GNU logs - #25397, boring messages


Message sent to bug-guile@HIDDEN:


X-Loop: help-debbugs@HIDDEN
Subject: bug#25397: guile-2.2 regression in utf8 support in scm_puts scm_lfwrite scm_c_put_string
Resent-From: Linas Vepstas <linasvepstas@HIDDEN>
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-guile@HIDDEN
Resent-Date: Sun, 08 Jan 2017 18:17:01 +0000
Resent-Message-ID: <handler.25397.B.148389941514309 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: report 25397
X-GNU-PR-Package: guile
X-GNU-PR-Keywords: 
To: 25397 <at> debbugs.gnu.org
X-Debbugs-Original-To: bug-guile@HIDDEN
Reply-To: linasvepstas@HIDDEN
Received: via spool by submit <at> debbugs.gnu.org id=B.148389941514309
          (code B ref -1); Sun, 08 Jan 2017 18:17:01 +0000
Received: (at submit) by debbugs.gnu.org; 8 Jan 2017 18:16:55 +0000
Received: from localhost ([127.0.0.1]:47062 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1cQI1X-0003ij-5a
	for submit <at> debbugs.gnu.org; Sun, 08 Jan 2017 13:16:55 -0500
Received: from eggs.gnu.org ([208.118.235.92]:50474)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <linasvepstas@HIDDEN>) id 1cQI1V-0003iW-Fa
 for submit <at> debbugs.gnu.org; Sun, 08 Jan 2017 13:16:53 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <linasvepstas@HIDDEN>) id 1cQI1P-0000os-5u
 for submit <at> debbugs.gnu.org; Sun, 08 Jan 2017 13:16:48 -0500
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,FREEMAIL_FROM,
 T_DKIM_INVALID autolearn=disabled version=3.3.2
Received: from lists.gnu.org ([2001:4830:134:3::11]:50808)
 by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32)
 (Exim 4.71) (envelope-from <linasvepstas@HIDDEN>)
 id 1cQI1P-0000on-2W
 for submit <at> debbugs.gnu.org; Sun, 08 Jan 2017 13:16:47 -0500
Received: from eggs.gnu.org ([2001:4830:134:3::10]:41444)
 by lists.gnu.org with esmtp (Exim 4.71)
 (envelope-from <linasvepstas@HIDDEN>) id 1cQI1N-0000Cl-Ru
 for bug-guile@HIDDEN; Sun, 08 Jan 2017 13:16:46 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <linasvepstas@HIDDEN>) id 1cQI1M-0000oL-Pj
 for bug-guile@HIDDEN; Sun, 08 Jan 2017 13:16:45 -0500
Received: from mail-qt0-x22b.google.com ([2607:f8b0:400d:c0d::22b]:34420)
 by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16)
 (Exim 4.71) (envelope-from <linasvepstas@HIDDEN>)
 id 1cQI1M-0000oF-LU
 for bug-guile@HIDDEN; Sun, 08 Jan 2017 13:16:44 -0500
Received: by mail-qt0-x22b.google.com with SMTP id l7so70341608qtd.1
 for <bug-guile@HIDDEN>; Sun, 08 Jan 2017 10:16:44 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=mime-version:reply-to:from:date:message-id:subject:to
 :content-transfer-encoding;
 bh=JA5bE9eQ9LlQDOyg8QRZUr1IpybPbykncbab2LVibeY=;
 b=sM/A2M7cjVSCejNvFhem+KE3s3lvgc1L33pq+weaMFOakoBz10QcSMkzl4GD/ok5Yx
 Huvtv13zZwkeH4t6TbqN3tj8UTseHjyjwP/s06buayKpQI1cx+aWiVaLEK6MllBfFcTw
 OwV5QriZK54oXteSQQga0aTL0+Yf4DLokyAudyxz3He0FDpMXWrFQUmeR1SDHYk50hBL
 rN4vTIHGnvTJ5+Md4exkk7Dlc8DJ4hXBloOhsOujU7hc252L6O/wTz3hU3giBwjPNA0C
 K+V+aFUcru7mO16ZwExTKRinMBqfv3qBaLvtt5B1NEFPbnS/WRPGkGf37d3K9iDriKr2
 ytUA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:reply-to:from:date:message-id
 :subject:to:content-transfer-encoding;
 bh=JA5bE9eQ9LlQDOyg8QRZUr1IpybPbykncbab2LVibeY=;
 b=QjsiVT9EO9p8j6xYYEtxAnde6IIgdEDX5c22X5BcRirm3kxjR3jz4LidzlRR83G5/q
 NH9qC/CNU3uOVLXj+3f9ISfS/0CY2p6sbwB8wQ2a1EXj5etitgFe4gUczx9EvRwLWA9H
 Ec1Vu/oLeVlgsrH/5II7EEdB/dRCnhMiJaMG9K4gYXUTP2oGLXrzV6gKBgX8ChDEZJ1N
 XuaI65LDEOk1MX2ya9t4QhLhIvC2degoQyTD0pa1Ca3ZlFmdcjLB7JPfvrMnAIzm5UJo
 RLyY+fk5N4rJSgRL2UYy2Vu8wa8Tt5zIea37QFQwVZiufbd6yijOLzudL7NvVkyqLT08
 vlzw==
X-Gm-Message-State: AIkVDXLgCk+MzHwdhaVA4BX3KYKXpMJaOQ6ai+RKAhYrZ8tRgxB7ZaXGiXy2un6i7wkRJj9Zny00qDsy30MCug==
X-Received: by 10.237.52.37 with SMTP id w34mr18201493qtd.173.1483899404089;
 Sun, 08 Jan 2017 10:16:44 -0800 (PST)
MIME-Version: 1.0
Received: by 10.12.128.78 with HTTP; Sun, 8 Jan 2017 10:16:23 -0800 (PST)
From: Linas Vepstas <linasvepstas@HIDDEN>
Date: Sun, 8 Jan 2017 12:16:23 -0600
Message-ID: <CAHrUA35wPNE0JgCDonQTk_z=ZNijWJWzKHzu+pdSMYF-3-1_zg@HIDDEN>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
X-detected-operating-system: by eggs.gnu.org: Genre and OS details not
 recognized.
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x
X-Received-From: 2001:4830:134:3::11
X-Spam-Score: -4.0 (----)
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -4.0 (----)

There appears to be a regression in guile-2.2 with utf8 handling
in the scm_puts() scm_lfwrite() and scm_c_put_string() functions.

In guile-2.0, one could give these utf8-encoded strings, and these
would display just fine.  In 2.2 they get mangled.

The source of the mangling seems to be an assumption that these
three are being given latin1 strings, which they then attempt to
convert to utf8, thus wrecking the encoding.  See, e.g. libguile/ports.c
line 3526

Presumably this change was intentional, but I don't understand
why; guile-2.0 seems utf-8 clean, correctly handling utf-8 in
essentially all cases.  Why would one want to go back to the
bad old days of latin1 and iso-8859-1 for guile 2.2?

I could submit a patch for this, but would it be wanted?

Test case is straight-forward:

printf("duuude port-encoding is=3D%s\n",
   scm_to_utf8_string(scm_port_encoding(scm_current_output_port ())));
scm_puts ("=E4=BF=82 =E6=8B=89 =E4=B8=81 =E5=AD=97 =E6=AF=8D", scm_current_=
output_port ());

which works in guile-2.0 but is garbled in 2.2




Message sent:


Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-Mailer: MIME-tools 5.505 (Entity 5.505)
Content-Type: text/plain; charset=utf-8
X-Loop: help-debbugs@HIDDEN
From: help-debbugs@HIDDEN (GNU bug Tracking System)
To: linasvepstas@HIDDEN
Subject: bug#25397: Acknowledgement (guile-2.2 regression in utf8 support
 in scm_puts scm_lfwrite scm_c_put_string)
Message-ID: <handler.25397.B.148389941514309.ack <at> debbugs.gnu.org>
References: <CAHrUA35wPNE0JgCDonQTk_z=ZNijWJWzKHzu+pdSMYF-3-1_zg@HIDDEN>
X-Gnu-PR-Message: ack 25397
X-Gnu-PR-Package: guile
Reply-To: 25397 <at> debbugs.gnu.org
Date: Sun, 08 Jan 2017 18:17:02 +0000

Thank you for filing a new bug report with debbugs.gnu.org.

This is an automatically generated reply to let you know your message
has been received.

Your message is being forwarded to the package maintainers and other
interested parties for their attention; they will reply in due course.

Your message has been sent to the package maintainer(s):
 bug-guile@HIDDEN

If you wish to submit further information on this problem, please
send it to 25397 <at> debbugs.gnu.org.

Please do not send mail to help-debbugs@HIDDEN unless you wish
to report a problem with the Bug-tracking system.

--=20
25397: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D25397
GNU Bug Tracking System
Contact help-debbugs@HIDDEN with problems


Message sent to bug-guile@HIDDEN:


X-Loop: help-debbugs@HIDDEN
Subject: bug#25397: guile-2.2 regression in utf8 support in scm_puts scm_lfwrite scm_c_put_string
Resent-From: Andy Wingo <wingo@HIDDEN>
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-guile@HIDDEN
Resent-Date: Mon, 09 Jan 2017 22:04:02 +0000
Resent-Message-ID: <handler.25397.B25397.148399941930248 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: followup 25397
X-GNU-PR-Package: guile
X-GNU-PR-Keywords: 
To: Linas Vepstas <linasvepstas@HIDDEN>
Cc: 25397 <at> debbugs.gnu.org
Received: via spool by 25397-submit <at> debbugs.gnu.org id=B25397.148399941930248
          (code B ref 25397); Mon, 09 Jan 2017 22:04:02 +0000
Received: (at 25397) by debbugs.gnu.org; 9 Jan 2017 22:03:39 +0000
Received: from localhost ([127.0.0.1]:48208 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1cQi2V-0007ro-8d
	for submit <at> debbugs.gnu.org; Mon, 09 Jan 2017 17:03:39 -0500
Received: from pb-sasl2.pobox.com ([64.147.108.67]:56364
 helo=sasl.smtp.pobox.com) by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <wingo@HIDDEN>) id 1cQi2T-0007rf-FA
 for 25397 <at> debbugs.gnu.org; Mon, 09 Jan 2017 17:03:37 -0500
Received: from sasl.smtp.pobox.com (unknown [127.0.0.1])
 by pb-sasl2.pobox.com (Postfix) with ESMTP id 51B6357C9E;
 Mon,  9 Jan 2017 17:03:36 -0500 (EST)
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=from:to:cc
 :subject:references:date:in-reply-to:message-id:mime-version
 :content-type; s=sasl; bh=LS0WcsMw+G9U6TBXj7PdaW1TbPo=; b=TQfoYH
 0tUaanzgM3bdkvx3NntCAU49vcz/v4qwAHrjakg/oRTg4Yy9bk5Qp4Rko1X801io
 J3qZgo6ncZ2HBlnrMKrqB/cipk0d1bry6ayYFYAYN4UiEvA9LsXXGm746vGDx5/E
 SmFjwrvYr4ni/VQK2+/MDeZmRAvSkwV4xPMD0=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=from:to:cc
 :subject:references:date:in-reply-to:message-id:mime-version
 :content-type; q=dns; s=sasl; b=pVhD4yulxrRsfES5/FnNesvUOUQSPjL/
 wAW+grE1KgYGDhWtT3UJUgFVW04VZmCiNZIVOCGQbsRZtcNDrpFZMyEENtQLn5cj
 QZGaEanMED7RZZqZqgRAuZRkd9BqJcH+Q8NolHCjKlC9xfQInS2YW3PRqET/6AjO
 ME6bBUaMiks=
Received: from pb-sasl2.nyi.icgroup.com (unknown [127.0.0.1])
 by pb-sasl2.pobox.com (Postfix) with ESMTP id 49F9257C9D;
 Mon,  9 Jan 2017 17:03:36 -0500 (EST)
Received: from clucks (unknown [88.160.190.192])
 (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by pb-sasl2.pobox.com (Postfix) with ESMTPSA id 3E2A357C9C;
 Mon,  9 Jan 2017 17:03:35 -0500 (EST)
From: Andy Wingo <wingo@HIDDEN>
References: <CAHrUA35wPNE0JgCDonQTk_z=ZNijWJWzKHzu+pdSMYF-3-1_zg@HIDDEN>
Date: Mon, 09 Jan 2017 23:03:27 +0100
In-Reply-To: <CAHrUA35wPNE0JgCDonQTk_z=ZNijWJWzKHzu+pdSMYF-3-1_zg@HIDDEN>
 (Linas Vepstas's message of "Sun, 8 Jan 2017 12:16:23 -0600")
Message-ID: <87y3yj99hs.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-Pobox-Relay-ID: 77B21C20-D6B7-11E6-A179-6141F2301B6D-02397024!pb-sasl2.pobox.com
X-Spam-Score: -3.2 (---)
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.2 (---)

On Sun 08 Jan 2017 19:16, Linas Vepstas <linasvepstas@HIDDEN> writes:

> There appears to be a regression in guile-2.2 with utf8 handling
> in the scm_puts() scm_lfwrite() and scm_c_put_string() functions.
>
> In guile-2.0, one could give these utf8-encoded strings, and these
> would display just fine.  In 2.2 they get mangled.

Could it be this from NEWS:

  ** Better locale support in Guile scripts

  When Guile is invoked directly, either from the command line or via a
  hash-bang line (e.g. "#!/usr/bin/guile"), it now installs the current
  locale via a call to `(setlocale LC_ALL "")'.  For users with a unicode
  locale, this makes all ports unicode-capable by default, without the
  need to call `setlocale' in your program.  This behavior may be
  controlled via the GUILE_INSTALL_LOCALE environment variable; see the
  manual for more.




Message sent to bug-guile@HIDDEN:


X-Loop: help-debbugs@HIDDEN
Subject: bug#25397: guile-2.2 regression in utf8 support in scm_puts scm_lfwrite scm_c_put_string
Resent-From: Linas Vepstas <linasvepstas@HIDDEN>
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-guile@HIDDEN
Resent-Date: Tue, 10 Jan 2017 03:36:02 +0000
Resent-Message-ID: <handler.25397.B25397.14840193045689 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: followup 25397
X-GNU-PR-Package: guile
X-GNU-PR-Keywords: 
To: Andy Wingo <wingo@HIDDEN>
Cc: 25397 <at> debbugs.gnu.org
Reply-To: linasvepstas@HIDDEN
Received: via spool by 25397-submit <at> debbugs.gnu.org id=B25397.14840193045689
          (code B ref 25397); Tue, 10 Jan 2017 03:36:02 +0000
Received: (at 25397) by debbugs.gnu.org; 10 Jan 2017 03:35:04 +0000
Received: from localhost ([127.0.0.1]:48307 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1cQnDE-0001Th-Bx
	for submit <at> debbugs.gnu.org; Mon, 09 Jan 2017 22:35:04 -0500
Received: from mail-qk0-f172.google.com ([209.85.220.172]:33832)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <linasvepstas@HIDDEN>) id 1cQnDC-0001T9-6f
 for 25397 <at> debbugs.gnu.org; Mon, 09 Jan 2017 22:35:02 -0500
Received: by mail-qk0-f172.google.com with SMTP id a20so162908198qkc.1
 for <25397 <at> debbugs.gnu.org>; Mon, 09 Jan 2017 19:35:02 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=mime-version:reply-to:in-reply-to:references:from:date:message-id
 :subject:to:cc:content-transfer-encoding;
 bh=V81j9Jzazwb6lGErkRCPRfrl4olTX0005hGcfV052XM=;
 b=k54uXqjKXl5zXzqbPP13xZbqrUWTlm5KpGxwV0gSlvvbVSF2hcJGJTcrnfV4PprYNP
 R1is9LlKgWDRdwkHvNei+JLLJEJGyVuH8yR7A73MwOHmhFCC4a5HgZFN+Pj0QODGxBEB
 7diO93tbq3zQUNP7qCMousEixTMvaFHqG4p8yPoSnDS6PrdVN+ce79IK/KOxjFxyFfKe
 vJ5yWNcQXj4mwTOAjvJAL3iuPnI5aGRaJKtcsDvX83xJ2ldMPAiZRHlmOzzpdAe+wwPX
 5Vi93ME6K+ky2N2EGdI71jo2m2E1zLQ/jzSSu9jSvBiGNdGtdq65EkqeuEv2luzQdr74
 r3WQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:reply-to:in-reply-to:references
 :from:date:message-id:subject:to:cc:content-transfer-encoding;
 bh=V81j9Jzazwb6lGErkRCPRfrl4olTX0005hGcfV052XM=;
 b=VQMRqX6JYAeeF5jA5HTt4QjuYxxY8g/EZBnJoKNBdFoe9OGG1SFE7ug+vNIbX9ekvt
 qxjoPTTtntJVYZYuJuT7ZMB1LoE29wmsr3vWZg8UcWgNtrl8iPhqQbvEarqq5MZlR5L4
 9mJhLxuM/qSgoa28L8bMgtwAYIkKq3NnGEoWdhff2K1164zkDNCAM8S/E9arMBz8hh7l
 slK2ptB2yNV+u7GKfI/j9bKrQ7WRIyq6vpiBdeHITtuWRFDyGjxO9sO+f+rasQzmRYMI
 I011zvQHhBYJ5OZIiox3pUdC36y92OVmlo/pvxjNpeSslpm9ctntGgG+yC3GtQWhBsbU
 rh7A==
X-Gm-Message-State: AIkVDXLzJSvr0evDa3ecHPxZKGhAWhu2coQwS7+n4434MKfkEk2hYaGQERPzZZHUBNIwVBmRB3hxiS11tL+ifA==
X-Received: by 10.55.114.70 with SMTP id n67mr935130qkc.185.1484019296670;
 Mon, 09 Jan 2017 19:34:56 -0800 (PST)
MIME-Version: 1.0
Received: by 10.12.128.78 with HTTP; Mon, 9 Jan 2017 19:34:36 -0800 (PST)
In-Reply-To: <87y3yj99hs.fsf@HIDDEN>
References: <CAHrUA35wPNE0JgCDonQTk_z=ZNijWJWzKHzu+pdSMYF-3-1_zg@HIDDEN>
 <87y3yj99hs.fsf@HIDDEN>
From: Linas Vepstas <linasvepstas@HIDDEN>
Date: Mon, 9 Jan 2017 21:34:36 -0600
Message-ID: <CAHrUA36xT1x4xW-xgYw_-3zpfSDHFQU4kqtETFk7ScKW-5u0pQ@HIDDEN>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
X-Spam-Score: -1.3 (-)
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.3 (-)

This short C program illustrates the issue.  The locale, the output port et=
c.
are UTF-8.  The bad results are no surprise: the code currently in git for
scm_puts etc. explicitly ignores the locale setting, always, and always
assumes latin1 -- its hard-coded in there.

--linas

#include <libguile.h>

void *wrap_eval(void* p)
{
   char *wtf =3D "(setlocale LC_ALL \"\")";
   SCM eval_str =3D scm_from_utf8_string(wtf);
   scm_eval_string(eval_str);

   return NULL;
}

void *wrap_puts(void* p)
{
   char *wtf =3D p;

   SCM port =3D scm_current_output_port ();

   scm_puts("the port-encoding is=3D", port);
   scm_puts(scm_to_utf8_string(scm_port_encoding(port)), port);

   scm_puts("\nThe string to display is =3D", port);
   scm_puts (wtf, port);

   scm_puts("\nWas expecting to see this=3D", port);
   SCM str =3D scm_from_utf8_string(wtf);
   scm_display(str, port);
   scm_puts("\n\n", port);

   return NULL;
}

int main(int argc, char* argv[])
{
   scm_with_guile(wrap_eval, 0x0);

   char * wtf =3D "=C4=86i=C4=87olina";
   scm_with_guile(wrap_puts, wtf);

   wtf =3D "Th=E1=BB=A7 D=E1=BA=A7u M=E1=BB=99t";
   scm_with_guile(wrap_puts, wtf);

   wtf =3D "Sm=C3=A5land";
   scm_with_guile(wrap_puts, wtf);

   wtf =3D "H=C3=B2a Ph=C3=BA Ph=C3=BA T=C3=A2n";
   scm_with_guile(wrap_puts, wtf);

   wtf =3D "=E4=BF=82 =E6=8B=89 =E4=B8=81 =E5=AD=97 =E6=AF=8D";
   scm_with_guile(wrap_puts, wtf);
}

The output is always this:

the port-encoding is=3DUTF-8
The string to display is =3D=C3=84=E2=80=A0i=C3=84=E2=80=A1olina
Was expecting to see this=3D=C4=86i=C4=87olina

the port-encoding is=3DUTF-8
The string to display is =3DTh=C3=A1=C2=BB=C2=A7 D=C3=A1=C2=BA=C2=A7u M=C3=
=A1=C2=BB=E2=84=A2t
Was expecting to see this=3DTh=E1=BB=A7 D=E1=BA=A7u M=E1=BB=99t

the port-encoding is=3DUTF-8
The string to display is =3DSm=C3=83=C2=A5land
Was expecting to see this=3DSm=C3=A5land

the port-encoding is=3DUTF-8
The string to display is =3DH=C3=83=C2=B2a Ph=C3=83=C2=BA Ph=C3=83=C2=BA T=
=C3=83=C2=A2n
Was expecting to see this=3DH=C3=B2a Ph=C3=BA Ph=C3=BA T=C3=A2n

the port-encoding is=3DUTF-8
Was expecting to see this=3D=E4=BF=82 =E6=8B=89 =E4=B8=81 =E5=AD=97 =E6=AF=
=8D =C3=A6=C2=AF


What's cool is that all this stuff works in email!

--linas

On Mon, Jan 9, 2017 at 4:03 PM, Andy Wingo <wingo@HIDDEN> wrote:
> On Sun 08 Jan 2017 19:16, Linas Vepstas <linasvepstas@HIDDEN> writes:
>
>> There appears to be a regression in guile-2.2 with utf8 handling
>> in the scm_puts() scm_lfwrite() and scm_c_put_string() functions.
>>
>> In guile-2.0, one could give these utf8-encoded strings, and these
>> would display just fine.  In 2.2 they get mangled.
>
> Could it be this from NEWS:
>
>   ** Better locale support in Guile scripts
>
>   When Guile is invoked directly, either from the command line or via a
>   hash-bang line (e.g. "#!/usr/bin/guile"), it now installs the current
>   locale via a call to `(setlocale LC_ALL "")'.  For users with a unicode
>   locale, this makes all ports unicode-capable by default, without the
>   need to call `setlocale' in your program.  This behavior may be
>   controlled via the GUILE_INSTALL_LOCALE environment variable; see the
>   manual for more.




Message sent to bug-guile@HIDDEN:


X-Loop: help-debbugs@HIDDEN
Subject: bug#25397: guile-2.2 regression in utf8 support in scm_puts scm_lfwrite scm_c_put_string
Resent-From: Andy Wingo <wingo@HIDDEN>
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-guile@HIDDEN
Resent-Date: Wed, 01 Mar 2017 15:46:02 +0000
Resent-Message-ID: <handler.25397.B25397.148838313610808 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: followup 25397
X-GNU-PR-Package: guile
X-GNU-PR-Keywords: 
To: Linas Vepstas <linasvepstas@HIDDEN>
Cc: 25397 <at> debbugs.gnu.org
Received: via spool by 25397-submit <at> debbugs.gnu.org id=B25397.148838313610808
          (code B ref 25397); Wed, 01 Mar 2017 15:46:02 +0000
Received: (at 25397) by debbugs.gnu.org; 1 Mar 2017 15:45:36 +0000
Received: from localhost ([127.0.0.1]:34559 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1cj6Rc-0002oF-I9
	for submit <at> debbugs.gnu.org; Wed, 01 Mar 2017 10:45:36 -0500
Received: from pb-sasl1.pobox.com ([64.147.108.66]:54548
 helo=sasl.smtp.pobox.com) by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <wingo@HIDDEN>) id 1cj6Rb-0002o8-1l
 for 25397 <at> debbugs.gnu.org; Wed, 01 Mar 2017 10:45:35 -0500
Received: from sasl.smtp.pobox.com (unknown [127.0.0.1])
 by pb-sasl1.pobox.com (Postfix) with ESMTP id 82A775EF3F;
 Wed,  1 Mar 2017 10:45:34 -0500 (EST)
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=from:to:cc
 :subject:references:date:in-reply-to:message-id:mime-version
 :content-type; s=sasl; bh=BHvsydYzlnzrH9h4nCcTqf1Fob8=; b=KElZyb
 AMqvI2ZdPHcUFzNEE0WAEoJGPYYeTXfzrGbSI/7WNKD4tFW7FRiBLPdI3Ab0Frky
 k7rT+fAU2H1YySOdtdqZrATrYCInh8t2gAe50Ce1BWOrmdHB3DZD1fxNcZVi9/I2
 qzVSFiNvfHYpsaw4WHnH309x2eGKlADMAVMns=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=from:to:cc
 :subject:references:date:in-reply-to:message-id:mime-version
 :content-type; q=dns; s=sasl; b=Hx/mH/Eh+aFXXiyYdUtsMRW2PJLEbusa
 1bEjRRkFVHRgUrWS3tN2H3PZrvf0ixyNudHHcYTdsz3B96dk31Up9HMpax1+8Khj
 kMii5pEogKsxKpPhWXdXOHsFfSXEN++IHDzhiXlD3c2kCHyHNdrD1qsuyiFDODWX
 uLjxKRQLMhc=
Received: from pb-sasl1.nyi.icgroup.com (unknown [127.0.0.1])
 by pb-sasl1.pobox.com (Postfix) with ESMTP id 7389D5EF3D;
 Wed,  1 Mar 2017 10:45:34 -0500 (EST)
Received: from clucks (unknown [109.190.228.233])
 (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by pb-sasl1.pobox.com (Postfix) with ESMTPSA id 57B0C5EF3A;
 Wed,  1 Mar 2017 10:45:33 -0500 (EST)
From: Andy Wingo <wingo@HIDDEN>
References: <CAHrUA35wPNE0JgCDonQTk_z=ZNijWJWzKHzu+pdSMYF-3-1_zg@HIDDEN>
 <87y3yj99hs.fsf@HIDDEN>
 <CAHrUA36xT1x4xW-xgYw_-3zpfSDHFQU4kqtETFk7ScKW-5u0pQ@HIDDEN>
Date: Wed, 01 Mar 2017 16:45:26 +0100
In-Reply-To: <CAHrUA36xT1x4xW-xgYw_-3zpfSDHFQU4kqtETFk7ScKW-5u0pQ@HIDDEN>
 (Linas Vepstas's message of "Mon, 9 Jan 2017 21:34:36 -0600")
Message-ID: <87y3wpdmqx.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-Pobox-Relay-ID: 1B2B23E6-FE96-11E6-8A12-B667064AB293-02397024!pb-sasl1.pobox.com
X-Spam-Score: 0.0 (/)
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: 0.0 (/)

On Tue 10 Jan 2017 04:34, Linas Vepstas <linasvepstas@HIDDEN> writes:

> void *wrap_puts(void* p)
> {
>    char *wtf = p;
>
>    SCM port = scm_current_output_port ();
>
>    scm_puts("the port-encoding is=", port);
>    scm_puts(scm_to_utf8_string(scm_port_encoding(port)), port);
>
>    scm_puts("\nThe string to display is =", port);
>    scm_puts (wtf, port);
>
>    scm_puts("\nWas expecting to see this=", port);
>    SCM str = scm_from_utf8_string(wtf);
>    scm_display(str, port);
>    scm_puts("\n\n", port);
>
>    return NULL;
> }

So, there are a few questions here.  scm_puts and scm_lfwrite are not
documented, so we need to do basic science on them to see what they are
supposed to do.

Firstly, is scm_puts() a textual interface or a binary interface?
I.e. does it write a sequence of characters or a sequence of bytes?

If I look at uses of scm_puts in Guile sources, it seems clear that it's
a textual interface.  That is to say, at all points, the intention seems
to be to write characters on a Guile port.  All of the uses are of
strings.  Please do a "git grep" on your source to see if your
perceptions correspond.

Now the question is, what encoding is the argument in?  If the port is
UTF-16, that byte string should be decoded to characters, and that
character sequence encoded to UTF-16.

All of the scm_puts calls in Guile are of one-byte characters with
codepoints less than 128, so when doing some port refactoring I chose to
interpret the argument as latin1.

FTR, in Guile 2.0, this was effectively a binary interface.  Guile 2.0's
scm_lfwrite interpreted the incoming bytes as ISO-8859-1 codepoints for
the purposes of updating line and column, but scm_puts and scm_lfwrite
just wrote out the bytes to the port directly, regardless of the
encoding.  That was the wrong thing.

Are you arguing that the byte string given to scm_puts should be decoded
from UTF-8?  That would be OK.

Andy




Message sent to bug-guile@HIDDEN:


X-Loop: help-debbugs@HIDDEN
Subject: bug#25397: guile-2.2 regression in utf8 support in scm_puts scm_lfwrite scm_c_put_string
Resent-From: Linas Vepstas <linasvepstas@HIDDEN>
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-guile@HIDDEN
Resent-Date: Wed, 01 Mar 2017 20:20:02 +0000
Resent-Message-ID: <handler.25397.B25397.14883995433563 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: followup 25397
X-GNU-PR-Package: guile
X-GNU-PR-Keywords: 
To: Andy Wingo <wingo@HIDDEN>
Cc: "25397 <at> debbugs.gnu.org" <25397 <at> debbugs.gnu.org>
Reply-To: linasvepstas@HIDDEN
Received: via spool by 25397-submit <at> debbugs.gnu.org id=B25397.14883995433563
          (code B ref 25397); Wed, 01 Mar 2017 20:20:02 +0000
Received: (at 25397) by debbugs.gnu.org; 1 Mar 2017 20:19:03 +0000
Received: from localhost ([127.0.0.1]:34870 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1cjAiE-0000vO-K7
	for submit <at> debbugs.gnu.org; Wed, 01 Mar 2017 15:19:02 -0500
Received: from mail-qk0-f181.google.com ([209.85.220.181]:35687)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <linasvepstas@HIDDEN>) id 1cjAiC-0000uu-Vs
 for 25397 <at> debbugs.gnu.org; Wed, 01 Mar 2017 15:19:01 -0500
Received: by mail-qk0-f181.google.com with SMTP id u188so89817965qkc.2
 for <25397 <at> debbugs.gnu.org>; Wed, 01 Mar 2017 12:19:00 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=mime-version:reply-to:in-reply-to:references:from:date:message-id
 :subject:to:cc;
 bh=yOm5zg0CNT2fR2AlKikOmNLAkOrT/1jA4uyaZFEqRu4=;
 b=hlw7REaUZ57DzIJKxMVWxA7VwUV5YpaNg8HykVU6/dk0RjxY+JGj2f0AbjpyADpWwG
 wMKHp/1FjLe/wi+fT/H+tGhW9Ik8/vcP98xHMpxuaFmlIJyJpCzUe15u6hiFlncGEqS/
 soeQzl9rZeGgYGFLzeQMJRoIjVTl7s/j27tKRtqxlJG4hRqQ9Udin7zDyKvYRG1Gb9zg
 PTbX2aEnY1dJeKIi2qzcLp3TP6zmttiQt9PP0+J8c+0jW9IvUInqZ37HyAYDeBWDHweR
 THcy3RFoIk9QSOO31WZGHxIV4OGkKKepLV7kigOQl1mt2GZFnQF74wMPFjzkEHJb9Ec1
 rdEA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:reply-to:in-reply-to:references
 :from:date:message-id:subject:to:cc;
 bh=yOm5zg0CNT2fR2AlKikOmNLAkOrT/1jA4uyaZFEqRu4=;
 b=AGA1W/73mOLmDLTrFv4rOUFdIm0C54BgftbmpHYkZ2xofCpCmw3+i14rP1UrMoQVAY
 MCidfDqyamjEIM/zuYG0cpOWWn1OJIv8kvEr5/6cRV/v7QY5uYU2tCSlejf451i1Y3X6
 HwZQlLE13lZiy0LPxO4o7lEkf14KRLjbguuRbhxwyvoEeJyfV2rx2I7cQ3eFgzKHkjiV
 TgeRLIhkyuC8sXIAHsCQbtIM0esG3J/p/Kv/CYe2gbLheJFZGSBcZTepZuXUn0qpwRuY
 zEpIIHtYdsj01UflfVCAmV7njCfcxOmY0kFXQliJGilB+Iw+jT/KfZRKL2M9jnrLosvc
 cOWg==
X-Gm-Message-State: AMke39kp8zrfLM68lJQM81QvRkfArU1xFanJA1piWZCHPv56ZoS5y0JgwTZnHS097MT1OBdhbT5kHz0yVyMcqg==
X-Received: by 10.55.131.4 with SMTP id f4mr11959349qkd.1.1488399535351; Wed,
 01 Mar 2017 12:18:55 -0800 (PST)
MIME-Version: 1.0
Received: by 10.12.174.231 with HTTP; Wed, 1 Mar 2017 12:18:54 -0800 (PST)
In-Reply-To: <87y3wpdmqx.fsf@HIDDEN>
References: <CAHrUA35wPNE0JgCDonQTk_z=ZNijWJWzKHzu+pdSMYF-3-1_zg@HIDDEN>
 <87y3yj99hs.fsf@HIDDEN>
 <CAHrUA36xT1x4xW-xgYw_-3zpfSDHFQU4kqtETFk7ScKW-5u0pQ@HIDDEN>
 <87y3wpdmqx.fsf@HIDDEN>
From: Linas Vepstas <linasvepstas@HIDDEN>
Date: Wed, 1 Mar 2017 14:18:54 -0600
Message-ID: <CAHrUA35TFqxtZuJ23huU1=5FOE-_Cr55R1HJHXjNrcqCtqQMGA@HIDDEN>
Content-Type: multipart/alternative; boundary=94eb2c071e9a96ad5f0549b105a9
X-Spam-Score: 0.5 (/)
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: 0.5 (/)

--94eb2c071e9a96ad5f0549b105a9
Content-Type: text/plain; charset=UTF-8

In the bad old days, not every thing was documented ... My use of scm_puts
dates back to guile-1.8.  I only ever send it utf8.  I can change my code,
no problem,... I just thought I'd report a regression in case .... others
are affected.

Linas

On Wednesday, March 1, 2017, Andy Wingo <wingo@HIDDEN> wrote:

> On Tue 10 Jan 2017 04:34, Linas Vepstas <linasvepstas@HIDDEN
> <javascript:;>> writes:
>
> > void *wrap_puts(void* p)
> > {
> >    char *wtf = p;
> >
> >    SCM port = scm_current_output_port ();
> >
> >    scm_puts("the port-encoding is=", port);
> >    scm_puts(scm_to_utf8_string(scm_port_encoding(port)), port);
> >
> >    scm_puts("\nThe string to display is =", port);
> >    scm_puts (wtf, port);
> >
> >    scm_puts("\nWas expecting to see this=", port);
> >    SCM str = scm_from_utf8_string(wtf);
> >    scm_display(str, port);
> >    scm_puts("\n\n", port);
> >
> >    return NULL;
> > }
>
> So, there are a few questions here.  scm_puts and scm_lfwrite are not
> documented, so we need to do basic science on them to see what they are
> supposed to do.
>
> Firstly, is scm_puts() a textual interface or a binary interface?
> I.e. does it write a sequence of characters or a sequence of bytes?
>
> If I look at uses of scm_puts in Guile sources, it seems clear that it's
> a textual interface.  That is to say, at all points, the intention seems
> to be to write characters on a Guile port.  All of the uses are of
> strings.  Please do a "git grep" on your source to see if your
> perceptions correspond.
>
> Now the question is, what encoding is the argument in?  If the port is
> UTF-16, that byte string should be decoded to characters, and that
> character sequence encoded to UTF-16.
>
> All of the scm_puts calls in Guile are of one-byte characters with
> codepoints less than 128, so when doing some port refactoring I chose to
> interpret the argument as latin1.
>
> FTR, in Guile 2.0, this was effectively a binary interface.  Guile 2.0's
> scm_lfwrite interpreted the incoming bytes as ISO-8859-1 codepoints for
> the purposes of updating line and column, but scm_puts and scm_lfwrite
> just wrote out the bytes to the port directly, regardless of the
> encoding.  That was the wrong thing.
>
> Are you arguing that the byte string given to scm_puts should be decoded
> from UTF-8?  That would be OK.
>
> Andy
>

--94eb2c071e9a96ad5f0549b105a9
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

In the bad old days, not every thing was documented ... My use of scm_puts =
dates back to guile-1.8.=C2=A0 I only ever send it utf8.=C2=A0 I can change=
 my code, no problem,... I just thought I&#39;d report a regression in case=
 .... others are affected.<div><br></div><div>Linas<br><br>On Wednesday, Ma=
rch 1, 2017, Andy Wingo &lt;<a href=3D"mailto:wingo@HIDDEN">wingo@pobox.=
com</a>&gt; wrote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0 0=
 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On Tue 10 Jan 2017 04:=
34, Linas Vepstas &lt;<a href=3D"javascript:;" onclick=3D"_e(event, &#39;cv=
ml&#39;, &#39;linasvepstas@HIDDEN&#39;)">linasvepstas@HIDDEN</a>&gt; =
writes:<br>
<br>
&gt; void *wrap_puts(void* p)<br>
&gt; {<br>
&gt;=C2=A0 =C2=A0 char *wtf =3D p;<br>
&gt;<br>
&gt;=C2=A0 =C2=A0 SCM port =3D scm_current_output_port ();<br>
&gt;<br>
&gt;=C2=A0 =C2=A0 scm_puts(&quot;the port-encoding is=3D&quot;, port);<br>
&gt;=C2=A0 =C2=A0 scm_puts(scm_to_utf8_string(<wbr>scm_port_encoding(port))=
, port);<br>
&gt;<br>
&gt;=C2=A0 =C2=A0 scm_puts(&quot;\nThe string to display is =3D&quot;, port=
);<br>
&gt;=C2=A0 =C2=A0 scm_puts (wtf, port);<br>
&gt;<br>
&gt;=C2=A0 =C2=A0 scm_puts(&quot;\nWas expecting to see this=3D&quot;, port=
);<br>
&gt;=C2=A0 =C2=A0 SCM str =3D scm_from_utf8_string(wtf);<br>
&gt;=C2=A0 =C2=A0 scm_display(str, port);<br>
&gt;=C2=A0 =C2=A0 scm_puts(&quot;\n\n&quot;, port);<br>
&gt;<br>
&gt;=C2=A0 =C2=A0 return NULL;<br>
&gt; }<br>
<br>
So, there are a few questions here.=C2=A0 scm_puts and scm_lfwrite are not<=
br>
documented, so we need to do basic science on them to see what they are<br>
supposed to do.<br>
<br>
Firstly, is scm_puts() a textual interface or a binary interface?<br>
I.e. does it write a sequence of characters or a sequence of bytes?<br>
<br>
If I look at uses of scm_puts in Guile sources, it seems clear that it&#39;=
s<br>
a textual interface.=C2=A0 That is to say, at all points, the intention see=
ms<br>
to be to write characters on a Guile port.=C2=A0 All of the uses are of<br>
strings.=C2=A0 Please do a &quot;git grep&quot; on your source to see if yo=
ur<br>
perceptions correspond.<br>
<br>
Now the question is, what encoding is the argument in?=C2=A0 If the port is=
<br>
UTF-16, that byte string should be decoded to characters, and that<br>
character sequence encoded to UTF-16.<br>
<br>
All of the scm_puts calls in Guile are of one-byte characters with<br>
codepoints less than 128, so when doing some port refactoring I chose to<br=
>
interpret the argument as latin1.<br>
<br>
FTR, in Guile 2.0, this was effectively a binary interface.=C2=A0 Guile 2.0=
&#39;s<br>
scm_lfwrite interpreted the incoming bytes as ISO-8859-1 codepoints for<br>
the purposes of updating line and column, but scm_puts and scm_lfwrite<br>
just wrote out the bytes to the port directly, regardless of the<br>
encoding.=C2=A0 That was the wrong thing.<br>
<br>
Are you arguing that the byte string given to scm_puts should be decoded<br=
>
from UTF-8?=C2=A0 That would be OK.<br>
<br>
Andy<br>
</blockquote></div>

--94eb2c071e9a96ad5f0549b105a9--





Last modified: Mon, 25 Nov 2019 12:00:02 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.