GNU bug report logs - #22901
drain-input doesn't decode

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: guile; Reported by: Zefram <zefram@HIDDEN>; dated Fri, 4 Mar 2016 03:11:01 UTC; Maintainer for guile is bug-guile@HIDDEN.

Message received at 22901 <at> debbugs.gnu.org:


Received: (at 22901) by debbugs.gnu.org; 20 Jun 2016 16:13:02 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Mon Jun 20 12:13:01 2016
Received: from localhost ([127.0.0.1]:47930 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1bF1or-0003L0-JM
	for submit <at> debbugs.gnu.org; Mon, 20 Jun 2016 12:13:01 -0400
Received: from pb-sasl1.pobox.com ([64.147.108.66]:62952
 helo=sasl.smtp.pobox.com) by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <wingo@HIDDEN>) id 1bF1op-0003Kj-UT
 for 22901 <at> debbugs.gnu.org; Mon, 20 Jun 2016 12:13:00 -0400
Received: from sasl.smtp.pobox.com (unknown [127.0.0.1])
 by pb-sasl1.pobox.com (Postfix) with ESMTP id B6871234B8;
 Mon, 20 Jun 2016 12:12:59 -0400 (EDT)
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=from:to:cc
 :subject:references:date:in-reply-to:message-id:mime-version
 :content-type; s=sasl; bh=y4R4cYS9EkYmfXQv09LXm8Itk7w=; b=ERnacB
 mLvkD4zAv3Pl7n95yTv27yNPKPnb8gbURxkX3wiqkX8TI2K3A/bcz3h5IJIR77y6
 6V9eRqIUXYflsePzJRfvLhkhqlVRDQKawJW1WEgZajbyaxguwkiwkoFOB0U3oTDj
 G0y8EksJ0VValrx9UjRVQ8xcKNdW+Xw9Tgj28=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=from:to:cc
 :subject:references:date:in-reply-to:message-id:mime-version
 :content-type; q=dns; s=sasl; b=IsjO+jwE6DP1FymnJbvgWW7fqMqXQHKX
 0shWKnjA5RUrZUXhyMJSmGIkavh5Bo98JZAVoNuzKNOSXfsf18S7nvhSkBACeJ48
 GTd+nr7/sB6fPN6AkqB5OcGIGglXoVCTMMhLzmAzyoZxdbxmcONJa6e5SsuRxvFp
 478xAdOan+0=
Received: from pb-sasl1.nyi.icgroup.com (unknown [127.0.0.1])
 by pb-sasl1.pobox.com (Postfix) with ESMTP id AFC49234B7;
 Mon, 20 Jun 2016 12:12:59 -0400 (EDT)
Received: from clucks (unknown [88.160.190.192])
 (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by pb-sasl1.pobox.com (Postfix) with ESMTPSA id BCEFB234B1;
 Mon, 20 Jun 2016 12:12:58 -0400 (EDT)
From: Andy Wingo <wingo@HIDDEN>
To: Zefram <zefram@HIDDEN>
Subject: Re: bug#22901: drain-input doesn't decode
References: <20160304030944.GA1318@HIDDEN>
Date: Mon, 20 Jun 2016 18:12:50 +0200
In-Reply-To: <20160304030944.GA1318@HIDDEN> (zefram@HIDDEN's message of
 "Fri, 4 Mar 2016 03:09:44 +0000")
Message-ID: <87d1nbg7p9.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-Pobox-Relay-ID: DB0ED0CA-3701-11E6-A060-C1836462E9F6-02397024!pb-sasl1.pobox.com
X-Spam-Score: -1.4 (-)
X-Debbugs-Envelope-To: 22901
Cc: 22901 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.4 (-)

On Fri 04 Mar 2016 04:09, Zefram <zefram@HIDDEN> writes:

> The documentation for drain-input says that it returns a string of
> characters, implying that the result is equivalent to what you'd get
> from calling read-char some number of times.  In fact it differs in a
> significant respect: whereas read-char decodes input octets according to
> the port's selected encoding, drain-input ignores the selected encoding
> and always decodes according to ISO-8859-1 (thus preserving the octet
> values in character form).
>
> $ echo -n $'\1a\2b\3c' | guile-2.0 -c '(set-port-encoding!
> (current-input-port) "UCS-2BE") (write (port-encoding
> (current-input-port))) (newline) (write (map char->integer (let r ((l
> '\''())) (let ((c (read-char (current-input-port)))) (if (eof-object?
> c) (reverse l) (r (cons c l))))))) (newline)'
> "UCS-2BE"
> (353 610 867)
> $ echo -n $'\1a\2b\3c' | guile-2.0 -c '(set-port-encoding!
> (current-input-port) "UCS-2BE") (write (port-encoding
> (current-input-port))) (newline) (peek-char (current-input-port))
> (write (map char->integer (string->list (drain-input
> (current-input-port))))) (newline)'
> "UCS-2BE"
> (1 97 2 98 3 99)

Thanks for the test case!  FWIW, this is fixed in Guile 2.1.3.  I am not
sure what we should do about Guile 2.0.  I guess we should make it do
the documented thing though!

Andy




Information forwarded to bug-guile@HIDDEN:
bug#22901; Package guile. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 4 Mar 2016 03:10:05 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Mar 03 22:10:05 2016
Received: from localhost ([127.0.0.1]:60716 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1abg7x-0004UD-7B
	for submit <at> debbugs.gnu.org; Thu, 03 Mar 2016 22:10:05 -0500
Received: from eggs.gnu.org ([208.118.235.92]:37103)
 by debbugs.gnu.org with esmtp (Exim 4.84)
 (envelope-from <zefram@HIDDEN>) id 1abg7v-0004Tg-TA
 for submit <at> debbugs.gnu.org; Thu, 03 Mar 2016 22:10:04 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <zefram@HIDDEN>) id 1abg7p-0000bM-OJ
 for submit <at> debbugs.gnu.org; Thu, 03 Mar 2016 22:09:58 -0500
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled
 version=3.3.2
Received: from lists.gnu.org ([2001:4830:134:3::11]:54346)
 by eggs.gnu.org with esmtp (Exim 4.71)
 (envelope-from <zefram@HIDDEN>) id 1abg7p-0000bI-LC
 for submit <at> debbugs.gnu.org; Thu, 03 Mar 2016 22:09:57 -0500
Received: from eggs.gnu.org ([2001:4830:134:3::10]:56307)
 by lists.gnu.org with esmtp (Exim 4.71)
 (envelope-from <zefram@HIDDEN>) id 1abg7o-0006Xa-Hk
 for bug-guile@HIDDEN; Thu, 03 Mar 2016 22:09:57 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <zefram@HIDDEN>) id 1abg7l-0000b1-92
 for bug-guile@HIDDEN; Thu, 03 Mar 2016 22:09:56 -0500
Received: from river.fysh.org ([87.98.248.19]:55145)
 by eggs.gnu.org with esmtp (Exim 4.71)
 (envelope-from <zefram@HIDDEN>) id 1abg7l-0000ao-2X
 for bug-guile@HIDDEN; Thu, 03 Mar 2016 22:09:53 -0500
Received: from zefram by river.fysh.org with local (Exim 4.80 #2 (Debian))
 id 1abg7c-0001TA-SN; Fri, 04 Mar 2016 03:09:44 +0000
Date: Fri, 4 Mar 2016 03:09:44 +0000
From: Zefram <zefram@HIDDEN>
To: bug-guile@HIDDEN
Subject: drain-input doesn't decode
Message-ID: <20160304030944.GA1318@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x
X-Received-From: 2001:4830:134:3::11
X-Spam-Score: -5.0 (-----)
X-Debbugs-Envelope-To: submit
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -5.0 (-----)

The documentation for drain-input says that it returns a string of
characters, implying that the result is equivalent to what you'd get
from calling read-char some number of times.  In fact it differs in a
significant respect: whereas read-char decodes input octets according to
the port's selected encoding, drain-input ignores the selected encoding
and always decodes according to ISO-8859-1 (thus preserving the octet
values in character form).

$ echo -n $'\1a\2b\3c' | guile-2.0 -c '(set-port-encoding! (current-input-port) "UCS-2BE") (write (port-encoding (current-input-port))) (newline) (write (map char->integer (let r ((l '\''())) (let ((c (read-char (current-input-port)))) (if (eof-object? c) (reverse l) (r (cons c l))))))) (newline)'
"UCS-2BE"
(353 610 867)
$ echo -n $'\1a\2b\3c' | guile-2.0 -c '(set-port-encoding! (current-input-port) "UCS-2BE") (write (port-encoding (current-input-port))) (newline) (peek-char (current-input-port)) (write (map char->integer (string->list (drain-input (current-input-port))))) (newline)'
"UCS-2BE"
(1 97 2 98 3 99)

The practical upshot is that the input returned by drain-input can't
be used in the same way as regular input from read-char.  It can still
be used if the code doing the reading is totally aware of the encoding,
so that it can perform the decoding manually, but this seems a failure
of abstraction.  The value returned by drain-input ought to be coherent
with the abstraction level at which it is specified.

I can see that there is a reason for drain-input to avoid performing
decoding: the problem that occurs if the buffer ends in the middle
of a character.  If drain-input is to return decoded characters then
presumably in this case it would have to read further octets beyond the
buffer contents, in an unbuffered manner, until it reaches a character
boundary.  If this is too unpalatable, perhaps drain-input should be
permitted only on ports configured for single-octet character encodings.

If, on the other hand, it is decided to endorse the current non-decoding
behaviour, then the break of abstraction needs to be documented.

-zefram




Acknowledgement sent to Zefram <zefram@HIDDEN>:
New bug report received and forwarded. Copy sent to bug-guile@HIDDEN. Full text available.
Report forwarded to bug-guile@HIDDEN:
bug#22901; Package guile. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Mon, 20 Jun 2016 16:15:02 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.