GNU logs - #22913, boring messages


Message sent to bug-guile@HIDDEN:


X-Loop: help-debbugs@HIDDEN
Subject: bug#22913: filenames mangled by locale
Resent-From: Zefram <zefram@HIDDEN>
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-guile@HIDDEN
Resent-Date: Sat, 05 Mar 2016 00:44:02 +0000
Resent-Message-ID: <handler.22913.B.145713858428108 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: report 22913
X-GNU-PR-Package: guile
X-GNU-PR-Keywords: 
To: 22913 <at> debbugs.gnu.org
X-Debbugs-Original-To: bug-guile@HIDDEN
Received: via spool by submit <at> debbugs.gnu.org id=B.145713858428108
          (code B ref -1); Sat, 05 Mar 2016 00:44:02 +0000
Received: (at submit) by debbugs.gnu.org; 5 Mar 2016 00:43:04 +0000
Received: from localhost ([127.0.0.1]:34203 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1ac0JE-0007JI-Gs
	for submit <at> debbugs.gnu.org; Fri, 04 Mar 2016 19:43:04 -0500
Received: from eggs.gnu.org ([208.118.235.92]:41983)
 by debbugs.gnu.org with esmtp (Exim 4.84)
 (envelope-from <zefram@HIDDEN>) id 1ac0JD-0007Io-50
 for submit <at> debbugs.gnu.org; Fri, 04 Mar 2016 19:43:03 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <zefram@HIDDEN>) id 1ac0J7-0003nR-3h
 for submit <at> debbugs.gnu.org; Fri, 04 Mar 2016 19:42:58 -0500
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled
 version=3.3.2
Received: from lists.gnu.org ([2001:4830:134:3::11]:57054)
 by eggs.gnu.org with esmtp (Exim 4.71)
 (envelope-from <zefram@HIDDEN>) id 1ac0J7-0003nM-0N
 for submit <at> debbugs.gnu.org; Fri, 04 Mar 2016 19:42:57 -0500
Received: from eggs.gnu.org ([2001:4830:134:3::10]:32954)
 by lists.gnu.org with esmtp (Exim 4.71)
 (envelope-from <zefram@HIDDEN>) id 1ac0J6-0000WB-0w
 for bug-guile@HIDDEN; Fri, 04 Mar 2016 19:42:56 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <zefram@HIDDEN>) id 1ac0J5-0003n5-1D
 for bug-guile@HIDDEN; Fri, 04 Mar 2016 19:42:55 -0500
Received: from river6.fysh.org ([2001:41d0:d:20da::2]:34679
 helo=river.fysh.org) by eggs.gnu.org with esmtp (Exim 4.71)
 (envelope-from <zefram@HIDDEN>) id 1ac0J4-0003n0-RK
 for bug-guile@HIDDEN; Fri, 04 Mar 2016 19:42:54 -0500
Received: from zefram by river.fysh.org with local (Exim 4.80 #2 (Debian))
 id 1ac0J1-0006Tx-Fy; Sat, 05 Mar 2016 00:42:51 +0000
Date: Sat, 5 Mar 2016 00:42:51 +0000
From: Zefram <zefram@HIDDEN>
Message-ID: <20160305004251.GF7946@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x
X-Received-From: 2001:4830:134:3::11
X-Spam-Score: -5.0 (-----)
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -5.0 (-----)

It seems that guile-2.0 applies locale encoding and decoding to pathnames
being used in system calls.  This radically breaks file access anywhere
that the locale's character encoding is anything other than a simple
8-bit encoding such as ISO-8859-1.  For example, in the default C locale
with its nominal ASCII encoding,

$ guile-2.0 -c '(open-file (list->string (map integer->char '\''(76 195 169 111 110))) "w")'
$ echo L*n | od -tc
0000000   L   ?   ?   o   n  \n
0000006

Those are literal question marks in the name of the file actually
created, apparently arising as substitutions for the high-half octets in
the requested filename.  Existing files with names containing high-half
octets can't be found (resulting in an ENOENT error message that shows the
actually-existing filename), and new ones can't be created (actually being
created under the mangled name instead).  There's no warning or exception
advising that the requested name can't be used, just this misbehaviour.

The equivalent problem arises with decoding when filenames are received:

$ echo foo > $'L\303\251on.txt'
$ guile-2.0 -c '(define d (opendir ".")) (let r () (let ((n (readdir d))) (if (eof-object? n) #t (begin (if (eq? (car (reverse (string->list n))) #\t) (begin (write (map char->integer (string->list n))) (newline))) (r)))))'
(76 63 63 111 110 46 116 120 116)

Again no warning or exception, just incorrect data returned.

To work around this would require the program to select a locale with
a more accommodating nominal character encoding.  As I've previously
noted, there's no guarantee of such a locale existing.  Thus the above
behaviour is fatal to any attempt to write in Guile Scheme a program to
operate on arbitrarily-named files.

Guile even applies this mangling to the pathname of a script that it is
to load:

$ echo '(write "hi")(newline)' > $'L\303\251on.scm'     
$ guile-2.0 -s L*n.scm
[big error message saying it couldn't find the file that exists]

Obviously, even if a program could turn off the locale mangling in
general, this instance of it occurs too early for the program to avoid.
The guile framework itself has acquired the kind of 8-bit-cleanliness
bug that it is imposing on the programs that it interprets.

-zefram




Message sent:


Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-Mailer: MIME-tools 5.505 (Entity 5.505)
Content-Type: text/plain; charset=utf-8
X-Loop: help-debbugs@HIDDEN
From: help-debbugs@HIDDEN (GNU bug Tracking System)
To: Zefram <zefram@HIDDEN>
Subject: bug#22913: Acknowledgement (filenames mangled by locale)
Message-ID: <handler.22913.B.145713858428108.ack <at> debbugs.gnu.org>
References: <20160305004251.GF7946@HIDDEN>
X-Gnu-PR-Message: ack 22913
X-Gnu-PR-Package: guile
Reply-To: 22913 <at> debbugs.gnu.org
Date: Sat, 05 Mar 2016 00:44:02 +0000

Thank you for filing a new bug report with debbugs.gnu.org.

This is an automatically generated reply to let you know your message
has been received.

Your message is being forwarded to the package maintainers and other
interested parties for their attention; they will reply in due course.

Your message has been sent to the package maintainer(s):
 bug-guile@HIDDEN

If you wish to submit further information on this problem, please
send it to 22913 <at> debbugs.gnu.org.

Please do not send mail to help-debbugs@HIDDEN unless you wish
to report a problem with the Bug-tracking system.

--=20
22913: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D22913
GNU Bug Tracking System
Contact help-debbugs@HIDDEN with problems



Last modified: Mon, 25 Nov 2019 12:00:02 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.