X-Loop: help-debbugs@HIDDEN Subject: bug#22913: filenames mangled by locale Resent-From: Zefram <zefram@HIDDEN> Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-guile@HIDDEN Resent-Date: Sat, 05 Mar 2016 00:44:02 +0000 Resent-Message-ID: <handler.22913.B.145713858428108 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: report 22913 X-GNU-PR-Package: guile X-GNU-PR-Keywords: To: 22913 <at> debbugs.gnu.org X-Debbugs-Original-To: bug-guile@HIDDEN Received: via spool by submit <at> debbugs.gnu.org id=B.145713858428108 (code B ref -1); Sat, 05 Mar 2016 00:44:02 +0000 Received: (at submit) by debbugs.gnu.org; 5 Mar 2016 00:43:04 +0000 Received: from localhost ([127.0.0.1]:34203 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1ac0JE-0007JI-Gs for submit <at> debbugs.gnu.org; Fri, 04 Mar 2016 19:43:04 -0500 Received: from eggs.gnu.org ([208.118.235.92]:41983) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from <zefram@HIDDEN>) id 1ac0JD-0007Io-50 for submit <at> debbugs.gnu.org; Fri, 04 Mar 2016 19:43:03 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from <zefram@HIDDEN>) id 1ac0J7-0003nR-3h for submit <at> debbugs.gnu.org; Fri, 04 Mar 2016 19:42:58 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:57054) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <zefram@HIDDEN>) id 1ac0J7-0003nM-0N for submit <at> debbugs.gnu.org; Fri, 04 Mar 2016 19:42:57 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:32954) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from <zefram@HIDDEN>) id 1ac0J6-0000WB-0w for bug-guile@HIDDEN; Fri, 04 Mar 2016 19:42:56 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from <zefram@HIDDEN>) id 1ac0J5-0003n5-1D for bug-guile@HIDDEN; Fri, 04 Mar 2016 19:42:55 -0500 Received: from river6.fysh.org ([2001:41d0:d:20da::2]:34679 helo=river.fysh.org) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <zefram@HIDDEN>) id 1ac0J4-0003n0-RK for bug-guile@HIDDEN; Fri, 04 Mar 2016 19:42:54 -0500 Received: from zefram by river.fysh.org with local (Exim 4.80 #2 (Debian)) id 1ac0J1-0006Tx-Fy; Sat, 05 Mar 2016 00:42:51 +0000 Date: Sat, 5 Mar 2016 00:42:51 +0000 From: Zefram <zefram@HIDDEN> Message-ID: <20160305004251.GF7946@HIDDEN> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -5.0 (-----) X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -5.0 (-----) It seems that guile-2.0 applies locale encoding and decoding to pathnames being used in system calls. This radically breaks file access anywhere that the locale's character encoding is anything other than a simple 8-bit encoding such as ISO-8859-1. For example, in the default C locale with its nominal ASCII encoding, $ guile-2.0 -c '(open-file (list->string (map integer->char '\''(76 195 169 111 110))) "w")' $ echo L*n | od -tc 0000000 L ? ? o n \n 0000006 Those are literal question marks in the name of the file actually created, apparently arising as substitutions for the high-half octets in the requested filename. Existing files with names containing high-half octets can't be found (resulting in an ENOENT error message that shows the actually-existing filename), and new ones can't be created (actually being created under the mangled name instead). There's no warning or exception advising that the requested name can't be used, just this misbehaviour. The equivalent problem arises with decoding when filenames are received: $ echo foo > $'L\303\251on.txt' $ guile-2.0 -c '(define d (opendir ".")) (let r () (let ((n (readdir d))) (if (eof-object? n) #t (begin (if (eq? (car (reverse (string->list n))) #\t) (begin (write (map char->integer (string->list n))) (newline))) (r)))))' (76 63 63 111 110 46 116 120 116) Again no warning or exception, just incorrect data returned. To work around this would require the program to select a locale with a more accommodating nominal character encoding. As I've previously noted, there's no guarantee of such a locale existing. Thus the above behaviour is fatal to any attempt to write in Guile Scheme a program to operate on arbitrarily-named files. Guile even applies this mangling to the pathname of a script that it is to load: $ echo '(write "hi")(newline)' > $'L\303\251on.scm' $ guile-2.0 -s L*n.scm [big error message saying it couldn't find the file that exists] Obviously, even if a program could turn off the locale mangling in general, this instance of it occurs too early for the program to avoid. The guile framework itself has acquired the kind of 8-bit-cleanliness bug that it is imposing on the programs that it interprets. -zefram
Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.505 (Entity 5.505) Content-Type: text/plain; charset=utf-8 X-Loop: help-debbugs@HIDDEN From: help-debbugs@HIDDEN (GNU bug Tracking System) To: Zefram <zefram@HIDDEN> Subject: bug#22913: Acknowledgement (filenames mangled by locale) Message-ID: <handler.22913.B.145713858428108.ack <at> debbugs.gnu.org> References: <20160305004251.GF7946@HIDDEN> X-Gnu-PR-Message: ack 22913 X-Gnu-PR-Package: guile Reply-To: 22913 <at> debbugs.gnu.org Date: Sat, 05 Mar 2016 00:44:02 +0000 Thank you for filing a new bug report with debbugs.gnu.org. This is an automatically generated reply to let you know your message has been received. Your message is being forwarded to the package maintainers and other interested parties for their attention; they will reply in due course. Your message has been sent to the package maintainer(s): bug-guile@HIDDEN If you wish to submit further information on this problem, please send it to 22913 <at> debbugs.gnu.org. Please do not send mail to help-debbugs@HIDDEN unless you wish to report a problem with the Bug-tracking system. --=20 22913: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D22913 GNU Bug Tracking System Contact help-debbugs@HIDDEN with problems
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997 nCipher Corporation Ltd,
1994-97 Ian Jackson.