Received: (at submit) by debbugs.gnu.org; 5 Mar 2016 00:43:04 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Fri Mar 04 19:43:04 2016 Received: from localhost ([127.0.0.1]:34203 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1ac0JE-0007JI-Gs for submit <at> debbugs.gnu.org; Fri, 04 Mar 2016 19:43:04 -0500 Received: from eggs.gnu.org ([208.118.235.92]:41983) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from <zefram@HIDDEN>) id 1ac0JD-0007Io-50 for submit <at> debbugs.gnu.org; Fri, 04 Mar 2016 19:43:03 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from <zefram@HIDDEN>) id 1ac0J7-0003nR-3h for submit <at> debbugs.gnu.org; Fri, 04 Mar 2016 19:42:58 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:57054) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <zefram@HIDDEN>) id 1ac0J7-0003nM-0N for submit <at> debbugs.gnu.org; Fri, 04 Mar 2016 19:42:57 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:32954) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from <zefram@HIDDEN>) id 1ac0J6-0000WB-0w for bug-guile@HIDDEN; Fri, 04 Mar 2016 19:42:56 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from <zefram@HIDDEN>) id 1ac0J5-0003n5-1D for bug-guile@HIDDEN; Fri, 04 Mar 2016 19:42:55 -0500 Received: from river6.fysh.org ([2001:41d0:d:20da::2]:34679 helo=river.fysh.org) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <zefram@HIDDEN>) id 1ac0J4-0003n0-RK for bug-guile@HIDDEN; Fri, 04 Mar 2016 19:42:54 -0500 Received: from zefram by river.fysh.org with local (Exim 4.80 #2 (Debian)) id 1ac0J1-0006Tx-Fy; Sat, 05 Mar 2016 00:42:51 +0000 Date: Sat, 5 Mar 2016 00:42:51 +0000 From: Zefram <zefram@HIDDEN> To: bug-guile@HIDDEN Subject: filenames mangled by locale Message-ID: <20160305004251.GF7946@HIDDEN> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -5.0 (-----) It seems that guile-2.0 applies locale encoding and decoding to pathnames being used in system calls. This radically breaks file access anywhere that the locale's character encoding is anything other than a simple 8-bit encoding such as ISO-8859-1. For example, in the default C locale with its nominal ASCII encoding, $ guile-2.0 -c '(open-file (list->string (map integer->char '\''(76 195 169 111 110))) "w")' $ echo L*n | od -tc 0000000 L ? ? o n \n 0000006 Those are literal question marks in the name of the file actually created, apparently arising as substitutions for the high-half octets in the requested filename. Existing files with names containing high-half octets can't be found (resulting in an ENOENT error message that shows the actually-existing filename), and new ones can't be created (actually being created under the mangled name instead). There's no warning or exception advising that the requested name can't be used, just this misbehaviour. The equivalent problem arises with decoding when filenames are received: $ echo foo > $'L\303\251on.txt' $ guile-2.0 -c '(define d (opendir ".")) (let r () (let ((n (readdir d))) (if (eof-object? n) #t (begin (if (eq? (car (reverse (string->list n))) #\t) (begin (write (map char->integer (string->list n))) (newline))) (r)))))' (76 63 63 111 110 46 116 120 116) Again no warning or exception, just incorrect data returned. To work around this would require the program to select a locale with a more accommodating nominal character encoding. As I've previously noted, there's no guarantee of such a locale existing. Thus the above behaviour is fatal to any attempt to write in Guile Scheme a program to operate on arbitrarily-named files. Guile even applies this mangling to the pathname of a script that it is to load: $ echo '(write "hi")(newline)' > $'L\303\251on.scm' $ guile-2.0 -s L*n.scm [big error message saying it couldn't find the file that exists] Obviously, even if a program could turn off the locale mangling in general, this instance of it occurs too early for the program to avoid. The guile framework itself has acquired the kind of 8-bit-cleanliness bug that it is imposing on the programs that it interprets. -zefram
Zefram <zefram@HIDDEN>
:bug-guile@HIDDEN
.
Full text available.bug-guile@HIDDEN
:bug#22913
; Package guile
.
Full text available.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997 nCipher Corporation Ltd,
1994-97 Ian Jackson.