Package: guix-patches;
Reported by: Tomas Volf <~@wolfsden.cz>
Date: Wed, 29 May 2024 13:29:01 UTC
Severity: normal
Tags: patch
Done: Tomas Volf <~@wolfsden.cz>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 71257 in the body.
You can then email your comments to 71257 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
View this report as an mbox folder, status mbox, maintainer mbox
guix <at> cbaines.net, pelzflorian <at> pelzflorian.de, dev <at> jpoiret.xyz, ludo <at> gnu.org, othacehe <at> gnu.org, matt <at> excalamus.com, maxim.cournoyer <at> gmail.com, rekado <at> elephly.net, zimon.toutoune <at> gmail.com, me <at> tobias.gr, guix-patches <at> gnu.org
:bug#71257
; Package guix-patches
.
(Wed, 29 May 2024 13:29:01 GMT) Full text and rfc822 format available.Tomas Volf <~@wolfsden.cz>
:guix <at> cbaines.net, pelzflorian <at> pelzflorian.de, dev <at> jpoiret.xyz, ludo <at> gnu.org, othacehe <at> gnu.org, matt <at> excalamus.com, maxim.cournoyer <at> gmail.com, rekado <at> elephly.net, zimon.toutoune <at> gmail.com, me <at> tobias.gr, guix-patches <at> gnu.org
.
(Wed, 29 May 2024 13:29:02 GMT) Full text and rfc822 format available.Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
From: Tomas Volf <~@wolfsden.cz> To: guix-patches <at> gnu.org Cc: Tomas Volf <~@wolfsden.cz> Subject: [PATCH core-updates] gexp: Improve support of Unicode characters. Date: Wed, 29 May 2024 15:27:05 +0200
Support for non-ASCII characters was mixed. Some gexp forms did support them, while others did not. Combined with current value for %default-port-conversion-strategy, that sometimes led to unpleasant surprises. For example: (scheme-file "utf8" #~(with-output-to-file #$output (λ _ (display "猫")))) Was written to the store as: ((? _ (display "\u732b"))) No, that is not font issue on your part, that is an actual #\? instead of the lambda character. Which, surprisingly, does not do what it should when executed. The solution is to switch to C.UTF-8 locale where possible, since it is now always available. Or to explicitly set the port encoding. No tests are provided, since majority of tests/gexp.scm use guile in version 2, and it tends to work under it. The issues occur mostly with guile 3. I did test it locally using: #!/bin/sh set -eu set -x [ -f guix.scm ] || { echo >&2 Run from root of Guix repo.; exit 1; } [ -f gnu.scm ] || { echo >&2 Run from root of Guix repo.; exit 1; } cat >猫.scm <<'EOF' (define-module (猫) #:export (say)) (define (say) "nyaaaa~~~~!") EOF mkdir -p dir-with-utf8-file cp 猫.scm dir-with-utf8-file/ cat >repro.scm <<'EOF' (use-modules (guix build utils) (guix derivations) (guix gexp) (guix store) (ice-9 ftw) (ice-9 textual-ports)) (define cat "猫") (define (drv-content drv) (call-with-input-file (derivation->output-path drv) get-string-all)) (define (out-content out) (call-with-input-file out get-string-all)) (define (drv-listing drv) (scandir (derivation->output-path drv))) (define (dir-listing dir) (scandir dir)) (define-macro (test exp lower? report) (let ((type (car exp))) `(false-if-exception (let ((drv (with-store %store (run-with-store %store (,(if lower? lower-object identity) ,exp))))) (format #t "~%~a:~%" ',type) (when (with-store %store (build-derivations %store (list drv))) (format #t "~a~%" (,report drv))))))) (test (computed-file "utf8" #~(with-output-to-file #$output (λ _ (display #$cat)))) #t drv-content) (test (program-file "utf8" #~((λ _ (display #$cat)))) #t drv-content) (test (scheme-file "utf8" #~((λ _ (display #$cat)))) #t drv-content) (test (text-file* "utf8" cat cat cat) #f drv-content) (test (compiled-modules '((猫))) #f drv-listing) (test (file-union "utf8" `((,cat ,(plain-file "utf8" cat)))) #t drv-listing) ;;; No fix needed: (test (imported-modules '((猫))) #f dir-listing) (test (local-file "dir-with-utf8-file" #:recursive? #t) #t dir-listing) (test (plain-file "utf8" cat) #t out-content) (test (mixed-text-file "utf8" cat cat cat) #t drv-content) (test (directory-union "utf8" (list (local-file "dir-with-utf8-file" #:recursive? #t))) #t dir-listing) EOF guix shell -CWN -D guix glibc-locales -- \ env LANG=C.UTF-8 ./pre-inst-env guix repl -- ./repro.scm Before this commit, the output is: + '[' -f guix.scm ']' + '[' -f gnu.scm ']' + cat + mkdir -p dir-with-utf8-file + cp 猫.scm dir-with-utf8-file/ + cat + guix shell -CWN -D guix glibc-locales -- env LANG=C.UTF-8 ./pre-inst-env guix repl -- ./repro.scm computed-file: ? program-file: #!/gnu/store/mfkz7fvlfpv3ppwbkv0imb19nrf95akf-guile-3.0.9/bin/guile --no-auto-compile !# ((? _ (display "\u732b"))) scheme-file: ((? _ (display "\u732b"))) text-file*: ??? compiled-modules: building path(s) `/gnu/store/ay3jifyvliigfgnz67jf0kgngzpya5a5-module-import-compiled' Backtrace: 5 (primitive-load "/gnu/store/rn7b0dq6iqfmmqyqzamix2mjmfy?") In ice-9/eval.scm: 619:8 4 (_ #f) In srfi/srfi-1.scm: 460:18 3 (fold #<procedure 7ffff79245e0 at ice-9/eval.scm:336:1?> ?) In ice-9/eval.scm: 245:16 2 (_ #(#(#<directory (guix build utils) 7ffff779f320>) # ?)) In ice-9/boot-9.scm: 1982:24 1 (_ _) In unknown file: 0 (stat "./???.scm" #<undefined>) ERROR: In procedure stat: In procedure stat: No such file or directory: "./???.scm" builder for `/gnu/store/dxg87135zcd6a1c92dlrkyvxlbhfwfld-module-import-compiled.drv' failed with exit code 1 file-union: (. .. ?) imported-modules: (. .. 猫.scm) local-file: (. .. 猫.scm) plain-file: 猫 mixed-text-file: 猫猫猫 directory-union: (. .. 猫.scm) Which I think you will agree is far from optimal. After my fix the output changes to: + '[' -f guix.scm ']' + '[' -f gnu.scm ']' + cat + mkdir -p dir-with-utf8-file + cp 猫.scm dir-with-utf8-file/ + cat + guix shell -CWN -D guix glibc-locales -- env LANG=C.UTF-8 ./pre-inst-env guix repl -- ./repro.scm computed-file: 猫 program-file: #!/gnu/store/8kbmn359jqkgsbqgqxnmiryvd9ynz8w7-guile-3.0.9/bin/guile --no-auto-compile !# ((λ _ (display "猫"))) scheme-file: ((λ _ (display "猫"))) text-file*: 猫猫猫 compiled-modules: (. .. 猫.go) file-union: (. .. 猫) imported-modules: (. .. 猫.scm) local-file: (. .. 猫.scm) plain-file: 猫 mixed-text-file: 猫猫猫 directory-union: (. .. 猫.scm) Which is actually what the user would expect. I also added missing arguments to the documentation. * guix/gexp.scm (computed-file): Set LANG to C.UTF-8 by default. (compiled-modules): Try to `setlocale'. (gexp->script), (gexp->file): New `locale' argument defaulting to C.UTF-8. (text-file*): Set output port encoding to UTF-8. * doc/guix.texi (G-Expressions)[computed-file]: Document the changes. Use @var. Document #:guile. [gexp->script]: Document #:locale. Fix default value for #:target. [gexp->file]: Document #:locale, #:system and #:target. Change-Id: Ib323b51af88a588b780ff48ddd04db8be7c729fb --- doc/guix.texi | 11 +++++++---- guix/gexp.scm | 24 ++++++++++++++++++------ 2 files changed, 25 insertions(+), 10 deletions(-) diff --git a/doc/guix.texi b/doc/guix.texi index be4868c188..c6bad3f734 100644 --- a/doc/guix.texi +++ b/doc/guix.texi @@ -12234,7 +12234,9 @@ G-Expressions This is the declarative counterpart of @code{text-file}. @end deffn -@deffn {Procedure} computed-file name gexp [#:local-build? #t] [#:options '()] +@deffn {Procedure} computed-file @var{name} @var{gexp} @ + [#:local-build? #t] [#:guile] @ + [#:options '(#:env-vars (("LANG" . "C.UTF-8")))] Return an object representing the store item @var{name}, a file or directory computed by @var{gexp}. When @var{local-build?} is true (the default), the derivation is built locally. @var{options} is a list of @@ -12245,7 +12247,7 @@ G-Expressions @deffn {Monadic Procedure} gexp->script @var{name} @var{exp} @ [#:guile (default-guile)] [#:module-path %load-path] @ - [#:system (%current-system)] [#:target #f] + [#:system (%current-system)] [#:target 'current] [#:locale "C.UTF-8"] Return an executable script @var{name} that runs @var{exp} using @var{guile}, with @var{exp}'s imported modules in its search path. Look up @var{exp}'s modules in @var{module-path}. @@ -12282,8 +12284,9 @@ G-Expressions @deffn {Monadic Procedure} gexp->file @var{name} @var{exp} @ [#:set-load-path? #t] [#:module-path %load-path] @ - [#:splice? #f] @ - [#:guile (default-guile)] + [#:splice? #f] [#:guile (default-guile)] @ + [#:system (%current-system)] [#:target 'current] @ + [#:locale "C.UTF-8"] Return a derivation that builds a file @var{name} containing @var{exp}. When @var{splice?} is true, @var{exp} is considered to be a list of expressions that will be spliced in the resulting file. diff --git a/guix/gexp.scm b/guix/gexp.scm index 74b4c49f90..af266171fd 100644 --- a/guix/gexp.scm +++ b/guix/gexp.scm @@ -584,7 +584,10 @@ (define-record-type <computed-file> (options computed-file-options)) ;list of arguments (define* (computed-file name gexp - #:key guile (local-build? #t) (options '())) + #:key + guile + (local-build? #t) + (options '(#:env-vars (("LANG" . "C.UTF-8"))))) "Return an object representing the store item NAME, a file or directory computed by GEXP. When LOCAL-BUILD? is #t (the default), it ensures the corresponding derivation is built locally. OPTIONS may be used to pass @@ -1687,6 +1690,9 @@ (define* (compiled-modules modules (system base target) (system base compile)) + ;; Best effort. The locale is not installed in all contexts. + (false-if-exception (setlocale LC_ALL "C.UTF-8")) + (define modules (getenv "modules")) @@ -1977,7 +1983,8 @@ (define* (gexp->script name exp #:key (guile (default-guile)) (module-path %load-path) (system (%current-system)) - (target 'current)) + (target 'current) + (locale "C.UTF-8")) "Return an executable script NAME that runs EXP using GUILE, with EXP's imported modules in its search path. Look up EXP's modules in MODULE-PATH." (mlet* %store-monad ((target (if (eq? target 'current) @@ -2020,7 +2027,8 @@ (define* (gexp->script name exp ;; These derivations are not worth offloading or ;; substituting. #:local-build? #t - #:substitutable? #f))) + #:substitutable? #f + #:env-vars `(("LANG" . ,locale))))) (define* (gexp->file name exp #:key (guile (default-guile)) @@ -2028,7 +2036,8 @@ (define* (gexp->file name exp #:key (module-path %load-path) (splice? #f) (system (%current-system)) - (target 'current)) + (target 'current) + (locale "C.UTF-8")) "Return a derivation that builds a file NAME containing EXP. When SPLICE? is true, EXP is considered to be a list of expressions that will be spliced in the resulting file. @@ -2068,7 +2077,8 @@ (define* (gexp->file name exp #:key #:local-build? #t #:substitutable? #f #:system system - #:target target) + #:target target + #:env-vars `(("LANG" . ,locale))) (gexp->derivation name (gexp (call-with-output-file (ungexp output) @@ -2085,7 +2095,8 @@ (define* (gexp->file name exp #:key #:local-build? #t #:substitutable? #f #:system system - #:target target)))) + #:target target + #:env-vars `(("LANG" . ,locale)))))) (define* (text-file* name #:rest text) "Return as a monadic value a derivation that builds a text file containing @@ -2095,6 +2106,7 @@ (define* (text-file* name #:rest text) (define builder (gexp (call-with-output-file (ungexp output "out") (lambda (port) + (set-port-encoding! port "UTF-8") (display (string-append (ungexp-splicing text)) port))))) (gexp->derivation name builder -- 2.41.0
Tomas Volf <~@wolfsden.cz>
to control <at> debbugs.gnu.org
.
(Sun, 06 Oct 2024 15:44:02 GMT) Full text and rfc822 format available.Debbugs Internal Request <help-debbugs <at> gnu.org>
to internal_control <at> debbugs.gnu.org
.
(Mon, 04 Nov 2024 12:24:15 GMT) Full text and rfc822 format available.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.