GNU bug report logs - #26353
GuixSD /tmp cleaner fails to clean when Umlauts like "ä" are used in filenames

Previous Next

Package: guix;

Reported by: Danny Milosavljevic <dannym <at> scratchpost.org>

Date: Mon, 3 Apr 2017 18:57:02 UTC

Severity: important

Tags: patch

Done: ludo <at> gnu.org (Ludovic Courtès)

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 26353 in the body.
You can then email your comments to 26353 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guix <at> gnu.org:
bug#26353; Package guix. (Mon, 03 Apr 2017 18:57:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Danny Milosavljevic <dannym <at> scratchpost.org>:
New bug report received and forwarded. Copy sent to bug-guix <at> gnu.org. (Mon, 03 Apr 2017 18:57:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Danny Milosavljevic <dannym <at> scratchpost.org>
To: <bug-guix <at> gnu.org>
Subject: GuixSD /tmp cleaner fails to clean when Umlauts like
 "ä" are used in filenames
Date: Mon, 3 Apr 2017 20:56:32 +0200
Hi,

the GuixSD /tmp cleaner fails to clean when Umlauts like "ä" are used in filenames.  It will just leave them there.

For example I have an immortal file "/tmp/!x!home!dannym!scratchpost.org!www!mirror!science!physics!03._Relativitätstheorie!.webseealso~".




Information forwarded to bug-guix <at> gnu.org:
bug#26353; Package guix. (Wed, 12 Apr 2017 13:05:02 GMT) Full text and rfc822 format available.

Message #8 received at 26353 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Danny Milosavljevic <dannym <at> scratchpost.org>
Cc: 26353 <at> debbugs.gnu.org
Subject: Re: bug#26353: GuixSD /tmp cleaner fails to clean when Umlauts like
 "ä" are used in filenames
Date: Wed, 12 Apr 2017 15:04:01 +0200
[Message part 1 (text/plain, inline)]
Hi Danny,

Danny Milosavljevic <dannym <at> scratchpost.org> skribis:

> the GuixSD /tmp cleaner fails to clean when Umlauts like "ä" are used in filenames.  It will just leave them there.
>
> For example I have an immortal file "/tmp/!x!home!dannym!scratchpost.org!www!mirror!science!physics!03._Relativitätstheorie!.webseealso~".

The problem is that the “activation scripts” run in the C locale and
thus Guile interprets file names in this locale encoding (i.e., ASCII),
which fails.

I believe the attached patch mostly fixes the problem.  Could you try
and report back?

I say “mostly” because if /tmp contains a file in an encoding other than
that of the system locale, we still have a problem.

Once we’ve switched to Guile 2.2, we should probably force use of an
ISO-8859-1 locale to avoid file name decoding altogether.

Thanks,
Ludo’.

[Message part 2 (text/x-patch, inline)]
diff --git a/gnu/services.scm b/gnu/services.scm
index 9f6e323e1..500724eec 100644
--- a/gnu/services.scm
+++ b/gnu/services.scm
@@ -248,9 +248,9 @@ directory."
   ;; The service that produces the boot script.
   (service boot-service-type #t))
 
-(define (cleanup-gexp _)
+(define (cleanup-gexp locale)
   "Return as a monadic value a gexp to clean up /tmp and similar places upon
-boot."
+boot.  Run with LOCALE to ensure file names are properly decoded."
   (with-monad %store-monad
     (with-imported-modules '((guix build utils))
       (return #~(begin
@@ -272,6 +272,13 @@ boot."
                                                 #t))))
                     ;; Ignore I/O errors so the system can boot.
                     (fail-safe
+                     ;; Guile decodes file names according to the current
+                     ;; locale's encoding so attempt to use an appropriate
+                     ;; locale.  See <https://bugs.gnu.org/26353>.
+                     ;; TODO: With Guile 2.2, choose an ISO-8859-1 locale
+                     ;; to disable decoding altogether.
+                     (setlocale LC_CTYPE #$locale)
+
                      (delete-file-recursively "/tmp")
                      (delete-file-recursively "/var/run")
                      (mkdir "/tmp")
@@ -280,7 +287,8 @@ boot."
                      (chmod "/var/run" #o755))))))))
 
 (define cleanup-service-type
-  ;; Service that cleans things up in /tmp and similar.
+  ;; Service that cleans things up in /tmp and similar.  Its value is the name
+  ;; of a locale to install before traversing these directories.
   (service-type (name 'cleanup)
                 (extensions
                  (list (service-extension boot-service-type
diff --git a/gnu/system.scm b/gnu/system.scm
index 0f52351cf..5e0d2db7d 100644
--- a/gnu/system.scm
+++ b/gnu/system.scm
@@ -309,7 +309,8 @@ a container or that of a \"bare metal\" system."
            ;; activation code.
            %shepherd-root-service
            %activation-service
-           (service cleanup-service-type #f)
+           (service cleanup-service-type
+                    (operating-system-locale os))
 
            (pam-root-service (operating-system-pam-services os))
            (account-service (append (operating-system-accounts os)

Information forwarded to bug-guix <at> gnu.org:
bug#26353; Package guix. (Sat, 22 Apr 2017 23:32:02 GMT) Full text and rfc822 format available.

Message #11 received at 26353 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Danny Milosavljevic <dannym <at> scratchpost.org>
Cc: 26353 <at> debbugs.gnu.org
Subject: Re: bug#26353: GuixSD /tmp cleaner fails to clean when Umlauts like
 "ä" are used in filenames
Date: Sun, 23 Apr 2017 01:30:56 +0200
Hello,

Did you have a chance to look at this patch?

TIA,
Ludo’.

ludo <at> gnu.org (Ludovic Courtès) skribis:

> Hi Danny,
>
> Danny Milosavljevic <dannym <at> scratchpost.org> skribis:
>
>> the GuixSD /tmp cleaner fails to clean when Umlauts like "ä" are used in filenames.  It will just leave them there.
>>
>> For example I have an immortal file "/tmp/!x!home!dannym!scratchpost.org!www!mirror!science!physics!03._Relativitätstheorie!.webseealso~".
>
> The problem is that the “activation scripts” run in the C locale and
> thus Guile interprets file names in this locale encoding (i.e., ASCII),
> which fails.
>
> I believe the attached patch mostly fixes the problem.  Could you try
> and report back?
>
> I say “mostly” because if /tmp contains a file in an encoding other than
> that of the system locale, we still have a problem.
>
> Once we’ve switched to Guile 2.2, we should probably force use of an
> ISO-8859-1 locale to avoid file name decoding altogether.
>
> Thanks,
> Ludo’.
>
> diff --git a/gnu/services.scm b/gnu/services.scm
> index 9f6e323e1..500724eec 100644
> --- a/gnu/services.scm
> +++ b/gnu/services.scm
> @@ -248,9 +248,9 @@ directory."
>    ;; The service that produces the boot script.
>    (service boot-service-type #t))
>  
> -(define (cleanup-gexp _)
> +(define (cleanup-gexp locale)
>    "Return as a monadic value a gexp to clean up /tmp and similar places upon
> -boot."
> +boot.  Run with LOCALE to ensure file names are properly decoded."
>    (with-monad %store-monad
>      (with-imported-modules '((guix build utils))
>        (return #~(begin
> @@ -272,6 +272,13 @@ boot."
>                                                  #t))))
>                      ;; Ignore I/O errors so the system can boot.
>                      (fail-safe
> +                     ;; Guile decodes file names according to the current
> +                     ;; locale's encoding so attempt to use an appropriate
> +                     ;; locale.  See <https://bugs.gnu.org/26353>.
> +                     ;; TODO: With Guile 2.2, choose an ISO-8859-1 locale
> +                     ;; to disable decoding altogether.
> +                     (setlocale LC_CTYPE #$locale)
> +
>                       (delete-file-recursively "/tmp")
>                       (delete-file-recursively "/var/run")
>                       (mkdir "/tmp")
> @@ -280,7 +287,8 @@ boot."
>                       (chmod "/var/run" #o755))))))))
>  
>  (define cleanup-service-type
> -  ;; Service that cleans things up in /tmp and similar.
> +  ;; Service that cleans things up in /tmp and similar.  Its value is the name
> +  ;; of a locale to install before traversing these directories.
>    (service-type (name 'cleanup)
>                  (extensions
>                   (list (service-extension boot-service-type
> diff --git a/gnu/system.scm b/gnu/system.scm
> index 0f52351cf..5e0d2db7d 100644
> --- a/gnu/system.scm
> +++ b/gnu/system.scm
> @@ -309,7 +309,8 @@ a container or that of a \"bare metal\" system."
>             ;; activation code.
>             %shepherd-root-service
>             %activation-service
> -           (service cleanup-service-type #f)
> +           (service cleanup-service-type
> +                    (operating-system-locale os))
>  
>             (pam-root-service (operating-system-pam-services os))
>             (account-service (append (operating-system-accounts os)




Added tag(s) patch. Request was from ludo <at> gnu.org (Ludovic Courtès) to control <at> debbugs.gnu.org. (Sat, 22 Apr 2017 23:32:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-guix <at> gnu.org:
bug#26353; Package guix. (Sun, 23 Apr 2017 00:15:01 GMT) Full text and rfc822 format available.

Message #16 received at 26353 <at> debbugs.gnu.org (full text, mbox):

From: Danny Milosavljevic <dannym <at> scratchpost.org>
To: ludo <at> gnu.org (Ludovic Courtès)
Cc: 26353 <at> debbugs.gnu.org
Subject: Re: bug#26353: GuixSD /tmp cleaner fails to clean when Umlauts like
 "ä" are used in filenames
Date: Sun, 23 Apr 2017 02:14:48 +0200
Hi Ludo,

I've applied it, but the system update is still running.  Because of the massive number of patches I write I don't track master daily. I'm always behind 2 weeks (because that's the time until I can merge a patch).  It seems lately a huge update got merged :)

Right now it's compiling qtbase from source locally (not Hydra - no idea why).

70 GiB non-home root partition seems also be too small for it all. I have to do guix gc quite often - I'll have to repartition somewhen.

texlive finally downloaded correctly *shrugs*.  Texlive is really getting on my nerves - isn't it possible to modularize it more?  Also, one shouldn't require 2 GiB for a word processor and DTP. *mumble mumble*

But I will test the tmp cleaner, it will just take some time.




Information forwarded to bug-guix <at> gnu.org:
bug#26353; Package guix. (Sun, 23 Apr 2017 02:04:02 GMT) Full text and rfc822 format available.

Message #19 received at 26353 <at> debbugs.gnu.org (full text, mbox):

From: Danny Milosavljevic <dannym <at> scratchpost.org>
To: ludo <at> gnu.org (Ludovic Courtès)
Cc: 26353 <at> debbugs.gnu.org
Subject: Re: bug#26353: GuixSD /tmp cleaner fails to clean when Umlauts like
 "ä" are used in filenames
Date: Sun, 23 Apr 2017 04:03:01 +0200
On Sun, 23 Apr 2017 01:30:56 +0200
ludo <at> gnu.org (Ludovic Courtès) wrote:

> Did you have a chance to look at this patch?

Hmm, guix system reconfigure finished with the patch, I rebooted, and I get the same error message (No such file) and the file is still there.

My operating-system locale is en_US.UTF-8.




Information forwarded to bug-guix <at> gnu.org:
bug#26353; Package guix. (Mon, 01 May 2017 14:52:02 GMT) Full text and rfc822 format available.

Message #22 received at 26353 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Danny Milosavljevic <dannym <at> scratchpost.org>
Cc: 26353 <at> debbugs.gnu.org
Subject: TeX Live
Date: Mon, 01 May 2017 16:51:06 +0200
Hi Danny!

Danny Milosavljevic <dannym <at> scratchpost.org> skribis:

> 70 GiB non-home root partition seems also be too small for it all. I have to do guix gc quite often - I'll have to repartition somewhen.
>
> texlive finally downloaded correctly *shrugs*.  Texlive is really getting on my nerves - isn't it possible to modularize it more?  Also, one shouldn't require 2 GiB for a word processor and DTP. *mumble mumble*

TeX Live is getting on everybody’s nerves.  :-)

There are ways to turn it into a zillion packages from CTAN, which is
what Nixpkgs did.  Ricardo (I think?) had some thoughts as to how to
achieve this and I would really like to see it happen.

Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#26353; Package guix. (Mon, 01 May 2017 15:12:01 GMT) Full text and rfc822 format available.

Message #25 received at 26353 <at> debbugs.gnu.org (full text, mbox):

From: Marius Bakke <mbakke <at> fastmail.com>
To: Ludovic Courtès <ludo <at> gnu.org>, Danny Milosavljevic
 <dannym <at> scratchpost.org>
Cc: 26353 <at> debbugs.gnu.org
Subject: Re: bug#26353: TeX Live
Date: Mon, 01 May 2017 17:11:26 +0200
[Message part 1 (text/plain, inline)]
Ludovic Courtès <ludo <at> gnu.org> writes:

> Hi Danny!
>
> Danny Milosavljevic <dannym <at> scratchpost.org> skribis:
>
>> 70 GiB non-home root partition seems also be too small for it all. I have to do guix gc quite often - I'll have to repartition somewhen.
>>
>> texlive finally downloaded correctly *shrugs*.  Texlive is really getting on my nerves - isn't it possible to modularize it more?  Also, one shouldn't require 2 GiB for a word processor and DTP. *mumble mumble*
>
> TeX Live is getting on everybody’s nerves.  :-)
>
> There are ways to turn it into a zillion packages from CTAN, which is
> what Nixpkgs did.  Ricardo (I think?) had some thoughts as to how to
> achieve this and I would really like to see it happen.

That would be great. I miss this snippet from my ~/.nixpkgs/config.nix:

    myTex = pkgs.texlive.combine {
      inherit (texlive) scheme-small marginnote sectsty cm-super enumitem
      xifthen ifmtarg unicode-math filehook collection-fontsrecommended
      collection-fontsextra libertine gentium-tug ucharcat;
    };
[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-guix <at> gnu.org:
bug#26353; Package guix. (Mon, 01 May 2017 21:00:02 GMT) Full text and rfc822 format available.

Message #28 received at 26353 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Danny Milosavljevic <dannym <at> scratchpost.org>
Cc: 26353 <at> debbugs.gnu.org
Subject: Re: bug#26353: GuixSD /tmp cleaner fails to clean when Umlauts like
 "ä" are used in filenames
Date: Mon, 01 May 2017 22:59:22 +0200
Hi,

Danny Milosavljevic <dannym <at> scratchpost.org> skribis:

> On Sun, 23 Apr 2017 01:30:56 +0200
> ludo <at> gnu.org (Ludovic Courtès) wrote:
>
>> Did you have a chance to look at this patch?
>
> Hmm, guix system reconfigure finished with the patch, I rebooted, and I get the same error message (No such file) and the file is still there.

Indeed, I just realized that the cleanup code runs before
/run/current-system has been created; thus it does not have access to
locale data and ‘setlocale’ fails.

I cannot think of a nice way to address this unfortunately.  :-(

The problem of how to deal with file name encoding has been discussed on
the Guile side so hopefully the next release in the 2.2 series will have
a solution for this.

Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#26353; Package guix. (Mon, 01 May 2017 21:25:01 GMT) Full text and rfc822 format available.

Message #31 received at 26353 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Marius Bakke <mbakke <at> fastmail.com>
Cc: 26353 <at> debbugs.gnu.org, Danny Milosavljevic <dannym <at> scratchpost.org>
Subject: Re: bug#26353: TeX Live
Date: Mon, 01 May 2017 23:24:14 +0200
Marius Bakke <mbakke <at> fastmail.com> skribis:

> Ludovic Courtès <ludo <at> gnu.org> writes:
>
>> Hi Danny!
>>
>> Danny Milosavljevic <dannym <at> scratchpost.org> skribis:
>>
>>> 70 GiB non-home root partition seems also be too small for it all. I have to do guix gc quite often - I'll have to repartition somewhen.
>>>
>>> texlive finally downloaded correctly *shrugs*.  Texlive is really getting on my nerves - isn't it possible to modularize it more?  Also, one shouldn't require 2 GiB for a word processor and DTP. *mumble mumble*
>>
>> TeX Live is getting on everybody’s nerves.  :-)
>>
>> There are ways to turn it into a zillion packages from CTAN, which is
>> what Nixpkgs did.  Ricardo (I think?) had some thoughts as to how to
>> achieve this and I would really like to see it happen.
>
> That would be great. I miss this snippet from my ~/.nixpkgs/config.nix:
>
>     myTex = pkgs.texlive.combine {
>       inherit (texlive) scheme-small marginnote sectsty cm-super enumitem
>       xifthen ifmtarg unicode-math filehook collection-fontsrecommended
>       collection-fontsextra libertine gentium-tug ucharcat;
>     };

Yeah, I agree.  Hopefully a profile hook could do what
pkgs.texlive.combine does, which would make it more convenient.  Then we
also need a CTAN importer…

Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#26353; Package guix. (Tue, 02 May 2017 06:33:02 GMT) Full text and rfc822 format available.

Message #34 received at 26353 <at> debbugs.gnu.org (full text, mbox):

From: Ricardo Wurmus <rekado <at> elephly.net>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 26353 <at> debbugs.gnu.org, Danny Milosavljevic <dannym <at> scratchpost.org>
Subject: Re: bug#26353: TeX Live
Date: Tue, 02 May 2017 08:31:59 +0200
Ludovic Courtès <ludo <at> gnu.org> writes:

> There are ways to turn it into a zillion packages from CTAN, which is
> what Nixpkgs did.  Ricardo (I think?) had some thoughts as to how to
> achieve this and I would really like to see it happen.

Yeah, it’s true, I wanted to work on this, but … it hasn’t happened yet :)
I’d be happy if someone could help us out here.

-- 
Ricardo

GPG: BCA6 89B6 3655 3801 C3C6  2150 197A 5888 235F ACAC
https://elephly.net





Severity set to 'important' from 'normal' Request was from ludo <at> gnu.org (Ludovic Courtès) to control <at> debbugs.gnu.org. (Mon, 08 May 2017 14:32:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-guix <at> gnu.org:
bug#26353; Package guix. (Thu, 01 Jun 2017 10:58:01 GMT) Full text and rfc822 format available.

Message #39 received at 26353 <at> debbugs.gnu.org (full text, mbox):

From: Danny Milosavljevic <dannym <at> scratchpost.org>
To: ludo <at> gnu.org (Ludovic Courtès)
Cc: 26353 <at> debbugs.gnu.org
Subject: VFS name encoding
Date: Thu, 1 Jun 2017 12:57:43 +0200
Hi Ludo,

> The problem of how to deal with file name encoding has been discussed on
> the Guile side so hopefully the next release in the 2.2 series will have
> a solution for this.

For what it's worth, I think the sane solution is the Plan 9 solution:  Just represent file names as bytevectors.  Programs which don't care about the actual name - for example programs that just want to do (for-each unlink (scandir (string->utf8 "."))) or something - have no reason to care about the encoding at all.  And then use UTF-8 encoding everywhere (for the file names, also for everything else) throughout the operating system for the tools that do care.

There are also utf8 mount options in the Linux kernel to be able to present UTF-8 names to userspace even when the actual names on disk are something else - and we should use them.  (I think we should even modify <file-system> flags to default to "utf8" or "iocharset=utf8" where possible)

This conversion of UTF-8 to UCS-4 especially is really just busywork.  My opinion changed over the years - earlier I was all for UCS-4.  But actually, most tools don't care about the actual content of the file names - it's just an opaque ID to them (similar to an UUID).  Representing them as something else in userspace again (inviting another conversion failure) is just ... unnecessary. 

In any case, it would be different if we had a non-UNIX kernel underneath.  But as long as we do have UNIX the kernel VFS interface expects bytevectors, preferrably interpreted as UTF-8 (if interpreted at all).

I think this is also the consensus among the major Linux distributions and also among lowlevel libraries like glib: They assume one is using UTF-8 filenames and default to it whereever possible.




Information forwarded to bug-guix <at> gnu.org:
bug#26353; Package guix. (Thu, 01 Jun 2017 11:18:01 GMT) Full text and rfc822 format available.

Message #42 received at 26353 <at> debbugs.gnu.org (full text, mbox):

From: ng0 <ng0 <at> pragmatique.xyz>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 26353 <at> debbugs.gnu.org, Marius Bakke <mbakke <at> fastmail.com>
Subject: Re: bug#26353: TeX Live
Date: Thu, 1 Jun 2017 11:17:15 +0000
[Message part 1 (text/plain, inline)]
Ludovic Courtès transcribed 1.3K bytes:
> Marius Bakke <mbakke <at> fastmail.com> skribis:
> 
> > Ludovic Courtès <ludo <at> gnu.org> writes:
> >
> >> Hi Danny!
> >>
> >> Danny Milosavljevic <dannym <at> scratchpost.org> skribis:
> >>
> >>> 70 GiB non-home root partition seems also be too small for it all. I have to do guix gc quite often - I'll have to repartition somewhen.
> >>>
> >>> texlive finally downloaded correctly *shrugs*.  Texlive is really getting on my nerves - isn't it possible to modularize it more?  Also, one shouldn't require 2 GiB for a word processor and DTP. *mumble mumble*
> >>
> >> TeX Live is getting on everybody’s nerves.  :-)
> >>
> >> There are ways to turn it into a zillion packages from CTAN, which is
> >> what Nixpkgs did.  Ricardo (I think?) had some thoughts as to how to
> >> achieve this and I would really like to see it happen.
> >
> > That would be great. I miss this snippet from my ~/.nixpkgs/config.nix:
> >
> >     myTex = pkgs.texlive.combine {
> >       inherit (texlive) scheme-small marginnote sectsty cm-super enumitem
> >       xifthen ifmtarg unicode-math filehook collection-fontsrecommended
> >       collection-fontsextra libertine gentium-tug ucharcat;
> >     };
> 
> Yeah, I agree.  Hopefully a profile hook could do what
> pkgs.texlive.combine does, which would make it more convenient.  Then we
> also need a CTAN importer…
> 
> Ludo’.
> 
> 
> 

Importers for the importer deity!

But seriously, whoever manages to split Texlive up gets a free non-alcoholic
drink if I should ever meet the person at a conference or somewhere else!
-- 
ng0
OpenPG: A88C8ADD129828D7EAC02E52E22F9BBFEE348588
[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-guix <at> gnu.org:
bug#26353; Package guix. (Thu, 01 Jun 2017 11:29:01 GMT) Full text and rfc822 format available.

Message #45 received at 26353 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Danny Milosavljevic <dannym <at> scratchpost.org>
Cc: 26353 <at> debbugs.gnu.org
Subject: Re: VFS name encoding
Date: Thu, 01 Jun 2017 13:28:27 +0200
Hi Danny,

Danny Milosavljevic <dannym <at> scratchpost.org> skribis:

>> The problem of how to deal with file name encoding has been
>> discussed on the Guile side so hopefully the next release in the 2.2
>> series will have a solution for this.
>
> For what it's worth, I think the sane solution is the Plan 9 solution:
> Just represent file names as bytevectors.  Programs which don't care
> about the actual name - for example programs that just want to do
> (for-each unlink (scandir (string->utf8 "."))) or something - have no
> reason to care about the encoding at all.  And then use UTF-8 encoding
> everywhere (for the file names, also for everything else) throughout
> the operating system for the tools that do care.

FWIW the problem has been discussed at length in Guile land, although I
don’t think anyone has come up with a complete solution yet.

I think it’s natural to represent file names as strings, but we made a
mistake in 2.0 when we assumed we’d basically always be able to decode
file names using the current locale “on sane systems”.  So now we need a
way to represent file names that cannot be decoded while preserving
backward compatibility.

To be continued!

Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#26353; Package guix. (Fri, 02 Jun 2017 08:33:02 GMT) Full text and rfc822 format available.

Message #48 received at 26353 <at> debbugs.gnu.org (full text, mbox):

From: Ricardo Wurmus <rekado <at> elephly.net>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 26353 <at> debbugs.gnu.org
Subject: Re: bug#26353: TeX Live
Date: Fri, 02 Jun 2017 10:32:01 +0200
Ricardo Wurmus <rekado <at> elephly.net> writes:

> Ludovic Courtès <ludo <at> gnu.org> writes:
>
>> There are ways to turn it into a zillion packages from CTAN, which is
>> what Nixpkgs did.  Ricardo (I think?) had some thoughts as to how to
>> achieve this and I would really like to see it happen.
>
> Yeah, it’s true, I wanted to work on this, but … it hasn’t happened yet :)
> I’d be happy if someone could help us out here.

So… I already have a Texlive importer that fetches things from SVN
(because the tarballs on CTAN are not versioned).

The texmf-dist tarball actually seems to include a couple of generated
files (such as latex.ltx), which needs to be bootstrapped with initex
first.  I’ve already made some progress on this end, but I need to first
build a few metafont fonts.

The hardest part here is to override search paths and figure out
dependencies.  This is a very slow process right now, because it’s
mainly error-driven.

I’m close to finishing the bootstrap of latex-base.  Once that’s done I
should be able to finish the texlive-build-system, and then I’ll try
building the other latex packages that are distributed with Texlive.
There’s more to Texlive (e.g. xetex packages), but I’ll take care of
that later.

One thing that’s still unknown at this point is how the profile hook
should work, but I’ll figure this out as I learn more about the search
paths and the like.

--
Ricardo

GPG: BCA6 89B6 3655 3801 C3C6  2150 197A 5888 235F ACAC
https://elephly.net





Information forwarded to bug-guix <at> gnu.org:
bug#26353; Package guix. (Fri, 02 Jun 2017 15:08:02 GMT) Full text and rfc822 format available.

Message #51 received at 26353 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Ricardo Wurmus <rekado <at> elephly.net>
Cc: 26353 <at> debbugs.gnu.org
Subject: Re: bug#26353: TeX Live
Date: Fri, 02 Jun 2017 17:06:49 +0200
Hi Ricardo,

Ricardo Wurmus <rekado <at> elephly.net> skribis:

> So… I already have a Texlive importer that fetches things from SVN
> (because the tarballs on CTAN are not versioned).

Awesome!

> The texmf-dist tarball actually seems to include a couple of generated
> files (such as latex.ltx), which needs to be bootstrapped with initex
> first.  I’ve already made some progress on this end, but I need to first
> build a few metafont fonts.
>
> The hardest part here is to override search paths and figure out
> dependencies.  This is a very slow process right now, because it’s
> mainly error-driven.

Yeah kpathsea and all that.

> I’m close to finishing the bootstrap of latex-base.  Once that’s done I
> should be able to finish the texlive-build-system, and then I’ll try
> building the other latex packages that are distributed with Texlive.
> There’s more to Texlive (e.g. xetex packages), but I’ll take care of
> that later.
>
> One thing that’s still unknown at this point is how the profile hook
> should work, but I’ll figure this out as I learn more about the search
> paths and the like.

OK, we’ll see.

Thank you for this brave effort!

Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#26353; Package guix. (Sat, 03 Jun 2017 19:15:02 GMT) Full text and rfc822 format available.

Message #54 received at 26353 <at> debbugs.gnu.org (full text, mbox):

From: Ricardo Wurmus <rekado <at> elephly.net>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 26353 <at> debbugs.gnu.org
Subject: Re: bug#26353: TeX Live
Date: Sat, 03 Jun 2017 21:14:42 +0200
Ludovic Courtès <ludo <at> gnu.org> writes:

> Ricardo Wurmus <rekado <at> elephly.net> skribis:
>
>> So… I already have a Texlive importer that fetches things from SVN
>> (because the tarballs on CTAN are not versioned).
>
> Awesome!
>
>> The texmf-dist tarball actually seems to include a couple of generated
>> files (such as latex.ltx), which needs to be bootstrapped with initex
>> first.  I’ve already made some progress on this end, but I need to first
>> build a few metafont fonts.
>>
>> The hardest part here is to override search paths and figure out
>> dependencies.  This is a very slow process right now, because it’s
>> mainly error-driven.
>
> Yeah kpathsea and all that.
>
>> I’m close to finishing the bootstrap of latex-base.  Once that’s done I
>> should be able to finish the texlive-build-system, and then I’ll try
>> building the other latex packages that are distributed with Texlive.
>> There’s more to Texlive (e.g. xetex packages), but I’ll take care of
>> that later.
>>
>> One thing that’s still unknown at this point is how the profile hook
>> should work, but I’ll figure this out as I learn more about the search
>> paths and the like.
>
> OK, we’ll see.

I submitted a new bug for this:

    http://debbugs.gnu.org/cgi/bugreport.cgi?bug=27217

…because this allows us the satisfaction of closing this bug once it’s
done; and because we can keep track of progress there.

--
Ricardo

GPG: BCA6 89B6 3655 3801 C3C6  2150 197A 5888 235F ACAC
https://elephly.net





Information forwarded to bug-guix <at> gnu.org:
bug#26353; Package guix. (Thu, 14 Dec 2017 22:30:02 GMT) Full text and rfc822 format available.

Message #57 received at 26353 <at> debbugs.gnu.org (full text, mbox):

From: Danny Milosavljevic <dannym <at> scratchpost.org>
To: ludo <at> gnu.org (Ludovic Courtès)
Cc: 26353 <at> debbugs.gnu.org
Subject: Re: bug#26353: GuixSD /tmp cleaner fails to clean when Umlauts like
 "ä" are used in filenames
Date: Thu, 14 Dec 2017 23:28:57 +0100
> The problem of how to deal with file name encoding has been discussed on
> the Guile side so hopefully the next release in the 2.2 series will have
> a solution for this.

Hmm, any news on this?  I've again got some immortal files in /tmp ...




Information forwarded to bug-guix <at> gnu.org:
bug#26353; Package guix. (Fri, 15 Dec 2017 10:28:02 GMT) Full text and rfc822 format available.

Message #60 received at 26353 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Danny Milosavljevic <dannym <at> scratchpost.org>
Cc: 26353 <at> debbugs.gnu.org
Subject: Re: bug#26353: GuixSD /tmp cleaner fails to clean when Umlauts like
 "ä" are used in filenames
Date: Fri, 15 Dec 2017 11:27:49 +0100
[Message part 1 (text/plain, inline)]
Danny Milosavljevic <dannym <at> scratchpost.org> skribis:

>> The problem of how to deal with file name encoding has been discussed on
>> the Guile side so hopefully the next release in the 2.2 series will have
>> a solution for this.
>
> Hmm, any news on this?  I've again got some immortal files in /tmp ...

I’m afraid no.  Months ago a solution was proposed on the Guile side but
not implemented.

I tried the attached workaround, which attempts to use are UTF-8-only
syscalls wrappers for the task.  Unfortunately it doesn’t work because
the cleanup code runs on the initrd’s statically-linked Guile, where
‘dynamic-link’ calls used in (guix build syscalls) fail.  :-/

Ideas?

Ludo’.

[Message part 2 (text/x-patch, inline)]
diff --git a/gnu/services.scm b/gnu/services.scm
index 016ff08e0..7d9fd132f 100644
--- a/gnu/services.scm
+++ b/gnu/services.scm
@@ -361,9 +361,12 @@ directory."
   "Return as a monadic value a gexp to clean up /tmp and similar places upon
 boot."
   (with-monad %store-monad
-    (with-imported-modules '((guix build utils))
+    (with-imported-modules (source-module-closure
+                            '((guix build utils)
+                              (guix build syscalls)))
       (return #~(begin
-                  (use-modules (guix build utils))
+                  (use-modules (guix build utils)
+                               (guix build syscalls))
 
                   ;; Clean out /tmp and /var/run.
                   ;;
@@ -387,8 +390,12 @@ boot."
                      (delete-file "/etc/passwd.lock")
                      (delete-file "/etc/.pwd.lock") ;from 'lckpwdf'
 
-                     (delete-file-recursively "/tmp")
-                     (delete-file-recursively "/var/run")
+                     ;; Assume file names are UTF-8 encoded.  See
+                     ;; <https://bugs.gnu.org/26353>.
+                     (with-utf8-file-names
+                      (delete-file-recursively "/tmp")
+                      (delete-file-recursively "/var/run"))
+
                      (mkdir "/tmp")
                      (chmod "/tmp" #o1777)
                      (mkdir "/var/run")
diff --git a/gnu/tests/base.scm b/gnu/tests/base.scm
index 1bc7a7027..3cec5af7f 100644
--- a/gnu/tests/base.scm
+++ b/gnu/tests/base.scm
@@ -29,6 +29,8 @@
   #:use-module (gnu services mcron)
   #:use-module (gnu services shepherd)
   #:use-module (gnu services networking)
+  #:use-module (gnu packages base)
+  #:use-module (gnu packages bash)
   #:use-module (gnu packages imagemagick)
   #:use-module (gnu packages ocr)
   #:use-module (gnu packages package-management)
@@ -36,11 +38,13 @@
   #:use-module (gnu packages tmux)
   #:use-module (guix gexp)
   #:use-module (guix store)
+  #:use-module (guix monads)
   #:use-module (guix packages)
   #:use-module (srfi srfi-1)
   #:export (run-basic-test
             %test-basic-os
             %test-halt
+            %test-cleanup
             %test-mcron
             %test-nss-mdns))
 
@@ -476,6 +480,67 @@ in a loop.  See <http://bugs.gnu.org/26931>.")
       (run-halt-test (virtual-machine os))))))
 
 
+;;;
+;;; Cleanup of /tmp, /var/run, etc.
+;;;
+
+
+(define %cleanup-os
+  (simple-operating-system
+   (simple-service 'dirty-things
+                   boot-service-type
+                   (with-monad %store-monad
+                     (let ((script (plain-file
+                                    "create-utf8-file.sh"
+                                    "exec touch /tmp/{λαμβδα,witness}")))
+                       (with-imported-modules '((guix build utils))
+                         (return #~(begin
+                                     (setenv "PATH"
+                                             #$(file-append coreutils "/bin"))
+                                     (invoke #$(file-append bash "/bin/sh")
+                                             #$script)))))))))
+
+(define (run-cleanup-test name)
+  (define os
+    (marionette-operating-system %cleanup-os
+                                 #:imported-modules '((gnu services herd)
+                                                      (guix combinators))))
+  (define test
+    (with-imported-modules '((gnu build marionette))
+      #~(begin
+          (use-modules (gnu build marionette)
+                       (srfi srfi-64)
+                       (ice-9 match))
+
+          (define marionette
+            (make-marionette (list #$(virtual-machine os))))
+
+          (mkdir #$output)
+          (chdir #$output)
+
+          (test-begin "cleanup")
+
+          (test-assert "dirty service worked"
+            (marionette-eval '(file-exists? "/witness") marionette))
+
+          (test-equal "/tmp cleaned up"
+            2
+            (marionette-eval '(stat:nlink (stat "/tmp")) marionette))
+
+          (test-end)
+          (exit (= (test-runner-fail-count (test-runner-current)) 0)))))
+
+  (gexp->derivation "cleanup" test))
+
+(define %test-cleanup
+  ;; See <https://bugs.gnu.org/26353>.
+  (system-test
+   (name "cleanup")
+   (description "Make sure the 'cleanup' service can remove files with
+non-ASCII names from /tmp.")
+   (value (run-cleanup-test name))))
+
+
 ;;;
 ;;; Mcron.
 ;;;
diff --git a/guix/build/syscalls.scm b/guix/build/syscalls.scm
index 0cb630cfb..ac27fb5d6 100644
--- a/guix/build/syscalls.scm
+++ b/guix/build/syscalls.scm
@@ -71,6 +71,7 @@
             fdatasync
             pivot-root
             scandir*
+            with-utf8-file-names
             fcntl-flock
 
             set-thread-name
@@ -995,6 +996,35 @@ system to PUT-OLD."
       (lambda ()
         (closedir* directory)))))
 
+(define delete-file*
+  (let ((proc (syscall->procedure int "unlike" '(*))))
+    (lambda* (file #:optional (string->pointer string->pointer/utf-8))
+      (proc (string->pointer file)))))
+
+(define* (call-with-utf8-file-names thunk)
+  (let ((real-delete-file delete-file)
+        (real-opendir     opendir)
+        (real-readdir     readdir))
+    (dynamic-wind
+      (lambda ()
+        (set! delete-file delete-file*)
+        (set! opendir opendir*)
+        (set! readdir readdir*))
+      thunk
+      (lambda ()
+        (set! delete-file real-delete-file)
+        (set! opendir real-opendir)
+        (set! readdir real-readdir)))))
+
+(define-syntax-rule (with-utf8-file-names body ...)
+  "Evaluate BODY in a context where *some* of the core file system bindings
+have been replaced with variants that assume file names are UTF-8-encoded
+instead of locale-encoded.
+
+This hack is meant to address <https://bugs.gnu.org/26353>.  Use with care,
+and only in a single-threaded context!"
+  (call-with-utf8-file-names (lambda () body ...)))
+
 
 ;;;
 ;;; Advisory file locking.

Information forwarded to bug-guix <at> gnu.org:
bug#26353; Package guix. (Sat, 09 Jun 2018 09:31:02 GMT) Full text and rfc822 format available.

Message #63 received at 26353 <at> debbugs.gnu.org (full text, mbox):

From: Danny Milosavljevic <dannym <at> scratchpost.org>
To: ludo <at> gnu.org (Ludovic Courtès)
Cc: 26353 <at> debbugs.gnu.org
Subject: Re: bug#26353: GuixSD /tmp cleaner fails to clean when Umlauts like
 "ä" are used in filenames
Date: Sat, 9 Jun 2018 11:30:20 +0200
Hi Ludo,

+(define delete-file*
+  (let ((proc (syscall->procedure int "unlike" '(*))))

Typo.  Should be "unlink".

>+    (lambda* (file #:optional (string->pointer string->pointer/utf-8))
>+      (proc (string->pointer file)))))

>Ideas?

Well, we could always include a special wrapper in guile-static - like we do
for load-linux-module/fd.

That way, it is included in the statically linked guile executable.




Information forwarded to bug-guix <at> gnu.org:
bug#26353; Package guix. (Tue, 19 Jun 2018 20:18:02 GMT) Full text and rfc822 format available.

Message #66 received at 26353 <at> debbugs.gnu.org (full text, mbox):

From: Nils Gillmann <ng0 <at> n0.is>
To: Danny Milosavljevic <dannym <at> scratchpost.org>
Cc: 26353 <at> debbugs.gnu.org, Ludovic Courtès <ludo <at> gnu.org>
Subject: Re: bug#26353: GuixSD /tmp cleaner fails to clean when Umlauts like "ä" are used in filenames
Date: Tue, 19 Jun 2018 20:17:35 +0000
Danny Milosavljevic transcribed 249 bytes:
> > The problem of how to deal with file name encoding has been discussed on
> > the Guile side so hopefully the next release in the 2.2 series will have
> > a solution for this.
> 
> Hmm, any news on this?  I've again got some immortal files in /tmp ...

Did it ever work for you? I can't recall a single time in my years with
GuixSD when /tmp was cleaned. It was only when I started reading more
system specific code that I found out that the lack of /tmp cleaning
on shutdown is not a default.




Information forwarded to bug-guix <at> gnu.org:
bug#26353; Package guix. (Tue, 19 Jun 2018 20:48:01 GMT) Full text and rfc822 format available.

Message #69 received at 26353 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Nils Gillmann <ng0 <at> n0.is>
Cc: 26353 <at> debbugs.gnu.org, Danny Milosavljevic <dannym <at> scratchpost.org>
Subject: Re: bug#26353: GuixSD /tmp cleaner fails to clean when Umlauts like
 "ä" are used in filenames
Date: Tue, 19 Jun 2018 22:47:39 +0200
Nils Gillmann <ng0 <at> n0.is> skribis:

> Danny Milosavljevic transcribed 249 bytes:
>> > The problem of how to deal with file name encoding has been discussed on
>> > the Guile side so hopefully the next release in the 2.2 series will have
>> > a solution for this.
>> 
>> Hmm, any news on this?  I've again got some immortal files in /tmp ...
>
> Did it ever work for you? I can't recall a single time in my years with
> GuixSD when /tmp was cleaned. It was only when I started reading more
> system specific code that I found out that the lack of /tmp cleaning
> on shutdown is not a default.

This bug report is about the specific case where it doesn’t work.  :-)

Ludo’.




Reply sent to ludo <at> gnu.org (Ludovic Courtès):
You have taken responsibility. (Wed, 20 Jun 2018 08:08:02 GMT) Full text and rfc822 format available.

Notification sent to Danny Milosavljevic <dannym <at> scratchpost.org>:
bug acknowledged by developer. (Wed, 20 Jun 2018 08:08:02 GMT) Full text and rfc822 format available.

Message #74 received at 26353-done <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Danny Milosavljevic <dannym <at> scratchpost.org>
Cc: 26353-done <at> debbugs.gnu.org
Subject: Re: bug#26353: GuixSD /tmp cleaner fails to clean when Umlauts like
 "ä" are used in filenames
Date: Wed, 20 Jun 2018 10:07:41 +0200
Hello!

Finally fixed with commit 76c321d8e85683091ecbcd3afe8c56fb7c45c00a.
I opted for a simpler approach (and I wonder why it didn’t come to mind
earlier than this…).

Thanks for your patience, and bye bye immortal files!  :-)

Ludo’.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Wed, 18 Jul 2018 11:24:05 GMT) Full text and rfc822 format available.

This bug report was last modified 5 years and 277 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.