GNU bug report logs - #33643
[PATCH] gnu-build-system: Enable xz to decompress in parallel.

Previous Next

Package: guix-patches;

Reported by: Christopher Baines <mail <at> cbaines.net>

Date: Thu, 6 Dec 2018 07:57:02 UTC

Severity: normal

Tags: patch

Done: Christopher Baines <mail <at> cbaines.net>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 33643 in the body.
You can then email your comments to 33643 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to guix-patches <at> gnu.org:
bug#33643; Package guix-patches. (Thu, 06 Dec 2018 07:57:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Christopher Baines <mail <at> cbaines.net>:
New bug report received and forwarded. Copy sent to guix-patches <at> gnu.org. (Thu, 06 Dec 2018 07:57:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Christopher Baines <mail <at> cbaines.net>
To: guix-patches <at> gnu.org
Subject: [PATCH] gnu-build-system: Enable xz to decompress in parallel.
Date: Thu,  6 Dec 2018 07:56:15 +0000
It can take a little while to decompress some packages with large xz
compressed source tar files. xz includes support for parallelism, so enable
this using the parallel job count for the overall derivation.

* guix/build/gnu-build-system.scm (unpack): Set XZ_OPT to pass the -T option
to xz to enable it to work in parallel if appropriate.
---
 guix/build/gnu-build-system.scm | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/guix/build/gnu-build-system.scm b/guix/build/gnu-build-system.scm
index e5f3197b0..9d11e5b1e 100644
--- a/guix/build/gnu-build-system.scm
+++ b/guix/build/gnu-build-system.scm
@@ -147,7 +147,7 @@ chance to be set."
               locale (strerror (system-error-errno args)))
       #t)))
 
-(define* (unpack #:key source #:allow-other-keys)
+(define* (unpack #:key source parallel-build? #:allow-other-keys)
   "Unpack SOURCE in the working directory, and change directory within the
 source.  When SOURCE is a directory, copy it in a sub-directory of the current
 working directory."
@@ -161,6 +161,10 @@ working directory."
         (copy-recursively source "."
                           #:keep-mtime? #t))
       (begin
+        (when parallel-build?
+          (setenv "XZ_OPT"
+                  (format #f "-T~d" (parallel-job-count))))
+
         (if (string-suffix? ".zip" source)
             (invoke "unzip" source)
             (invoke "tar" "xvf" source))
-- 
2.19.2





Information forwarded to guix-patches <at> gnu.org:
bug#33643; Package guix-patches. (Thu, 06 Dec 2018 08:09:02 GMT) Full text and rfc822 format available.

Message #8 received at 33643 <at> debbugs.gnu.org (full text, mbox):

From: Christopher Baines <mail <at> cbaines.net>
To: 33643 <at> debbugs.gnu.org
Subject: Re: [bug#33643] [PATCH] gnu-build-system: Enable xz to decompress in
 parallel.
Date: Thu, 06 Dec 2018 08:08:29 +0000
[Message part 1 (text/plain, inline)]
Christopher Baines <mail <at> cbaines.net> writes:

> It can take a little while to decompress some packages with large xz
> compressed source tar files. xz includes support for parallelism, so enable
> this using the parallel job count for the overall derivation.

I'm guessing this is only suitable for core-updates, as it'll cause a
lot of rebuilds. I'm also not sure if it's worth it, but it does seem to
make building some packages at least start faster.
[signature.asc (application/pgp-signature, inline)]

Information forwarded to guix-patches <at> gnu.org:
bug#33643; Package guix-patches. (Thu, 06 Dec 2018 08:15:02 GMT) Full text and rfc822 format available.

Message #11 received at 33643 <at> debbugs.gnu.org (full text, mbox):

From: Leo Famulari <leo <at> famulari.name>
To: Christopher Baines <mail <at> cbaines.net>
Cc: 33643 <at> debbugs.gnu.org
Subject: Re: [bug#33643] [PATCH] gnu-build-system: Enable xz to decompress in
 parallel.
Date: Thu, 6 Dec 2018 03:13:52 -0500
[Message part 1 (text/plain, inline)]
On Thu, Dec 06, 2018 at 07:56:15AM +0000, Christopher Baines wrote:
> It can take a little while to decompress some packages with large xz
> compressed source tar files. xz includes support for parallelism, so enable
> this using the parallel job count for the overall derivation.

The xz man page says that multi-threaded decompression isn't implemented
yet, unfortunately.
[signature.asc (application/pgp-signature, inline)]

Information forwarded to guix-patches <at> gnu.org:
bug#33643; Package guix-patches. (Thu, 06 Dec 2018 19:39:02 GMT) Full text and rfc822 format available.

Message #14 received at 33643 <at> debbugs.gnu.org (full text, mbox):

From: Christopher Baines <mail <at> cbaines.net>
To: Leo Famulari <leo <at> famulari.name>
Cc: 33643 <at> debbugs.gnu.org
Subject: Re: [bug#33643] [PATCH] gnu-build-system: Enable xz to decompress in
 parallel.
Date: Thu, 06 Dec 2018 19:38:21 +0000
[Message part 1 (text/plain, inline)]
Leo Famulari <leo <at> famulari.name> writes:

> On Thu, Dec 06, 2018 at 07:56:15AM +0000, Christopher Baines wrote:
>> It can take a little while to decompress some packages with large xz
>> compressed source tar files. xz includes support for parallelism, so enable
>> this using the parallel job count for the overall derivation.
>
> The xz man page says that multi-threaded decompression isn't implemented
> yet, unfortunately.

Ah, interesting. Having a read myself now, it also says it:

  "will work on files that contain multiple blocks with size information
   in block headers.  All files compressed in multi-threaded mode meet
   this condition, but files compressed in single- threaded mode don't
   even if --block-size=size is used."

So, if -T was used to compress the data, then it sounds like it'll work
to decompress it. I guess this adds a little more uncertainty to the
benefit of this change, as the impact is dependent on the way the source
data is compressed.
[signature.asc (application/pgp-signature, inline)]

Information forwarded to guix-patches <at> gnu.org:
bug#33643; Package guix-patches. (Thu, 06 Dec 2018 21:08:02 GMT) Full text and rfc822 format available.

Message #17 received at 33643 <at> debbugs.gnu.org (full text, mbox):

From: Leo Famulari <leo <at> famulari.name>
To: Christopher Baines <mail <at> cbaines.net>
Cc: 33643 <at> debbugs.gnu.org
Subject: Re: [bug#33643] [PATCH] gnu-build-system: Enable xz to decompress in
 parallel.
Date: Thu, 6 Dec 2018 16:06:53 -0500
[Message part 1 (text/plain, inline)]
On Thu, Dec 06, 2018 at 07:38:21PM +0000, Christopher Baines wrote:
> So, if -T was used to compress the data, then it sounds like it'll work
> to decompress it. I guess this adds a little more uncertainty to the
> benefit of this change, as the impact is dependent on the way the source
> data is compressed.

Right. When parallel decompression is implemented, I think we should
enable it in order to get some benefit from upstream tarballs that may
have been created with multi-threaded compression. 

However, we probably won't be able to use the parallel compression
within Guix because it is apparently not deterministic:

<https://bugs.gnu.org/31015>
[signature.asc (application/pgp-signature, inline)]

Information forwarded to guix-patches <at> gnu.org:
bug#33643; Package guix-patches. (Sun, 09 Dec 2018 14:33:02 GMT) Full text and rfc822 format available.

Message #20 received at 33643 <at> debbugs.gnu.org (full text, mbox):

From: Efraim Flashner <efraim <at> flashner.co.il>
To: Leo Famulari <leo <at> famulari.name>
Cc: 33643 <at> debbugs.gnu.org, Christopher Baines <mail <at> cbaines.net>
Subject: Re: [bug#33643] [PATCH] gnu-build-system: Enable xz to decompress in
 parallel.
Date: Sun, 9 Dec 2018 16:32:01 +0200
[Message part 1 (text/plain, inline)]
On Thu, Dec 06, 2018 at 04:06:53PM -0500, Leo Famulari wrote:
> On Thu, Dec 06, 2018 at 07:38:21PM +0000, Christopher Baines wrote:
> > So, if -T was used to compress the data, then it sounds like it'll work
> > to decompress it. I guess this adds a little more uncertainty to the
> > benefit of this change, as the impact is dependent on the way the source
> > data is compressed.
> 
> Right. When parallel decompression is implemented, I think we should
> enable it in order to get some benefit from upstream tarballs that may
> have been created with multi-threaded compression. 
> 
> However, we probably won't be able to use the parallel compression
> within Guix because it is apparently not deterministic:
> 
> <https://bugs.gnu.org/31015>

If the tarball is compressed in parallel then it can be decompressed in
parallel.

As for compressing in parallel, it *might work* to pass it through our
non-bootstrap tar for 'tar --sort=name' and then pass it through xz
-T(pick-a-num).


-- 
Efraim Flashner   <efraim <at> flashner.co.il>   אפרים פלשנר
GPG key = A28B F40C 3E55 1372 662D  14F7 41AA E7DC CA3D 8351
Confidentiality cannot be guaranteed on emails sent or received unencrypted
[signature.asc (application/pgp-signature, inline)]

Information forwarded to guix-patches <at> gnu.org:
bug#33643; Package guix-patches. (Mon, 10 Dec 2018 16:25:01 GMT) Full text and rfc822 format available.

Message #23 received at 33643 <at> debbugs.gnu.org (full text, mbox):

From: Leo Famulari <leo <at> famulari.name>
To: Efraim Flashner <efraim <at> flashner.co.il>
Cc: 33643 <at> debbugs.gnu.org, Christopher Baines <mail <at> cbaines.net>
Subject: Re: [bug#33643] [PATCH] gnu-build-system: Enable xz to decompress in
 parallel.
Date: Mon, 10 Dec 2018 11:24:29 -0500
[Message part 1 (text/plain, inline)]
On Sun, Dec 09, 2018 at 04:32:01PM +0200, Efraim Flashner wrote:
> If the tarball is compressed in parallel then it can be decompressed in
> parallel.

The xz documentation says that parallel decompression is not
implemented? Is that no longer the case?

> As for compressing in parallel, it *might work* to pass it through our
> non-bootstrap tar for 'tar --sort=name' and then pass it through xz
> -T(pick-a-num).

That could be helpful!
[signature.asc (application/pgp-signature, inline)]

Information forwarded to guix-patches <at> gnu.org:
bug#33643; Package guix-patches. (Mon, 10 Dec 2018 18:49:01 GMT) Full text and rfc822 format available.

Message #26 received at 33643 <at> debbugs.gnu.org (full text, mbox):

From: Efraim Flashner <efraim <at> flashner.co.il>
To: Leo Famulari <leo <at> famulari.name>
Cc: 33643 <at> debbugs.gnu.org, Christopher Baines <mail <at> cbaines.net>
Subject: Re: [bug#33643] [PATCH] gnu-build-system: Enable xz to decompress in
 parallel.
Date: Mon, 10 Dec 2018 20:48:43 +0200
[Message part 1 (text/plain, inline)]
On Mon, Dec 10, 2018 at 11:24:29AM -0500, Leo Famulari wrote:
> On Sun, Dec 09, 2018 at 04:32:01PM +0200, Efraim Flashner wrote:
> > If the tarball is compressed in parallel then it can be decompressed in
> > parallel.
> 
> The xz documentation says that parallel decompression is not
> implemented? Is that no longer the case?

Looks like I got caught up with the original release notes.
https://git.tukaani.org/?p=xz.git;a=blob;f=NEWS;hb=HEAD#l94
Looks like it's specifically only compression.

> 
> > As for compressing in parallel, it *might work* to pass it through our
> > non-bootstrap tar for 'tar --sort=name' and then pass it through xz
> > -T(pick-a-num).
> 
> That could be helpful!



-- 
Efraim Flashner   <efraim <at> flashner.co.il>   אפרים פלשנר
GPG key = A28B F40C 3E55 1372 662D  14F7 41AA E7DC CA3D 8351
Confidentiality cannot be guaranteed on emails sent or received unencrypted
[signature.asc (application/pgp-signature, inline)]

Information forwarded to guix-patches <at> gnu.org:
bug#33643; Package guix-patches. (Wed, 13 May 2020 18:21:02 GMT) Full text and rfc822 format available.

Message #29 received at 33643 <at> debbugs.gnu.org (full text, mbox):

From: Christopher Baines <mail <at> cbaines.net>
To: 33643 <at> debbugs.gnu.org
Cc: Efraim Flashner <efraim <at> flashner.co.il>, Leo Famulari <leo <at> famulari.name>
Subject: Re: [bug#33643] [PATCH] gnu-build-system: Enable xz to decompress in
 parallel.
Date: Wed, 13 May 2020 19:20:08 +0100
[Message part 1 (text/plain, inline)]
Christopher Baines <mail <at> cbaines.net> writes:

> It can take a little while to decompress some packages with large xz
> compressed source tar files. xz includes support for parallelism, so enable
> this using the parallel job count for the overall derivation.
>
> * guix/build/gnu-build-system.scm (unpack): Set XZ_OPT to pass the -T option
> to xz to enable it to work in parallel if appropriate.
> ---
>  guix/build/gnu-build-system.scm | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/guix/build/gnu-build-system.scm b/guix/build/gnu-build-system.scm
> index e5f3197b0..9d11e5b1e 100644
> --- a/guix/build/gnu-build-system.scm
> +++ b/guix/build/gnu-build-system.scm
> @@ -147,7 +147,7 @@ chance to be set."
>                locale (strerror (system-error-errno args)))
>        #t)))
>
> -(define* (unpack #:key source #:allow-other-keys)
> +(define* (unpack #:key source parallel-build? #:allow-other-keys)
>    "Unpack SOURCE in the working directory, and change directory within the
>  source.  When SOURCE is a directory, copy it in a sub-directory of the current
>  working directory."
> @@ -161,6 +161,10 @@ working directory."
>          (copy-recursively source "."
>                            #:keep-mtime? #t))
>        (begin
> +        (when parallel-build?
> +          (setenv "XZ_OPT"
> +                  (format #f "-T~d" (parallel-job-count))))
> +
>          (if (string-suffix? ".zip" source)
>              (invoke "unzip" source)
>              (invoke "tar" "xvf" source))

It's been a long long while, but now that core-updates has recently been
merged, I'd like to try and take a look at this again.

I think the consensus was that this will only help for xz compressed
files where they have been compressed in parallel. I think it's still
worth doing though, as some of the big xz files that need decompressing
have been compressed in parallel, and this will speed up the builds when
multiple cores are available.

Thanks,

Chris
[signature.asc (application/pgp-signature, inline)]

Information forwarded to guix-patches <at> gnu.org:
bug#33643; Package guix-patches. (Wed, 13 May 2020 19:09:02 GMT) Full text and rfc822 format available.

Message #32 received at 33643 <at> debbugs.gnu.org (full text, mbox):

From: Efraim Flashner <efraim <at> flashner.co.il>
To: Christopher Baines <mail <at> cbaines.net>
Cc: 33643 <at> debbugs.gnu.org, Leo Famulari <leo <at> famulari.name>
Subject: Re: [bug#33643] [PATCH] gnu-build-system: Enable xz to decompress in
 parallel.
Date: Wed, 13 May 2020 22:07:21 +0300
[Message part 1 (text/plain, inline)]
On Wed, May 13, 2020 at 07:20:08PM +0100, Christopher Baines wrote:
> 
> Christopher Baines <mail <at> cbaines.net> writes:
> 
> > It can take a little while to decompress some packages with large xz
> > compressed source tar files. xz includes support for parallelism, so enable
> > this using the parallel job count for the overall derivation.
> >
> > * guix/build/gnu-build-system.scm (unpack): Set XZ_OPT to pass the -T option
> > to xz to enable it to work in parallel if appropriate.
> > ---
> >  guix/build/gnu-build-system.scm | 6 +++++-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/guix/build/gnu-build-system.scm b/guix/build/gnu-build-system.scm
> > index e5f3197b0..9d11e5b1e 100644
> > --- a/guix/build/gnu-build-system.scm
> > +++ b/guix/build/gnu-build-system.scm
> > @@ -147,7 +147,7 @@ chance to be set."
> >                locale (strerror (system-error-errno args)))
> >        #t)))
> >
> > -(define* (unpack #:key source #:allow-other-keys)
> > +(define* (unpack #:key source parallel-build? #:allow-other-keys)
> >    "Unpack SOURCE in the working directory, and change directory within the
> >  source.  When SOURCE is a directory, copy it in a sub-directory of the current
> >  working directory."
> > @@ -161,6 +161,10 @@ working directory."
> >          (copy-recursively source "."
> >                            #:keep-mtime? #t))
> >        (begin
> > +        (when parallel-build?
> > +          (setenv "XZ_OPT"
> > +                  (format #f "-T~d" (parallel-job-count))))
> > +
> >          (if (string-suffix? ".zip" source)
> >              (invoke "unzip" source)
> >              (invoke "tar" "xvf" source))
> 
> It's been a long long while, but now that core-updates has recently been
> merged, I'd like to try and take a look at this again.
> 
> I think the consensus was that this will only help for xz compressed
> files where they have been compressed in parallel. I think it's still
> worth doing though, as some of the big xz files that need decompressing
> have been compressed in parallel, and this will speed up the builds when
> multiple cores are available.
> 
> Thanks,
> 
> Chris

I thought the last time we looked into this we figured out that there
was a mistake in release notes or something and that parallel
decompression isn't actually supported.

-- 
Efraim Flashner   <efraim <at> flashner.co.il>   אפרים פלשנר
GPG key = A28B F40C 3E55 1372 662D  14F7 41AA E7DC CA3D 8351
Confidentiality cannot be guaranteed on emails sent or received unencrypted
[signature.asc (application/pgp-signature, inline)]

Reply sent to Christopher Baines <mail <at> cbaines.net>:
You have taken responsibility. (Thu, 14 May 2020 07:39:01 GMT) Full text and rfc822 format available.

Notification sent to Christopher Baines <mail <at> cbaines.net>:
bug acknowledged by developer. (Thu, 14 May 2020 07:39:01 GMT) Full text and rfc822 format available.

Message #37 received at 33643-done <at> debbugs.gnu.org (full text, mbox):

From: Christopher Baines <mail <at> cbaines.net>
To: Efraim Flashner <efraim <at> flashner.co.il>
Cc: 33643-done <at> debbugs.gnu.org
Subject: Re: [bug#33643] [PATCH] gnu-build-system: Enable xz to decompress in
 parallel.
Date: Thu, 14 May 2020 08:37:56 +0100
[Message part 1 (text/plain, inline)]
Efraim Flashner <efraim <at> flashner.co.il> writes:

> On Wed, May 13, 2020 at 07:20:08PM +0100, Christopher Baines wrote:
>>
>> Christopher Baines <mail <at> cbaines.net> writes:
>>
>> > It can take a little while to decompress some packages with large xz
>> > compressed source tar files. xz includes support for parallelism, so enable
>> > this using the parallel job count for the overall derivation.
>> >
>> > * guix/build/gnu-build-system.scm (unpack): Set XZ_OPT to pass the -T option
>> > to xz to enable it to work in parallel if appropriate.
>> > ---
>> >  guix/build/gnu-build-system.scm | 6 +++++-
>> >  1 file changed, 5 insertions(+), 1 deletion(-)
>> >
>> > diff --git a/guix/build/gnu-build-system.scm b/guix/build/gnu-build-system.scm
>> > index e5f3197b0..9d11e5b1e 100644
>> > --- a/guix/build/gnu-build-system.scm
>> > +++ b/guix/build/gnu-build-system.scm
>> > @@ -147,7 +147,7 @@ chance to be set."
>> >                locale (strerror (system-error-errno args)))
>> >        #t)))
>> >
>> > -(define* (unpack #:key source #:allow-other-keys)
>> > +(define* (unpack #:key source parallel-build? #:allow-other-keys)
>> >    "Unpack SOURCE in the working directory, and change directory within the
>> >  source.  When SOURCE is a directory, copy it in a sub-directory of the current
>> >  working directory."
>> > @@ -161,6 +161,10 @@ working directory."
>> >          (copy-recursively source "."
>> >                            #:keep-mtime? #t))
>> >        (begin
>> > +        (when parallel-build?
>> > +          (setenv "XZ_OPT"
>> > +                  (format #f "-T~d" (parallel-job-count))))
>> > +
>> >          (if (string-suffix? ".zip" source)
>> >              (invoke "unzip" source)
>> >              (invoke "tar" "xvf" source))
>>
>> It's been a long long while, but now that core-updates has recently been
>> merged, I'd like to try and take a look at this again.
>>
>> I think the consensus was that this will only help for xz compressed
>> files where they have been compressed in parallel. I think it's still
>> worth doing though, as some of the big xz files that need decompressing
>> have been compressed in parallel, and this will speed up the builds when
>> multiple cores are available.
>>
>> Thanks,
>>
>> Chris
>
> I thought the last time we looked into this we figured out that there
> was a mistake in release notes or something and that parallel
> decompression isn't actually supported.

Hmm, I had a look to see if I could find some examples of where this
would apply, but I couldn't find any xz archives that we use in Guix
where it's been compressed in a way that allows multithreaded
decompression...

I'm pretty sure I had some examples before, but maybe somethings changed
in the intervening year.

Anyway, if I discover this again, I'll actually make a note of where
it's applicable.
[signature.asc (application/pgp-signature, inline)]

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 11 Jun 2020 11:24:06 GMT) Full text and rfc822 format available.

This bug report was last modified 3 years and 313 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.