GNU bug report logs - #62656
Cannot fallback to SWH for Guix channel

Previous Next

Package: guix;

Reported by: Nicolas Graves <ngraves <at> ngraves.fr>

Date: Mon, 3 Apr 2023 21:40:01 UTC

Severity: important

Done: Nicolas Graves <ngraves <at> ngraves.fr>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 62656 in the body.
You can then email your comments to 62656 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guix <at> gnu.org:
bug#62656; Package guix. (Mon, 03 Apr 2023 21:40:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Nicolas Graves <ngraves <at> ngraves.fr>:
New bug report received and forwarded. Copy sent to bug-guix <at> gnu.org. (Mon, 03 Apr 2023 21:40:01 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Nicolas Graves <ngraves <at> ngraves.fr>
To: bug-guix <at> gnu.org
Subject: broken guix time-machine + software-heritage
Date: Mon, 03 Apr 2023 23:39:32 +0200
Hi Guix!

I was trying to use guix time-machine as I did in the past, but the
recent updates with software heritage seem to have broken my use of it.

Here's the channels.scm file I used:

(list (channel
        (name 'guix)
        (url "/https://git.savannah.gnu.org/git/guix.git")
        (commit "1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1")
        (introduction
          (make-channel-introduction
            "9edb3f66fd807b096b48283debdcddccfea34bad"
            (openpgp-fingerprint
              "BBB0 2DDF 2CEA F6A8 0D1D  E643 A2A0 6DF2 A33A 54FA")))))

Here is the content + backtrace of the time-machine call, after the ~10
hours long object processing on Software Heritage side:

> guix time-machine -C channels.scm -- shell
Mise à jour du canal « guix » depuis le dépôt Git « /https://git.savannah.gnu.org/git/guix.git »...
SWH: found revision 1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1 with directory at 'https://archive.softwareheritage.org/api/1/directory/1ea499e7529e67a0632ecbe0a8214f0618a82c1a/'
swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/
swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/HEAD
swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/branches/
swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/config
swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/description
swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/hooks/
swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/info/
swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/info/exclude
swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/info/refs
swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/objects/
swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/objects/info/
swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/objects/info/packs
swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/objects/pack/
swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/objects/pack/pack-20648aeebad9dc6d8a29c87bd99d8fd773e1266a.idx
swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/objects/pack/pack-20648aeebad9dc6d8a29c87bd99d8fd773e1266a.pack
Backtrace:
In ice-9/boot-9.scm:
  1752:10 19 (with-exception-handler _ _ #:unwind? _ # _)
In guix/store.scm:
   659:37 18 (thunk)
In guix/status.scm:
    830:4 17 (call-with-status-report _ _)
In guix/store.scm:
   1298:8 16 (call-with-build-handler #<procedure 7f8d1bf5adb0 at g…> …)
In guix/inferior.scm:
   928:34 15 (cached-channel-instance #<store-connection 256.99 7f8…> …)
In guix/channels.scm:
    528:7 14 (loop _ _)
In guix/combinators.scm:
    48:26 13 (fold2 #<procedure 7f8d1bf592a0 at guix/channels.scm:5…> …)
In guix/channels.scm:
   538:29 12 (_ #<<channel> name: guix url: "/https://git.savannah.…> …)
   409:17 11 (latest-channel-instance #<store-connection 256.99 7f8…> …)
In guix/git.scm:
   477:29 10 (update-cached-checkout _ #:ref _ #:recursive? _ # _ # _ …)
    378:2  9 (_ git-error #<<git-error> code: -1 message: "failed to…>)
In guix/utils.scm:
    959:8  8 (call-with-temporary-directory _)
In guix/git.scm:
   380:10  7 (_ "/tmp/guix-directory.v8A5Fq")
In guix/swh.scm:
    655:8  6 (call-with-temporary-directory #<procedure 7f8d248dba80…>)
   682:11  5 (_ "/tmp/guix-directory.4kHVt8")
In guix/build/utils.scm:
  1018:28  4 (_)
In unknown file:
           3 (get-bytevector-n! #<input: string 7f8d1aad5cb0> # 0 #)
In web/response.scm:
     95:2  2 (read! _ _ _)
In ice-9/boot-9.scm:
  1685:16  1 (raise-exception _ #:continuable? _)
  1685:16  0 (raise-exception _ #:continuable? _)

ice-9/boot-9.scm:1685:16: In procedure raise-exception:
Throw to key `bad-response' with args `("EOF while reading response body: ~a bytes of ~a" (53394376 296632320))'.
tar: Fin prématurée rencontrée dans l'archive.
tar: Fin prématurée rencontrée dans l'archive.
tar: swh\:1\:rev\:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/objects/pack : utime impossible: Aucun fichier ou dossier de ce type
tar: swh\:1\:rev\:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/objects : utime impossible: Aucun fichier ou dossier de ce type
tar: swh\:1\:rev\:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git : utime impossible: Aucun fichier ou dossier de ce type
tar: Error is not recoverable: exiting now
zsh: exit 1     guix time-machine -C channels.scm -- shell

Hope this can be fixed soon, good luck ;) 

-- 
Best regards,
Nicolas Graves




Information forwarded to bug-guix <at> gnu.org:
bug#62656; Package guix. (Tue, 04 Apr 2023 11:53:04 GMT) Full text and rfc822 format available.

Message #8 received at 62656 <at> debbugs.gnu.org (full text, mbox):

From: Simon Tournier <zimon.toutoune <at> gmail.com>
To: Nicolas Graves <ngraves <at> ngraves.fr>, 62656 <at> debbugs.gnu.org
Subject: Re: bug#62656: broken guix time-machine + software-heritage
Date: Tue, 04 Apr 2023 12:51:38 +0200
Hi,

Cool you did this test! :-)

On Mon, 03 Apr 2023 at 23:39, Nicolas Graves via Bug reports for GNU Guix <bug-guix <at> gnu.org> wrote:

> Here is the content + backtrace of the time-machine call, after the ~10
> hours long object processing on Software Heritage side:

Last time I checked that, I never got the object from SWH because a bug
on their side.  Nice now they cook the content.

Note that SWH is an archive and so it is expected to take a long time to
extract a large dataset as the files of the Guix repository is.  Many
data are stored cold not to say frozen and that’s why it takes a long
time to warm them up.


>> guix time-machine -C channels.scm -- shell
> Mise à jour du canal « guix » depuis le dépôt Git « /https://git.savannah.gnu.org/git/guix.git »...
> SWH: found revision 1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1 with directory at 'https://archive.softwareheritage.org/api/1/directory/1ea499e7529e67a0632ecbe0a8214f0618a82c1a/'
> swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/
> swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/HEAD

[...]

> swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/objects/pack/pack-20648aeebad9dc6d8a29c87bd99d8fd773e1266a.pack
> Backtrace:
> In ice-9/boot-9.scm:
>   1752:10 19 (with-exception-handler _ _ #:unwind? _ # _)
> In guix/store.scm:
>    659:37 18 (thunk)

[...]

> In unknown file:
>            3 (get-bytevector-n! #<input: string 7f8d1aad5cb0> # 0 #)
> In web/response.scm:
>      95:2  2 (read! _ _ _)
> In ice-9/boot-9.scm:
>   1685:16  1 (raise-exception _ #:continuable? _)
>   1685:16  0 (raise-exception _ #:continuable? _)
>
> ice-9/boot-9.scm:1685:16: In procedure raise-exception:
> Throw to key `bad-response' with args `("EOF while reading response body: ~a bytes of ~a" (53394376 296632320))'.

Well, if I understand correctly, SWH cooked the full Git repository of
Guix and somehow it is probably too big.  Hum, I do not know how to
investigate…

Thanks for the report!

Cheers,
simon




Information forwarded to bug-guix <at> gnu.org:
bug#62656; Package guix. (Wed, 26 Apr 2023 09:52:01 GMT) Full text and rfc822 format available.

Message #11 received at 62656 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludovic.courtes <at> inria.fr>
To: Nicolas Graves <ngraves <at> ngraves.fr>
Cc: 62656 <at> debbugs.gnu.org
Subject: Re: bug#62656: broken guix time-machine + software-heritage
Date: Wed, 26 Apr 2023 11:50:57 +0200
Hello,

Nicolas Graves <ngraves <at> ngraves.fr> skribis:

> I was trying to use guix time-machine as I did in the past, but the
> recent updates with software heritage seem to have broken my use of it.
>
> Here's the channels.scm file I used:
>
> (list (channel
>         (name 'guix)
>         (url "/https://git.savannah.gnu.org/git/guix.git")
>         (commit "1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1")
>         (introduction
>           (make-channel-introduction
>             "9edb3f66fd807b096b48283debdcddccfea34bad"
>             (openpgp-fingerprint
>               "BBB0 2DDF 2CEA F6A8 0D1D  E643 A2A0 6DF2 A33A 54FA")))))

Interesting test!

> Here is the content + backtrace of the time-machine call, after the ~10
> hours long object processing on Software Heritage side:
>
>> guix time-machine -C channels.scm -- shell
> Mise à jour du canal « guix » depuis le dépôt Git « /https://git.savannah.gnu.org/git/guix.git »...
> SWH: found revision 1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1 with directory at 'https://archive.softwareheritage.org/api/1/directory/1ea499e7529e67a0632ecbe0a8214f0618a82c1a/'
> swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/
> swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/HEAD
> swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/branches/
> swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/config
> swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/description

[...]

>            3 (get-bytevector-n! #<input: string 7f8d1aad5cb0> # 0 #)
> In web/response.scm:
>      95:2  2 (read! _ _ _)
> In ice-9/boot-9.scm:
>   1685:16  1 (raise-exception _ #:continuable? _)
>   1685:16  0 (raise-exception _ #:continuable? _)
>
> ice-9/boot-9.scm:1685:16: In procedure raise-exception:
> Throw to key `bad-response' with args `("EOF while reading response body: ~a bytes of ~a" (53394376 296632320))'.

I can reproduce it like this:

--8<---------------cut here---------------start------------->8---
$ wget -O/tmp/swh.git \
   "https://archive.softwareheritage.org/api/1/vault/git-bare/swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1/raw/"
--2023-04-26 11:43:22--  https://archive.softwareheritage.org/api/1/vault/git-bare/swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1/raw/
Resolving archive.softwareheritage.org (archive.softwareheritage.org)... 128.93.166.15
Connecting to archive.softwareheritage.org (archive.softwareheritage.org)|128.93.166.15|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 296632320 (283M) [application/x-tar]
Saving to: ‘/tmp/swh.git’

/tmp/swh.git              13%[===>                             ]  39.11M  84.1MB/s    in 0.5s    

2023-04-26 11:43:40 (84.1 MB/s) - Connection closed at byte 41015184. Retrying.

--2023-04-26 11:43:41--  (try: 2)  https://archive.softwareheritage.org/api/1/vault/git-bare/swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1/raw/
Connecting to archive.softwareheritage.org (archive.softwareheritage.org)|128.93.166.15|:443... connected.
HTTP request sent, awaiting response... 206 Partial Content
Length: 296632320 (283M), 255617136 (244M) remaining [application/x-tar]
Saving to: ‘/tmp/swh.git’

/tmp/swh.git              65%[++++================>            ] 184.66M  96.7MB/s    in 1.5s    

2023-04-26 11:44:00 (96.7 MB/s) - Connection closed at byte 193634304. Retrying.

[…]

--2023-04-26 11:48:01--  (try:12)  https://archive.softwareheritage.org/api/1/vault/git-bare/swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1/raw/
Connecting to archive.softwareheritage.org (archive.softwareheritage.org)|128.93.166.15|:443... connected.
HTTP request sent, awaiting response... 206 Partial Content
Length: 296632320 (283M), 28199637 (27M) remaining [application/x-tar]
Saving to: ‘/tmp/swh.git’

/tmp/swh.git              90%[+++++++++++++++++++++++++++++    ] 256.00M  5.39KB/s    in 0.3s    

2023-04-26 11:48:19 (5.39 KB/s) - Connection closed at byte 268434406. Retrying.

--2023-04-26 11:48:29--  (try:13)  https://archive.softwareheritage.org/api/1/vault/git-bare/swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1/raw/
Connecting to archive.softwareheritage.org (archive.softwareheritage.org)|128.93.166.15|:443... connected.
HTTP request sent, awaiting response... 206 Partial Content
Length: 296632320 (283M), 28197914 (27M) remaining [application/x-tar]
Saving to: ‘/tmp/swh.git’

/tmp/swh.git              90%[+++++++++++++++++++++++++++++    ] 256.00M  --.-KB/s    in 0s      

2023-04-26 11:48:46 (0.00 B/s) - Connection closed at byte 268434406. Retrying.
--8<---------------cut here---------------end--------------->8---

The server keeps closing the connection prematurely.  Unlike our client
in Guile, wget keeps retrying and so, little by little, it eventually
gets more bytes.  In my case it seems to get stuck at 90% though, where
each attempt gives it zero or very few additional bytes.

I suspect this is an issue at SWH.  I’ll bring it up there.

Thanks,
Ludo’.




Severity set to 'important' from 'normal' Request was from Ludovic Courtès <ludo <at> gnu.org> to control <at> debbugs.gnu.org. (Wed, 26 Apr 2023 09:52:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-guix <at> gnu.org:
bug#62656; Package guix. (Wed, 26 Apr 2023 10:02:02 GMT) Full text and rfc822 format available.

Message #16 received at 62656 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludovic.courtes <at> inria.fr>
To: Nicolas Graves <ngraves <at> ngraves.fr>
Cc: 62656 <at> debbugs.gnu.org
Subject: Re: bug#62656: broken guix time-machine + software-heritage
Date: Wed, 26 Apr 2023 12:01:01 +0200
Ludovic Courtès <ludovic.courtes <at> inria.fr> skribis:

> The server keeps closing the connection prematurely.  Unlike our client
> in Guile, wget keeps retrying and so, little by little, it eventually
> gets more bytes.  In my case it seems to get stuck at 90% though, where
> each attempt gives it zero or very few additional bytes.
>
> I suspect this is an issue at SWH.  I’ll bring it up there.

👉 https://gitlab.softwareheritage.org/swh/devel/swh-vault/-/issues/4346




Information forwarded to bug-guix <at> gnu.org:
bug#62656; Package guix. (Fri, 28 Apr 2023 17:01:02 GMT) Full text and rfc822 format available.

Message #19 received at 62656 <at> debbugs.gnu.org (full text, mbox):

From: Simon Tournier <zimon.toutoune <at> gmail.com>
To: Ludovic Courtès <ludovic.courtes <at> inria.fr>, Nicolas Graves
 <ngraves <at> ngraves.fr>
Cc: 62656 <at> debbugs.gnu.org
Subject: Re: bug#62656: broken guix time-machine + software-heritage
Date: Fri, 28 Apr 2023 16:43:10 +0200
Hi,

On mer., 26 avril 2023 at 11:50, Ludovic Courtès <ludovic.courtes <at> inria.fr> wrote:

>> swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/
>> swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/HEAD
>> swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/branches/
>> swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/config
>> swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/description

[...]


> I suspect this is an issue at SWH.  I’ll bring it up there.

Aside the potential bug on SWH side, maybe we could ask a flat cooking
instead of a git-bare cooking.

Considering the size of the Guix repository, it can take hours to cook
it – remember the test with CRLF ;-) – when most of the time, we need
only one specific revision.

Somehow, we could tweak ’clone-from-swh’ from (guix git) to use 'flat
instead of 'git-bare.  However, I am unsure the other tweaks it would
require since a Git repository is somehow expected.


Cheers,
simon




Information forwarded to bug-guix <at> gnu.org:
bug#62656; Package guix. (Tue, 02 May 2023 07:43:02 GMT) Full text and rfc822 format available.

Message #22 received at 62656 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludovic.courtes <at> inria.fr>
To: Simon Tournier <zimon.toutoune <at> gmail.com>
Cc: 62656 <at> debbugs.gnu.org, Nicolas Graves <ngraves <at> ngraves.fr>
Subject: Re: bug#62656: broken guix time-machine + software-heritage
Date: Tue, 02 May 2023 09:42:38 +0200
Hi!

Simon Tournier <zimon.toutoune <at> gmail.com> skribis:

> On mer., 26 avril 2023 at 11:50, Ludovic Courtès <ludovic.courtes <at> inria.fr> wrote:
>
>>> swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/
>>> swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/HEAD
>>> swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/branches/
>>> swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/config
>>> swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/description
>
> [...]
>
>
>> I suspect this is an issue at SWH.  I’ll bring it up there.
>
> Aside the potential bug on SWH side, maybe we could ask a flat cooking
> instead of a git-bare cooking.
>
> Considering the size of the Guix repository, it can take hours to cook
> it – remember the test with CRLF ;-) – when most of the time, we need
> only one specific revision.
>
> Somehow, we could tweak ’clone-from-swh’ from (guix git) to use 'flat
> instead of 'git-bare.  However, I am unsure the other tweaks it would
> require since a Git repository is somehow expected.

Yeah, ‘clone-from-swh’ is really cloning, so it needs ‘git-bare’.
Generally, in the case of channels, we need a full clone, not just a
revision.  Various bits of the machinery expect the clone: (guix
describe), (guix channels), and so on.

Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#62656; Package guix. (Tue, 02 May 2023 18:03:02 GMT) Full text and rfc822 format available.

Message #25 received at 62656 <at> debbugs.gnu.org (full text, mbox):

From: Simon Tournier <zimon.toutoune <at> gmail.com>
To: Ludovic Courtès <ludovic.courtes <at> inria.fr>
Cc: Nicolas Graves <ngraves <at> ngraves.fr>, 62656 <at> debbugs.gnu.org
Subject: Re: bug#62656: broken guix time-machine + software-heritage
Date: Tue, 02 May 2023 20:01:17 +0200
Hi,

On Tue, 02 May 2023 at 09:42, Ludovic Courtès <ludovic.courtes <at> inria.fr> wrote:

>> Somehow, we could tweak ’clone-from-swh’ from (guix git) to use 'flat
>> instead of 'git-bare.  However, I am unsure the other tweaks it would
>> require since a Git repository is somehow expected.
>
> Yeah, ‘clone-from-swh’ is really cloning, so it needs ‘git-bare’.
> Generally, in the case of channels, we need a full clone, not just a
> revision.  Various bits of the machinery expect the clone: (guix
> describe), (guix channels), and so on.

Even if the bug on SWH would be fixed, at the rate the Guix repo is
growing, it would be impractical to cook the whole Guix repo.  And it
appears to me weird when we, most of the time, need a very restricted
set of commits.

We could imagine to locally create a new repo (git init) and only add
the content of the commit specified by “guix time-machine”.

Cheers,
simon

PS: Just some numbers backing the rate of growing:

        $ git log --oneline | wc -l
        114457

        $ git log --oneline --before=2019-05-01 | wc -l
        43845

        $ git log --oneline --after=2019-05-01 | wc -l
        70612


 1. We are cooking 43845 commits of the history that are useless because
    unreachable with the time-machine.  They pre-date the introduction
    of the inferiors – yes, we could refine and consider v0.15 instead
    of v1.0.0. :-)

 2. The first commit is from 2012.  Over the first 7 years, 38% of the
    history had been produced.  In less than 4 years, we have produced
    62% of the history!  Yeah, that’s cool!

    Basically, from now to less than 5 years, we will generate the same
    number of commits as over the past 10 years.
    




Information forwarded to bug-guix <at> gnu.org:
bug#62656; Package guix. (Thu, 04 May 2023 07:23:01 GMT) Full text and rfc822 format available.

Message #28 received at 62656 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludovic.courtes <at> inria.fr>
To: Simon Tournier <zimon.toutoune <at> gmail.com>
Cc: Nicolas Graves <ngraves <at> ngraves.fr>, 62656 <at> debbugs.gnu.org
Subject: Re: bug#62656: broken guix time-machine + software-heritage
Date: Thu, 04 May 2023 09:22:23 +0200
Hi,

Simon Tournier <zimon.toutoune <at> gmail.com> skribis:

> On Tue, 02 May 2023 at 09:42, Ludovic Courtès <ludovic.courtes <at> inria.fr> wrote:
>
>>> Somehow, we could tweak ’clone-from-swh’ from (guix git) to use 'flat
>>> instead of 'git-bare.  However, I am unsure the other tweaks it would
>>> require since a Git repository is somehow expected.
>>
>> Yeah, ‘clone-from-swh’ is really cloning, so it needs ‘git-bare’.
>> Generally, in the case of channels, we need a full clone, not just a
>> revision.  Various bits of the machinery expect the clone: (guix
>> describe), (guix channels), and so on.
>
> Even if the bug on SWH would be fixed, at the rate the Guix repo is
> growing, it would be impractical to cook the whole Guix repo.

Falling back to SWH to fetch channels is something we expect to be rare,
though.

> And it appears to me weird when we, most of the time, need a very
> restricted set of commits.
>
> We could imagine to locally create a new repo (git init) and only add
> the content of the commit specified by “guix time-machine”.

To do that we’d need to say goodbye to the features I mentioned above.

> PS: Just some numbers backing the rate of growing:
>
>         $ git log --oneline | wc -l
>         114457
>
>         $ git log --oneline --before=2019-05-01 | wc -l
>         43845
>
>         $ git log --oneline --after=2019-05-01 | wc -l
>         70612
>
>
>  1. We are cooking 43845 commits of the history that are useless because
>     unreachable with the time-machine.  They pre-date the introduction
>     of the inferiors – yes, we could refine and consider v0.15 instead
>     of v1.0.0. :-)
>
>  2. The first commit is from 2012.  Over the first 7 years, 38% of the
>     history had been produced.  In less than 4 years, we have produced
>     62% of the history!  Yeah, that’s cool!
>
>     Basically, from now to less than 5 years, we will generate the same
>     number of commits as over the past 10 years.

Heh, insightful figures!

Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#62656; Package guix. (Thu, 04 May 2023 07:58:02 GMT) Full text and rfc822 format available.

Message #31 received at 62656 <at> debbugs.gnu.org (full text, mbox):

From: Simon Tournier <zimon.toutoune <at> gmail.com>
To: Ludovic Courtès <ludovic.courtes <at> inria.fr>
Cc: Nicolas Graves <ngraves <at> ngraves.fr>, 62656 <at> debbugs.gnu.org
Subject: Re: bug#62656: broken guix time-machine + software-heritage
Date: Thu, 4 May 2023 09:57:26 +0200
Hi Ludo,

On Thu, 4 May 2023 at 09:22, Ludovic Courtès <ludovic.courtes <at> inria.fr> wrote:

> > Even if the bug on SWH would be fixed, at the rate the Guix repo is
> > growing, it would be impractical to cook the whole Guix repo.
>
> Falling back to SWH to fetch channels is something we expect to be rare,
> though.

Being rare will not make it practical. ;-)

What I am trying to point is that considering the size of the Guix
repository and its rate, the current implementation will not scale and
the fallback will be impossible for the end-user.

> > And it appears to me weird when we, most of the time, need a very
> > restricted set of commits.
> >
> > We could imagine to locally create a new repo (git init) and only add
> > the content of the commit specified by “guix time-machine”.
>
> To do that we’d need to say goodbye to the features I mentioned above.

Well, I do not see which features will be missing.  I am talking about
making practical:

    guix time-machine -C channels.scm -- shell -m manifest.scm

and not having a complete working Guix.  Well, I read a paper that
mentions this command line, I want to inspect so I am running this
command.  Somehow, I do not care about the others 114456 commits of
the history.  And for sure "guix time-machine -C channels.scm --
describe -f channels" will not be a fixed-point.

Maybe, we could imagine an option for shortcutting the complete clone
and restrict to one specific commit.


Cheers,
simon




Information forwarded to bug-guix <at> gnu.org:
bug#62656; Package guix. (Thu, 04 May 2023 13:06:01 GMT) Full text and rfc822 format available.

Message #34 received at 62656 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludovic.courtes <at> inria.fr>
To: Simon Tournier <zimon.toutoune <at> gmail.com>
Cc: Nicolas Graves <ngraves <at> ngraves.fr>, 62656 <at> debbugs.gnu.org
Subject: Re: bug#62656: broken guix time-machine + software-heritage
Date: Thu, 04 May 2023 15:05:44 +0200
Hi,

Simon Tournier <zimon.toutoune <at> gmail.com> skribis:

> On Thu, 4 May 2023 at 09:22, Ludovic Courtès <ludovic.courtes <at> inria.fr> wrote:
>
>> > Even if the bug on SWH would be fixed, at the rate the Guix repo is
>> > growing, it would be impractical to cook the whole Guix repo.
>>
>> Falling back to SWH to fetch channels is something we expect to be rare,
>> though.
>
> Being rare will not make it practical. ;-)
>
> What I am trying to point is that considering the size of the Guix
> repository and its rate, the current implementation will not scale and
> the fallback will be impossible for the end-user.

It’s not impossible, it just takes time (how long exactly, I don’t know,
we should check with the SWH folks what we can expect and what the
relevant factors are.)

That it takes time is acceptable IMO: we’re likely talking about
disaster recovery after the Savannah repo and its GitHub mirror have
disappeared.

Other channels, are typically smaller but also more likely to vanish; I
wonder how that affects the cooking time at SWH—again something to ask
them.

>> > And it appears to me weird when we, most of the time, need a very
>> > restricted set of commits.
>> >
>> > We could imagine to locally create a new repo (git init) and only add
>> > the content of the commit specified by “guix time-machine”.
>>
>> To do that we’d need to say goodbye to the features I mentioned above.
>
> Well, I do not see which features will be missing.

Those mentioned earlier, provenance tracking and downgrade detection in
particular.

Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#62656; Package guix. (Thu, 04 May 2023 18:28:02 GMT) Full text and rfc822 format available.

Message #37 received at 62656 <at> debbugs.gnu.org (full text, mbox):

From: Simon Tournier <zimon.toutoune <at> gmail.com>
To: Ludovic Courtès <ludovic.courtes <at> inria.fr>
Cc: 62656 <at> debbugs.gnu.org, Nicolas Graves <ngraves <at> ngraves.fr>
Subject: Re: bug#62656: broken guix time-machine + software-heritage
Date: Thu, 04 May 2023 19:00:28 +0200
Hi,

On jeu., 04 mai 2023 at 15:05, Ludovic Courtès <ludovic.courtes <at> inria.fr> wrote:

>> Well, I do not see which features will be missing.
>
> Those mentioned earlier, provenance tracking and downgrade detection in
> particular.

Do we care about provenance tracking for this scenario?  Similarly, do
we care about downgrade detection for this scenario?

I mean, we are not talking about a regular scenario but as you said a
worst-case scenario.

Somehow, I am missing where “security” (provenance tracking and
downgrade detection) fits in the picture.

If tomorrow Savannah is totally down and let assume the malicious Eve is
serving https://git.savannah.gnu.org/git/guix.git.  The authentication
is useless since Eve can easily rewrite it.  The only mechanism that
protects Alice is the commit SHA-1 hash she has at hand.  Eve needs to
attack this SHA-1 with some collision.  And if it’s possible to produce
pre-image attack for SHA-1, then nothing would prevent Eve to also
replace the origins of some packages in
https://git.savannah.gnu.org/git/guix.git.

Moreover, cloning from SWH using git-bare is not protecting neither.
Well, you are trusting SWH.  Somehow, you have no mean to be sure that
the repository you get back from SWH is the one you expect.  The only
way is to inspect the signatures; it means the end-user knows exactly
which gpg key from .guix-authorizations they must trust.

Obviously, the former could be injected in the latter. ;-)  Noting that
SWH heavily relies on SHA-1, IIUC.

Yeah, we should talk with SWH’s folks. :-)

Cheers,
simon




Information forwarded to bug-guix <at> gnu.org:
bug#62656; Package guix. (Fri, 05 May 2023 07:37:02 GMT) Full text and rfc822 format available.

Message #40 received at 62656 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludovic.courtes <at> inria.fr>
To: Simon Tournier <zimon.toutoune <at> gmail.com>
Cc: 62656 <at> debbugs.gnu.org, Nicolas Graves <ngraves <at> ngraves.fr>
Subject: Re: bug#62656: broken guix time-machine + software-heritage
Date: Fri, 05 May 2023 09:36:43 +0200
Hi!

Simon Tournier <zimon.toutoune <at> gmail.com> skribis:

> On jeu., 04 mai 2023 at 15:05, Ludovic Courtès <ludovic.courtes <at> inria.fr> wrote:
>
>>> Well, I do not see which features will be missing.
>>
>> Those mentioned earlier, provenance tracking and downgrade detection in
>> particular.
>
> Do we care about provenance tracking for this scenario?  Similarly, do
> we care about downgrade detection for this scenario?

Provenance tracking, yes.  I wrote about the current status: (guix
describe), (guix channels), etc. expect a full Git repo, which is why
things are done this way.

We could imagine a different design, but that’s a broader endeavor.

[...]

> If tomorrow Savannah is totally down and let assume the malicious Eve is
> serving https://git.savannah.gnu.org/git/guix.git.  The authentication
> is useless since Eve can easily rewrite it.

The authentication mechanism is designed to make this impossible.
That’s why one can run:

  guix pull --url=https://github.com/guix-mirror/guix

without fear (worst that can happen is that the mirror is stale).

> The only mechanism that protects Alice is the commit SHA-1 hash she
> has at hand.  Eve needs to attack this SHA-1 with some collision.  And
> if it’s possible to produce pre-image attack for SHA-1, then nothing
> would prevent Eve to also replace the origins of some packages in
> https://git.savannah.gnu.org/git/guix.git.

True to some extent—see the section about SHA1 in the Programming paper¹.

Ludo’.

¹ https://doi.org/10.22152/programming-journal.org/2023/7/1




Changed bug title to 'Cannot fallback to SWH for Guix channel' from 'broken guix time-machine + software-heritage' Request was from Simon Tournier <zimon.toutoune <at> gmail.com> to control <at> debbugs.gnu.org. (Mon, 04 Sep 2023 17:38:04 GMT) Full text and rfc822 format available.

Information forwarded to bug-guix <at> gnu.org:
bug#62656; Package guix. (Tue, 24 Oct 2023 13:25:02 GMT) Full text and rfc822 format available.

Message #45 received at 62656 <at> debbugs.gnu.org (full text, mbox):

From: Simon Tournier <zimon.toutoune <at> gmail.com>
To: Ludovic Courtès <ludovic.courtes <at> inria.fr>, Nicolas Graves
 <ngraves <at> ngraves.fr>
Cc: 62656 <at> debbugs.gnu.org
Subject: Re: bug#62656: broken guix time-machine + software-heritage
Date: Tue, 24 Oct 2023 15:23:15 +0200
Hi,

On Wed, 26 Apr 2023 at 12:01, Ludovic Courtès <ludovic.courtes <at> inria.fr> wrote:

>> I suspect this is an issue at SWH.  I’ll bring it up there.
>
> https://gitlab.softwareheritage.org/swh/devel/swh-vault/-/issues/4346

Issue closed. \o/

Now, it passes:

--8<---------------cut here---------------start------------->8---
$ time wget -O/tmp/swh.git \
   "https://archive.softwareheritage.org/api/1/vault/git-bare/swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1/raw/"
> --2023-10-24 15:12:14--  https://archive.softwareheritage.org/api/1/vault/git-bare/swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1/raw/
Resolving archive.softwareheritage.org (archive.softwareheritage.org)... 128.93.166.15
Connecting to archive.softwareheritage.org (archive.softwareheritage.org)|128.93.166.15|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://swhvaultstorage.blob.core.windows.net/contents-uncompressed/4210e49babbe65df77ab7075d68615ca5edc2a23?se=2023-10-25T13%3A12%3A14Z&sp=r&sv=2019-02-02&sr=b&rscd=attachment%3B%20filename%3D%22swh_1_rev_1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git.tar%22&sig=scSRKMM3zV0UO5rb91lk/M8AUhlQEeKrhm31VbvhB6w%3D [following]
--2023-10-24 15:12:14--  https://swhvaultstorage.blob.core.windows.net/contents-uncompressed/4210e49babbe65df77ab7075d68615ca5edc2a23?se=2023-10-25T13%3A12%3A14Z&sp=r&sv=2019-02-02&sr=b&rscd=attachment%3B%20filename%3D%22swh_1_rev_1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git.tar%22&sig=scSRKMM3zV0UO5rb91lk/M8AUhlQEeKrhm31VbvhB6w%3D
Resolving swhvaultstorage.blob.core.windows.net (swhvaultstorage.blob.core.windows.net)... 20.209.11.33
Connecting to swhvaultstorage.blob.core.windows.net (swhvaultstorage.blob.core.windows.net)|20.209.11.33|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 296632320 (283M) [application/octet-stream]
Saving to: ‘/tmp/swh.git’

/tmp/swh.git                      100%[===========================================================>] 282.89M  6.87MB/s    in 37s     

2023-10-24 15:12:51 (7.70 MB/s) - ‘/tmp/swh.git’ saved [296632320/296632320]


real	0m37.034s
user	0m0.973s
sys	0m2.602s
--8<---------------cut here---------------end--------------->8---

Please note:

--8<---------------cut here---------------start------------->8---
$ file swh.git
swh.git: POSIX tar archive (GNU)

$ mkdir -p some-dir
$ mv swh.git some-dir/
$ cd some-dir/
$ tar -xf swh.git
$ mv swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git .git

$ git log --oneline -10
1984d56b0e (HEAD -> master) gnu: Add scilab.
13b2d110ee gnu: Add suitesparse-3.
f4d7b901db gnu: matio: Add header file.
42b938ae8c gnu: Add audmes.
be5e280e5f Revert "gnu: network-manager: Update to 1.43.4."
7ceedc7df7 gnu: conan: Update to 2.0.2.
57c3662ddd gnu: conan: Use gexps and remove input labels.
113146d31c gnu: r-mumin: Update to 1.47.5.
c029bac121 gnu: r-tclust: Update to 1.5-4.
aadc68f297 gnu: r-car: Update to 3.1-2.

$ git log --oneline | wc -l
110743

$ git log --format="%cd %s" | tail -3
Wed Apr 18 23:34:19 2012 +0200 Add `.gitignore'.
Wed Apr 18 23:34:12 2012 +0200 Split (guix) in (guix store) and (guix derivations).
Wed Apr 18 23:21:11 2012 +0200 Initial commit.
--8<---------------cut here---------------end--------------->8---

And only the master branch seems around,

--8<---------------cut here---------------start------------->8---
$ git branch -avv
* master 1984d56b0e gnu: Add scilab.
--8<---------------cut here---------------end--------------->8---

Last, there is a SWH redirection that is probably not supported.

--8<---------------cut here---------------start------------->8---
$ guix time-machine -q --commit=1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1 -- describe
SWH: found revision 1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1 with directory at 'https://archive.softwareheritage.org/api/1/directory/1ea499e7529e67a0632ecbe0a8214f0618a82c1a/'
SWH: object swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1 could not be fetched from the vault
guix time-machine: warning: revision 1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1 of https://git.savannah.gnu.org/git/guix.git could not be fetched from Software Heritage
guix time-machine: error: Git error: failed to resolve address for git.savannah.gnu.org: Name or service not known
--8<---------------cut here---------------end--------------->8---

The issue is progressing…

Cheers,
simon




Reply sent to Nicolas Graves <ngraves <at> ngraves.fr>:
You have taken responsibility. (Sun, 04 Feb 2024 13:05:01 GMT) Full text and rfc822 format available.

Notification sent to Nicolas Graves <ngraves <at> ngraves.fr>:
bug acknowledged by developer. (Sun, 04 Feb 2024 13:05:02 GMT) Full text and rfc822 format available.

Message #50 received at 62656-done <at> debbugs.gnu.org (full text, mbox):

From: Nicolas Graves <ngraves <at> ngraves.fr>
To: 62656-done <at> debbugs.gnu.org
Subject: close 62656
Date: Sun, 04 Feb 2024 14:03:48 +0100
Issue fixed.

-- 
Best regards,
Nicolas Graves




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Mon, 04 Mar 2024 12:24:11 GMT) Full text and rfc822 format available.

This bug report was last modified 25 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.