GNU bug report logs - #49781
Successful build on ci.guix not available as substitute

Previous Next

Package: guix;

Reported by: Ludovic Courtès <ludovic.courtes <at> inria.fr>

Date: Fri, 30 Jul 2021 08:57:02 UTC

Severity: normal

Done: Mathieu Othacehe <othacehe <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 49781 in the body.
You can then email your comments to 49781 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to othacehe <at> gnu.org, bug-guix <at> gnu.org:
bug#49781; Package guix. (Fri, 30 Jul 2021 08:57:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Ludovic Courtès <ludovic.courtes <at> inria.fr>:
New bug report received and forwarded. Copy sent to othacehe <at> gnu.org, bug-guix <at> gnu.org. (Fri, 30 Jul 2021 08:57:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludovic.courtes <at> inria.fr>
To: bug-guix <at> gnu.org
Subject: Successful build on ci.guix not available as substitute
Date: Fri, 30 Jul 2021 10:56:29 +0200
Hi!

I stumbled upon the case of a build that was successful on ci.guix but
that does not have a corresponding substitute:

  https://ci.guix.gnu.org/build/806609/details

and:

--8<---------------cut here---------------start------------->8---
$ wget -O- https://ci.guix.gnu.org/slm0x0j05vyl3vqbjhgc0nch6jca9fp3.narinfo
--2021-07-30 10:53:15--  https://ci.guix.gnu.org/slm0x0j05vyl3vqbjhgc0nch6jca9fp3.narinfo
Ni solvigas ci.guix.gnu.org (ci.guix.gnu.org)... 141.80.181.40
Konektado al ci.guix.gnu.org (ci.guix.gnu.org)|141.80.181.40|:443... konektita.
HTTP peto sendita, ni atendas respondon... 404 Not Found
2021-07-30 10:53:15 ERARO 404: Not Found.
--8<---------------cut here---------------end--------------->8---

The store item vanished from berlin, though the .drv is still there:

--8<---------------cut here---------------start------------->8---
ludo <at> berlin ~$ ls /gnu/store/slm0x0j05vyl3vqbjhgc0nch6jca9fp3-xnnpack-0.0-1.bbe8824
ls: cannot access '/gnu/store/slm0x0j05vyl3vqbjhgc0nch6jca9fp3-xnnpack-0.0-1.bbe8824': No such file or directory
ludo <at> berlin ~$ guix gc --derivers /gnu/store/slm0x0j05vyl3vqbjhgc0nch6jca9fp3-xnnpack-0.0-1.bbe8824
/gnu/store/hrn9hj7v1img3il6afh3r036aczvwnyd-xnnpack-0.0-1.bbe8824.drv
ludo <at> berlin ~$ ls $(guix gc --derivers /gnu/store/slm0x0j05vyl3vqbjhgc0nch6jca9fp3-xnnpack-0.0-1.bbe8824)
/gnu/store/hrn9hj7v1img3il6afh3r036aczvwnyd-xnnpack-0.0-1.bbe8824.drv
--8<---------------cut here---------------end--------------->8---

It’s the first time I see this.  My understanding is that Cuirass makes
a GET request on ‘guix publish’ upon build completion.  Could there be a
race condition or something that can explain this?

Thanks,
Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#49781; Package guix. (Fri, 30 Jul 2021 10:12:02 GMT) Full text and rfc822 format available.

Message #8 received at 49781 <at> debbugs.gnu.org (full text, mbox):

From: Mathieu Othacehe <othacehe <at> gnu.org>
To: Ludovic Courtès <ludovic.courtes <at> inria.fr>
Cc: 49781 <at> debbugs.gnu.org
Subject: Re: bug#49781: Successful build on ci.guix not available as substitute
Date: Fri, 30 Jul 2021 12:11:17 +0200
Hello Ludo,

> It’s the first time I see this.  My understanding is that Cuirass makes
> a GET request on ‘guix publish’ upon build completion.  Could there be a
> race condition or something that can explain this?

Your understanding is correct :). However, looking at the
cuirass-remote-server.log file, I found:

--8<---------------cut here---------------start------------->8---
2021-07-29T22:31:48 fetching '/gnu/store/hrn9hj7v1img3il6afh3r036aczvwnyd-xnnpack-0.0-1.bbe8824.drv' from http://141.80.167.185:5558
2021-07-29T22:31:48 Failed to add /gnu/store/slm0x0j05vyl3vqbjhgc0nch6jca9fp3-xnnpack-0.0-1.bbe8824 to store.
--8<---------------cut here---------------end--------------->8---

There are quite a few other "Failed to add" traces in the log, and they
all seem to involve the machine behind the .185 IP address.

Fetching the substitute manually from Berlin works:

--8<---------------cut here---------------start------------->8---
 wget http://141.80.167.185:5558/nar/slm0x0j05vyl3vqbjhgc0nch6jca9fp3-xnnpack-0.0-1.bbe8824 --no-http-keep-alive
--8<---------------cut here---------------end--------------->8---

but using "ensure-path" fails this way:

--8<---------------cut here---------------start------------->8---
scheme@(guix-user)> (set-build-options* s "http://141.80.167.185:5558")
$1 = #t
scheme@(guix-user)> (ensure-path s "/gnu/store/slm0x0j05vyl3vqbjhgc0nch6jca9fp3-xnnpack-0.0-1.bbe8824")
substitute: updating substitutes from 'http://141.80.167.185:5558'... 100.0%
substitute: updating substitutes from 'https://ci.guix.gnu.org'... 100.0%
ice-9/boot-9.scm:1669:16: In procedure raise-exception:
ERROR:
  1. &store-protocol-error:
      message: "path `/gnu/store/slm0x0j05vyl3vqbjhgc0nch6jca9fp3-xnnpack-0.0-1.bbe8824' does not exist and cannot be created"
      status: 1

Entering a new prompt.  Type `,bt' for a backtrace or `,q' to continue.
--8<---------------cut here---------------end--------------->8---

Do you see any reason why ensure-path would fail while wget works?

Thanks,

Mathieu




Information forwarded to bug-guix <at> gnu.org:
bug#49781; Package guix. (Mon, 02 Aug 2021 08:39:02 GMT) Full text and rfc822 format available.

Message #11 received at 49781 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludovic.courtes <at> inria.fr>
To: Mathieu Othacehe <othacehe <at> gnu.org>
Cc: 49781 <at> debbugs.gnu.org
Subject: Re: bug#49781: Successful build on ci.guix not available as substitute
Date: Mon, 02 Aug 2021 10:38:39 +0200
Hello!

Mathieu Othacehe <othacehe <at> gnu.org> skribis:

> Your understanding is correct :). However, looking at the
> cuirass-remote-server.log file, I found:
>
> 2021-07-29T22:31:48 fetching '/gnu/store/hrn9hj7v1img3il6afh3r036aczvwnyd-xnnpack-0.0-1.bbe8824.drv' from http://141.80.167.185:5558
> 2021-07-29T22:31:48 Failed to add /gnu/store/slm0x0j05vyl3vqbjhgc0nch6jca9fp3-xnnpack-0.0-1.bbe8824 to store.
>
>
> There are quite a few other "Failed to add" traces in the log, and they
> all seem to involve the machine behind the .185 IP address.

Ah ha!

> Fetching the substitute manually from Berlin works:
>
>  wget http://141.80.167.185:5558/nar/slm0x0j05vyl3vqbjhgc0nch6jca9fp3-xnnpack-0.0-1.bbe8824 --no-http-keep-alive
>
>
> but using "ensure-path" fails this way:
>
> scheme@(guix-user)> (set-build-options* s "http://141.80.167.185:5558")
> $1 = #t
> scheme@(guix-user)> (ensure-path s "/gnu/store/slm0x0j05vyl3vqbjhgc0nch6jca9fp3-xnnpack-0.0-1.bbe8824")
> substitute: updating substitutes from 'http://141.80.167.185:5558'... 100.0%
> substitute: updating substitutes from 'https://ci.guix.gnu.org'... 100.0%
> ice-9/boot-9.scm:1669:16: In procedure raise-exception:
> ERROR:
>   1. &store-protocol-error:
>       message: "path `/gnu/store/slm0x0j05vyl3vqbjhgc0nch6jca9fp3-xnnpack-0.0-1.bbe8824' does not exist and cannot be created"
>       status: 1
>
> Entering a new prompt.  Type `,bt' for a backtrace or `,q' to continue.
>
> Do you see any reason why ensure-path would fail while wget works?

The cache on berlin says this substitute is available from .185:

--8<---------------cut here---------------start------------->8---
ludo <at> berlin ~$ sudo cat /var/guix/substitute/cache/mlcrs6tdmrqacbxadjqt4svma66gjcemoqjydai2vxshcokzw3fa/slm0x0j05vyl3vqbjhgc0nch6jca9fp3; echo
(narinfo (version 2) (cache-uri "http://141.80.167.185:5558") (date 1627639693) (ttl 129600) (value "StorePath: /gnu/store/slm0x0j05vyl3vqbjhgc0nch6jca9fp3-xnnpack-0.0-1.bbe8824\nURL: nar/slm0x0j05vyl3vqbjhgc0nch6jca9fp3-xnnpack-0.0-1.bbe8824\nCompression: none\nFileSize: 924872\nNarHash: sha256:0ym303zq7shxgrl0p797bpfr8iymdwg6gc3vq85b6ms1akmj396b\nNarSize: 924872\nReferences: 01b4w3m6mp55y531kyi1g8shh722kwqm-gcc-7.5.0-lib 4lgnc0s0h6zpdfbv7ccgj50ya6m6gy4j-pthreadpool-0.1-1.1787867 dv8kki4psjqyf01y0s0vf6j5hw3sm5wk-cpuinfo-0.0-1.866ae6e fa6wj5bxkj5ll1d7292a70knmyl7a0cr-glibc-2.31 slm0x0j05vyl3vqbjhgc0nch6jca9fp3-xnnpack-0.0-1.bbe8824\nSystem: x86_64-linux\nDeriver: hrn9hj7v1img3il6afh3r036aczvwnyd-xnnpack-0.0-1.bbe8824.drv\nSignature: 1;hydra-guix-128;KHNpZ25hdHVyZSAKIChkYXRhIAogIChmbGFncyByZmM2OTc5KQogIChoYXNoIHNoYTI1NiAjRjQzQTYyOTFERTM5QjY4MUVCRTQxNEZGRTE0NUNGQ0E5RDhEREQ5OTc3QTM0RUUzOTIxQzg3OEQ3N0JBMkFEOCMpCiAgKQogKHNpZy12YWwgCiAgKGVjZHNhIAogICAociAjMEVGNTIzNkIyNDk5QzAzQ0NBREIzODg2QUMwQUQzODg4QTUyOEU2NTQ3NTIxM0Q0ODYyNjYwMUJEODE1RDdGQyMpCiAgIChzICMwQTQ0ODU5N0VBNUJFODFFOEJFRDlDRkM2MkFFRTEwMUIxQjdGN0U1RUM5RTlFRDc1Q0Q0NjQ4NkYyQUVGNThFIykKICAgKQogICkKIChwdWJsaWMta2V5IAogIChlY2MgCiAgIChjdXJ2ZSBFZDI1NTE5KQogICAocSAjRkQyMEI1QjVFMDkwQjVBMzYzMTI0OTA1RTJDQjAyMEZCOUIzQzk3QTU0NTUxMDI4REYwQTM2NEU0QzRDNjY1MyMpCiAgICkKICApCiApCg==\n"))
--8<---------------cut here---------------end--------------->8---

Yet, I’m also stumbling upon the ‘ensure-path’ failure above, but now
the substitute seems to have vanished:

--8<---------------cut here---------------start------------->8---
scheme@(guix-user)> ,use(guix store)
scheme@(guix-user)> (define s (open-connection ))
scheme@(guix-user)> (set-build-options s #:substitute-urls '("http://141.80.167.185:5558"))
$1 = #t
scheme@(guix-user)> (ensure-path s "/gnu/store/slm0x0j05vyl3vqbjhgc0nch6jca9fp3-xnnpack-0.0-1.bbe8824")
substitute: updating substitutes from 'http://141.80.167.185:5558'... 100.0%
ice-9/boot-9.scm:1685:16: In procedure raise-exception:
ERROR:
  1. &store-protocol-error:
      message: "path `/gnu/store/slm0x0j05vyl3vqbjhgc0nch6jca9fp3-xnnpack-0.0-1.bbe8824' does not exist and cannot be created"
      status: 1

Entering a new prompt.  Type `,bt' for a backtrace or `,q' to continue.
scheme@(guix-user) [1]> ,q
scheme@(guix-user)> (substitutable-path-info s '("/gnu/store/slm0x0j05vyl3vqbjhgc0nch6jca9fp3-xnnpack-0.0-1.bbe8824"))
substitute: updating substitutes from 'http://141.80.167.185:5558'... 100.0%
$2 = ()
scheme@(guix-user)> 
[1]+  Stopped                 guix repl
ludo <at> berlin ~$ sudo cat /var/guix/substitute/cache/mlcrs6tdmrqacbxadjqt4svma66gjcemoqjydai2vxshcokzw3fa/slm0x0j05vyl3vqbjhgc0nch6jca9fp3; echo (narinfo (version 2) (cache-uri "http://141.80.167.185:5558") (date 1627892782) (ttl 0) (value #f))
ludo <at> berlin ~$ fg
guix repl

scheme@(guix-user)> (substitutable-path-info s '("/gnu/store/slm0x0j05vyl3vqbjhgc0nch6jca9fp3-xnnpack-0.0-1.bbe8824"))
substitute: updating substitutes from 'http://141.80.167.185:5558'... 100.0%
$3 = ()
scheme@(guix-user)> (substitutable-path-info s '("/gnu/store/slm0x0j05vyl3vqbjhgc0nch6jca9fp3-xnnpack-0.0-1.bbe8824"))
substitute: updating substitutes from 'http://141.80.167.185:5558'... 100.0%
$4 = ()
scheme@(guix-user)> ,q
ludo <at> berlin ~$ wget -O - http://141.80.167.185:5558/slm0x0j05vyl3vqbjhgc0nch6jca9fp3.narinfo
--2021-08-02 10:28:22--  http://141.80.167.185:5558/slm0x0j05vyl3vqbjhgc0nch6jca9fp3.narinfo
Connecting to 141.80.167.185:5558... connected.
HTTP request sent, awaiting response... 404 Not Found
2021-08-02 10:28:22 ERROR 404: Not Found.
--8<---------------cut here---------------end--------------->8---

And indeed, the store item no longer is on .185:

--8<---------------cut here---------------start------------->8---
ludo <at> berlin ~$ sudo ssh hydra <at> 141.80.167.185
-bash-5.0$ ls /gnu/store/slm0x0j05vyl3vqbjhgc0nch6jca9fp3-xnnpack-0.0-1.bbe8824
ls: cannot access '/gnu/store/slm0x0j05vyl3vqbjhgc0nch6jca9fp3-xnnpack-0.0-1.bbe8824': No such file or directory
--8<---------------cut here---------------end--------------->8---

One of the problems is that .185 did not advertise a TTL on its
narinfos, hence the default of (* 36 3600) = 129600 that we saw above.
That in turn explains some discrepancies: ‘guix substitute’ thinks the
nar is still around but the corresponding /nar request fails because it
has disappeared in the meantime.

But perhaps the core issue is that .185 did not keep the build result
long enough, perhaps because GC ran right after the build?  Is that
plausible?

Thanks,
Ludo’.




Reply sent to Mathieu Othacehe <othacehe <at> gnu.org>:
You have taken responsibility. (Mon, 09 Aug 2021 13:32:02 GMT) Full text and rfc822 format available.

Notification sent to Ludovic Courtès <ludovic.courtes <at> inria.fr>:
bug acknowledged by developer. (Mon, 09 Aug 2021 13:32:02 GMT) Full text and rfc822 format available.

Message #16 received at 49781-done <at> debbugs.gnu.org (full text, mbox):

From: Mathieu Othacehe <othacehe <at> gnu.org>
To: Ludovic Courtès <ludovic.courtes <at> inria.fr>
Cc: 49781-done <at> debbugs.gnu.org
Subject: Re: bug#49781: Successful build on ci.guix not available as substitute
Date: Mon, 09 Aug 2021 15:31:30 +0200
Hey Ludo!

I found out the reason of this failure, the .185 key was not authorized
on Berlin. This issue while really simple, took us a long time to
detect. I think we need to come out with a big warning somewhere when a
substitute is refused because the server is not authorized.

In the meantime this is fixed with
d8f078a2c96b867abba5092a251175b8d82dd37c in the maintenance repository.

Closing this one,

Thanks,

Mathieu




Information forwarded to bug-guix <at> gnu.org:
bug#49781; Package guix. (Tue, 10 Aug 2021 07:54:02 GMT) Full text and rfc822 format available.

Message #19 received at 49781-done <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludovic.courtes <at> inria.fr>
To: Mathieu Othacehe <othacehe <at> gnu.org>
Cc: 49781-done <at> debbugs.gnu.org
Subject: Re: bug#49781: Successful build on ci.guix not available as substitute
Date: Tue, 10 Aug 2021 09:53:34 +0200
Hi,

Mathieu Othacehe <othacehe <at> gnu.org> skribis:

> I found out the reason of this failure, the .185 key was not authorized
> on Berlin. This issue while really simple, took us a long time to
> detect. I think we need to come out with a big warning somewhere when a
> substitute is refused because the server is not authorized.

Ouch!  That machine must have been off-line when I initially imported
all the keys.

The difficulty with printing a warning is that when using ‘--discover’
or similar, we just want to use unauthorized machines as mirrors, and
there should be no warnings in that case.  But it’s hard to tell whether
that’s intended or not.

> In the meantime this is fixed with
> d8f078a2c96b867abba5092a251175b8d82dd37c in the maintenance repository.

Thank you!

Ludo’.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Tue, 07 Sep 2021 11:24:07 GMT) Full text and rfc822 format available.

This bug report was last modified 2 years and 225 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.