GNU bug report logs - #57978
'guix substitute' stops when first substitute URL is unroutable

Previous Next

Package: guix;

Reported by: Attila Lendvai <attila <at> lendvai.name>

Date: Wed, 21 Sep 2022 13:12:02 UTC

Severity: important

Done: Ludovic Courtès <ludo <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 57978 in the body.
You can then email your comments to 57978 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guix <at> gnu.org:
bug#57978; Package guix. (Wed, 21 Sep 2022 13:12:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Attila Lendvai <attila <at> lendvai.name>:
New bug report received and forwarded. Copy sent to bug-guix <at> gnu.org. (Wed, 21 Sep 2022 13:12:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Attila Lendvai <attila <at> lendvai.name>
To: "bug-guix <at> gnu.org" <bug-guix <at> gnu.org>
Subject: the fallback machanism for substitute servers doesn't work?
Date: Wed, 21 Sep 2022 13:11:06 +0000
ci.guix.gnu.org is down right now. if i add --substitute-urls=http://bordeaux.guix.gnu.org then things work, but sans that it fails:


$ ./pre-inst-env guix system --no-graphic vm ~/workspace/guix/guix-crypto/tests/swarm-tests.scm
guix system: warning: the following groups appear more than once: swarm-mainnet
guix system: warning: the following groups appear more than once: swarm-mainnet
guix system: warning: the following groups appear more than once: swarm-mainnet
guix system: warning: the following groups appear more than once: swarm-mainnet
guix system: warning: the following groups appear more than once: swarm-mainnet
guix system: warning: the following groups appear more than once: swarm-mainnet
guix system: warning: the following groups appear more than once: swarm-mainnet
guix system: warning: the following groups appear more than once: swarm-mainnet
guix system: warning: the following groups appear more than once: swarm-mainnet
guix system: warning: the following groups appear more than once: swarm-mainnet
substitute: updating substitutes from 'https://substitutes.nonguix.org'... 100.0%
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%guix substitute: warning: ci.guix.gnu.org: connection failed: No route to host
substitute: 
substitute: updating substitutes from 'https://bordeaux.guix.gnu.org'... 100.0%
The following derivations will be built:
  /gnu/store/fkdmiwmvb6ar6n04hr470r3f5frgcbnc-bee-binary-1.8.1.drv
  /gnu/store/ksqiqmdijfy34g9qmqhkn3r5kww7v644-bee-linux-amd64.drv
  /gnu/store/lgc6jnar1qha8dydhi5p9ni2jawp5wmd-module-import-compiled.drv
  /gnu/store/kjza0q20vy6jywfrzr4l5df5va8d5ia9-geth-binary-1.10.25.drv
  /gnu/store/f090qzxym89vp3r13fbqlh4ghbnfc7ls-geth-alltools-linux-amd64-1.10.25-69568c55.tar.gz.drv
  /gnu/store/kxjd60sx5hxygkz8vfj670f2c70xdjxd-module-import-compiled.drv
  /gnu/store/jkp7wrakjv4gqjn475kszaa425zgm62a-openethereum-binary-3.3.5.drv
  /gnu/store/d1cl9x0gy0bns9frqwgliq0z7604vian-openethereum-linux-v3.3.5.zip.drv

71.2 MB will be downloaded
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%
guix substitute: warning: ci.guix.gnu.org: connection failed: No route to host
 qemu-minimal-7.1.0-doc  3.4MiB                                                                                                  876.6MiB/s 00:00 [##################] 100.0%
guix substitute: error: connect*: No route to host
substitution of /gnu/store/7czrnkybr466v69wdj6i2sn6vpsg0ks3-cdrkit-libre-1.1.11 failed
guix system: error: corrupt input while restoring archive from #<closed: file 7f37458bd000>













the second time fails with another package:

$ ./pre-inst-env guix system --no-graphic vm ~/workspace/guix/guix-crypto/tests/swarm-tests.scm
guix system: warning: the following groups appear more than once: swarm-mainnet
guix system: warning: the following groups appear more than once: swarm-mainnet
guix system: warning: the following groups appear more than once: swarm-mainnet
guix system: warning: the following groups appear more than once: swarm-mainnet
guix system: warning: the following groups appear more than once: swarm-mainnet
guix system: warning: the following groups appear more than once: swarm-mainnet
guix system: warning: the following groups appear more than once: swarm-mainnet
guix system: warning: the following groups appear more than once: swarm-mainnet
guix system: warning: the following groups appear more than once: swarm-mainnet
guix system: warning: the following groups appear more than once: swarm-mainnet
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%guix substitute: warning: ci.guix.gnu.org: connection failed: No route to host
substitute: 
The following derivations will be built:
  /gnu/store/fkdmiwmvb6ar6n04hr470r3f5frgcbnc-bee-binary-1.8.1.drv
  /gnu/store/ksqiqmdijfy34g9qmqhkn3r5kww7v644-bee-linux-amd64.drv
  /gnu/store/lgc6jnar1qha8dydhi5p9ni2jawp5wmd-module-import-compiled.drv
  /gnu/store/kjza0q20vy6jywfrzr4l5df5va8d5ia9-geth-binary-1.10.25.drv
  /gnu/store/f090qzxym89vp3r13fbqlh4ghbnfc7ls-geth-alltools-linux-amd64-1.10.25-69568c55.tar.gz.drv
  /gnu/store/kxjd60sx5hxygkz8vfj670f2c70xdjxd-module-import-compiled.drv
  /gnu/store/jkp7wrakjv4gqjn475kszaa425zgm62a-openethereum-binary-3.3.5.drv
  /gnu/store/d1cl9x0gy0bns9frqwgliq0z7604vian-openethereum-linux-v3.3.5.zip.drv

71.2 MB will be downloaded
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%
guix substitute: error: connect*: No route to host
substitution of /gnu/store/qvd2h5fd60h9p6yc161mndznf1785c9p-cpio-2.13 failed
guix system: error: corrupt input while restoring archive from #<closed: file 7fdc6f291000>










Information forwarded to bug-guix <at> gnu.org:
bug#57978; Package guix. (Wed, 21 Sep 2022 20:51:02 GMT) Full text and rfc822 format available.

Message #8 received at 57978 <at> debbugs.gnu.org (full text, mbox):

From: zimoun <zimon.toutoune <at> gmail.com>
To: Attila Lendvai <attila <at> lendvai.name>, 57978 <at> debbugs.gnu.org
Subject: Re: bug#57978: the fallback machanism for substitute servers
 doesn't work?
Date: Wed, 21 Sep 2022 22:43:29 +0200
Hi,

On Wed, 21 Sep 2022 at 13:11, Attila Lendvai <attila <at> lendvai.name> wrote:
> ci.guix.gnu.org is down right now. if i add --substitute-urls=http://bordeaux.guix.gnu.org then things work, but sans that it fails:

[...]

> guix substitute: error: connect*: No route to host
> substitution of /gnu/store/7czrnkybr466v69wdj6i2sn6vpsg0ks3-cdrkit-libre-1.1.11 failed
> guix system: error: corrupt input while restoring archive from #<closed: file 7f37458bd000>

I observed the same behaviour.  In addition, I notice:

 1. even if I have the substitute inside the store of my offloading
    machine, then it fails with similar error

 2. the option --fallabck does not fallback and does not build locally.


Cheers,
simon




Severity set to 'important' from 'normal' Request was from Ludovic Courtès <ludo <at> gnu.org> to control <at> debbugs.gnu.org. (Thu, 22 Sep 2022 09:34:02 GMT) Full text and rfc822 format available.

Changed bug title to ''guix substitute' stops when first substitute URL is unroutable' from 'the fallback machanism for substitute servers doesn't work?' Request was from Ludovic Courtès <ludo <at> gnu.org> to control <at> debbugs.gnu.org. (Thu, 22 Sep 2022 09:35:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-guix <at> gnu.org:
bug#57978; Package guix. (Thu, 22 Sep 2022 09:49:02 GMT) Full text and rfc822 format available.

Message #15 received at 57978 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Attila Lendvai <attila <at> lendvai.name>
Cc: 57978 <at> debbugs.gnu.org
Subject: Re: bug#57978: the fallback machanism for substitute servers
 doesn't work?
Date: Thu, 22 Sep 2022 11:47:56 +0200
Hi,

Attila Lendvai <attila <at> lendvai.name> skribis:

> guix substitute: warning: ci.guix.gnu.org: connection failed: No route to host
>  qemu-minimal-7.1.0-doc  3.4MiB                                                                                                  876.6MiB/s 00:00 [##################] 100.0%
> guix substitute: error: connect*: No route to host
> substitution of /gnu/store/7czrnkybr466v69wdj6i2sn6vpsg0ks3-cdrkit-libre-1.1.11 failed
> guix system: error: corrupt input while restoring archive from #<closed: file 7f37458bd000>

I observed the same yesterday when ci.guix was down.

Note that the following command, where 203.* is unroutable, does not
reproduce it:

  guix build --substitute-urls="http://203.0.113.1 https://ci.guix.gnu.org" \
    --no-grafts pandoc    # pick a package not in store

So I believe what we experienced yesterday goes along these lines:

  1. We had cached narinfos for ci.guix available locally so the daemon
     assumed it could go ahead and download from ci.guix;

  2. When ‘guix substitute --substitute’ when to download stuff from
     ci.guix, which it assumed was possible because there was a valid
     narinfo for that, it didn’t handle the connection failure.  (The
     same happens if you get, say, 404 while substituting even though
     you have a valid substitute at hand.)

Trying to come up with a fix…

Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#57978; Package guix. (Fri, 23 Sep 2022 06:17:03 GMT) Full text and rfc822 format available.

Message #18 received at 57978 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: guix-patches <at> gnu.org
Cc: Attila Lendvai <attila <at> lendvai.name>, 57978 <at> debbugs.gnu.org,
 Ludovic Courtès <ludo <at> gnu.org>,
 zimoun <zimon.toutoune <at> gmail.com>
Subject: [PATCH 0/2] Retry nar downloads upon failure
Date: Fri, 23 Sep 2022 08:16:16 +0200
Hello!

This is a long overdue fix for <https://issues.guix.gnu.org/57978>:
when a nar cannot be downloaded from its “preferred” location,
‘guix substitute --substitute’ will now retry once for each substitute
URL instead of failing right away.

This should address the most common issues such as transient
networking failures.

Comments?

Thanks,
Ludo’.

Ludovic Courtès (2):
  substitute: Split nar download.
  substitute: Retry downloading when a nar is unavailable.

 guix/scripts/substitute.scm | 157 +++++++++++++++++++++++++++---------
 tests/substitute.scm        | 113 ++++++++++++++++++++++++++
 2 files changed, 231 insertions(+), 39 deletions(-)


base-commit: a09655b20850d065333ec333e6e184b604f606a8
-- 
2.37.3





Information forwarded to bug-guix <at> gnu.org:
bug#57978; Package guix. (Fri, 23 Sep 2022 06:21:01 GMT) Full text and rfc822 format available.

Message #21 received at 57978 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: 58017 <at> debbugs.gnu.org
Cc: Attila Lendvai <attila <at> lendvai.name>, 57978 <at> debbugs.gnu.org,
 Ludovic Courtès <ludo <at> gnu.org>,
 zimoun <zimon.toutoune <at> gmail.com>
Subject: [PATCH 1/2] substitute: Split nar download.
Date: Fri, 23 Sep 2022 08:19:56 +0200
* guix/scripts/substitute.scm (download-nar): New procedure, with most
of the code moved from...
(process-substitution): ... here.  Call it.
---
 guix/scripts/substitute.scm | 52 +++++++++++++++++++++++--------------
 1 file changed, 32 insertions(+), 20 deletions(-)

diff --git a/guix/scripts/substitute.scm b/guix/scripts/substitute.scm
index cdf591ac4d..e3b382d0d8 100755
--- a/guix/scripts/substitute.scm
+++ b/guix/scripts/substitute.scm
@@ -437,20 +437,13 @@ (define-syntax-rule (with-cached-connection uri port exp ...)
   "Bind PORT with EXP... to a socket connected to URI."
   (call-with-cached-connection uri (lambda (port) exp ...)))
 
-(define* (process-substitution port store-item destination
-                               #:key cache-urls acl
-                               deduplicate? print-build-trace?)
-  "Substitute STORE-ITEM (a store file name) from CACHE-URLS, and write it to
-DESTINATION as a nar file.  Verify the substitute against ACL, and verify its
-hash against what appears in the narinfo.  When DEDUPLICATE? is true, and if
-DESTINATION is in the store, deduplicate its files.  Print a status line to
-PORT."
-  (define narinfo
-    (lookup-narinfo cache-urls store-item
-                    (if (%allow-unauthenticated-substitutes?)
-                        (const #t)
-                        (cut valid-narinfo? <> acl))))
-
+(define* (download-nar narinfo destination
+                       #:key status-port
+                       deduplicate? print-build-trace?)
+  "Download the nar prescribed in NARINFO, which is assumed to be authentic
+and authorized, and write it to DESTINATION.  When DEDUPLICATE? is true, and
+if DESTINATION is in the store, deduplicate its files.  Print a status line to
+STATUS-PORT."
   (define destination-in-store?
     (string-prefix? (string-append (%store-prefix) "/")
                     destination))
@@ -490,10 +483,6 @@ (define (fetch uri)
        (leave (G_ "unsupported substitute URI scheme: ~a~%")
               (uri->string uri)))))
 
-  (unless narinfo
-    (leave (G_ "no valid substitute for '~a'~%")
-           store-item))
-
   (let ((uri compression file-size
              (narinfo-best-uri narinfo
                                #:fast-decompression?
@@ -575,14 +564,37 @@ (define cpu-usage
       (let ((actual (get-hash)))
         (if (bytevector=? actual expected)
             ;; Tell the daemon that we're done.
-            (format port "success ~a ~a~%"
+            (format status-port "success ~a ~a~%"
                     (narinfo-hash narinfo) (narinfo-size narinfo))
             ;; The actual data has a different hash than that in NARINFO.
-            (format port "hash-mismatch ~a ~a ~a~%"
+            (format status-port "hash-mismatch ~a ~a ~a~%"
                     (hash-algorithm-name algorithm)
                     (bytevector->nix-base32-string expected)
                     (bytevector->nix-base32-string actual)))))))
 
+(define* (process-substitution port store-item destination
+                               #:key cache-urls acl
+                               deduplicate? print-build-trace?)
+  "Substitute STORE-ITEM (a store file name) from CACHE-URLS, and write it to
+DESTINATION as a nar file.  Verify the substitute against ACL, and verify its
+hash against what appears in the narinfo.  When DEDUPLICATE? is true, and if
+DESTINATION is in the store, deduplicate its files.  Print a status line to
+PORT."
+  (define narinfo
+    (lookup-narinfo cache-urls store-item
+                    (if (%allow-unauthenticated-substitutes?)
+                        (const #t)
+                        (cut valid-narinfo? <> acl))))
+
+  (unless narinfo
+    (leave (G_ "no valid substitute for '~a'~%")
+           store-item))
+
+  (download-nar narinfo destination
+                #:status-port port
+                #:deduplicate? deduplicate?
+                #:print-build-trace? print-build-trace?))
+
 
 ;;;
 ;;; Entry point.
-- 
2.37.3





Information forwarded to bug-guix <at> gnu.org:
bug#57978; Package guix. (Fri, 23 Sep 2022 06:21:02 GMT) Full text and rfc822 format available.

Message #24 received at 57978 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: 58017 <at> debbugs.gnu.org
Cc: Attila Lendvai <attila <at> lendvai.name>, 57978 <at> debbugs.gnu.org,
 Ludovic Courtès <ludo <at> gnu.org>,
 zimoun <zimon.toutoune <at> gmail.com>
Subject: [PATCH 2/2] substitute: Retry downloading when a nar is unavailable.
Date: Fri, 23 Sep 2022 08:19:57 +0200
Fixes <https://issues.guix.gnu.org/57978>
Reported by Attila Lendvai <attila <at> lendvai.name>.

Previously, if a narinfo was available but its corresponding nar was
missing (for instance because the narinfo was cached and the server
became unreachable in the meantime), 'guix substitute --substitute'
would try to download the nar from its preferred location and abort when
that fails.  This change forces one retry with each of the URLs.

* guix/scripts/substitute.scm (download-nar): Do not catch
'http-get-error?' exceptions.
(system-error?, network-error?, process-substitution/fallback): New
procedures.
(process-substitution): Call 'process-substitution/fallback' upon
'network-error?'.
* tests/substitute.scm ("substitute, first URL has narinfo but lacks nar, second URL unauthorized")
("substitute, first URL has narinfo but nar is 404, both URLs authorized")
("substitute, first URL has narinfo but nar is 404, one URL authorized")
("substitute, narinfo is available but nar is missing"): New tests.
---
 guix/scripts/substitute.scm | 113 ++++++++++++++++++++++++++++--------
 tests/substitute.scm        | 113 ++++++++++++++++++++++++++++++++++++
 2 files changed, 203 insertions(+), 23 deletions(-)

diff --git a/guix/scripts/substitute.scm b/guix/scripts/substitute.scm
index e3b382d0d8..cf59db4315 100755
--- a/guix/scripts/substitute.scm
+++ b/guix/scripts/substitute.scm
@@ -460,25 +460,20 @@ (define (fetch uri)
        (let ((port (open-file (uri-path uri) "r0b")))
          (values port (stat:size (stat port)))))
       ((http https)
-       (guard (c ((http-get-error? c)
-                  (leave (G_ "download from '~a' failed: ~a, ~s~%")
-                         (uri->string (http-get-error-uri c))
-                         (http-get-error-code c)
-                         (http-get-error-reason c))))
-         ;; Test this with:
-         ;;   sudo tc qdisc add dev eth0 root netem delay 1500ms
-         ;; and then cancel with:
-         ;;   sudo tc qdisc del dev eth0 root
-         (with-timeout %fetch-timeout
-           (begin
-             (warning (G_ "while fetching ~a: server is somewhat slow~%")
-                      (uri->string uri))
-             (warning (G_ "try `--no-substitutes' if the problem persists~%")))
-           (with-cached-connection uri port
-             (http-fetch uri #:text? #f
-                         #:port port
-                         #:keep-alive? #t
-                         #:buffered? #f)))))
+       ;; Test this with:
+       ;;   sudo tc qdisc add dev eth0 root netem delay 1500ms
+       ;; and then cancel with:
+       ;;   sudo tc qdisc del dev eth0 root
+       (with-timeout %fetch-timeout
+         (begin
+           (warning (G_ "while fetching ~a: server is somewhat slow~%")
+                    (uri->string uri))
+           (warning (G_ "try `--no-substitutes' if the problem persists~%")))
+         (with-cached-connection uri port
+           (http-fetch uri #:text? #f
+                       #:port port
+                       #:keep-alive? #t
+                       #:buffered? #f))))
       (else
        (leave (G_ "unsupported substitute URI scheme: ~a~%")
               (uri->string uri)))))
@@ -572,6 +567,68 @@ (define cpu-usage
                     (bytevector->nix-base32-string expected)
                     (bytevector->nix-base32-string actual)))))))
 
+(define system-error?
+  (let ((kind-and-args? (exception-predicate &exception-with-kind-and-args)))
+    (lambda (exception)
+      "Return true if EXCEPTION is a Guile 'system-error exception."
+      (and (kind-and-args? exception)
+           (eq? 'system-error (exception-kind exception))))))
+
+(define network-error?
+  (let ((kind-and-args? (exception-predicate &exception-with-kind-and-args)))
+    (lambda (exception)
+      "Return true if EXCEPTION denotes a networking error."
+      (or (and (system-error? exception)
+               (let ((errno (system-error-errno
+                             (cons 'system-error (exception-args exception)))))
+                 (memv errno (list ECONNRESET ECONNABORTED
+                                   ECONNREFUSED EHOSTUNREACH
+                                   ENOENT))))     ;for "file://"
+          (and (kind-and-args? exception)
+               (memq (exception-kind exception)
+                     '(gnutls-error getaddrinfo-error)))
+          (and (http-get-error? exception)
+               (begin
+                 (warning (G_ "download from '~a' failed: ~a, ~s~%")
+                          (uri->string (http-get-error-uri exception))
+                          (http-get-error-code exception)
+                          (http-get-error-reason exception))
+                 #t))))))
+
+(define* (process-substitution/fallback port narinfo destination
+                                        #:key cache-urls acl
+                                        deduplicate? print-build-trace?)
+  "Attempt to substitute NARINFO, which is assumed to be authorized or
+equivalent, by trying to download its nar from each entry in CACHE-URLS.
+
+This can be less efficient than 'lookup-narinfo', which stops at the first
+entry that provides a valid narinfo, but it makes sure we eventually find a
+way to download the nar."
+  ;; Note: Keep NARINFO's uri-base in CACHE-URLS: that lets us retry in case
+  ;; this was a transient issue.
+  (let loop ((cache-urls cache-urls))
+    (match cache-urls
+      (()
+       (leave (G_ "failed to find alternative substitute for '~a'~%")
+              (narinfo-path narinfo)))
+      ((cache-url rest ...)
+       (match (lookup-narinfos cache-url
+                               (list (narinfo-path narinfo))
+                               #:open-connection
+                               open-connection-for-uri/cached)
+         ((alternate)
+          (if (or (equivalent-narinfo? narinfo alternate)
+                  (valid-narinfo? alternate acl)
+                  (%allow-unauthenticated-substitutes?))
+              (guard (c ((network-error? c) (loop rest)))
+                (download-nar alternate destination
+                              #:status-port port
+                              #:deduplicate? deduplicate?
+                              #:print-build-trace? print-build-trace?))
+              (loop rest)))
+         (()
+          (loop rest)))))))
+
 (define* (process-substitution port store-item destination
                                #:key cache-urls acl
                                deduplicate? print-build-trace?)
@@ -590,10 +647,20 @@ (define narinfo
     (leave (G_ "no valid substitute for '~a'~%")
            store-item))
 
-  (download-nar narinfo destination
-                #:status-port port
-                #:deduplicate? deduplicate?
-                #:print-build-trace? print-build-trace?))
+  (guard (c ((network-error? c)
+             (format (current-error-port)
+                     (G_ "retrying download of '~a' with other substitute URLs...~%")
+                     store-item)
+             (process-substitution/fallback port narinfo destination
+                                            #:cache-urls cache-urls
+                                            #:acl acl
+                                            #:deduplicate? deduplicate?
+                                            #:print-build-trace?
+                                            print-build-trace?)))
+    (download-nar narinfo destination
+                  #:status-port port
+                  #:deduplicate? deduplicate?
+                  #:print-build-trace? print-build-trace?)))
 
 
 ;;;
diff --git a/tests/substitute.scm b/tests/substitute.scm
index 5315292987..9032a50268 100644
--- a/tests/substitute.scm
+++ b/tests/substitute.scm
@@ -523,6 +523,119 @@ (define-syntax-rule (with-narinfo* narinfo directory body ...)
         (lambda ()
           (false-if-exception (delete-file "substitute-retrieved")))))))
 
+(test-equal "substitute, first URL has narinfo but lacks nar, second URL unauthorized"
+  "Substitutable data."
+  (with-narinfo*
+      (string-append %narinfo "Signature: "
+                     (signature-field
+                      %narinfo
+                      #:public-key %wrong-public-key))
+      %alternate-substitute-directory
+
+    (with-narinfo* (string-append %narinfo "Signature: "
+                                  (signature-field %narinfo))
+        %main-substitute-directory
+
+      (dynamic-wind
+        (const #t)
+        (lambda ()
+          ;; Remove this file so that the substitute can only be retrieved
+          ;; from %ALTERNATE-SUBSTITUTE-DIRECTORY.
+          (delete-file (string-append %main-substitute-directory
+                                      "/example.nar"))
+
+          (parameterize ((substitute-urls
+                          (map (cut string-append "file://" <>)
+                               (list %main-substitute-directory
+                                     %alternate-substitute-directory))))
+            (request-substitution (string-append (%store-prefix)
+                                                 "/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa-foo")
+                                  "substitute-retrieved"))
+          (call-with-input-file "substitute-retrieved" get-string-all))
+        (lambda ()
+          (false-if-exception (delete-file "substitute-retrieved")))))))
+
+(test-equal "substitute, first URL has narinfo but nar is 404, both URLs authorized"
+  "Substitutable data."
+  (with-narinfo*
+      (string-append %narinfo "Signature: "
+                     (signature-field %narinfo))
+      %main-substitute-directory
+
+    (with-http-server `((200 ,(string-append %narinfo "Signature: "
+                                             (signature-field %narinfo)))
+                        (404 "Sorry, nar is missing!"))
+      (dynamic-wind
+        (const #t)
+        (lambda ()
+          (parameterize ((substitute-urls
+                          (list (%local-url)
+                                (string-append "file://"
+                                               %main-substitute-directory))))
+            (request-substitution (string-append (%store-prefix)
+                                                 "/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa-foo")
+                                  "substitute-retrieved"))
+          (call-with-input-file "substitute-retrieved" get-string-all))
+        (lambda ()
+          (false-if-exception (delete-file "substitute-retrieved")))))))
+
+(test-equal "substitute, first URL has narinfo but nar is 404, one URL authorized"
+  "Substitutable data."
+  (with-narinfo*
+      (string-append %narinfo "Signature: "
+                     (signature-field
+                      %narinfo
+                      #:public-key %wrong-public-key))
+      %main-substitute-directory
+
+    (with-http-server `((200 ,(string-append %narinfo "Signature: "
+                                             (signature-field
+                                              %narinfo
+                                              #:public-key %wrong-public-key)))
+                        (404 "Sorry, nar is missing!"))
+      (let ((url1 (%local-url)))
+        (parameterize ((%http-server-port 0))
+          (with-http-server `((200 ,(string-append %narinfo "Signature: "
+                                                   (signature-field %narinfo)))
+                              (404 "Sorry, nar is missing!"))
+            (let ((url2 (%local-url)))
+              (dynamic-wind
+                (const #t)
+                (lambda ()
+                  (parameterize ((substitute-urls
+                                  (list url1 url2
+                                        (string-append "file://"
+                                                       %main-substitute-directory))))
+                    (request-substitution (string-append (%store-prefix)
+                                                         "/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa-foo")
+                                          "substitute-retrieved"))
+                  (call-with-input-file "substitute-retrieved" get-string-all))
+                (lambda ()
+                  (false-if-exception (delete-file "substitute-retrieved")))))))))))
+
+(test-quit "substitute, narinfo is available but nar is missing"
+    "failed to find alternative substitute"
+  (with-narinfo*
+      (string-append %narinfo "Signature: "
+                     (signature-field
+                      %narinfo
+                      #:public-key %wrong-public-key))
+      %main-substitute-directory
+
+    (with-http-server `((200 ,(string-append %narinfo "Signature: "
+                                             (signature-field %narinfo)))
+                        (404 "Sorry, nar is missing!"))
+      (parameterize ((substitute-urls
+                      (list (%local-url)
+                            (string-append "file://"
+                                           %main-substitute-directory))))
+        (delete-file (string-append %main-substitute-directory
+                                    "/example.nar"))
+        (request-substitution (string-append (%store-prefix)
+                                             "/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa-foo")
+                              "substitute-retrieved")
+        (not (file-exists? "substitute-retrieved"))))))
+
 (test-equal "substitute, first narinfo is unsigned and has wrong hash"
   "Substitutable data."
   (with-narinfo* (regexp-substitute #f
-- 
2.37.3





Information forwarded to bug-guix <at> gnu.org:
bug#57978; Package guix. (Fri, 23 Sep 2022 09:26:01 GMT) Full text and rfc822 format available.

Message #27 received at 57978 <at> debbugs.gnu.org (full text, mbox):

From: zimoun <zimon.toutoune <at> gmail.com>
To: Ludovic Courtès <ludo <at> gnu.org>, 58017 <at> debbugs.gnu.org
Cc: Attila Lendvai <attila <at> lendvai.name>, 57978 <at> debbugs.gnu.org,
 Ludovic Courtès <ludo <at> gnu.org>
Subject: Re: bug#57978: [PATCH 1/2] substitute: Split nar download.
Date: Fri, 23 Sep 2022 09:56:47 +0200
Hi,


On ven., 23 sept. 2022 at 08:19, Ludovic Courtès <ludo <at> gnu.org> wrote:
> * guix/scripts/substitute.scm (download-nar): New procedure, with most
> of the code moved from...
> (process-substitution): ... here.  Call it.

LTGM.

Just to be sure, the patch tweaks the logic checking about narinfo and
it is not mentioned in the commit message, IMHO.



Cheers,
simon





Information forwarded to bug-guix <at> gnu.org:
bug#57978; Package guix. (Fri, 23 Sep 2022 09:26:02 GMT) Full text and rfc822 format available.

Message #30 received at 57978 <at> debbugs.gnu.org (full text, mbox):

From: zimoun <zimon.toutoune <at> gmail.com>
To: Ludovic Courtès <ludo <at> gnu.org>, 58017 <at> debbugs.gnu.org
Cc: Attila Lendvai <attila <at> lendvai.name>, 57978 <at> debbugs.gnu.org,
 Ludovic Courtès <ludo <at> gnu.org>
Subject: Re: bug#57978: [PATCH 2/2] substitute: Retry downloading when a nar
 is unavailable.
Date: Fri, 23 Sep 2022 10:17:16 +0200
Hi,

On ven., 23 sept. 2022 at 08:19, Ludovic Courtès <ludo <at> gnu.org> wrote:

> Fixes <https://issues.guix.gnu.org/57978>
> Reported by Attila Lendvai <attila <at> lendvai.name>.
>
> Previously, if a narinfo was available but its corresponding nar was
> missing (for instance because the narinfo was cached and the server
> became unreachable in the meantime), 'guix substitute --substitute'
> would try to download the nar from its preferred location and abort when
> that fails.  This change forces one retry with each of the URLs.
>
> * guix/scripts/substitute.scm (download-nar): Do not catch
> 'http-get-error?' exceptions.
> (system-error?, network-error?, process-substitution/fallback): New
> procedures.
> (process-substitution): Call 'process-substitution/fallback' upon
> 'network-error?'.
> * tests/substitute.scm ("substitute, first URL has narinfo but lacks nar, second URL unauthorized")
> ("substitute, first URL has narinfo but nar is 404, both URLs authorized")
> ("substitute, first URL has narinfo but nar is 404, one URL authorized")
> ("substitute, narinfo is available but nar is missing"): New tests.

LGTM.


> +(test-equal "substitute, first URL has narinfo but nar is 404, one URL authorized"
> +  "Substitutable data."
> +  (with-narinfo*
> +      (string-append %narinfo "Signature: "
> +                     (signature-field
> +                      %narinfo
> +                      #:public-key %wrong-public-key))
> +      %main-substitute-directory
> +
> +    (with-http-server `((200 ,(string-append %narinfo "Signature: "
> +                                             (signature-field
> +                                              %narinfo
> +                                              #:public-key %wrong-public-key)))
> +                        (404 "Sorry, nar is missing!"))
> +      (let ((url1 (%local-url)))
> +        (parameterize ((%http-server-port 0))
> +          (with-http-server `((200 ,(string-append %narinfo "Signature: "
> +                                                   (signature-field %narinfo)))
> +                              (404 "Sorry, nar is missing!"))
> +            (let ((url2 (%local-url)))
> +              (dynamic-wind
> +                (const #t)
> +                (lambda ()
> +                  (parameterize ((substitute-urls
> +                                  (list url1 url2
> +                                        (string-append "file://"
> +                                                       %main-substitute-directory))))
> +                    (request-substitution (string-append (%store-prefix)
> +                                                         "/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa-foo")
> +                                          "substitute-retrieved"))
> +                  (call-with-input-file "substitute-retrieved" get-string-all))
> +                (lambda ()
> +                  (false-if-exception (delete-file "substitute-retrieved")))))))))))

Although I do not understand this test.  Why is 404 appearing twice?


Cheers,
simon




Information forwarded to bug-guix <at> gnu.org:
bug#57978; Package guix. (Sat, 24 Sep 2022 01:58:01 GMT) Full text and rfc822 format available.

Message #33 received at 57978 <at> debbugs.gnu.org (full text, mbox):

From: Maxime Devos <maximedevos <at> telenet.be>
To: Ludovic Courtès <ludo <at> gnu.org>, 58017 <at> debbugs.gnu.org
Cc: Attila Lendvai <attila <at> lendvai.name>, 57978 <at> debbugs.gnu.org,
 zimoun <zimon.toutoune <at> gmail.com>
Subject: Re: [bug#58017] [PATCH 2/2] substitute: Retry downloading when a nar
 is unavailable.
Date: Sat, 24 Sep 2022 03:57:33 +0200
[Message part 1 (text/plain, inline)]
> +(test-equal "substitute, first URL has narinfo but nar is 404, both URLs authorized"
> +  "Substitutable data."
> +  (with-narinfo*
> +      (string-append %narinfo "Signature: "
> +                     (signature-field %narinfo))
> +      %main-substitute-directory
> +
> +    (with-http-server `((200 ,(string-append %narinfo "Signature: "
> +                                             (signature-field %narinfo)))
> +                        (404 "Sorry, nar is missing!"))
> +      (dynamic-wind
> +        (const #t)
> +        (lambda ()
> +          (parameterize ((substitute-urls
> +                          (list (%local-url)
> +                                (string-append "file://"
> +                                               %main-substitute-directory))))
> +            (request-substitution (string-append (%store-prefix)
> +                                                 "/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa-foo")
> +                                  "substitute-retrieved"))
> +          (call-with-input-file "substitute-retrieved" get-string-all))
> +        (lambda ()
> +          (false-if-exception (delete-file "substitute-retrieved")))))))

Shouldn't it only ignore 'file not found' (ENOENT?) exceptions?
If the exception handling is refined a bit, it becomes a bit more 
complicated, and could be simplified to (when [exists] [delete]), as 
there are no atomicity concerns.

This test, and some others, can be improved by also checking the URI. 
While currently 'with-http-server' does not support that, there are (5 
months, with the v1 having seen some reviewing and a v2 available) 
patches for that at <https://issues.guix.gnu.org/53389>.

That patch also _requires_ always mentioning the URI, if the cover 
letter is correct.  It also allows simplifying the use of '%local-url' a 
bit.

Greetings,
Maxime.
[OpenPGP_0x49E3EE22191725EE.asc (application/pgp-keys, attachment)]
[OpenPGP_signature (application/pgp-signature, attachment)]

Information forwarded to bug-guix <at> gnu.org:
bug#57978; Package guix. (Sat, 24 Sep 2022 16:21:02 GMT) Full text and rfc822 format available.

Message #36 received at 57978 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: zimoun <zimon.toutoune <at> gmail.com>
Cc: 58017 <at> debbugs.gnu.org, 57978 <at> debbugs.gnu.org,
 Attila Lendvai <attila <at> lendvai.name>
Subject: Re: bug#57978: [PATCH 2/2] substitute: Retry downloading when a nar
 is unavailable.
Date: Sat, 24 Sep 2022 18:20:08 +0200
Hi!

zimoun <zimon.toutoune <at> gmail.com> skribis:

>> +  (with-narinfo*
>> +      (string-append %narinfo "Signature: "
>> +                     (signature-field
>> +                      %narinfo
>> +                      #:public-key %wrong-public-key))
>> +      %main-substitute-directory
>> +
>> +    (with-http-server `((200 ,(string-append %narinfo "Signature: "
>> +                                             (signature-field
>> +                                              %narinfo
>> +                                              #:public-key %wrong-public-key)))
>> +                        (404 "Sorry, nar is missing!"))
>> +      (let ((url1 (%local-url)))
>> +        (parameterize ((%http-server-port 0))
>> +          (with-http-server `((200 ,(string-append %narinfo "Signature: "
>> +                                                   (signature-field %narinfo)))
>> +                              (404 "Sorry, nar is missing!"))
>> +            (let ((url2 (%local-url)))
>> +              (dynamic-wind
>> +                (const #t)
>> +                (lambda ()
>> +                  (parameterize ((substitute-urls
>> +                                  (list url1 url2
>> +                                        (string-append "file://"
>> +                                                       %main-substitute-directory))))

[...]

> Although I do not understand this test.  Why is 404 appearing twice?

That’s because it’s testing with 3 substitute URLs.

Thanks for taking a look!

Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#57978; Package guix. (Sat, 24 Sep 2022 16:24:03 GMT) Full text and rfc822 format available.

Message #39 received at 57978 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Maxime Devos <maximedevos <at> telenet.be>
Cc: 58017 <at> debbugs.gnu.org, 57978 <at> debbugs.gnu.org,
 Attila Lendvai <attila <at> lendvai.name>, zimoun <zimon.toutoune <at> gmail.com>
Subject: Re: [bug#58017] [PATCH 2/2] substitute: Retry downloading when a
 nar is unavailable.
Date: Sat, 24 Sep 2022 18:22:59 +0200
Hi Maxime,

Maxime Devos <maximedevos <at> telenet.be> skribis:

>> +(test-equal "substitute, first URL has narinfo but nar is 404, both URLs authorized"
>> +  "Substitutable data."
>> +  (with-narinfo*
>> +      (string-append %narinfo "Signature: "
>> +                     (signature-field %narinfo))
>> +      %main-substitute-directory
>> +
>> +    (with-http-server `((200 ,(string-append %narinfo "Signature: "
>> +                                             (signature-field %narinfo)))
>> +                        (404 "Sorry, nar is missing!"))
>> +      (dynamic-wind
>> +        (const #t)
>> +        (lambda ()
>> +          (parameterize ((substitute-urls
>> +                          (list (%local-url)
>> +                                (string-append "file://"
>> +                                               %main-substitute-directory))))
>> +            (request-substitution (string-append (%store-prefix)
>> +                                                 "/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa-foo")
>> +                                  "substitute-retrieved"))
>> +          (call-with-input-file "substitute-retrieved" get-string-all))
>> +        (lambda ()
>> +          (false-if-exception (delete-file "substitute-retrieved")))))))
>
> Shouldn't it only ignore 'file not found' (ENOENT?) exceptions?

By “it”, do you mean ‘dynamic-wind’ should be replaced by a ‘catch’
form?

We could discuss it, but note that this patch just keeps with the style
of existing tests.

> This test, and some others, can be improved by also checking the
> URI. While currently 'with-http-server' does not support that, there
> are (5 months, with the v1 having seen some reviewing and a v2
> available) patches for that at <https://issues.guix.gnu.org/53389>.
>
> That patch also _requires_ always mentioning the URI, if the cover
> letter is correct.  It also allows simplifying the use of '%local-url'
> a bit.

Ah, thanks for the reminder!  I’ve just spent most of the day reviewing
patches, but not that one…

Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#57978; Package guix. (Sat, 24 Sep 2022 17:19:02 GMT) Full text and rfc822 format available.

Message #42 received at 57978 <at> debbugs.gnu.org (full text, mbox):

From: Maxime Devos <maximedevos <at> telenet.be>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 58017 <at> debbugs.gnu.org, 57978 <at> debbugs.gnu.org,
 Attila Lendvai <attila <at> lendvai.name>, zimoun <zimon.toutoune <at> gmail.com>
Subject: Re: [bug#58017] [PATCH 2/2] substitute: Retry downloading when a nar
 is unavailable.
Date: Sat, 24 Sep 2022 19:18:15 +0200
[Message part 1 (text/plain, inline)]

On 24-09-2022 18:22, Ludovic Courtès wrote:
> Hi Maxime,
> 
> Maxime Devos <maximedevos <at> telenet.be> skribis:
> 
>>> +(test-equal "substitute, first URL has narinfo but nar is 404, both URLs authorized"
>>> +  "Substitutable data."
>>> +  (with-narinfo*
>>> +      (string-append %narinfo "Signature: "
>>> +                     (signature-field %narinfo))
>>> +      %main-substitute-directory
>>> +
>>> +    (with-http-server `((200 ,(string-append %narinfo "Signature: "
>>> +                                             (signature-field %narinfo)))
>>> +                        (404 "Sorry, nar is missing!"))
>>> +      (dynamic-wind
>>> +        (const #t)
>>> +        (lambda ()
>>> +          (parameterize ((substitute-urls
>>> +                          (list (%local-url)
>>> +                                (string-append "file://"
>>> +                                               %main-substitute-directory))))
>>> +            (request-substitution (string-append (%store-prefix)
>>> +                                                 "/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa-foo")
>>> +                                  "substitute-retrieved"))
>>> +          (call-with-input-file "substitute-retrieved" get-string-all))
>>> +        (lambda ()
>>> +          (false-if-exception (delete-file "substitute-retrieved")))))))
>>
>> Shouldn't it only ignore 'file not found' (ENOENT?) exceptions?
> 
> By “it”, do you mean ‘dynamic-wind’ should be replaced by a ‘catch’
> form?

No, I'm not referring to the dynamic-wind as a whole, rather 'it' = the 
following code:

 (false-if-exception (delete-file "substitute-retrieved"))

-- the catch can stay, AFAIK.

> We could discuss it, but note that this patch just keeps with the style
> of existing tests.

For the reasons given, I don't think this style should be continued, 
though I suppose all of them can be done at once in a separate patch.

Greetings,
Maxime.
[OpenPGP_0x49E3EE22191725EE.asc (application/pgp-keys, attachment)]
[OpenPGP_signature (application/pgp-signature, attachment)]

Reply sent to Ludovic Courtès <ludo <at> gnu.org>:
You have taken responsibility. (Wed, 28 Sep 2022 21:25:02 GMT) Full text and rfc822 format available.

Notification sent to Attila Lendvai <attila <at> lendvai.name>:
bug acknowledged by developer. (Wed, 28 Sep 2022 21:25:02 GMT) Full text and rfc822 format available.

Message #47 received at 57978-done <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: 58017-done <at> debbugs.gnu.org
Cc: Attila Lendvai <attila <at> lendvai.name>, 57978-done <at> debbugs.gnu.org,
 zimoun <zimon.toutoune <at> gmail.com>
Subject: Re: bug#58017: [PATCH 0/2] Retry nar downloads upon failure
Date: Wed, 28 Sep 2022 23:24:42 +0200
Hi,

Ludovic Courtès <ludo <at> gnu.org> skribis:

>   substitute: Split nar download.
>   substitute: Retry downloading when a nar is unavailable.

Pushed as 8bd4126917f59f4af9a4323c3d5699201862dca2.  The ‘guix’ package
has yet to be updated.

Thanks,
Ludo’.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 27 Oct 2022 11:24:20 GMT) Full text and rfc822 format available.

This bug report was last modified 1 year and 175 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.