GNU bug report logs - #45323
[PATCH] substitute: Reuse connections for '--query'.

Previous Next

Package: guix-patches;

Reported by: Ludovic Courtès <ludo <at> gnu.org>

Date: Sat, 19 Dec 2020 14:51:01 UTC

Severity: normal

Tags: patch

Done: Ludovic Courtès <ludo <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 45323 in the body.
You can then email your comments to 45323 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to guix-patches <at> gnu.org:
bug#45323; Package guix-patches. (Sat, 19 Dec 2020 14:51:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Ludovic Courtès <ludo <at> gnu.org>:
New bug report received and forwarded. Copy sent to guix-patches <at> gnu.org. (Sat, 19 Dec 2020 14:51:01 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: guix-patches <at> gnu.org
Cc: Ludovic Courtès <ludo <at> gnu.org>
Subject: [PATCH] substitute: Reuse connections for '--query'.
Date: Sat, 19 Dec 2020 15:49:52 +0100
This significantly speeds up things like substituting the closure of a
.drv.  This is a followup to 5ff521452b9ec2aae9ed8e4bb7bdc250a581f203.

* guix/scripts/substitute.scm (http-multiple-get): Add #:open-connection
and #:keep-alive? and honor them.
(open-connection-for-uri/maybe): Use 'open-connection-for-uri/cached'
instead of 'guix:open-connection-for-uri'.  Call 'http-multiple-get'
within 'call-with-cached-connection'.
(open-connection-for-uri/cached): Add #:timeout and #:verify-certificate?
and honor them.
(call-with-cached-connection): Add 'open-connection'  parameter and
honor it.
---
 guix/scripts/substitute.scm | 97 ++++++++++++++++++++++---------------
 1 file changed, 59 insertions(+), 38 deletions(-)

diff --git a/guix/scripts/substitute.scm b/guix/scripts/substitute.scm
index 38702d0c4b..8084c89ae5 100755
--- a/guix/scripts/substitute.scm
+++ b/guix/scripts/substitute.scm
@@ -514,12 +514,18 @@ return its MAX-LENGTH first elements and its tail."
 
 (define* (http-multiple-get base-uri proc seed requests
                             #:key port (verify-certificate? #t)
+                            (open-connection guix:open-connection-for-uri)
+                            (keep-alive? #t)
                             (batch-size 1000))
   "Send all of REQUESTS to the server at BASE-URI.  Call PROC for each
 response, passing it the request object, the response, a port from which to
 read the response body, and the previous result, starting with SEED, à la
-'fold'.  Return the final result.  When PORT is specified, use it as the
-initial connection on which HTTP requests are sent."
+'fold'.  Return the final result.
+
+When PORT is specified, use it as the initial connection on which HTTP
+requests are sent; otherwise call OPEN-CONNECTION to open a new connection for
+a URI.  When KEEP-ALIVE? is false, close the connection port before
+returning."
   (let connect ((port     port)
                 (requests requests)
                 (result   seed))
@@ -528,10 +534,9 @@ initial connection on which HTTP requests are sent."
 
     ;; (format (current-error-port) "connecting (~a requests left)..."
     ;;         (length requests))
-    (let ((p (or port (guix:open-connection-for-uri
-                       base-uri
-                       #:verify-certificate?
-                       verify-certificate?))))
+    (let ((p (or port (open-connection base-uri
+                                       #:verify-certificate?
+                                       verify-certificate?))))
       ;; For HTTPS, P is not a file port and does not support 'setvbuf'.
       (when (file-port? p)
         (setvbuf p 'block (expt 2 16)))
@@ -556,7 +561,8 @@ initial connection on which HTTP requests are sent."
           (()
            (match (drop requests processed)
              (()
-              (close-port p)
+              (unless keep-alive?
+                (close-port p))
               (reverse result))
              (remainder
               (connect p remainder result))))
@@ -598,18 +604,18 @@ if file doesn't exist, and the narinfo otherwise."
 
 (define* (open-connection-for-uri/maybe uri
                                         #:key
-                                        (verify-certificate? #f)
+                                        fresh?
                                         (time %fetch-timeout))
-  "Open a connection to URI and return a port to it, or, if connection failed,
-print a warning and return #f."
+  "Open a connection to URI via 'open-connection-for-uri/cached' and return a
+port to it, or, if connection failed, print a warning and return #f.  Pass
+#:fresh? to 'open-connection-for-uri/cached'."
   (define host
     (uri-host uri))
 
   (catch #t
     (lambda ()
-      (guix:open-connection-for-uri uri
-                                    #:verify-certificate? verify-certificate?
-                                    #:timeout time))
+      (open-connection-for-uri/cached uri #:timeout time
+                                      #:fresh? fresh?))
     (match-lambda*
       (('getaddrinfo-error error)
        (unless (hash-ref %unreachable-hosts host)
@@ -683,23 +689,26 @@ print a warning and return #f."
   (define (do-fetch uri)
     (case (and=> uri uri-scheme)
       ((http https)
-       (let ((requests (map (cut narinfo-request url <>) paths)))
-         (match (open-connection-for-uri/maybe uri)
-           (#f
-            '())
-           (port
-            (update-progress!)
-            ;; Note: Do not check HTTPS server certificates to avoid depending
-            ;; on the X.509 PKI.  We can do it because we authenticate
-            ;; narinfos, which provides a much stronger guarantee.
-            (let ((result (http-multiple-get uri
-                                             handle-narinfo-response '()
-                                             requests
-                                             #:verify-certificate? #f
-                                             #:port port)))
-              (close-port port)
-              (newline (current-error-port))
-              result)))))
+       ;; Note: Do not check HTTPS server certificates to avoid depending
+       ;; on the X.509 PKI.  We can do it because we authenticate
+       ;; narinfos, which provides a much stronger guarantee.
+       (let* ((requests (map (cut narinfo-request url <>) paths))
+              (result   (call-with-cached-connection uri
+                          (lambda (port)
+                            (if port
+                                (begin
+                                  (update-progress!)
+                                  (http-multiple-get uri
+                                                     handle-narinfo-response '()
+                                                     requests
+                                                     #:open-connection
+                                                     open-connection-for-uri/cached
+                                                     #:verify-certificate? #f
+                                                     #:port port))
+                                '()))
+                          open-connection-for-uri/maybe)))
+         (newline (current-error-port))
+         result))
       ((file #f)
        (let* ((base  (string-append (uri-path uri) "/"))
               (files (map (compose (cut string-append base <> ".narinfo")
@@ -990,10 +999,14 @@ the URI, its compression method (a string), and the compressed file size."
 
 (define open-connection-for-uri/cached
   (let ((cache '()))
-    (lambda* (uri #:key fresh?)
+    (lambda* (uri #:key fresh? timeout verify-certificate?)
       "Return a connection for URI, possibly reusing a cached connection.
-When FRESH? is true, delete any cached connections for URI and open a new
-one.  Return #f if URI's scheme is 'file' or #f."
+When FRESH? is true, delete any cached connections for URI and open a new one.
+Return #f if URI's scheme is 'file' or #f.
+
+When true, TIMEOUT is the maximum number of milliseconds to wait for
+connection establishment.  When VERIFY-CERTIFICATE? is true, verify HTTPS
+server certificates."
       (define host (uri-host uri))
       (define scheme (uri-scheme uri))
       (define key (list host scheme (uri-port uri)))
@@ -1005,7 +1018,9 @@ one.  Return #f if URI's scheme is 'file' or #f."
               ;; CACHE, if any.
               (let-values (((socket)
                             (guix:open-connection-for-uri
-                             uri #:verify-certificate? #f))
+                             uri
+                             #:verify-certificate? verify-certificate?
+                             #:timeout timeout))
                            ((new-cache evicted)
                             (at-most (- %max-cached-connections 1) cache)))
                 (for-each (match-lambda
@@ -1019,14 +1034,19 @@ one.  Return #f if URI's scheme is 'file' or #f."
                   (begin
                     (false-if-exception (close-port socket))
                     (set! cache (alist-delete key cache))
-                    (open-connection-for-uri/cached uri))
+                    (open-connection-for-uri/cached uri #:timeout timeout
+                                                    #:verify-certificate?
+                                                    verify-certificate?))
                   (begin
                     ;; Drain input left from the previous use.
                     (drain-input socket)
                     socket))))))))
 
-(define (call-with-cached-connection uri proc)
-  (let ((port (open-connection-for-uri/cached uri)))
+(define* (call-with-cached-connection uri proc
+                                      #:optional
+                                      (open-connection
+                                       open-connection-for-uri/cached))
+  (let ((port (open-connection uri)))
     (catch #t
       (lambda ()
         (proc port))
@@ -1038,7 +1058,7 @@ one.  Return #f if URI's scheme is 'file' or #f."
         (if (or (and (eq? key 'system-error)
                      (= EPIPE (system-error-errno `(,key ,@args))))
                 (memq key '(bad-response bad-header bad-header-component)))
-            (proc (open-connection-for-uri/cached uri #:fresh? #t))
+            (proc (open-connection uri #:fresh? #t))
             (apply throw key args))))))
 
 (define-syntax-rule (with-cached-connection uri port exp ...)
@@ -1341,6 +1361,7 @@ default value."
 ;;; Local Variables:
 ;;; eval: (put 'with-timeout 'scheme-indent-function 1)
 ;;; eval: (put 'with-cached-connection 'scheme-indent-function 2)
+;;; eval: (put 'call-with-cached-connection 'scheme-indent-function 1)
 ;;; End:
 
 ;;; substitute.scm ends here
-- 
2.29.2





Reply sent to Ludovic Courtès <ludo <at> gnu.org>:
You have taken responsibility. (Wed, 23 Dec 2020 15:07:02 GMT) Full text and rfc822 format available.

Notification sent to Ludovic Courtès <ludo <at> gnu.org>:
bug acknowledged by developer. (Wed, 23 Dec 2020 15:07:02 GMT) Full text and rfc822 format available.

Message #10 received at 45323-done <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: 45323-done <at> debbugs.gnu.org
Subject: Re: [bug#45323] [PATCH] substitute: Reuse connections for '--query'.
Date: Wed, 23 Dec 2020 16:06:14 +0100
Ludovic Courtès <ludo <at> gnu.org> skribis:

> This significantly speeds up things like substituting the closure of a
> .drv.  This is a followup to 5ff521452b9ec2aae9ed8e4bb7bdc250a581f203.
>
> * guix/scripts/substitute.scm (http-multiple-get): Add #:open-connection
> and #:keep-alive? and honor them.
> (open-connection-for-uri/maybe): Use 'open-connection-for-uri/cached'
> instead of 'guix:open-connection-for-uri'.  Call 'http-multiple-get'
> within 'call-with-cached-connection'.
> (open-connection-for-uri/cached): Add #:timeout and #:verify-certificate?
> and honor them.
> (call-with-cached-connection): Add 'open-connection'  parameter and
> honor it.
> ---
>  guix/scripts/substitute.scm | 97 ++++++++++++++++++++++---------------
>  1 file changed, 59 insertions(+), 38 deletions(-)

Pushed as be5a75ebb5988b87b2392e2113f6590f353dd6cd!

You can check the effect by running ‘guix build XYZ.drv’, where XYZ.drv
is not available locally yet.

Ludo’.




Information forwarded to guix-patches <at> gnu.org:
bug#45323; Package guix-patches. (Thu, 24 Dec 2020 11:07:02 GMT) Full text and rfc822 format available.

Message #13 received at 45323 <at> debbugs.gnu.org (full text, mbox):

From: Christopher Baines <mail <at> cbaines.net>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 45323 <at> debbugs.gnu.org
Subject: Re: bug#45323: [PATCH] substitute: Reuse connections for '--query'.
Date: Thu, 24 Dec 2020 11:06:39 +0000
[Message part 1 (text/plain, inline)]
Ludovic Courtès <ludo <at> gnu.org> writes:

> Ludovic Courtès <ludo <at> gnu.org> skribis:
>
>> This significantly speeds up things like substituting the closure of a
>> .drv.  This is a followup to 5ff521452b9ec2aae9ed8e4bb7bdc250a581f203.
>>
>> * guix/scripts/substitute.scm (http-multiple-get): Add #:open-connection
>> and #:keep-alive? and honor them.
>> (open-connection-for-uri/maybe): Use 'open-connection-for-uri/cached'
>> instead of 'guix:open-connection-for-uri'.  Call 'http-multiple-get'
>> within 'call-with-cached-connection'.
>> (open-connection-for-uri/cached): Add #:timeout and #:verify-certificate?
>> and honor them.
>> (call-with-cached-connection): Add 'open-connection'  parameter and
>> honor it.
>> ---
>>  guix/scripts/substitute.scm | 97 ++++++++++++++++++++++---------------
>>  1 file changed, 59 insertions(+), 38 deletions(-)
>
> Pushed as be5a75ebb5988b87b2392e2113f6590f353dd6cd!
>
> You can check the effect by running ‘guix build XYZ.drv’, where XYZ.drv
> is not available locally yet.

Hey,

I did do some testing of this, and didn't spot any issues, but I think
it might be causing some issues when things go wrong.

The Guix Build Coordinator uses code from this script, and I'm sometimes
seeing exceptions like [1] when running with these changes. This is when
calling lookup-narinfos.

1:
#<&compound-exception components: (#<&error> #<&irritants irritants: (#<gnutls-error-enum The specified session has been invalidated for some
reason.> write_to_session_record_port)> #<&exception-with-kind-and-args kind: gnutls-error args: (#<gnutls-error-enum The specified session hasbeen invalidated for some reason.> write_to_session_record_port)>)>,

When this happens, things seem to get stuck and retrying calling
lookup-narinfos leads to the same exception. I'm guessing this might be
happening because the broken connection is being cached and reused.

Any ideas?

Thanks,

Chris
[signature.asc (application/pgp-signature, inline)]

Information forwarded to guix-patches <at> gnu.org:
bug#45323; Package guix-patches. (Sun, 27 Dec 2020 14:58:01 GMT) Full text and rfc822 format available.

Message #16 received at 45323 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Christopher Baines <mail <at> cbaines.net>
Cc: 45323 <at> debbugs.gnu.org
Subject: Re: bug#45323: [PATCH] substitute: Reuse connections for '--query'.
Date: Sun, 27 Dec 2020 15:57:42 +0100
[Message part 1 (text/plain, inline)]
Hi!

Christopher Baines <mail <at> cbaines.net> skribis:

> The Guix Build Coordinator uses code from this script, and I'm sometimes
> seeing exceptions like [1] when running with these changes. This is when
> calling lookup-narinfos.
>
> 1:
> #<&compound-exception components: (#<&error> #<&irritants irritants: (#<gnutls-error-enum The specified session has been invalidated for some
> reason.> write_to_session_record_port)> #<&exception-with-kind-and-args kind: gnutls-error args: (#<gnutls-error-enum The specified session hasbeen invalidated for some reason.> write_to_session_record_port)>)>,
>
> When this happens, things seem to get stuck and retrying calling
> lookup-narinfos leads to the same exception. I'm guessing this might be
> happening because the broken connection is being cached and reused.

Ah, that looks like another thing that might break.  Does the patch
below help?

Thanks for reporting it,
Ludo’.

[Message part 2 (text/x-patch, inline)]
diff --git a/guix/scripts/substitute.scm b/guix/scripts/substitute.scm
index 8084c89ae5..e53de8c304 100755
--- a/guix/scripts/substitute.scm
+++ b/guix/scripts/substitute.scm
@@ -43,6 +43,7 @@
                           (open-connection-for-uri
                            . guix:open-connection-for-uri)
                           store-path-abbreviation byte-count->string))
+  #:autoload   (gnutls) (error/invalid-session)
   #:use-module (guix progress)
   #:use-module ((guix build syscalls)
                 #:select (set-thread-name))
@@ -1054,9 +1055,12 @@ server certificates."
         ;; If PORT was cached and the server closed the connection in the
         ;; meantime, we get EPIPE.  In that case, open a fresh connection and
         ;; retry.  We might also get 'bad-response or a similar exception from
-        ;; (web response) later on, once we've sent the request.
+        ;; (web response) later on, once we've sent the request, or a
+        ;; ERROR/INVALID-SESSION from GnuTLS.
         (if (or (and (eq? key 'system-error)
                      (= EPIPE (system-error-errno `(,key ,@args))))
+                (and (eq? key 'gnutls-error)
+                     (eq? (first args) error/invalid-session))
                 (memq key '(bad-response bad-header bad-header-component)))
             (proc (open-connection uri #:fresh? #t))
             (apply throw key args))))))

Information forwarded to guix-patches <at> gnu.org:
bug#45323; Package guix-patches. (Wed, 30 Dec 2020 22:57:01 GMT) Full text and rfc822 format available.

Message #19 received at 45323 <at> debbugs.gnu.org (full text, mbox):

From: Christopher Baines <mail <at> cbaines.net>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 45323 <at> debbugs.gnu.org
Subject: Re: bug#45323: [PATCH] substitute: Reuse connections for '--query'.
Date: Wed, 30 Dec 2020 22:55:56 +0000
[Message part 1 (text/plain, inline)]
Ludovic Courtès <ludo <at> gnu.org> writes:

> Hi!
>
> Christopher Baines <mail <at> cbaines.net> skribis:
>
>> The Guix Build Coordinator uses code from this script, and I'm sometimes
>> seeing exceptions like [1] when running with these changes. This is when
>> calling lookup-narinfos.
>>
>> 1:
>> #<&compound-exception components: (#<&error> #<&irritants irritants: (#<gnutls-error-enum The specified session has been invalidated for some
>> reason.> write_to_session_record_port)> #<&exception-with-kind-and-args kind: gnutls-error args: (#<gnutls-error-enum The specified session hasbeen invalidated for some reason.> write_to_session_record_port)>)>,
>>
>> When this happens, things seem to get stuck and retrying calling
>> lookup-narinfos leads to the same exception. I'm guessing this might be
>> happening because the broken connection is being cached and reused.
>
> Ah, that looks like another thing that might break.  Does the patch
> below help?

I've tried using it, and I haven't spotted any problems yet, so I
believe so.

Thanks,

Chris
[signature.asc (application/pgp-signature, inline)]

Information forwarded to guix-patches <at> gnu.org:
bug#45323; Package guix-patches. (Mon, 04 Jan 2021 10:56:01 GMT) Full text and rfc822 format available.

Message #22 received at 45323 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Christopher Baines <mail <at> cbaines.net>
Cc: 45323 <at> debbugs.gnu.org
Subject: Re: bug#45323: [PATCH] substitute: Reuse connections for '--query'.
Date: Mon, 04 Jan 2021 11:55:40 +0100
Hi,

Christopher Baines <mail <at> cbaines.net> skribis:

> Ludovic Courtès <ludo <at> gnu.org> writes:

[...]

>>> #<&compound-exception components: (#<&error> #<&irritants irritants: (#<gnutls-error-enum The specified session has been invalidated for some
>>> reason.> write_to_session_record_port)> #<&exception-with-kind-and-args kind: gnutls-error args: (#<gnutls-error-enum The specified session hasbeen invalidated for some reason.> write_to_session_record_port)>)>,
>>>
>>> When this happens, things seem to get stuck and retrying calling
>>> lookup-narinfos leads to the same exception. I'm guessing this might be
>>> happening because the broken connection is being cached and reused.
>>
>> Ah, that looks like another thing that might break.  Does the patch
>> below help?
>
> I've tried using it, and I haven't spotted any problems yet, so I
> believe so.

Pushed as 9158020d7853b6e7925802e0d0a082801c680e8f, thanks!

Ludo’.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Mon, 01 Feb 2021 12:24:05 GMT) Full text and rfc822 format available.

This bug report was last modified 3 years and 84 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.