GNU bug report logs - #48903
guix substitute: error: TLS error in procedure 'read_from_session_record_port': Error decoding the received TLS packet.

Previous Next

Package: guix;

Reported by: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>

Date: Mon, 7 Jun 2021 15:18:02 UTC

Severity: important

Merged with 49071

Done: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 48903 in the body.
You can then email your comments to 48903 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guix <at> gnu.org:
bug#48903; Package guix. (Mon, 07 Jun 2021 15:18:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Maxim Cournoyer <maxim.cournoyer <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-guix <at> gnu.org. (Mon, 07 Jun 2021 15:18:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>
To: bug-guix <bug-guix <at> gnu.org>
Subject: guix substitute: error: TLS error in procedure
 'read_from_session_record_port': Error decoding the received TLS packet.
Date: Mon, 07 Jun 2021 11:10:06 -0400
[Message part 1 (text/plain, inline)]
Hello,

Using Guix from the master branch at commit
b2122b07dc24007263b92247cc479713c2101390, with a system reconfigured on
the 2nd of June (Guix commit bb325c5611553a6db21ee7499ac07d5757d24fc3):

--8<---------------cut here---------------start------------->8---
Generation 216  Jun 02 2021 10:14:19    (current)
  file name: /var/guix/profiles/system-216-link
  canonical file name: /gnu/store/apjg70083nc5xj816y0ff3r8ir9gh5py-system
  label: GNU with Linux-Libre 5.11.20
  bootloader: grub
  root device: /dev/mapper/cryptroot
  kernel: /gnu/store/ghijd80qabdyf0p6jcich9ggnpwrbwxw-linux-libre-5.11.20/bzImage
  channels:
    sfl-packages:
      repository URL: https://gitlab.com/Apteryks/sfl-guix-channel
      branch: master
      commit: 37d017573350b64f8a8c992530153f42806b6a6f
    guix:
      repository URL: https://git.savannah.gnu.org/git/guix.git
      branch: master
      commit: bb325c5611553a6db21ee7499ac07d5757d24fc3
  configuration file:
  /gnu/store/qvhl7ya2xn4gr9mn29hg93p1dcbdlyfy-configuration.scm
--8<---------------cut here---------------end--------------->8---
  
with the guix-daemon running being:

--8<---------------cut here---------------start------------->8---
/gnu/store/9zh3bg8d4y08jnkqyrk6xczahiahhcy4-guix-1.3.0-1.771b866/bin/guix-daemon
29920 guixbuild --max-silent-time 0 --timeout 0 --log-compression none
--discover=no --substitute-urls http://127.0.0.1:8080
https://ci.guix.gnu.org --max-jobs=4
--8<---------------cut here---------------end--------------->8---

Attempting to update my profile keeps failing with:

--8<---------------cut here---------------start------------->8---
$ ./pre-inst-env guix package -m ~/stow/guix/manifest.scm -L ~/src/sfl-guix-channel/ --substitute-urls=https://ci.guix.gnu.org --no-offload
;;; note: source file /home/maxim/src/guix-master/gnu/packages/networking.scm
;;;       newer than compiled /home/maxim/src/guix-master/gnu/packages/networking.go
;;; note: source file /home/maxim/src/guix-master/gnu/packages/networking.scm
;;;       newer than compiled /run/current-system/profile/lib/guile/3.0/site-ccache/gnu/packages/networking.go
The following packages will be installed:
   acpi                      1.7
   adb                       7.1.2_r36
   adwaita-icon-theme        3.34.3
   alsa-utils                1.2.4
   [...]
   xrandr                    1.5.1
   xrdb                      1.2.0
   xsetroot                  1.1.2
   yelp                      3.32.2

122.8 MB will be downloaded
 libreoffice-6.4.7.2  117.1MiB                                  344KiB/s 03:04 [#########         ]  52.7%guix substitute: error: TLS error in procedure 'read_from_session_record_port': Error decoding the received TLS packet.
substitution of /gnu/store/44h13hn5zssfppz67vydxcf95qsc8qfw-libreoffice-6.4.7.2 failed
guix package: error: some substitutes for the outputs of derivation `/gnu/store/9f8sffldy39mprihx6xgrs7hys9j75jm-libreoffice-6.4.7.2.drv' failed (usually happens due to networking issues); try `--fallback' to build derivation from source
--8<---------------cut here---------------end--------------->8---

I'm attaching my (large!) profile manifest.  It depends on the
https://gitlab.com/Apteryks/sfl-guix-channel channel, which just adds 3
Python packages.  You could comment out the "sflvault-client" package
from the manifest to lift that requirement.

Thanks,

Maxim

[manifest.scm (text/plain, attachment)]

Severity set to 'important' from 'normal' Request was from Ludovic Courtès <ludo <at> gnu.org> to control <at> debbugs.gnu.org. (Mon, 07 Jun 2021 15:39:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-guix <at> gnu.org:
bug#48903; Package guix. (Tue, 08 Jun 2021 09:02:02 GMT) Full text and rfc822 format available.

Message #10 received at 48903 <at> debbugs.gnu.org (full text, mbox):

From: Maxime Devos <maximedevos <at> telenet.be>
To: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>, 48903 <at> debbugs.gnu.org
Subject: Re: bug#48903: guix substitute: error: TLS error in procedure
 'read_from_session_record_port': Error decoding the received TLS packet.
Date: Mon, 07 Jun 2021 19:46:01 +0200
[Message part 1 (text/plain, inline)]
> 122.8 MB will be downloaded
>  libreoffice-6.4.7.2  117.1MiB                                  344KiB/s 03:04 [#########         ]  52.7%guix substitute: error: TLS error in procedure 'read_from_session_record_port': Error decoding the received TLS packet.
> substitution of /gnu/store/44h13hn5zssfppz67vydxcf95qsc8qfw-libreoffice-6.4.7.2 failed
> guix package: error: some substitutes for the outputs of derivation `/gnu/store/9f8sffldy39mprihx6xgrs7hys9j75jm-libreoffice-6.4.7.2.drv' failed (usually happens due to networking issues); try `--fallback' to build derivation from source
> --8<---------------cut here---------------end--------------->8---
> 

I often have the same problem when I do "guix package -u".
(Same error message, same package libreoffice, same derivation)
(Usually libreoffice, sometimes with other packages as well.)

I don't know the cause though.
[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-guix <at> gnu.org:
bug#48903; Package guix. (Fri, 11 Jun 2021 15:10:01 GMT) Full text and rfc822 format available.

Message #13 received at 48903 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Maxime Devos <maximedevos <at> telenet.be>
Cc: 48903 <at> debbugs.gnu.org, Maxim Cournoyer <maxim.cournoyer <at> gmail.com>
Subject: Re: bug#48903: guix substitute: error: TLS error in procedure
 'read_from_session_record_port': Error decoding the received TLS packet.
Date: Fri, 11 Jun 2021 17:09:36 +0200
Hi Maxim{,e}!

Maxime Devos <maximedevos <at> telenet.be> skribis:

>> 122.8 MB will be downloaded
>>  libreoffice-6.4.7.2  117.1MiB                                  344KiB/s 03:04 [#########         ]  52.7%guix substitute: error: TLS error in procedure 'read_from_session_record_port': Error decoding the received TLS packet.
>> substitution of /gnu/store/44h13hn5zssfppz67vydxcf95qsc8qfw-libreoffice-6.4.7.2 failed
>> guix package: error: some substitutes for the outputs of derivation `/gnu/store/9f8sffldy39mprihx6xgrs7hys9j75jm-libreoffice-6.4.7.2.drv' failed (usually happens due to networking issues); try `--fallback' to build derivation from source
>> --8<---------------cut here---------------end--------------->8---
>> 
>
> I often have the same problem when I do "guix package -u".
> (Same error message, same package libreoffice, same derivation)
> (Usually libreoffice, sometimes with other packages as well.)

As a first step, can you reproduce the bug like this:

  while echo substitute /gnu/store/44h13hn5zssfppz67vydxcf95qsc8qfw-libreoffice-6.4.7.2 /tmp/t | guix substitute --substitute ; do chmod -R +w /tmp/t && rm -rf /tmp/t; done

?

FWIW, I can’t seem to reproduce it with:

--8<---------------cut here---------------start------------->8---
$ guix describe 
Generacio 185   Jun 07 2021 15:07:46    (nuna)
  guix e3611cc
    repository URL: https://git.savannah.gnu.org/git/guix.git
    branch: master
    commit: e3611cc412e7b1c750a56d17fb1b7cde684baa3f
--8<---------------cut here---------------end--------------->8---

TIA,
Ludo’.




Merged 48903 49071. Request was from Ludovic Courtès <ludo <at> gnu.org> to control <at> debbugs.gnu.org. (Fri, 18 Jun 2021 07:49:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-guix <at> gnu.org:
bug#48903; Package guix. (Tue, 29 Jun 2021 13:24:03 GMT) Full text and rfc822 format available.

Message #18 received at 48903 <at> debbugs.gnu.org (full text, mbox):

From: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 48903 <at> debbugs.gnu.org, Maxime Devos <maximedevos <at> telenet.be>
Subject: Re: bug#48903: guix substitute: error: TLS error in procedure
 'read_from_session_record_port': Error decoding the received TLS packet.
Date: Tue, 29 Jun 2021 09:23:18 -0400
Hello Ludovic,

Ludovic Courtès <ludo <at> gnu.org> writes:

[...]

> As a first step, can you reproduce the bug like this:
>
>   while echo substitute
> /gnu/store/44h13hn5zssfppz67vydxcf95qsc8qfw-libreoffice-6.4.7.2 /tmp/t
> | guix substitute --substitute ; do chmod -R +w /tmp/t && rm -rf
> /tmp/t; done
>
> ?
>
> FWIW, I can’t seem to reproduce it with:
>
> $ guix describe 
> Generacio 185   Jun 07 2021 15:07:46    (nuna)
>   guix e3611cc
>     repository URL: https://git.savannah.gnu.org/git/guix.git
>     branch: master
>     commit: e3611cc412e7b1c750a56d17fb1b7cde684baa3f

I can't seem to reproduce either.  Perhaps the issue only arises when
there are many things happening concurrently.  My daemon runs with:

--8<---------------cut here---------------start------------->8---
$ sudo ps -eF | grep guix-daemon
root     25193   216  0  3074  1524   3 Jun28 ?        00:00:00 /gnu/store/vphx2839xv0qj9xwcwrb95592lzrrnx7-guix-1.3.0-3.50dfbbf/bin/guix-daemon 25178               guixbuild --max-silent-time 0 --timeout 0 --log-compression none --discover=no --substitute-urls http://127.0.0.1:8080 https://ci.guix.gnu.org --max-jobs=4--8<---------------cut here---------------end--------------->8---

I can rather easily (and annoyingly!) trigger the problem (and a few
variations of it, it seems) with something like:

$ packages=$(guix refresh -l protobuf | sed 's/^.*: //')
$ guix build -v3 --keep-going $packages

For example, running the above, I just got:

--8<---------------cut here---------------start------------->8---
guix build: error: corrupt input while restoring archive from #<closed:
file 7fc95acfc2a0>
--8<---------------cut here---------------end--------------->8---

Does the above commands succeed on the first time on your end?  If you
have already lots of things cached, you can try for an architecture you
don't often build for by adding the '--system=i686-linux' option; that
should cause a massive amount of downloads, likely to trigger the
problem.  Perhaps also try to use --max-jobs=4.

If you have ideas of how to debug this when I hit the issue I'm all ears
:-).

Thank you!

Maxim




Information forwarded to bug-guix <at> gnu.org:
bug#48903; Package guix. (Tue, 29 Jun 2021 21:19:01 GMT) Full text and rfc822 format available.

Message #21 received at 48903 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>
Cc: 48903 <at> debbugs.gnu.org, Maxime Devos <maximedevos <at> telenet.be>
Subject: Re: bug#48903: guix substitute: error: TLS error in procedure
 'read_from_session_record_port': Error decoding the received TLS packet.
Date: Tue, 29 Jun 2021 23:18:39 +0200
[Message part 1 (text/plain, inline)]
Hi,

Maxim Cournoyer <maxim.cournoyer <at> gmail.com> skribis:

> $ sudo ps -eF | grep guix-daemon
> root     25193   216  0  3074  1524   3 Jun28 ?        00:00:00 /gnu/store/vphx2839xv0qj9xwcwrb95592lzrrnx7-guix-1.3.0-3.50dfbbf/bin/guix-daemon 25178               guixbuild --max-silent-time 0 --timeout 0 --log-compression none --discover=no --substitute-urls http://127.0.0.1:8080 https://ci.guix.gnu.org --max-jobs=4--8<---------------cut here---------------end--------------->8---
>
> I can rather easily (and annoyingly!) trigger the problem (and a few
> variations of it, it seems) with something like:
>
> $ packages=$(guix refresh -l protobuf | sed 's/^.*: //')
> $ guix build -v3 --keep-going $packages
>
> For example, running the above, I just got:
>
> guix build: error: corrupt input while restoring archive from #<closed:
> file 7fc95acfc2a0>
> --8<---------------cut here---------------end--------------->8---
>
> Does the above commands succeed on the first time on your end?  If you
> have already lots of things cached, you can try for an architecture you
> don't often build for by adding the '--system=i686-linux' option; that
> should cause a massive amount of downloads, likely to trigger the
> problem.  Perhaps also try to use --max-jobs=4.

I’ve tried that, with --max-jobs=4, and it fills my disk just fine.  :-/

> If you have ideas of how to debug this when I hit the issue I'm all ears
> :-).

The attached patch substitutes a number of store items in a row; run:

  guix repl -- substitute-stress.scm

and it’ll fill /tmp/substitute-test with 200 substitutes, which should
be equivalent to the kind of stress test you had above.

It doesn’t crash for me.  There are a few “error: no valid substitute
for /gnu/store/…” errors, but these are expected: was ask for
substitutes for 200 packages without first checking whether substitutes
are available.

Could you run it and report back?

You can try with more packages, different substitute URLs, etc.

TIA!

Ludo’.

[substitute-stress.scm (text/plain, inline)]
(use-modules (guix) (gnu packages)
             (guix scripts substitute)
             (guix grafts)
             (guix build utils)
             (srfi srfi-1)
             (ice-9 match)
             (ice-9 threads))

(define test-directory
  "/tmp/substitute-test")

(define packages
  ;; Subset of packages for which we request substitutes.
  (take (fold-packages cons '()) 200))

(define (spawn-substitution-thread input urls)
  "Spawn a 'guix substitute' thread that reads commands from INPUT and uses
URLS as the substitute servers."
  (call-with-new-thread
   (lambda ()
     (parameterize ((%reply-file-descriptor #f)
                    (current-input-port input))
       (setenv "_NIX_OPTIONS"
               (string-append "substitute-urls=" (string-join urls)))
       (let loop ()
         (format (current-error-port) "starting substituter~%")
         ;; Catch "no valid substitute" errors.
         (catch 'quit
           (lambda ()
             (guix-substitute "--substitute"))
           (const #f))
         (unless (eof-object? (peek-char input))
           (loop)))))))

(match (pipe)
  ((input . output)
   (let ((thread (spawn-substitution-thread input
                                            %default-substitute-urls)))
     ;; Remove the test directory.
     (when (file-exists? test-directory)
       (for-each make-file-writable
                 (find-files test-directory #:directories? #t))
       (delete-file-recursively test-directory))
     (mkdir-p test-directory)

     (parameterize ((%graft? #false))
       (with-store store
         ;; Ask for substitutes for PACKAGES.
         (for-each (lambda (package n)
                     (define item
                       (run-with-store store
                         (package-file package)))

                     (format output "substitute ~a ~a/~a~%"
                             item test-directory n))
                   packages
                   (iota (length packages))))
       (format #t "sent ~a substitution requests...~%"
               (length packages))
       (close-port output)

       ;; Wait for substitution to complete.
       (join-thread thread)))))

Information forwarded to bug-guix <at> gnu.org:
bug#48903; Package guix. (Wed, 30 Jun 2021 16:27:01 GMT) Full text and rfc822 format available.

Message #24 received at 48903 <at> debbugs.gnu.org (full text, mbox):

From: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 48903 <at> debbugs.gnu.org, Maxime Devos <maximedevos <at> telenet.be>
Subject: Re: bug#48903: guix substitute: error: TLS error in procedure
 'read_from_session_record_port': Error decoding the received TLS packet.
Date: Wed, 30 Jun 2021 12:26:32 -0400
Hello!

Ludovic Courtès <ludo <at> gnu.org> writes:

> Hi,
>
> Maxim Cournoyer <maxim.cournoyer <at> gmail.com> skribis:
>
>> $ sudo ps -eF | grep guix-daemon
>> root 25193 216 0 3074 1524 3 Jun28 ?  00:00:00
>> /gnu/store/vphx2839xv0qj9xwcwrb95592lzrrnx7-guix-1.3.0-3.50dfbbf/bin/guix-daemon
>> 25178 guixbuild --max-silent-time 0 --timeout 0 --log-compression
>> none --discover=no --substitute-urls http://127.0.0.1:8080
>> https://ci.guix.gnu.org --max-jobs=4--8<---------------cut
>> here---------------end--------------->8---
>>
>> I can rather easily (and annoyingly!) trigger the problem (and a few
>> variations of it, it seems) with something like:
>>
>> $ packages=$(guix refresh -l protobuf | sed 's/^.*: //')
>> $ guix build -v3 --keep-going $packages
>>
>> For example, running the above, I just got:
>>
>> guix build: error: corrupt input while restoring archive from #<closed:
>> file 7fc95acfc2a0>
>> --8<---------------cut here---------------end--------------->8---
>>
>> Does the above commands succeed on the first time on your end?  If you
>> have already lots of things cached, you can try for an architecture you
>> don't often build for by adding the '--system=i686-linux' option; that
>> should cause a massive amount of downloads, likely to trigger the
>> problem.  Perhaps also try to use --max-jobs=4.
>
> I’ve tried that, with --max-jobs=4, and it fills my disk just fine.  :-/
>
>> If you have ideas of how to debug this when I hit the issue I'm all ears
>> :-).
>
> The attached patch substitutes a number of store items in a row; run:
>
>   guix repl -- substitute-stress.scm
>
> and it’ll fill /tmp/substitute-test with 200 substitutes, which should
> be equivalent to the kind of stress test you had above.
>
> It doesn’t crash for me.  There are a few “error: no valid substitute
> for /gnu/store/…” errors, but these are expected: was ask for
> substitutes for 200 packages without first checking whether substitutes
> are available.
>
> Could you run it and report back?
>
> You can try with more packages, different substitute URLs, etc.
>
> TIA!
>
> Ludo’.

[...]

I've tried with the following modified version which runs multiple
threads in parallel (to mimic --max-jobs=4 on the daemon), and I've yet
to trigger it, although the hard drive is grinding heavily:

--8<---------------cut here---------------start------------->8---
(use-modules (guix) (gnu packages)
             (guix scripts substitute)
             (guix grafts)
             (guix build utils)
             (srfi srfi-1)
             (ice-9 match)
             (ice-9 threads))

(define test-directory "/tmp/substitute-test")

(define max-jobs 4)

(define packages
  ;; Subset of packages for which we request substitutes.
  (append (map specification->package '("libreoffice"
                                        "ungoogled-chromium"
                                        "openjdk"
                                        "texmacs"))
          (take (fold-packages cons '()) 1000)))

(define (spawn-substitution-thread input urls)
  "Spawn a 'guix substitute' thread that reads commands from INPUT and uses
URLS as the substitute servers."
  (call-with-new-thread
   (lambda ()
     (parameterize ((%reply-file-descriptor #f)
                    (current-input-port input))
       (setenv "_NIX_OPTIONS"
               (string-append "substitute-urls=" (string-join urls)))
       (let loop ()
         (format (current-error-port) "starting substituter~%")
         ;; Catch "no valid substitute" errors.
         (catch 'quit
           (lambda ()
             (guix-substitute "--substitute"))
           (const #f))
         (unless (eof-object? (peek-char input))
           (loop)))))))

(for-each (lambda (job)
            (match (pipe)
              ((input . output)
               (let ((test-directory* (string-append test-directory "-"
                                                     (number->string job)))
                     (thread (spawn-substitution-thread
                              input %default-substitute-urls)))
                 ;; Remove the test directory.
                 (when (file-exists? test-directory*)
                   (for-each (lambda (f)
                               (false-if-exception (make-file-writable f)))
                             (find-files test-directory #:directories? #t))
                   (delete-file-recursively test-directory*))
                 (mkdir-p test-directory*)

                 (parameterize ((%graft? #false))
                   (with-store store
                     ;; Ask for substitutes for PACKAGES.
                     (for-each (lambda (package n)
                                 (define item
                                   (run-with-store store
                                     (package-file package)))

                                 (format output "substitute ~a ~a/~a~%"
                                         item test-directory* n))
                               packages
                               (iota (length packages))))
                   (format #t "sent ~a substitution requests...~%"
                           (length packages))
                   (close-port output)

                   ;; Wait for substitution to complete.
                   (join-thread thread))))))
          (iota max-jobs))
--8<---------------cut here---------------end--------------->8---

I wonder if there's something more happening in the real scenario
(validating signatures when putting things in the store? or something
similar) that may have a role in the failure.

That's a tough nut to crack!

I'll keep looking for clues.

Thanks for your time!

Maxim




Information forwarded to bug-guix <at> gnu.org:
bug#48903; Package guix. (Thu, 01 Jul 2021 13:13:02 GMT) Full text and rfc822 format available.

Message #27 received at 48903 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>
Cc: 48903 <at> debbugs.gnu.org, Maxime Devos <maximedevos <at> telenet.be>
Subject: Re: bug#48903: guix substitute: error: TLS error in procedure
 'read_from_session_record_port': Error decoding the received TLS packet.
Date: Thu, 01 Jul 2021 15:12:11 +0200
Hi,

Maxim Cournoyer <maxim.cournoyer <at> gmail.com> skribis:

> I've tried with the following modified version which runs multiple
> threads in parallel (to mimic --max-jobs=4 on the daemon), and I've yet
> to trigger it, although the hard drive is grinding heavily:

Note that ‘--max-jobs=4’ leads guix-daemon to spawn 4 ‘guix substitute’
processes, which is not what the script is doing here.

Are the other conditions the same, for instance same network, etc.?

Thanks,
Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#48903; Package guix. (Fri, 02 Jul 2021 14:27:02 GMT) Full text and rfc822 format available.

Message #30 received at 48903 <at> debbugs.gnu.org (full text, mbox):

From: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 48903 <at> debbugs.gnu.org, Maxime Devos <maximedevos <at> telenet.be>
Subject: Re: bug#48903: guix substitute: error: TLS error in procedure
 'read_from_session_record_port': Error decoding the received TLS packet.
Date: Fri, 02 Jul 2021 10:26:48 -0400
Hello,

Ludovic Courtès <ludo <at> gnu.org> writes:

> Hi,
>
> Maxim Cournoyer <maxim.cournoyer <at> gmail.com> skribis:
>
>> I've tried with the following modified version which runs multiple
>> threads in parallel (to mimic --max-jobs=4 on the daemon), and I've yet
>> to trigger it, although the hard drive is grinding heavily:
>
> Note that ‘--max-jobs=4’ leads guix-daemon to spawn 4 ‘guix substitute’
> processes, which is not what the script is doing here.

Oh!  I had overlooked that.  What the modified version did is create
threads rather than processes, right?

> Are the other conditions the same, for instance same network, etc.?

Yes!

Maxim




Information forwarded to bug-guix <at> gnu.org:
bug#48903; Package guix. (Mon, 05 Jul 2021 16:13:02 GMT) Full text and rfc822 format available.

Message #33 received at 48903 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>
Cc: 48903 <at> debbugs.gnu.org, Maxime Devos <maximedevos <at> telenet.be>
Subject: Re: bug#48903: guix substitute: error: TLS error in procedure
 'read_from_session_record_port': Error decoding the received TLS packet.
Date: Mon, 05 Jul 2021 18:12:30 +0200
Maxim Cournoyer <maxim.cournoyer <at> gmail.com> skribis:

> Ludovic Courtès <ludo <at> gnu.org> writes:
>
>> Hi,
>>
>> Maxim Cournoyer <maxim.cournoyer <at> gmail.com> skribis:
>>
>>> I've tried with the following modified version which runs multiple
>>> threads in parallel (to mimic --max-jobs=4 on the daemon), and I've yet
>>> to trigger it, although the hard drive is grinding heavily:
>>
>> Note that ‘--max-jobs=4’ leads guix-daemon to spawn 4 ‘guix substitute’
>> processes, which is not what the script is doing here.
>
> Oh!  I had overlooked that.  What the modified version did is create
> threads rather than processes, right?

Yes.

So I’m not sure how to better test this.  Perhaps you could try
introducing random delays in the loop (which could cause connections to
go stale), using different substitute URLs, things like that.

Thanks,
Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#48903; Package guix. (Thu, 08 Jul 2021 14:00:02 GMT) Full text and rfc822 format available.

Message #36 received at 48903 <at> debbugs.gnu.org (full text, mbox):

From: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 48903 <at> debbugs.gnu.org, Maxime Devos <maximedevos <at> telenet.be>
Subject: Re: bug#48903: guix substitute: error: TLS error in procedure
 'read_from_session_record_port': Error decoding the received TLS packet.
Date: Thu, 08 Jul 2021 09:59:22 -0400
[Message part 1 (text/plain, inline)]
Hi!

Ludovic Courtès <ludo <at> gnu.org> writes:

> Maxim Cournoyer <maxim.cournoyer <at> gmail.com> skribis:
>
>> Ludovic Courtès <ludo <at> gnu.org> writes:
>>
>>> Hi,
>>>
>>> Maxim Cournoyer <maxim.cournoyer <at> gmail.com> skribis:
>>>
>>>> I've tried with the following modified version which runs multiple
>>>> threads in parallel (to mimic --max-jobs=4 on the daemon), and I've yet
>>>> to trigger it, although the hard drive is grinding heavily:
>>>
>>> Note that ‘--max-jobs=4’ leads guix-daemon to spawn 4 ‘guix substitute’
>>> processes, which is not what the script is doing here.
>>
>> Oh!  I had overlooked that.  What the modified version did is create
>> threads rather than processes, right?
>
> Yes.
>
> So I’m not sure how to better test this.  Perhaps you could try
> introducing random delays in the loop (which could cause connections to
> go stale), using different substitute URLs, things like that.

I've tried some to reproduce the issue with the modified scripts below,
but in vain.  I'm not sure if my delay is inserted at the right place.
I also suspect that my attempt to shuffle the substitute-urls is not
really useful, as that's probably what would have happened anyway
(although I haven't followed in the code deeply enough to confirm).

[substitute-stress-launcher.sh (application/x-sh, inline)]
[substitute-stress.scm (application/octet-stream, inline)]
[Message part 4 (text/plain, inline)]
Thanks,

Maxim

Information forwarded to bug-guix <at> gnu.org:
bug#48903; Package guix. (Sat, 10 Jul 2021 10:21:01 GMT) Full text and rfc822 format available.

Message #39 received at 48903 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>
Cc: 48903 <at> debbugs.gnu.org, Maxime Devos <maximedevos <at> telenet.be>
Subject: Re: bug#48903: guix substitute: error: TLS error in procedure
 'read_from_session_record_port': Error decoding the received TLS packet.
Date: Sat, 10 Jul 2021 12:20:25 +0200
Hi,

Maxim Cournoyer <maxim.cournoyer <at> gmail.com> skribis:

> I've tried some to reproduce the issue with the modified scripts below,
> but in vain.  I'm not sure if my delay is inserted at the right place.
> I also suspect that my attempt to shuffle the substitute-urls is not
> really useful, as that's probably what would have happened anyway
> (although I haven't followed in the code deeply enough to confirm).

Bah.  :-/  Do the two of you still experience the bug initially reported
here in “real” conditions?

Are we sure we’re using the same Guix + Guile when running the stress
test and in real conditions?

Thanks for testing!

Ludo’.




Reply sent to Maxim Cournoyer <maxim.cournoyer <at> gmail.com>:
You have taken responsibility. (Sat, 07 Aug 2021 04:24:02 GMT) Full text and rfc822 format available.

Notification sent to Maxim Cournoyer <maxim.cournoyer <at> gmail.com>:
bug acknowledged by developer. (Sat, 07 Aug 2021 04:24:02 GMT) Full text and rfc822 format available.

Message #44 received at 48903-done <at> debbugs.gnu.org (full text, mbox):

From: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 48903-done <at> debbugs.gnu.org, Maxime Devos <maximedevos <at> telenet.be>
Subject: Re: bug#48903: guix substitute: error: TLS error in procedure
 'read_from_session_record_port': Error decoding the received TLS packet.
Date: Sat, 07 Aug 2021 00:23:07 -0400
Hello,

Ludovic Courtès <ludo <at> gnu.org> writes:

> Hi,
>
> Maxim Cournoyer <maxim.cournoyer <at> gmail.com> skribis:
>
>> I've tried some to reproduce the issue with the modified scripts below,
>> but in vain.  I'm not sure if my delay is inserted at the right place.
>> I also suspect that my attempt to shuffle the substitute-urls is not
>> really useful, as that's probably what would have happened anyway
>> (although I haven't followed in the code deeply enough to confirm).
>
> Bah.  :-/  Do the two of you still experience the bug initially reported
> here in “real” conditions?
>
> Are we sure we’re using the same Guix + Guile when running the stress
> test and in real conditions?
>
> Thanks for testing!
>
> Ludo’.

I've been doing builds on core-updates, which would previously trigger
this problem rather often, *without* suffering from the problem.

I consider it resolved.  I'm not exactly sure how, which is not
satisfying, but I'm glad it's gone.

Thank you!

Closing.

Maxim




Reply sent to Maxim Cournoyer <maxim.cournoyer <at> gmail.com>:
You have taken responsibility. (Sat, 07 Aug 2021 04:24:02 GMT) Full text and rfc822 format available.

Notification sent to Maxime Devos <maximedevos <at> telenet.be>:
bug acknowledged by developer. (Sat, 07 Aug 2021 04:24:02 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sat, 04 Sep 2021 11:24:04 GMT) Full text and rfc822 format available.

bug unarchived. Request was from Thiago Jung Bauermann <bauermann <at> kolabnow.com> to control <at> debbugs.gnu.org. (Thu, 13 Jan 2022 20:23:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-guix <at> gnu.org:
bug#48903; Package guix. (Thu, 13 Jan 2022 20:48:02 GMT) Full text and rfc822 format available.

Message #56 received at 48903 <at> debbugs.gnu.org (full text, mbox):

From: Thiago Jung Bauermann <bauermann <at> kolabnow.com>
To: 48903 <at> debbugs.gnu.org
Subject: Re: bug#42902: texlive substitute TLS error: decoding the received
 packet
Date: Thu, 13 Jan 2022 17:47:33 -0300
Hello,

[ Sending again to 48903 (and excluding everyone else from the recipients)
  because this bug was archived and thus the original message bounced. ]

I’m sending this email to two bugs because they are both about the same
problem. Bug 48903 (which is currently closed) has fairly intense attempts
at debugging what is going on, but unfortunately without arriving at an
answer.

This problem is also affecting Cuirass builds. This powerpc64le-linux build:

https://ci.guix.gnu.org/build/70/details

failed because of the intermitent TLS error:

--8<---------------cut here---------------start------------->8---
guix substitute: error: TLS error in procedure 'read_from_session_record_port': Error decoding the received TLS packet.
fetching path `/gnu/store/fkkfffvwrj103zs5cf6d8bf9as46ywhc-python-minimal-3.5.9' (empty status: '')
@ substituter-failed /gnu/store/fkkfffvwrj103zs5cf6d8bf9as46ywhc-python-minimal-3.5.9  fetching path `/gnu/store/fkkfffvwrj103zs5cf6d8bf9as46ywhc-python-minimal-3.5.9' (empty status: '')
--8<---------------cut here---------------end--------------->8---

The daemon in this case is:

/gnu/store/ny30pxjzv866m3w0v1vfbzdbqi17k8wn-guix-daemon-1.3.0-21.e427593/bin/guix-daemon

From this Guix version

--8<---------------cut here---------------start------------->8---
guixp9: sudo -i guix describe
Generation 11   Jan 11 2022 02:07:42    (current)
  guix 83abdc8
    repository URL: https://git.savannah.gnu.org/git/guix.git
    branch: master
    commit: 83abdc8371d90b6d4591a69fae5585a2a99c1627
--8<---------------cut here---------------end--------------->8---

Not sure if this is related, but during that build Guix noticed that the substitute server is slow:

--8<---------------cut here---------------start------------->8---
Downloading https://ci.guix.gnu.org/nar/lzip/fkkfffvwrj103zs5cf6d8bf9as46ywhc-python-minimal-3.5.9...
guix substitute: warning: while fetching https://ci.guix.gnu.org/nar/lzip/fkkfffvwrj103zs5cf6d8bf9as46ywhc-python-minimal-3.5.9: server is somewhat slow
guix substitute: warning: try `--no-substitutes' if the problem persists
--8<---------------cut here---------------end--------------->8---

There’s bug 46942 which is specifically about ci.guix.gnu.org being slow,
and the bug reporter there also hits this same TLS error, so there’s
probably at least some correlation between this problem and the network
being slow.

-- 
Thanks,
Thiago







bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Fri, 11 Feb 2022 12:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 2 years and 75 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.