GNU bug report logs -
#61646
Bandwidth-induced offload timeout abort whole operating
Previous Next
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 61646 in the body.
You can then email your comments to 61646 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-guix <at> gnu.org
:
bug#61646
; Package
guix
.
(Mon, 20 Feb 2023 03:29:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Maxim Cournoyer <maxim.cournoyer <at> gmail.com>
:
New bug report received and forwarded. Copy sent to
bug-guix <at> gnu.org
.
(Mon, 20 Feb 2023 03:29:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
Hi Guix,
I can reproduce this rather easily on my system:
--8<---------------cut here---------------start------------->8---
$ ./pre-inst-env guix build icedove
The following derivations will be built:
/gnu/store/l6r93asndd0kwv7024iyrl71zd0lbpbq-icedove-102.7.2.drv
/gnu/store/8zi808086b3vlfjrhdm87fgljziwdqx2-icedove-l10n-102.7.2.drv
/gnu/store/v0sq7rb8fk36kjasb27a71z1a27wxb1s-icedove-minimal-102.7.2.drv
process 19542 acquired build slot '/var/guix/offload/localhost:6666/0'
normalized load on machine 'localhost' is 0.08
building /gnu/store/8zi808086b3vlfjrhdm87fgljziwdqx2-icedove-l10n-102.7.2.drv...
process 19548 acquired build slot '/var/guix/offload/localhost:6666/1'
normalized load on machine 'localhost' is 0.08
building /gnu/store/v0sq7rb8fk36kjasb27a71z1a27wxb1s-icedove-minimal-102.7.2.drv...
guix offload: sending 1 store item (558 MiB) to 'localhost'...
exporting path `/gnu/store/bwb5hcdyzgq16kmbsva7ax0zq6lzg78z-icedove-102.7.2.tar.xz'
guix offload: error: failed to connect to 'localhost': Timeout connecting to localhost
cannot build derivation `/gnu/store/l6r93asndd0kwv7024iyrl71zd0lbpbq-icedove-102.7.2.drv': 1 dependencies couldn't be built
guix build: error: build of
`/gnu/store/l6r93asndd0kwv7024iyrl71zd0lbpbq-icedove-102.7.2.drv' failed
--8<---------------cut here---------------end--------------->8---
The third derivation tries to get a build slot and times out, because
the first two have already saturated the bandwidth of the link and it
takes more time than expected to get a reply.
The workaround is to use '-k', for "--keep-continuing", and retry the
3rd failing derivation after the first two completed.
I don't have a clear idea on how to improve the situation other than use
longer timeouts... but perhaps these timeouts could be dynamic based on
the load of the network/CPU/ ?
--
Thanks,
Maxim
Information forwarded
to
bug-guix <at> gnu.org
:
bug#61646
; Package
guix
.
(Thu, 23 Feb 2023 22:27:01 GMT)
Full text and
rfc822 format available.
Message #8 received at 61646 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Hi Maxim,
Maxim Cournoyer <maxim.cournoyer <at> gmail.com> skribis:
> I can reproduce this rather easily on my system:
>
> $ ./pre-inst-env guix build icedove
> The following derivations will be built:
> /gnu/store/l6r93asndd0kwv7024iyrl71zd0lbpbq-icedove-102.7.2.drv
> /gnu/store/8zi808086b3vlfjrhdm87fgljziwdqx2-icedove-l10n-102.7.2.drv
> /gnu/store/v0sq7rb8fk36kjasb27a71z1a27wxb1s-icedove-minimal-102.7.2.drv
> process 19542 acquired build slot '/var/guix/offload/localhost:6666/0'
> normalized load on machine 'localhost' is 0.08
> building /gnu/store/8zi808086b3vlfjrhdm87fgljziwdqx2-icedove-l10n-102.7.2.drv...
> process 19548 acquired build slot '/var/guix/offload/localhost:6666/1'
> normalized load on machine 'localhost' is 0.08
> building /gnu/store/v0sq7rb8fk36kjasb27a71z1a27wxb1s-icedove-minimal-102.7.2.drv...
> guix offload: sending 1 store item (558 MiB) to 'localhost'...
> exporting path `/gnu/store/bwb5hcdyzgq16kmbsva7ax0zq6lzg78z-icedove-102.7.2.tar.xz'
> guix offload: error: failed to connect to 'localhost': Timeout connecting to localhost
> cannot build derivation `/gnu/store/l6r93asndd0kwv7024iyrl71zd0lbpbq-icedove-102.7.2.drv': 1 dependencies couldn't be built
> guix build: error: build of
> `/gnu/store/l6r93asndd0kwv7024iyrl71zd0lbpbq-icedove-102.7.2.drv' failed
>
> The third derivation tries to get a build slot and times out, because
> the first two have already saturated the bandwidth of the link and it
> takes more time than expected to get a reply.
Weird. Since the it’s a timeout while connecting, I suppose the patch
below would improve the situation:
[Message part 2 (text/x-patch, inline)]
diff --git a/guix/scripts/offload.scm b/guix/scripts/offload.scm
index 578b3b9888..90cf97401c 100644
--- a/guix/scripts/offload.scm
+++ b/guix/scripts/offload.scm
@@ -220,7 +220,7 @@ (define* (open-ssh-session machine #:optional max-silent-time)
(session (make-session #:user (build-machine-user machine)
#:host (build-machine-name machine)
#:port (build-machine-port machine)
- #:timeout 10 ;initial timeout (seconds)
+ #:timeout 30 ;initial timeout (seconds)
;; #:log-verbosity 'protocol
#:identity (build-machine-private-key machine)
[Message part 3 (text/plain, inline)]
WDYT?
Ludo’.
Information forwarded
to
bug-guix <at> gnu.org
:
bug#61646
; Package
guix
.
(Sat, 25 Feb 2023 02:47:02 GMT)
Full text and
rfc822 format available.
Message #11 received at 61646 <at> debbugs.gnu.org (full text, mbox):
Hi Ludovic,
Ludovic Courtès <ludo <at> gnu.org> writes:
> Hi Maxim,
>
> Maxim Cournoyer <maxim.cournoyer <at> gmail.com> skribis:
>
>> I can reproduce this rather easily on my system:
>>
>> $ ./pre-inst-env guix build icedove
>> The following derivations will be built:
>> /gnu/store/l6r93asndd0kwv7024iyrl71zd0lbpbq-icedove-102.7.2.drv
>> /gnu/store/8zi808086b3vlfjrhdm87fgljziwdqx2-icedove-l10n-102.7.2.drv
>> /gnu/store/v0sq7rb8fk36kjasb27a71z1a27wxb1s-icedove-minimal-102.7.2.drv
>> process 19542 acquired build slot '/var/guix/offload/localhost:6666/0'
>> normalized load on machine 'localhost' is 0.08
>> building /gnu/store/8zi808086b3vlfjrhdm87fgljziwdqx2-icedove-l10n-102.7.2.drv...
>> process 19548 acquired build slot '/var/guix/offload/localhost:6666/1'
>> normalized load on machine 'localhost' is 0.08
>> building /gnu/store/v0sq7rb8fk36kjasb27a71z1a27wxb1s-icedove-minimal-102.7.2.drv...
>> guix offload: sending 1 store item (558 MiB) to 'localhost'...
>> exporting path `/gnu/store/bwb5hcdyzgq16kmbsva7ax0zq6lzg78z-icedove-102.7.2.tar.xz'
>> guix offload: error: failed to connect to 'localhost': Timeout connecting to localhost
>> cannot build derivation
>> `/gnu/store/l6r93asndd0kwv7024iyrl71zd0lbpbq-icedove-102.7.2.drv': 1
>> dependencies couldn't be built
>> guix build: error: build of
>> `/gnu/store/l6r93asndd0kwv7024iyrl71zd0lbpbq-icedove-102.7.2.drv' failed
>>
>> The third derivation tries to get a build slot and times out, because
>> the first two have already saturated the bandwidth of the link and it
>> takes more time than expected to get a reply.
>
> Weird. Since the it’s a timeout while connecting, I suppose the patch
> below would improve the situation:
>
> diff --git a/guix/scripts/offload.scm b/guix/scripts/offload.scm
> index 578b3b9888..90cf97401c 100644
> --- a/guix/scripts/offload.scm
> +++ b/guix/scripts/offload.scm
> @@ -220,7 +220,7 @@ (define* (open-ssh-session machine #:optional max-silent-time)
> (session (make-session #:user (build-machine-user machine)
> #:host (build-machine-name machine)
> #:port (build-machine-port machine)
> - #:timeout 10 ;initial timeout (seconds)
> + #:timeout 30 ;initial timeout (seconds)
> ;; #:log-verbosity 'protocol
> #:identity (build-machine-private-key machine)
Hm, how can I test this again?
I tried launching a daemon both on the remote and locally, with
something like:
sudo -E ./pre-inst-env ./guix-daemon --build-users-group guixbuild
--max-silent-time 0 --timeout 0 --log-compression none --discover=yes
--substitute-urls "https://ci.guix.gnu.org
https://bordeaux.guix.gnu.org" --max-jobs=20
and the code edited doesn't seem to run (I put an (error 'hello) in
there and nothing happened).
--
Thanks,
Maxim
Reply sent
to
Maxim Cournoyer <maxim.cournoyer <at> gmail.com>
:
You have taken responsibility.
(Sat, 25 Feb 2023 03:08:02 GMT)
Full text and
rfc822 format available.
Notification sent
to
Maxim Cournoyer <maxim.cournoyer <at> gmail.com>
:
bug acknowledged by developer.
(Sat, 25 Feb 2023 03:08:02 GMT)
Full text and
rfc822 format available.
Message #16 received at 61646-done <at> debbugs.gnu.org (full text, mbox):
Hello,
Ludovic Courtès <ludo <at> gnu.org> writes:
[...]
> Weird. Since the it’s a timeout while connecting, I suppose the patch
> below would improve the situation:
>
> diff --git a/guix/scripts/offload.scm b/guix/scripts/offload.scm
> index 578b3b9888..90cf97401c 100644
> --- a/guix/scripts/offload.scm
> +++ b/guix/scripts/offload.scm
> @@ -220,7 +220,7 @@ (define* (open-ssh-session machine #:optional max-silent-time)
> (session (make-session #:user (build-machine-user machine)
> #:host (build-machine-name machine)
> #:port (build-machine-port machine)
> - #:timeout 10 ;initial timeout (seconds)
> + #:timeout 30 ;initial timeout (seconds)
> ;; #:log-verbosity 'protocol
> #:identity (build-machine-private-key machine)
Nevermind my previous message, it was --sysconfdir that had not been
set, thus ignoring my offload setup (/etc/guix/machines.scm). The
command worked to test the change from the local machine:
--8<---------------cut here---------------start------------->8---
sudo -E ./pre-inst-env ./guix-daemon --build-users-group guixbuild \
--max-silent-time 0 --timeout 0 --log-compression none --discover=yes \
--substitute-urls "https://ci.guix.gnu.org https://bordeaux.guix.gnu.org" \
--max-jobs=4
--8<---------------cut here---------------end--------------->8---
I pushed the fix in commit 53d718f61b.
Closing, thank you!
--
Thanks,
Maxim
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Sat, 25 Mar 2023 11:24:07 GMT)
Full text and
rfc822 format available.
This bug report was last modified 2 years and 49 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.