GNU bug report logs - #67485
[Cuirass] Workers not waking up after server went away

Previous Next

Package: guix;

Reported by: Ludovic Courtès <ludovic.courtes <at> inria.fr>

Date: Mon, 27 Nov 2023 13:30:02 UTC

Severity: normal

Done: Ludovic Courtès <ludo <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 67485 in the body.
You can then email your comments to 67485 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guix <at> gnu.org:
bug#67485; Package guix. (Mon, 27 Nov 2023 13:30:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Ludovic Courtès <ludovic.courtes <at> inria.fr>:
New bug report received and forwarded. Copy sent to bug-guix <at> gnu.org. (Mon, 27 Nov 2023 13:30:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludovic.courtes <at> inria.fr>
To: bug-guix <at> gnu.org
Subject: [Cuirass] Workers not waking up after server went away
Date: Mon, 27 Nov 2023 14:28:41 +0100
Hello,

The ‘cuirass remote-worker’ processes (1.2.0-1.bdc1f9f) didn’t wake up
after ‘cuirass remote-server’ stopped responding earlier today,
remaining stuck while waiting for a reply to their latest “request work”
message:

--8<---------------cut here---------------start------------->8---
Nov 27 02:47:30 guixp9 cuirass[22122]: COhE8Mw6: derivation `/gnu/store/acljcvz7wb3pc9bxipkl1vf74ac7ns2z-calf-0.90.3.drv' build failed: build o
Nov 27 02:47:30 guixp9 cuirass[22122]: COhE8Mw6: request work.
Nov 27 02:47:30 guixp9 cuirass[22122]: HKCtyhxH: derivation `/gnu/store/z51fxy3j476136wcqd5gmy9v9r2vyqwn-csdr-0.18.2.drv' build failed: build o
Nov 27 02:47:30 guixp9 cuirass[22122]: HKCtyhxH: request work.
Nov 27 02:47:44 guixp9 cuirass[22122]: COhE8Mw6: ping tcp://10.0.0.1:5555.
Nov 27 02:47:44 guixp9 cuirass[22122]: HKCtyhxH: ping tcp://10.0.0.1:5555.
Nov 27 02:48:44 guixp9 cuirass[22122]: COhE8Mw6: ping tcp://10.0.0.1:5555.
Nov 27 02:48:44 guixp9 cuirass[22122]: HKCtyhxH: ping tcp://10.0.0.1:5555.
Nov 27 02:49:45 guixp9 cuirass[22122]: COhE8Mw6: ping tcp://10.0.0.1:5555.
Nov 27 02:49:45 guixp9 cuirass[22122]: HKCtyhxH: ping tcp://10.0.0.1:5555.
Nov 27 02:50:45 guixp9 cuirass[22122]: COhE8Mw6: ping tcp://10.0.0.1:5555.
Nov 27 02:50:45 guixp9 cuirass[22122]: HKCtyhxH: ping tcp://10.0.0.1:5555.
Nov 27 02:51:45 guixp9 cuirass[22122]: COhE8Mw6: ping tcp://10.0.0.1:5555.
Nov 27 02:51:45 guixp9 cuirass[22122]: HKCtyhxH: ping tcp://10.0.0.1:5555.
Nov 27 02:52:45 guixp9 cuirass[22122]: COhE8Mw6: ping tcp://10.0.0.1:5555.
Nov 27 02:52:45 guixp9 cuirass[22122]: HKCtyhxH: ping tcp://10.0.0.1:5555.
Nov 27 02:53:46 guixp9 cuirass[22122]: COhE8Mw6: ping tcp://10.0.0.1:5555.
Nov 27 02:53:46 guixp9 cuirass[22122]: HKCtyhxH: ping tcp://10.0.0.1:5555.
Nov 27 02:54:46 guixp9 cuirass[22122]: COhE8Mw6: ping tcp://10.0.0.1:5555.
Nov 27 02:54:46 guixp9 cuirass[22122]: HKCtyhxH: ping tcp://10.0.0.1:5555.
Nov 27 02:55:46 guixp9 cuirass[22122]: COhE8Mw6: ping tcp://10.0.0.1:5555.
Nov 27 02:55:46 guixp9 cuirass[22122]: HKCtyhxH: ping tcp://10.0.0.1:5555.
Nov 27 02:55:53 guixp9 cuirass[22122]: worker's alive
Nov 27 02:56:46 guixp9 cuirass[22122]: COhE8Mw6: ping tcp://10.0.0.1:5555.
Nov 27 02:56:46 guixp9 cuirass[22122]: HKCtyhxH: ping tcp://10.0.0.1:5555.
Nov 27 02:57:47 guixp9 cuirass[22122]: COhE8Mw6: ping tcp://10.0.0.1:5555.
--8<---------------cut here---------------end--------------->8---

They had to be manually restarted.

This shouldn’t be the case.  Instead, they should say “received
bootstrap message” when the new ‘cuirass remote-server’ is spawned and
keep going.

Ludo’.




Reply sent to Ludovic Courtès <ludo <at> gnu.org>:
You have taken responsibility. (Thu, 29 Aug 2024 09:40:02 GMT) Full text and rfc822 format available.

Notification sent to Ludovic Courtès <ludovic.courtes <at> inria.fr>:
bug acknowledged by developer. (Thu, 29 Aug 2024 09:40:02 GMT) Full text and rfc822 format available.

Message #10 received at 67485-done <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: 67485-done <at> debbugs.gnu.org
Subject: Re: bug#67485: [Cuirass] Workers not waking up after server went away
Date: Thu, 29 Aug 2024 11:38:22 +0200
Ludovic Courtès <ludovic.courtes <at> inria.fr> skribis:

> The ‘cuirass remote-worker’ processes (1.2.0-1.bdc1f9f) didn’t wake up
> after ‘cuirass remote-server’ stopped responding earlier today,
> remaining stuck while waiting for a reply to their latest “request work”
> message:

I believe this is fixed.  In particular, Cuirass commit
fdb6bdfa27d9da8d052ed76b6a05b3817ff19777 added a timeout waiting for
“request work” replies.

Ludo’.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 26 Sep 2024 11:24:17 GMT) Full text and rfc822 format available.

This bug report was last modified 227 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.