GNU bug report logs -
#63368
Build coordiantor "Signals delivery fails constantly" crashes
Previous Next
To reply to this bug, email your comments to 63368 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-guix <at> gnu.org
:
bug#63368
; Package
guix
.
(Mon, 08 May 2023 10:55:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Christopher Baines <mail <at> cbaines.net>
:
New bug report received and forwarded. Copy sent to
bug-guix <at> gnu.org
.
(Mon, 08 May 2023 10:55:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Since the recent core-updates merge, I've seen the build coordinator
using less memory, but it's also been crashing in a new way, up to 10
times a day.
In the log, you see something like:
2023-05-07 09:15:42 Signals delivery fails constantly at GC #71051
2023-05-07 09:15:42 Signals delivery fails constantly
I'm guessing the switch from libgc-8.0.4 to libgc-8.2.2 has something to
do with this.
[signature.asc (application/pgp-signature, inline)]
Information forwarded
to
bug-guix <at> gnu.org
:
bug#63368
; Package
guix
.
(Wed, 10 May 2023 12:50:03 GMT)
Full text and
rfc822 format available.
Message #8 received at 63368 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Christopher Baines <mail <at> cbaines.net> writes:
> Since the recent core-updates merge, I've seen the build coordinator
> using less memory, but it's also been crashing in a new way, up to 10
> times a day.
>
> In the log, you see something like:
>
> 2023-05-07 09:15:42 Signals delivery fails constantly at GC #71051
> 2023-05-07 09:15:42 Signals delivery fails constantly
>
> I'm guessing the switch from libgc-8.0.4 to libgc-8.2.2 has something to
> do with this.
I think I've found a workaround. I found a list of environment variables
[1] you can set to affect the GC behaviour, and the first one I tried
(GC_RETRY_SIGNALS=0) seems to have had the desired affect, in that the
crashes/restarts have stopped.
1: https://github.com/ivmai/bdwgc/blob/master/docs/README.environment
I've sent a patch [2] to apply this setting as part of the service.
2: https://issues.guix.gnu.org/63417
[signature.asc (application/pgp-signature, inline)]
Information forwarded
to
bug-guix <at> gnu.org
:
bug#63368
; Package
guix
.
(Thu, 25 May 2023 15:26:01 GMT)
Full text and
rfc822 format available.
Message #11 received at 63368 <at> debbugs.gnu.org (full text, mbox):
Hi,
Christopher Baines <mail <at> cbaines.net> skribis:
> Since the recent core-updates merge, I've seen the build coordinator
> using less memory, but it's also been crashing in a new way, up to 10
> times a day.
>
> In the log, you see something like:
>
> 2023-05-07 09:15:42 Signals delivery fails constantly at GC #71051
> 2023-05-07 09:15:42 Signals delivery fails constantly
>
> I'm guessing the switch from libgc-8.0.4 to libgc-8.2.2 has something to
> do with this.
Normally on GNU/Linux libgc has:
#define SIG_SUSPEND SIGPWR
The Coordinator fiddles with SIGALRM, SIGUSR1, SIGINT, and SIGPIPE,
which should normally be fine.
Is there anything else that might interfere with libgc?
Ludo’.
Information forwarded
to
bug-guix <at> gnu.org
:
bug#63368
; Package
guix
.
(Thu, 25 May 2023 15:42:01 GMT)
Full text and
rfc822 format available.
Message #14 received at 63368 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Ludovic Courtès <ludo <at> gnu.org> writes:
> Christopher Baines <mail <at> cbaines.net> skribis:
>
>> Since the recent core-updates merge, I've seen the build coordinator
>> using less memory, but it's also been crashing in a new way, up to 10
>> times a day.
>>
>> In the log, you see something like:
>>
>> 2023-05-07 09:15:42 Signals delivery fails constantly at GC #71051
>> 2023-05-07 09:15:42 Signals delivery fails constantly
>>
>> I'm guessing the switch from libgc-8.0.4 to libgc-8.2.2 has something to
>> do with this.
>
> Normally on GNU/Linux libgc has:
>
> #define SIG_SUSPEND SIGPWR
>
> The Coordinator fiddles with SIGALRM, SIGUSR1, SIGINT, and SIGPIPE,
> which should normally be fine.
>
> Is there anything else that might interfere with libgc?
I've seen this issue in both the build coordinator and nar-herder, both
of which use guile-sqlite, so I wonder if that could have something to
do with it.
[signature.asc (application/pgp-signature, inline)]
Information forwarded
to
bug-guix <at> gnu.org
:
bug#63368
; Package
guix
.
(Fri, 02 Jun 2023 17:13:01 GMT)
Full text and
rfc822 format available.
Message #17 received at 63368 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Christopher Baines <mail <at> cbaines.net> writes:
> Ludovic Courtès <ludo <at> gnu.org> writes:
>
>> Christopher Baines <mail <at> cbaines.net> skribis:
>>
>>> Since the recent core-updates merge, I've seen the build coordinator
>>> using less memory, but it's also been crashing in a new way, up to 10
>>> times a day.
>>>
>>> In the log, you see something like:
>>>
>>> 2023-05-07 09:15:42 Signals delivery fails constantly at GC #71051
>>> 2023-05-07 09:15:42 Signals delivery fails constantly
>>>
>>> I'm guessing the switch from libgc-8.0.4 to libgc-8.2.2 has something to
>>> do with this.
>>
>> Normally on GNU/Linux libgc has:
>>
>> #define SIG_SUSPEND SIGPWR
>>
>> The Coordinator fiddles with SIGALRM, SIGUSR1, SIGINT, and SIGPIPE,
>> which should normally be fine.
>>
>> Is there anything else that might interfere with libgc?
>
> I've seen this issue in both the build coordinator and nar-herder, both
> of which use guile-sqlite, so I wonder if that could have something to
> do with it.
I've seen this happen with the build coordinator agent now (on
milano-guix-1):
2023-06-02 18:59:55 2023-06-02 18:59:55 (DEBUG): fb9f06cf-cc1d-4493-88b8-3eac9437f5d4: checking the availability of build inputs
2023-06-02 18:59:55 2023-06-02 18:59:55 (INFO ): fb9f06cf-cc1d-4493-88b8-3eac9437f5d4: setup successful, building: /gnu/store/7fbrli2a8nzn676q8gz2b0i0y0lr9nxv-r-quasr-1.40.0.drv
2023-06-02 19:00:46 Signals delivery fails constantly at GC #55
2023-06-02 19:01:22 Signals delivery fails constantly
2023-06-02 19:01:29 locale is en_US.utf8
2023-06-02 19:01:29 (gnutls version: 3.7.7, guix version: 1.4.0-6.dc5430c)
Which is a bit more concerning, since the build coordinator agent is
intentionally quite simple (no SQLite for example).
[signature.asc (application/pgp-signature, inline)]
Information forwarded
to
bug-guix <at> gnu.org
:
bug#63368
; Package
guix
.
(Tue, 06 Jun 2023 15:10:02 GMT)
Full text and
rfc822 format available.
Message #20 received at 63368 <at> debbugs.gnu.org (full text, mbox):
Christopher Baines <mail <at> cbaines.net> skribis:
> I've seen this happen with the build coordinator agent now (on
> milano-guix-1):
>
> 2023-06-02 18:59:55 2023-06-02 18:59:55 (DEBUG): fb9f06cf-cc1d-4493-88b8-3eac9437f5d4: checking the availability of build inputs
> 2023-06-02 18:59:55 2023-06-02 18:59:55 (INFO ): fb9f06cf-cc1d-4493-88b8-3eac9437f5d4: setup successful, building: /gnu/store/7fbrli2a8nzn676q8gz2b0i0y0lr9nxv-r-quasr-1.40.0.drv
> 2023-06-02 19:00:46 Signals delivery fails constantly at GC #55
> 2023-06-02 19:01:22 Signals delivery fails constantly
> 2023-06-02 19:01:29 locale is en_US.utf8
> 2023-06-02 19:01:29 (gnutls version: 3.7.7, guix version: 1.4.0-6.dc5430c)
>
> Which is a bit more concerning, since the build coordinator agent is
> intentionally quite simple (no SQLite for example).
The closure of (guix-build-coordinator agent) seems to be quite large
still.
Could you check what .so files are loaded by that code, perhaps via
/proc/PID/maps?
Thanks,
Ludo’.
Information forwarded
to
bug-guix <at> gnu.org
:
bug#63368
; Package
guix
.
(Tue, 06 Jun 2023 15:21:02 GMT)
Full text and
rfc822 format available.
Message #23 received at 63368 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Ludovic Courtès <ludo <at> gnu.org> writes:
> Christopher Baines <mail <at> cbaines.net> skribis:
>
>> I've seen this happen with the build coordinator agent now (on
>> milano-guix-1):
>>
>> 2023-06-02 18:59:55 2023-06-02 18:59:55 (DEBUG):
>> fb9f06cf-cc1d-4493-88b8-3eac9437f5d4: checking the availability of
>> build inputs
>> 2023-06-02 18:59:55 2023-06-02 18:59:55 (INFO ):
>> fb9f06cf-cc1d-4493-88b8-3eac9437f5d4: setup successful, building:
>> /gnu/store/7fbrli2a8nzn676q8gz2b0i0y0lr9nxv-r-quasr-1.40.0.drv
>> 2023-06-02 19:00:46 Signals delivery fails constantly at GC #55
>> 2023-06-02 19:01:22 Signals delivery fails constantly
>> 2023-06-02 19:01:29 locale is en_US.utf8
>> 2023-06-02 19:01:29 (gnutls version: 3.7.7, guix version: 1.4.0-6.dc5430c)
>>
>> Which is a bit more concerning, since the build coordinator agent is
>> intentionally quite simple (no SQLite for example).
>
> The closure of (guix-build-coordinator agent) seems to be quite large
> still.
>
> Could you check what .so files are loaded by that code, perhaps via
> /proc/PID/maps?
I think I see these (that's on milano-guix-1 currently):
/gnu/store/0i81lpfnn05pmjc5f43q4nfvd27r08f7-guile-gnutls-3.7.12/lib/guile/3.0/extensions/guile-gnutls-v-2.so.0.0.0
/gnu/store/0jk7sl5xqwwdkzjpp9sxgz9z0d48a3vy-libunistring-1.0/lib/libunistring.so.2.2.0
/gnu/store/1r1azdi4hvfypnx14d01n60p4aa7g2im-libidn2-2.3.4/lib/libidn2.so.0.3.8
/gnu/store/1w1r6r56z9lhg8ghcb7lxss6mkn7d5l1-libgc-8.2.2/lib/libgc.so.1.5.1
/gnu/store/4gvgcfdiz67wv04ihqfa8pqwzsb0qpv5-guile-3.0.9/lib/libguile-3.0.so.1.6.0
/gnu/store/8y0pwifz8a3d7zbdfzsawa1amf4afx1s-libgcrypt-1.10.1/lib/libgcrypt.so.20.4.1
/gnu/store/930nwsiysdvy2x5zv1sf6v7ym75z8ayk-gcc-11.3.0-lib/lib/libgcc_s.so.1
/gnu/store/c2fx42ial6lr60s96xcbml5hd8vwaxq3-nettle-3.8.1/lib/libhogweed.so.6.6
/gnu/store/c2fx42ial6lr60s96xcbml5hd8vwaxq3-nettle-3.8.1/lib/libnettle.so.8.6
/gnu/store/gsjczqir1wbz8p770zndrpw4rnppmxi3-glibc-2.35/lib/ld-linux-x86-64.so.2
/gnu/store/gsjczqir1wbz8p770zndrpw4rnppmxi3-glibc-2.35/lib/libcrypt.so.1
/gnu/store/gsjczqir1wbz8p770zndrpw4rnppmxi3-glibc-2.35/lib/libc.so.6
/gnu/store/gsjczqir1wbz8p770zndrpw4rnppmxi3-glibc-2.35/lib/libm.so.6
/gnu/store/ib2n2vzqpchc3bhh9i712w5sq9zapn8d-gmp-6.2.1/lib/libgmp.so.10.4.1
/gnu/store/j5kzdjan6mnf2ngmkc50fia8vrbpqi9b-libtasn1-4.19.0/lib/libtasn1.so.6.6.3
/gnu/store/k0p01a6b7hsxjfr65ga4f2gh6lh92aiq-lzlib-1.13/lib/liblz.so.1.13
/gnu/store/m9wi9hcrf7f9dm4ri32vw1jrbh1csywi-libgpg-error-1.45/lib/libgpg-error.so.0.33.0
/gnu/store/slzq3zqwj75lbrg4ly51hfhbv2vhryv5-zlib-1.2.13/lib/libz.so.1.2.13
/gnu/store/vq7dxp5la2lnhsvniwv38j0ggvsmzim7-p11-kit-0.24.1/lib/libp11-kit.so.0.3.0
/gnu/store/w8b0l8hk6g0fahj4fvmc4qqm3cvaxnmv-libffi-3.4.4/lib/libffi.so.8.1.2
/gnu/store/yr4lbvdyc4dgs76yij1dw2w2z8s84af8-gnutls-3.7.7/lib/libgnutls.so.30.34.1
[signature.asc (application/pgp-signature, inline)]
Information forwarded
to
bug-guix <at> gnu.org
:
bug#63368
; Package
guix
.
(Fri, 09 Jun 2023 13:15:01 GMT)
Full text and
rfc822 format available.
Message #26 received at 63368 <at> debbugs.gnu.org (full text, mbox):
Christopher Baines <mail <at> cbaines.net> skribis:
> Ludovic Courtès <ludo <at> gnu.org> writes:
>
>> Christopher Baines <mail <at> cbaines.net> skribis:
[...]
>>> 2023-06-02 19:01:22 Signals delivery fails constantly
>>> 2023-06-02 19:01:29 locale is en_US.utf8
>>> 2023-06-02 19:01:29 (gnutls version: 3.7.7, guix version: 1.4.0-6.dc5430c)
>>>
>>> Which is a bit more concerning, since the build coordinator agent is
>>> intentionally quite simple (no SQLite for example).
>>
>> The closure of (guix-build-coordinator agent) seems to be quite large
>> still.
>>
>> Could you check what .so files are loaded by that code, perhaps via
>> /proc/PID/maps?
>
> I think I see these (that's on milano-guix-1 currently):
>
> /gnu/store/0i81lpfnn05pmjc5f43q4nfvd27r08f7-guile-gnutls-3.7.12/lib/guile/3.0/extensions/guile-gnutls-v-2.so.0.0.0
> /gnu/store/0jk7sl5xqwwdkzjpp9sxgz9z0d48a3vy-libunistring-1.0/lib/libunistring.so.2.2.0
> /gnu/store/1r1azdi4hvfypnx14d01n60p4aa7g2im-libidn2-2.3.4/lib/libidn2.so.0.3.8
> /gnu/store/1w1r6r56z9lhg8ghcb7lxss6mkn7d5l1-libgc-8.2.2/lib/libgc.so.1.5.1
> /gnu/store/4gvgcfdiz67wv04ihqfa8pqwzsb0qpv5-guile-3.0.9/lib/libguile-3.0.so.1.6.0
> /gnu/store/8y0pwifz8a3d7zbdfzsawa1amf4afx1s-libgcrypt-1.10.1/lib/libgcrypt.so.20.4.1
> /gnu/store/930nwsiysdvy2x5zv1sf6v7ym75z8ayk-gcc-11.3.0-lib/lib/libgcc_s.so.1
> /gnu/store/c2fx42ial6lr60s96xcbml5hd8vwaxq3-nettle-3.8.1/lib/libhogweed.so.6.6
> /gnu/store/c2fx42ial6lr60s96xcbml5hd8vwaxq3-nettle-3.8.1/lib/libnettle.so.8.6
> /gnu/store/gsjczqir1wbz8p770zndrpw4rnppmxi3-glibc-2.35/lib/ld-linux-x86-64.so.2
> /gnu/store/gsjczqir1wbz8p770zndrpw4rnppmxi3-glibc-2.35/lib/libcrypt.so.1
> /gnu/store/gsjczqir1wbz8p770zndrpw4rnppmxi3-glibc-2.35/lib/libc.so.6
> /gnu/store/gsjczqir1wbz8p770zndrpw4rnppmxi3-glibc-2.35/lib/libm.so.6
> /gnu/store/ib2n2vzqpchc3bhh9i712w5sq9zapn8d-gmp-6.2.1/lib/libgmp.so.10.4.1
> /gnu/store/j5kzdjan6mnf2ngmkc50fia8vrbpqi9b-libtasn1-4.19.0/lib/libtasn1.so.6.6.3
> /gnu/store/k0p01a6b7hsxjfr65ga4f2gh6lh92aiq-lzlib-1.13/lib/liblz.so.1.13
> /gnu/store/m9wi9hcrf7f9dm4ri32vw1jrbh1csywi-libgpg-error-1.45/lib/libgpg-error.so.0.33.0
> /gnu/store/slzq3zqwj75lbrg4ly51hfhbv2vhryv5-zlib-1.2.13/lib/libz.so.1.2.13
> /gnu/store/vq7dxp5la2lnhsvniwv38j0ggvsmzim7-p11-kit-0.24.1/lib/libp11-kit.so.0.3.0
> /gnu/store/w8b0l8hk6g0fahj4fvmc4qqm3cvaxnmv-libffi-3.4.4/lib/libffi.so.8.1.2
> /gnu/store/yr4lbvdyc4dgs76yij1dw2w2z8s84af8-gnutls-3.7.7/lib/libgnutls.so.30.34.1
Hmm no idea. I’ve never seen “Signals delivery fails” before so I
really wonder what could be causing this. Would be great if you could
come up with a reduced test case, but I guess that won’t be easy.
Or perhaps you could run a Coordinator agent under ‘strace -f’ to see if
we get hints?
Ludo’.
This bug report was last modified 330 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.