GNU bug report logs -
#43334
Cuirass crashes
Previous Next
Reported by: Mathieu Othacehe <othacehe <at> gnu.org>
Date: Fri, 11 Sep 2020 12:00:02 UTC
Severity: normal
Done: Mathieu Othacehe <othacehe <at> gnu.org>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 43334 in the body.
You can then email your comments to 43334 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-guix <at> gnu.org
:
bug#43334
; Package
guix
.
(Fri, 11 Sep 2020 12:00:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Mathieu Othacehe <othacehe <at> gnu.org>
:
New bug report received and forwarded. Copy sent to
bug-guix <at> gnu.org
.
(Fri, 11 Sep 2020 12:00:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
Hello,
I've observed a few Cuirass crashes the past days. The log looks like:
--8<---------------cut here---------------start------------->8---
2020-09-11T12:55:35 next evaluation in 300 seconds
GC Warning: Repeated allocation of very large block (appr. size 28766208):
May lead to memory leak and poor performance
2020-09-11T12:58:52 heap: 942.38 MiB; threads: 110; file descriptors: 257
2020-09-11T13:00:35 fetching input 'core-updates' of spec 'core-updates-core-updates'
2020-09-11T13:00:54 fetched input 'core-updates' of spec 'core-updates-core-updates' (commit "1bec03df9b60f156c657a64a323ef27f4ed14b44")
2020-09-11T13:00:54 fetching input 'guix' of spec 'guix-master'
2020-09-11T13:01:13 fetched input 'guix' of spec 'guix-master' (commit "7daa99e52d94e409f05a874813bdf739709807a2")
2020-09-11T13:01:13 evaluating spec 'guix-master'
2020-09-11T13:01:13 fetching input 'guix-modular' of spec 'guix-modular-master'
2020-09-11T13:01:17 fetched input 'guix-modular' of spec 'guix-modular-master' (commit "7daa99e52d94e409f05a874813bdf739709807a2")
2020-09-11T13:01:17 evaluating spec 'guix-modular-master'
2020-09-11T13:01:17 fetching input 'kernel-updates' of spec 'kernel-updates'
2020-09-11T13:01:21 fetched input 'kernel-updates' of spec 'kernel-updates' (commit "1de80be489e443e7c0d8c79ea84762e1706e81ff")
2020-09-11T13:01:21 fetching input 'staging' of spec 'staging-staging'
2020-09-11T13:01:24 fetched input 'staging' of spec 'staging-staging' (commit "de3c03a47160dec355d9b19ad5ca210d90c15fd7")
2020-09-11T13:01:24 fetching input 'version-1.0.1' of spec 'version-1.0.1'
2020-09-11T13:01:27 fetched input 'version-1.0.1' of spec 'version-1.0.1' (commit "58d7909c97c1ab2457faee1d7af925ee32ad15c2")
2020-09-11T13:01:27 fetching input 'version-1.1.0' of spec 'version-1.1.0'
mmap(PROT_NONE) failed
WARNING: (guile-user): imported module (fibers) overrides core binding `sleep'
2020-09-11T13:01:30 performing database optimizations
--8<---------------cut here---------------end--------------->8---
It looks like a memory allocation failed causing a Cuirass/Guile crash.
Thanks,
Mathieu
Information forwarded
to
bug-guix <at> gnu.org
:
bug#43334
; Package
guix
.
(Fri, 11 Sep 2020 12:53:01 GMT)
Full text and
rfc822 format available.
Message #8 received at 43334 <at> debbugs.gnu.org (full text, mbox):
Mathieu Othacehe <othacehe <at> gnu.org> writes:
> Hello,
>
> I've observed a few Cuirass crashes the past days. The log looks like:
>
> --8<---------------cut here---------------start------------->8---
> 2020-09-11T12:55:35 next evaluation in 300 seconds
> GC Warning: Repeated allocation of very large block (appr. size 28766208):
> May lead to memory leak and poor performance
> 2020-09-11T12:58:52 heap: 942.38 MiB; threads: 110; file descriptors: 257
> 2020-09-11T13:00:35 fetching input 'core-updates' of spec 'core-updates-core-updates'
> 2020-09-11T13:00:54 fetched input 'core-updates' of spec 'core-updates-core-updates' (commit "1bec03df9b60f156c657a64a323ef27f4ed14b44")
> 2020-09-11T13:00:54 fetching input 'guix' of spec 'guix-master'
> 2020-09-11T13:01:13 fetched input 'guix' of spec 'guix-master' (commit "7daa99e52d94e409f05a874813bdf739709807a2")
> 2020-09-11T13:01:13 evaluating spec 'guix-master'
> 2020-09-11T13:01:13 fetching input 'guix-modular' of spec 'guix-modular-master'
> 2020-09-11T13:01:17 fetched input 'guix-modular' of spec 'guix-modular-master' (commit "7daa99e52d94e409f05a874813bdf739709807a2")
> 2020-09-11T13:01:17 evaluating spec 'guix-modular-master'
> 2020-09-11T13:01:17 fetching input 'kernel-updates' of spec 'kernel-updates'
> 2020-09-11T13:01:21 fetched input 'kernel-updates' of spec 'kernel-updates' (commit "1de80be489e443e7c0d8c79ea84762e1706e81ff")
> 2020-09-11T13:01:21 fetching input 'staging' of spec 'staging-staging'
> 2020-09-11T13:01:24 fetched input 'staging' of spec 'staging-staging' (commit "de3c03a47160dec355d9b19ad5ca210d90c15fd7")
> 2020-09-11T13:01:24 fetching input 'version-1.0.1' of spec 'version-1.0.1'
> 2020-09-11T13:01:27 fetched input 'version-1.0.1' of spec 'version-1.0.1' (commit "58d7909c97c1ab2457faee1d7af925ee32ad15c2")
> 2020-09-11T13:01:27 fetching input 'version-1.1.0' of spec 'version-1.1.0'
> mmap(PROT_NONE) failed
> WARNING: (guile-user): imported module (fibers) overrides core binding `sleep'
> 2020-09-11T13:01:30 performing database optimizations
> --8<---------------cut here---------------end--------------->8---
>
> It looks like a memory allocation failed causing a Cuirass/Guile crash.
On ci.guix.gnu.org? We have 188GiB RAM there according to free.
--
Ricardo
Information forwarded
to
bug-guix <at> gnu.org
:
bug#43334
; Package
guix
.
(Fri, 11 Sep 2020 19:24:02 GMT)
Full text and
rfc822 format available.
Message #11 received at 43334 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Mathieu Othacehe <othacehe <at> gnu.org> writes:
> Hello,
>
> I've observed a few Cuirass crashes the past days. The log looks like:
>
> --8<---------------cut here---------------start------------->8---
> 2020-09-11T12:55:35 next evaluation in 300 seconds
> GC Warning: Repeated allocation of very large block (appr. size 28766208):
> May lead to memory leak and poor performance
> 2020-09-11T12:58:52 heap: 942.38 MiB; threads: 110; file descriptors: 257
> 2020-09-11T13:00:35 fetching input 'core-updates' of spec 'core-updates-core-updates'
> 2020-09-11T13:00:54 fetched input 'core-updates' of spec 'core-updates-core-updates' (commit "1bec03df9b60f156c657a64a323ef27f4ed14b44")
> 2020-09-11T13:00:54 fetching input 'guix' of spec 'guix-master'
> 2020-09-11T13:01:13 fetched input 'guix' of spec 'guix-master' (commit "7daa99e52d94e409f05a874813bdf739709807a2")
> 2020-09-11T13:01:13 evaluating spec 'guix-master'
> 2020-09-11T13:01:13 fetching input 'guix-modular' of spec 'guix-modular-master'
> 2020-09-11T13:01:17 fetched input 'guix-modular' of spec 'guix-modular-master' (commit "7daa99e52d94e409f05a874813bdf739709807a2")
> 2020-09-11T13:01:17 evaluating spec 'guix-modular-master'
> 2020-09-11T13:01:17 fetching input 'kernel-updates' of spec 'kernel-updates'
> 2020-09-11T13:01:21 fetched input 'kernel-updates' of spec 'kernel-updates' (commit "1de80be489e443e7c0d8c79ea84762e1706e81ff")
> 2020-09-11T13:01:21 fetching input 'staging' of spec 'staging-staging'
> 2020-09-11T13:01:24 fetched input 'staging' of spec 'staging-staging' (commit "de3c03a47160dec355d9b19ad5ca210d90c15fd7")
> 2020-09-11T13:01:24 fetching input 'version-1.0.1' of spec 'version-1.0.1'
> 2020-09-11T13:01:27 fetched input 'version-1.0.1' of spec 'version-1.0.1' (commit "58d7909c97c1ab2457faee1d7af925ee32ad15c2")
> 2020-09-11T13:01:27 fetching input 'version-1.1.0' of spec 'version-1.1.0'
> mmap(PROT_NONE) failed
> WARNING: (guile-user): imported module (fibers) overrides core binding `sleep'
> 2020-09-11T13:01:30 performing database optimizations
> --8<---------------cut here---------------end--------------->8---
>
> It looks like a memory allocation failed causing a Cuirass/Guile crash.
So, I've seen this before but in a slightly different context, [1]. To
summarise, with Guile built with libgc <at> 8 the Guix Data Service couldn't
processes Guix revisions, because the code it had Guile built with
libgc <at> 8 run caused it to consistently crash with this error. The
workaround was to add a Guile variant built with libgc <at> 7 and use this
for the guix package [2].
1: http://issues.guix.info/40525
2: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=40684
I'm not quite sure what Guile process is crashing here, but switching to
use Guile built with libgc <at> 7 might help.
[signature.asc (application/pgp-signature, inline)]
Information forwarded
to
bug-guix <at> gnu.org
:
bug#43334
; Package
guix
.
(Sat, 12 Sep 2020 06:47:02 GMT)
Full text and
rfc822 format available.
Message #14 received at 43334 <at> debbugs.gnu.org (full text, mbox):
Hey Chris,
>> It looks like a memory allocation failed causing a Cuirass/Guile crash.
>
> So, I've seen this before but in a slightly different context, [1]. To
> summarise, with Guile built with libgc <at> 8 the Guix Data Service couldn't
> processes Guix revisions, because the code it had Guile built with
> libgc <at> 8 run caused it to consistently crash with this error. The
> workaround was to add a Guile variant built with libgc <at> 7 and use this
> for the guix package [2].
>
> 1: http://issues.guix.info/40525
> 2: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=40684
>
> I'm not quite sure what Guile process is crashing here, but switching to
> use Guile built with libgc <at> 7 might help.
Thanks for pointing to this, I somehow missed it at the time. I
collected the strace log which sounds indeed really similar:
--8<---------------cut here---------------start------------->8---
[pid 49511] getdents64(271, 0x7f5374304930 /* 455 entries */, 32768) = 32760
[pid 42583] mmap(0x7f5361976000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
[pid 42583] write(2, "mmap(PROT_NONE) failed", 22) = 22
[pid 42583] write(2, "\n", 1) = 1
[pid 42583] rt_sigprocmask(SIG_UNBLOCK, [ABRT], NULL, 8) = 0
[pid 42583] rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1], [], 8) = 0
[pid 42583] getpid() = 42562
[pid 42583] gettid() = 42583
[pid 42583] tgkill(42562, 42583, SIGABRT) = 0
[pid 42583] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
[pid 42583] --- SIGABRT {si_signo=SIGABRT, si_code=SI_TKILL, si_pid=42562, si_uid=997} ---
[pid 42738] <... read resumed> <unfinished ...>) = ?
--8<---------------cut here---------------end--------------->8---
The abort seem to be received by the finalizer thread. I can try to use
guile-3.0/libgc-7 to confirm this theory, but I guess we'll need to dig
deeper.
Thanks,
Mathieu
Reply sent
to
Mathieu Othacehe <othacehe <at> gnu.org>
:
You have taken responsibility.
(Thu, 25 Mar 2021 12:50:02 GMT)
Full text and
rfc822 format available.
Notification sent
to
Mathieu Othacehe <othacehe <at> gnu.org>
:
bug acknowledged by developer.
(Thu, 25 Mar 2021 12:50:02 GMT)
Full text and
rfc822 format available.
Message #19 received at 43334-done <at> debbugs.gnu.org (full text, mbox):
Hello,
Closing as Cuirass evaluation process now uses less memory.
Thanks,
Mathieu
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Fri, 23 Apr 2021 11:24:08 GMT)
Full text and
rfc822 format available.
This bug report was last modified 2 years and 341 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.