GNU bug report logs - #52646
GC thread freeze

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: guile; Reported by: Mathieu Othacehe <othacehe@HIDDEN>; dated Sat, 18 Dec 2021 20:53:02 UTC; Maintainer for guile is bug-guile@HIDDEN.

Message received at 52646 <at> debbugs.gnu.org:


Received: (at 52646) by debbugs.gnu.org; 21 Dec 2021 10:16:49 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Tue Dec 21 05:16:49 2021
Received: from localhost ([127.0.0.1]:52418 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1mzcCP-0003sK-7y
	for submit <at> debbugs.gnu.org; Tue, 21 Dec 2021 05:16:49 -0500
Received: from hera.aquilenet.fr ([185.233.100.1]:60284)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <ludo@HIDDEN>) id 1mzcCL-0003s1-01
 for 52646 <at> debbugs.gnu.org; Tue, 21 Dec 2021 05:16:47 -0500
Received: from localhost (localhost [127.0.0.1])
 by hera.aquilenet.fr (Postfix) with ESMTP id 0500253A;
 Tue, 21 Dec 2021 11:16:39 +0100 (CET)
X-Virus-Scanned: Debian amavisd-new at aquilenet.fr
Received: from hera.aquilenet.fr ([127.0.0.1])
 by localhost (hera.aquilenet.fr [127.0.0.1]) (amavisd-new, port 10024)
 with ESMTP id dKdEjXb88NLy; Tue, 21 Dec 2021 11:16:38 +0100 (CET)
Received: from ribbon (unknown [IPv6:2001:660:6102:320:e120:2c8f:8909:cdfe])
 by hera.aquilenet.fr (Postfix) with ESMTPSA id A913A3FC;
 Tue, 21 Dec 2021 11:16:37 +0100 (CET)
From: =?utf-8?Q?Ludovic_Court=C3=A8s?= <ludo@HIDDEN>
To: Mathieu Othacehe <othacehe@HIDDEN>
Subject: Re: bug#52646: GC thread freeze
References: <878rwhbppt.fsf@HIDDEN>
Date: Tue, 21 Dec 2021 11:16:36 +0100
In-Reply-To: <878rwhbppt.fsf@HIDDEN> (Mathieu Othacehe's message of "Sat, 18
 Dec 2021 21:52:30 +0100")
Message-ID: <87wnjy6z5n.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Spamd-Bar: /
Authentication-Results: hera.aquilenet.fr;
	none
X-Rspamd-Server: hera
X-Rspamd-Queue-Id: 0500253A
X-Spamd-Result: default: False [0.53 / 15.00]; ARC_NA(0.00)[];
 RCVD_VIA_SMTP_AUTH(0.00)[]; FROM_HAS_DN(0.00)[];
 TO_DN_SOME(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[];
 MIME_GOOD(-0.10)[text/plain]; RCPT_COUNT_TWO(0.00)[2];
 FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+];
 R_MIXED_CHARSET(0.63)[subject]; RCVD_COUNT_TWO(0.00)[2];
 RCVD_TLS_ALL(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]
X-Spam-Score: 1.0 (+)
X-Debbugs-Envelope-To: 52646
Cc: 52646 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -0.0 (/)

Hello!

Mathieu Othacehe <othacehe@HIDDEN> skribis:

> I experiment a strange behaviour with this Guile 3.0.7 process:
> https://git.savannah.gnu.org/cgit/guix/guix-cuirass.git/tree/src/cuirass/=
scripts/remote-worker.scm.
>
> The process is forking N processes that in turn start 4 threads.

This is happening in this order, right?

POSIX leaves unspecified the behavior of a child process forked from a
multi-threaded process; there could be deadlocks, etc.  =E2=80=98primitive-=
fork=E2=80=99
prints a warning when called from a multi-threaded Guile process.

The solution is for multi-threaded Guile processes to not fork at all,
or to fork only via =E2=80=98open-pipe*=E2=80=99, =E2=80=98system*=E2=80=99=
, etc., which are =E2=80=9Cknown
good=E2=80=9D (they take care of post-fork handling in the child and call =
=E2=80=98exec=E2=80=99
before anything bad could happen.)

Thanks,
Ludo=E2=80=99.




Information forwarded to bug-guile@HIDDEN:
bug#52646; Package guile. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 18 Dec 2021 20:52:37 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sat Dec 18 15:52:37 2021
Received: from localhost ([127.0.0.1]:44269 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1mygh3-0004x4-A8
	for submit <at> debbugs.gnu.org; Sat, 18 Dec 2021 15:52:37 -0500
Received: from lists.gnu.org ([209.51.188.17]:54984)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <othacehe@HIDDEN>) id 1mygh1-0004ww-2M
 for submit <at> debbugs.gnu.org; Sat, 18 Dec 2021 15:52:36 -0500
Received: from eggs.gnu.org ([209.51.188.92]:50122)
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <othacehe@HIDDEN>) id 1myggz-00089F-FD
 for bug-guile@HIDDEN; Sat, 18 Dec 2021 15:52:34 -0500
Received: from [2001:470:142:3::e] (port=38898 helo=fencepost.gnu.org)
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <othacehe@HIDDEN>) id 1myggz-00047j-7R
 for bug-guile@HIDDEN; Sat, 18 Dec 2021 15:52:33 -0500
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=MIME-Version:Date:Subject:To:From:in-reply-to:
 references; bh=ojtgPcImAWSdEucuTEB+77vgiLtFRZDbqDa5KNqBjs4=; b=Mbjo8sTqytMdr8
 hr+n9/qhmVGnaIal1iPKgWw9ojAW4Xdpq35BJYHjwvtYLJUD+VQUdcAOgyyuhw6KoX5lYVqEFICP2
 H/tn++RIJrV+Sor5R4qPTczG4eElx4AjdcGOydHoWi+3E6bywyVX7/VJNRCwqYkRuRCgKRj6kaM+b
 xx74oqi382QtjrgY45SvQsbjya7uhboW7xnjyIxRhhCyf6LH52TINp5UmNiYmR4TaJWCjT2yL1V4F
 2pg83Nc4vzpCgaNcWDfKqUrZVWH+aNtMHJ+kJIFpkttOt4RnLtFowDzhqynG0TpRnWOjgP+SSGhBZ
 NTx5BqVxvMnTPc2QOXlw==;
Received: from [2a01:e0a:19b:d9a0:45b5:a14a:5c75:5737] (port=54116 helo=meije)
 by fencepost.gnu.org with esmtpsa
 (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1)
 (envelope-from <othacehe@HIDDEN>) id 1myggy-0004x3-Qd
 for bug-guile@HIDDEN; Sat, 18 Dec 2021 15:52:33 -0500
From: Mathieu Othacehe <othacehe@HIDDEN>
To: bug-guile@HIDDEN
Subject: GC thread freeze
Date: Sat, 18 Dec 2021 21:52:30 +0100
Message-ID: <878rwhbppt.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: submit
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)


Hello,

I experiment a strange behaviour with this Guile 3.0.7 process:
https://git.savannah.gnu.org/cgit/guix/guix-cuirass.git/tree/src/cuirass/scripts/remote-worker.scm.

The process is forking N processes that in turn start 4 threads. On
aarch64 machines specifically, some of those threads are freezing. Here
is what GDB is reporting:

--8<---------------cut here---------------start------------->8---
(gdb) attach 5660 ;frozen cuirass-remote-worker PID
(gdb) info thr
  Id   Target Id                                       Frame 
* 1    Thread 0xffffafd32e20 (LWP 5660) "yHg3r3fS"     0x0000ffffafb3fa80 in do_futex_wait.constprop () from /gnu/store/cb88z63hyg1icd2kkahiink2p291mhr2-glibc-2.31/lib/libpthread.so.0
  2    Thread 0xffffa6c1c1d0 (LWP 5666) "ZMQbg/Reaper" 0x0000ffffaf7ec294 in epoll_pwait () from /gnu/store/cb88z63hyg1icd2kkahiink2p291mhr2-glibc-2.31/lib/libc.so.6
  3    Thread 0xffffaf0071d0 (LWP 5667) "ZMQbg/IO/0"   0x0000ffffaf7ec294 in epoll_pwait () from /gnu/store/cb88z63hyg1icd2kkahiink2p291mhr2-glibc-2.31/lib/libc.so.6
  4    Thread 0xffffa641b1d0 (LWP 5674) "yHg3r3fS"     0x0000ffffaf7b9d04 in clock_nanosleep@@GLIBC_2.17 () from /gnu/store/cb88z63hyg1icd2kkahiink2p291mhr2-glibc-2.31/lib/libc.so.6
(gdb) bt
#0  0x0000ffffafb3fa80 in do_futex_wait.constprop () from /gnu/store/cb88z63hyg1icd2kkahiink2p291mhr2-glibc-2.31/lib/libpthread.so.0
#1  0x0000ffffafb3fb78 in __new_sem_wait_slow.constprop.0 () from /gnu/store/cb88z63hyg1icd2kkahiink2p291mhr2-glibc-2.31/lib/libpthread.so.0
#2  0x0000ffffafb80318 in GC_stop_world () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1
#3  0x0000ffffafb6c020 in GC_stopped_mark () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1
#4  0x0000ffffafb6c8dc in GC_try_to_collect_inner () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1
#5  0x0000ffffafb6d598 in GC_collect_or_expand () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1
#6  0x0000ffffafb73b4c in GC_alloc_large () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1
#7  0x0000ffffafb74038 in GC_generic_malloc () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1
#8  0x0000ffffafb74298 in GC_malloc_kind_global () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1
#9  0x0000ffffafc11fa8 in scm_make_bytevector () from /gnu/store/7g3nbnf2kf31jk696k0nyz9ck55b11a0-guile-3.0.7/lib/libguile-3.0.so.1
#10 0x0000ffffacacc418 in ?? ()
#11 0x0000ffffacc2ef2c in ?? ()
(gdb) thr 4
[Switching to thread 4 (Thread 0xffffa641b1d0 (LWP 5674))]
#0  0x0000ffffaf7b9d04 in clock_nanosleep@@GLIBC_2.17 () from /gnu/store/cb88z63hyg1icd2kkahiink2p291mhr2-glibc-2.31/lib/libc.so.6
(gdb) bt
#0  0x0000ffffaf7b9d04 in clock_nanosleep@@GLIBC_2.17 () from /gnu/store/cb88z63hyg1icd2kkahiink2p291mhr2-glibc-2.31/lib/libc.so.6
#1  0x0000ffffaf7bf55c in nanosleep () from /gnu/store/cb88z63hyg1icd2kkahiink2p291mhr2-glibc-2.31/lib/libc.so.6
#2  0x0000ffffafb7e844 in GC_lock () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1
#3  0x0000ffffafb7ecdc in GC_do_blocking_inner () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1
#4  0x0000ffffafb73998 in GC_with_callee_saves_pushed () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1
#5  0x0000ffffafb79654 in GC_do_blocking () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1
#6  0x0000ffffafc96d94 in scm_without_guile () from /gnu/store/7g3nbnf2kf31jk696k0nyz9ck55b11a0-guile-3.0.7/lib/libguile-3.0.so.1
#7  0x0000ffffafc97050 in scm_std_select () from /gnu/store/7g3nbnf2kf31jk696k0nyz9ck55b11a0-guile-3.0.7/lib/libguile-3.0.so.1
#8  0x0000ffffafc97b5c in scm_std_sleep () from /gnu/store/7g3nbnf2kf31jk696k0nyz9ck55b11a0-guile-3.0.7/lib/libguile-3.0.so.1
#9  0x0000ffffafc75918 in scm_sleep () from /gnu/store/7g3nbnf2kf31jk696k0nyz9ck55b11a0-guile-3.0.7/lib/libguile-3.0.so.1
#10 0x0000ffffa6c50d94 in ?? ()
#11 0x0000ffffacc2ee0c in ?? ()
--8<---------------cut here---------------end--------------->8---

The threads 1 and 4 do no respond anymore and are stuck, thread 1 on a
futex wait and thread 4 on a sleep, both in the GC library. For what
it's worth, I do not experiment this behaviour on x86 machines.

I tried to come up with a smaller reproducer without success, but I'll
keep trying.

Thanks,

Mathieu




Acknowledgement sent to Mathieu Othacehe <othacehe@HIDDEN>:
New bug report received and forwarded. Copy sent to bug-guile@HIDDEN. Full text available.
Report forwarded to bug-guile@HIDDEN:
bug#52646; Package guile. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Tue, 21 Dec 2021 10:30:02 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.