Received: (at 52646) by debbugs.gnu.org; 21 Dec 2021 10:16:49 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Tue Dec 21 05:16:49 2021 Received: from localhost ([127.0.0.1]:52418 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1mzcCP-0003sK-7y for submit <at> debbugs.gnu.org; Tue, 21 Dec 2021 05:16:49 -0500 Received: from hera.aquilenet.fr ([185.233.100.1]:60284) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <ludo@HIDDEN>) id 1mzcCL-0003s1-01 for 52646 <at> debbugs.gnu.org; Tue, 21 Dec 2021 05:16:47 -0500 Received: from localhost (localhost [127.0.0.1]) by hera.aquilenet.fr (Postfix) with ESMTP id 0500253A; Tue, 21 Dec 2021 11:16:39 +0100 (CET) X-Virus-Scanned: Debian amavisd-new at aquilenet.fr Received: from hera.aquilenet.fr ([127.0.0.1]) by localhost (hera.aquilenet.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id dKdEjXb88NLy; Tue, 21 Dec 2021 11:16:38 +0100 (CET) Received: from ribbon (unknown [IPv6:2001:660:6102:320:e120:2c8f:8909:cdfe]) by hera.aquilenet.fr (Postfix) with ESMTPSA id A913A3FC; Tue, 21 Dec 2021 11:16:37 +0100 (CET) From: =?utf-8?Q?Ludovic_Court=C3=A8s?= <ludo@HIDDEN> To: Mathieu Othacehe <othacehe@HIDDEN> Subject: Re: bug#52646: GC thread freeze References: <878rwhbppt.fsf@HIDDEN> Date: Tue, 21 Dec 2021 11:16:36 +0100 In-Reply-To: <878rwhbppt.fsf@HIDDEN> (Mathieu Othacehe's message of "Sat, 18 Dec 2021 21:52:30 +0100") Message-ID: <87wnjy6z5n.fsf@HIDDEN> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spamd-Bar: / Authentication-Results: hera.aquilenet.fr; none X-Rspamd-Server: hera X-Rspamd-Queue-Id: 0500253A X-Spamd-Result: default: False [0.53 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; RCPT_COUNT_TWO(0.00)[2]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; R_MIXED_CHARSET(0.63)[subject]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[]; MID_RHS_MATCH_FROM(0.00)[] X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 52646 Cc: 52646 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -0.0 (/) Hello! Mathieu Othacehe <othacehe@HIDDEN> skribis: > I experiment a strange behaviour with this Guile 3.0.7 process: > https://git.savannah.gnu.org/cgit/guix/guix-cuirass.git/tree/src/cuirass/= scripts/remote-worker.scm. > > The process is forking N processes that in turn start 4 threads. This is happening in this order, right? POSIX leaves unspecified the behavior of a child process forked from a multi-threaded process; there could be deadlocks, etc. =E2=80=98primitive-= fork=E2=80=99 prints a warning when called from a multi-threaded Guile process. The solution is for multi-threaded Guile processes to not fork at all, or to fork only via =E2=80=98open-pipe*=E2=80=99, =E2=80=98system*=E2=80=99= , etc., which are =E2=80=9Cknown good=E2=80=9D (they take care of post-fork handling in the child and call = =E2=80=98exec=E2=80=99 before anything bad could happen.) Thanks, Ludo=E2=80=99.
bug-guile@HIDDEN
:bug#52646
; Package guile
.
Full text available.Received: (at submit) by debbugs.gnu.org; 18 Dec 2021 20:52:37 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Sat Dec 18 15:52:37 2021 Received: from localhost ([127.0.0.1]:44269 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1mygh3-0004x4-A8 for submit <at> debbugs.gnu.org; Sat, 18 Dec 2021 15:52:37 -0500 Received: from lists.gnu.org ([209.51.188.17]:54984) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <othacehe@HIDDEN>) id 1mygh1-0004ww-2M for submit <at> debbugs.gnu.org; Sat, 18 Dec 2021 15:52:36 -0500 Received: from eggs.gnu.org ([209.51.188.92]:50122) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <othacehe@HIDDEN>) id 1myggz-00089F-FD for bug-guile@HIDDEN; Sat, 18 Dec 2021 15:52:34 -0500 Received: from [2001:470:142:3::e] (port=38898 helo=fencepost.gnu.org) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <othacehe@HIDDEN>) id 1myggz-00047j-7R for bug-guile@HIDDEN; Sat, 18 Dec 2021 15:52:33 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:Date:Subject:To:From:in-reply-to: references; bh=ojtgPcImAWSdEucuTEB+77vgiLtFRZDbqDa5KNqBjs4=; b=Mbjo8sTqytMdr8 hr+n9/qhmVGnaIal1iPKgWw9ojAW4Xdpq35BJYHjwvtYLJUD+VQUdcAOgyyuhw6KoX5lYVqEFICP2 H/tn++RIJrV+Sor5R4qPTczG4eElx4AjdcGOydHoWi+3E6bywyVX7/VJNRCwqYkRuRCgKRj6kaM+b xx74oqi382QtjrgY45SvQsbjya7uhboW7xnjyIxRhhCyf6LH52TINp5UmNiYmR4TaJWCjT2yL1V4F 2pg83Nc4vzpCgaNcWDfKqUrZVWH+aNtMHJ+kJIFpkttOt4RnLtFowDzhqynG0TpRnWOjgP+SSGhBZ NTx5BqVxvMnTPc2QOXlw==; Received: from [2a01:e0a:19b:d9a0:45b5:a14a:5c75:5737] (port=54116 helo=meije) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <othacehe@HIDDEN>) id 1myggy-0004x3-Qd for bug-guile@HIDDEN; Sat, 18 Dec 2021 15:52:33 -0500 From: Mathieu Othacehe <othacehe@HIDDEN> To: bug-guile@HIDDEN Subject: GC thread freeze Date: Sat, 18 Dec 2021 21:52:30 +0100 Message-ID: <878rwhbppt.fsf@HIDDEN> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -3.3 (---) Hello, I experiment a strange behaviour with this Guile 3.0.7 process: https://git.savannah.gnu.org/cgit/guix/guix-cuirass.git/tree/src/cuirass/scripts/remote-worker.scm. The process is forking N processes that in turn start 4 threads. On aarch64 machines specifically, some of those threads are freezing. Here is what GDB is reporting: --8<---------------cut here---------------start------------->8--- (gdb) attach 5660 ;frozen cuirass-remote-worker PID (gdb) info thr Id Target Id Frame * 1 Thread 0xffffafd32e20 (LWP 5660) "yHg3r3fS" 0x0000ffffafb3fa80 in do_futex_wait.constprop () from /gnu/store/cb88z63hyg1icd2kkahiink2p291mhr2-glibc-2.31/lib/libpthread.so.0 2 Thread 0xffffa6c1c1d0 (LWP 5666) "ZMQbg/Reaper" 0x0000ffffaf7ec294 in epoll_pwait () from /gnu/store/cb88z63hyg1icd2kkahiink2p291mhr2-glibc-2.31/lib/libc.so.6 3 Thread 0xffffaf0071d0 (LWP 5667) "ZMQbg/IO/0" 0x0000ffffaf7ec294 in epoll_pwait () from /gnu/store/cb88z63hyg1icd2kkahiink2p291mhr2-glibc-2.31/lib/libc.so.6 4 Thread 0xffffa641b1d0 (LWP 5674) "yHg3r3fS" 0x0000ffffaf7b9d04 in clock_nanosleep@@GLIBC_2.17 () from /gnu/store/cb88z63hyg1icd2kkahiink2p291mhr2-glibc-2.31/lib/libc.so.6 (gdb) bt #0 0x0000ffffafb3fa80 in do_futex_wait.constprop () from /gnu/store/cb88z63hyg1icd2kkahiink2p291mhr2-glibc-2.31/lib/libpthread.so.0 #1 0x0000ffffafb3fb78 in __new_sem_wait_slow.constprop.0 () from /gnu/store/cb88z63hyg1icd2kkahiink2p291mhr2-glibc-2.31/lib/libpthread.so.0 #2 0x0000ffffafb80318 in GC_stop_world () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1 #3 0x0000ffffafb6c020 in GC_stopped_mark () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1 #4 0x0000ffffafb6c8dc in GC_try_to_collect_inner () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1 #5 0x0000ffffafb6d598 in GC_collect_or_expand () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1 #6 0x0000ffffafb73b4c in GC_alloc_large () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1 #7 0x0000ffffafb74038 in GC_generic_malloc () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1 #8 0x0000ffffafb74298 in GC_malloc_kind_global () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1 #9 0x0000ffffafc11fa8 in scm_make_bytevector () from /gnu/store/7g3nbnf2kf31jk696k0nyz9ck55b11a0-guile-3.0.7/lib/libguile-3.0.so.1 #10 0x0000ffffacacc418 in ?? () #11 0x0000ffffacc2ef2c in ?? () (gdb) thr 4 [Switching to thread 4 (Thread 0xffffa641b1d0 (LWP 5674))] #0 0x0000ffffaf7b9d04 in clock_nanosleep@@GLIBC_2.17 () from /gnu/store/cb88z63hyg1icd2kkahiink2p291mhr2-glibc-2.31/lib/libc.so.6 (gdb) bt #0 0x0000ffffaf7b9d04 in clock_nanosleep@@GLIBC_2.17 () from /gnu/store/cb88z63hyg1icd2kkahiink2p291mhr2-glibc-2.31/lib/libc.so.6 #1 0x0000ffffaf7bf55c in nanosleep () from /gnu/store/cb88z63hyg1icd2kkahiink2p291mhr2-glibc-2.31/lib/libc.so.6 #2 0x0000ffffafb7e844 in GC_lock () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1 #3 0x0000ffffafb7ecdc in GC_do_blocking_inner () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1 #4 0x0000ffffafb73998 in GC_with_callee_saves_pushed () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1 #5 0x0000ffffafb79654 in GC_do_blocking () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1 #6 0x0000ffffafc96d94 in scm_without_guile () from /gnu/store/7g3nbnf2kf31jk696k0nyz9ck55b11a0-guile-3.0.7/lib/libguile-3.0.so.1 #7 0x0000ffffafc97050 in scm_std_select () from /gnu/store/7g3nbnf2kf31jk696k0nyz9ck55b11a0-guile-3.0.7/lib/libguile-3.0.so.1 #8 0x0000ffffafc97b5c in scm_std_sleep () from /gnu/store/7g3nbnf2kf31jk696k0nyz9ck55b11a0-guile-3.0.7/lib/libguile-3.0.so.1 #9 0x0000ffffafc75918 in scm_sleep () from /gnu/store/7g3nbnf2kf31jk696k0nyz9ck55b11a0-guile-3.0.7/lib/libguile-3.0.so.1 #10 0x0000ffffa6c50d94 in ?? () #11 0x0000ffffacc2ee0c in ?? () --8<---------------cut here---------------end--------------->8--- The threads 1 and 4 do no respond anymore and are stuck, thread 1 on a futex wait and thread 4 on a sleep, both in the GC library. For what it's worth, I do not experiment this behaviour on x86 machines. I tried to come up with a smaller reproducer without success, but I'll keep trying. Thanks, Mathieu
Mathieu Othacehe <othacehe@HIDDEN>
:bug-guile@HIDDEN
.
Full text available.bug-guile@HIDDEN
:bug#52646
; Package guile
.
Full text available.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997 nCipher Corporation Ltd,
1994-97 Ian Jackson.