GNU bug report logs - #52182
[cuirass] remote-worker process freeze

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: guix; Reported by: Mathieu Othacehe <othacehe@HIDDEN>; dated Mon, 29 Nov 2021 15:20:02 UTC; Maintainer for guix is bug-guix@HIDDEN.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 29 Nov 2021 15:19:52 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Mon Nov 29 10:19:52 2021
Received: from localhost ([127.0.0.1]:38900 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1mriRb-0006EH-NM
	for submit <at> debbugs.gnu.org; Mon, 29 Nov 2021 10:19:52 -0500
Received: from lists.gnu.org ([209.51.188.17]:56588)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <othacehe@HIDDEN>) id 1mriRa-0006EA-PA
 for submit <at> debbugs.gnu.org; Mon, 29 Nov 2021 10:19:51 -0500
Received: from eggs.gnu.org ([209.51.188.92]:52064)
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <othacehe@HIDDEN>) id 1mriRa-0004ug-KG
 for bug-guix@HIDDEN; Mon, 29 Nov 2021 10:19:50 -0500
Received: from [2001:470:142:3::e] (port=47628 helo=fencepost.gnu.org)
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <othacehe@HIDDEN>) id 1mriRa-0001b4-9c
 for bug-guix@HIDDEN; Mon, 29 Nov 2021 10:19:50 -0500
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=MIME-Version:Date:Subject:To:From:in-reply-to:
 references; bh=Kdu9oFdCKG/pFUuxFu6Bxs4NV02A7B2COEBDqZ+uXzU=; b=AGDGRvK3qhx2Df
 bDiCIOuyHyAKUWs3uRoSK9EUQIidRriHD1plGRcN3BU8m2mSHyKIV+Mi4q5p8dHrzwtMjIponvPkf
 vaLEoa8a/mF58hwz5r624SWWLb+jolG1extw51LoueYo/K8qrSX2mN86+DIffkh7rUMrOwtXC9g3o
 dkLb4s8nSYWteUrgXYCdcx4QGlVMgE34iBOjoDMh0bW9yWXfytzwd3IvoGkXRASVnlDSosGIemkXk
 SlOJJK8EPZ/28Go1bK2oaeCGXBDImwRa5MXNXosRLkx2ARlGHJcZ9XPFW/as6lwLJYLJ0lXzJop23
 N/MYaxf6yv+dHaErapWQ==;
Received: from [2a01:e0a:19b:d9a0:2ddb:d3d2:32e8:d31a] (port=60754 helo=meije)
 by fencepost.gnu.org with esmtpsa
 (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1)
 (envelope-from <othacehe@HIDDEN>) id 1mriRa-0006Tk-53
 for bug-guix@HIDDEN; Mon, 29 Nov 2021 10:19:50 -0500
From: Mathieu Othacehe <othacehe@HIDDEN>
To: bug-guix@HIDDEN
Subject: [cuirass] remote-worker process freeze
Date: Mon, 29 Nov 2021 16:19:48 +0100
Message-ID: <87ilwbj8ff.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: submit
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)


Hello,

On the newly installed honeycomb machines, some Cuirass remote-worker
process freeze completely and stop communicating with the
remote-server.

This has already been observed, but is for some reason more repeatable
on those machines.

Here are the information I could collect on such a frozen process using
GDB:

--8<---------------cut here---------------start------------->8---
(gdb) attach 5660 ;frozen cuirass-remote-worker PID
(gdb) info thr
  Id   Target Id                                       Frame 
* 1    Thread 0xffffafd32e20 (LWP 5660) "yHg3r3fS"     0x0000ffffafb3fa80 in do_futex_wait.constprop () from /gnu/store/cb88z63hyg1icd2kkahiink2p291mhr2-glibc-2.31/lib/libpthread.so.0
  2    Thread 0xffffa6c1c1d0 (LWP 5666) "ZMQbg/Reaper" 0x0000ffffaf7ec294 in epoll_pwait () from /gnu/store/cb88z63hyg1icd2kkahiink2p291mhr2-glibc-2.31/lib/libc.so.6
  3    Thread 0xffffaf0071d0 (LWP 5667) "ZMQbg/IO/0"   0x0000ffffaf7ec294 in epoll_pwait () from /gnu/store/cb88z63hyg1icd2kkahiink2p291mhr2-glibc-2.31/lib/libc.so.6
  4    Thread 0xffffa641b1d0 (LWP 5674) "yHg3r3fS"     0x0000ffffaf7b9d04 in clock_nanosleep@@GLIBC_2.17 () from /gnu/store/cb88z63hyg1icd2kkahiink2p291mhr2-glibc-2.31/lib/libc.so.6
(gdb) bt
#0  0x0000ffffafb3fa80 in do_futex_wait.constprop () from /gnu/store/cb88z63hyg1icd2kkahiink2p291mhr2-glibc-2.31/lib/libpthread.so.0
#1  0x0000ffffafb3fb78 in __new_sem_wait_slow.constprop.0 () from /gnu/store/cb88z63hyg1icd2kkahiink2p291mhr2-glibc-2.31/lib/libpthread.so.0
#2  0x0000ffffafb80318 in GC_stop_world () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1
#3  0x0000ffffafb6c020 in GC_stopped_mark () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1
#4  0x0000ffffafb6c8dc in GC_try_to_collect_inner () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1
#5  0x0000ffffafb6d598 in GC_collect_or_expand () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1
#6  0x0000ffffafb73b4c in GC_alloc_large () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1
#7  0x0000ffffafb74038 in GC_generic_malloc () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1
#8  0x0000ffffafb74298 in GC_malloc_kind_global () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1
#9  0x0000ffffafc11fa8 in scm_make_bytevector () from /gnu/store/7g3nbnf2kf31jk696k0nyz9ck55b11a0-guile-3.0.7/lib/libguile-3.0.so.1
#10 0x0000ffffacacc418 in ?? ()
#11 0x0000ffffacc2ef2c in ?? ()
(gdb) thr 4
[Switching to thread 4 (Thread 0xffffa641b1d0 (LWP 5674))]
#0  0x0000ffffaf7b9d04 in clock_nanosleep@@GLIBC_2.17 () from /gnu/store/cb88z63hyg1icd2kkahiink2p291mhr2-glibc-2.31/lib/libc.so.6
(gdb) bt
#0  0x0000ffffaf7b9d04 in clock_nanosleep@@GLIBC_2.17 () from /gnu/store/cb88z63hyg1icd2kkahiink2p291mhr2-glibc-2.31/lib/libc.so.6
#1  0x0000ffffaf7bf55c in nanosleep () from /gnu/store/cb88z63hyg1icd2kkahiink2p291mhr2-glibc-2.31/lib/libc.so.6
#2  0x0000ffffafb7e844 in GC_lock () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1
#3  0x0000ffffafb7ecdc in GC_do_blocking_inner () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1
#4  0x0000ffffafb73998 in GC_with_callee_saves_pushed () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1
#5  0x0000ffffafb79654 in GC_do_blocking () from /gnu/store/jsda4njqwjp4kb60fwa7n4mlfi1aanpq-libgc-7.6.12/lib/libgc.so.1
#6  0x0000ffffafc96d94 in scm_without_guile () from /gnu/store/7g3nbnf2kf31jk696k0nyz9ck55b11a0-guile-3.0.7/lib/libguile-3.0.so.1
#7  0x0000ffffafc97050 in scm_std_select () from /gnu/store/7g3nbnf2kf31jk696k0nyz9ck55b11a0-guile-3.0.7/lib/libguile-3.0.so.1
#8  0x0000ffffafc97b5c in scm_std_sleep () from /gnu/store/7g3nbnf2kf31jk696k0nyz9ck55b11a0-guile-3.0.7/lib/libguile-3.0.so.1
#9  0x0000ffffafc75918 in scm_sleep () from /gnu/store/7g3nbnf2kf31jk696k0nyz9ck55b11a0-guile-3.0.7/lib/libguile-3.0.so.1
#10 0x0000ffffa6c50d94 in ?? ()
#11 0x0000ffffacc2ee0c in ?? ()
--8<---------------cut here---------------end--------------->8---

So the threads 2 and 3 are managed internally by ZMQ. The threads 1 and
4 are respectively the thread pinging the remote-server and the thread
actually building stuff.

Looks like they are both stuck doing GC stuff.

Thanks,

Mathieu




Acknowledgement sent to Mathieu Othacehe <othacehe@HIDDEN>:
New bug report received and forwarded. Copy sent to bug-guix@HIDDEN. Full text available.
Report forwarded to bug-guix@HIDDEN:
bug#52182; Package guix. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Mon, 29 Nov 2021 15:30:02 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.