GNU bug report logs - #65178
[Shepherd] Non-responding service control fiber

Previous Next

Package: guix;

Reported by: Hilton Chain <hako <at> ultrarare.space>

Date: Wed, 9 Aug 2023 12:43:02 UTC

Severity: important

Merged with 65419

Done: Ludovic Courtès <ludo <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 65178 in the body.
You can then email your comments to 65178 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to danclm <at> tutanota.com, contact <at> robbyzambito.me, skyvine <at> protonmail.com, ignas <at> lapenas.dev, etienne.roesch <at> gmail.com, chris <at> catsu.it, bug-guix <at> gnu.org:
bug#65178; Package guix. (Wed, 09 Aug 2023 12:43:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Hilton Chain <hako <at> ultrarare.space>:
New bug report received and forwarded. Copy sent to danclm <at> tutanota.com, contact <at> robbyzambito.me, skyvine <at> protonmail.com, ignas <at> lapenas.dev, etienne.roesch <at> gmail.com, chris <at> catsu.it, bug-guix <at> gnu.org. (Wed, 09 Aug 2023 12:43:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Hilton Chain <hako <at> ultrarare.space>
To: bug-guix <at> gnu.org
Subject: Shepherd hangs (was: Getting Guix to shutdown my laptop properly with
 Sway and no DE)
Date: Wed, 09 Aug 2023 20:41:44 +0800
Hello!

I have experienced many instances of Shepherd hanging through my use
of Guix, though I don't have a clear record of when it first happened.

These days I have seen a few reports on the subject.  With a quick
search of recent bug reports, I can't find any related, only to find
this thread [1] on help-guix.  So I'll start a bug report here, but I
don't know how to debug Shepherd and I haven't found a way to
reproduce it stably.

I'm not sure if Shepherd hangs at usual, but most of the time I find
it already hanging is when doing a reconfiguration.  The
reconfiguration becomes unresponsive and it won't accept a ^C, herd
actions also hang.  This usually happens with home reconfiguration,
but I can remember once with system reconfiguration when adding and
deleting some services in the configuration file.

I'm not sure how Shepherd hangs either, because in the latter case
(system one) I can still see logs indicating that it's trying to
respawn a process I killed manually, even though that's just a output
and no processes are actually spawned.

And as shown in [1], there are also cases where Shepherd hangs at some
point in the halting process, usually after syslogd has been
terminated but before term-tty*.

(The termination message indicates that Shepherd is still functional,
and no logs after that point shows that that's a real action, but
because of this I can't know anything happened further either.  After
that I'm still able to switch ttys so I assume term-tty* are alive.)

Although I don't know how they are related, I have linked my
configurations below:
<https://codeberg.org/hako/Testament/src/branch/trunk/dorphine-home.scm>
<https://codeberg.org/hako/Testament/src/branch/trunk/dorphine-system.scm>

Thanks

[1]:
<https://lists.gnu.org/archive/html/help-guix/2023-07/msg00021.html>
(public-inbox mirror on yhetil.org)
<https://yhetil.org/guix/NZXMeM4--3-9 <at> tutanota.com/t/#u>




Information forwarded to bug-guix <at> gnu.org:
bug#65178; Package guix. (Sun, 13 Aug 2023 15:29:02 GMT) Full text and rfc822 format available.

Message #8 received at 65178 <at> debbugs.gnu.org (full text, mbox):

From: Hilton Chain <hako <at> ultrarare.space>
To: 65178 <at> debbugs.gnu.org
Subject: Re: Shepherd hangs (was: Getting Guix to shutdown my laptop properly
 with Sway and no DE)
Date: Sun, 13 Aug 2023 23:25:59 +0800
On Wed, 09 Aug 2023 20:41:44 +0800,
Hilton Chain wrote:
> I'm not sure if Shepherd hangs at usual, but most of the time I find
> it already hanging is when doing a reconfiguration.  The
> reconfiguration becomes unresponsive and it won't accept a ^C, herd
> actions also hang.  This usually happens with home reconfiguration,

Today I encountered the home reconfiguration issue.  The behavior is
similar to <https://issues.guix.gnu.org/54919>.

Ending part of output for the hanging reconfiguration:
--8<---------------cut here---------------start------------->8---
[...]
Symlinking /home/hako/.config/fontconfig/fonts.conf -> /gnu/store/fvvqbma1xxgisfcq7rrwihbw7jwnyliv-fonts.conf... done
Symlinking /home/hako/.gnupg/gpg-agent.conf -> /gnu/store/kfaz4zrxmfz6p72x47c7qrqvb873gbyi-gpg-agent.conf... done
Symlinking /home/hako/.ssh/config -> /gnu/store/xb6f584pwclg48fr28wl21v1mxplqp6f-ssh.conf... done
Symlinking /home/hako/.icons/default/index.theme -> /gnu/store/3sraq69nrs04ii0fjgk36aw2c57q6z27-icons.theme... done
 done
Finished updating symlinks.


--8<---------------cut here---------------end--------------->8---

And `herd status' also hangs:
--8<---------------cut here---------------start------------->8---
$ herd status

--8<---------------cut here---------------end--------------->8---




Information forwarded to bug-guix <at> gnu.org:
bug#65178; Package guix. (Tue, 15 Aug 2023 13:21:01 GMT) Full text and rfc822 format available.

Message #11 received at 65178 <at> debbugs.gnu.org (full text, mbox):

From: Hilton Chain <hako <at> ultrarare.space>
To: 65178 <at> debbugs.gnu.org
Subject: Re: Shepherd hangs (was: Getting Guix to shutdown my laptop properly
 with Sway and no DE)
Date: Tue, 15 Aug 2023 21:20:04 +0800
On Sun, 13 Aug 2023 23:25:59 +0800,
Hilton Chain wrote:
>
> Today I encountered the home reconfiguration issue.  The behavior is
> similar to <https://issues.guix.gnu.org/54919>.

And today Shepherd hung after starting a service [1], the service
itself started successfully (process started, logs available):
--8<---------------cut here---------------start------------->8---
$ sudo herd enable cloudflare-tunnel && sudo herd start cloudflare-tunnel
Enabled service cloudflare-tunnel.

--8<---------------cut here---------------end--------------->8---

[1]: <https://codeberg.org/hako/Rosenthal/src/commit/c7dc95c2932d7362673c28cdc2f52e6bb8357c18/rosenthal/services/child-error.scm#L151>




Information forwarded to bug-guix <at> gnu.org:
bug#65178; Package guix. (Sat, 02 Sep 2023 20:50:02 GMT) Full text and rfc822 format available.

Message #14 received at 65178 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Hilton Chain <hako <at> ultrarare.space>
Cc: 65178 <at> debbugs.gnu.org
Subject: Re: bug#65178: Shepherd hangs (was: Getting Guix to shutdown my
 laptop properly with Sway and no DE)
Date: Sat, 02 Sep 2023 22:49:35 +0200
Hi!

Hilton Chain <hako <at> ultrarare.space> scribes:

> On Sun, 13 Aug 2023 23:25:59 +0800,
> Hilton Chain wrote:
>>
>> Today I encountered the home reconfiguration issue.  The behavior is
>> similar to <https://issues.guix.gnu.org/54919>.
>
> And today Shepherd hung after starting a service [1], the service
> itself started successfully (process started, logs available):

I’m assuming this is shepherd 0.10.2, right?

> $ sudo herd enable cloudflare-tunnel && sudo herd start cloudflare-tunnel
> Enabled service cloudflare-tunnel.
>
> [1]: <https://codeberg.org/hako/Rosenthal/src/commit/c7dc95c2932d7362673c28cdc2f52e6bb8357c18/rosenthal/services/child-error.scm#L151>

Is any of the services you’re using doing “non-standard things” such as
using constructors/destructors other than those provided by shepherd
(‘make-forkexec-constructor’ et al.)?

Is it reproducible, and do you think you could come up with a reduce
test case (for example by removing services from the config until you
reach the minimum)?

Thanks,
Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#65178; Package guix. (Sun, 03 Sep 2023 08:22:01 GMT) Full text and rfc822 format available.

Message #17 received at 65178 <at> debbugs.gnu.org (full text, mbox):

From: Hilton Chain <hako <at> ultrarare.space>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 65178 <at> debbugs.gnu.org
Subject: Re: bug#65178: Shepherd hangs (was: Getting Guix to shutdown my
 laptop properly with Sway and no DE)
Date: Sun, 03 Sep 2023 16:21:06 +0800
On Sun, 03 Sep 2023 04:49:35 +0800,
Ludovic Courtès wrote:
>
> Hi!
>
> Hilton Chain <hako <at> ultrarare.space> scribes:
>
> > On Sun, 13 Aug 2023 23:25:59 +0800,
> > Hilton Chain wrote:
> >>
> >> Today I encountered the home reconfiguration issue.  The behavior is
> >> similar to <https://issues.guix.gnu.org/54919>.
> >
> > And today Shepherd hung after starting a service [1], the service
> > itself started successfully (process started, logs available):
>
> I’m assuming this is shepherd 0.10.2, right?


Yes!


>
> > $ sudo herd enable cloudflare-tunnel && sudo herd start cloudflare-tunnel
> > Enabled service cloudflare-tunnel.
> >
> > [1]: <https://codeberg.org/hako/Rosenthal/src/commit/c7dc95c2932d7362673c28cdc2f52e6bb8357c18/rosenthal/services/child-error.scm#L151>
>
> Is any of the services you’re using doing “non-standard things” such as
> using constructors/destructors other than those provided by shepherd
> (‘make-forkexec-constructor’ et al.)?


No, I'm unaware of such things.


> Is it reproducible, and do you think you could come up with a reduce
> test case (for example by removing services from the config until you
> reach the minimum)?


I still don't know which condition triggers it, so I can't make a test
case.

It's unreproducible.  And I don't think it's really related to the
config, since Shepherd won't hang when rebooting to a system
generation which made it hanging at reconfiguration before.

It might be related to bug#65419 ([Shepherd] Non-reponding service
control fiber) you have reported, since there's similar behavior that
`herd status nscd' still works when Shepherd hangs.

Merged 65178 65419. Request was from Ludovic Courtès <ludo <at> gnu.org> to control <at> debbugs.gnu.org. (Sun, 03 Sep 2023 20:00:03 GMT) Full text and rfc822 format available.

Severity set to 'important' from 'normal' Request was from Ludovic Courtès <ludo <at> gnu.org> to control <at> debbugs.gnu.org. (Sun, 03 Sep 2023 20:00:03 GMT) Full text and rfc822 format available.

Changed bug title to '[Shepherd] Non-responding service control fiber' from 'Shepherd hangs (was: Getting Guix to shutdown my laptop properly with Sway and no DE)' Request was from Ludovic Courtès <ludo <at> gnu.org> to control <at> debbugs.gnu.org. (Thu, 23 Nov 2023 20:43:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-guix <at> gnu.org:
bug#65178; Package guix. (Thu, 14 Dec 2023 22:57:01 GMT) Full text and rfc822 format available.

Message #26 received at 65178 <at> debbugs.gnu.org (full text, mbox):

From: "Timo Wilken" <guix <at> twilken.net>
To: <67538 <at> debbugs.gnu.org>, <67230 <at> debbugs.gnu.org>, <65178 <at> debbugs.gnu.org>
Cc: Attila Lendvai <attila <at> lendvai.name>
Subject: Re: Shepherd stops responding during "guix system reconfigure"
Date: Thu, 14 Dec 2023 23:55:47 +0100
After a bit of searching, it looks like 67538, 67230 and 65178 may be the same
issue.

Attila Lendvai wrote:
> > > my suspicion is that it's due to some error coming from a start
> > > GEXP that somehow derails shepherd's event loop.
> >
> > iirc I once managed to get a debugger out when it happened and it's
> > stuck waiting in one of the epoll/select/alike calls,
>
> ...or one of the start/stop GEXP's calls something that (sometimes?) blocks
> indefinitely (which violates the API of shepherd).

Same symptoms here again.

For context: this time I was trying to deploy some OCI/Docker containers using
Guix' `oci-container-service-type', specifically a Shepherd service called
"conduit". My code is here:

https://cgit.twilken.net/dotfiles/log/?h=matrix-containers

(Specifically, commits bf94f7872a1df293bd904bbd2c1ef7229f4f98a8 and
c87dcdae79c6266ac3dac70af08fbef5eb21629b.)

This is with Guix commit 1b2505217cf222d98cc960b8510660976a01cfa1.

I first ran "guix system reconfigure -L . tw/system/lud.scm" with commit
bf94f7872a1df293bd904bbd2c1ef7229f4f98a8, which had a bug (an env var was
wrong, so the container failed to start). This worked as expected in that
Shepherd tried to start the service, which failed, so Shepherd disabled it.

Then, I fixed the env var and re-ran "guix system reconfigure -L .
tw/system/lud.scm" with commit c87dcdae79c6266ac3dac70af08fbef5eb21629b.
Shepherd loaded the new "conduit" service fine, as far as I can tell, but
didn't restart it because it was still disabled.

I then enabled and started the service manually. Enabling worked fine, but on
start, I got no terminal output from Shepherd, and it hung.

I still had an error in my setup (directory permissions were wrong), and I got
a message in /var/log/messages to that effect:

--8<---------------cut here---------------start------------->8---
Dec 14 21:33:50 localhost shepherd[1]: Service conduit is currently disabled. 
Dec 14 21:34:04 localhost shepherd[1]: Enabled service conduit. 
Dec 14 21:34:07 localhost shepherd[1]: Starting service user-homes... 
Dec 14 21:34:07 localhost shepherd[1]: Service user-homes has been started. 
Dec 14 21:34:07 localhost shepherd[1]: Service user-homes started. 
Dec 14 21:34:07 localhost shepherd[1]: Service user-homes running with value #t. 
Dec 14 21:34:07 localhost shepherd[1]: Starting service conduit... 
Dec 14 21:34:07 localhost shepherd[1]: Service conduit has been started. 
Dec 14 21:34:07 localhost shepherd[1]: Service conduit started. 
Dec 14 21:34:07 localhost shepherd[1]: Service conduit running with value 13226. 
Dec 14 21:34:07 localhost shepherd[1]: [docker] conduit: [...] "IO error: While open a file for appending: /var/lib/matrix-conduit/LOG: Permission denied"
--8<---------------cut here---------------end--------------->8---

...showing that Shepherd had at least tried to start the new container. The
container is not running, though (due to the error shown above), and nothing
with PID 13226 is running.

The "herd start conduit" command did not return, and ^C-ing it did not help.
Afterwards, every "herd" command also hung without any output.

Here are the last four lines of the output of "sudo strace -s1000 herd status"
on such a hung machine:

--8<---------------cut here---------------start------------->8---
connect(10, {sa_family=AF_UNIX, sun_path="/var/run/shepherd/socket"}, 26) = 0
getcwd("/home/timo", 100)               = 11
write(10, "(shepherd-command (version 0) (action status) (service root) (arguments ()) (directory \"/home/timo\"))", 101) = 101
read(10,
--8<---------------cut here---------------end--------------->8---

The "read(10, " call never completes.

At least in this case, Shepherd still seems to be processing inbound inet
connections, so I can open new SSH connections to the machine.

Attaching to PID 1 with strace shows it is stuck in "epoll_wait(13, "
(unsurprisingly, fd 13 points to "anon_inode:[eventpoll]"). Here's a backtrace
of all threads in "gdb -p 1":

--8<---------------cut here---------------start------------->8---
(gdb) info threads
  Id   Target Id                                     Frame 
* 1    Thread 0x7f786544c380 (LWP 1) "shepherd"      0x00007f7865552626 in epoll_wait ()
   from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
  2    Thread 0x7f7864e16640 (LWP 186) "GC-marker-0" 0x00007f78654cf16a in __futex_abstimed_wait_common ()
   from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
  3    Thread 0x7f7864615640 (LWP 187) "GC-marker-1" 0x00007f78654cf16a in __futex_abstimed_wait_common ()
   from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
  4    Thread 0x7f7863e14640 (LWP 188) "GC-marker-2" 0x00007f78654cf16a in __futex_abstimed_wait_common ()
   from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
  5    Thread 0x7f78634c6640 (LWP 190) "shepherd"    0x00007f786554300c in read ()
   from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
(gdb) thread apply all bt

Thread 5 (Thread 0x7f78634c6640 (LWP 190) "shepherd"):
#0  0x00007f786554300c in read () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
#1  0x00007f7865a48cc7 in ?? () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#2  0x00007f78659427d1 in ?? () from /gnu/store/k1ha4n9v8d7myiiszvl2ic7xnb56l219-libgc-8.2.2/lib/libgc.so.1
#3  0x00007f786594438c in ?? () from /gnu/store/k1ha4n9v8d7myiiszvl2ic7xnb56l219-libgc-8.2.2/lib/libgc.so.1
#4  0x00007f786594e83c in GC_do_blocking () from /gnu/store/k1ha4n9v8d7myiiszvl2ic7xnb56l219-libgc-8.2.2/lib/libgc.so.1
#5  0x00007f7865a65455 in scm_without_guile () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#6  0x00007f7865a4d570 in ?? () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#7  0x00007f7865a71390 in ?? () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#8  0x00007f7865a7edb5 in scm_call_n () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#9  0x00007f78659e5b3e in scm_call_with_unblocked_asyncs () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#10 0x00007f7865a71390 in ?? () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#11 0x00007f7865a7edb5 in scm_call_n () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#12 0x00007f7865a6b0f3 in ?? () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#13 0x00007f78659e7e1a in ?? () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#14 0x00007f7865a71390 in ?? () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#15 0x00007f7865a7edb5 in scm_call_n () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#16 0x00007f78659e95ca in scm_call_2 () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#17 0x00007f7865a90092 in ?? () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#18 0x00007f7865a6be1f in scm_c_catch () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#19 0x00007f78659ea396 in scm_c_with_continuation_barrier () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#20 0x00007f7865a6b049 in ?? () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#21 0x00007f786594e7fa in GC_call_with_stack_base () from /gnu/store/k1ha4n9v8d7myiiszvl2ic7xnb56l219-libgc-8.2.2/lib/libgc.so.1
#22 0x00007f7865a64c5d in ?? () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#23 0x00007f78654d23aa in start_thread () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
#24 0x00007f7865552f7c in clone3 () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6

Thread 4 (Thread 0x7f7863e14640 (LWP 188) "GC-marker-2"):
#0  0x00007f78654cf16a in __futex_abstimed_wait_common () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
#1  0x00007f78654d17e8 in pthread_cond_wait@@GLIBC_2.3.2 () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
#2  0x00007f7865948740 in ?? () from /gnu/store/k1ha4n9v8d7myiiszvl2ic7xnb56l219-libgc-8.2.2/lib/libgc.so.1
#3  0x00007f7865948897 in ?? () from /gnu/store/k1ha4n9v8d7myiiszvl2ic7xnb56l219-libgc-8.2.2/lib/libgc.so.1
#4  0x00007f78654d23aa in start_thread () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
#5  0x00007f7865552f7c in clone3 () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6

Thread 3 (Thread 0x7f7864615640 (LWP 187) "GC-marker-1"):
#0  0x00007f78654cf16a in __futex_abstimed_wait_common () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
#1  0x00007f78654d17e8 in pthread_cond_wait@@GLIBC_2.3.2 () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
#2  0x00007f7865948740 in ?? () from /gnu/store/k1ha4n9v8d7myiiszvl2ic7xnb56l219-libgc-8.2.2/lib/libgc.so.1
#3  0x00007f7865948897 in ?? () from /gnu/store/k1ha4n9v8d7myiiszvl2ic7xnb56l219-libgc-8.2.2/lib/libgc.so.1
#4  0x00007f78654d23aa in start_thread () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
#5  0x00007f7865552f7c in clone3 () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6

Thread 2 (Thread 0x7f7864e16640 (LWP 186) "GC-marker-0"):
#0  0x00007f78654cf16a in __futex_abstimed_wait_common () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
#1  0x00007f78654d17e8 in pthread_cond_wait@@GLIBC_2.3.2 () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
#2  0x00007f7865948740 in ?? () from /gnu/store/k1ha4n9v8d7myiiszvl2ic7xnb56l219-libgc-8.2.2/lib/libgc.so.1
#3  0x00007f7865948897 in ?? () from /gnu/store/k1ha4n9v8d7myiiszvl2ic7xnb56l219-libgc-8.2.2/lib/libgc.so.1
#4  0x00007f78654d23aa in start_thread () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
#5  0x00007f7865552f7c in clone3 () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6

Thread 1 (Thread 0x7f786544c380 (LWP 1) "shepherd"):
#0  0x00007f7865552626 in epoll_wait () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
#1  0x00007f7862bb9335 in ?? () from /gnu/store/h4nsywbhn8b4qyh40fhykk3q40qkr3wd-guile-fibers-1.3.1/lib/guile/3.0/extensions/fibers-epoll.so
#2  0x00007f78659427d1 in ?? () from /gnu/store/k1ha4n9v8d7myiiszvl2ic7xnb56l219-libgc-8.2.2/lib/libgc.so.1
#3  0x00007f786594438c in ?? () from /gnu/store/k1ha4n9v8d7myiiszvl2ic7xnb56l219-libgc-8.2.2/lib/libgc.so.1
#4  0x00007f786594e83c in GC_do_blocking () from /gnu/store/k1ha4n9v8d7myiiszvl2ic7xnb56l219-libgc-8.2.2/lib/libgc.so.1
#5  0x00007f7865a65455 in scm_without_guile () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#6  0x00007f7862bb96ce in ?? () from /gnu/store/h4nsywbhn8b4qyh40fhykk3q40qkr3wd-guile-fibers-1.3.1/lib/guile/3.0/extensions/fibers-epoll.so
#7  0x00007f78606246c2 in ?? ()
#8  0x00007f78620ba628 in ?? ()
#9  0x00007f7860627610 in ?? ()
#10 0x00007f786520ad80 in ?? ()
#11 0x00007f7865a14edc in ?? () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#12 0x00007f7865a71215 in ?? () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#13 0x00007f7865a7edb5 in scm_call_n () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#14 0x00007f78659e9977 in scm_primitive_eval () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#15 0x00007f7865a1dff9 in scm_primitive_load () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#16 0x00007f7865a71390 in ?? () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#17 0x00007f7865a7edb5 in scm_call_n () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#18 0x00007f78659e9977 in scm_primitive_eval () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#19 0x00007f78659ef846 in scm_eval () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#20 0x00007f7865a4e3e6 in scm_shell () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#21 0x00007f7865a008cc in ?? () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#22 0x00007f78659e7e1a in ?? () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#23 0x00007f7865a71390 in ?? () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#24 0x00007f7865a7edb5 in scm_call_n () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#25 0x00007f78659e95ca in scm_call_2 () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#26 0x00007f7865a90092 in ?? () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#27 0x00007f7865a6be1f in scm_c_catch () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#28 0x00007f78659ea396 in scm_c_with_continuation_barrier () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#29 0x00007f7865a6b049 in ?? () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#30 0x00007f786594e7fa in GC_call_with_stack_base () from /gnu/store/k1ha4n9v8d7myiiszvl2ic7xnb56l219-libgc-8.2.2/lib/libgc.so.1
#31 0x00007f7865a653f8 in scm_with_guile () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#32 0x00007f7865a098e5 in scm_boot_guile () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#33 0x00000000004010f7 in ?? ()
#34 0x00007f78654761f7 in __libc_start_call_main () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
#35 0x00007f78654762ac in __libc_start_main_impl () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
#36 0x0000000000401171 in ?? ()
--8<---------------cut here---------------end--------------->8---

Unrelatedly, I also have another Shepherd on a different machine that became
stuck after I ran a bunch of "guix system reconfigure" commands. The
backtraces there, if it helps:

--8<---------------cut here---------------start------------->8---
(gdb) info threads 
  Id   Target Id                                     Frame 
* 1    Thread 0x7ffaceef2380 (LWP 1) "shepherd"      0x00007fface938626 in epoll_wait ()
   from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
  2    Thread 0x7fface1aa640 (LWP 231) "GC-marker-0" 0x00007fface8b516a in __futex_abstimed_wait_common ()
   from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
  3    Thread 0x7ffacd9a9640 (LWP 232) "GC-marker-1" 0x00007fface8b516a in __futex_abstimed_wait_common ()
   from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
  4    Thread 0x7ffacd1a8640 (LWP 233) "GC-marker-2" 0x00007fface8b516a in __futex_abstimed_wait_common ()
   from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
  5    Thread 0x7ffacc9a7640 (LWP 234) "GC-marker-3" 0x00007fface8b516a in __futex_abstimed_wait_common ()
   from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
  6    Thread 0x7ffacc1a6640 (LWP 235) "GC-marker-4" 0x00007fface8b516a in __futex_abstimed_wait_common ()
   from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
  7    Thread 0x7ffacb9a5640 (LWP 236) "GC-marker-5" 0x00007fface8b516a in __futex_abstimed_wait_common ()
   from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
  8    Thread 0x7ffacb1a4640 (LWP 237) "GC-marker-6" 0x00007fface8b516a in __futex_abstimed_wait_common ()
   from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
  9    Thread 0x7ffaca832640 (LWP 249) "shepherd"    0x00007fface92900c in read ()
   from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
  10   Thread 0x7ffac89ca640 (LWP 26693) "shepherd"  0x00007fface92900c in read ()
   from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
(gdb) thread apply all bt

Thread 10 (Thread 0x7ffac89ca640 (LWP 26693) "shepherd"):
#0  0x00007fface92900c in read () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
#1  0x00007ffacedf0e57 in ?? () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#2  0x00007ffaced3c7d1 in ?? () from /gnu/store/k1ha4n9v8d7myiiszvl2ic7xnb56l219-libgc-8.2.2/lib/libgc.so.1
#3  0x00007ffaced3e38c in ?? () from /gnu/store/k1ha4n9v8d7myiiszvl2ic7xnb56l219-libgc-8.2.2/lib/libgc.so.1
#4  0x00007ffaced4883c in GC_do_blocking () from /gnu/store/k1ha4n9v8d7myiiszvl2ic7xnb56l219-libgc-8.2.2/lib/libgc.so.1
#5  0x00007ffacee62455 in scm_without_guile () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#6  0x00007ffacedf903d in ?? () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#7  0x00007ffacede4e1a in ?? () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#8  0x00007ffac6832022 in ?? ()
#9  0x00007fface4d97f0 in ?? ()
#10 0x00007ffac94766c0 in ?? ()
#11 0x00007fface5f4b40 in ?? ()
#12 0x00007ffacee11edc in ?? () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#13 0x00007ffacee6e215 in ?? () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#14 0x00007ffacee7bdb5 in scm_call_n () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#15 0x00007ffacede65ca in scm_call_2 () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#16 0x00007ffacee8d092 in ?? () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#17 0x00007ffacee68e1f in scm_c_catch () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#18 0x00007ffacede7396 in scm_c_with_continuation_barrier () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#19 0x00007ffacee68049 in ?? () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#20 0x00007ffaced487fa in GC_call_with_stack_base () from /gnu/store/k1ha4n9v8d7myiiszvl2ic7xnb56l219-libgc-8.2.2/lib/libgc.so.1
#21 0x00007ffacee623f8 in scm_with_guile () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#22 0x00007fface8b83aa in start_thread () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
#23 0x00007fface938f7c in clone3 () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6

Thread 9 (Thread 0x7ffaca832640 (LWP 249) "shepherd"):
#0  0x00007fface92900c in read () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
#1  0x00007ffacee45cc7 in ?? () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#2  0x00007ffaced3c7d1 in ?? () from /gnu/store/k1ha4n9v8d7myiiszvl2ic7xnb56l219-libgc-8.2.2/lib/libgc.so.1
#3  0x00007ffaced3e38c in ?? () from /gnu/store/k1ha4n9v8d7myiiszvl2ic7xnb56l219-libgc-8.2.2/lib/libgc.so.1
#4  0x00007ffaced4883c in GC_do_blocking () from /gnu/store/k1ha4n9v8d7myiiszvl2ic7xnb56l219-libgc-8.2.2/lib/libgc.so.1
#5  0x00007ffacee62455 in scm_without_guile () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#6  0x00007ffacee4a570 in ?? () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#7  0x00007ffacee6e390 in ?? () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#8  0x00007ffacee7bdb5 in scm_call_n () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#9  0x00007ffacede2b3e in scm_call_with_unblocked_asyncs () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#10 0x00007ffacee6e390 in ?? () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#11 0x00007ffacee7bdb5 in scm_call_n () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#12 0x00007ffacee680f3 in ?? () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#13 0x00007ffacede4e1a in ?? () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#14 0x00007ffacee6e390 in ?? () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#15 0x00007ffacee7bdb5 in scm_call_n () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#16 0x00007ffacede65ca in scm_call_2 () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#17 0x00007ffacee8d092 in ?? () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#18 0x00007ffacee68e1f in scm_c_catch () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#19 0x00007ffacede7396 in scm_c_with_continuation_barrier () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#20 0x00007ffacee68049 in ?? () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#21 0x00007ffaced487fa in GC_call_with_stack_base () from /gnu/store/k1ha4n9v8d7myiiszvl2ic7xnb56l219-libgc-8.2.2/lib/libgc.so.1
#22 0x00007ffacee61c5d in ?? () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#23 0x00007fface8b83aa in start_thread () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
#24 0x00007fface938f7c in clone3 () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6

Thread 8 (Thread 0x7ffacb1a4640 (LWP 237) "GC-marker-6"):
#0  0x00007fface8b516a in __futex_abstimed_wait_common () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
#1  0x00007fface8b77e8 in pthread_cond_wait@@GLIBC_2.3.2 () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
#2  0x00007ffaced42740 in ?? () from /gnu/store/k1ha4n9v8d7myiiszvl2ic7xnb56l219-libgc-8.2.2/lib/libgc.so.1
#3  0x00007ffaced42897 in ?? () from /gnu/store/k1ha4n9v8d7myiiszvl2ic7xnb56l219-libgc-8.2.2/lib/libgc.so.1
#4  0x00007fface8b83aa in start_thread () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
#5  0x00007fface938f7c in clone3 () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6

Thread 7 (Thread 0x7ffacb9a5640 (LWP 236) "GC-marker-5"):
#0  0x00007fface8b516a in __futex_abstimed_wait_common () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
#1  0x00007fface8b77e8 in pthread_cond_wait@@GLIBC_2.3.2 () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
--Type <RET> for more, q to quit, c to continue without paging--c
#2  0x00007ffaced42740 in ?? () from /gnu/store/k1ha4n9v8d7myiiszvl2ic7xnb56l219-libgc-8.2.2/lib/libgc.so.1
#3  0x00007ffaced42897 in ?? () from /gnu/store/k1ha4n9v8d7myiiszvl2ic7xnb56l219-libgc-8.2.2/lib/libgc.so.1
#4  0x00007fface8b83aa in start_thread () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
#5  0x00007fface938f7c in clone3 () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6

Thread 6 (Thread 0x7ffacc1a6640 (LWP 235) "GC-marker-4"):
#0  0x00007fface8b516a in __futex_abstimed_wait_common () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
#1  0x00007fface8b77e8 in pthread_cond_wait@@GLIBC_2.3.2 () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
#2  0x00007ffaced42740 in ?? () from /gnu/store/k1ha4n9v8d7myiiszvl2ic7xnb56l219-libgc-8.2.2/lib/libgc.so.1
#3  0x00007ffaced42897 in ?? () from /gnu/store/k1ha4n9v8d7myiiszvl2ic7xnb56l219-libgc-8.2.2/lib/libgc.so.1
#4  0x00007fface8b83aa in start_thread () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
#5  0x00007fface938f7c in clone3 () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6

Thread 5 (Thread 0x7ffacc9a7640 (LWP 234) "GC-marker-3"):
#0  0x00007fface8b516a in __futex_abstimed_wait_common () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
#1  0x00007fface8b77e8 in pthread_cond_wait@@GLIBC_2.3.2 () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
#2  0x00007ffaced42740 in ?? () from /gnu/store/k1ha4n9v8d7myiiszvl2ic7xnb56l219-libgc-8.2.2/lib/libgc.so.1
#3  0x00007ffaced42897 in ?? () from /gnu/store/k1ha4n9v8d7myiiszvl2ic7xnb56l219-libgc-8.2.2/lib/libgc.so.1
#4  0x00007fface8b83aa in start_thread () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
#5  0x00007fface938f7c in clone3 () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6

Thread 4 (Thread 0x7ffacd1a8640 (LWP 233) "GC-marker-2"):
#0  0x00007fface8b516a in __futex_abstimed_wait_common () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
#1  0x00007fface8b77e8 in pthread_cond_wait@@GLIBC_2.3.2 () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
#2  0x00007ffaced42740 in ?? () from /gnu/store/k1ha4n9v8d7myiiszvl2ic7xnb56l219-libgc-8.2.2/lib/libgc.so.1
#3  0x00007ffaced42897 in ?? () from /gnu/store/k1ha4n9v8d7myiiszvl2ic7xnb56l219-libgc-8.2.2/lib/libgc.so.1
#4  0x00007fface8b83aa in start_thread () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
#5  0x00007fface938f7c in clone3 () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6

Thread 3 (Thread 0x7ffacd9a9640 (LWP 232) "GC-marker-1"):
#0  0x00007fface8b516a in __futex_abstimed_wait_common () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
#1  0x00007fface8b77e8 in pthread_cond_wait@@GLIBC_2.3.2 () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
#2  0x00007ffaced42740 in ?? () from /gnu/store/k1ha4n9v8d7myiiszvl2ic7xnb56l219-libgc-8.2.2/lib/libgc.so.1
#3  0x00007ffaced42897 in ?? () from /gnu/store/k1ha4n9v8d7myiiszvl2ic7xnb56l219-libgc-8.2.2/lib/libgc.so.1
#4  0x00007fface8b83aa in start_thread () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
#5  0x00007fface938f7c in clone3 () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6

Thread 2 (Thread 0x7fface1aa640 (LWP 231) "GC-marker-0"):
#0  0x00007fface8b516a in __futex_abstimed_wait_common () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
#1  0x00007fface8b77e8 in pthread_cond_wait@@GLIBC_2.3.2 () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
#2  0x00007ffaced42740 in ?? () from /gnu/store/k1ha4n9v8d7myiiszvl2ic7xnb56l219-libgc-8.2.2/lib/libgc.so.1
#3  0x00007ffaced42897 in ?? () from /gnu/store/k1ha4n9v8d7myiiszvl2ic7xnb56l219-libgc-8.2.2/lib/libgc.so.1
#4  0x00007fface8b83aa in start_thread () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
#5  0x00007fface938f7c in clone3 () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6

Thread 1 (Thread 0x7ffaceef2380 (LWP 1) "shepherd"):
#0  0x00007fface938626 in epoll_wait () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
#1  0x00007ffac9efc335 in ?? () from /gnu/store/h4nsywbhn8b4qyh40fhykk3q40qkr3wd-guile-fibers-1.3.1/lib/guile/3.0/extensions/fibers-epoll.so
#2  0x00007ffaced3c7d1 in ?? () from /gnu/store/k1ha4n9v8d7myiiszvl2ic7xnb56l219-libgc-8.2.2/lib/libgc.so.1
#3  0x00007ffaced3e38c in ?? () from /gnu/store/k1ha4n9v8d7myiiszvl2ic7xnb56l219-libgc-8.2.2/lib/libgc.so.1
#4  0x00007ffaced4883c in GC_do_blocking () from /gnu/store/k1ha4n9v8d7myiiszvl2ic7xnb56l219-libgc-8.2.2/lib/libgc.so.1
#5  0x00007ffacee62455 in scm_without_guile () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#6  0x00007ffac9efc6ce in ?? () from /gnu/store/h4nsywbhn8b4qyh40fhykk3q40qkr3wd-guile-fibers-1.3.1/lib/guile/3.0/extensions/fibers-epoll.so
#7  0x00007ffac76416c2 in ?? ()
#8  0x00007ffac934f594 in ?? ()
#9  0x00007ffac9476d83 in ?? ()
#10 0x00007fface5f4d80 in ?? ()
#11 0x00007ffacee11edc in ?? () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#12 0x00007ffacee6e652 in ?? () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#13 0x00007ffacee7bdb5 in scm_call_n () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#14 0x00007ffacede6977 in scm_primitive_eval () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#15 0x00007ffacee1aff9 in scm_primitive_load () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#16 0x00007ffacee6e390 in ?? () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#17 0x00007ffacee7bdb5 in scm_call_n () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#18 0x00007ffacede6977 in scm_primitive_eval () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#19 0x00007ffacedec846 in scm_eval () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#20 0x00007ffacee4b3e6 in scm_shell () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#21 0x00007ffacedfd8cc in ?? () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#22 0x00007ffacede4e1a in ?? () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#23 0x00007ffacee6e390 in ?? () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#24 0x00007ffacee7bdb5 in scm_call_n () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#25 0x00007ffacede65ca in scm_call_2 () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#26 0x00007ffacee8d092 in ?? () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#27 0x00007ffacee68e1f in scm_c_catch () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#28 0x00007ffacede7396 in scm_c_with_continuation_barrier () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#29 0x00007ffacee68049 in ?? () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#30 0x00007ffaced487fa in GC_call_with_stack_base () from /gnu/store/k1ha4n9v8d7myiiszvl2ic7xnb56l219-libgc-8.2.2/lib/libgc.so.1
#31 0x00007ffacee623f8 in scm_with_guile () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#32 0x00007ffacee068e5 in scm_boot_guile () from /gnu/store/n24l8hxn6nvb7lz7zjlyd7i05khrm0i4-guile-3.0.9/lib/libguile-3.0.so.1
#33 0x00000000004010f7 in ?? ()
#34 0x00007fface85c1f7 in __libc_start_call_main () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
#35 0x00007fface85c2ac in __libc_start_main_impl () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
#36 0x0000000000401171 in ?? ()
--8<---------------cut here---------------end--------------->8---




Information forwarded to bug-guix <at> gnu.org:
bug#65178; Package guix. (Fri, 15 Dec 2023 19:49:02 GMT) Full text and rfc822 format available.

Message #29 received at 65178 <at> debbugs.gnu.org (full text, mbox):

From: Attila Lendvai <attila <at> lendvai.name>
To: Timo Wilken <guix <at> twilken.net>
Cc: 67538 <at> debbugs.gnu.org, 67230 <at> debbugs.gnu.org, 65178 <at> debbugs.gnu.org
Subject: Re: Shepherd stops responding during "guix system reconfigure"
Date: Fri, 15 Dec 2023 19:47:43 +0000
i think i have found the root cause of this, as documented here: https://issues.guix.gnu.org/67839

that issue contains patches for shepherd to reproduce it in its test suite.

-- 
• attila lendvai
• PGP: 963F 5D5F 45C7 DFCD 0A39
--
“What divides libertarians from everybody else is not a belief about rights or what rights people have, because the judgments libertarians make about the state are the same as the judgments almost everyone makes about private agents. So it's not that we believe in rights that other people don't believe in, or that other people believe in rights that we don't believe in. It's that other people think the state is exempt from the moral principles that apply to non-government agents.”
	— Michael Huemer





Information forwarded to bug-guix <at> gnu.org:
bug#65178; Package guix. (Fri, 15 Dec 2023 20:34:02 GMT) Full text and rfc822 format available.

Message #32 received at 65178 <at> debbugs.gnu.org (full text, mbox):

From: "Timo Wilken" <guix <at> twilken.net>
To: "Attila Lendvai" <attila <at> lendvai.name>
Cc: 67538 <at> debbugs.gnu.org, 67230 <at> debbugs.gnu.org, 67839 <at> debbugs.gnu.org,
 65178 <at> debbugs.gnu.org
Subject: Re: Shepherd stops responding during "guix system reconfigure"
Date: Fri, 15 Dec 2023 21:33:15 +0100
On Fri Dec 15, 2023 at 8:47 PM CET, Attila Lendvai wrote:
> i think i have found the root cause of this, as documented here: https://issues.guix.gnu.org/67839
>
> that issue contains patches for shepherd to reproduce it in its test suite.

Thank you very much for this, Attila!

Are the patch in 67839 and/or your branch "attila" linked from there in a
state that I could test them locally? Would it be valuable to you if I ran a
patched Shepherd and sent logs and/or backtraces as I encountered them?




Information forwarded to bug-guix <at> gnu.org:
bug#65178; Package guix. (Fri, 15 Dec 2023 21:25:04 GMT) Full text and rfc822 format available.

Message #35 received at 65178 <at> debbugs.gnu.org (full text, mbox):

From: Attila Lendvai <attila <at> lendvai.name>
To: Timo Wilken <guix <at> twilken.net>
Cc: 67538 <at> debbugs.gnu.org, 67230 <at> debbugs.gnu.org, 67839 <at> debbugs.gnu.org,
 65178 <at> debbugs.gnu.org
Subject: Re: Shepherd stops responding during "guix system reconfigure"
Date: Fri, 15 Dec 2023 21:24:15 +0000
> Thank you very much for this, Attila!


you're welcome! :)


> Are the patch in 67839 and/or your branch "attila" linked from there in a
> state that I could test them locally? Would it be valuable to you if I ran a
> patched Shepherd and sent logs and/or backtraces as I encountered them?


it's nice of you, but not really. now that we have a failing test case in shepherd's unit tests that can reproduce it much easier.

with #67839 you would only get you an extra "Assertion failed" message over master, without much useful output.

as for my branch, it would emit a lot of useful log, including backtraces, but i keep force-pushing into it. i'm running my servers with it, though, so if you feel really adventurous, and want to join the debugging, then you can try... otherwise it's too much in flux.

what we need to focus on now is making shepherd's test suite run clean again, one way or another. then i can test it in a real life environment, and report back with any possible findings.

-- 
• attila lendvai
• PGP: 963F 5D5F 45C7 DFCD 0A39
--
“Ignorance might be bliss for the ignorant, but for the rest of us it's a fucking pain in the ass.”
	— Ricky Gervais





Information forwarded to bug-guix <at> gnu.org:
bug#65178; Package guix. (Tue, 19 Dec 2023 23:01:02 GMT) Full text and rfc822 format available.

Message #38 received at 65178 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Attila Lendvai <attila <at> lendvai.name>
Cc: 65419 <at> debbugs.gnu.org, 67538 <at> debbugs.gnu.org, 67230 <at> debbugs.gnu.org,
 65178 <at> debbugs.gnu.org, Timo Wilken <guix <at> twilken.net>
Subject: Re: bug#65419: [Shepherd] Non-responding service control fiber
Date: Wed, 20 Dec 2023 00:00:36 +0100
Hello,

Attila Lendvai <attila <at> lendvai.name> skribis:

> i think i have found the root cause of this, as documented here: https://issues.guix.gnu.org/67839
>
> that issue contains patches for shepherd to reproduce it in its test suite.

Yes, it looks like this long-standing and hard-to-debug issue may well
be fixed now, thumbs up Attila!!

We have accumulated quite a few fixes by now so I think I’ll release
0.10.3 hopefully in 2023 and otherwise soon after.

Thanks,
Ludo’.




bug closed, send any further explanations to 65419 <at> debbugs.gnu.org and Ludovic Courtès <ludovic.courtes <at> inria.fr> Request was from Ludovic Courtès <ludo <at> gnu.org> to control <at> debbugs.gnu.org. (Tue, 02 Jan 2024 22:11:02 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Wed, 31 Jan 2024 12:24:07 GMT) Full text and rfc822 format available.

This bug report was last modified 1 year and 99 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.