GNU bug report logs - #26931
GuixSD rebooting fails when tmux is running

Previous Next

Package: guix;

Reported by: Leo Famulari <leo <at> famulari.name>

Date: Sun, 14 May 2017 19:31:01 UTC

Severity: normal

Done: ludo <at> gnu.org (Ludovic Courtès)

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 26931 in the body.
You can then email your comments to 26931 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guix <at> gnu.org:
bug#26931; Package guix. (Sun, 14 May 2017 19:31:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Leo Famulari <leo <at> famulari.name>:
New bug report received and forwarded. Copy sent to bug-guix <at> gnu.org. (Sun, 14 May 2017 19:31:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Leo Famulari <leo <at> famulari.name>
To: bug-guix <at> gnu.org
Subject: GuixSD rebooting fails when tmux is running
Date: Sun, 14 May 2017 15:30:43 -0400
[Message part 1 (text/plain, inline)]
When tmux is running, GuixSD fails to reboot. No more details yet, but
confirmation from at least 3 users:

[19:24:57] <CharlieBrown> My system can't shut itself down. Once it says everything is shut down, I have to power off with the power button.
[19:25:05] <lfam> CharlieBrown: I've had that issue before
[19:25:12] <lfam> CharlieBrown: Are you using tmux or screen?
[19:25:23] <CharlieBrown> lfam: Yes.
[19:25:38] <lfam> CharlieBrown: I haven't seriously debugged yet, but I think Shepherd is failing to kill tmux, preventing shutdown
[19:25:47] <paroneayea> must be that your computer doesn't want to shut down screen
[19:25:51] <CharlieBrown> lfam: :-(
[19:25:53] <paroneayea> because it's using a screen-saver
[19:25:57] <paroneayea> sorry, bad pun
[19:25:59] <lfam> Lol
[19:26:21] <lfam> I should make a bug report
[19:26:36] <lfam> CharlieBrown: Tmux or screen?
[19:26:37] <ng0> I have this bug for 2 years now.. so it's a bug.
[19:26:46] <lfam> And ng0, tmux or screen?
[19:26:46] <ng0> here it is tmux
[19:27:07] <lfam> It's quite bad for remote machines. It breaks rebooting completely
[19:27:20] <CharlieBrown> lfam: tmux

https://gnunet.org/bot/log/guix/2017-05-14#T1386137
[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-guix <at> gnu.org:
bug#26931; Package guix. (Sun, 14 May 2017 21:37:02 GMT) Full text and rfc822 format available.

Message #8 received at 26931 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Leo Famulari <leo <at> famulari.name>
Cc: 26931 <at> debbugs.gnu.org
Subject: Re: bug#26931: GuixSD rebooting fails when tmux is running
Date: Sun, 14 May 2017 23:36:17 +0200
Leo Famulari <leo <at> famulari.name> skribis:

> When tmux is running, GuixSD fails to reboot. No more details yet, but
> confirmation from at least 3 users:

What does /var/log/shepherd.log show around the time where you hit
“halt”?

I get something like this:

--8<---------------cut here---------------start------------->8---
18:06:26 Service mcron has been stopped.
18:06:26 sending all processes the TERM signal
18:06:30 all processes have been terminated
18:06:30 Service user-processes has been stopped.
18:06:30 Service file-systems has been stopped.
18:06:30 Service file-system-/dev/pts has been stopped.
18:06:30 Service file-system-/sys/fs/cgroup/memory has been stopped.
18:06:30 system-error("umount" "~S: ~A" ("/run/user" "Device or resource busy") (16))
18:06:30 Service file-system-/run/user has been stopped.
18:06:30 Service file-system-/sys/fs/cgroup/blkio has been stopped.
18:06:30 Service file-system-/sys/fs/cgroup/cpuacct has been stopped.
18:06:30 Service file-system-/sys/fs/cgroup/freezer has been stopped.
18:06:30 Service file-system-/sys/fs/cgroup/cpu has been stopped.
18:06:30 Service file-system-/sys/fs/cgroup/hugetlb has been stopped.
18:06:30 Service file-system-/sys/fs/cgroup/devices has been stopped.
18:06:30 Service file-system-/sys/fs/cgroup/elogind has been stopped.
18:06:30 Service file-system-/sys/fs/cgroup/cpuset has been stopped.
18:06:30 Service file-system-/sys/fs/cgroup/perf_event has been stopped.
18:06:30 Service file-system-/sys/fs/cgroup has been stopped.
18:06:30 Service file-system-/boot/efi has been stopped.
18:06:30 Service file-system-/run/systemd has been stopped.
18:06:30 Service file-system-/gnu/store has been stopped.
18:06:30 Service udev has been stopped.
18:06:30 Service file-system-/dev/shm has been stopped.
18:06:30 closing log
--8<---------------cut here---------------end--------------->8---

“closing log” is the last message the Shepherd writes before calling
reboot(2).

Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#26931; Package guix. (Tue, 16 May 2017 23:19:01 GMT) Full text and rfc822 format available.

Message #11 received at 26931 <at> debbugs.gnu.org (full text, mbox):

From: Leo Famulari <leo <at> famulari.name>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 26931 <at> debbugs.gnu.org
Subject: Re: bug#26931: GuixSD rebooting fails when tmux is running
Date: Tue, 16 May 2017 19:18:39 -0400
[Message part 1 (text/plain, inline)]
On Sun, May 14, 2017 at 11:36:17PM +0200, Ludovic Courtès wrote:
> What does /var/log/shepherd.log show around the time where you hit
> “halt”?
> 
> I get something like this:
> 
> --8<---------------cut here---------------start------------->8---
> 18:06:26 Service mcron has been stopped.
> 18:06:26 sending all processes the TERM signal

For me, this is where it gets stuck:

------
2017-05-16 19:12:53 sending all processes the TERM signal
2017-05-16 19:12:58 waiting for process termination (processes left: (1 494)) 
2017-05-16 19:13:00 waiting for process termination (processes left: (1 494)) 
2017-05-16 19:13:02 waiting for process termination (processes left: (1 494)) 
------

In my experience, it will wait here forever.

And from `ps aux`:

leo        494  0.0  0.1  27232  3676 ?        Ss   19:12   0:00 tmux
[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-guix <at> gnu.org:
bug#26931; Package guix. (Wed, 17 May 2017 07:41:02 GMT) Full text and rfc822 format available.

Message #14 received at 26931 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Leo Famulari <leo <at> famulari.name>
Cc: 26931 <at> debbugs.gnu.org
Subject: Re: bug#26931: GuixSD rebooting fails when tmux is running
Date: Wed, 17 May 2017 09:39:45 +0200
Leo Famulari <leo <at> famulari.name> skribis:

> On Sun, May 14, 2017 at 11:36:17PM +0200, Ludovic Courtès wrote:
>> What does /var/log/shepherd.log show around the time where you hit
>> “halt”?
>> 
>> I get something like this:
>> 
>> --8<---------------cut here---------------start------------->8---
>> 18:06:26 Service mcron has been stopped.
>> 18:06:26 sending all processes the TERM signal
>
> For me, this is where it gets stuck:
>
> ------
> 2017-05-16 19:12:53 sending all processes the TERM signal
> 2017-05-16 19:12:58 waiting for process termination (processes left: (1 494)) 
> 2017-05-16 19:13:00 waiting for process termination (processes left: (1 494)) 
> 2017-05-16 19:13:02 waiting for process termination (processes left: (1 494)) 
> ------
>
> In my experience, it will wait here forever.
>
> And from `ps aux`:
>
> leo        494  0.0  0.1  27232  3676 ?        Ss   19:12   0:00 tmux

Interesting.  The code for this is in (gnu services base).  It sends
SIGTERM, waits for a few seconds, and then sends SIGKILL, which
processes cannot survive AFAIK, and then enters that ‘wait’ loop.

This is on the bare metal and /etc/shepherd/do-not-kill does not exist,
right?

We could always add a round of SIGKILL in the ‘wait’ loop, but that
doesn’t sound right.  Does /var/log/messages contain any hints as to why
tmux wasn’t terminated?

Thanks,
Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#26931; Package guix. (Thu, 18 May 2017 01:14:01 GMT) Full text and rfc822 format available.

Message #17 received at 26931 <at> debbugs.gnu.org (full text, mbox):

From: Leo Famulari <leo <at> famulari.name>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 26931 <at> debbugs.gnu.org
Subject: Re: bug#26931: GuixSD rebooting fails when tmux is running
Date: Wed, 17 May 2017 21:13:55 -0400
On Wed, May 17, 2017 at 09:39:45AM +0200, Ludovic Courtès wrote:
> This is on the bare metal and /etc/shepherd/do-not-kill does not exist,
> right?

Yes, on a Thinkpad x200s (x86_64) with a recent kernel. Nothing is
protected by a 'do-not-kill' file.

> We could always add a round of SIGKILL in the ‘wait’ loop, but that
> doesn’t sound right.

Agreed.

> Does /var/log/messages contain any hints as to why tmux wasn’t
> terminated?

Not from what I can see. I'll keep digging.

Here is the excerpt of /var/log/messages that spans the (forced) reboot.
The last message before I powered off the system is syslogd exiting
after receiving SIGTERM, and the first message after booting is syslogd
restarting.

May 16 19:11:41 localhost ntpd[326]: Listen normally on 4 wls1 192.168.1.103:123
May 16 19:11:41 localhost ntpd[326]: bind(24) AF_INET6 fde0:3702:f8fc:0:21e:65ff:fece:9a5a#123 flags 0x11 failed: Cannot assign requested address
May 16 19:11:41 localhost ntpd[326]: unable to create socket on wls1 (5) for fde0:3702:f8fc:0:21e:65ff:fece:9a5a#123
May 16 19:11:41 localhost ntpd[326]: failed to init interface for address fde0:3702:f8fc:0:21e:65ff:fece:9a5a
May 16 19:11:41 localhost ntpd[326]: bind(24) AF_INET6 2601:47:4101:9916:21e:65ff:fece:9a5a#123 flags 0x11 failed: Cannot assign requested address
May 16 19:11:41 localhost ntpd[326]: unable to create socket on wls1 (6) for 2601:47:4101:9916:21e:65ff:fece:9a5a#123
May 16 19:11:41 localhost ntpd[326]: failed to init interface for address 2601:47:4101:9916:21e:65ff:fece:9a5a
May 16 19:11:41 localhost ntpd[326]: Listen normally on 7 wls1 [fe80::21e:65ff:fece:9a5a%3]:123
May 16 19:11:42 localhost sshd[435]: Accepted publickey for leo from 192.168.1.101 port 42140 ssh2: RSA SHA256:[public key removed for privacy]
May 16 19:11:43 localhost sshd[437]: Received disconnect from 192.168.1.101 port 42140:11: disconnected by user
May 16 19:11:43 localhost sshd[437]: Disconnected from user leo 192.168.1.101 port 42140
May 16 19:11:43 localhost ntpd[326]: Listen normally on 8 wls1 [fde0:3702:f8fc:0:21e:65ff:fece:9a5a]:123
May 16 19:11:43 localhost ntpd[326]: Listen normally on 9 wls1 [2601:47:4101:9916:21e:65ff:fece:9a5a]:123
May 16 19:12:52 localhost syslogd: exiting on signal 15
May 16 19:13:40 localhost syslogd (GNU inetutils 1.9.4): restart




Information forwarded to bug-guix <at> gnu.org:
bug#26931; Package guix. (Thu, 18 May 2017 09:07:02 GMT) Full text and rfc822 format available.

Message #20 received at 26931 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Leo Famulari <leo <at> famulari.name>
Cc: 26931 <at> debbugs.gnu.org
Subject: Re: bug#26931: GuixSD rebooting fails when tmux is running
Date: Thu, 18 May 2017 11:06:12 +0200
Leo Famulari <leo <at> famulari.name> skribis:

> On Wed, May 17, 2017 at 09:39:45AM +0200, Ludovic Courtès wrote:
>> This is on the bare metal and /etc/shepherd/do-not-kill does not exist,
>> right?
>
> Yes, on a Thinkpad x200s (x86_64) with a recent kernel. Nothing is
> protected by a 'do-not-kill' file.
>
>> We could always add a round of SIGKILL in the ‘wait’ loop, but that
>> doesn’t sound right.
>
> Agreed.
>
>> Does /var/log/messages contain any hints as to why tmux wasn’t
>> terminated?
>
> Not from what I can see. I'll keep digging.

Could it be that “something” respawned a tmux process after the first
one had been killed with SIGKILL?

Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#26931; Package guix. (Thu, 18 May 2017 12:24:02 GMT) Full text and rfc822 format available.

Message #23 received at 26931 <at> debbugs.gnu.org (full text, mbox):

From: ng0 <ng0 <at> pragmatique.xyz>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 26931 <at> debbugs.gnu.org, Leo Famulari <leo <at> famulari.name>
Subject: Re: bug#26931: GuixSD rebooting fails when tmux is running
Date: Thu, 18 May 2017 12:22:50 +0000
Ludovic Courtès transcribed 0.7K bytes:
> Leo Famulari <leo <at> famulari.name> skribis:
> 
> > On Wed, May 17, 2017 at 09:39:45AM +0200, Ludovic Courtès wrote:
> >> This is on the bare metal and /etc/shepherd/do-not-kill does not exist,
> >> right?
> >
> > Yes, on a Thinkpad x200s (x86_64) with a recent kernel. Nothing is
> > protected by a 'do-not-kill' file.
> >
> >> We could always add a round of SIGKILL in the ‘wait’ loop, but that
> >> doesn’t sound right.
> >
> > Agreed.
> >
> >> Does /var/log/messages contain any hints as to why tmux wasn’t
> >> terminated?
> >
> > Not from what I can see. I'll keep digging.
> 
> Could it be that “something” respawned a tmux process after the first
> one had been killed with SIGKILL?
> 
> Ludo’.
> 
> 
> 

terminal 1:
tmux new

terminal 2:
[user <at> abyayala ~]$ killall tmux
tmux: no process found

I haven't looked into the core of shepherd, but I think
it should try to kill the process by processid:

terminal 1:
[user <at> abyayala ~]$ pidof tmux
29611 29609
[user <at> abyayala ~]$ kill 29611

terminal 2 (the "tmux" is still from when I started it):
[user <at> abyayala ~]$ tmux
[server exited]


I hope this helps.
-- 
https://pragmatique.xyz
PGP: https://people.pragmatique.xyz/ng0/




Reply sent to ludo <at> gnu.org (Ludovic Courtès):
You have taken responsibility. (Mon, 28 Aug 2017 08:24:01 GMT) Full text and rfc822 format available.

Notification sent to Leo Famulari <leo <at> famulari.name>:
bug acknowledged by developer. (Mon, 28 Aug 2017 08:24:02 GMT) Full text and rfc822 format available.

Message #28 received at 26931-done <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Leo Famulari <leo <at> famulari.name>
Cc: 26931-done <at> debbugs.gnu.org
Subject: Re: bug#26931: GuixSD rebooting fails when tmux is running
Date: Mon, 28 Aug 2017 10:22:52 +0200
Hi,

Leo Famulari <leo <at> famulari.name> skribis:

> On Sun, May 14, 2017 at 11:36:17PM +0200, Ludovic Courtès wrote:
>> What does /var/log/shepherd.log show around the time where you hit
>> “halt”?
>> 
>> I get something like this:
>> 
>> --8<---------------cut here---------------start------------->8---
>> 18:06:26 Service mcron has been stopped.
>> 18:06:26 sending all processes the TERM signal
>
> For me, this is where it gets stuck:
>
> ------
> 2017-05-16 19:12:53 sending all processes the TERM signal
> 2017-05-16 19:12:58 waiting for process termination (processes left: (1 494)) 
> 2017-05-16 19:13:00 waiting for process termination (processes left: (1 494)) 
> 2017-05-16 19:13:02 waiting for process termination (processes left: (1 494)) 
> ------
>
> In my experience, it will wait here forever.
>
> And from `ps aux`:
>
> leo        494  0.0  0.1  27232  3676 ?        Ss   19:12   0:00 tmux

The bug was 100% reproducible in a VM, and AFAICS it is fixed by
7f090203d5fb033eb1b64778b03afad5bb35f5f2.

The problem was that the tmux server process would be left as a zombie,
and then the loop would always see it because the parent process of the
tmux server process is PID 1 and for some reason the PID 1 either didn’t
get SIGCHLD or the handler didn’t run.

The test that this commit adds does exactly the same thing: launch tmux
and then invoke “halt”.  I tried to create a synthetic test not
involving tmux, simply creating a process that gets PID 1 as its parent,
but it wouldn’t trigger the bug.  I’m unclear as to why tmux triggers it
and no that other simple test.

Thanks,
Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#26931; Package guix. (Mon, 28 Aug 2017 08:31:01 GMT) Full text and rfc822 format available.

Message #31 received at 26931-done <at> debbugs.gnu.org (full text, mbox):

From: Clément Lassieur <clement <at> lassieur.org>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 26931-done <at> debbugs.gnu.org, Leo Famulari <leo <at> famulari.name>
Subject: Re: bug#26931: GuixSD rebooting fails when tmux is running
Date: Mon, 28 Aug 2017 10:30:37 +0200
Ludovic Courtès <ludo <at> gnu.org> writes:

> Hi,
>
> Leo Famulari <leo <at> famulari.name> skribis:
>
>> On Sun, May 14, 2017 at 11:36:17PM +0200, Ludovic Courtès wrote:
>>> What does /var/log/shepherd.log show around the time where you hit
>>> “halt”?
>>> 
>>> I get something like this:
>>> 
>>> --8<---------------cut here---------------start------------->8---
>>> 18:06:26 Service mcron has been stopped.
>>> 18:06:26 sending all processes the TERM signal
>>
>> For me, this is where it gets stuck:
>>
>> ------
>> 2017-05-16 19:12:53 sending all processes the TERM signal
>> 2017-05-16 19:12:58 waiting for process termination (processes left: (1 494)) 
>> 2017-05-16 19:13:00 waiting for process termination (processes left: (1 494)) 
>> 2017-05-16 19:13:02 waiting for process termination (processes left: (1 494)) 
>> ------
>>
>> In my experience, it will wait here forever.
>>
>> And from `ps aux`:
>>
>> leo        494  0.0  0.1  27232  3676 ?        Ss   19:12   0:00 tmux
>
> The bug was 100% reproducible in a VM, and AFAICS it is fixed by
> 7f090203d5fb033eb1b64778b03afad5bb35f5f2.
>
> The problem was that the tmux server process would be left as a zombie,
> and then the loop would always see it because the parent process of the
> tmux server process is PID1 and for some reason the PID1 either didn’t
> get SIGCHLD or the handler didn’t run.
>
> The test that this commit adds does exactly the same thing: launch tmux
> and then invoke “halt”.  I tried to create a synthetic test not
> involving tmux, simply creating a process that gets PID1 as its parent,
> but it wouldn’t trigger the bug.  I’m unclear as to why tmux triggers it
> and no that other simple test.

FYI I have the exact same problem with guix-publish: I have to do 'herd
stop guix-publish', otherwise I can't reboot.  I'll report a bug soon.
I would be interested to know if someone else reproduces it.




Information forwarded to bug-guix <at> gnu.org:
bug#26931; Package guix. (Mon, 28 Aug 2017 08:52:01 GMT) Full text and rfc822 format available.

Message #34 received at 26931 <at> debbugs.gnu.org (full text, mbox):

From: ng0 <ng0 <at> infotropique.org>
To: 26931 <at> debbugs.gnu.org, ludo <at> gnu.org, leo <at> famulari.name
Subject: Re: bug#26931: GuixSD rebooting fails when tmux is running
Date: Mon, 28 Aug 2017 08:50:39 +0000
[Message part 1 (text/plain, inline)]
Ludovic Courtès transcribed 1.7K bytes:
> Hi,
> 
> Leo Famulari <leo <at> famulari.name> skribis:
> 
> > On Sun, May 14, 2017 at 11:36:17PM +0200, Ludovic Courtès wrote:
> >> What does /var/log/shepherd.log show around the time where you hit
> >> “halt”?
> >> 
> >> I get something like this:
> >> 
> >> --8<---------------cut here---------------start------------->8---
> >> 18:06:26 Service mcron has been stopped.
> >> 18:06:26 sending all processes the TERM signal
> >
> > For me, this is where it gets stuck:
> >
> > ------
> > 2017-05-16 19:12:53 sending all processes the TERM signal
> > 2017-05-16 19:12:58 waiting for process termination (processes left: (1 494)) 
> > 2017-05-16 19:13:00 waiting for process termination (processes left: (1 494)) 
> > 2017-05-16 19:13:02 waiting for process termination (processes left: (1 494)) 
> > ------
> >
> > In my experience, it will wait here forever.
> >
> > And from `ps aux`:
> >
> > leo        494  0.0  0.1  27232  3676 ?        Ss   19:12   0:00 tmux
> 
> The bug was 100% reproducible in a VM, and AFAICS it is fixed by
> 7f090203d5fb033eb1b64778b03afad5bb35f5f2.
> 
> The problem was that the tmux server process would be left as a zombie,
> and then the loop would always see it because the parent process of the
> tmux server process is PID 1 and for some reason the PID 1 either didn’t
> get SIGCHLD or the handler didn’t run.
> 
> The test that this commit adds does exactly the same thing: launch tmux
> and then invoke “halt”.  I tried to create a synthetic test not
> involving tmux, simply creating a process that gets PID 1 as its parent,
> but it wouldn’t trigger the bug.  I’m unclear as to why tmux triggers it
> and no that other simple test.
> 
> Thanks,
> Ludo’.
> 
> 
> 
> 
I just found this upstream issue: https://github.com/tmux/tmux/issues/311
which has been fixed in tmux 2.5. I think we should take this bug to upstream,
even if it's just to get more insight if it is a tmux bug.
-- 
ng0
GnuPG: A88C8ADD129828D7EAC02E52E22F9BBFEE348588
GnuPG: https://n0is.noblogs.org/my-keys
https://www.infotropique.org https://krosos.org
[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-guix <at> gnu.org:
bug#26931; Package guix. (Mon, 28 Aug 2017 10:17:02 GMT) Full text and rfc822 format available.

Message #37 received at 26931-done <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Clément Lassieur <clement <at> lassieur.org>
Cc: 26931-done <at> debbugs.gnu.org, Leo Famulari <leo <at> famulari.name>
Subject: Re: bug#26931: GuixSD rebooting fails when tmux is running
Date: Mon, 28 Aug 2017 12:16:09 +0200
Clément Lassieur <clement <at> lassieur.org> skribis:

> Ludovic Courtès <ludo <at> gnu.org> writes:
>
>> Hi,
>>
>> Leo Famulari <leo <at> famulari.name> skribis:
>>
>>> On Sun, May 14, 2017 at 11:36:17PM +0200, Ludovic Courtès wrote:
>>>> What does /var/log/shepherd.log show around the time where you hit
>>>> “halt”?
>>>> 
>>>> I get something like this:
>>>> 
>>>> --8<---------------cut here---------------start------------->8---
>>>> 18:06:26 Service mcron has been stopped.
>>>> 18:06:26 sending all processes the TERM signal
>>>
>>> For me, this is where it gets stuck:
>>>
>>> ------
>>> 2017-05-16 19:12:53 sending all processes the TERM signal
>>> 2017-05-16 19:12:58 waiting for process termination (processes left: (1 494)) 
>>> 2017-05-16 19:13:00 waiting for process termination (processes left: (1 494)) 
>>> 2017-05-16 19:13:02 waiting for process termination (processes left: (1 494)) 
>>> ------
>>>
>>> In my experience, it will wait here forever.
>>>
>>> And from `ps aux`:
>>>
>>> leo        494  0.0  0.1  27232  3676 ?        Ss   19:12   0:00 tmux
>>
>> The bug was 100% reproducible in a VM, and AFAICS it is fixed by
>> 7f090203d5fb033eb1b64778b03afad5bb35f5f2.

[...]

> FYI I have the exact same problem with guix-publish: I have to do 'herd
> stop guix-publish', otherwise I can't reboot.  I'll report a bug soon.
> I would be interested to know if someone else reproduces it.

The commit above makes sure all child processes are reaped.  Could you
check whether that solves the problem you’re seeing with ‘guix publish’?

Thanks,
Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#26931; Package guix. (Mon, 28 Aug 2017 10:19:01 GMT) Full text and rfc822 format available.

Message #40 received at 26931 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: 26931 <at> debbugs.gnu.org
Cc: leo <at> famulari.name
Subject: Re: bug#26931: GuixSD rebooting fails when tmux is running
Date: Mon, 28 Aug 2017 12:18:05 +0200
ng0 <ng0 <at> infotropique.org> skribis:

> I just found this upstream issue: https://github.com/tmux/tmux/issues/311
> which has been fixed in tmux 2.5. I think we should take this bug to upstream,
> even if it's just to get more insight if it is a tmux bug.

Oh that probably explains why tmux leaves a zombie process, though the
root problem here is that PID 1 should reap those processes upon
shutdown in the first place.

Thanks for the pointer!

Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#26931; Package guix. (Wed, 30 Aug 2017 07:37:02 GMT) Full text and rfc822 format available.

Message #43 received at 26931-done <at> debbugs.gnu.org (full text, mbox):

From: Clément Lassieur <clement <at> lassieur.org>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 26931-done <at> debbugs.gnu.org, Leo Famulari <leo <at> famulari.name>
Subject: Re: bug#26931: GuixSD rebooting fails when tmux is running
Date: Wed, 30 Aug 2017 09:35:56 +0200
Ludovic Courtès <ludo <at> gnu.org> writes:

> The commit above makes sure all child processes are reaped.  Could you
> check whether that solves the problem you’re seeing with ‘guix publish’?

Indeed it's fixed.  Thank you very much :-)




Information forwarded to bug-guix <at> gnu.org:
bug#26931; Package guix. (Wed, 30 Aug 2017 08:40:02 GMT) Full text and rfc822 format available.

Message #46 received at 26931-done <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Clément Lassieur <clement <at> lassieur.org>
Cc: 26931-done <at> debbugs.gnu.org, Leo Famulari <leo <at> famulari.name>
Subject: Re: bug#26931: GuixSD rebooting fails when tmux is running
Date: Wed, 30 Aug 2017 10:39:28 +0200
Clément Lassieur <clement <at> lassieur.org> skribis:

> Ludovic Courtès <ludo <at> gnu.org> writes:
>
>> The commit above makes sure all child processes are reaped.  Could you
>> check whether that solves the problem you’re seeing with ‘guix publish’?
>
> Indeed it's fixed.  Thank you very much :-)

Excellent, thanks for checking!

Ludo’.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Wed, 27 Sep 2017 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 6 years and 183 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.