GNU bug report logs - #53225
shepherd freezes if wireguard is started with dns config enabled

Previous Next

Package: guix;

Reported by: Nathan Dehnel <ncdehnel <at> gmail.com>

Date: Thu, 13 Jan 2022 00:28:02 UTC

Severity: important

Done: Ludovic Courtès <ludo <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 53225 in the body.
You can then email your comments to 53225 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guix <at> gnu.org:
bug#53225; Package guix. (Thu, 13 Jan 2022 00:28:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Nathan Dehnel <ncdehnel <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-guix <at> gnu.org. (Thu, 13 Jan 2022 00:28:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Nathan Dehnel <ncdehnel <at> gmail.com>
To: Tobias Geerinckx-Rice via Bug reports for GNU Guix <bug-guix <at> gnu.org>
Subject: shepherd freezes if wireguard is started with dns config enabled
Date: Wed, 12 Jan 2022 18:27:24 -0600
When dns is specified, wireguard runs wg-quick, which runs resolvconf,
which runs /run/current-system/profile/bin/herd restart, which causes
shepherd to freeze because I guess it doesn't like being given
multiple start commands at once. I'm not sure how to fix it.




Information forwarded to bug-guix <at> gnu.org:
bug#53225; Package guix. (Thu, 13 Jan 2022 15:12:02 GMT) Full text and rfc822 format available.

Message #8 received at 53225 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Nathan Dehnel <ncdehnel <at> gmail.com>
Cc: 53225 <at> debbugs.gnu.org
Subject: Re: bug#53225: shepherd freezes if wireguard is started with dns
 config enabled
Date: Thu, 13 Jan 2022 16:11:36 +0100
Hi,

Nathan Dehnel <ncdehnel <at> gmail.com> skribis:

> When dns is specified, wireguard runs wg-quick, which runs resolvconf,
> which runs /run/current-system/profile/bin/herd restart, which causes
> shepherd to freeze because I guess it doesn't like being given
> multiple start commands at once. I'm not sure how to fix it.

What do you mean by “freezing”?  Does ‘herd status’ and similar commands
block forever?  Or is it something else?

Requests in the Shepherd are currently handled sequentially.  So if you
issue several ‘herd restart’ commands, they’ll be processed one at a
time.  This is usually okay because ‘start’ commands are expected to be
quick (just wait for the daemon to write its PID file or similar).

Thanks,
Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#53225; Package guix. (Thu, 13 Jan 2022 22:43:01 GMT) Full text and rfc822 format available.

Message #11 received at 53225 <at> debbugs.gnu.org (full text, mbox):

From: Nathan Dehnel <ncdehnel <at> gmail.com>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 53225 <at> debbugs.gnu.org
Subject: Re: bug#53225: shepherd freezes if wireguard is started with dns
 config enabled
Date: Thu, 13 Jan 2022 16:41:44 -0600
>What do you mean by “freezing”?  Does ‘herd status’ and similar commands
block forever?
Yes

>Requests in the Shepherd are currently handled sequentially.  So if you
issue several ‘herd restart’ commands, they’ll be processed one at a
time.  This is usually okay because ‘start’ commands are expected to be
quick (just wait for the daemon to write its PID file or similar).
What is the nature of this serialization? Does wireguard need to
finish before resolvconf can start? Because that's probably the issue.


On Thu, Jan 13, 2022 at 9:11 AM Ludovic Courtès <ludo <at> gnu.org> wrote:
>
> Hi,
>
> Nathan Dehnel <ncdehnel <at> gmail.com> skribis:
>
> > When dns is specified, wireguard runs wg-quick, which runs resolvconf,
> > which runs /run/current-system/profile/bin/herd restart, which causes
> > shepherd to freeze because I guess it doesn't like being given
> > multiple start commands at once. I'm not sure how to fix it.
>
> What do you mean by “freezing”?  Does ‘herd status’ and similar commands
> block forever?  Or is it something else?
>
> Requests in the Shepherd are currently handled sequentially.  So if you
> issue several ‘herd restart’ commands, they’ll be processed one at a
> time.  This is usually okay because ‘start’ commands are expected to be
> quick (just wait for the daemon to write its PID file or similar).
>
> Thanks,
> Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#53225; Package guix. (Mon, 17 Jan 2022 13:49:01 GMT) Full text and rfc822 format available.

Message #14 received at 53225 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Nathan Dehnel <ncdehnel <at> gmail.com>
Cc: 53225 <at> debbugs.gnu.org
Subject: Re: bug#53225: shepherd freezes if wireguard is started with dns
 config enabled
Date: Mon, 17 Jan 2022 14:48:40 +0100
Hi,

Nathan Dehnel <ncdehnel <at> gmail.com> skribis:

>>What do you mean by “freezing”?  Does ‘herd status’ and similar commands
> block forever?
> Yes
>
>>Requests in the Shepherd are currently handled sequentially.  So if you
> issue several ‘herd restart’ commands, they’ll be processed one at a
> time.  This is usually okay because ‘start’ commands are expected to be
> quick (just wait for the daemon to write its PID file or similar).
> What is the nature of this serialization? Does wireguard need to
> finish before resolvconf can start? Because that's probably the issue.

One command sent to shepherd by ‘herd …’ must have completed before the
next one is processed.

You can experience it like this:

  sudo herd eval root '(sleep 3)' & echo status && sudo herd status

Here the first ‘herd’ command has shepherd block for 3 seconds, so the
second ‘herd’ command won’t print anything until 3 seconds have passed.

HTH,
Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#53225; Package guix. (Wed, 01 Jun 2022 22:57:02 GMT) Full text and rfc822 format available.

Message #17 received at 53225 <at> debbugs.gnu.org (full text, mbox):

From: Nathan Dehnel <ncdehnel <at> gmail.com>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 53225 <at> debbugs.gnu.org
Subject: Re: bug#53225: shepherd freezes if wireguard is started with dns
 config enabled
Date: Wed, 1 Jun 2022 17:56:04 -0500
Just tested and Shepherd 0.9 does not fix this issue.

On Mon, Jan 17, 2022 at 7:48 AM Ludovic Courtès <ludo <at> gnu.org> wrote:
>
> Hi,
>
> Nathan Dehnel <ncdehnel <at> gmail.com> skribis:
>
> >>What do you mean by “freezing”?  Does ‘herd status’ and similar commands
> > block forever?
> > Yes
> >
> >>Requests in the Shepherd are currently handled sequentially.  So if you
> > issue several ‘herd restart’ commands, they’ll be processed one at a
> > time.  This is usually okay because ‘start’ commands are expected to be
> > quick (just wait for the daemon to write its PID file or similar).
> > What is the nature of this serialization? Does wireguard need to
> > finish before resolvconf can start? Because that's probably the issue.
>
> One command sent to shepherd by ‘herd …’ must have completed before the
> next one is processed.
>
> You can experience it like this:
>
>   sudo herd eval root '(sleep 3)' & echo status && sudo herd status
>
> Here the first ‘herd’ command has shepherd block for 3 seconds, so the
> second ‘herd’ command won’t print anything until 3 seconds have passed.
>
> HTH,
> Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#53225; Package guix. (Thu, 02 Jun 2022 13:40:02 GMT) Full text and rfc822 format available.

Message #20 received at 53225 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Nathan Dehnel <ncdehnel <at> gmail.com>
Cc: 53225 <at> debbugs.gnu.org
Subject: Re: bug#53225: shepherd freezes if wireguard is started with dns
 config enabled
Date: Thu, 02 Jun 2022 15:38:56 +0200
Hi Nathan,

Nathan Dehnel <ncdehnel <at> gmail.com> skribis:

> Just tested and Shepherd 0.9 does not fix this issue.

Could you be more specific?  Specifically, could you share
/var/log/messages for the parts related to Wireguard?

> On Mon, Jan 17, 2022 at 7:48 AM Ludovic Courtès <ludo <at> gnu.org> wrote:

[...]

>> One command sent to shepherd by ‘herd …’ must have completed before the
>> next one is processed.
>>
>> You can experience it like this:
>>
>>   sudo herd eval root '(sleep 3)' & echo status && sudo herd status
>>
>> Here the first ‘herd’ command has shepherd block for 3 seconds, so the
>> second ‘herd’ command won’t print anything until 3 seconds have passed.

This is actually still the case with 0.9, because here we’re calling
(@ (guile) sleep), which blocks.  So… not a good example.

The short story is: it is still possible to write code that blocks
shepherd, as with the ‘sleep’ example above.  However, the standard
service constructors/destructors no longer block, and shepherd can serve
multiple clients concurrently.

Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#53225; Package guix. (Wed, 08 Jun 2022 23:24:02 GMT) Full text and rfc822 format available.

Message #23 received at 53225 <at> debbugs.gnu.org (full text, mbox):

From: Nathan Dehnel <ncdehnel <at> gmail.com>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 53225 <at> debbugs.gnu.org
Subject: Re: bug#53225: shepherd freezes if wireguard is started with dns
 config enabled
Date: Wed, 8 Jun 2022 18:23:31 -0500
>Could you be more specific?  Specifically, could you share
>/var/log/messages for the parts related to Wireguard?

root <at> guixtest ~# cat /var/log/messages | grep -i wireguardJun  8
18:20:07 localhost vmunix: [    6.330271] wireguard: WireGuard 1.0.0
loaded. See www.wireguard.com for information.
Jun  8 18:20:07 localhost vmunix: [    6.330276] wireguard: Copyright
(C) 2015-2019 Jason A. Donenfeld <Jason <at> zx2c4.com>. All Rights
Reserved.

>However, the standard
>service constructors/destructors no longer block, and shepherd can serve
>multiple clients concurrently.

I don't know, I guess wireguard uses "non-standard" constructors.

On Thu, Jun 2, 2022 at 8:38 AM Ludovic Courtès <ludo <at> gnu.org> wrote:
>
> Hi Nathan,
>
> Nathan Dehnel <ncdehnel <at> gmail.com> skribis:
>
> > Just tested and Shepherd 0.9 does not fix this issue.
>
> Could you be more specific?  Specifically, could you share
> /var/log/messages for the parts related to Wireguard?
>
> > On Mon, Jan 17, 2022 at 7:48 AM Ludovic Courtès <ludo <at> gnu.org> wrote:
>
> [...]
>
> >> One command sent to shepherd by ‘herd …’ must have completed before the
> >> next one is processed.
> >>
> >> You can experience it like this:
> >>
> >>   sudo herd eval root '(sleep 3)' & echo status && sudo herd status
> >>
> >> Here the first ‘herd’ command has shepherd block for 3 seconds, so the
> >> second ‘herd’ command won’t print anything until 3 seconds have passed.
>
> This is actually still the case with 0.9, because here we’re calling
> (@ (guile) sleep), which blocks.  So… not a good example.
>
> The short story is: it is still possible to write code that blocks
> shepherd, as with the ‘sleep’ example above.  However, the standard
> service constructors/destructors no longer block, and shepherd can serve
> multiple clients concurrently.
>
> Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#53225; Package guix. (Thu, 09 Jun 2022 15:06:02 GMT) Full text and rfc822 format available.

Message #26 received at 53225 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Nathan Dehnel <ncdehnel <at> gmail.com>
Cc: 53225 <at> debbugs.gnu.org
Subject: Re: bug#53225: shepherd freezes if wireguard is started with dns
 config enabled
Date: Thu, 09 Jun 2022 17:05:04 +0200
Hi Nathan,

Nathan Dehnel <ncdehnel <at> gmail.com> skribis:

>>Could you be more specific?  Specifically, could you share
>>/var/log/messages for the parts related to Wireguard?
>
> root <at> guixtest ~# cat /var/log/messages | grep -i wireguardJun  8
> 18:20:07 localhost vmunix: [    6.330271] wireguard: WireGuard 1.0.0
> loaded. See www.wireguard.com for information.
> Jun  8 18:20:07 localhost vmunix: [    6.330276] wireguard: Copyright
> (C) 2015-2019 Jason A. Donenfeld <Jason <at> zx2c4.com>. All Rights
> Reserved.

There should be lines like:

  shepherd[1]: Service 'wireguard-XXX' has been started.

Perhaps they’ve been moved to a different files due to log rotation?

Without these, I cannot tell what happened.

>>However, the standard
>>service constructors/destructors no longer block, and shepherd can serve
>>multiple clients concurrently.
>
> I don't know, I guess wireguard uses "non-standard" constructors.

Indeed, it invokes ‘wg-quick up’ and waits for completion.

I suppose that command blocks until it has set up the VPN, right?

If so, we’ll need to rewrite it differently.

Thanks,
Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#53225; Package guix. (Thu, 09 Jun 2022 15:50:01 GMT) Full text and rfc822 format available.

Message #29 received at 53225 <at> debbugs.gnu.org (full text, mbox):

From: Nathan Dehnel <ncdehnel <at> gmail.com>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 53225 <at> debbugs.gnu.org
Subject: Re: bug#53225: shepherd freezes if wireguard is started with dns
 config enabled
Date: Thu, 9 Jun 2022 10:49:07 -0500
>There should be lines like:

 > shepherd[1]: Service 'wireguard-XXX' has been started.

>Perhaps they’ve been moved to a different files due to log rotation?

>Without these, I cannot tell what happened.

I tried it again and found this
Jun  9 10:47:44 localhost vmunix: [    6.497581] wireguard: WireGuard
1.0.0 loaded. See www.wireguard.com for information.
Jun  9 10:47:44 localhost vmunix: [    6.497584] wireguard: Copyright
(C) 2015-2019 Jason A. Donenfeld <Jason <at> zx2c4.com>. All Rights
Reserved.
Jun  9 10:47:44 localhost shepherd[1]: Failed to start wireguard-test
in the background.

On Thu, Jun 9, 2022 at 10:05 AM Ludovic Courtès <ludo <at> gnu.org> wrote:
>
> Hi Nathan,
>
> Nathan Dehnel <ncdehnel <at> gmail.com> skribis:
>
> >>Could you be more specific?  Specifically, could you share
> >>/var/log/messages for the parts related to Wireguard?
> >
> > root <at> guixtest ~# cat /var/log/messages | grep -i wireguardJun  8
> > 18:20:07 localhost vmunix: [    6.330271] wireguard: WireGuard 1.0.0
> > loaded. See www.wireguard.com for information.
> > Jun  8 18:20:07 localhost vmunix: [    6.330276] wireguard: Copyright
> > (C) 2015-2019 Jason A. Donenfeld <Jason <at> zx2c4.com>. All Rights
> > Reserved.
>
> There should be lines like:
>
>   shepherd[1]: Service 'wireguard-XXX' has been started.
>
> Perhaps they’ve been moved to a different files due to log rotation?
>
> Without these, I cannot tell what happened.
>
> >>However, the standard
> >>service constructors/destructors no longer block, and shepherd can serve
> >>multiple clients concurrently.
> >
> > I don't know, I guess wireguard uses "non-standard" constructors.
>
> Indeed, it invokes ‘wg-quick up’ and waits for completion.
>
> I suppose that command blocks until it has set up the VPN, right?
>
> If so, we’ll need to rewrite it differently.
>
> Thanks,
> Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#53225; Package guix. (Thu, 09 Jun 2022 20:16:01 GMT) Full text and rfc822 format available.

Message #32 received at 53225 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Nathan Dehnel <ncdehnel <at> gmail.com>
Cc: 53225 <at> debbugs.gnu.org
Subject: Re: bug#53225: shepherd freezes if wireguard is started with dns
 config enabled
Date: Thu, 09 Jun 2022 22:15:06 +0200
Hi,

Nathan Dehnel <ncdehnel <at> gmail.com> skribis:

> I tried it again and found this
> Jun  9 10:47:44 localhost vmunix: [    6.497581] wireguard: WireGuard
> 1.0.0 loaded. See www.wireguard.com for information.
> Jun  9 10:47:44 localhost vmunix: [    6.497584] wireguard: Copyright
> (C) 2015-2019 Jason A. Donenfeld <Jason <at> zx2c4.com>. All Rights
> Reserved.
> Jun  9 10:47:44 localhost shepherd[1]: Failed to start wireguard-test
> in the background.

Could you provide me (privately if you prefer) the /var/log/messages
sequence starting from boot (the line that reads “syslogd (GNU inetutils
2.0): restart”) up to the last line above?

Thanks in advance,
Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#53225; Package guix. (Mon, 13 Jun 2022 09:32:01 GMT) Full text and rfc822 format available.

Message #35 received at 53225 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Nathan Dehnel <ncdehnel <at> gmail.com>
Cc: Guillaume Le Vaillant <glv <at> posteo.net>, Mathieu Othacehe <othacehe <at> gnu.org>,
 53225 <at> debbugs.gnu.org
Subject: Re: bug#53225: shepherd freezes if wireguard is started with dns
 config enabled
Date: Mon, 13 Jun 2022 11:31:18 +0200
Hi,

The /var/log/messages excerpt you sent me has nothing beyond:

> Jun 11 11:43:33 localhost shepherd[1]: Service networking has been started.

[…]

> Jun 11 11:43:33 localhost vmunix: [    5.552395] wireguard: WireGuard
> 1.0.0 loaded. See www.wireguard.com for information.
> Jun 11 11:43:33 localhost vmunix: [    5.552398] wireguard: Copyright
> (C) 2015-2019 Jason A. Donenfeld <Jason <at> zx2c4.com>. All Rights
> Reserved.
> Jun 11 11:43:33 localhost shepherd[1]: Failed to start wireguard-test
> in the background.

That there’s not a single error message from wireguard doesn’t help.

Mathieu, Guillaume: any idea what might prevent the wireguard Shepherd
service from starting, or how we could gather debugging info?

(Context: <https://issues.guix.gnu.org/53225>.)

Ludo’.




Severity set to 'important' from 'normal' Request was from Ludovic Courtès <ludo <at> gnu.org> to control <at> debbugs.gnu.org. (Sat, 12 Nov 2022 18:08:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-guix <at> gnu.org:
bug#53225; Package guix. (Sat, 12 Nov 2022 18:12:02 GMT) Full text and rfc822 format available.

Message #40 received at 53225 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Mathieu Othacehe <othacehe <at> gnu.org>
Cc: 53225 <at> debbugs.gnu.org, 58926 <at> debbugs.gnu.org
Subject: Re: bug#58926: Shepherd becomes unresponsive after an interrupt
Date: Sat, 12 Nov 2022 19:10:56 +0100
Mathieu Othacehe <othacehe <at> gnu.org> skribis:

> 1. On my laptop with a Wireguard service trying to reach a non-existing
> DNS server.
>
>             (service wireguard-service-type
>                      (wireguard-configuration
>                       (addresses (list "10.0.0.2/24"))
>                       (dns '("10.0.0.50")) #does not exit

This one is similar to:

  https://issues.guix.gnu.org/53225
  https://issues.guix.gnu.org/53381

It has to do with the fact that “wg-quick up” blocks until it succeeds
and that ‘invoke’ gets stuck on ‘waitpid’ until the “wg-quick” process
terminates.

The solution will be to use something non-blocking instead of ‘invoke’;
I’m looking into it.

Ludo’.




Reply sent to Ludovic Courtès <ludo <at> gnu.org>:
You have taken responsibility. (Thu, 17 Nov 2022 10:24:03 GMT) Full text and rfc822 format available.

Notification sent to Nathan Dehnel <ncdehnel <at> gmail.com>:
bug acknowledged by developer. (Thu, 17 Nov 2022 10:24:03 GMT) Full text and rfc822 format available.

Message #45 received at 53225-done <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Mathieu Othacehe <othacehe <at> gnu.org>
Cc: 53225-done <at> debbugs.gnu.org, 58926-done <at> debbugs.gnu.org
Subject: Re: bug#58926: Shepherd becomes unresponsive after an interrupt
Date: Thu, 17 Nov 2022 11:23:09 +0100
Hi,

Ludovic Courtès <ludo <at> gnu.org> skribis:

> Mathieu Othacehe <othacehe <at> gnu.org> skribis:
>
>> 1. On my laptop with a Wireguard service trying to reach a non-existing
>> DNS server.
>>
>>             (service wireguard-service-type
>>                      (wireguard-configuration
>>                       (addresses (list "10.0.0.2/24"))
>>                       (dns '("10.0.0.50")) #does not exit
>
> This one is similar to:
>
>   https://issues.guix.gnu.org/53225
>   https://issues.guix.gnu.org/53381
>
> It has to do with the fact that “wg-quick up” blocks until it succeeds
> and that ‘invoke’ gets stuck on ‘waitpid’ until the “wg-quick” process
> terminates.
>
> The solution will be to use something non-blocking instead of ‘invoke’;
> I’m looking into it.

This is fixed in the Shepherd 0.9.3, which landed in Guix commit
283d7318c5b312d7129adb6dbeea6ad205ce89d1.

As I wrote, I’m not sure whether it fixes the nginx situation since I
could not reproduce it.  I’m closing and let’s open a new issue
specifically for nginx if it comes up again with 0.9.3.

Thanks,
Ludo’.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 15 Dec 2022 12:24:07 GMT) Full text and rfc822 format available.

This bug report was last modified 1 year and 126 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.