GNU bug report logs -
#53225
shepherd freezes if wireguard is started with dns config enabled
Previous Next
Reported by: Nathan Dehnel <ncdehnel <at> gmail.com>
Date: Thu, 13 Jan 2022 00:28:02 UTC
Severity: important
Done: Ludovic Courtès <ludo <at> gnu.org>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 53225 in the body.
You can then email your comments to 53225 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-guix <at> gnu.org
:
bug#53225
; Package
guix
.
(Thu, 13 Jan 2022 00:28:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Nathan Dehnel <ncdehnel <at> gmail.com>
:
New bug report received and forwarded. Copy sent to
bug-guix <at> gnu.org
.
(Thu, 13 Jan 2022 00:28:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
When dns is specified, wireguard runs wg-quick, which runs resolvconf,
which runs /run/current-system/profile/bin/herd restart, which causes
shepherd to freeze because I guess it doesn't like being given
multiple start commands at once. I'm not sure how to fix it.
Information forwarded
to
bug-guix <at> gnu.org
:
bug#53225
; Package
guix
.
(Thu, 13 Jan 2022 15:12:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 53225 <at> debbugs.gnu.org (full text, mbox):
Hi,
Nathan Dehnel <ncdehnel <at> gmail.com> skribis:
> When dns is specified, wireguard runs wg-quick, which runs resolvconf,
> which runs /run/current-system/profile/bin/herd restart, which causes
> shepherd to freeze because I guess it doesn't like being given
> multiple start commands at once. I'm not sure how to fix it.
What do you mean by “freezing”? Does ‘herd status’ and similar commands
block forever? Or is it something else?
Requests in the Shepherd are currently handled sequentially. So if you
issue several ‘herd restart’ commands, they’ll be processed one at a
time. This is usually okay because ‘start’ commands are expected to be
quick (just wait for the daemon to write its PID file or similar).
Thanks,
Ludo’.
Information forwarded
to
bug-guix <at> gnu.org
:
bug#53225
; Package
guix
.
(Thu, 13 Jan 2022 22:43:01 GMT)
Full text and
rfc822 format available.
Message #11 received at 53225 <at> debbugs.gnu.org (full text, mbox):
>What do you mean by “freezing”? Does ‘herd status’ and similar commands
block forever?
Yes
>Requests in the Shepherd are currently handled sequentially. So if you
issue several ‘herd restart’ commands, they’ll be processed one at a
time. This is usually okay because ‘start’ commands are expected to be
quick (just wait for the daemon to write its PID file or similar).
What is the nature of this serialization? Does wireguard need to
finish before resolvconf can start? Because that's probably the issue.
On Thu, Jan 13, 2022 at 9:11 AM Ludovic Courtès <ludo <at> gnu.org> wrote:
>
> Hi,
>
> Nathan Dehnel <ncdehnel <at> gmail.com> skribis:
>
> > When dns is specified, wireguard runs wg-quick, which runs resolvconf,
> > which runs /run/current-system/profile/bin/herd restart, which causes
> > shepherd to freeze because I guess it doesn't like being given
> > multiple start commands at once. I'm not sure how to fix it.
>
> What do you mean by “freezing”? Does ‘herd status’ and similar commands
> block forever? Or is it something else?
>
> Requests in the Shepherd are currently handled sequentially. So if you
> issue several ‘herd restart’ commands, they’ll be processed one at a
> time. This is usually okay because ‘start’ commands are expected to be
> quick (just wait for the daemon to write its PID file or similar).
>
> Thanks,
> Ludo’.
Information forwarded
to
bug-guix <at> gnu.org
:
bug#53225
; Package
guix
.
(Mon, 17 Jan 2022 13:49:01 GMT)
Full text and
rfc822 format available.
Message #14 received at 53225 <at> debbugs.gnu.org (full text, mbox):
Hi,
Nathan Dehnel <ncdehnel <at> gmail.com> skribis:
>>What do you mean by “freezing”? Does ‘herd status’ and similar commands
> block forever?
> Yes
>
>>Requests in the Shepherd are currently handled sequentially. So if you
> issue several ‘herd restart’ commands, they’ll be processed one at a
> time. This is usually okay because ‘start’ commands are expected to be
> quick (just wait for the daemon to write its PID file or similar).
> What is the nature of this serialization? Does wireguard need to
> finish before resolvconf can start? Because that's probably the issue.
One command sent to shepherd by ‘herd …’ must have completed before the
next one is processed.
You can experience it like this:
sudo herd eval root '(sleep 3)' & echo status && sudo herd status
Here the first ‘herd’ command has shepherd block for 3 seconds, so the
second ‘herd’ command won’t print anything until 3 seconds have passed.
HTH,
Ludo’.
Information forwarded
to
bug-guix <at> gnu.org
:
bug#53225
; Package
guix
.
(Wed, 01 Jun 2022 22:57:02 GMT)
Full text and
rfc822 format available.
Message #17 received at 53225 <at> debbugs.gnu.org (full text, mbox):
Just tested and Shepherd 0.9 does not fix this issue.
On Mon, Jan 17, 2022 at 7:48 AM Ludovic Courtès <ludo <at> gnu.org> wrote:
>
> Hi,
>
> Nathan Dehnel <ncdehnel <at> gmail.com> skribis:
>
> >>What do you mean by “freezing”? Does ‘herd status’ and similar commands
> > block forever?
> > Yes
> >
> >>Requests in the Shepherd are currently handled sequentially. So if you
> > issue several ‘herd restart’ commands, they’ll be processed one at a
> > time. This is usually okay because ‘start’ commands are expected to be
> > quick (just wait for the daemon to write its PID file or similar).
> > What is the nature of this serialization? Does wireguard need to
> > finish before resolvconf can start? Because that's probably the issue.
>
> One command sent to shepherd by ‘herd …’ must have completed before the
> next one is processed.
>
> You can experience it like this:
>
> sudo herd eval root '(sleep 3)' & echo status && sudo herd status
>
> Here the first ‘herd’ command has shepherd block for 3 seconds, so the
> second ‘herd’ command won’t print anything until 3 seconds have passed.
>
> HTH,
> Ludo’.
Information forwarded
to
bug-guix <at> gnu.org
:
bug#53225
; Package
guix
.
(Thu, 02 Jun 2022 13:40:02 GMT)
Full text and
rfc822 format available.
Message #20 received at 53225 <at> debbugs.gnu.org (full text, mbox):
Hi Nathan,
Nathan Dehnel <ncdehnel <at> gmail.com> skribis:
> Just tested and Shepherd 0.9 does not fix this issue.
Could you be more specific? Specifically, could you share
/var/log/messages for the parts related to Wireguard?
> On Mon, Jan 17, 2022 at 7:48 AM Ludovic Courtès <ludo <at> gnu.org> wrote:
[...]
>> One command sent to shepherd by ‘herd …’ must have completed before the
>> next one is processed.
>>
>> You can experience it like this:
>>
>> sudo herd eval root '(sleep 3)' & echo status && sudo herd status
>>
>> Here the first ‘herd’ command has shepherd block for 3 seconds, so the
>> second ‘herd’ command won’t print anything until 3 seconds have passed.
This is actually still the case with 0.9, because here we’re calling
(@ (guile) sleep), which blocks. So… not a good example.
The short story is: it is still possible to write code that blocks
shepherd, as with the ‘sleep’ example above. However, the standard
service constructors/destructors no longer block, and shepherd can serve
multiple clients concurrently.
Ludo’.
Information forwarded
to
bug-guix <at> gnu.org
:
bug#53225
; Package
guix
.
(Wed, 08 Jun 2022 23:24:02 GMT)
Full text and
rfc822 format available.
Message #23 received at 53225 <at> debbugs.gnu.org (full text, mbox):
>Could you be more specific? Specifically, could you share
>/var/log/messages for the parts related to Wireguard?
root <at> guixtest ~# cat /var/log/messages | grep -i wireguardJun 8
18:20:07 localhost vmunix: [ 6.330271] wireguard: WireGuard 1.0.0
loaded. See www.wireguard.com for information.
Jun 8 18:20:07 localhost vmunix: [ 6.330276] wireguard: Copyright
(C) 2015-2019 Jason A. Donenfeld <Jason <at> zx2c4.com>. All Rights
Reserved.
>However, the standard
>service constructors/destructors no longer block, and shepherd can serve
>multiple clients concurrently.
I don't know, I guess wireguard uses "non-standard" constructors.
On Thu, Jun 2, 2022 at 8:38 AM Ludovic Courtès <ludo <at> gnu.org> wrote:
>
> Hi Nathan,
>
> Nathan Dehnel <ncdehnel <at> gmail.com> skribis:
>
> > Just tested and Shepherd 0.9 does not fix this issue.
>
> Could you be more specific? Specifically, could you share
> /var/log/messages for the parts related to Wireguard?
>
> > On Mon, Jan 17, 2022 at 7:48 AM Ludovic Courtès <ludo <at> gnu.org> wrote:
>
> [...]
>
> >> One command sent to shepherd by ‘herd …’ must have completed before the
> >> next one is processed.
> >>
> >> You can experience it like this:
> >>
> >> sudo herd eval root '(sleep 3)' & echo status && sudo herd status
> >>
> >> Here the first ‘herd’ command has shepherd block for 3 seconds, so the
> >> second ‘herd’ command won’t print anything until 3 seconds have passed.
>
> This is actually still the case with 0.9, because here we’re calling
> (@ (guile) sleep), which blocks. So… not a good example.
>
> The short story is: it is still possible to write code that blocks
> shepherd, as with the ‘sleep’ example above. However, the standard
> service constructors/destructors no longer block, and shepherd can serve
> multiple clients concurrently.
>
> Ludo’.
Information forwarded
to
bug-guix <at> gnu.org
:
bug#53225
; Package
guix
.
(Thu, 09 Jun 2022 15:06:02 GMT)
Full text and
rfc822 format available.
Message #26 received at 53225 <at> debbugs.gnu.org (full text, mbox):
Hi Nathan,
Nathan Dehnel <ncdehnel <at> gmail.com> skribis:
>>Could you be more specific? Specifically, could you share
>>/var/log/messages for the parts related to Wireguard?
>
> root <at> guixtest ~# cat /var/log/messages | grep -i wireguardJun 8
> 18:20:07 localhost vmunix: [ 6.330271] wireguard: WireGuard 1.0.0
> loaded. See www.wireguard.com for information.
> Jun 8 18:20:07 localhost vmunix: [ 6.330276] wireguard: Copyright
> (C) 2015-2019 Jason A. Donenfeld <Jason <at> zx2c4.com>. All Rights
> Reserved.
There should be lines like:
shepherd[1]: Service 'wireguard-XXX' has been started.
Perhaps they’ve been moved to a different files due to log rotation?
Without these, I cannot tell what happened.
>>However, the standard
>>service constructors/destructors no longer block, and shepherd can serve
>>multiple clients concurrently.
>
> I don't know, I guess wireguard uses "non-standard" constructors.
Indeed, it invokes ‘wg-quick up’ and waits for completion.
I suppose that command blocks until it has set up the VPN, right?
If so, we’ll need to rewrite it differently.
Thanks,
Ludo’.
Information forwarded
to
bug-guix <at> gnu.org
:
bug#53225
; Package
guix
.
(Thu, 09 Jun 2022 15:50:01 GMT)
Full text and
rfc822 format available.
Message #29 received at 53225 <at> debbugs.gnu.org (full text, mbox):
>There should be lines like:
> shepherd[1]: Service 'wireguard-XXX' has been started.
>Perhaps they’ve been moved to a different files due to log rotation?
>Without these, I cannot tell what happened.
I tried it again and found this
Jun 9 10:47:44 localhost vmunix: [ 6.497581] wireguard: WireGuard
1.0.0 loaded. See www.wireguard.com for information.
Jun 9 10:47:44 localhost vmunix: [ 6.497584] wireguard: Copyright
(C) 2015-2019 Jason A. Donenfeld <Jason <at> zx2c4.com>. All Rights
Reserved.
Jun 9 10:47:44 localhost shepherd[1]: Failed to start wireguard-test
in the background.
On Thu, Jun 9, 2022 at 10:05 AM Ludovic Courtès <ludo <at> gnu.org> wrote:
>
> Hi Nathan,
>
> Nathan Dehnel <ncdehnel <at> gmail.com> skribis:
>
> >>Could you be more specific? Specifically, could you share
> >>/var/log/messages for the parts related to Wireguard?
> >
> > root <at> guixtest ~# cat /var/log/messages | grep -i wireguardJun 8
> > 18:20:07 localhost vmunix: [ 6.330271] wireguard: WireGuard 1.0.0
> > loaded. See www.wireguard.com for information.
> > Jun 8 18:20:07 localhost vmunix: [ 6.330276] wireguard: Copyright
> > (C) 2015-2019 Jason A. Donenfeld <Jason <at> zx2c4.com>. All Rights
> > Reserved.
>
> There should be lines like:
>
> shepherd[1]: Service 'wireguard-XXX' has been started.
>
> Perhaps they’ve been moved to a different files due to log rotation?
>
> Without these, I cannot tell what happened.
>
> >>However, the standard
> >>service constructors/destructors no longer block, and shepherd can serve
> >>multiple clients concurrently.
> >
> > I don't know, I guess wireguard uses "non-standard" constructors.
>
> Indeed, it invokes ‘wg-quick up’ and waits for completion.
>
> I suppose that command blocks until it has set up the VPN, right?
>
> If so, we’ll need to rewrite it differently.
>
> Thanks,
> Ludo’.
Information forwarded
to
bug-guix <at> gnu.org
:
bug#53225
; Package
guix
.
(Thu, 09 Jun 2022 20:16:01 GMT)
Full text and
rfc822 format available.
Message #32 received at 53225 <at> debbugs.gnu.org (full text, mbox):
Hi,
Nathan Dehnel <ncdehnel <at> gmail.com> skribis:
> I tried it again and found this
> Jun 9 10:47:44 localhost vmunix: [ 6.497581] wireguard: WireGuard
> 1.0.0 loaded. See www.wireguard.com for information.
> Jun 9 10:47:44 localhost vmunix: [ 6.497584] wireguard: Copyright
> (C) 2015-2019 Jason A. Donenfeld <Jason <at> zx2c4.com>. All Rights
> Reserved.
> Jun 9 10:47:44 localhost shepherd[1]: Failed to start wireguard-test
> in the background.
Could you provide me (privately if you prefer) the /var/log/messages
sequence starting from boot (the line that reads “syslogd (GNU inetutils
2.0): restart”) up to the last line above?
Thanks in advance,
Ludo’.
Information forwarded
to
bug-guix <at> gnu.org
:
bug#53225
; Package
guix
.
(Mon, 13 Jun 2022 09:32:01 GMT)
Full text and
rfc822 format available.
Message #35 received at 53225 <at> debbugs.gnu.org (full text, mbox):
Hi,
The /var/log/messages excerpt you sent me has nothing beyond:
> Jun 11 11:43:33 localhost shepherd[1]: Service networking has been started.
[…]
> Jun 11 11:43:33 localhost vmunix: [ 5.552395] wireguard: WireGuard
> 1.0.0 loaded. See www.wireguard.com for information.
> Jun 11 11:43:33 localhost vmunix: [ 5.552398] wireguard: Copyright
> (C) 2015-2019 Jason A. Donenfeld <Jason <at> zx2c4.com>. All Rights
> Reserved.
> Jun 11 11:43:33 localhost shepherd[1]: Failed to start wireguard-test
> in the background.
That there’s not a single error message from wireguard doesn’t help.
Mathieu, Guillaume: any idea what might prevent the wireguard Shepherd
service from starting, or how we could gather debugging info?
(Context: <https://issues.guix.gnu.org/53225>.)
Ludo’.
Severity set to 'important' from 'normal'
Request was from
Ludovic Courtès <ludo <at> gnu.org>
to
control <at> debbugs.gnu.org
.
(Sat, 12 Nov 2022 18:08:02 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-guix <at> gnu.org
:
bug#53225
; Package
guix
.
(Sat, 12 Nov 2022 18:12:02 GMT)
Full text and
rfc822 format available.
Message #40 received at 53225 <at> debbugs.gnu.org (full text, mbox):
Mathieu Othacehe <othacehe <at> gnu.org> skribis:
> 1. On my laptop with a Wireguard service trying to reach a non-existing
> DNS server.
>
> (service wireguard-service-type
> (wireguard-configuration
> (addresses (list "10.0.0.2/24"))
> (dns '("10.0.0.50")) #does not exit
This one is similar to:
https://issues.guix.gnu.org/53225
https://issues.guix.gnu.org/53381
It has to do with the fact that “wg-quick up” blocks until it succeeds
and that ‘invoke’ gets stuck on ‘waitpid’ until the “wg-quick” process
terminates.
The solution will be to use something non-blocking instead of ‘invoke’;
I’m looking into it.
Ludo’.
Reply sent
to
Ludovic Courtès <ludo <at> gnu.org>
:
You have taken responsibility.
(Thu, 17 Nov 2022 10:24:03 GMT)
Full text and
rfc822 format available.
Notification sent
to
Nathan Dehnel <ncdehnel <at> gmail.com>
:
bug acknowledged by developer.
(Thu, 17 Nov 2022 10:24:03 GMT)
Full text and
rfc822 format available.
Message #45 received at 53225-done <at> debbugs.gnu.org (full text, mbox):
Hi,
Ludovic Courtès <ludo <at> gnu.org> skribis:
> Mathieu Othacehe <othacehe <at> gnu.org> skribis:
>
>> 1. On my laptop with a Wireguard service trying to reach a non-existing
>> DNS server.
>>
>> (service wireguard-service-type
>> (wireguard-configuration
>> (addresses (list "10.0.0.2/24"))
>> (dns '("10.0.0.50")) #does not exit
>
> This one is similar to:
>
> https://issues.guix.gnu.org/53225
> https://issues.guix.gnu.org/53381
>
> It has to do with the fact that “wg-quick up” blocks until it succeeds
> and that ‘invoke’ gets stuck on ‘waitpid’ until the “wg-quick” process
> terminates.
>
> The solution will be to use something non-blocking instead of ‘invoke’;
> I’m looking into it.
This is fixed in the Shepherd 0.9.3, which landed in Guix commit
283d7318c5b312d7129adb6dbeea6ad205ce89d1.
As I wrote, I’m not sure whether it fixes the nginx situation since I
could not reproduce it. I’m closing and let’s open a new issue
specifically for nginx if it comes up again with 0.9.3.
Thanks,
Ludo’.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Thu, 15 Dec 2022 12:24:07 GMT)
Full text and
rfc822 format available.
This bug report was last modified 1 year and 126 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.