GNU bug report logs - #48521
opendht-service-type hangs Shepherd at boot

Previous Next

Package: guix;

Reported by: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>

Date: Wed, 19 May 2021 12:00:02 UTC

Severity: normal

Done: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 48521 in the body.
You can then email your comments to 48521 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guix <at> gnu.org:
bug#48521; Package guix. (Wed, 19 May 2021 12:00:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Maxim Cournoyer <maxim.cournoyer <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-guix <at> gnu.org. (Wed, 19 May 2021 12:00:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>
To: bug-guix <bug-guix <at> gnu.org>
Subject: opendht-service-type hangs Shepherd at boot
Date: Wed, 19 May 2021 07:59:19 -0400
Hello,

I just noticed about this problem following a reboot.  I can also
reproduce it in 'guix system vm', simply adding the opendht-service-type
to my operating-system declaration.

The boot proceeds until 'error in finalization thread: Success' then
hangs indefinitely.

What is troubling for me is that the service is rather straightforwardly
defined.  It uses the make-forkexec-constructor/container like so:

--8<---------------cut here---------------start------------->8---
(define (opendht-shepherd-service config)
  "Return a <shepherd-service> running OpenDHT."
  (shepherd-service
   (documentation "Run an OpenDHT node.")
   (provision '(opendht dhtnode dhtproxy))
   (requirement '(user-processes syslogd))
   (start #~(make-forkexec-constructor/container
             (list #$@(opendht-configuration->command-line-arguments config))
             #:mappings (list (file-system-mapping
                               (source "/dev/log") ;for syslog
                               (target source)))
             #:user "opendht"))
   (stop #~(make-kill-destructor))))
--8<---------------cut here---------------end--------------->8---

I'm not sure how using such basic building blocks could lead to a hang
in Shepherd ?

Thanks,

Maxim




Information forwarded to bug-guix <at> gnu.org:
bug#48521; Package guix. (Wed, 19 May 2021 21:37:02 GMT) Full text and rfc822 format available.

Message #8 received at 48521 <at> debbugs.gnu.org (full text, mbox):

From: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>
To: 48521 <at> debbugs.gnu.org
Subject: Re: bug#48521: opendht-service-type hangs Shepherd at boot
Date: Wed, 19 May 2021 17:36:38 -0400
Maxim Cournoyer <maxim.cournoyer <at> gmail.com> writes:

> Hello,
>
> I just noticed about this problem following a reboot.  I can also
> reproduce it in 'guix system vm', simply adding the opendht-service-type
> to my operating-system declaration.
>
> The boot proceeds until 'error in finalization thread: Success' then
> hangs indefinitely.
>
> What is troubling for me is that the service is rather straightforwardly
> defined.  It uses the make-forkexec-constructor/container like so:
>
> (define (opendht-shepherd-service config)
>   "Return a <shepherd-service> running OpenDHT."
>   (shepherd-service
>    (documentation "Run an OpenDHT node.")
>    (provision '(opendht dhtnode dhtproxy))
>    (requirement '(user-processes syslogd))
>    (start #~(make-forkexec-constructor/container
>              (list #$@(opendht-configuration->command-line-arguments config))
>              #:mappings (list (file-system-mapping
>                                (source "/dev/log") ;for syslog
>                                (target source)))
>              #:user "opendht"))
>    (stop #~(make-kill-destructor))))
>
> I'm not sure how using such basic building blocks could lead to a hang
> in Shepherd ?

After much trial and error, the service can be made to not hang Shepherd
with the removal of the mappings argument:

--8<---------------cut here---------------start------------->8---
modified   gnu/services/networking.scm
@@ -845,9 +845,9 @@ CONFIG, an <opendht-configuration> object."
    (requirement '(user-processes networking syslogd))
    (start #~(make-forkexec-constructor/container
              (list #$@(opendht-configuration->command-line-arguments config))
-             #:mappings (list (file-system-mapping
-                               (source "/dev/log") ;for syslog
-                               (target source)))
+             ;; #:mappings (list (file-system-mapping
+             ;;                   (source "/dev/log") ;for syslog
+             ;;                   (target source)))
              #:user "opendht"))
    (stop #~(make-kill-destructor))))
--8<---------------cut here---------------end--------------->8---

I have no idea why that is, but given that the tor-service-type does the
same thing, I can only conclude that it is some strange interaction
between dhtnode and syslog.

The above fixes the hang, but breaks logging to syslog.

Ideas?

Maxim




Reply sent to Maxim Cournoyer <maxim.cournoyer <at> gmail.com>:
You have taken responsibility. (Thu, 20 May 2021 02:53:02 GMT) Full text and rfc822 format available.

Notification sent to Maxim Cournoyer <maxim.cournoyer <at> gmail.com>:
bug acknowledged by developer. (Thu, 20 May 2021 02:53:02 GMT) Full text and rfc822 format available.

Message #13 received at 48521-done <at> debbugs.gnu.org (full text, mbox):

From: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>
To: 48521-done <at> debbugs.gnu.org
Subject: Re: bug#48521: opendht-service-type hangs Shepherd at boot
Date: Wed, 19 May 2021 22:52:06 -0400
Hello,

Maxim Cournoyer <maxim.cournoyer <at> gmail.com> writes:

> Hello,
>
> I just noticed about this problem following a reboot.  I can also
> reproduce it in 'guix system vm', simply adding the opendht-service-type
> to my operating-system declaration.
>
> The boot proceeds until 'error in finalization thread: Success' then
> hangs indefinitely.
>
> What is troubling for me is that the service is rather straightforwardly
> defined.  It uses the make-forkexec-constructor/container like so:
>
> (define (opendht-shepherd-service config)
>   "Return a <shepherd-service> running OpenDHT."
>   (shepherd-service
>    (documentation "Run an OpenDHT node.")
>    (provision '(opendht dhtnode dhtproxy))
>    (requirement '(user-processes syslogd))
>    (start #~(make-forkexec-constructor/container
>              (list #$@(opendht-configuration->command-line-arguments config))
>              #:mappings (list (file-system-mapping
>                                (source "/dev/log") ;for syslog
>                                (target source)))
>              #:user "opendht"))
>    (stop #~(make-kill-destructor))))
>
> I'm not sure how using such basic building blocks could lead to a hang
> in Shepherd ?

It seems Shepherd can't cope with a failing start procedure/script when
a variable was not bound.  To diagnose the problem, the best way ended
up being to extract the code of the constructor in a separate script to
run it separately.  This made the error quickly apparent: "Unbound
variable: file-system-mapping".

We should try to handle this class of errors in Shepherd and report a
useful message and *not* crash Shepherd or otherwise hang.

Pushed with commit a09cdf1f9d.

Closing.

Maxim




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 17 Jun 2021 11:24:07 GMT) Full text and rfc822 format available.

This bug report was last modified 2 years and 313 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.