GNU bug report logs -
#63190
[Shepherd] Nested calls lead to a hang
Previous Next
Reported by: Bruno Victal <mirai <at> makinata.eu>
Date: Sun, 30 Apr 2023 15:22:01 UTC
Severity: normal
Tags: notabug
Done: Ludovic Courtès <ludo <at> gnu.org>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 63190 in the body.
You can then email your comments to 63190 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-guix <at> gnu.org
:
bug#63190
; Package
guix
.
(Sun, 30 Apr 2023 15:22:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Bruno Victal <mirai <at> makinata.eu>
:
New bug report received and forwarded. Copy sent to
bug-guix <at> gnu.org
.
(Sun, 30 Apr 2023 15:22:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
Original discussion (IRC): <https://logs.guix.gnu.org/guix/2023-04-29.log#180735>
Minimal example (annotated):
test-system.scm:
--8<---------------cut here---------------start------------->8---
(use-modules (gnu)
(gnu tests)
(gnu packages)
(gnu packages base) ; coreutils/sleep
(gnu packages admin) ; shepherd
(gnu services shepherd))
;; Some dummy service whose start action simply waits for some seconds,
;; about enough to check with herd status before it exits.
(define dummy-service-type
(shepherd-service-type
'dummy
(lambda (cfg)
(shepherd-service
(documentation "Dummy action to start service.")
(provision '(dummy-service))
(respawn? #f) ; <<<<<< note, this disables the service!
(modules (cons* '(gnu services herd)
%default-modules))
(start #~(lambda _
(format #t "Starting a delay on dummy service.~%")
(fork+exec-command (list #$(file-append coreutils "/bin/sleep")
"30"))))
(stop #~(make-kill-destructor))
(actions
(list (shepherd-action
(name 'my-action)
(documentation "lorem ipsum")
(procedure
#~(lambda (x)
;; Scenario 1: using code from (gnu services herd), this hangs shepherd
#;(start-service 'dummy) ; hangs shepherd
;; Scenario 2: this doesn't hang shepherd but do note that the service has to be re-enabled either manually or automatically here
#;(system* #$(file-append shepherd "/bin/herd") "start" "dummy-service")
;; Scenario 3: use the already imported (shepherd service) module, doesn't hang shepherd
;; Like Scenario 2, the service must be re-enabled since (respawn? #f) disabled this.
;; Comment: Won't re-enabling mean that this service will relaunch once it quits?
;; That means the service has to disable itself on a successful exit, perhaps within the (stop ...) field?
(start 'dummy-service))))))))
#f ; no config
(description "lorem ipsum.")))
(operating-system
(inherit %simple-os)
(services
(cons*
(service dummy-service-type)
%base-services)))
--8<---------------cut here---------------end--------------->8---
Required modifications to gnu/services/shepherd.scm for scenario 1:
--8<---------------cut here---------------start------------->8---
diff --git a/gnu/services/shepherd.scm b/gnu/services/shepherd.scm
index b2601c0128..158806f421 100644
--- a/gnu/services/shepherd.scm
+++ b/gnu/services/shepherd.scm
@@ -282,7 +282,8 @@ (define (shepherd-service-file-name service)
(define (shepherd-service-file service)
"Return a file defining SERVICE."
(scheme-file (shepherd-service-file-name service)
- (with-imported-modules %default-imported-modules
+ (with-imported-modules (cons '(gnu services herd)
+ %default-imported-modules)
#~(begin
(use-modules #$@(shepherd-service-modules service))
--8<---------------cut here---------------end--------------->8---
Information forwarded
to
bug-guix <at> gnu.org
:
bug#63190
; Package
guix
.
(Sat, 06 May 2023 17:27:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 63190 <at> debbugs.gnu.org (full text, mbox):
Hi,
Bruno Victal <mirai <at> makinata.eu> skribis:
> Original discussion (IRC): <https://logs.guix.gnu.org/guix/2023-04-29.log#180735>
[...]
> (procedure
> #~(lambda (x)
> ;; Scenario 1: using code from (gnu services herd), this hangs shepherd
> #;(start-service 'dummy) ; hangs shepherd
(gnu services herd) provides a client to talk to the shepherd process.
However, the code of actions runs in the shepherd process itself, so
there’s no need to use the client library. Don’t do that. :-)
(Whether that leads to a deadlock depends; at first sight, I’d say
there’s no reason for this to deadlock in general, but you can of course
end up with a logic bug like A starts B, which spawns a client to start
A, which doesn’t start because it’s waiting for B.)
> ;; Scenario 2: this doesn't hang shepherd but do note that the service has to be re-enabled either manually or automatically here
> #;(system* #$(file-append shepherd "/bin/herd") "start" "dummy-service")
This is equivalent to the one above.
> ;; Scenario 3: use the already imported (shepherd service) module, doesn't hang shepherd
> ;; Like Scenario 2, the service must be re-enabled since (respawn? #f) disabled this.
> ;; Comment: Won't re-enabling mean that this service will relaunch once it quits?
> ;; That means the service has to disable itself on a successful exit, perhaps within the (stop ...) field?
> (start 'dummy-service))))))))
This should work without blocking.
However, starting a service from another one doesn’t sound great.
Could you give more context?
Thanks,
Ludo’.
Information forwarded
to
bug-guix <at> gnu.org
:
bug#63190
; Package
guix
.
(Mon, 08 May 2023 10:28:02 GMT)
Full text and
rfc822 format available.
Message #11 received at 63190 <at> debbugs.gnu.org (full text, mbox):
Ludovic Courtès <ludo <at> gnu.org> skribis:
> Bruno Victal <mirai <at> makinata.eu> skribis:
>
>> Original discussion (IRC): <https://logs.guix.gnu.org/guix/2023-04-29.log#180735>
>
> [...]
>
>> (procedure
>> #~(lambda (x)
>> ;; Scenario 1: using code from (gnu services herd), this hangs shepherd
>> #;(start-service 'dummy) ; hangs shepherd
>
> (gnu services herd) provides a client to talk to the shepherd process.
> However, the code of actions runs in the shepherd process itself, so
> there’s no need to use the client library. Don’t do that. :-)
Also, the socket created in (gnu services herd) lacks SOCK_NONBLOCK so
the code above is bound to block forever.
Ludo’.
Information forwarded
to
bug-guix <at> gnu.org
:
bug#63190
; Package
guix
.
(Fri, 12 May 2023 23:07:02 GMT)
Full text and
rfc822 format available.
Message #14 received at 63190 <at> debbugs.gnu.org (full text, mbox):
Ludovic Courtès <ludo <at> gnu.org> writes:
> (Whether that leads to a deadlock depends; at first sight, I’d
> say
> there’s no reason for this to deadlock in general, but you can
> of course
> end up with a logic bug like A starts B, which spawns a client
> to start
> A, which doesn’t start because it’s waiting for B.)
It's been a while since I looked at this, but my rough
recollection is the deadlock occurs because shepherd can only
process one request over its socket at a time. If that request
happens to *also* try to talk over the same socket, it'll hang
indefinitely waiting for its turn to come off the accept queue.
I'm not sure there's much to be done in the 0.9 version of
shepherd about it. I'm hoping that 0.10 and up will be able to
cope with situations like this without completely deadlocking the
shepherd itself. It's obviously pretty bad if pid 1 hangs for any
reason at all, even user error.
-bjc
Information forwarded
to
bug-guix <at> gnu.org
:
bug#63190
; Package
guix
.
(Sat, 13 May 2023 09:47:02 GMT)
Full text and
rfc822 format available.
Message #17 received at 63190 <at> debbugs.gnu.org (full text, mbox):
Brian Cully <bjc <at> spork.org> skribis:
> Ludovic Courtès <ludo <at> gnu.org> writes:
>
>> (Whether that leads to a deadlock depends; at first sight, I’d say
>> there’s no reason for this to deadlock in general, but you can of
>> course
>> end up with a logic bug like A starts B, which spawns a client to
>> start
>> A, which doesn’t start because it’s waiting for B.)
>
> It's been a while since I looked at this, but my rough recollection is
> the deadlock occurs because shepherd can only process one request over
> its socket at a time.
That’s not the case in 0.9: it can process several requests
concurrently. However, as I wrote in a followup message, the client
socket created by (gnu services herd) lacks SOCK_NONBLOCK, which can
thus block the process on reads and writes.
Ludo’.
Added tag(s) notabug.
Request was from
Ludovic Courtès <ludo <at> gnu.org>
to
control <at> debbugs.gnu.org
.
(Sat, 13 May 2023 09:47:02 GMT)
Full text and
rfc822 format available.
bug closed, send any further explanations to
63190 <at> debbugs.gnu.org and Bruno Victal <mirai <at> makinata.eu>
Request was from
Ludovic Courtès <ludo <at> gnu.org>
to
control <at> debbugs.gnu.org
.
(Sat, 13 May 2023 09:47:02 GMT)
Full text and
rfc822 format available.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Sat, 10 Jun 2023 11:24:09 GMT)
Full text and
rfc822 format available.
This bug report was last modified 313 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.