GNU bug report logs - #77274
[Shepherd] Competing one-shot service starter gets erroneous failure

Previous Next

Package: guix;

Reported by: Ludovic Courtès <ludo <at> gnu.org>

Date: Wed, 26 Mar 2025 10:15:02 UTC

Severity: normal

Done: Ludovic Courtès <ludo <at> gnu.org>

To reply to this bug, email your comments to 77274 AT debbugs.gnu.org.
There is no need to reopen the bug first.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guix <at> gnu.org:
bug#77274; Package guix. (Wed, 26 Mar 2025 10:15:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Ludovic Courtès <ludo <at> gnu.org>:
New bug report received and forwarded. Copy sent to bug-guix <at> gnu.org. (Wed, 26 Mar 2025 10:15:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: bug-guix <at> gnu.org
Subject: [Shepherd] Competing one-shot service starter gets erroneous failure
Date: Wed, 26 Mar 2025 11:14:24 +0100
[Message part 1 (text/plain, inline)]
As of 1.0.3, when two clients start the same one-shot service, the one
that loses the race never sees the value that was produced by the
‘start’ method.

  herd start one-shot & herd start one-shot

Here one of the ‘herd start’ processes will wrongfully fail with “failed
to start service one-shot”.

Instead, it calls ‘service-running-value’ but that always returns #f
because the one-shot service was stopped in the meantime.  I’m referring
to this bit of ‘start-service’:

      (match (get-message reply)
        (#f
         ;; We lost the race: SERVICE is already running.
         (service-running-value service))   ;<- here
        …)

Attached is a reproducer.

Ludo’.

[Message part 2 (text/x-patch, inline)]
diff --git a/tests/one-shot.sh b/tests/one-shot.sh
index eeecea7..491eeae 100644
--- a/tests/one-shot.sh
+++ b/tests/one-shot.sh
@@ -1,5 +1,5 @@
 # GNU Shepherd --- Test one-shot services.
-# Copyright © 2019, 2023-2024 Ludovic Courtès <ludo <at> gnu.org>
+# Copyright © 2019, 2023-2025 Ludovic Courtès <ludo <at> gnu.org>
 #
 # This file is part of the GNU Shepherd.
 #
@@ -197,4 +197,35 @@ test "$(cat "$stamp")" = "third"
 $herd start fourth && false
 $herd start fourth && false
 
+# Check the behavior of two clients competing to start the same one-shot
+# service.  Both should succeed.
+
+cat > "$conf" <<EOF
+(register-services
+  (list (service
+          '(fifth)
+          #:one-shot? #t
+          #:start (lambda ()
+                    (let loop ()
+                      (unless (file-exists? "$stamp")
+                        (sleep 0.5)
+                        (loop)))
+                    #t))))
+EOF
+
+$herd load root "$conf"
+
+rm -f "$stamp"
+
+$herd start fifth &
+herd_start_pid1=$!
+$herd start fifth &
+herd_start_pid2=$!
+until $herd status fifth | grep starting; do sleep 0.5; done
+touch "$stamp"			# trigger starting->running transition
+
+# Both 'herd start' processes should have succeeded.
+wait $herd_start_pid1
+wait $herd_start_pid2
+
 $herd stop root

Reply sent to Ludovic Courtès <ludo <at> gnu.org>:
You have taken responsibility. (Wed, 26 Mar 2025 11:37:01 GMT) Full text and rfc822 format available.

Notification sent to Ludovic Courtès <ludo <at> gnu.org>:
bug acknowledged by developer. (Wed, 26 Mar 2025 11:37:02 GMT) Full text and rfc822 format available.

Message #10 received at 77274-done <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: 77274-done <at> debbugs.gnu.org
Subject: Re: bug#77274: [Shepherd] Competing one-shot service starter gets
 erroneous failure
Date: Wed, 26 Mar 2025 12:36:17 +0100
Ludovic Courtès <ludo <at> gnu.org> skribis:

> As of 1.0.3, when two clients start the same one-shot service, the one
> that loses the race never sees the value that was produced by the
> ‘start’ method.
>
>   herd start one-shot & herd start one-shot
>
> Here one of the ‘herd start’ processes will wrongfully fail with “failed
> to start service one-shot”.
>
> Instead, it calls ‘service-running-value’ but that always returns #f
> because the one-shot service was stopped in the meantime.  I’m referring
> to this bit of ‘start-service’:
>
>       (match (get-message reply)
>         (#f
>          ;; We lost the race: SERVICE is already running.
>          (service-running-value service))   ;<- here
>         …)

Fixed in f730106fe1cf9a3efc2f327cc5716335585ac92b.

Ludo'.




This bug report was last modified 13 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.