GNU bug report logs - #55444
elogind startup race between shepherd and dbus-daemon

Previous Next

Package: guix;

Reported by: Ludovic Courtès <ludo <at> gnu.org>

Date: Mon, 16 May 2022 08:27:01 UTC

Severity: important

Done: Ludovic Courtès <ludo <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 55444 in the body.
You can then email your comments to 55444 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guix <at> gnu.org:
bug#55444; Package guix. (Mon, 16 May 2022 08:27:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Ludovic Courtès <ludo <at> gnu.org>:
New bug report received and forwarded. Copy sent to bug-guix <at> gnu.org. (Mon, 16 May 2022 08:27:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: bug-guix <at> gnu.org
Subject: elogind startup race between shepherd and dbus-daemon
Date: Mon, 16 May 2022 10:26:15 +0200
Hello!

Currently (40a729a0e6f1d660b942241416c1e2c567616d4d), shepherd and
dbus-daemon compete to start elogind: shepherd tries to start it
eagerly, and dbus-daemon starts it on-demand upon bus activation.

Sometimes dbus-daemon wins, and thus shepherd tries a few times to start
it anyway, leading to the infamous:

  elogind is already running as PID 123

(elogind checks whether its PID file exists.  Note that you may see that
message also when shepherd wins, because dbus-daemon tries to start it
anyway.)  Eventually, shepherd considers that elogind cannot be started
and disables it.

In addition to being ridiculous, it’s harmful: the ‘xorg-server’ service
(from ‘gdm-service-type’ and ‘sddm-service-type’ depends on ‘elogind’),
so if shepherd loses the race, Xorg isn’t started (on my laptop,
shepherd never loses the race it seems, but i’ve seen it lose half of
the time on a slower machine).

The reason elogind is started by shepherd is explained in this comment:

       ;; Start elogind from the Shepherd rather than waiting
       ;; for bus activation.  This ensures that it can handle
       ;; events like lid close, etc.

This comes from 94a881178af9a9a918ce6de55641daa245c92e73, which was a
fix for <http://issues.guix.gnu.org/27580>.  I believe the justification
still holds.

So it would seem that the solution to this is to prevent dbus-daemon
from starting elogind.  We can do that by changing
org.freedesktop.login1.service so that it has “Exec=true” instead of
“Exec=elogind --daemon”.

“Exec=true” is a bit crude because it doesn’t guarantee that elogind is
really started; if that isn’t good enough, we could instead wait for the
PID file or something (as of Shepherd 0.9.0, invoking ‘herd start
elogind’ potentially leads shepherd to start a second instance if the
first one is still being started, so we can’t really do that).

Depending on what we end up with, we might also revisit whether
xorg-server needs to explicitly depend on elogind.

Thoughts?

Ludo’.




Severity set to 'important' from 'normal' Request was from Ludovic Courtès <ludo <at> gnu.org> to control <at> debbugs.gnu.org. (Mon, 16 May 2022 08:41:01 GMT) Full text and rfc822 format available.

Information forwarded to bug-guix <at> gnu.org:
bug#55444; Package guix. (Tue, 24 May 2022 02:28:01 GMT) Full text and rfc822 format available.

Message #10 received at 55444 <at> debbugs.gnu.org (full text, mbox):

From: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 55444 <at> debbugs.gnu.org
Subject: Re: bug#55444: elogind startup race between shepherd and dbus-daemon
Date: Mon, 23 May 2022 22:27:20 -0400
Hi,

Ludovic Courtès <ludo <at> gnu.org> writes:

> Hello!
>
> Currently (40a729a0e6f1d660b942241416c1e2c567616d4d), shepherd and
> dbus-daemon compete to start elogind: shepherd tries to start it
> eagerly, and dbus-daemon starts it on-demand upon bus activation.
>
> Sometimes dbus-daemon wins, and thus shepherd tries a few times to start
> it anyway, leading to the infamous:
>
>   elogind is already running as PID 123

Do we have a system test that sometimes reproduce it, or at least the
above message?  I have some branch where I introduce some D-Bus
synchronization primitives I had started to fix
https://issues.guix.gnu.org/issue/52051, which ended up being fixed
differently (bumping the timeout value); perhaps it could be of use
here.

Thanks,

Maxim




Information forwarded to bug-guix <at> gnu.org:
bug#55444; Package guix. (Tue, 24 May 2022 19:27:01 GMT) Full text and rfc822 format available.

Message #13 received at 55444 <at> debbugs.gnu.org (full text, mbox):

From: Liliana Marie Prikler <liliana.prikler <at> gmail.com>
To: Ludovic Courtès <ludo <at> gnu.org>, 55444 <at> debbugs.gnu.org
Subject: Re: elogind startup race between shepherd and dbus-daemon
Date: Tue, 24 May 2022 21:25:57 +0200
Hi,

Am Montag, dem 16.05.2022 um 10:26 +0200 schrieb Ludovic Courtès:
> [...]
> So it would seem that the solution to this is to prevent dbus-daemon
> from starting elogind.  We can do that by changing
> org.freedesktop.login1.service so that it has “Exec=true” instead of
> “Exec=elogind --daemon”.
> 
> “Exec=true” is a bit crude because it doesn’t guarantee that elogind
> is really started; if that isn’t good enough, we could instead wait
> for the PID file or something (as of Shepherd 0.9.0, invoking ‘herd
> start elogind’ potentially leads shepherd to start a second instance
> if the first one is still being started, so we can’t really do that).
Why does shepherd race with itself here?  That sounds like a very evil
bug.  Rather than waiting for a log file, I'd suggest writing an ad-hoc
Guile script that communicates with shepherd and blocks until shepherd
signals that elogind has been started, but this script too would have
to work around shepherd racing against itself.

> Depending on what we end up with, we might also revisit whether
> xorg-server needs to explicitly depend on elogind.
At least in the case of GDM I think it does heavily depend on elogind.
For the future, I think we also should take over dbus-daemon's
autostart in the same way systemd already has.

Cheers




Information forwarded to bug-guix <at> gnu.org:
bug#55444; Package guix. (Wed, 25 May 2022 12:27:02 GMT) Full text and rfc822 format available.

Message #16 received at 55444 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Liliana Marie Prikler <liliana.prikler <at> gmail.com>
Cc: 55444 <at> debbugs.gnu.org
Subject: Re: bug#55444: elogind startup race between shepherd and dbus-daemon
Date: Wed, 25 May 2022 14:26:13 +0200
Hi,

Liliana Marie Prikler <liliana.prikler <at> gmail.com> skribis:

> Am Montag, dem 16.05.2022 um 10:26 +0200 schrieb Ludovic Courtès:
>> [...]
>> So it would seem that the solution to this is to prevent dbus-daemon
>> from starting elogind.  We can do that by changing
>> org.freedesktop.login1.service so that it has “Exec=true” instead of
>> “Exec=elogind --daemon”.
>> 
>> “Exec=true” is a bit crude because it doesn’t guarantee that elogind
>> is really started; if that isn’t good enough, we could instead wait
>> for the PID file or something (as of Shepherd 0.9.0, invoking ‘herd
>> start elogind’ potentially leads shepherd to start a second instance
>> if the first one is still being started, so we can’t really do that).
> Why does shepherd race with itself here?  That sounds like a very evil
> bug.  Rather than waiting for a log file, I'd suggest writing an ad-hoc
> Guile script that communicates with shepherd and blocks until shepherd
> signals that elogind has been started, but this script too would have
> to work around shepherd racing against itself.

Right.  Currently services have two states: stopped, and started.
Fixing that needs non-trivial changes to how shepherd handles state.

We’ll have to do that (the way I see it, we’ll move state out of
<service> and have a fiber explicitly handle state, including
distinguishing between “stopped” and “starting”), but I think/hope we
can fix this bug without first addressing this issue.

>> Depending on what we end up with, we might also revisit whether
>> xorg-server needs to explicitly depend on elogind.
> At least in the case of GDM I think it does heavily depend on elogind.
> For the future, I think we also should take over dbus-daemon's
> autostart in the same way systemd already has.

Agreed, though that one is trickier: we’d need an implementation of the
D-Bus protocol.  There’s guile-ac-d-bus but it’s probably under-tested.

Thanks,
Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#55444; Package guix. (Wed, 25 May 2022 12:46:02 GMT) Full text and rfc822 format available.

Message #19 received at 55444 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>
Cc: 55444 <at> debbugs.gnu.org
Subject: Re: bug#55444: elogind startup race between shepherd and dbus-daemon
Date: Wed, 25 May 2022 14:45:50 +0200
Hi,

Maxim Cournoyer <maxim.cournoyer <at> gmail.com> skribis:

> Ludovic Courtès <ludo <at> gnu.org> writes:
>
>> Hello!
>>
>> Currently (40a729a0e6f1d660b942241416c1e2c567616d4d), shepherd and
>> dbus-daemon compete to start elogind: shepherd tries to start it
>> eagerly, and dbus-daemon starts it on-demand upon bus activation.
>>
>> Sometimes dbus-daemon wins, and thus shepherd tries a few times to start
>> it anyway, leading to the infamous:
>>
>>   elogind is already running as PID 123
>
> Do we have a system test that sometimes reproduce it, or at least the
> above message?

Any system along the lines of gnu/system/examples/desktop.tmpl gets this
message, though usually Xorg starts without any problem.

There’s an elogind system test but it doesn’t catch the problem because
it’s a non-deterministic and rather rare issue.

> I have some branch where I introduce some D-Bus synchronization
> primitives I had started to fix
> https://issues.guix.gnu.org/issue/52051, which ended up being fixed
> differently (bumping the timeout value); perhaps it could be of use
> here.

Hmm I had forgotten about that bug, and I wonder if it’s the same bug I
was seeing, and that in fact the shepherd/dbus-daemon race isn’t the
root cause.  The machine where I saw elogind startup failures was
extremely slow to start for other reasons.

Anyhow, getting rid of this race seems like the right thing to do.

Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#55444; Package guix. (Fri, 27 May 2022 13:55:01 GMT) Full text and rfc822 format available.

Message #22 received at 55444 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: 55444 <at> debbugs.gnu.org
Subject: Re: bug#55444: elogind startup race between shepherd and dbus-daemon
Date: Fri, 27 May 2022 15:54:44 +0200
Ludovic Courtès <ludo <at> gnu.org> skribis:

> Currently (40a729a0e6f1d660b942241416c1e2c567616d4d), shepherd and
> dbus-daemon compete to start elogind: shepherd tries to start it
> eagerly, and dbus-daemon starts it on-demand upon bus activation.
>
> Sometimes dbus-daemon wins, and thus shepherd tries a few times to start
> it anyway, leading to the infamous:

Here’s an example where dbus-daemon wins:

--8<---------------cut here---------------start------------->8---
$ sudo grep --color -E '^May 27 .*(dbus.*login|elogind)' /var/log/messages
May 27 15:06:36 localhost dbus-daemon[326]: [system] Activating service name='org.freedesktop.login1' requested by ':1.0' (uid=0 pid=307 comm="/gnu/store/2lis8khrdk0zzjzs5ydi8rs5h6f6wjr7-shadow") (using servicehelper)
May 27 15:06:37 localhost elogind-daemon[335]: New seat seat0.
May 27 15:06:37 localhost elogind-daemon[335]: Watching system buttons on /dev/input/event2 (Power Button)
May 27 15:06:37 localhost elogind-daemon[335]: Watching system buttons on /dev/input/event0 (Power Button)
May 27 15:06:37 localhost elogind-daemon[335]: Watching system buttons on /dev/input/event1 (Sleep Button)
May 27 15:06:41 localhost elogind-daemon[335]: Watching system buttons on /dev/input/event4 (Darfon HP USB Keyboard)
May 27 15:06:43 localhost elogind-daemon[335]: Watching system buttons on /dev/input/event5 (Darfon HP USB Keyboard Consumer Control)
May 27 15:06:48 localhost dbus-daemon[326]: [system] Successfully activated service 'org.freedesktop.login1'
May 27 15:06:54 localhost elogind-daemon[335]: New session c1 of user ludo.
May 27 15:07:45 localhost shepherd[1]: Service elogind has been started. 
May 27 15:08:02 localhost shepherd[1]: Respawning elogind. 
May 27 15:08:14 localhost shepherd[1]: Service elogind has been started. 
May 27 15:08:32 localhost shepherd[1]: Respawning elogind. 
May 27 15:08:43 localhost shepherd[1]: Service elogind has been started. 
May 27 15:08:47 localhost vmunix: [   25.123255] shepherd[1]: Service file-system-/sys/fs/cgroup/elogind has been started.
May 27 15:08:53 localhost shepherd[1]: Respawning elogind. 
May 27 15:08:59 localhost shepherd[1]: Service elogind has been started. 
May 27 15:09:00 localhost vmunix: [   79.976531] elogind[348]: elogind is already running as PID 335
May 27 15:09:00 localhost vmunix: [  107.880971] elogind[364]: elogind is already running as PID 335
May 27 15:09:00 localhost vmunix: [  109.160864] elogind-daemon[335]: New session c2 of user ludo.
May 27 15:09:00 localhost vmunix: [  135.017068] elogind[369]: elogind is already running as PID 335
May 27 15:09:00 localhost vmunix: [  159.849027] elogind[370]: elogind is already running as PID 335
May 27 15:09:00 localhost vmunix: [  181.608889] elogind[371]: elogind is already running as PID 335
May 27 15:09:00 localhost shepherd[1]: Respawning elogind. 
May 27 15:09:00 localhost shepherd[1]: Service elogind has been started. 
May 27 15:09:04 localhost shepherd[1]: Respawning elogind. 
May 27 15:09:04 localhost shepherd[1]: Service elogind has been started. 
May 27 15:09:04 localhost elogind[410]: elogind is already running as PID 335
May 27 15:09:04 localhost shepherd[1]: Respawning elogind. 
May 27 15:09:04 localhost shepherd[1]: Service elogind has been started. 
May 27 15:09:05 localhost elogind[411]: elogind is already running as PID 335
May 27 15:09:05 localhost shepherd[1]: Respawning elogind. 
May 27 15:09:05 localhost shepherd[1]: Service elogind has been started. 
May 27 15:09:05 localhost elogind[412]: elogind is already running as PID 335
May 27 15:09:05 localhost shepherd[1]: Respawning elogind. 
May 27 15:09:05 localhost shepherd[1]: Service elogind has been started. 
May 27 15:09:05 localhost elogind[416]: elogind is already running as PID 335
May 27 15:09:05 localhost shepherd[1]: Respawning elogind. 
May 27 15:09:05 localhost shepherd[1]: Service elogind has been started. 
May 27 15:09:05 localhost elogind[417]: elogind is already running as PID 335
May 27 15:09:05 localhost shepherd[1]: Service elogind has been disabled. 
May 27 15:09:08 localhost elogind-daemon[335]: New session c3 of user gdm.
May 27 15:12:08 localhost elogind-daemon[335]: New session c4 of user ludo.
--8<---------------cut here---------------end--------------->8---

(In this case ‘xorg-server’ started but ‘herd status elogind’ shows it
is stopped and disabled.)

Contrast with a successful startup where shepherd wins:

--8<---------------cut here---------------start------------->8---
May 27 11:03:55 localhost shepherd[1]: Service elogind has been started. 
May 27 11:03:54 localhost elogind[476]: New seat seat0.
May 27 11:03:54 localhost elogind[476]: Watching system buttons on /dev/input/event3 (Power Button)
May 27 11:03:54 localhost elogind[476]: Watching system buttons on /dev/input/event1 (Power Button)
May 27 11:03:54 localhost elogind[476]: Watching system buttons on /dev/input/event0 (Lid Switch)
May 27 11:03:54 localhost elogind[476]: Watching system buttons on /dev/input/event2 (Sleep Button)
May 27 11:03:54 localhost elogind[476]: Watching system buttons on /dev/input/event5 (Dell Dell USB Keyboard)
May 27 11:03:54 localhost dbus-daemon[470]: [system] Activating service name='org.freedesktop.login1' requested by ':1.2' (uid=0 pid=477 comm="/gnu/store/qpaw2f734zlsq153fkn5afcv4k4fk63z-upower") (using servicehelper)
May 27 11:03:54 localhost elogind[476]: Watching system buttons on /dev/input/event4 (AT Translated Set 2 keyboard)
May 27 11:03:55 localhost elogind[496]: elogind is already running as PID 476
May 27 11:03:55 localhost dbus-daemon[470]: [system] Successfully activated service 'org.freedesktop.login1'
May 27 11:04:03 localhost vmunix: [ 2089.808033] shepherd[1]: Service file-system-/sys/fs/cgroup/elogind has been started.
May 27 11:04:04 localhost elogind[476]: New session c1 of user gdm.
--8<---------------cut here---------------end--------------->8---

Besides, Gentoo recommends starting it from runit rather than on-demand:

  https://wiki.gentoo.org/wiki/Elogind#Service
  https://gitweb.gentoo.org/repo/gentoo.git/tree/sys-auth/elogind/elogind-241.1.ebuild?id=bea47cee314829edbb41453d1e89fa1d1d3f9993

They don’t seem to be doing anything to avoid the race though.

Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#55444; Package guix. (Fri, 27 May 2022 20:55:01 GMT) Full text and rfc822 format available.

Message #25 received at 55444 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: 55444 <at> debbugs.gnu.org
Subject: Re: bug#55444: elogind startup race between shepherd and dbus-daemon
Date: Fri, 27 May 2022 22:54:49 +0200
[Message part 1 (text/plain, inline)]
Hey there,

Ludovic Courtès <ludo <at> gnu.org> skribis:

> So it would seem that the solution to this is to prevent dbus-daemon
> from starting elogind.  We can do that by changing
> org.freedesktop.login1.service so that it has “Exec=true” instead of
> “Exec=elogind --daemon”.
>
> “Exec=true” is a bit crude because it doesn’t guarantee that elogind is
> really started; if that isn’t good enough, we could instead wait for the
> PID file or something (as of Shepherd 0.9.0, invoking ‘herd start
> elogind’ potentially leads shepherd to start a second instance if the
> first one is still being started, so we can’t really do that).

The patch below address that: it changes the “Exec=” line of
‘org.freedesktop.login1’ to refer to a wrapper.  That wrapper connects
to shepherd and waits until ‘elogind’ is started.

That way, if dbus-daemon comes first, it won’t actually launch anything
and instead wait for the Shepherd ‘elogind’ service to be up.  (And if
it comes second, dbus-daemon won’t try to launch anything, so no
spurious “already running” messages.)

I tested it in a ‘desktop.tmpl’ VM, quickly logging in on tty1.  On
/var/log/messages, you can see the “Activating ….login1” message from
dbus-daemon, followed by “Service elogind started” from shepherd,
followed by “Successfully activated ….login1” from dbus-daemon.

The “elogind” system test passes too.

Thoughts?  Objections?

Ludo’.

[0001-services-elogind-When-started-by-dbus-daemon-wait-fo.patch (text/x-patch, inline)]
From 7ef63d7426677961afd2bd937af19b08209c5b70 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ludovic=20Court=C3=A8s?= <ludo <at> gnu.org>
Date: Fri, 27 May 2022 22:41:55 +0200
Subject: [PATCH] services: elogind: When started by dbus-daemon, wait for the
 Shepherd service.

Fixes <https://issues.guix.gnu.org/55444>.

Previously shepherd and dbus-daemon would race to start elogind.  In
some cases (for instance if one logs in quickly enough on the tty),
dbus-daemon would "win" and start elogind before shepherd has had a
chance to do it.  Consequently, shepherd would fail to start elogind and
mark it as stopped and disabled, in turn preventing services that depend
on it such as 'xorg-server' from starting.

* gnu/services/desktop.scm (elogind-dbus-service): Rewrite to refer to a
wrapper that waits for the 'elogind' Shepherd service.
---
 gnu/services/desktop.scm | 79 ++++++++++++++++++++++++++++++++++++++--
 1 file changed, 75 insertions(+), 4 deletions(-)

diff --git a/gnu/services/desktop.scm b/gnu/services/desktop.scm
index 24fd43a207..318107a2ca 100644
--- a/gnu/services/desktop.scm
+++ b/gnu/services/desktop.scm
@@ -1075,10 +1075,81 @@ (define-syntax-rule (ini-file config file clause ...)
    ("HybridSleepMode" (sleep-list elogind-hybrid-sleep-mode))))
 
 (define (elogind-dbus-service config)
-  (list (wrapped-dbus-service (elogind-package config)
-                              "libexec/elogind/elogind"
-                              `(("ELOGIND_CONF_FILE"
-                                 ,(elogind-configuration-file config))))))
+  "Return a @file{org.freedesktop.login1.service} file that tells D-Bus how to
+\"start\" elogind.  In practice though, our elogind is started when booting by
+shepherd.  Thus, the @code{Exec} line of this @file{.service} file does not
+explain how to start elogind; instead, it spawns a wrapper that waits for the
+@code{elogind} shepherd service.  This avoids a race condition where both
+@command{shepherd} and @command{dbus-daemon} would attempt to start elogind."
+  ;; For more info on the elogind startup race, see
+  ;; <https://issues.guix.gnu.org/55444>.
+
+  (define elogind
+    (elogind-package config))
+
+  (define wrapper
+    (program-file "elogind-dbus-shepherd-sync"
+                  (with-imported-modules '((gnu services herd))
+                    #~(begin
+                        (use-modules (gnu services herd)
+                                     (srfi srfi-1)
+                                     (ice-9 match))
+
+                        (define (elogind-service? service)
+                          (memq 'elogind (live-service-provision service)))
+
+                        (define max-attempts
+                          ;; Number of attempts before assuming elogind failed
+                          ;; to start.
+                          20)
+
+                        ;; Repeatedly check whether the 'elogind' shepherd
+                        ;; service is up and running.  (As of Shepherd 0.9.1,
+                        ;; we cannot just call the 'start' method and wait for
+                        ;; it: it would spawn an additional elogind process.)
+                        (let loop ((attempts 0))
+                          (define services
+                            (current-services))
+
+                          (when (>= attempts max-attempts)
+                            (format (current-error-port)
+                                    "elogind shepherd service not started~%")
+                            (exit 2))
+
+                          (match (find elogind-service? services)
+                            (#f
+                             (format (current-error-port)
+                                     "no elogind shepherd service~%")
+                             (exit 1))
+                            (service
+                             (unless (live-service-running service)
+                               (sleep 1)
+                               (loop (+ attempts 1))))))))))
+
+  (define build
+    (with-imported-modules '((guix build utils))
+      #~(begin
+          (use-modules (guix build utils)
+                       (ice-9 match))
+
+          (define service-directory
+            "/share/dbus-1/system-services")
+
+          (mkdir-p (dirname (string-append #$output service-directory)))
+          (copy-recursively (string-append #$elogind service-directory)
+                            (string-append #$output service-directory))
+          (symlink (string-append #$elogind "/etc") ;for etc/dbus-1
+                   (string-append #$output "/etc"))
+
+          ;; Replace the "Exec=" line of the 'org.freedesktop.login1.service'
+          ;; file with one that refers to WRAPPER instead of elogind.
+          (match (find-files #$output "\\.service$")
+            ((file)
+             (substitute* file
+               (("Exec[[:blank:]]*=.*" _)
+                (string-append "Exec=" #$wrapper "\n"))))))))
+
+  (list (computed-file "elogind-dbus-service-wrapper" build)))
 
 (define (pam-extension-procedure config)
   "Return an extension for PAM-ROOT-SERVICE-TYPE that ensures that all the PAM
-- 
2.36.0


Information forwarded to bug-guix <at> gnu.org:
bug#55444; Package guix. (Sat, 28 May 2022 08:14:02 GMT) Full text and rfc822 format available.

Message #28 received at 55444 <at> debbugs.gnu.org (full text, mbox):

From: Josselin Poiret <dev <at> jpoiret.xyz>
To: Ludovic Courtès <ludo <at> gnu.org>, 55444 <at> debbugs.gnu.org
Subject: Re: bug#55444: elogind startup race between shepherd and dbus-daemon
Date: Sat, 28 May 2022 10:13:26 +0200
Hello Ludo,

Ludovic Courtès <ludo <at> gnu.org> writes:

> The patch below address that: it changes the “Exec=” line of
> ‘org.freedesktop.login1’ to refer to a wrapper.  That wrapper connects
> to shepherd and waits until ‘elogind’ is started.
>
> That way, if dbus-daemon comes first, it won’t actually launch anything
> and instead wait for the Shepherd ‘elogind’ service to be up.  (And if
> it comes second, dbus-daemon won’t try to launch anything, so no
> spurious “already running” messages.)
>
> I tested it in a ‘desktop.tmpl’ VM, quickly logging in on tty1.  On
> /var/log/messages, you can see the “Activating ….login1” message from
> dbus-daemon, followed by “Service elogind started” from shepherd,
> followed by “Successfully activated ….login1” from dbus-daemon.
>
> The “elogind” system test passes too.
>
> Thoughts?  Objections?
>
> Ludo’.

Great idea!  The patch LGTM, although I'd argue that most of the wrapper
code could belong in (gnu services herd), in something like
(wait-for-services #:select? select? #:retries (n 20)) so that someone
else doesn't end up recoding this for another service.

Best,
-- 
Josselin Poiret




Reply sent to Ludovic Courtès <ludo <at> gnu.org>:
You have taken responsibility. (Sat, 28 May 2022 21:27:02 GMT) Full text and rfc822 format available.

Notification sent to Ludovic Courtès <ludo <at> gnu.org>:
bug acknowledged by developer. (Sat, 28 May 2022 21:27:02 GMT) Full text and rfc822 format available.

Message #33 received at 55444-done <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Josselin Poiret <dev <at> jpoiret.xyz>
Cc: 55444-done <at> debbugs.gnu.org
Subject: Re: bug#55444: elogind startup race between shepherd and dbus-daemon
Date: Sat, 28 May 2022 23:26:23 +0200
Hello,

Josselin Poiret <dev <at> jpoiret.xyz> skribis:

> Great idea!  The patch LGTM, although I'd argue that most of the wrapper
> code could belong in (gnu services herd), in something like
> (wait-for-services #:select? select? #:retries (n 20)) so that someone
> else doesn't end up recoding this for another service.

Good idea, done here:

  f383838a09 services: elogind: When started by dbus-daemon, wait for the Shepherd service.
  b04ae71def services: herd: Add 'wait-for-service'.

Let me know what you think!

Thanks,
Ludo’.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sun, 26 Jun 2022 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 1 year and 276 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.