GNU bug report logs - #65306
[shepherd] ntpd throws shepherd out of the loop

Previous Next

Package: guix;

Reported by: Liliana Marie Prikler <liliana.prikler <at> gmail.com>

Date: Tue, 15 Aug 2023 05:19:01 UTC

Severity: normal

Done: Ludovic Courtès <ludo <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 65306 in the body.
You can then email your comments to 65306 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guix <at> gnu.org:
bug#65306; Package guix. (Tue, 15 Aug 2023 05:19:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Liliana Marie Prikler <liliana.prikler <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-guix <at> gnu.org. (Tue, 15 Aug 2023 05:19:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Liliana Marie Prikler <liliana.prikler <at> gmail.com>
To: bug-guix <at> gnu.org
Subject: [shepherd] ntpd throws shepherd out of the loop
Date: Tue, 15 Aug 2023 07:18:02 +0200
Hi Guix,

I have a laptop that's a little stuck in the past… more accurately
January of 2020 thanks to what I believe to be an empty CMOS battery. 
As of recently (maybe it dates back longer, but I first experienced it
two weeks ago and just now got to debugging it a little), Shepherd gets
stuck at 100% CPU usage "early" on first boot.  I can prevent this
issue by getting the system time "close enough" to the actual time
before the NTP sync, but see the first sentence.  Not having a network
connection also works, but that's somewhat unpractical.  Also, the high
CPU usage still occurs if a sync is done later.  I have yet to
encounter the bug post hibernation, but I also wish not to.  There
doesn't appear to be anything particular interesting in the logs
either.

Cheers




Information forwarded to bug-guix <at> gnu.org:
bug#65306; Package guix. (Tue, 15 Aug 2023 11:19:02 GMT) Full text and rfc822 format available.

Message #8 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Csepp <raingloom <at> riseup.net>
To: Liliana Marie Prikler <liliana.prikler <at> gmail.com>
Cc: bug-guix <at> gnu.org, 65306 <at> debbugs.gnu.org
Subject: Re: bug#65306: [shepherd] ntpd throws shepherd out of the loop
Date: Tue, 15 Aug 2023 13:13:50 +0200
Liliana Marie Prikler <liliana.prikler <at> gmail.com> writes:

> Hi Guix,
>
> I have a laptop that's a little stuck in the past… more accurately
> January of 2020 thanks to what I believe to be an empty CMOS battery. 
> As of recently (maybe it dates back longer, but I first experienced it
> two weeks ago and just now got to debugging it a little), Shepherd gets
> stuck at 100% CPU usage "early" on first boot.  I can prevent this
> issue by getting the system time "close enough" to the actual time
> before the NTP sync, but see the first sentence.  Not having a network
> connection also works, but that's somewhat unpractical.  Also, the high
> CPU usage still occurs if a sync is done later.  I have yet to
> encounter the bug post hibernation, but I also wish not to.  There
> doesn't appear to be anything particular interesting in the logs
> either.
>
> Cheers

This sounds like an issue with slow incremental system time updates,
although I don't understand why that would cause Shepherd to hang, but
maybe the NTP service is configured to only report itself as initialized
once it has finished synchronizing, which defeats the point of
incremental updating.
There is probably a config setting to tell ntpd to perform the update in
a single step, at least I know chrony has one.

ps.: don't wait until the battery starts leaking to replace it




Information forwarded to bug-guix <at> gnu.org:
bug#65306; Package guix. (Tue, 15 Aug 2023 11:19:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-guix <at> gnu.org:
bug#65306; Package guix. (Tue, 15 Aug 2023 14:28:02 GMT) Full text and rfc822 format available.

Message #14 received at 65306 <at> debbugs.gnu.org (full text, mbox):

From: Timotej Lazar <timotej.lazar <at> araneo.si>
To: Liliana Marie Prikler <liliana.prikler <at> gmail.com>, 65306 <at> debbugs.gnu.org
Subject: Re: bug#65306: [shepherd] ntpd throws shepherd out of the loop
Date: Tue, 15 Aug 2023 16:27:21 +0200
Liliana Marie Prikler <liliana.prikler <at> gmail.com> [2023-08-15 07:18:02+0200]:
> As of recently (maybe it dates back longer, but I first experienced it
> two weeks ago and just now got to debugging it a little), Shepherd gets
> stuck at 100% CPU usage "early" on first boot.

I have this issue on all Guix systems without a (working) RTC. It seems
to be caused by a recentish update to guile-fibers:

https://github.com/wingo/fibers/issues/89

For me this happens regardless of whether the system time is pushed
forward manually or by ntpd. Depending on the time delta and CPU speed,
the usage returns to normal after a couple of days. During that time any
socket-activated services like SSH are also unreachable.




Information forwarded to bug-guix <at> gnu.org:
bug#65306; Package guix. (Sat, 02 Sep 2023 20:45:02 GMT) Full text and rfc822 format available.

Message #17 received at 65306 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Timotej Lazar <timotej.lazar <at> araneo.si>
Cc: Liliana Marie Prikler <liliana.prikler <at> gmail.com>, 65306 <at> debbugs.gnu.org
Subject: Re: bug#65306: [shepherd] ntpd throws shepherd out of the loop
Date: Sat, 02 Sep 2023 22:44:03 +0200
Hi,

Timotej Lazar <timotej.lazar <at> araneo.si> skribis:

> Liliana Marie Prikler <liliana.prikler <at> gmail.com> [2023-08-15 07:18:02+0200]:
>> As of recently (maybe it dates back longer, but I first experienced it
>> two weeks ago and just now got to debugging it a little), Shepherd gets
>> stuck at 100% CPU usage "early" on first boot.
>
> I have this issue on all Guix systems without a (working) RTC. It seems
> to be caused by a recentish update to guile-fibers:
>
> https://github.com/wingo/fibers/issues/89

Yeah, that’s the one.

Liliana, Timotej: could you try the Guix patch I posted at
<https://issues.guix.gnu.org/64966>?

Thanks,
Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#65306; Package guix. (Sat, 02 Sep 2023 21:42:01 GMT) Full text and rfc822 format available.

Message #20 received at 65306 <at> debbugs.gnu.org (full text, mbox):

From: Liliana Marie Prikler <liliana.prikler <at> gmail.com>
To: Ludovic Courtès <ludo <at> gnu.org>, Timotej Lazar
 <timotej.lazar <at> araneo.si>
Cc: 65306 <at> debbugs.gnu.org
Subject: Re: bug#65306: [shepherd] ntpd throws shepherd out of the loop
Date: Sat, 02 Sep 2023 23:41:16 +0200
Am Samstag, dem 02.09.2023 um 22:44 +0200 schrieb Ludovic Courtès:
> Hi,
> 
> Timotej Lazar <timotej.lazar <at> araneo.si> skribis:
> 
> > Liliana Marie Prikler <liliana.prikler <at> gmail.com> [2023-08-15
> > 07:18:02+0200]:
> > > As of recently (maybe it dates back longer, but I first
> > > experienced it two weeks ago and just now got to debugging it a
> > > little), Shepherd gets stuck at 100% CPU usage "early" on first
> > > boot.
> > 
> > I have this issue on all Guix systems without a (working) RTC. It
> > seems to be caused by a recentish update to guile-fibers:
> > 
> > https://github.com/wingo/fibers/issues/89
> 
> Yeah, that’s the one.
> 
> Liliana, Timotej: could you try the Guix patch I posted at
> <https://issues.guix.gnu.org/64966>?
Do we have a guide on how to swap out shepherd from the config.scm? 
The machine that experiences this fault isn't set up for Guix hacking.

Cheers




Information forwarded to bug-guix <at> gnu.org:
bug#65306; Package guix. (Sun, 03 Sep 2023 19:59:01 GMT) Full text and rfc822 format available.

Message #23 received at 65306 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Liliana Marie Prikler <liliana.prikler <at> gmail.com>
Cc: Timotej Lazar <timotej.lazar <at> araneo.si>, 65306 <at> debbugs.gnu.org
Subject: Re: bug#65306: [shepherd] ntpd throws shepherd out of the loop
Date: Sun, 03 Sep 2023 21:58:08 +0200
Hi,

Liliana Marie Prikler <liliana.prikler <at> gmail.com> skribis:

> Am Samstag, dem 02.09.2023 um 22:44 +0200 schrieb Ludovic Courtès:

[...]

>> Liliana, Timotej: could you try the Guix patch I posted at
>> <https://issues.guix.gnu.org/64966>?
> Do we have a guide on how to swap out shepherd from the config.scm? 
> The machine that experiences this fault isn't set up for Guix hacking.

You can do something like this in your OS config:

  (essential-services
   (modify-services (operating-system-default-essential-services
                     this-operating-system)
     (shepherd-root-service-type
      config => (shepherd-configuration
                 (shepherd insert-custom-sherpherd-here)))))

(Initially mentioned at
<https://lists.gnu.org/archive/html/guix-devel/2023-04/msg00396.html>.)

HTH!

Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#65306; Package guix. (Mon, 04 Sep 2023 05:47:01 GMT) Full text and rfc822 format available.

Message #26 received at 65306 <at> debbugs.gnu.org (full text, mbox):

From: Timotej Lazar <timotej.lazar <at> araneo.si>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: Liliana Marie Prikler <liliana.prikler <at> gmail.com>, 65306 <at> debbugs.gnu.org
Subject: Re: bug#65306: [shepherd] ntpd throws shepherd out of the loop
Date: Mon, 04 Sep 2023 07:46:03 +0200
Ludovic Courtès <ludo <at> gnu.org> [2023-09-02 22:44:03+0200]:
> Liliana, Timotej: could you try the Guix patch I posted at
> <https://issues.guix.gnu.org/64966>?

That patch works for my aarch64 board. I encounter the same issue on an
x86_64 system without a functional RTC, but at least now I know how to
apply a workaround. Thanks!




Reply sent to Ludovic Courtès <ludo <at> gnu.org>:
You have taken responsibility. (Fri, 08 Sep 2023 16:51:01 GMT) Full text and rfc822 format available.

Notification sent to Liliana Marie Prikler <liliana.prikler <at> gmail.com>:
bug acknowledged by developer. (Fri, 08 Sep 2023 16:51:01 GMT) Full text and rfc822 format available.

Message #31 received at 65306-done <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Timotej Lazar <timotej.lazar <at> araneo.si>
Cc: 65306-done <at> debbugs.gnu.org,
 Liliana Marie Prikler <liliana.prikler <at> gmail.com>
Subject: Re: bug#65306: [shepherd] ntpd throws shepherd out of the loop
Date: Fri, 08 Sep 2023 18:50:30 +0200
Timotej Lazar <timotej.lazar <at> araneo.si> skribis:

> Ludovic Courtès <ludo <at> gnu.org> [2023-09-02 22:44:03+0200]:
>> Liliana, Timotej: could you try the Guix patch I posted at
>> <https://issues.guix.gnu.org/64966>?
>
> That patch works for my aarch64 board. I encounter the same issue on an
> x86_64 system without a functional RTC, but at least now I know how to
> apply a workaround. Thanks!

Right.  I’ve committed a variant of this patch (will push shortly).

Thanks for testing!

Ludo’.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sat, 07 Oct 2023 11:24:05 GMT) Full text and rfc822 format available.

This bug report was last modified 1 year and 217 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.