GNU bug report logs - #77390
[Shepherd] Timer spawns many commands on DST change (CET -> CEST)

Previous Next

Package: guix;

Reported by: "Timo Wilken" <guix <at> twilken.net>

Date: Sun, 30 Mar 2025 20:05:02 UTC

Severity: important

Merged with 77401

Done: Ludovic Courtès <ludo <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 77390 in the body.
You can then email your comments to 77390 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guix <at> gnu.org:
bug#77390; Package guix. (Sun, 30 Mar 2025 20:05:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to "Timo Wilken" <guix <at> twilken.net>:
New bug report received and forwarded. Copy sent to bug-guix <at> gnu.org. (Sun, 30 Mar 2025 20:05:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: "Timo Wilken" <guix <at> twilken.net>
To: <bug-guix <at> gnu.org>
Subject: Shepherd: bug in timer scheduling on daylight savings time change
Date: Sun, 30 Mar 2025 22:04:07 +0200
Hi,

Last night, my server's time changed from CET to CEST automatically, as it is
set to Europe/Berlin time.

I have a Shepherd timer set to run every night at 02:00 local time:
https://cgit.twilken.net/dotfiles/tree/tw/system.scm?h=cd39f5d3340c0bcd59bb80d3ff4f12741db75e07#n136

--8<---------------cut here
   (simple-service 'disk-maintenance shepherd-root-service-type
     (list (shepherd-service
            (provision '(guix-gc))
            (requirement '(user-processes file-systems))
            (documentation "Delete unused files from Guix' store.")
            (modules '((shepherd service timer)))
            (start #~(make-timer-constructor
                      (calendar-event #:hours '(2) #:minutes '(0))
                      (command (list "guix" "gc" "-d" "2w"))))
            (stop #~(make-timer-destructor))
            (actions (list (shepherd-action
                            (name 'trigger) (procedure #~trigger-timer)
                            (documentation "Collect garbage now.")))))))
--8<---------------cut here

Last night, this hit the exact moment of the CET/CEST change and seems to have
confused the Shepherd a lot, to the point where it kept spamming the same
message to /var/log/messages, and used up 80% of my system's total memory,
according to htop. (This led to various things getting oom-killed, and the
system grinding to a halt until I had a look at it in the afternoon.)

First, at 01:00, the Shepherd spawned about 30 processes for this one timer:

--8<---------------cut here
2025-03-30 01:00:00 localhost shepherd[1]: Timer 'guix-gc' spawned process 1691.
2025-03-30 01:00:00 localhost shepherd[1]: Timer 'nextcloud-cron' spawned process 1692.
2025-03-30 01:00:00 localhost shepherd[1]: Registering new logger for nextcloud-cron.
2025-03-30 01:00:00 localhost shepherd[1]: Timer 'gpsplot' spawned process 1693.
2025-03-30 01:00:00 localhost shepherd[1]: Registering new logger for gpsplot.
2025-03-30 01:00:00 localhost shepherd[1]: Timer 'dynamic-dns-ipv4.api.mythic-beasts.com' spawned process 1694.
2025-03-30 01:00:00 localhost shepherd[1]: Registering new logger for dynamic-dns-ipv4.api.mythic-beasts.com.
2025-03-30 01:00:00 localhost shepherd[1]: Timer 'guix-gc' spawned process 1698.
2025-03-30 01:00:00 localhost shepherd[1]: Registering new logger for guix-gc.
2025-03-30 01:00:00 localhost shepherd[1]: Timer 'guix-gc' spawned process 1703.
2025-03-30 01:00:00 localhost shepherd[1]: Registering new logger for guix-gc.
2025-03-30 01:00:00 localhost shepherd[1]: Timer 'guix-gc' spawned process 1713.
2025-03-30 01:00:00 localhost shepherd[1]: Registering new logger for guix-gc.
2025-03-30 01:00:00 localhost shepherd[1]: Timer 'guix-gc' spawned process 1715.
2025-03-30 01:00:00 localhost shepherd[1]: Registering new logger for guix-gc.
2025-03-30 01:00:00 localhost shepherd[1]: Timer 'guix-gc' spawned process 1723.
2025-03-30 01:00:00 localhost shepherd[1]: Registering new logger for guix-gc.
2025-03-30 01:00:00 localhost shepherd[1]: Timer 'guix-gc' spawned process 1726.
2025-03-30 01:00:00 localhost shepherd[1]: Registering new logger for guix-gc.
2025-03-30 01:00:00 localhost shepherd[1]: Timer 'guix-gc' spawned process 1732.
2025-03-30 01:00:00 localhost shepherd[1]: Registering new logger for guix-gc.
2025-03-30 01:00:00 localhost shepherd[1]: Timer 'guix-gc' spawned process 1736.
2025-03-30 01:00:00 localhost shepherd[1]: Registering new logger for guix-gc.
2025-03-30 01:00:00 localhost shepherd[1]: Timer 'guix-gc' spawned process 1740.
2025-03-30 01:00:00 localhost shepherd[1]: Registering new logger for guix-gc.
2025-03-30 01:00:00 localhost shepherd[1]: Timer 'guix-gc' spawned process 1750.
2025-03-30 01:00:00 localhost shepherd[1]: Registering new logger for guix-gc.
2025-03-30 01:00:00 localhost shepherd[1]: Timer 'guix-gc' spawned process 1754.
2025-03-30 01:00:00 localhost shepherd[1]: Registering new logger for guix-gc.
2025-03-30 01:00:00 localhost shepherd[1]: Timer 'guix-gc' spawned process 1761.
2025-03-30 01:00:00 localhost shepherd[1]: Registering new logger for guix-gc.
2025-03-30 01:00:00 localhost shepherd[1]: Timer 'guix-gc' spawned process 1766.
2025-03-30 01:00:00 localhost shepherd[1]: Registering new logger for guix-gc.
2025-03-30 01:00:00 localhost shepherd[1]: Timer 'guix-gc' spawned process 1768.
2025-03-30 01:00:00 localhost shepherd[1]: Registering new logger for guix-gc.
2025-03-30 01:00:00 localhost shepherd[1]: Timer 'guix-gc' spawned process 1778.
2025-03-30 01:00:00 localhost shepherd[1]: Registering new logger for guix-gc.
2025-03-30 01:00:00 localhost shepherd[1]: Timer 'guix-gc' spawned process 1785.
2025-03-30 01:00:00 localhost shepherd[1]: Registering new logger for guix-gc.
2025-03-30 01:00:00 localhost shepherd[1]: Timer 'guix-gc' spawned process 1791.
2025-03-30 01:00:01 localhost shepherd[1]: Registering new logger for guix-gc.
2025-03-30 01:00:01 localhost shepherd[1]: Timer 'guix-gc' spawned process 1806.
2025-03-30 01:00:01 localhost shepherd[1]: Registering new logger for guix-gc.
2025-03-30 01:00:01 localhost shepherd[1]: Timer 'guix-gc' spawned process 1811.
2025-03-30 01:00:01 localhost shepherd[1]: Registering new logger for guix-gc.
2025-03-30 01:00:01 localhost shepherd[1]: Timer 'guix-gc' spawned process 1818.
2025-03-30 01:00:01 localhost shepherd[1]: Registering new logger for guix-gc.
2025-03-30 01:00:01 localhost shepherd[1]: Timer 'guix-gc' spawned process 1822.
2025-03-30 01:00:01 localhost shepherd[1]: Registering new logger for guix-gc.
2025-03-30 01:00:01 localhost shepherd[1]: Timer 'guix-gc' spawned process 1829.
2025-03-30 01:00:01 localhost shepherd[1]: Registering new logger for guix-gc.
2025-03-30 01:00:01 localhost shepherd[1]: Timer 'guix-gc' spawned process 1838.
2025-03-30 01:00:01 localhost shepherd[1]: Registering new logger for guix-gc.
2025-03-30 01:00:01 localhost shepherd[1]: Timer 'guix-gc' spawned process 1848.
2025-03-30 01:00:01 localhost shepherd[1]: Registering new logger for guix-gc.
2025-03-30 01:00:01 localhost shepherd[1]: Timer 'guix-gc' spawned process 1851.
2025-03-30 01:00:01 localhost shepherd[1]: Registering new logger for guix-gc.
2025-03-30 01:00:01 localhost shepherd[1]: Timer 'guix-gc' spawned process 1862.
2025-03-30 01:00:01 localhost shepherd[1]: Registering new logger for guix-gc.
2025-03-30 01:00:01 localhost shepherd[1]: Timer 'guix-gc' spawned process 1872.
2025-03-30 01:00:01 localhost shepherd[1]: Registering new logger for guix-gc.
2025-03-30 01:00:01 localhost shepherd[1]: Timer 'guix-gc' spawned process 1876.
2025-03-30 01:00:01 localhost shepherd[1]: Registering new logger for guix-gc.
2025-03-30 01:00:02 localhost shepherd[1]: Timer 'guix-gc' spawned process 1883.
2025-03-30 01:00:02 localhost shepherd[1]: Registering new logger for guix-gc.
2025-03-30 03:00:00 localhost linux: [46479.948177] perf: interrupt took too long (2504 > 2500), lowering kernel.perf_event_max_sample_rate to 79750
2025-03-30 03:00:01 localhost linux: [47490.137060] perf: interrupt took too long (3138 > 3130), lowering kernel.perf_event_max_sample_rate to 63500
2025-03-30 03:00:01 localhost shepherd[1]: Timer 'guix-gc' spawned process 1887.
2025-03-30 03:00:01 localhost shepherd[1]: Registering new logger for guix-gc.
2025-03-30 03:00:02 localhost shepherd[1]: Timer 'guix-gc' spawned process 1896.
2025-03-30 03:00:02 localhost shepherd[1]: Registering new logger for guix-gc.
2025-03-30 03:00:03 localhost shepherd[1]: Timer 'guix-gc' spawned process 1910.
2025-03-30 03:00:03 localhost shepherd[1]: Registering new logger for guix-gc.
2025-03-30 03:00:03 localhost shepherd[1]: Timer 'guix-gc' spawned process 1915.
--8<---------------cut here

I then get 731(!!) lines like the following over the course of 3 seconds:

--8<---------------cut here
2025-03-30 03:00:03 localhost shepherd[1]: Waiting anew for timer 'guix-gc' (resuming from sleep state?).
[...]
2025-03-30 03:00:06 localhost shepherd[1]: Waiting anew for timer 'guix-gc' (resuming from sleep state?).
--8<---------------cut here

The Shepherd then gets more confused and starts spamming the same message over
and over again, but now without newlines, leading to lines like:

--8<---------------cut here
2025-03-30 03:00:06 localhost shepherd[1]: Waiting anew for timer 'guix-gc' (resuming from sleep state?).Waiting anew for timer 'guix-gc' (resuming from sleep state?).Waiting anew for timer 'guix-gc' (resuming from sleep state?).Waiting anew for timer 'guix-gc' (resuming from sleep state?).Waiting anew for timer 'guix-gc' (resuming from sleep state?).Waiting anew for timer 'guix-gc' (resuming from sleep state?).Waiting anew for timer 'guix-gc' (resuming from sleep state?).Waiting anew for timer 'guix-gc' (resuming from sleep state?).Waiting anew for
2025-03-30 03:00:06 localhost shepherd[1]:  timer 'guix-gc' (resuming from sleep state?).Waiting anew for timer 'guix-gc' (resuming from sleep state?).Waiting anew for timer 'guix-gc' (resuming from sleep state?).Waiting anew for timer 'guix-gc' (resuming from sleep state?).Waiting anew for timer 'guix-gc' (resuming from sleep state?).Waiting anew for timer 'guix-gc' (resuming from sleep state?).Waiting anew for timer 'guix-gc' (resuming from sleep state?).Waiting anew for timer 'guix-gc' (resuming from sleep state?).Waiting anew for timer 'guix-gc'
--8<---------------cut here

Then there's some output from the various "guix gc" processes spawned by the
timer, which of course all step on each other's toes. The Shepherd keeps
happily writing the above lines out in large blocks.

It goes on like this until the afternoon, when the OOM kills start (probably
when I start SSH'ing in to see what's going on):

--8<---------------cut here
2025-03-30 15:21:44 localhost linux: [93869.155828] sshd-session invoked oom-killer: gfp_mask=0x140cca(GFP_HIGHUSER_MOVABLE|__GFP_COMP), order=0, oom_score_adj=0
--8<---------------cut here

After a bit of that, I decide to reboot the machine.

I've saved the system logs from that night - let me know if you need more details.

I'm running Shepherd 1.0.3, as packaged in Guix:

--8<---------------cut here
Generation 235	Mar 29 2025 12:07:15	(current)
  file name: /var/guix/profiles/system-235-link
  canonical file name: /gnu/store/p7infvmm678nysvdg3xvxlr83xxkj66j-system
  label: GNU with Linux-Libre 6.13.8
  bootloader: grub-efi
  root device: UUID: c63af3e6-3c2b-43d2-b1e6-944f09a10e0f
  kernel: /gnu/store/d7z9027yrs1vfmpvds10iqfl69cyvixn-linux-libre-6.13.8/bzImage
  channels:
    tw:
      repository URL: https://git.twilken.net/dotfiles
      branch: master
      commit: 39abf96d68ce500abbd55f9bdbd68e1c63282d36
    guix:
      repository URL: https://git.savannah.gnu.org/git/guix.git
      branch: master
      commit: 67f4a2feadbb6cc04a61b5a3bc7b8f4bace2a0bb
    [non-free channels omitted]
--8<---------------cut here

Cheers,
Timo




Merged 77390 77401. Request was from Ludovic Courtès <ludo <at> gnu.org> to control <at> debbugs.gnu.org. (Mon, 31 Mar 2025 13:11:05 GMT) Full text and rfc822 format available.

Information forwarded to bug-guix <at> gnu.org:
bug#77390; Package guix. (Mon, 31 Mar 2025 13:18:04 GMT) Full text and rfc822 format available.

Message #10 received at 77390 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: "Timo Wilken" <guix <at> twilken.net>
Cc: 77390 <at> debbugs.gnu.org
Subject: Re: bug#77390: Shepherd: bug in timer scheduling on daylight
 savings time change
Date: Mon, 31 Mar 2025 15:17:00 +0200
Hi Timo,

"Timo Wilken" <guix <at> twilken.net> skribis:

> Last night, this hit the exact moment of the CET/CEST change and seems to have
> confused the Shepherd a lot, to the point where it kept spamming the same
> message to /var/log/messages, and used up 80% of my system's total memory,
> according to htop. (This led to various things getting oom-killed, and the
> system grinding to a halt until I had a look at it in the afternoon.)
>
> First, at 01:00, the Shepherd spawned about 30 processes for this one timer:
>
> --8<---------------cut here
> 2025-03-30 01:00:00 localhost shepherd[1]: Timer 'guix-gc' spawned process 1691.
> 2025-03-30 01:00:00 localhost shepherd[1]: Timer 'nextcloud-cron' spawned process 1692.
> 2025-03-30 01:00:00 localhost shepherd[1]: Registering new logger for nextcloud-cron.
> 2025-03-30 01:00:00 localhost shepherd[1]: Timer 'gpsplot' spawned process 1693.
> 2025-03-30 01:00:00 localhost shepherd[1]: Registering new logger for gpsplot.
> 2025-03-30 01:00:00 localhost shepherd[1]: Timer 'dynamic-dns-ipv4.api.mythic-beasts.com' spawned process 1694.
> 2025-03-30 01:00:00 localhost shepherd[1]: Registering new logger for dynamic-dns-ipv4.api.mythic-beasts.com.
> 2025-03-30 01:00:00 localhost shepherd[1]: Timer 'guix-gc' spawned process 1698.
> 2025-03-30 01:00:00 localhost shepherd[1]: Registering new logger for guix-gc.

I saw the exact same problem on berlin (aka. ci.guix.gnu.org):

  https://issues.guix.gnu.org/77401

but not on the other servers I co-administer; I suspect the difference
is that these other servers did not have a timer running at 01:00.

So I guess that the fix for <https://issues.guix.gnu.org/75622> was
bogus somehow?

To be continued…

Ludo’.




Severity set to 'important' from 'normal' Request was from Ludovic Courtès <ludo <at> gnu.org> to control <at> debbugs.gnu.org. (Mon, 31 Mar 2025 13:19:04 GMT) Full text and rfc822 format available.

Changed bug title to '[Shepherd] Timer spawns many commands on DST change (CET -> CEST)' from 'Shepherd: bug in timer scheduling on daylight savings time change' Request was from Ludovic Courtès <ludo <at> gnu.org> to control <at> debbugs.gnu.org. (Mon, 31 Mar 2025 20:30:02 GMT) Full text and rfc822 format available.

Reply sent to Ludovic Courtès <ludo <at> gnu.org>:
You have taken responsibility. (Mon, 31 Mar 2025 21:23:02 GMT) Full text and rfc822 format available.

Notification sent to "Timo Wilken" <guix <at> twilken.net>:
bug acknowledged by developer. (Mon, 31 Mar 2025 21:23:03 GMT) Full text and rfc822 format available.

Message #19 received at 77390-done <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: "Timo Wilken" <guix <at> twilken.net>
Cc: 77401-done <at> debbugs.gnu.org, 77390-done <at> debbugs.gnu.org
Subject: Re: bug#77401: [Shepherd] Timer spawns too many commands
Date: Mon, 31 Mar 2025 23:21:56 +0200
Hi Timo,

"Timo Wilken" <guix <at> twilken.net> skribis:

> Last night, this hit the exact moment of the CET/CEST change and seems to have
> confused the Shepherd a lot, to the point where it kept spamming the same
> message to /var/log/messages, and used up 80% of my system's total memory,
> according to htop. (This led to various things getting oom-killed, and the
> system grinding to a halt until I had a look at it in the afternoon.)
>
> First, at 01:00, the Shepherd spawned about 30 processes for this one timer:

This is now fixed:

  https://git.savannah.gnu.org/cgit/shepherd.git/commit/?id=b36e97a730596dbf3c376f940150d5fd1f08ecf1

Let me know if you find something fishy.

The irony is that the work I did for <https://issues.guix.gnu.org/75622>
helped, but the one test that would have caught this issue was wrong: it
was looking at the wrong date when testing the CET->CEST change.

Setting #:wait-for-termination? #true would have mitigated this bug (it
wouldn’t have triggered).

Thanks,
Ludo’.




Reply sent to Ludovic Courtès <ludo <at> gnu.org>:
You have taken responsibility. (Mon, 31 Mar 2025 21:23:03 GMT) Full text and rfc822 format available.

Notification sent to Ludovic Courtès <ludo <at> gnu.org>:
bug acknowledged by developer. (Mon, 31 Mar 2025 21:23:03 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Tue, 29 Apr 2025 11:24:13 GMT) Full text and rfc822 format available.

This bug report was last modified 14 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.