GNU bug report logs -
#79492
Shepherd catastrophic memory leak
Previous Next
To reply to this bug, email your comments to 79492 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-guix <at> gnu.org:
bug#79492; Package
guix.
(Mon, 22 Sep 2025 19:24:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
"Zack Weinberg" <zack <at> owlfolio.org>:
New bug report received and forwarded. Copy sent to
bug-guix <at> gnu.org.
(Mon, 22 Sep 2025 19:24:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
I left my Guix System-based web server running for 26 days and PID 1 has
ballooned to consume 75% of all available RAM. Because of this, it can
no longer fork. Which, in turn, means the system is almost but not quite
dead in the water. Daemons that are already running, such as the actual
web server, are fine, but any transient service -- like ssh -- won't
start. I could log in on the console, because getty was already
running, but `reboot` just hangs, and if I log out I expect it won't be
able to start another getty process.
Here is some relevant troubleshooting info:
# uptime
19:08:57 up 26 days 20:17, 1 user, load average: 0.01, 0.02, 0.00
# free
total used free shared buff/cache available
Mem: 2020468 1768960 103008 6472 307064 251508
Swap: 2094056 168268 1925788
# ps -p 1 lc
F UID PID PPID PRI NI VSZ RSS WCHAN STAT TTY TIME COMMAND
4 0 1 0 20 0 1988980 1528612 do_epo Sl ? 175:14 shepher
# grep -v MARK messages
2025-09-14 22:00:48 localhost shepherd[1]: Rotating '/var/log/messages' to '/var/log/messages.1'.
2025-09-14 22:00:48 localhost linux: [1638517.256304] __vm_enough_memory: pid: 1, comm: shepherd, bytes: 8388608 not enough memory for the allocation
2025-09-14 22:00:48 localhost shepherd[1]: Exception caught while calling action of timer 'log-rotation': (system-error "primitive-fork" "~A" ("Cannot allocate memory") (12))
2025-09-22 19:06:33 localhost shepherd[1]: Stopping service root...
2025-09-22 19:06:33 localhost shepherd[1]: Exiting shepherd...
2025-09-22 19:06:33 localhost shepherd[1]: Service guix-ownership is not running.
2025-09-22 19:06:33 localhost shepherd[1]: Service user-homes is not running.
2025-09-22 19:06:33 localhost shepherd[1]: Stopping service swap-7cb6821e-5fbb-48b1-85f8-74b4c41e9b7f...
2025-09-22 19:06:33 localhost linux: [2319321.058327] __vm_enough_memory: pid: 1, comm: shepherd, bytes: 2144313344 not enough memory for the allocation
2025-09-22 19:06:33 localhost shepherd[1]: Ignoring error while stopping swap-7cb6821e-5fbb-48b1-85f8-74b4c41e9b7f: (system-error "swapoff" "~S: ~A" ("/dev/vda2" "Cannot allocate memory") (12))
2025-09-22 19:06:33 localhost shepherd[1]: Service swap-7cb6821e-5fbb-48b1-85f8-74b4c41e9b7f might have failed to stop.
2025-09-22 19:06:33 localhost shepherd[1]: Service swap-7cb6821e-5fbb-48b1-85f8-74b4c41e9b7f is now stopped.
2025-09-22 19:06:34 localhost shepherd[1]: Stopping service ntpd...
2025-09-22 19:06:34 localhost ntpd[134]: ntpd exiting on signal 15 (Terminated)
2025-09-22 19:06:34 localhost shepherd[1]: Service ntpd stopped.
2025-09-22 19:06:34 localhost shepherd[1]: Service ntpd is now stopped.
2025-09-22 19:06:34 localhost shepherd[1]: Stopping service ssh-daemon...
2025-09-22 19:06:34 localhost shepherd[1]: Service ssh-daemon stopped.
2025-09-22 19:06:34 localhost shepherd[1]: Service ssh-daemon is now stopped.
2025-09-22 19:06:34 localhost shepherd[1]: Stopping service certbot-certificate-renewal...
--
Closely related issue: For situations just such as this, reboot(8) is
supposed to have an option (conventionally `-f/--force`) which causes it
to issue the reboot system call itself, bypassing init. But the
Shepherd's version of reboot is missing this option.
--
I was already pretty frustrated with Guix System and this memory leak is
the last straw. This server is shortly going to be reformatted with
another distribution. However, I will preserve a disk image in case it
is useful to anyone.
zw
Information forwarded
to
bug-guix <at> gnu.org:
bug#79492; Package
guix.
(Mon, 22 Sep 2025 23:49:02 GMT)
Full text and
rfc822 format available.
Message #8 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
"Zack Weinberg" via Bug reports for GNU Guix <bug-guix <at> gnu.org> writes:
> I left my Guix System-based web server running for 26 days and PID 1 has
> ballooned to consume 75% of all available RAM. Because of this, it can
> no longer fork. Which, in turn, means the system is almost but not
> quite
…
> # free
> total used free shared buff/cache available
> Mem: 2020468 1768960 103008 6472 307064 251508
> Swap: 2094056 168268 1925788
You still have almost all swap free, so you should be able to start
programs (though slowly).
What I found, though, is that SSH can get into trouble when cgroups run
out (which happens quickly if you make heavy use of docker).
I regularly delete the unused cgroups then:
find /sys/fs/cgroup/ -depth -type d -name 'c*' | xargs -I {} sudo bash -c 'if test "$(cat {}/pids.current)" -eq 0; then echo {}; cat {}/pids.current; rmdir {}; fi'
Best wishes,
Arne
--
Unpolitisch sein
heißt politisch sein,
ohne es zu merken.
draketo.de
[signature.asc (application/pgp-signature, inline)]
Information forwarded
to
bug-guix <at> gnu.org:
bug#79492; Package
guix.
(Mon, 22 Sep 2025 23:49:02 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-guix <at> gnu.org:
bug#79492; Package
guix.
(Tue, 23 Sep 2025 06:45:02 GMT)
Full text and
rfc822 format available.
Message #14 received at submit <at> debbugs.gnu.org (full text, mbox):
Hi Zack,
"Zack Weinberg" via Bug reports for GNU Guix <bug-guix <at> gnu.org> writes:
> I left my Guix System-based web server running for 26 days and PID 1 has
> ballooned to consume 75% of all available RAM. Because of this, it can
> no longer fork.
This is being tracked at
<https://codeberg.org/shepherd/shepherd/issues/1>.
It would seem a workaround is to use Inetutils syslogd instead of the
built-in ‘system-log’:
--8<---------------cut here---------------start------------->8---
(operating-system
;; …
(services (append (list …
(service syslog-service-type))
(modify-services %base-services
(delete shepherd-system-log-service-type)))))
--8<---------------cut here---------------end--------------->8---
Ludo’.
Information forwarded
to
bug-guix <at> gnu.org:
bug#79492; Package
guix.
(Tue, 23 Sep 2025 06:45:03 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-guix <at> gnu.org:
bug#79492; Package
guix.
(Tue, 23 Sep 2025 07:55:03 GMT)
Full text and
rfc822 format available.
Message #20 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
"Dr. Arne Babenhauserheide" <arne_bab <at> web.de> writes:
> "Zack Weinberg" via Bug reports for GNU Guix <bug-guix <at> gnu.org> writes:
>> I left my Guix System-based web server running for 26 days and PID 1 has
>> ballooned to consume 75% of all available RAM. Because of this, it can
>> no longer fork. Which, in turn, means the system is almost but not
> You still have almost all swap free, so you should be able to start
I have to take back this comment: didn’t read closely enough.
(forking copies allocated memory, so 75% mem usage kills fork)
I’m sorry for the noise.
Best wishes,
Arne
--
Unpolitisch sein
heißt politisch sein,
ohne es zu merken.
draketo.de
[signature.asc (application/pgp-signature, inline)]
Information forwarded
to
bug-guix <at> gnu.org:
bug#79492; Package
guix.
(Tue, 23 Sep 2025 07:56:01 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-guix <at> gnu.org:
bug#79492; Package
guix.
(Fri, 26 Sep 2025 22:20:01 GMT)
Full text and
rfc822 format available.
Message #26 received at 79492 <at> debbugs.gnu.org (full text, mbox):
Hi,
Ludovic Courtès <ludo <at> gnu.org> writes:
> It would seem a workaround is to use Inetutils syslogd instead of the
> built-in ‘system-log’:
>
> (operating-system
> ;; …
> (services (append (list …
> (service syslog-service-type))
> (modify-services %base-services
> (delete shepherd-system-log-service-type)))))
Thank you for the suggestion. I will give that a try. I would probably
be a good idea to mention it on the Codeberg issue as well.
Have a nice day,
Tomas
--
There are only two hard things in Computer Science:
cache invalidation, naming things and off-by-one errors.
This bug report was last modified 39 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.