GNU bug report logs - #72166
Shepherd periodically goes unresponsive on one of my machines

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: guix; Reported by: "Jonathan Frederickson" <jonathan@HIDDEN>; dated Thu, 18 Jul 2024 00:44:01 UTC; Maintainer for guix is bug-guix@HIDDEN.

Message received at 72166 <at> debbugs.gnu.org:


Received: (at 72166) by debbugs.gnu.org; 22 Jul 2024 07:14:42 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Mon Jul 22 03:14:42 2024
Received: from localhost ([127.0.0.1]:56741 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1sVnFq-0006A4-F6
	for submit <at> debbugs.gnu.org; Mon, 22 Jul 2024 03:14:42 -0400
Received: from hera.aquilenet.fr ([185.233.100.1]:43500)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <ludo@HIDDEN>) id 1sVnFo-00069o-7Z
 for 72166 <at> debbugs.gnu.org; Mon, 22 Jul 2024 03:14:41 -0400
Received: from localhost (localhost [127.0.0.1])
 by hera.aquilenet.fr (Postfix) with ESMTP id D09A4207;
 Mon, 22 Jul 2024 09:14:31 +0200 (CEST)
X-Virus-Scanned: Debian amavisd-new at hera.aquilenet.fr
Received: from hera.aquilenet.fr ([127.0.0.1])
 by localhost (hera.aquilenet.fr [127.0.0.1]) (amavisd-new, port 10024)
 with ESMTP id LtT8vylRAwr0; Mon, 22 Jul 2024 09:14:31 +0200 (CEST)
Received: from ribbon (unknown [193.50.110.239])
 by hera.aquilenet.fr (Postfix) with ESMTPSA id 59E803C;
 Mon, 22 Jul 2024 09:14:31 +0200 (CEST)
From: =?utf-8?Q?Ludovic_Court=C3=A8s?= <ludo@HIDDEN>
To: "Jonathan Frederickson" <jonathan@HIDDEN>
Subject: Re: bug#72166: Shepherd periodically goes unresponsive on one of my
 machines
In-Reply-To: <7974c622-e7d8-48b3-9948-14e8d7654793@HIDDEN> (Jonathan
 Frederickson's message of "Fri, 19 Jul 2024 12:25:37 -0400")
References: <df6e8894-fd84-446f-a67f-50cdcc9de5b3@HIDDEN>
 <878qxxtmwu.fsf@HIDDEN>
 <7974c622-e7d8-48b3-9948-14e8d7654793@HIDDEN>
Date: Mon, 22 Jul 2024 09:14:29 +0200
Message-ID: <87zfq9kiei.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Spam-Score: 1.0 (+)
X-Debbugs-Envelope-To: 72166
Cc: 72166 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -0.0 (/)

Hi,

"Jonathan Frederickson" <jonathan@HIDDEN> skribis:

> Hi Ludo, thanks for the troubleshooting help. Looks like I'm running 0.10=
.4:
>
> jfred@terracard ~$ cat /proc/1/cmdline | xargs -0
> /gnu/store/bhynhk0c6ssq3fqqc59fvhxjzwywsjbb-guile-3.0.9/bin/guile --no-au=
to-compile /gnu/store/39li5qpiaj1lx89xgahlbgvfnjhpcpwg-shepherd-0.10.4/bin/=
shepherd --config /gnu/store/hfyri6ygfdjq4w3nkha2ypa2k98hhfxj-shepherd.conf
>
> I see now that 0.10.5 was released a few weeks ago, does that have a fix =
that could be related?

Yes, it could be related.  Per the =E2=80=98NEWS=E2=80=99 file of Shepherd:

  ** =E2=80=98herd unload root SERVICE=E2=80=99 no longer hands when there=
=E2=80=99s a replacement
     (<https://issues.guix.gnu.org/71478>)

  It used to be that, for a running service S that has a replacement regist=
ered,
  =E2=80=98herd unload root S=E2=80=99 would hang shepherd, making it total=
ly unresponsive=E2=80=94=E2=80=98herd
  status=E2=80=99, =E2=80=98halt=E2=80=99, etc. would hang forever, and ine=
td-style services would no
  longer start, etc.  This is now fixed.

Depending on previous =E2=80=98guix system reconfigure=E2=80=99 invocations=
 on these
machines, it=E2=80=99s possible that you ended up in this state.

Would be great if you could upgrade and see if the problem still occurs.

Thanks,
Ludo=E2=80=99.




Information forwarded to bug-guix@HIDDEN:
bug#72166; Package guix. Full text available.

Message received at 72166 <at> debbugs.gnu.org:


Received: (at 72166) by debbugs.gnu.org; 19 Jul 2024 16:26:06 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Fri Jul 19 12:26:06 2024
Received: from localhost ([127.0.0.1]:50261 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1sUqQn-0002to-R3
	for submit <at> debbugs.gnu.org; Fri, 19 Jul 2024 12:26:06 -0400
Received: from fhigh3-smtp.messagingengine.com ([103.168.172.154]:56121)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <jonathan@HIDDEN>) id 1sUqQk-0002tI-G1
 for 72166 <at> debbugs.gnu.org; Fri, 19 Jul 2024 12:26:04 -0400
Received: from compute8.internal (compute8.nyi.internal [10.202.2.227])
 by mailfhigh.nyi.internal (Postfix) with ESMTP id CD1B7114031F;
 Fri, 19 Jul 2024 12:25:57 -0400 (EDT)
Received: from wimap21 ([10.202.2.81])
 by compute8.internal (MEProxy); Fri, 19 Jul 2024 12:25:57 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=terracrypt.net;
 h=cc:cc:content-transfer-encoding:content-type:content-type
 :date:date:from:from:in-reply-to:in-reply-to:message-id
 :mime-version:references:reply-to:subject:subject:to:to; s=fm1;
 t=1721406357; x=1721492757; bh=0GMiKGNSEMi5xNIcqKzUjE/xdKeG68LX
 QcR8g/cUOVo=; b=DxIAPcC54vOaIT02ri7S7NOi2xxBR3xPl5xcAtJaf3mC1kpd
 dC56Xba4veS1x0/28A5fdSilbmJvIrkG1Hx+wx13kQRp+DAs0Zq9dkDcZzDPk2Hc
 qvolJQRHFYgWp8hkE0dwG1pvNvWCiJ/RzVUK7VitQioheLtCpP1AslKTR+hmW5Nd
 zTYSLhZZ1mbPR2doyZQV4pNy4UtcKnDaCMm483LZsQwa0f46UW16jgU9tipsy00T
 0gxQi2eyM4+itMp+psf1MzSzGiIf0JxEN1HYHBg8OpPIo//SsXHzcW/J5btV0poF
 qb6+DihKf/zLp7vvg7zpHQwvWxawcola7MyAyg==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
 messagingengine.com; h=cc:cc:content-transfer-encoding
 :content-type:content-type:date:date:feedback-id:feedback-id
 :from:from:in-reply-to:in-reply-to:message-id:mime-version
 :references:reply-to:subject:subject:to:to:x-me-proxy:x-me-proxy
 :x-me-sender:x-me-sender:x-sasl-enc; s=fm3; t=1721406357; x=
 1721492757; bh=0GMiKGNSEMi5xNIcqKzUjE/xdKeG68LXQcR8g/cUOVo=; b=P
 gEuAUM6HJEGn6tLHiG8W40xJbzN7DgA1rYztnnvvHK4RmSe9fz3FC5wpV8hxvVgv
 jd2q0z7WRtC0PZnpuhAkVtZqaezy+bYLUjuWF38pE+niic1y2KCgtoJcnnAVRWmP
 wM/RyqjME76LubiX53dKcx0heN8dLxlUnPNbDZe3reYQjr4vJMMHOPbr8BbNo65f
 oWibhk1uE+v98WktnQf+O7mxxBc8cCHVntE60hYdjT+CcOUvDWq3KFawoAj8SCCV
 vz8S4R7X6FuC+QhVHuZO3VHisl3lwdwoigGHxyiROsjsqIQk306mFGBU26KFBaNy
 eJF/j9puZMZEBS2y+3W/Q==
X-ME-Sender: <xms:lZOaZnunoJYC_1mOFhYrBl6M-XVV7DBIUADfLpcTp4Siu5fNymPMFg>
 <xme:lZOaZoftOdDqe2jZYLXNmrVERxfcN8wzG9m1yHQvtE1HH0dXBILweAy7G_MClRMvg
 pAwD8K9exwp1imGPg>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrhedugddutddtucetufdoteggodetrfdotf
 fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen
 uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne
 cujfgurhepofgfggfkjghffffhvfevufgtgfesthhqredtreerjeenucfhrhhomhepfdfl
 ohhnrghthhgrnhcuhfhrvgguvghrihgtkhhsohhnfdcuoehjohhnrghthhgrnhesthgvrh
 hrrggtrhihphhtrdhnvghtqeenucggtffrrghtthgvrhhnpeejudelledvueetgfetleel
 vdelheefhefhgfdthfffhfelkeevgeekvdeffeekjeenucevlhhushhtvghrufhiiigvpe
 dtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehjohhnrghthhgrnhesthgvrhhrrggtrhih
 phhtrdhnvght
X-ME-Proxy: <xmx:lZOaZqzE5bWwhAbd4YlFnziTSu-7TNV29qXcv7cMqrzDTgr2Zm3xbA>
 <xmx:lZOaZmOx-pz4dvQqZkM8As4UZE-8v_I-Udsdn8VES0x_98wDTMpIpQ>
 <xmx:lZOaZn89pK-0DSnMbNkvrS5OnwLCeJJSuiGO7MaN_zI2YHyy9NjwNQ>
 <xmx:lZOaZmXlKKymEafTT1GxtaLTLUto4W-KKwbH4qsnu4Nka1XYzvZNLQ>
 <xmx:lZOaZhlmIyw-L4y7-cmhc1TAKNXnafHGkCx6fx5bGGjKWgvuCytxPRDN>
Feedback-ID: if4194509:Fastmail
Received: by mailuser.nyi.internal (Postfix, from userid 501)
 id 8433A37A0084; Fri, 19 Jul 2024 12:25:57 -0400 (EDT)
X-Mailer: MessagingEngine.com Webmail Interface
User-Agent: Cyrus-JMAP/3.11.0-alpha0-568-g843fbadbe-fm-20240701.003-g843fbadb
MIME-Version: 1.0
Message-Id: <7974c622-e7d8-48b3-9948-14e8d7654793@HIDDEN>
In-Reply-To: <878qxxtmwu.fsf@HIDDEN>
References: <df6e8894-fd84-446f-a67f-50cdcc9de5b3@HIDDEN>
 <878qxxtmwu.fsf@HIDDEN>
Date: Fri, 19 Jul 2024 12:25:37 -0400
From: "Jonathan Frederickson" <jonathan@HIDDEN>
To: =?UTF-8?Q?Ludovic_Court=C3=A8s?= <ludo@HIDDEN>
Subject: Re: bug#72166: Shepherd periodically goes unresponsive on one of my
 machines
Content-Type: text/plain;charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Spam-Score: -0.7 (/)
X-Debbugs-Envelope-To: 72166
Cc: 72166 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.7 (-)

On Fri, Jul 19, 2024, at 11:35 AM, Ludovic Court=C3=A8s wrote:
> Hi Jonathan,
>=20
> "Jonathan Frederickson" <jonathan@HIDDEN> skribis:
>=20
> > I've been running into an issue with Shepherd on one of my machines.=
 Every so often (and I haven't figured out what conditions trigger it), =
my Shepherd instances (both home and PID 1) will go unresponsive. I thou=
ght I had tracked it down to a misbehaving home service that I had confi=
gured, but it's just happened again without that service running.
> >
> > 'herd status' hangs indefinitely:
> >
> > jfred@terracard ~$ sudo herd status
> > Password:=20
> > <never returns>
> >
> > ...on both instances:
> >
> > jfred@terracard ~$ herd status
> > <never returns>
>=20
> Ouch.  What version of shepherd is running?  (You can view it with
> =E2=80=9Ccat /proc/1/cmdline | xargs -0=E2=80=9D.)
>=20
> > The PID 1 shepherd instance isn't reaping defunct processes:
> >
> > jfred@terracard ~$ ps aux | grep -i lock
> > jfred      541  0.0  0.0   3700  2304 ?        S    18:30   0:00 swa=
yidle -w timeout 300 swaylock -f -i ~/.wallpapers/user-manual.jpg timeou=
t 10 if pgrep swaylock; then swaymsg "output * dpms off"; fi resume sway=
msg "output * dpms on" before-sleep swaylock -f -i ~/.wallpapers/user-ma=
nual.jpg
> > jfred     3111  0.0  0.0      0     0 ?        Z    18:53   0:00 [sw=
aylock] <defunct>
> > jfred     3112  0.0  0.0      0     0 ?        Zs   18:53   0:00 [sw=
aylock] <defunct>
> >
> > Some further troubleshooting... strace indicates that it's waiting o=
n a read() on its fd 9:
>=20
> Interesting.  There were bugs in earlier 0.10.x version that could cau=
se
> this sort of thing; let=E2=80=99s see what version you have, first.
>=20
> Ludo=E2=80=99.
>=20

Hi Ludo, thanks for the troubleshooting help. Looks like I'm running 0.1=
0.4:

jfred@terracard ~$ cat /proc/1/cmdline | xargs -0
/gnu/store/bhynhk0c6ssq3fqqc59fvhxjzwywsjbb-guile-3.0.9/bin/guile --no-a=
uto-compile /gnu/store/39li5qpiaj1lx89xgahlbgvfnjhpcpwg-shepherd-0.10.4/=
bin/shepherd --config /gnu/store/hfyri6ygfdjq4w3nkha2ypa2k98hhfxj-shephe=
rd.conf

I see now that 0.10.5 was released a few weeks ago, does that have a fix=
 that could be related?




Information forwarded to bug-guix@HIDDEN:
bug#72166; Package guix. Full text available.

Message received at 72166 <at> debbugs.gnu.org:


Received: (at 72166) by debbugs.gnu.org; 19 Jul 2024 15:36:12 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Fri Jul 19 11:36:11 2024
Received: from localhost ([127.0.0.1]:50213 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1sUpeV-0001fH-J4
	for submit <at> debbugs.gnu.org; Fri, 19 Jul 2024 11:36:11 -0400
Received: from hera.aquilenet.fr ([185.233.100.1]:60734)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <ludo@HIDDEN>) id 1sUpeQ-0001ej-3y
 for 72166 <at> debbugs.gnu.org; Fri, 19 Jul 2024 11:36:10 -0400
Received: from localhost (localhost [127.0.0.1])
 by hera.aquilenet.fr (Postfix) with ESMTP id A8B301F24;
 Fri, 19 Jul 2024 17:35:30 +0200 (CEST)
X-Virus-Scanned: Debian amavisd-new at hera.aquilenet.fr
Received: from hera.aquilenet.fr ([127.0.0.1])
 by localhost (hera.aquilenet.fr [127.0.0.1]) (amavisd-new, port 10024)
 with ESMTP id wa4uM9Ge4WZP; Fri, 19 Jul 2024 17:35:30 +0200 (CEST)
Received: from ribbon (91-160-117-201.subs.proxad.net [91.160.117.201])
 by hera.aquilenet.fr (Postfix) with ESMTPSA id 1FAE21EE6;
 Fri, 19 Jul 2024 17:35:30 +0200 (CEST)
From: =?utf-8?Q?Ludovic_Court=C3=A8s?= <ludo@HIDDEN>
To: "Jonathan Frederickson" <jonathan@HIDDEN>
Subject: Re: bug#72166: Shepherd periodically goes unresponsive on one of my
 machines
In-Reply-To: <df6e8894-fd84-446f-a67f-50cdcc9de5b3@HIDDEN> (Jonathan
 Frederickson's message of "Wed, 17 Jul 2024 20:43:15 -0400")
References: <df6e8894-fd84-446f-a67f-50cdcc9de5b3@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13)
Date: Fri, 19 Jul 2024 17:35:29 +0200
Message-ID: <878qxxtmwu.fsf@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Spam-Score: 1.0 (+)
X-Debbugs-Envelope-To: 72166
Cc: 72166 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -0.0 (/)

Hi Jonathan,

"Jonathan Frederickson" <jonathan@HIDDEN> skribis:

> I've been running into an issue with Shepherd on one of my machines. Ever=
y so often (and I haven't figured out what conditions trigger it), my Sheph=
erd instances (both home and PID 1) will go unresponsive. I thought I had t=
racked it down to a misbehaving home service that I had configured, but it'=
s just happened again without that service running.
>
> 'herd status' hangs indefinitely:
>
> jfred@terracard ~$ sudo herd status
> Password:=20
> <never returns>
>
> ...on both instances:
>
> jfred@terracard ~$ herd status
> <never returns>

Ouch.  What version of shepherd is running?  (You can view it with
=E2=80=9Ccat /proc/1/cmdline | xargs -0=E2=80=9D.)

> The PID 1 shepherd instance isn't reaping defunct processes:
>
> jfred@terracard ~$ ps aux | grep -i lock
> jfred      541  0.0  0.0   3700  2304 ?        S    18:30   0:00 swayidle=
 -w timeout 300 swaylock -f -i ~/.wallpapers/user-manual.jpg timeout 10 if =
pgrep swaylock; then swaymsg "output * dpms off"; fi resume swaymsg "output=
 * dpms on" before-sleep swaylock -f -i ~/.wallpapers/user-manual.jpg
> jfred     3111  0.0  0.0      0     0 ?        Z    18:53   0:00 [swayloc=
k] <defunct>
> jfred     3112  0.0  0.0      0     0 ?        Zs   18:53   0:00 [swayloc=
k] <defunct>
>
> Some further troubleshooting... strace indicates that it's waiting on a r=
ead() on its fd 9:

Interesting.  There were bugs in earlier 0.10.x version that could cause
this sort of thing; let=E2=80=99s see what version you have, first.

Ludo=E2=80=99.




Information forwarded to bug-guix@HIDDEN:
bug#72166; Package guix. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 18 Jul 2024 00:43:47 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Wed Jul 17 20:43:47 2024
Received: from localhost ([127.0.0.1]:36414 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1sUFFL-0001UP-CN
	for submit <at> debbugs.gnu.org; Wed, 17 Jul 2024 20:43:47 -0400
Received: from lists.gnu.org ([209.51.188.17]:33914)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <jonathan@HIDDEN>) id 1sUFFJ-0001UG-Mp
 for submit <at> debbugs.gnu.org; Wed, 17 Jul 2024 20:43:46 -0400
Received: from eggs.gnu.org ([2001:470:142:3::10])
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <jonathan@HIDDEN>)
 id 1sUFFF-0004vs-UU
 for bug-guix@HIDDEN; Wed, 17 Jul 2024 20:43:42 -0400
Received: from fhigh2-smtp.messagingengine.com ([103.168.172.153])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <jonathan@HIDDEN>)
 id 1sUFFE-00083z-1Q
 for bug-guix@HIDDEN; Wed, 17 Jul 2024 20:43:41 -0400
Received: from compute4.internal (compute4.nyi.internal [10.202.2.44])
 by mailfhigh.nyi.internal (Postfix) with ESMTP id 93CAC1140114
 for <bug-guix@HIDDEN>; Wed, 17 Jul 2024 20:43:37 -0400 (EDT)
Received: from imap48 ([10.202.2.98])
 by compute4.internal (MEProxy); Wed, 17 Jul 2024 20:43:37 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=terracrypt.net;
 h=cc:content-type:content-type:date:date:from:from:in-reply-to
 :message-id:mime-version:reply-to:subject:subject:to:to; s=fm3;
 t=1721263417; x=1721349817; bh=21PgET032XXcC31BSCIzr6mXXXHIOXlx
 MqF6j8eZsrg=; b=06KHpmcXv7WGtMBtVSmtaslZPubW1UEPqrz5wCbJxtdP3w2t
 RyOC0G7EAeYpt1ZMrGbVIJxer/2UtgHfb8GMnV1Rl/H6vPKSK7JOAXQ7v8a/+Ny+
 iSmYp/meJRdpUZlW/pvSIe4VxnTLao6L5RgeDxYoOluTbFTB5+sOjyLxMaUM4UbS
 J9jEleOQkiAv15i88MSl+JnpN0umQsd2hhuMKufOTtXmxttFvT9kaNdT0J5pxQKs
 VDnEGCUUDKPFRp0zCILXJKUReIcsLuzO7e7VD77G0+0Xru5nR0EfU7xE72QRTZf0
 FbZLb8089PHRjK7JVM4bhUyrsTENTAqgg1/F1A==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
 messagingengine.com; h=cc:content-type:content-type:date:date
 :feedback-id:feedback-id:from:from:in-reply-to:message-id
 :mime-version:reply-to:subject:subject:to:to:x-me-proxy
 :x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=
 1721263417; x=1721349817; bh=21PgET032XXcC31BSCIzr6mXXXHIOXlxMqF
 6j8eZsrg=; b=Wdb256i65zONehXIr0PNO19QACsxDd+Z3jtu3DZYC3FUWP9m8Zy
 DF7MLqZSVnOx6FISlZUFZBdDrz8i1kre0aFXEpjik4jISxGlAnn/ZZRFZz7yNRsB
 H4VYTlC4k9vCh0BL5oAA6jALr8NvonLurW+00ITl8iMLwZJKri/39UA2q51J4vrm
 z56z+VY9QrYA5ovFUR11hyfyWeuSAO7uDylxKAsk5ruCXql6vlrl7G4E7Cl1bajt
 IT+GhLXEUrfX6eamfH+P4pZKPaQWyTFlUfeg9VpqzBn0jH5P+7WB3omAktPvNQKR
 Yl0TQ7NP6TYTH6I+k6Ohs67+X7hqlELJgPA==
X-ME-Sender: <xms:OGWYZkuujESNJvJbFbVtwfFI8ZzK3ExHs5thnJ1V5WPTE1zpW4JMlg>
 <xme:OGWYZhe0l5OWBIBxZMAAgxJ1WcFMr42yZHDhO7azf8o8FY9NXuD8lHK78pG7Ex0I7
 76-Tfj5BMjkyLQq6Q>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrgeekgdegtdcutefuodetggdotefrodftvf
 curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu
 uegrihhlohhuthemuceftddtnecunecujfgurhepofgfggfkfffhvffutgesthdtredtre
 ertdenucfhrhhomhepfdflohhnrghthhgrnhcuhfhrvgguvghrihgtkhhsohhnfdcuoehj
 ohhnrghthhgrnhesthgvrhhrrggtrhihphhtrdhnvghtqeenucggtffrrghtthgvrhhnpe
 dvffeugfetgfelleevfeevuefhudejtdfgfeejfeehjeegkefhjefgueeuffekffenucff
 ohhmrghinhepghhithhhuhgsrdgtohhmnecuvehluhhsthgvrhfuihiivgeptdenucfrrg
 hrrghmpehmrghilhhfrhhomhepjhhonhgrthhhrghnsehtvghrrhgrtghrhihpthdrnhgv
 th
X-ME-Proxy: <xmx:OGWYZvyv-BGRIXq3h8UxLZYxwNRDMafeODvbilulK8b9ILEgO8q52g>
 <xmx:OGWYZnPCMh5pg2dHDmpoRTbg2p8sTH17NOxZnOC9tgu-Ol_Wv53oNA>
 <xmx:OGWYZk_hl65U_OZYRmMDAukaP6xfml9hyOwuk1oedqpTsVoXmOHrTg>
 <xmx:OGWYZvWE0oIuMP_Sb5iQUiF345VVElXATlzIN4EtbbFfTEqJx9EB0Q>
 <xmx:OWWYZpFQFeJRqz3kReq1vTbn3bPRbyq-Zk0jEWg-jD-sveFH80aku4ge>
Feedback-ID: if4194509:Fastmail
Received: by mailuser.nyi.internal (Postfix, from userid 501)
 id 3B79731A0065; Wed, 17 Jul 2024 20:43:36 -0400 (EDT)
X-Mailer: MessagingEngine.com Webmail Interface
User-Agent: Cyrus-JMAP/3.11.0-alpha0-568-g843fbadbe-fm-20240701.003-g843fbadb
MIME-Version: 1.0
Message-Id: <df6e8894-fd84-446f-a67f-50cdcc9de5b3@HIDDEN>
Date: Wed, 17 Jul 2024 20:43:15 -0400
From: "Jonathan Frederickson" <jonathan@HIDDEN>
To: bug-guix@HIDDEN
Subject: Shepherd periodically goes unresponsive on one of my machines
Content-Type: text/plain
Received-SPF: pass client-ip=103.168.172.153;
 envelope-from=jonathan@HIDDEN; helo=fhigh2-smtp.messagingengine.com
X-Spam_score_int: -27
X-Spam_score: -2.8
X-Spam_bar: --
X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1,
 DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1,
 RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001,
 SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-Spam-Score: -1.6 (-)
X-Debbugs-Envelope-To: submit
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -2.6 (--)

I've been running into an issue with Shepherd on one of my machines. Every so often (and I haven't figured out what conditions trigger it), my Shepherd instances (both home and PID 1) will go unresponsive. I thought I had tracked it down to a misbehaving home service that I had configured, but it's just happened again without that service running.

'herd status' hangs indefinitely:

jfred@terracard ~$ sudo herd status
Password: 
<never returns>

...on both instances:

jfred@terracard ~$ herd status
<never returns>

The PID 1 shepherd instance isn't reaping defunct processes:

jfred@terracard ~$ ps aux | grep -i lock
jfred      541  0.0  0.0   3700  2304 ?        S    18:30   0:00 swayidle -w timeout 300 swaylock -f -i ~/.wallpapers/user-manual.jpg timeout 10 if pgrep swaylock; then swaymsg "output * dpms off"; fi resume swaymsg "output * dpms on" before-sleep swaylock -f -i ~/.wallpapers/user-manual.jpg
jfred     3111  0.0  0.0      0     0 ?        Z    18:53   0:00 [swaylock] <defunct>
jfred     3112  0.0  0.0      0     0 ?        Zs   18:53   0:00 [swaylock] <defunct>

Some further troubleshooting... strace indicates that it's waiting on a read() on its fd 9:

jfred@terracard ~ [env]$ sudo strace -fp 1
Password: 
strace: Process 1 attached with 5 threads
[pid   144] read(9,  <unfinished ...>
[pid   142] futex(0x7fa43892abe8, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid   141] futex(0x7fa43892abe8, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid   140] futex(0x7fa43892abe8, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, FUTEX_BITSET_MATCH_ANY^

...which seems to be:

jfred@terracard ~ [env]$ sudo ls -l /proc/1/fd/9
lr-x------ 1 root root 64 Jul 17 20:39 /proc/1/fd/9 -> 'pipe:[4015]'
jfred@terracard ~ [env]$ sudo lsof -n | grep 4015
lsof: WARNING: can't stat() fuse.portal file system /run/user/1000/doc
      Output information may be incomplete.
shepherd     1                      root    9r     FIFO               0,15       0t0       4015 pipe
shepherd     1                      root   11w     FIFO               0,15       0t0       4015 pipe
shepherd     1  140 GC-marker       root    9r     FIFO               0,15       0t0       4015 pipe
shepherd     1  140 GC-marker       root   11w     FIFO               0,15       0t0       4015 pipe
shepherd     1  141 GC-marker       root    9r     FIFO               0,15       0t0       4015 pipe
shepherd     1  141 GC-marker       root   11w     FIFO               0,15       0t0       4015 pipe
shepherd     1  142 GC-marker       root    9r     FIFO               0,15       0t0       4015 pipe
shepherd     1  142 GC-marker       root   11w     FIFO               0,15       0t0       4015 pipe
shepherd     1  144 shepherd        root    9r     FIFO               0,15       0t0       4015 pipe
shepherd     1  144 shepherd        root   11w     FIFO               0,15       0t0       4015 pipe

My system configuration for this machine can be found here, and I last ran a 'guix pull' on June 21: https://github.com/jfrederickson/dotfiles/blob/master/guix/guix/system/machines/terracard/config.scm

Has anyone else run into this?




Acknowledgement sent to "Jonathan Frederickson" <jonathan@HIDDEN>:
New bug report received and forwarded. Copy sent to bug-guix@HIDDEN. Full text available.
Report forwarded to bug-guix@HIDDEN:
bug#72166; Package guix. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Mon, 22 Jul 2024 07:30:02 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.