GNU bug report logs - #31785
Multiple client 'build-paths' RPCs can lead to daemon deadlock

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: guix; Severity: important; Reported by: ludo@HIDDEN (Ludovic Courtès); dated Mon, 11 Jun 2018 14:07:02 UTC; Maintainer for guix is bug-guix@HIDDEN.

Message received at 31785 <at> debbugs.gnu.org:


Received: (at 31785) by debbugs.gnu.org; 6 Sep 2019 09:05:37 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Fri Sep 06 05:05:36 2019
Received: from localhost ([127.0.0.1]:36406 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1i6ABU-0006rO-LP
	for submit <at> debbugs.gnu.org; Fri, 06 Sep 2019 05:05:36 -0400
Received: from eggs.gnu.org ([209.51.188.92]:48037)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <ludo@HIDDEN>) id 1i6ABS-0006r9-Op
 for 31785 <at> debbugs.gnu.org; Fri, 06 Sep 2019 05:05:35 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e]:53162)
 by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <ludo@HIDDEN>)
 id 1i6AAh-0004sA-Eh
 for 31785 <at> debbugs.gnu.org; Fri, 06 Sep 2019 05:05:28 -0400
Received: from [2001:660:6102:320:e120:2c8f:8909:cdfe] (port=37632 helo=ribbon)
 by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256)
 (Exim 4.82) (envelope-from <ludo@HIDDEN>) id 1i6AAg-00050u-OG
 for 31785 <at> debbugs.gnu.org; Fri, 06 Sep 2019 05:04:47 -0400
From: =?utf-8?Q?Ludovic_Court=C3=A8s?= <ludo@HIDDEN>
To: 31785 <at> debbugs.gnu.org
Subject: Re: bug#31785: Multiple client 'build-paths' RPCs can lead to daemon
 deadlock
References: <87602ph0yv.fsf@HIDDEN>
Date: Fri, 06 Sep 2019 11:04:43 +0200
In-Reply-To: <87602ph0yv.fsf@HIDDEN> ("Ludovic
 \=\?utf-8\?Q\?Court\=C3\=A8s\=22'\?\=
 \=\?utf-8\?Q\?s\?\= message of "Mon, 11 Jun 2018 16:06:16 +0200")
Message-ID: <87lfv1eqxg.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 31785
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

Here=E2=80=99s another example from berlin today:

--8<---------------cut here---------------start------------->8---
ludo@berlin$ sudo guix processes
SessionPID: 38649
ClientPID: 38611
ClientCommand: /gnu/store/7x9lrzs2l00mfcr3ya7hg1i06i7lq9pd-profile/bin/guil=
e \ /home/mbakke/guix/master/scripts/guix build --no-grafts --keep-going --=
max-silent-time=3D36000 --timeout=3D216000 -s armhf-linux nmoldyn r-msnid r=
-yamss netcdf-fortran python-netcdf4
LockHeld: /gnu/store/v6hcm1gvv3gdfbnq6cd4kn1i6ip3y74s-netcdf-fortran-4.4.4.=
lock
LockHeld: /gnu/store/ydxl0ws9kk6ix58m99ib7qhi157yq49h-r-msnid-1.18.1.lock
ChildProcess: 33181: /gnu/store/9alic3caqhay3h8mx4iihpmyj6ymqpcx-guile-2.2.=
4/bin/guile \ /gnu/store/vyhfp0gj4rx74yw0ybalzj8bvmqgp9pr-guix-1.0.1-4.c902=
458/bin/.guix-real offload x86_64-linux 36000 1 216000

[...]

SessionPID: 44043
ClientPID: 44023
ClientCommand: /gnu/store/7x9lrzs2l00mfcr3ya7hg1i06i7lq9pd-profile/bin/guil=
e \ /home/mbakke/guix/master/scripts/guix build --no-grafts --keep-going --=
max-silent-time=3D36000 --timeout=3D216000 -s armhf-linux netcdf-fortran r-=
msnid nmoldyn r-yamss
LockHeld: /gnu/store/69jjabypcpx83482rh3yp4a9ncjimw6b-r-yamss-1.10.0.lock
LockHeld: /gnu/store/ibdbwxb1i8yal8ilqxxgnzh6bxpinic1-nmoldyn-3.0.11.lock
ChildProcess: 48997: /gnu/store/9alic3caqhay3h8mx4iihpmyj6ymqpcx-guile-2.2.=
4/bin/guile \ /gnu/store/vyhfp0gj4rx74yw0ybalzj8bvmqgp9pr-guix-1.0.1-4.c902=
458/bin/.guix-real offload x86_64-linux 36000 1 216000

ludo@berlin ~$ sudo strace -p 38649 -e openat,fcntl
strace: Process 38649 attached
openat(AT_FDCWD, "/gnu/store/ibdbwxb1i8yal8ilqxxgnzh6bxpinic1-nmoldyn-3.0.1=
1.lock", O_RDWR|O_CREAT, 0600) =3D 10
fcntl(10, F_GETFD)                      =3D 0
fcntl(10, F_SETFD, FD_CLOEXEC)          =3D 0
fcntl(10, F_SETLK, {l_type=3DF_WRLCK, l_whence=3DSEEK_SET, l_start=3D0, l_l=
en=3D0}) =3D -1 EAGAIN (Resource temporarily unavailable)
openat(AT_FDCWD, "/gnu/store/69jjabypcpx83482rh3yp4a9ncjimw6b-r-yamss-1.10.=
0.lock", O_RDWR|O_CREAT, 0600) =3D 10
fcntl(10, F_GETFD)                      =3D 0
fcntl(10, F_SETFD, FD_CLOEXEC)          =3D 0
fcntl(10, F_SETLK, {l_type=3DF_WRLCK, l_whence=3DSEEK_SET, l_start=3D0, l_l=
en=3D0}) =3D -1 EAGAIN (Resource temporarily unavailable)
^Cstrace: Process 38649 detached

ludo@berlin ~$ sudo strace -p 44043 -e openat,fcntl
strace: Process 44043 attached
openat(AT_FDCWD, "/gnu/store/ydxl0ws9kk6ix58m99ib7qhi157yq49h-r-msnid-1.18.=
1.lock", O_RDWR|O_CREAT, 0600) =3D 10
fcntl(10, F_GETFD)                      =3D 0
fcntl(10, F_SETFD, FD_CLOEXEC)          =3D 0
fcntl(10, F_SETLK, {l_type=3DF_WRLCK, l_whence=3DSEEK_SET, l_start=3D0, l_l=
en=3D0}) =3D -1 EAGAIN (Resource temporarily unavailable)
^Cstrace: Process 44043 detached
--8<---------------cut here---------------end--------------->8---

A very palpable deadlock.

Ludo=E2=80=99.




Information forwarded to bug-guix@HIDDEN:
bug#31785; Package guix. Full text available.
Severity set to 'important' from 'normal' Request was from ludo@HIDDEN (Ludovic Courtès) to control <at> debbugs.gnu.org. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 11 Jun 2018 14:06:43 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Mon Jun 11 10:06:43 2018
Received: from localhost ([127.0.0.1]:43514 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1fSNT1-0008II-Bm
	for submit <at> debbugs.gnu.org; Mon, 11 Jun 2018 10:06:43 -0400
Received: from eggs.gnu.org ([208.118.235.92]:56334)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <ludo@HIDDEN>) id 1fSNSz-0008I6-9a
 for submit <at> debbugs.gnu.org; Mon, 11 Jun 2018 10:06:41 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <ludo@HIDDEN>) id 1fSNSq-0001YW-18
 for submit <at> debbugs.gnu.org; Mon, 11 Jun 2018 10:06:35 -0400
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=-0.0 required=5.0 tests=BAYES_40 autolearn=disabled
 version=3.3.2
Received: from lists.gnu.org ([2001:4830:134:3::11]:47059)
 by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32)
 (Exim 4.71) (envelope-from <ludo@HIDDEN>) id 1fSNSp-0001YQ-Tg
 for submit <at> debbugs.gnu.org; Mon, 11 Jun 2018 10:06:31 -0400
Received: from eggs.gnu.org ([2001:4830:134:3::10]:47293)
 by lists.gnu.org with esmtp (Exim 4.71)
 (envelope-from <ludo@HIDDEN>) id 1fSNSg-0002pO-VQ
 for bug-guix@HIDDEN; Mon, 11 Jun 2018 10:06:31 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <ludo@HIDDEN>) id 1fSNSc-0001TT-TN
 for bug-guix@HIDDEN; Mon, 11 Jun 2018 10:06:22 -0400
Received: from fencepost.gnu.org ([2001:4830:134:3::e]:44274)
 by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <ludo@HIDDEN>)
 id 1fSNSc-0001TH-PY
 for bug-guix@HIDDEN; Mon, 11 Jun 2018 10:06:18 -0400
Received: from [193.50.110.75] (port=57708 helo=ribbon)
 by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256)
 (Exim 4.82) (envelope-from <ludo@HIDDEN>) id 1fSNSc-0003FS-9d
 for bug-guix@HIDDEN; Mon, 11 Jun 2018 10:06:18 -0400
From: ludo@HIDDEN (Ludovic =?utf-8?Q?Court=C3=A8s?=)
To: bug-guix@HIDDEN
Subject: Multiple client 'build-paths' RPCs can lead to daemon deadlock
X-URL: http://www.fdn.fr/~lcourtes/
X-Revolutionary-Date: 23 Prairial an 226 de la =?utf-8?Q?R=C3=A9volution?=
X-PGP-Key-ID: 0x090B11993D9AEBB5
X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc
X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4  0CFB 090B 1199 3D9A EBB5
X-OS: x86_64-pc-linux-gnu
Date: Mon, 11 Jun 2018 16:06:16 +0200
Message-ID: <87602ph0yv.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x
X-Received-From: 2001:4830:134:3::11
X-Spam-Score: -5.0 (-----)
X-Debbugs-Envelope-To: submit
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -6.0 (------)

Hello,

I tried running this:

  guix build --max-jobs=3D200 $(guix gc -R $(guix build -d inkscape --no-gr=
afts) | sort) & \
  guix build --max-jobs=3D200 $(guix gc -R $(guix build -d inkscape --no-gr=
afts) | sort -r)

=E2=80=A6 also in parallel with this (for good measure):

  guix build --max-jobs=3D200 $(guix gc -R $(guix build -d inkscape --no-gr=
afts) | sort -R)

Since we have 3 clients, that leads to 3 guix-daemon processes, and
those are stuck in a deadlock:

--8<---------------cut here---------------start------------->8---
$ ps aux | grep guix-daemon
[=E2=80=A6]
root     20501  0.6  0.0  39292 12012 ?        Ss   15:51   0:01 guix-daemo=
n 20455
root     20503  4.3  0.0  39420 12096 ?        Ss   15:51   0:12 guix-daemo=
n 20491
root     22154  0.0  0.0  39028 11016 ?        Ss   15:52   0:00 guix-daemo=
n 22148
$ sudo strace -p 22154 -t
strace: Process 22154 attached
15:57:03 select(0, 0x7ffdca30b610, NULL, NULL, {tv_sec=3D1, tv_usec=3D31418=
9}) =3D 0 (Timeout)
15:57:04 openat(AT_FDCWD, "/gnu/store/ghc9vcc8ydv305jz0spd73qrrf30gxik-binu=
tils-bootstrap-0.lock", O_RDWR|O_CREAT, 0600) =3D 8
15:57:04 fcntl(8, F_GETFD)              =3D 0
15:57:04 fcntl(8, F_SETFD, FD_CLOEXEC)  =3D 0
15:57:04 fcntl(8, F_SETLK, {l_type=3DF_WRLCK, l_whence=3DSEEK_SET, l_start=
=3D0, l_len=3D0}) =3D -1 EAGAIN (Risurco dumtempe ne disponeblas)
15:57:04 close(8)                       =3D 0
15:57:04 openat(AT_FDCWD, "/gnu/store/5hdpffkgkfszpsin2ir50zscz06x0mb5-core=
utils-8.29-debug.lock", O_RDWR|O_CREAT, 0600) =3D 8
15:57:04 fcntl(8, F_GETFD)              =3D 0
15:57:04 fcntl(8, F_SETFD, FD_CLOEXEC)  =3D 0
15:57:04 fcntl(8, F_SETLK, {l_type=3DF_WRLCK, l_whence=3DSEEK_SET, l_start=
=3D0, l_len=3D0}) =3D -1 EAGAIN (Risurco dumtempe ne disponeblas)
15:57:04 close(8)                       =3D 0
15:57:04 openat(AT_FDCWD, "/gnu/store/aasfcliv85frq3aawpa4rw18nvlvl735-gett=
ext-minimal-0.19.8.1.lock", O_RDWR|O_CREAT, 0600) =3D 8
15:57:04 fcntl(8, F_GETFD)              =3D 0
15:57:04 fcntl(8, F_SETFD, FD_CLOEXEC)  =3D 0
15:57:04 fcntl(8, F_SETLK, {l_type=3DF_WRLCK, l_whence=3DSEEK_SET, l_start=
=3D0, l_len=3D0}) =3D -1 EAGAIN (Risurco dumtempe ne disponeblas)
15:57:04 close(8)                       =3D 0
15:57:04 openat(AT_FDCWD, "/gnu/store/j1kpzcpi1m1ihzwyqb60zd4s74yp85g8-glib=
c-bootstrap-0.lock", O_RDWR|O_CREAT, 0600) =3D 8
15:57:04 fcntl(8, F_GETFD)              =3D 0
15:57:04 fcntl(8, F_SETFD, FD_CLOEXEC)  =3D 0
15:57:04 fcntl(8, F_SETLK, {l_type=3DF_WRLCK, l_whence=3DSEEK_SET, l_start=
=3D0, l_len=3D0}) =3D -1 EAGAIN (Risurco dumtempe ne disponeblas)
15:57:04 close(8)                       =3D 0
15:57:04 openat(AT_FDCWD, "/gnu/store/4llnxsgr6mxxj6pk21d9vbwaxnkbn99k-gnut=
ls-3.5.18-doc.lock", O_RDWR|O_CREAT, 0600) =3D 8
15:57:04 fcntl(8, F_GETFD)              =3D 0
15:57:04 fcntl(8, F_SETFD, FD_CLOEXEC)  =3D 0
15:57:04 fcntl(8, F_SETLK, {l_type=3DF_WRLCK, l_whence=3DSEEK_SET, l_start=
=3D0, l_len=3D0}) =3D -1 EAGAIN (Risurco dumtempe ne disponeblas)
15:57:04 close(8)                       =3D 0
15:57:04 openat(AT_FDCWD, "/gnu/store/7rs9qx86f2w4hjbf9wjm10k4m2vvlbyv-grof=
f-1.22.3-doc.lock", O_RDWR|O_CREAT, 0600) =3D 8
15:57:04 fcntl(8, F_GETFD)              =3D 0
15:57:04 fcntl(8, F_SETFD, FD_CLOEXEC)  =3D 0
15:57:04 fcntl(8, F_SETLK, {l_type=3DF_WRLCK, l_whence=3DSEEK_SET, l_start=
=3D0, l_len=3D0}) =3D -1 EAGAIN (Risurco dumtempe ne disponeblas)
15:57:04 close(8)                       =3D 0
15:57:04 openat(AT_FDCWD, "/gnu/store/d645d919gp1zk8v65qa2r4ia5x1c5n6k-libs=
tdc++-5.5.0.lock", O_RDWR|O_CREAT, 0600) =3D 8
15:57:04 fcntl(8, F_GETFD)              =3D 0
15:57:04 fcntl(8, F_SETFD, FD_CLOEXEC)  =3D 0
15:57:04 fcntl(8, F_SETLK, {l_type=3DF_WRLCK, l_whence=3DSEEK_SET, l_start=
=3D0, l_len=3D0}) =3D -1 EAGAIN (Risurco dumtempe ne disponeblas)
15:57:04 close(8)                       =3D 0
15:57:04 openat(AT_FDCWD, "/gnu/store/amcs7l0ynj1qg6fp9ll3asiamd4zsq75-m4-1=
.4.18.lock", O_RDWR|O_CREAT, 0600) =3D 8
15:57:04 fcntl(8, F_GETFD)              =3D 0
15:57:04 fcntl(8, F_SETFD, FD_CLOEXEC)  =3D 0
15:57:04 fcntl(8, F_SETLK, {l_type=3DF_WRLCK, l_whence=3DSEEK_SET, l_start=
=3D0, l_len=3D0}) =3D -1 EAGAIN (Risurco dumtempe ne disponeblas)
15:57:04 close(8)                       =3D 0
15:57:04 openat(AT_FDCWD, "/gnu/store/p262qk7zsd2wnlal6x1jax2xbqvdddqq-m4-1=
.4.18.lock", O_RDWR|O_CREAT, 0600) =3D 8
15:57:04 fcntl(8, F_GETFD)              =3D 0
15:57:04 fcntl(8, F_SETFD, FD_CLOEXEC)  =3D 0
15:57:04 fcntl(8, F_SETLK, {l_type=3DF_WRLCK, l_whence=3DSEEK_SET, l_start=
=3D0, l_len=3D0}) =3D -1 EAGAIN (Risurco dumtempe ne disponeblas)
15:57:04 close(8)                       =3D 0
15:57:04 openat(AT_FDCWD, "/gnu/store/zvvgr301hz95ac15rp0wsjy5j4cdgkg4-modu=
le-import-compiled.lock", O_RDWR|O_CREAT, 0600) =3D 8
15:57:04 fcntl(8, F_GETFD)              =3D 0
15:57:04 fcntl(8, F_SETFD, FD_CLOEXEC)  =3D 0
15:57:04 fcntl(8, F_SETLK, {l_type=3DF_WRLCK, l_whence=3DSEEK_SET, l_start=
=3D0, l_len=3D0}) =3D -1 EAGAIN (Risurco dumtempe ne disponeblas)
15:57:04 close(8)                       =3D 0
15:57:04 openat(AT_FDCWD, "/gnu/store/760kj698c9x1vcmlzk1qr8i9hqw961n3-open=
ssl-1.0.2o-doc.lock", O_RDWR|O_CREAT, 0600) =3D 8
^C
$ sudo strace -p 20503 -t
strace: Process 20503 attached
15:58:23 select(445, [15 54 159 177 321 411 414 444], NULL, NULL, {tv_sec=
=3D1, tv_usec=3D595302}) =3D 0 (Timeout)
15:58:25 openat(AT_FDCWD, "/gnu/store/amcs7l0ynj1qg6fp9ll3asiamd4zsq75-m4-1=
.4.18.lock", O_RDWR|O_CREAT, 0600) =3D 14
15:58:25 fcntl(14, F_GETFD)             =3D 0
15:58:25 fcntl(14, F_SETFD, FD_CLOEXEC) =3D 0
15:58:25 fcntl(14, F_SETLK, {l_type=3DF_WRLCK, l_whence=3DSEEK_SET, l_start=
=3D0, l_len=3D0}) =3D -1 EAGAIN (Risurco dumtempe ne disponeblas)
15:58:25 close(14)                      =3D 0
15:58:25 openat(AT_FDCWD, "/gnu/store/zvvgr301hz95ac15rp0wsjy5j4cdgkg4-modu=
le-import-compiled.lock", O_RDWR|O_CREAT, 0600) =3D 14
15:58:25 fcntl(14, F_GETFD)             =3D 0
15:58:25 fcntl(14, F_SETFD, FD_CLOEXEC) =3D 0
15:58:25 fcntl(14, F_SETLK, {l_type=3DF_WRLCK, l_whence=3DSEEK_SET, l_start=
=3D0, l_len=3D0}) =3D -1 EAGAIN (Risurco dumtempe ne disponeblas)
15:58:25 close(14)                      =3D 0
15:58:25 openat(AT_FDCWD, "/gnu/store/760kj698c9x1vcmlzk1qr8i9hqw961n3-open=
ssl-1.0.2o-doc.lock", O_RDWR|O_CREAT, 0600) =3D 14
15:58:25 fcntl(14, F_GETFD)             =3D 0
15:58:25 fcntl(14, F_SETFD, FD_CLOEXEC) =3D 0
15:58:25 fcntl(14, F_SETLK, {l_type=3DF_WRLCK, l_whence=3DSEEK_SET, l_start=
=3D0, l_len=3D0}) =3D -1 EAGAIN (Risurco dumtempe ne disponeblas)
15:58:25 close(14)                      =3D 0
15:58:25 select(445, [15 54 159 177 321 411 414 444], NULL, NULL, {tv_sec=
=3D5, tv_usec=3D0}) =3D 0 (Timeout)
15:58:30 openat(AT_FDCWD, "/gnu/store/amcs7l0ynj1qg6fp9ll3asiamd4zsq75-m4-1=
.4.18.lock", O_RDWR|O_CREAT, 0600) =3D 14
15:58:30 fcntl(14, F_GETFD)             =3D 0
15:58:30 fcntl(14, F_SETFD, FD_CLOEXEC) =3D 0
15:58:30 fcntl(14, F_SETLK, {l_type=3DF_WRLCK, l_whence=3DSEEK_SET, l_start=
=3D0, l_len=3D0}) =3D -1 EAGAIN (Risurco dumtempe ne disponeblas)
15:58:30 close(14)                      =3D 0
15:58:30 openat(AT_FDCWD, "/gnu/store/zvvgr301hz95ac15rp0wsjy5j4cdgkg4-modu=
le-import-compiled.lock", O_RDWR|O_CREAT, 0600) =3D 14
15:58:30 fcntl(14, F_GETFD)             =3D 0
15:58:30 fcntl(14, F_SETFD, FD_CLOEXEC) =3D 0
15:58:30 fcntl(14, F_SETLK, {l_type=3DF_WRLCK, l_whence=3DSEEK_SET, l_start=
=3D0, l_len=3D0}) =3D -1 EAGAIN (Risurco dumtempe ne disponeblas)
15:58:30 close(14)                      =3D 0
15:58:30 openat(AT_FDCWD, "/gnu/store/760kj698c9x1vcmlzk1qr8i9hqw961n3-open=
ssl-1.0.2o-doc.lock", O_RDWR|O_CREAT, 0600) =3D 14
15:58:30 fcntl(14, F_GETFD)             =3D 0
15:58:30 fcntl(14, F_SETFD, FD_CLOEXEC) =3D 0
15:58:30 fcntl(14, F_SETLK, {l_type=3DF_WRLCK, l_whence=3DSEEK_SET, l_start=
=3D0, l_len=3D0}) =3D -1 EAGAIN (Risurco dumtempe ne disponeblas)
15:58:30 close(14)                      =3D 0
15:58:30 select(445, [15 54 159 177 321 411 414 444], NULL, NULL, {tv_sec=
=3D5, tv_usec=3D0}^Cstrace: Process 20503 detached
$ sudo strace -p 20501 -t
strace: Process 20501 attached
15:59:09 select(121, [20 30 120], NULL, NULL, {tv_sec=3D0, tv_usec=3D535316=
}) =3D 0 (Timeout)
15:59:10 openat(AT_FDCWD, "/gnu/store/ghc9vcc8ydv305jz0spd73qrrf30gxik-binu=
tils-bootstrap-0.lock", O_RDWR|O_CREAT, 0600) =3D 8
15:59:10 fcntl(8, F_GETFD)              =3D 0
15:59:10 fcntl(8, F_SETFD, FD_CLOEXEC)  =3D 0
15:59:10 fcntl(8, F_SETLK, {l_type=3DF_WRLCK, l_whence=3DSEEK_SET, l_start=
=3D0, l_len=3D0}) =3D -1 EAGAIN (Risurco dumtempe ne disponeblas)
15:59:10 close(8)                       =3D 0
15:59:10 openat(AT_FDCWD, "/gnu/store/5hdpffkgkfszpsin2ir50zscz06x0mb5-core=
utils-8.29-debug.lock", O_RDWR|O_CREAT, 0600) =3D 8
15:59:10 fcntl(8, F_GETFD)              =3D 0
15:59:10 fcntl(8, F_SETFD, FD_CLOEXEC)  =3D 0
15:59:10 fcntl(8, F_SETLK, {l_type=3DF_WRLCK, l_whence=3DSEEK_SET, l_start=
=3D0, l_len=3D0}) =3D -1 EAGAIN (Risurco dumtempe ne disponeblas)
15:59:10 close(8)                       =3D 0
15:59:10 openat(AT_FDCWD, "/gnu/store/aasfcliv85frq3aawpa4rw18nvlvl735-gett=
ext-minimal-0.19.8.1.lock", O_RDWR|O_CREAT, 0600) =3D 8
15:59:10 fcntl(8, F_GETFD)              =3D 0
15:59:10 fcntl(8, F_SETFD, FD_CLOEXEC)  =3D 0
15:59:10 fcntl(8, F_SETLK, {l_type=3DF_WRLCK, l_whence=3DSEEK_SET, l_start=
=3D0, l_len=3D0}) =3D -1 EAGAIN (Risurco dumtempe ne disponeblas)
15:59:10 close(8)                       =3D 0
15:59:10 openat(AT_FDCWD, "/gnu/store/j1kpzcpi1m1ihzwyqb60zd4s74yp85g8-glib=
c-bootstrap-0.lock", O_RDWR|O_CREAT, 0600) =3D 8
15:59:10 fcntl(8, F_GETFD)              =3D 0
15:59:10 fcntl(8, F_SETFD, FD_CLOEXEC)  =3D 0
15:59:10 fcntl(8, F_SETLK, {l_type=3DF_WRLCK, l_whence=3DSEEK_SET, l_start=
=3D0, l_len=3D0}) =3D -1 EAGAIN (Risurco dumtempe ne disponeblas)
15:59:10 close(8)                       =3D 0
15:59:10 openat(AT_FDCWD, "/gnu/store/4llnxsgr6mxxj6pk21d9vbwaxnkbn99k-gnut=
ls-3.5.18-doc.lock", O_RDWR|O_CREAT, 0600) =3D 8
15:59:10 fcntl(8, F_GETFD)              =3D 0
15:59:10 fcntl(8, F_SETFD, FD_CLOEXEC)  =3D 0
15:59:10 fcntl(8, F_SETLK, {l_type=3DF_WRLCK, l_whence=3DSEEK_SET, l_start=
=3D0, l_len=3D0}) =3D -1 EAGAIN (Risurco dumtempe ne disponeblas)
15:59:10 close(8)                       =3D 0
15:59:10 openat(AT_FDCWD, "/gnu/store/7rs9qx86f2w4hjbf9wjm10k4m2vvlbyv-grof=
f-1.22.3-doc.lock", O_RDWR|O_CREAT, 0600) =3D 8
15:59:10 fcntl(8, F_GETFD)              =3D 0
15:59:10 fcntl(8, F_SETFD, FD_CLOEXEC)  =3D 0
15:59:10 fcntl(8, F_SETLK, {l_type=3DF_WRLCK, l_whence=3DSEEK_SET, l_start=
=3D0, l_len=3D0}) =3D -1 EAGAIN (Risurco dumtempe ne disponeblas)
15:59:10 close(8)                       =3D 0
15:59:10 openat(AT_FDCWD, "/gnu/store/d645d919gp1zk8v65qa2r4ia5x1c5n6k-libs=
tdc++-5.5.0.lock", O_RDWR|O_CREAT, 0600) =3D 8
15:59:10 fcntl(8, F_GETFD)              =3D 0
15:59:10 fcntl(8, F_SETFD, FD_CLOEXEC)  =3D 0
15:59:10 fcntl(8, F_SETLK, {l_type=3DF_WRLCK, l_whence=3DSEEK_SET, l_start=
=3D0, l_len=3D0}) =3D -1 EAGAIN (Risurco dumtempe ne disponeblas)
15:59:10 close(8)                       =3D 0
15:59:10 openat(AT_FDCWD, "/gnu/store/p262qk7zsd2wnlal6x1jax2xbqvdddqq-m4-1=
.4.18.lock", O_RDWR|O_CREAT, 0600) =3D 8
15:59:10 fcntl(8, F_GETFD)              =3D 0
15:59:10 fcntl(8, F_SETFD, FD_CLOEXEC)  =3D 0
15:59:10 fcntl(8, F_SETLK, {l_type=3DF_WRLCK, l_whence=3DSEEK_SET, l_start=
=3D0, l_len=3D0}) =3D -1 EAGAIN (Risurco dumtempe ne disponeblas)
15:59:10 close(8)                       =3D 0
15:59:10 select(121, [20 30 120], NULL, NULL, {tv_sec=3D5, tv_usec=3D0}^Cst=
race: Process 20501 detached
$ sudo lsof | grep '/gnu/store/.*\.lock'
guix-daem 20501             root   14uW     REG              253,0         =
 0    3150879 /gnu/store/zvvgr301hz95ac15rp0wsjy5j4cdgkg4-module-import-com=
piled.lock
guix-daem 20501             root   26uW     REG              253,0         =
 0    3212618 /gnu/store/760kj698c9x1vcmlzk1qr8i9hqw961n3-openssl-1.0.2o-do=
c.lock
guix-daem 20501             root  116uW     REG              253,0         =
 0    3212684 /gnu/store/amcs7l0ynj1qg6fp9ll3asiamd4zsq75-m4-1.4.18.lock
guix-daem 20503             root    8uW     REG              253,0         =
 0    3212520 /gnu/store/7rs9qx86f2w4hjbf9wjm10k4m2vvlbyv-groff-1.22.3-doc.=
lock
guix-daem 20503             root   50uW     REG              253,0         =
 0    3212514 /gnu/store/ghc9vcc8ydv305jz0spd73qrrf30gxik-binutils-bootstra=
p-0.lock
guix-daem 20503             root  155uW     REG              253,0         =
 0    3212547 /gnu/store/j1kpzcpi1m1ihzwyqb60zd4s74yp85g8-glibc-bootstrap-0=
.lock
guix-daem 20503             root  173uW     REG              253,0         =
 0    3212552 /gnu/store/4llnxsgr6mxxj6pk21d9vbwaxnkbn99k-gnutls-3.5.18-doc=
.lock
guix-daem 20503             root  317uW     REG              253,0         =
 0    3212602 /gnu/store/d645d919gp1zk8v65qa2r4ia5x1c5n6k-libstdc++-5.5.0.l=
ock
guix-daem 20503             root  407uW     REG              253,0         =
 0    3212655 /gnu/store/5hdpffkgkfszpsin2ir50zscz06x0mb5-coreutils-8.29-de=
bug.lock
guix-daem 20503             root  410uW     REG              253,0         =
 0    3212659 /gnu/store/aasfcliv85frq3aawpa4rw18nvlvl735-gettext-minimal-0=
.19.8.1.lock
guix-daem 20503             root  440uW     REG              253,0         =
 0    3212686 /gnu/store/p262qk7zsd2wnlal6x1jax2xbqvdddqq-m4-1.4.18.lock
--8<---------------cut here---------------end--------------->8---

So for example:

  =E2=80=A2 20501 tries to acquire
    aasfcliv85frq3aawpa4rw18nvlvl735-gettext-minimal-0.19.8.1.lock,
    which is held by 20503;

  =E2=80=A2 20503 tries to acquire
    amcs7l0ynj1qg6fp9ll3asiamd4zsq75-m4-1.4.18.lock, which is held by
    20501.

This comes from the fact that =E2=80=98LocalStore::buildPaths=E2=80=99 take=
s the
user-supplied derivation list as is, without sorting it, and then
acquires locks in that order in =E2=80=98Worker::run=E2=80=99.

A topological sort (or maybe an alphanumeric sort?) should allow us to
guarantee that guix-daemon processes take locks in the same order, and
then don=E2=80=99t end up in a deadlock.

I discovered this bug while monitoring Cuirass on berlin: several
sessions submit batches of 200 derivations in =E2=80=98build-paths=E2=80=99=
 RPCs, and
sometimes most of the corresponding guix-daemon processes would end up
being stuck in a lock-acquiring loop.

Ludo=E2=80=99.




Acknowledgement sent to ludo@HIDDEN (Ludovic Courtès):
New bug report received and forwarded. Copy sent to bug-guix@HIDDEN. Full text available.
Report forwarded to bug-guix@HIDDEN:
bug#31785; Package guix. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Mon, 25 Nov 2019 12:00:02 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.