GNU bug report logs - #34033
Offloading sometimes hangs

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: guix; Reported by: Ludovic Courtès <ludo@HIDDEN>; dated Thu, 10 Jan 2019 16:10:02 UTC; Maintainer for guix is bug-guix@HIDDEN.

Message received at 34033 <at> debbugs.gnu.org:


Received: (at 34033) by debbugs.gnu.org; 24 Feb 2020 13:59:35 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Mon Feb 24 08:59:35 2020
Received: from localhost ([127.0.0.1]:53016 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1j6EGk-0000iO-V2
	for submit <at> debbugs.gnu.org; Mon, 24 Feb 2020 08:59:35 -0500
Received: from mail-qk1-f175.google.com ([209.85.222.175]:35383)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <maxim.cournoyer@HIDDEN>) id 1j6EGj-0000iC-LB
 for 34033 <at> debbugs.gnu.org; Mon, 24 Feb 2020 08:59:33 -0500
Received: by mail-qk1-f175.google.com with SMTP id 145so2556830qkl.2
 for <34033 <at> debbugs.gnu.org>; Mon, 24 Feb 2020 05:59:33 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=from:to:cc:subject:references:date:in-reply-to:message-id
 :user-agent:mime-version:content-transfer-encoding;
 bh=Q2i62VaZr0MAmsMom2oBc4xoXmFqCMuoNU9ugu5pWV0=;
 b=T5NWECQdbzAk5M7RF2uTLAOH3zCPET/tY+ODrCDJyslIN1yDTlC1Ck9lm9nuh0IA3U
 teTeuI5aKY5GhejxUKZ8K3txM7pfzcFiFZlWFAtalhBb/kRQ66biPX1n37Wppx/Wl/EI
 ShFCQrtbdHJQRiL2ECl90FCBS3MTyYiANBOXBPFbPwzySRgQwFDlG0yJtu/bcKyAsBvm
 E1+nHiCsKXLCezjOS4JyjN806Sux363Bgdu3pYwHrAcPX6Ku7uEVT39OnL7kPRylF5QV
 rJqo+ZLoVv515BNcO3aNku8o8wsxBL1ztXjTi77vTkV+jpbK01wNWdQ8swQXLOmG9BO5
 GLUQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:from:to:cc:subject:references:date:in-reply-to
 :message-id:user-agent:mime-version:content-transfer-encoding;
 bh=Q2i62VaZr0MAmsMom2oBc4xoXmFqCMuoNU9ugu5pWV0=;
 b=gcwot4zTMBKsejOdH4GWwrjmJk9GEqmeew429sWYCZRaIlD5MOOSxiYwSB8Fq2ajgO
 uA4VTsdMIQ/6Ah5um/uR0qx3flDfZnpE/EZ+YaV/JU2q91EiCKGvlxeMt3jxkN79rpXK
 SOYqaaoiHObwJscrFysF25CY+9uEGRe//K4vwbLu+olmHrP6eXCrcUzZwQ2j0RX/Jn9U
 e8TjrbtGMusBrRBNjkaRiqj95ptIMdV4OVGwF+e/jfugd3IF9xfeHHEfqhy61B3R1zmb
 1uEwA6Szvnk23nqSpnQOFmtjZDW10FTrRj87+oHrMX8yJuuYs5iA7V8ASdZW75elJa+O
 7d6g==
X-Gm-Message-State: APjAAAUdeO/D3evJFv3GCjF0UCgS+RI1HKkxDqD6wXxARCt3u/CX2pwv
 eR7EWnAIGZ+p8yp/lWoNMiLQn1KR
X-Google-Smtp-Source: APXvYqys9lbGuiyk3mG3xbPs1G0Gf8+xXiucMV88pQprUY8oHvc8Wm/6N3Rq1afAXk2LQKe37OrwKg==
X-Received: by 2002:a05:620a:545:: with SMTP id
 o5mr48640851qko.27.1582552767628; 
 Mon, 24 Feb 2020 05:59:27 -0800 (PST)
Received: from raisin ([2607:fad8:4:6:235e:8579:8464:aacc])
 by smtp.gmail.com with ESMTPSA id f59sm5961646qtb.75.2020.02.24.05.59.26
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Mon, 24 Feb 2020 05:59:26 -0800 (PST)
From: Maxim Cournoyer <maxim.cournoyer@HIDDEN>
To: Ludovic =?utf-8?Q?Court=C3=A8s?= <ludo@HIDDEN>
Subject: Re: bug#34033: Offloading sometimes hangs
References: <87o98obikk.fsf@HIDDEN> <87fttuq2mz.fsf@HIDDEN>
 <87wo8fqlu5.fsf@HIDDEN>
 <87v9nyuzq1.fsf@HIDDEN>
Date: Mon, 24 Feb 2020 08:59:25 -0500
In-Reply-To: <87v9nyuzq1.fsf@HIDDEN> ("Ludovic
 \=\?utf-8\?Q\?Court\=C3\=A8s\=22'\?\=
 \=\?utf-8\?Q\?s\?\= message of "Sat, 22 Feb 2020 21:35:50 +0100")
Message-ID: <87blpof5mq.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 34033
Cc: 34033 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

Hello Ludovic,

Ludovic Court=C3=A8s <ludo@HIDDEN> writes:


[...]

> The issues above are in libssh and were fixed a while ago.  =E2=80=98guix
> substitute=E2=80=99 doesn=E2=80=99t use Guile-SSH/libssh, so the problem =
you=E2=80=99re seeing
> must be something different.

OK, good to know!

> What do you mean by =E2=80=9Cthe substitute server is down=E2=80=9D?  You=
 mean =E2=80=98guix
> publish=E2=80=99 is not running, or the machine is unavailable altogether?

The machine is turned off (i.e., the machine is unavailable altogether
:-).  It doesn't hang forever, but the timeout is a rather long.  I'm
using that machine as both a substitute and an offload server.

Maxim




Information forwarded to bug-guix@HIDDEN:
bug#34033; Package guix. Full text available.

Message received at 34033 <at> debbugs.gnu.org:


Received: (at 34033) by debbugs.gnu.org; 22 Feb 2020 20:35:59 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sat Feb 22 15:35:59 2020
Received: from localhost ([127.0.0.1]:50081 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1j5bVG-0002Fk-TY
	for submit <at> debbugs.gnu.org; Sat, 22 Feb 2020 15:35:59 -0500
Received: from eggs.gnu.org ([209.51.188.92]:48872)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <ludo@HIDDEN>) id 1j5bVF-0002FX-2n
 for 34033 <at> debbugs.gnu.org; Sat, 22 Feb 2020 15:35:57 -0500
Received: from fencepost.gnu.org ([2001:470:142:3::e]:56284)
 by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <ludo@HIDDEN>)
 id 1j5bV9-0007Y4-UN; Sat, 22 Feb 2020 15:35:51 -0500
Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=46544 helo=ribbon)
 by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256)
 (Exim 4.82) (envelope-from <ludo@HIDDEN>)
 id 1j5bV9-0006jo-HX; Sat, 22 Feb 2020 15:35:51 -0500
From: =?utf-8?Q?Ludovic_Court=C3=A8s?= <ludo@HIDDEN>
To: Maxim Cournoyer <maxim.cournoyer@HIDDEN>
Subject: Re: bug#34033: Offloading sometimes hangs
References: <87o98obikk.fsf@HIDDEN> <87fttuq2mz.fsf@HIDDEN>
 <87wo8fqlu5.fsf@HIDDEN>
X-URL: http://www.fdn.fr/~lcourtes/
X-Revolutionary-Date: 4 =?utf-8?Q?Vent=C3=B4se?= an 228 de la =?utf-8?Q?R?=
 =?utf-8?Q?=C3=A9volution?=
X-PGP-Key-ID: 0x090B11993D9AEBB5
X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc
X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4  0CFB 090B 1199 3D9A EBB5
X-OS: x86_64-pc-linux-gnu
Date: Sat, 22 Feb 2020 21:35:50 +0100
In-Reply-To: <87wo8fqlu5.fsf@HIDDEN>
 (Maxim Cournoyer's message of "Fri, 21 Feb 2020 23:37:06 -0500")
Message-ID: <87v9nyuzq1.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
X-Spam-Score: -0.7 (/)
X-Debbugs-Envelope-To: 34033
Cc: 34033 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.7 (-)

Hi Maxim,

Maxim Cournoyer <maxim.cournoyer@HIDDEN> skribis:

> Ludovic Court=C3=A8s <ludo@HIDDEN> writes:
>
>> Hello,
>>
>> Ludovic Court=C3=A8s <ludo@HIDDEN> skribis:
>>
>>> A simple thing would be to somehow get libssh to pass POLLIN | POLLRDHUP
>>> instead of just POLLIN.
>>
>> Reported here:
>>
>>   https://www.libssh.org/archive/libssh/2019-01/0000000.html
>>
>> A fix has been proposed by upstream and should be committed shortly.
>>
>>> Additionally, we could change Guile-SSH so that we can specify a timeout
>>> when reading from a channel.
>>
>> Turns out we can set a per-session timeout, which we already do (see
>> #:timeout in =E2=80=98open-ssh-session=E2=80=99 in (guix scripts offload=
)) but
>> =E2=80=98ssh_channel_read=E2=80=99 would ignore it and instead pass an i=
nfinite timeout
>> to poll(2):
>>
>>   https://www.libssh.org/archive/libssh/2019-01/0000001.html
>>
>> This issue happens to be fixed in libssh 0.8.x, so I upgraded our libssh
>> package in commit a8b0556ea1e439c89dc1ba33c8864e8b9b811f08.
>>
>> (That still doesn=E2=80=99t tell us why our =E2=80=98guix offload=E2=80=
=99 processes would
>> occasionally be stuck but at least it ensures the build farm keeps
>> making progress even when that happens.)
>>
>> Ludo=E2=80=99.
>
> Seems the patch in the response at the URL you linked is awaiting some
> feedback/review.  Is this the reason 'guix substitute' hangs for so long
> when the substitute server is down? (like 1 minute or so).

The issues above are in libssh and were fixed a while ago.  =E2=80=98guix
substitute=E2=80=99 doesn=E2=80=99t use Guile-SSH/libssh, so the problem yo=
u=E2=80=99re seeing
must be something different.

What do you mean by =E2=80=9Cthe substitute server is down=E2=80=9D?  You m=
ean =E2=80=98guix
publish=E2=80=99 is not running, or the machine is unavailable altogether?

Thanks,
Ludo=E2=80=99.




Information forwarded to bug-guix@HIDDEN:
bug#34033; Package guix. Full text available.

Message received at 34033 <at> debbugs.gnu.org:


Received: (at 34033) by debbugs.gnu.org; 22 Feb 2020 04:37:17 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Fri Feb 21 23:37:17 2020
Received: from localhost ([127.0.0.1]:48514 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1j5MXU-000261-8M
	for submit <at> debbugs.gnu.org; Fri, 21 Feb 2020 23:37:17 -0500
Received: from mail-qt1-f195.google.com ([209.85.160.195]:46260)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <maxim.cournoyer@HIDDEN>) id 1j5MXS-00025p-6w
 for 34033 <at> debbugs.gnu.org; Fri, 21 Feb 2020 23:37:14 -0500
Received: by mail-qt1-f195.google.com with SMTP id i14so2857912qtv.13
 for <34033 <at> debbugs.gnu.org>; Fri, 21 Feb 2020 20:37:14 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=from:to:cc:subject:references:date:in-reply-to:message-id
 :user-agent:mime-version:content-transfer-encoding;
 bh=+M54D5f6sHD468ltGLg9X1XriouDmI73jTK3ZOVSA5Q=;
 b=KwubyUdFe2GL8wKFtKX3cKDL+MbAXIxeXSbWw2LgHwPgwOq/wVzLBKaTUhFj0F9CSy
 byK/xbpKMHS9i7Cdy/9/i59oir60jIh10wLE7inKV6HuhoflaoLYp/jC5h1rZICm8H6f
 UrH8ElMqNlKumytex8e28MQ7R7JiKJyAn8WQ+9J01F3xQj4NyZkeyYjkMp5hcNEjLNXo
 06UBmERXG9wMTmNdImiVQUvn255W2ruO5A2iFqXKGbSijpXelA4UvnTQ9Uz+NRHPSkKK
 kdOic+pFbD1fARLaAHs6WUDToc+fLSpQ5Ew97P7EZqvkiPLFhcUBhD0dv6xT5QJD6BW+
 vdtA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:from:to:cc:subject:references:date:in-reply-to
 :message-id:user-agent:mime-version:content-transfer-encoding;
 bh=+M54D5f6sHD468ltGLg9X1XriouDmI73jTK3ZOVSA5Q=;
 b=r0zX5ZmEX2fut28XDkXGrV7ndZodxBlUYorUkFWMtfENb8XuiQCgmv03UMrz5VoiH9
 Jij0o/gOqQDUu3HvX8O/asRir1dgUX0vbJ8z/7xC2ite1bKwZ0gZhzXR3tfMunQ8BHqU
 VSX9AyGJWPtYmnX/MVwV5MjAQoBXQveE7Xs45FcwkfEwWyXFdjsqeKfnyE2zo7yTCO+8
 dlAbrl8LlATflPbrDVMtc3X1IKmlzR8F8wcEm2Ymr0lc9HI0Ufv3KBT7DQvVwxi8JwED
 RF1XCL1Tv0ncGnCADANwoRfleOkK1HPCSLr9IuQgb+nubitR4fvxpLrWe2xV8eFsWLVg
 h4bg==
X-Gm-Message-State: APjAAAVC/jFJGRl2tY/tvqbWIrFpWllDjyxc/aPpotVej36x01OQmgJ/
 ZXdZFr96cP4ZLtj+e4q5a2WnHftb
X-Google-Smtp-Source: APXvYqz3lmUktckC4FBk10dOmHCfi4i3a33Hl80aUWvzHSVnRAv6F+wa6aYidFog9g625ly6PRW9Dw==
X-Received: by 2002:ac8:4505:: with SMTP id q5mr34667985qtn.84.1582346228300; 
 Fri, 21 Feb 2020 20:37:08 -0800 (PST)
Received: from apteryx (dsl-236-125-236.b2b2c.ca. [207.236.125.236])
 by smtp.gmail.com with ESMTPSA id d20sm2546082qkg.8.2020.02.21.20.37.06
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Fri, 21 Feb 2020 20:37:07 -0800 (PST)
From: Maxim Cournoyer <maxim.cournoyer@HIDDEN>
X-Google-Original-From: Maxim Cournoyer
 <maxim@HIDDEN>
To: Ludovic =?utf-8?Q?Court=C3=A8s?= <ludo@HIDDEN>
Subject: Re: bug#34033: Offloading sometimes hangs
References: <87o98obikk.fsf@HIDDEN> <87fttuq2mz.fsf@HIDDEN>
Date: Fri, 21 Feb 2020 23:37:06 -0500
In-Reply-To: <87fttuq2mz.fsf@HIDDEN> ("Ludovic
 \=\?utf-8\?Q\?Court\=C3\=A8s\=22'\?\=
 \=\?utf-8\?Q\?s\?\= message of "Mon, 14 Jan 2019 23:45:56 +0100")
Message-ID: <87wo8fqlu5.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 34033
Cc: 34033 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

Hello Ludovic,

Ludovic Court=C3=A8s <ludo@HIDDEN> writes:

> Hello,
>
> Ludovic Court=C3=A8s <ludo@HIDDEN> skribis:
>
>> A simple thing would be to somehow get libssh to pass POLLIN | POLLRDHUP
>> instead of just POLLIN.
>
> Reported here:
>
>   https://www.libssh.org/archive/libssh/2019-01/0000000.html
>
> A fix has been proposed by upstream and should be committed shortly.
>
>> Additionally, we could change Guile-SSH so that we can specify a timeout
>> when reading from a channel.
>
> Turns out we can set a per-session timeout, which we already do (see
> #:timeout in =E2=80=98open-ssh-session=E2=80=99 in (guix scripts offload)=
) but
> =E2=80=98ssh_channel_read=E2=80=99 would ignore it and instead pass an in=
finite timeout
> to poll(2):
>
>   https://www.libssh.org/archive/libssh/2019-01/0000001.html
>
> This issue happens to be fixed in libssh 0.8.x, so I upgraded our libssh
> package in commit a8b0556ea1e439c89dc1ba33c8864e8b9b811f08.
>
> (That still doesn=E2=80=99t tell us why our =E2=80=98guix offload=E2=80=
=99 processes would
> occasionally be stuck but at least it ensures the build farm keeps
> making progress even when that happens.)
>
> Ludo=E2=80=99.

Seems the patch in the response at the URL you linked is awaiting some
feedback/review.  Is this the reason 'guix substitute' hangs for so long
when the substitute server is down? (like 1 minute or so).

Maxim




Information forwarded to bug-guix@HIDDEN:
bug#34033; Package guix. Full text available.

Message received at 34033 <at> debbugs.gnu.org:


Received: (at 34033) by debbugs.gnu.org; 14 Jan 2019 22:46:01 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Mon Jan 14 17:46:01 2019
Received: from localhost ([127.0.0.1]:59426 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1gjAzZ-0002ej-5Z
	for submit <at> debbugs.gnu.org; Mon, 14 Jan 2019 17:46:01 -0500
Received: from hera.aquilenet.fr ([185.233.100.1]:33768)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <ludo@HIDDEN>) id 1gjAzX-0002eW-E6
 for 34033 <at> debbugs.gnu.org; Mon, 14 Jan 2019 17:46:00 -0500
Received: from localhost (localhost [127.0.0.1])
 by hera.aquilenet.fr (Postfix) with ESMTP id 0623C1BB2
 for <34033 <at> debbugs.gnu.org>; Mon, 14 Jan 2019 23:45:58 +0100 (CET)
X-Virus-Scanned: Debian amavisd-new at aquilenet.fr
Received: from hera.aquilenet.fr ([127.0.0.1])
 by localhost (hera.aquilenet.fr [127.0.0.1]) (amavisd-new, port 10024)
 with ESMTP id 0Gm0SI52e8ux for <34033 <at> debbugs.gnu.org>;
 Mon, 14 Jan 2019 23:45:57 +0100 (CET)
Received: from ribbon (unknown [IPv6:2a01:e0a:1d:7270:af76:b9b:ca24:c465])
 by hera.aquilenet.fr (Postfix) with ESMTPSA id 15B0E130C
 for <34033 <at> debbugs.gnu.org>; Mon, 14 Jan 2019 23:45:56 +0100 (CET)
From: =?utf-8?Q?Ludovic_Court=C3=A8s?= <ludo@HIDDEN>
To: 34033 <at> debbugs.gnu.org
Subject: Re: bug#34033: Offloading sometimes hangs
References: <87o98obikk.fsf@HIDDEN>
Date: Mon, 14 Jan 2019 23:45:56 +0100
In-Reply-To: <87o98obikk.fsf@HIDDEN> ("Ludovic
 \=\?utf-8\?Q\?Court\=C3\=A8s\=22'\?\=
 \=\?utf-8\?Q\?s\?\= message of "Thu, 10 Jan 2019 17:09:31 +0100")
Message-ID: <87fttuq2mz.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Spam-Score: 1.0 (+)
X-Debbugs-Envelope-To: 34033
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -0.0 (/)

Hello,

Ludovic Court=C3=A8s <ludo@HIDDEN> skribis:

> A simple thing would be to somehow get libssh to pass POLLIN | POLLRDHUP
> instead of just POLLIN.

Reported here:

  https://www.libssh.org/archive/libssh/2019-01/0000000.html

A fix has been proposed by upstream and should be committed shortly.

> Additionally, we could change Guile-SSH so that we can specify a timeout
> when reading from a channel.

Turns out we can set a per-session timeout, which we already do (see
#:timeout in =E2=80=98open-ssh-session=E2=80=99 in (guix scripts offload)) =
but
=E2=80=98ssh_channel_read=E2=80=99 would ignore it and instead pass an infi=
nite timeout
to poll(2):

  https://www.libssh.org/archive/libssh/2019-01/0000001.html

This issue happens to be fixed in libssh 0.8.x, so I upgraded our libssh
package in commit a8b0556ea1e439c89dc1ba33c8864e8b9b811f08.

(That still doesn=E2=80=99t tell us why our =E2=80=98guix offload=E2=80=99 =
processes would
occasionally be stuck but at least it ensures the build farm keeps
making progress even when that happens.)

Ludo=E2=80=99.




Information forwarded to bug-guix@HIDDEN:
bug#34033; Package guix. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 10 Jan 2019 16:09:46 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Jan 10 11:09:46 2019
Received: from localhost ([127.0.0.1]:54111 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1ghctt-00076W-UC
	for submit <at> debbugs.gnu.org; Thu, 10 Jan 2019 11:09:46 -0500
Received: from eggs.gnu.org ([209.51.188.92]:55842)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <ludo@HIDDEN>) id 1ghcts-00076J-Gx
 for submit <at> debbugs.gnu.org; Thu, 10 Jan 2019 11:09:44 -0500
Received: from lists.gnu.org ([209.51.188.17]:40213)
 by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32)
 (Exim 4.71) (envelope-from <ludo@HIDDEN>) id 1ghctn-0005bJ-9n
 for submit <at> debbugs.gnu.org; Thu, 10 Jan 2019 11:09:39 -0500
Received: from eggs.gnu.org ([209.51.188.92]:37807)
 by lists.gnu.org with esmtp (Exim 4.71)
 (envelope-from <ludo@HIDDEN>) id 1ghctm-0006Fe-6V
 for bug-guix@HIDDEN; Thu, 10 Jan 2019 11:09:39 -0500
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=-0.0 required=5.0 tests=BAYES_20 autolearn=disabled
 version=3.3.2
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <ludo@HIDDEN>) id 1ghctl-0005Zd-7u
 for bug-guix@HIDDEN; Thu, 10 Jan 2019 11:09:38 -0500
Received: from hera.aquilenet.fr ([2a0c:e300::1]:60310)
 by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32)
 (Exim 4.71) (envelope-from <ludo@HIDDEN>) id 1ghctk-0005XJ-TI
 for bug-guix@HIDDEN; Thu, 10 Jan 2019 11:09:37 -0500
Received: from localhost (localhost [127.0.0.1])
 by hera.aquilenet.fr (Postfix) with ESMTP id 75E0E195E
 for <bug-guix@HIDDEN>; Thu, 10 Jan 2019 17:09:33 +0100 (CET)
X-Virus-Scanned: Debian amavisd-new at aquilenet.fr
Received: from hera.aquilenet.fr ([127.0.0.1])
 by localhost (hera.aquilenet.fr [127.0.0.1]) (amavisd-new, port 10024)
 with ESMTP id pvs2nnhO917r for <bug-guix@HIDDEN>;
 Thu, 10 Jan 2019 17:09:32 +0100 (CET)
Received: from ribbon (unknown [IPv6:2001:660:6102:320:e120:2c8f:8909:cdfe])
 by hera.aquilenet.fr (Postfix) with ESMTPSA id 46F56193C
 for <bug-guix@HIDDEN>; Thu, 10 Jan 2019 17:09:32 +0100 (CET)
From: =?utf-8?Q?Ludovic_Court=C3=A8s?= <ludo@HIDDEN>
To: bug-guix@HIDDEN
Subject: Offloading sometimes hangs
X-URL: http://www.fdn.fr/~lcourtes/
X-Revolutionary-Date: 21 =?utf-8?Q?Niv=C3=B4se?= an 227 de la =?utf-8?Q?R?=
 =?utf-8?Q?=C3=A9volution?=
X-PGP-Key-ID: 0x090B11993D9AEBB5
X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc
X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4  0CFB 090B 1199 3D9A EBB5
X-OS: x86_64-pc-linux-gnu
Date: Thu, 10 Jan 2019 17:09:31 +0100
Message-ID: <87o98obikk.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-detected-operating-system: by eggs.gnu.org: Genre and OS details not
 recognized.
X-Received-From: 2a0c:e300::1
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x
X-Spam-Score: -0.0 (/)
X-Debbugs-Envelope-To: submit
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

Hello,

So there=E2=80=99s another situation where offloading regularly hangs on
berlin.  The =E2=80=98guix offload=E2=80=99 process looks like this:

--8<---------------cut here---------------start------------->8---
(gdb) bt
#0  0x00007f1f715686a1 in __GI___poll (fds=3D0x14e9b30, nfds=3D1, timeout=
=3D-1) at ../sysdeps/unix/sysv/linux/poll.c:29
#1  0x00007f1f673b94e7 in ssh_poll (timeout=3D<optimized out>, nfds=3D<opti=
mized out>, fds=3D<optimized out>)
    at /tmp/guix-build-libssh-0.7.7.drv-0/libssh-0.7.7-checkout/src/poll.c:=
98
#2  ssh_poll_ctx_dopoll (ctx=3Dctx@entry=3D0x14ee2e0, timeout=3Dtimeout@ent=
ry=3D-1)
    at /tmp/guix-build-libssh-0.7.7.drv-0/libssh-0.7.7-checkout/src/poll.c:=
612
#3  0x00007f1f673ba449 in ssh_handle_packets (session=3Dsession@entry=3D0x2=
249360, timeout=3Dtimeout@entry=3D-1)
    at /tmp/guix-build-libssh-0.7.7.drv-0/libssh-0.7.7-checkout/src/session=
.c:634
#4  0x00007f1f673ba51d in ssh_handle_packets_termination (session=3Dsession=
@entry=3D0x2249360, timeout=3D<optimized out>,
    timeout@entry=3D-3, fct=3Dfct@entry=3D0x7f1f673a4430 <ssh_channel_read_=
termination>, user=3Duser@entry=3D0x7ffce23953f0)
    at /tmp/guix-build-libssh-0.7.7.drv-0/libssh-0.7.7-checkout/src/session=
.c:696
#5  0x00007f1f673a6aaf in ssh_channel_read_timeout (channel=3D0x224e360, de=
st=3Ddest@entry=3D0x18ef020,
    count=3Dcount@entry=3D8, is_stderr=3D<optimized out>, timeout=3D-3, tim=
eout@entry=3D-1)
    at /tmp/guix-build-libssh-0.7.7.drv-0/libssh-0.7.7-checkout/src/channel=
s.c:2705
#6  0x00007f1f673a6bbb in ssh_channel_read (channel=3D<optimized out>, dest=
=3Ddest@entry=3D0x18ef020, count=3Dcount@entry=3D8,
    is_stderr=3D<optimized out>) at /tmp/guix-build-libssh-0.7.7.drv-0/libs=
sh-0.7.7-checkout/src/channels.c:2621
#7  0x00007f1f67413a23 in read_from_channel_port (
    channel=3D<error reading variable: ERROR: In procedure gdbscm_memory_po=
rt_fill_input: error reading memory>0x22f01a0, dst=3D<optimized out>, start=
=3D0, count=3D8) at channel-type.c:161
#8  0x00007f1f71b65287 in scm_i_read_bytes (
    port=3Dport@entry=3D<error reading variable: ERROR: In procedure gdbscm=
_memory_port_fill_input: error reading memory>0x22f01a0, dst=3Ddst@entry=3D=
"#<vu8vector>" =3D {...}, start=3Dstart@entry=3D0, count=3Dcount@entry=3D8)=
 at ports.c:1559
#9  0x00007f1f71b6996c in scm_c_read_bytes (
    port=3Dport@entry=3D<error reading variable: ERROR: In procedure gdbscm=
_memory_port_fill_input: error reading memory>0x22f01a0, dst=3Ddst@entry=3D=
"#<vu8vector>" =3D {...}, start=3Dstart@entry=3D0, count=3Dcount@entry=3D8)=
 at ports.c:1639
#10 0x00007f1f71b6fd80 in scm_get_bytevector_n (
    port=3D<error reading variable: ERROR: In procedure gdbscm_memory_port_=
fill_input: error reading memory>0x22f01a0,
    count=3D<optimized out>) at r6rs-ports.c:421
#11 0x00007f1f71ba4715 in vm_regular_engine (thread=3D0x14e9b30, vp=3D0xc31=
f30, registers=3D0xffffffff, resume=3D1901495969)
    at vm-engine.c:786

[...]

(gdb) p *fds
$1 =3D {fd =3D 15, events =3D 1, revents =3D 0}
(gdb) shell ls -l /proc/12185/fd
total 0
lr-x------ 1 root root 64 Jan 10 16:56 0 -> 'pipe:[76778016]'
l-wx------ 1 root root 64 Jan 10 16:56 1 -> 'pipe:[76778015]'
lr-x------ 1 root root 64 Jan 10 16:56 10 -> 'pipe:[76838317]'
l-wx------ 1 root root 64 Jan 10 16:56 11 -> 'pipe:[76838317]'
lr-x------ 1 root root 64 Jan 10 16:56 12 -> 'pipe:[76851360]'
l-wx------ 1 root root 64 Jan 10 16:56 13 -> 'pipe:[76851360]'
l-wx------ 1 root root 64 Jan 10 16:56 14 -> /var/guix/offload/overdrive1.g=
uixsd.org/1
lrwx------ 1 root root 64 Jan 10 16:56 15 -> 'socket:[76860702]'
lr-x------ 1 root root 64 Jan 10 16:56 16 -> /dev/urandom
l-wx------ 1 root root 64 Jan 10 16:56 2 -> 'pipe:[76778015]'
lr-x------ 1 root root 64 Jan 10 16:56 3 -> 'pipe:[76838313]'
l-wx------ 1 root root 64 Jan 10 16:56 4 -> 'pipe:[76778017]'
l-wx------ 1 root root 64 Jan 10 16:56 5 -> 'pipe:[76838313]'
lr-x------ 1 root root 64 Jan 10 16:56 6 -> 'pipe:[76838316]'
l-wx------ 1 root root 64 Jan 10 16:56 7 -> 'pipe:[76838316]'
lr-x------ 1 root root 64 Jan 10 16:56 8 -> 'pipe:[76841414]'
l-wx------ 1 root root 64 Jan 10 16:56 9 -> 'pipe:[76841414]'
--8<---------------cut here---------------end--------------->8---

It=E2=80=99s a =E2=80=98get-bytevector-n=E2=80=99 for 8 bytes, so it looks =
like the daemon
protocol.  At that point the socket is actually dead: if I connect on
the remote machine (overdrive1.guixsd.org) I can see that there are no
other open SSH sessions.

A simple thing would be to somehow get libssh to pass POLLIN | POLLRDHUP
instead of just POLLIN.

Additionally, we could change Guile-SSH so that we can specify a timeout
when reading from a channel.

Ludo=E2=80=99.




Acknowledgement sent to Ludovic Courtès <ludo@HIDDEN>:
New bug report received and forwarded. Copy sent to bug-guix@HIDDEN. Full text available.
Report forwarded to bug-guix@HIDDEN:
bug#34033; Package guix. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Mon, 24 Feb 2020 14:15:02 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.