GNU bug report logs - #34033
Offloading sometimes hangs

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: guix; Reported by: Ludovic Courtès <ludo@HIDDEN>; dated Thu, 10 Jan 2019 16:10:02 UTC; Maintainer for guix is bug-guix@HIDDEN.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 10 Jan 2019 16:09:46 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Jan 10 11:09:46 2019
Received: from localhost ([127.0.0.1]:54111 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1ghctt-00076W-UC
	for submit <at> debbugs.gnu.org; Thu, 10 Jan 2019 11:09:46 -0500
Received: from eggs.gnu.org ([209.51.188.92]:55842)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <ludo@HIDDEN>) id 1ghcts-00076J-Gx
 for submit <at> debbugs.gnu.org; Thu, 10 Jan 2019 11:09:44 -0500
Received: from lists.gnu.org ([209.51.188.17]:40213)
 by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32)
 (Exim 4.71) (envelope-from <ludo@HIDDEN>) id 1ghctn-0005bJ-9n
 for submit <at> debbugs.gnu.org; Thu, 10 Jan 2019 11:09:39 -0500
Received: from eggs.gnu.org ([209.51.188.92]:37807)
 by lists.gnu.org with esmtp (Exim 4.71)
 (envelope-from <ludo@HIDDEN>) id 1ghctm-0006Fe-6V
 for bug-guix@HIDDEN; Thu, 10 Jan 2019 11:09:39 -0500
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=-0.0 required=5.0 tests=BAYES_20 autolearn=disabled
 version=3.3.2
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <ludo@HIDDEN>) id 1ghctl-0005Zd-7u
 for bug-guix@HIDDEN; Thu, 10 Jan 2019 11:09:38 -0500
Received: from hera.aquilenet.fr ([2a0c:e300::1]:60310)
 by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32)
 (Exim 4.71) (envelope-from <ludo@HIDDEN>) id 1ghctk-0005XJ-TI
 for bug-guix@HIDDEN; Thu, 10 Jan 2019 11:09:37 -0500
Received: from localhost (localhost [127.0.0.1])
 by hera.aquilenet.fr (Postfix) with ESMTP id 75E0E195E
 for <bug-guix@HIDDEN>; Thu, 10 Jan 2019 17:09:33 +0100 (CET)
X-Virus-Scanned: Debian amavisd-new at aquilenet.fr
Received: from hera.aquilenet.fr ([127.0.0.1])
 by localhost (hera.aquilenet.fr [127.0.0.1]) (amavisd-new, port 10024)
 with ESMTP id pvs2nnhO917r for <bug-guix@HIDDEN>;
 Thu, 10 Jan 2019 17:09:32 +0100 (CET)
Received: from ribbon (unknown [IPv6:2001:660:6102:320:e120:2c8f:8909:cdfe])
 by hera.aquilenet.fr (Postfix) with ESMTPSA id 46F56193C
 for <bug-guix@HIDDEN>; Thu, 10 Jan 2019 17:09:32 +0100 (CET)
From: =?utf-8?Q?Ludovic_Court=C3=A8s?= <ludo@HIDDEN>
To: bug-guix@HIDDEN
Subject: Offloading sometimes hangs
X-URL: http://www.fdn.fr/~lcourtes/
X-Revolutionary-Date: 21 =?utf-8?Q?Niv=C3=B4se?= an 227 de la =?utf-8?Q?R?=
 =?utf-8?Q?=C3=A9volution?=
X-PGP-Key-ID: 0x090B11993D9AEBB5
X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc
X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4  0CFB 090B 1199 3D9A EBB5
X-OS: x86_64-pc-linux-gnu
Date: Thu, 10 Jan 2019 17:09:31 +0100
Message-ID: <87o98obikk.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-detected-operating-system: by eggs.gnu.org: Genre and OS details not
 recognized.
X-Received-From: 2a0c:e300::1
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x
X-Spam-Score: -0.0 (/)
X-Debbugs-Envelope-To: submit
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

Hello,

So there=E2=80=99s another situation where offloading regularly hangs on
berlin.  The =E2=80=98guix offload=E2=80=99 process looks like this:

--8<---------------cut here---------------start------------->8---
(gdb) bt
#0  0x00007f1f715686a1 in __GI___poll (fds=3D0x14e9b30, nfds=3D1, timeout=
=3D-1) at ../sysdeps/unix/sysv/linux/poll.c:29
#1  0x00007f1f673b94e7 in ssh_poll (timeout=3D<optimized out>, nfds=3D<opti=
mized out>, fds=3D<optimized out>)
    at /tmp/guix-build-libssh-0.7.7.drv-0/libssh-0.7.7-checkout/src/poll.c:=
98
#2  ssh_poll_ctx_dopoll (ctx=3Dctx@entry=3D0x14ee2e0, timeout=3Dtimeout@ent=
ry=3D-1)
    at /tmp/guix-build-libssh-0.7.7.drv-0/libssh-0.7.7-checkout/src/poll.c:=
612
#3  0x00007f1f673ba449 in ssh_handle_packets (session=3Dsession@entry=3D0x2=
249360, timeout=3Dtimeout@entry=3D-1)
    at /tmp/guix-build-libssh-0.7.7.drv-0/libssh-0.7.7-checkout/src/session=
.c:634
#4  0x00007f1f673ba51d in ssh_handle_packets_termination (session=3Dsession=
@entry=3D0x2249360, timeout=3D<optimized out>,
    timeout@entry=3D-3, fct=3Dfct@entry=3D0x7f1f673a4430 <ssh_channel_read_=
termination>, user=3Duser@entry=3D0x7ffce23953f0)
    at /tmp/guix-build-libssh-0.7.7.drv-0/libssh-0.7.7-checkout/src/session=
.c:696
#5  0x00007f1f673a6aaf in ssh_channel_read_timeout (channel=3D0x224e360, de=
st=3Ddest@entry=3D0x18ef020,
    count=3Dcount@entry=3D8, is_stderr=3D<optimized out>, timeout=3D-3, tim=
eout@entry=3D-1)
    at /tmp/guix-build-libssh-0.7.7.drv-0/libssh-0.7.7-checkout/src/channel=
s.c:2705
#6  0x00007f1f673a6bbb in ssh_channel_read (channel=3D<optimized out>, dest=
=3Ddest@entry=3D0x18ef020, count=3Dcount@entry=3D8,
    is_stderr=3D<optimized out>) at /tmp/guix-build-libssh-0.7.7.drv-0/libs=
sh-0.7.7-checkout/src/channels.c:2621
#7  0x00007f1f67413a23 in read_from_channel_port (
    channel=3D<error reading variable: ERROR: In procedure gdbscm_memory_po=
rt_fill_input: error reading memory>0x22f01a0, dst=3D<optimized out>, start=
=3D0, count=3D8) at channel-type.c:161
#8  0x00007f1f71b65287 in scm_i_read_bytes (
    port=3Dport@entry=3D<error reading variable: ERROR: In procedure gdbscm=
_memory_port_fill_input: error reading memory>0x22f01a0, dst=3Ddst@entry=3D=
"#<vu8vector>" =3D {...}, start=3Dstart@entry=3D0, count=3Dcount@entry=3D8)=
 at ports.c:1559
#9  0x00007f1f71b6996c in scm_c_read_bytes (
    port=3Dport@entry=3D<error reading variable: ERROR: In procedure gdbscm=
_memory_port_fill_input: error reading memory>0x22f01a0, dst=3Ddst@entry=3D=
"#<vu8vector>" =3D {...}, start=3Dstart@entry=3D0, count=3Dcount@entry=3D8)=
 at ports.c:1639
#10 0x00007f1f71b6fd80 in scm_get_bytevector_n (
    port=3D<error reading variable: ERROR: In procedure gdbscm_memory_port_=
fill_input: error reading memory>0x22f01a0,
    count=3D<optimized out>) at r6rs-ports.c:421
#11 0x00007f1f71ba4715 in vm_regular_engine (thread=3D0x14e9b30, vp=3D0xc31=
f30, registers=3D0xffffffff, resume=3D1901495969)
    at vm-engine.c:786

[...]

(gdb) p *fds
$1 =3D {fd =3D 15, events =3D 1, revents =3D 0}
(gdb) shell ls -l /proc/12185/fd
total 0
lr-x------ 1 root root 64 Jan 10 16:56 0 -> 'pipe:[76778016]'
l-wx------ 1 root root 64 Jan 10 16:56 1 -> 'pipe:[76778015]'
lr-x------ 1 root root 64 Jan 10 16:56 10 -> 'pipe:[76838317]'
l-wx------ 1 root root 64 Jan 10 16:56 11 -> 'pipe:[76838317]'
lr-x------ 1 root root 64 Jan 10 16:56 12 -> 'pipe:[76851360]'
l-wx------ 1 root root 64 Jan 10 16:56 13 -> 'pipe:[76851360]'
l-wx------ 1 root root 64 Jan 10 16:56 14 -> /var/guix/offload/overdrive1.g=
uixsd.org/1
lrwx------ 1 root root 64 Jan 10 16:56 15 -> 'socket:[76860702]'
lr-x------ 1 root root 64 Jan 10 16:56 16 -> /dev/urandom
l-wx------ 1 root root 64 Jan 10 16:56 2 -> 'pipe:[76778015]'
lr-x------ 1 root root 64 Jan 10 16:56 3 -> 'pipe:[76838313]'
l-wx------ 1 root root 64 Jan 10 16:56 4 -> 'pipe:[76778017]'
l-wx------ 1 root root 64 Jan 10 16:56 5 -> 'pipe:[76838313]'
lr-x------ 1 root root 64 Jan 10 16:56 6 -> 'pipe:[76838316]'
l-wx------ 1 root root 64 Jan 10 16:56 7 -> 'pipe:[76838316]'
lr-x------ 1 root root 64 Jan 10 16:56 8 -> 'pipe:[76841414]'
l-wx------ 1 root root 64 Jan 10 16:56 9 -> 'pipe:[76841414]'
--8<---------------cut here---------------end--------------->8---

It=E2=80=99s a =E2=80=98get-bytevector-n=E2=80=99 for 8 bytes, so it looks =
like the daemon
protocol.  At that point the socket is actually dead: if I connect on
the remote machine (overdrive1.guixsd.org) I can see that there are no
other open SSH sessions.

A simple thing would be to somehow get libssh to pass POLLIN | POLLRDHUP
instead of just POLLIN.

Additionally, we could change Guile-SSH so that we can specify a timeout
when reading from a channel.

Ludo=E2=80=99.




Acknowledgement sent to Ludovic Courtès <ludo@HIDDEN>:
New bug report received and forwarded. Copy sent to bug-guix@HIDDEN. Full text available.
Report forwarded to bug-guix@HIDDEN:
bug#34033; Package guix. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Thu, 10 Jan 2019 16:15:02 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.