Received: (at 61839) by debbugs.gnu.org; 17 Apr 2023 22:55:35 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Mon Apr 17 18:55:35 2023 Received: from localhost ([127.0.0.1]:57154 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1poXl1-0008RT-9Q for submit <at> debbugs.gnu.org; Mon, 17 Apr 2023 18:55:35 -0400 Received: from eggs.gnu.org ([209.51.188.92]:59892) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <ludo@HIDDEN>) id 1poXky-0008RG-Oq for 61839 <at> debbugs.gnu.org; Mon, 17 Apr 2023 18:55:33 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <ludo@HIDDEN>) id 1poXks-00028Y-Mo; Mon, 17 Apr 2023 18:55:26 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:In-Reply-To:Date:References:Subject:To: From; bh=8/7uXQWdXdN9WS7JoSCh8JZZYXXLVe40koqpM5e5VME=; b=aTPpXzp+RWfa7BNSly2K +yUSDOjIyFQQZ/9OpPb1q2hTNEOnMzcG0rPmBHsJ+e66snk1NTQbsZEpEMiw9gCXhbt4PVsK7xSS9 /zY2ZkoVZYE87cAsjsV5E94TqGFq97hCeTuZWA65Q6RtaiiOehKX+IthU6vCsXxo9gqsHasrHvLPT Cn8bsqjVLfuQy4GS2p/8nsBK7BpuqqDrn9Olp73aFW6/Ov0890SYmMETfFQ2+qs9dPOBsJSGHHx8j NiCMnWV3A5VsFCNNqvfjOj2KOUDCCG4KJFSJIOJ8BmfENmsx+x7lKq+MP3Q6Y0uK8CUzbxp2iO4d0 3fNjr8ApSklBrg==; Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <ludo@HIDDEN>) id 1poXks-0006Hn-9q; Mon, 17 Apr 2023 18:55:26 -0400 From: =?utf-8?Q?Ludovic_Court=C3=A8s?= <ludo@HIDDEN> To: 61839 <at> debbugs.gnu.org Subject: Re: bug#61839: 'guix offload' occasionally hangs on berlin, hitting max-silent-time References: <87a60zo0le.fsf@HIDDEN> <87edoicvn3.fsf@HIDDEN> Date: Tue, 18 Apr 2023 00:55:22 +0200 In-Reply-To: <87edoicvn3.fsf@HIDDEN> ("Ludovic =?utf-8?Q?Court=C3=A8s=22'?= =?utf-8?Q?s?= message of "Mon, 17 Apr 2023 18:37:36 +0200") Message-ID: <871qkiazl1.fsf@HIDDEN> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 61839 Cc: Mathieu Othacehe <othacehe@HIDDEN>, Andreas Enge <andreas@HIDDEN> X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -3.3 (---) Hi, Ludovic Court=C3=A8s <ludo@HIDDEN> skribis: > Turns out that guix-daemon on build machines is running with =E2=80=98--d= ebug=E2=80=99 > starting with maintenance.git commit > 72d8a62466b71eda657b52ca17e99f7189ea18ad (=E2=80=9Csysadmin: build-machin= es: > Enable guix-daemon debug output.=E2=80=9D) > > The immediate fix is to undo that=E2=80=94I=E2=80=99ll do that later. Done: https://git.savannah.gnu.org/cgit/guix/maintenance.git/commit/?id=3Dcbe75= edae348bf0aa235e93ebeb58fe9db14fa7a I redeployed the build machines (on the same Guix commit as before). The 4 machines that host a childhurd are still being redeployed though. > The other one is to swallow =E2=80=98current-error-port=E2=80=99 in > =E2=80=98store-import-channel=E2=80=99. Done: https://git.savannah.gnu.org/cgit/guix.git/commit/?id=3D166a3e3fdeaa5279b= 2dd5000088f913c05af9558 Ludo=E2=80=99.
bug-guix@HIDDEN
:bug#61839
; Package guix
.
Full text available.Received: (at 61839) by debbugs.gnu.org; 17 Apr 2023 16:37:47 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Mon Apr 17 12:37:47 2023 Received: from localhost ([127.0.0.1]:55923 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1poRrO-0003Qz-U0 for submit <at> debbugs.gnu.org; Mon, 17 Apr 2023 12:37:47 -0400 Received: from eggs.gnu.org ([209.51.188.92]:43922) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <ludo@HIDDEN>) id 1poRrN-0003Qk-5C for 61839 <at> debbugs.gnu.org; Mon, 17 Apr 2023 12:37:46 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <ludo@HIDDEN>) id 1poRrH-0000k7-2a; Mon, 17 Apr 2023 12:37:39 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:In-Reply-To:Date:References:Subject:To: From; bh=aJyv1vdXysdiIG2f34skQEptpu3vNlbj/J4KbAfaBwY=; b=b/0jMb6NNKHoJE9X2ERk +QTDkQDLBE6LQRE/nk+bywrxJioUKx8Cj9zkd6ncBbxf1bEVpLE/bV2Jnb6qxqvSJFNCtpQYRaEyT aI29M70gLVyLr+Vh079J1xVWsJP8jSkaG6SqJ5Lg5CH3h14/SzrbqzHlkNLLe9YiGXN9PkU0GsMCp r0NEPeXNFPAp3/sWrhpe95pSkUBN6uxg/Sx+n/z7h0iJOQccO9A3wNrWAJNddYQ9ai8a7PUUZvUvX T+fNBDNHQdqE4KeedO+prLDj2AiVArlDwMYHxhoeNK8lH9WxshDcgTDhNb8QDJ9tVnJ2twJwLjI8c YgZMmaJmtcMd5w==; Received: from [193.50.110.118] (helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <ludo@HIDDEN>) id 1poRrG-0006xR-Kf; Mon, 17 Apr 2023 12:37:38 -0400 From: =?utf-8?Q?Ludovic_Court=C3=A8s?= <ludo@HIDDEN> To: 61839 <at> debbugs.gnu.org Subject: Re: bug#61839: 'guix offload' occasionally hangs on berlin, hitting max-silent-time References: <87a60zo0le.fsf@HIDDEN> Date: Mon, 17 Apr 2023 18:37:36 +0200 In-Reply-To: <87a60zo0le.fsf@HIDDEN> ("Ludovic =?utf-8?Q?Court=C3=A8s=22?= =?utf-8?Q?'s?= message of "Mon, 27 Feb 2023 11:37:49 +0100") Message-ID: <87edoicvn3.fsf@HIDDEN> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 61839 Cc: Mathieu Othacehe <othacehe@HIDDEN>, Andreas Enge <andreas@HIDDEN> X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -3.3 (---) Hi! Ludovic Court=C3=A8s <ludo@HIDDEN> skribis: > We often get evaluation errors for =E2=80=98core-updates=E2=80=99, where = the build log > reveals that this is due to a build timeout while building the Guix > instance that will be used for the evaluation. Example: [...] > exporting path `/gnu/store/m2hyx7qk2j1bsvpj1qv89701klv8kqkk-guix-command' > exporting path `/gnu/store/vlm7401d5jvna6j7qdxhcr581cd8g8vh-guix-daemon-1= .4.0-4.01fd830' > exporting path `/gnu/store/zv3c0anfj085d2lyxh4g4aahhczmcaj8-guix-daemon' > exporting path `/gnu/store/f26rw4haizf2h126k9ja9wk31cm49mj0-guix-17bd0243= 3' > exporting path `/gnu/store/wc736va72zl2zz36817ahf3kc5fkfr69-inferior-scri= pt.scm' > building of `/gnu/store/yq9r44lic8cn4nphksbi5s3ppwfcsh1n-guile.drv' timed= out after 3600 seconds of silence With help from Andreas, I was able to reproduce it on berlin by running: guix build /gnu/store/b22580wj63wk4a3i09n22wa60y36kaa5-openjdk-16.0.2.drv Timeout happens while =E2=80=98guix offload=E2=80=99 is sending build requi= sites to the target machine. At that point, the client =E2=80=98guix offload=E2=80=99 p= rocess is stuck like so: --8<---------------cut here---------------start------------->8--- (gdb) bt #0 0x00007f1fd7ad7d6f in __GI___poll (fds=3D0x766b00, nfds=3D1, timeout=3D= 3594784) at ../sysdeps/unix/sysv/linux/poll.c:29 #1 0x00007f1fcc220de7 in ssh_poll_ctx_dopoll () from target:/gnu/store/f3h= n75p2jjc94jfxfwsgf96s2wz8kaql-libssh-0.9.6/lib/libssh.so.4 #2 0x00007f1fcc221f0c in ssh_handle_packets () from target:/gnu/store/f3hn= 75p2jjc94jfxfwsgf96s2wz8kaql-libssh-0.9.6/lib/libssh.so.4 #3 0x00007f1fcc221fdb in ssh_handle_packets_termination () from target:/gnu/store/f3hn75p2jjc94jfxfwsgf96s2wz8kaql-libssh-0.9.6/lib= /libssh.so.4 #4 0x00007f1fcc2034bb in channel_write_common () from target:/gnu/store/f3= hn75p2jjc94jfxfwsgf96s2wz8kaql-libssh-0.9.6/lib/libssh.so.4 #5 0x00007f1fcc28b15b in ?? () from target:/gnu/store/rjhq6cz7xbck4l7zar6g= a3ygph2chwzr-guile-ssh-0.16.0/lib/libguile-ssh.so #6 0x00007f1fd7cf4f34 in scm_i_write_bytes (port=3D0x7f1fc556ba00, src=3D0= x7f1fc3c27000, start=3D0, count=3D65536) at ports.c:2858 #7 0x00007f1fd7cff546 in scm_put_bytevector (port=3D0x7f1fc556ba00, bv=3D0= x7f1fc3c27000, start=3D<optimized out>, count=3D<optimized out>) at r6rs-ports.c:676 #8 0x00007f1fc5ac5197 in ?? () #9 0x000000000069731c in ?? () #10 0x00007f1fc5acdf71 in ?? () #11 0x0000000000697318 in ?? () #12 0x00007f1fd7cd2ccc in scm_jit_enter_mcode (thread=3D0x7f1fd73b1d80, mco= de=3D0x69bf14 "\034;\004") at jit.c:6038 #13 0x00007f1fd7d2874f in vm_regular_engine (thread=3D0x7f1fd73b1d80) at vm= -engine.c:546 #14 0x00007f1fd7d355d9 in scm_call_n (proc=3D<optimized out>, argv=3D<optim= ized out>, nargs=3D1) at vm.c:1610 #15 0x00007f1fd7c9d457 in scm_primitive_eval (exp=3D<optimized out>, exp@en= try=3D0x7f1fcfab8cf0) at eval.c:671 #16 0x00007f1fd7ca34b6 in scm_eval (exp=3D0x7f1fcfab8cf0, module_or_state= =3D0x7f1fcfaacc80) at eval.c:705 #17 0x00007f1fd7d073b6 in scm_shell (argc=3D11, argv=3D0x6a9fc0) at script.= c:357 #18 0x00007f1fd7cb249c in invoke_main_func (body_data=3D0x7fffd2c0d370) at = init.c:312 #19 0x00007f1fd7c9b85a in c_body (d=3D0x7fffd2c0d290) at continuations.c:430 #20 0x00007f1fd7d28326 in vm_regular_engine (thread=3D0x7f1fd73b1d80) at vm= -engine.c:972 #21 0x00007f1fd7d355d9 in scm_call_n (proc=3D<optimized out>, argv=3D<optim= ized out>, nargs=3D2) at vm.c:1610 #22 0x00007f1fd7c9d09a in scm_call_2 (proc=3D<optimized out>, arg1=3D<optim= ized out>, arg2=3D<optimized out>) at eval.c:503 #23 0x00007f1fd7d53742 in scm_c_with_exception_handler.constprop.0 (type=3D= 0x404, handler_data=3Dhandler_data@entry=3D0x7fffd2c0d220, thunk_data=3Dthunk_data@entry=3D0x7fffd2c0d220, thunk=3D<optimized out>= , handler=3D<optimized out>) at exceptions.c:170 #24 0x00007f1fd7d2588f in scm_c_catch (tag=3D<optimized out>, body=3D<optim= ized out>, body_data=3D<optimized out>, handler=3D<optimized out>, handler_data=3D<optimized out>, pre_unwind_handler=3D<optimized out>, p= re_unwind_handler_data=3D0x7f1fcfa10040) at throw.c:168 #25 0x00007f1fd7c9de66 in scm_i_with_continuation_barrier (pre_unwind_handl= er=3D0x7f1fd7c9db80 <pre_unwind_handler>, pre_unwind_handler_data=3D0x7f1fcfa10040, handler_data=3D0x7fffd2c0d290= , handler=3D0x7f1fd7ca48b0 <c_handler>, body_data=3D0x7fffd2c0d290, body=3D0x7f1fd7c9b850 <c_body>) at continuations.c:368 #26 scm_c_with_continuation_barrier (func=3D<optimized out>, data=3D<optimi= zed out>) at continuations.c:464 #27 0x00007f1fd7d24b39 in with_guile (base=3D0x7fffd2c0d318, data=3D0x7fffd= 2c0d340) at threads.c:645 #28 0x00007f1fd7bfb0ba in GC_call_with_stack_base () from target:/gnu/store= /2lczkxbdbzh4gk7wh91bzrqrk7h5g1dl-libgc-8.0.4/lib/libgc.so.1 #29 0x00007f1fd7d1d8b8 in scm_i_with_guile (dynamic_state=3D<optimized out>= , data=3D<optimized out>, func=3D<optimized out>) at threads.c:688 #30 scm_with_guile (func=3D<optimized out>, data=3D<optimized out>) at thre= ads.c:694 #31 0x00007f1fd7cbb025 in scm_boot_guile (argc=3D<optimized out>, argv=3D<o= ptimized out>, main_func=3D<optimized out>, closure=3D<optimized out>) at init.c:295 #32 0x0000000000401191 in ?? () #33 0x00007f1fd7a117dd in __libc_start_main (main=3D0x401110, argc=3D8, arg= v=3D0x7fffd2c0d4a8, init=3D<optimized out>, fini=3D<optimized out>, rtld_fini=3D<optimized out>, stack_end=3D0x7fffd2c0d498) at ../csu/libc= -start.c:332 #34 0x00000000004011da in ?? () (gdb) p *fds $1 =3D {fd =3D 15, events =3D 1, revents =3D 0} (gdb) shell ls -l /proc/62345/fd/15 lrwx------ 1 root root 64 Apr 17 17:42 /proc/62345/fd/15 -> 'socket:[323897= 5699]' --8<---------------cut here---------------end--------------->8--- The SSH connection is still live, and on the server side I see the =E2=80= =98guix repl -t machine=E2=80=99 process stuck in: --8<---------------cut here---------------start------------->8--- write(2, "linking \342\200\230/gnu/store/0gbj38wswwxq50i68ci973fgi2l531b4-o= penjdk-16.0.2-checkout/test/hotspot/jtreg/compiler/c2/cr6340864/TestLongVec= t.java\342\200\231 to \342\200\230/gnu/store/.links/03zscrls9431zipkpx1sn45= 0llvf5hkhvbkpjk57j22f7d2nb3vs\342\200\231\n", 221 --8<---------------cut here---------------end--------------->8--- (This message comes from optimise-store.cc:192.) This suggests that =E2=80=98guix repl -t machine=E2=80=99 has filled its ou= tput buffer; its output isn=E2=80=99t being read so it gets stuck in that write(2) call.= Its parent =E2=80=98sshd=E2=80=99 process is stuck in: --8<---------------cut here---------------start------------->8--- ppoll([{fd=3D3, events=3DPOLLIN}, {fd=3D4, events=3D0}, {fd=3D10, events=3D= POLLIN}, {fd=3D12, events=3DPOLLIN}, {fd=3D11, events=3DPOLLOUT}], 5, NULL,= [], 8^C --8<---------------cut here---------------end--------------->8--- One hypothesis is that when transferring store items with many files, such as openjdk-checkout here, guix-daemon on the remote machine outputs many such =E2=80=9Clinking =E2=80=A6 to =E2=80=A6=E2=80=9D messages; if the= client doesn=E2=80=99t read them, then we end up with a deadlock like this, with the server-side process stuck writing more messages. However, that =E2=80=9Clinking =E2=80=A6=E2=80=9D message is supposed to be= written only at non-default debugging levels: printMsg(lvlTalkative, format("linking =E2=80=98%1%=E2=80=99 to =E2=80= =98%2%=E2=80=99") % path % linkPath); Turns out that guix-daemon on build machines is running with =E2=80=98--deb= ug=E2=80=99 starting with maintenance.git commit 72d8a62466b71eda657b52ca17e99f7189ea18ad (=E2=80=9Csysadmin: build-machines: Enable guix-daemon debug output.=E2=80=9D) The immediate fix is to undo that=E2=80=94I=E2=80=99ll do that later. The other one is to swallow =E2=80=98current-error-port=E2=80=99 in =E2=80=98store-import-channel=E2=80=99. To be continued=E2=80=A6 Ludo=E2=80=99.
bug-guix@HIDDEN
:bug#61839
; Package guix
.
Full text available.Ludovic Courtès <ludo@HIDDEN>
to control <at> debbugs.gnu.org
.
Full text available.Received: (at 61839) by debbugs.gnu.org; 17 Mar 2023 13:21:13 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Fri Mar 17 09:21:13 2023 Received: from localhost ([127.0.0.1]:43984 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1pdA1B-0003dT-Dv for submit <at> debbugs.gnu.org; Fri, 17 Mar 2023 09:21:13 -0400 Received: from hera.aquilenet.fr ([185.233.100.1]:37342) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <andreas@HIDDEN>) id 1pdA1A-0003dH-3T for 61839 <at> debbugs.gnu.org; Fri, 17 Mar 2023 09:21:12 -0400 Received: from localhost (localhost [127.0.0.1]) by hera.aquilenet.fr (Postfix) with ESMTP id 00C73CD3; Fri, 17 Mar 2023 14:21:06 +0100 (CET) X-Virus-Scanned: Debian amavisd-new at hera.aquilenet.fr Received: from hera.aquilenet.fr ([127.0.0.1]) by localhost (hera.aquilenet.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zE2j33BYpVfW; Fri, 17 Mar 2023 14:21:05 +0100 (CET) Received: from jurong (unknown [IPv6:2001:861:c4:f2f0::c64]) by hera.aquilenet.fr (Postfix) with ESMTPSA id 6E43922F; Fri, 17 Mar 2023 14:21:05 +0100 (CET) Date: Fri, 17 Mar 2023 14:21:03 +0100 From: Andreas Enge <andreas@HIDDEN> To: Ludovic =?iso-8859-15?Q?Court=E8s?= <ludo@HIDDEN>, 61839 <at> debbugs.gnu.org Subject: Re: Offloading problems on berlin Message-ID: <ZBRpP/cDAYMQ5220@jurong> References: <ZAtFp/uT6Z0N3XmN@jurong> <87pm9bmix7.fsf@HIDDEN> <ZBHJMMYvI0BhllzF@jurong> <ZBHctIg6Dk+Yn0Ik@jurong> <87wn3gereq.fsf_-_@HIDDEN> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <87wn3gereq.fsf_-_@HIDDEN> X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: 61839 X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.0 (-) Am Thu, Mar 16, 2023 at 02:50:53PM +0100 schrieb Ludovic Courtès: > I occasionally offload from my laptop to machines at work but didn’t hit > this bug, so we’ll have to see if it’s specific to berlin or not. My impression is that in my case, the problem is not with ssh, but somehow with the guix daemon. I tend to observe this: guix offload: sending 1 store item (6,490 MiB) to '141.80.167.169'... exporting path `/gnu/store/y4ipvkapf1gninaabwdl6pcz46c1frak-texlive-texmf-20210325' Then jnettop shows data transfer of about 100 to 120MB/s during a bit over a minute, which makes me think that the actual data transfer succeeds, but that somehow the data is not registered in the store. Then nothing tangible happens and offloading times out. Since I am apparently not able to connect to the machines via ssh, I cannot have a closer look at the receiving end. Andreas
bug-guix@HIDDEN
:bug#61839
; Package guix
.
Full text available.Received: (at 61839) by debbugs.gnu.org; 3 Mar 2023 14:24:26 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Fri Mar 03 09:24:26 2023 Received: from localhost ([127.0.0.1]:60008 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1pY6Kg-0001Hm-3W for submit <at> debbugs.gnu.org; Fri, 03 Mar 2023 09:24:26 -0500 Received: from eggs.gnu.org ([209.51.188.92]:43604) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <ludo@HIDDEN>) id 1pY6Kf-0001HX-2v for 61839 <at> debbugs.gnu.org; Fri, 03 Mar 2023 09:24:25 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <ludo@HIDDEN>) id 1pY6KZ-0005Z5-Oc for 61839 <at> debbugs.gnu.org; Fri, 03 Mar 2023 09:24:19 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:In-Reply-To:Date:References:Subject:To: From; bh=GzKp0NqEk85EOFRYaglJljC5m1S1rWp78gqTujAoz2Q=; b=c+1c3bELKcPlVSBcHl+l 8kY+JPE9u9JZ5u9W0yq/SG3GeCXV2lWsYWvpsTbGp8fYtFba3SoFbn12tPNrIXGyxa/yvWwkW3arU cVNiYUC0y4QzbKaTwXzVi6ZI/wnfBnrpr+HqXTFHg5uPvTMTlspHWAFY8qAGami835WW27h5Gyk9Z lz6uH7xYRCq2bEridoZ2DGF8aeDeqWFs6alzBTOsVDMY3UD9tkG2mPo5EDTzAZS5YMOkUFSbPhAPH bVkwVtLt2XVTMflMb9yXNQhFdRnyeHeTGpnlW/rRF4gbOeMyK/iYSOXSF7++z4hWwYcdVYl/si8MW uz2zPh3ZRHHmJg==; Received: from [193.50.110.138] (helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <ludo@HIDDEN>) id 1pY6KZ-00081x-5Y; Fri, 03 Mar 2023 09:24:19 -0500 From: =?utf-8?Q?Ludovic_Court=C3=A8s?= <ludo@HIDDEN> To: 61839 <at> debbugs.gnu.org Subject: Re: bug#61839: [Cuirass] Build timeouts during evaluations References: <87a60zo0le.fsf@HIDDEN> Date: Fri, 03 Mar 2023 15:24:16 +0100 In-Reply-To: <87a60zo0le.fsf@HIDDEN> ("Ludovic =?utf-8?Q?Court=C3=A8s=22?= =?utf-8?Q?'s?= message of "Mon, 27 Feb 2023 11:37:49 +0100") Message-ID: <87sfelx69b.fsf@HIDDEN> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 61839 Cc: Mathieu Othacehe <othacehe@HIDDEN> X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -3.3 (---) Ludovic Court=C3=A8s <ludo@HIDDEN> skribis: > exporting path `/gnu/store/f26rw4haizf2h126k9ja9wk31cm49mj0-guix-17bd0243= 3' > exporting path `/gnu/store/wc736va72zl2zz36817ahf3kc5fkfr69-inferior-scri= pt.scm' > building of `/gnu/store/yq9r44lic8cn4nphksbi5s3ppwfcsh1n-guile.drv' timed= out after 3600 seconds of silence I experienced something similar while building from the command line on berlin, so it looks like an offloading problem. Ludo=E2=80=99.
bug-guix@HIDDEN
:bug#61839
; Package guix
.
Full text available.Ludovic Courtès <ludo@HIDDEN>
to control <at> debbugs.gnu.org
.
Full text available.Received: (at submit) by debbugs.gnu.org; 27 Feb 2023 10:37:55 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Mon Feb 27 05:37:55 2023 Received: from localhost ([127.0.0.1]:46185 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1pWatH-0001jy-53 for submit <at> debbugs.gnu.org; Mon, 27 Feb 2023 05:37:55 -0500 Received: from lists.gnu.org ([209.51.188.17]:34636) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <ludo@HIDDEN>) id 1pWatF-0001jr-Fp for submit <at> debbugs.gnu.org; Mon, 27 Feb 2023 05:37:54 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <ludo@HIDDEN>) id 1pWatE-0003vy-TT for bug-guix@HIDDEN; Mon, 27 Feb 2023 05:37:53 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <ludo@HIDDEN>) id 1pWatE-0002qy-Li for bug-guix@HIDDEN; Mon, 27 Feb 2023 05:37:52 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:Date:Subject:To:From:in-reply-to: references; bh=gFYGy5VJddheBiQLAVNYVs12BiehJcywFkEnng7brHo=; b=mlst+oxlzfZNyN FEyT7dPdMoH4tbd5BA4JaoHCFMDSgIOffVyRp7WyO8Eu3IUzVQ5QKKU7niv+83c15P0flSm+oMSBk eJONWnNRokno3djViYoxH2CqITBHjiodZpj/IKc+bH/QV76NfSjk6l2SheOHOK/lOLkejwURNw/wC aMfY8KYpyop2Y2FQR5HF9ODoYGq1UJHI+fBOcWzISTsMd15wGAqSiRwTtY7tdfiJkNSGlf7SuicxT TMR8L8JmCy3fbKJE9IXjP9L8VfGB7Mk15mPcGq+B+JMoEW9goIN/e9VGkUYSE7O4fJiv/duei7XNB EZn62V/dvMM88BajuCGw==; Received: from [2001:660:6102:320:e120:2c8f:8909:cdfe] (helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <ludo@HIDDEN>) id 1pWatE-0007Ox-2o for bug-guix@HIDDEN; Mon, 27 Feb 2023 05:37:52 -0500 From: =?utf-8?Q?Ludovic_Court=C3=A8s?= <ludo@HIDDEN> To: bug-guix@HIDDEN Subject: [Cuirass] Build timeouts during evaluations X-Debbugs-Cc: Mathieu Othacehe <othacehe@HIDDEN> X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: Nonidi 9 =?utf-8?Q?Vent=C3=B4se?= an 231 de la =?utf-8?Q?R=C3=A9volution=2C?= jour du Marsault X-PGP-Key-ID: 0x090B11993D9AEBB5 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5 X-OS: x86_64-pc-linux-gnu Date: Mon, 27 Feb 2023 11:37:49 +0100 Message-ID: <87a60zo0le.fsf@HIDDEN> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -3.3 (---) Hello Guix! We often get evaluation errors for =E2=80=98core-updates=E2=80=99, where th= e build log reveals that this is due to a build timeout while building the Guix instance that will be used for the evaluation. Example: --8<---------------cut here---------------start------------->8--- building of `/gnu/store/x9dyzywwmsp8vmpvpx54yhcnz7rqin0l-asciidoc-9.1.0.drv= ' timed out after 3600 seconds of silence cannot build derivation `/gnu/store/bl7aq8999lmp6flbyw3k656m522jmcl3-git-2.= 39.1.drv': 1 dependencies couldn't be built cannot build derivation `/gnu/store/ldajjyvb4np1zq9l9s09j0dpsbpscpan-gnulib= -2022-12-06-1.440b528.drv': 1 dependencies couldn't be built cannot build derivation `/gnu/store/fdks470j7xjw3dc6fiyw15hdsxj6cdbc-guile-= gnutls-3.7.11.drv': 1 dependencies couldn't be built cannot build derivation `/gnu/store/71pl6a38bd0b81qgi45wz2ymvlv0x7xn-guix-c= li-core.drv': 1 dependencies couldn't be built [...] cannot build derivation `/gnu/store/m7r73wypxhhmv8cqza1x8g2zyq9k1xpg-profil= e.drv': 1 dependencies couldn't be built [...] ERROR: 1. &store-protocol-error: message: "build of `/gnu/store/m7r73wypxhhmv8cqza1x8g2zyq9k1xpg-profi= le.drv' failed" status: 101 --8<---------------cut here---------------end--------------->8--- (From <https://ci.guix.gnu.org/eval/233402/log/raw>.) Likewise for the =E2=80=98guile=E2=80=99 jobset: --8<---------------cut here---------------start------------->8--- exporting path `/gnu/store/m2hyx7qk2j1bsvpj1qv89701klv8kqkk-guix-command' exporting path `/gnu/store/vlm7401d5jvna6j7qdxhcr581cd8g8vh-guix-daemon-1.4= .0-4.01fd830' exporting path `/gnu/store/zv3c0anfj085d2lyxh4g4aahhczmcaj8-guix-daemon' exporting path `/gnu/store/f26rw4haizf2h126k9ja9wk31cm49mj0-guix-17bd02433' exporting path `/gnu/store/wc736va72zl2zz36817ahf3kc5fkfr69-inferior-script= .scm' building of `/gnu/store/yq9r44lic8cn4nphksbi5s3ppwfcsh1n-guile.drv' timed o= ut after 3600 seconds of silence cannot build derivation `/gnu/store/nrcsvww8j24rjzyin90w76qgbkgphhvp-profil= e.drv': 1 dependencies couldn't be built Backtrace: In ice-9/boot-9.scm: 1752:10 12 (with-exception-handler _ _ #:unwind? _ # _) In unknown file: 11 (apply-smob/0 #<thunk 7fd8cbe13340>) In ice-9/boot-9.scm: 724:2 10 (call-with-prompt _ _ #<procedure default-prompt-handle?>) In ice-9/eval.scm: 619:8 9 (_ #(#(#<directory (guile-user) 7fd8cbe16c80>))) In cuirass/ui.scm: 104:10 8 (run-cuirass-command _ . _) In cuirass/scripts/evaluate.scm: 105:27 7 (cuirass-evaluate _) In ice-9/boot-9.scm: 1752:10 6 (with-exception-handler _ _ #:unwind? _ # _) In guix/store.scm: 658:37 5 (thunk) 2123:25 4 (run-with-store #<store-connection 256.99 7fd8cad09dc0> ?) In cuirass/scripts/evaluate.scm: 87:8 3 (_ _) In guix/store.scm: 1995:38 2 (_ #<store-connection 256.99 7fd8a33e6d70>) 1421:15 1 (_ #<store-connection 256.99 7fd8a33e6d70> _ _) In ice-9/boot-9.scm: 1685:16 0 (raise-exception _ #:continuable? _) ice-9/boot-9.scm:1685:16: In procedure raise-exception: ERROR: 1. &store-protocol-error: message: "build of `/gnu/store/nrcsvww8j24rjzyin90w76qgbkgphhvp-profi= le.drv' failed" status: 101 --8<---------------cut here---------------end--------------->8--- Building this particular derivation is quick though: --8<---------------cut here---------------start------------->8--- $ time guix build /gnu/store/yq9r44lic8cn4nphksbi5s3ppwfcsh1n-guile.drv --c= heck The following derivation will be built: /gnu/store/yq9r44lic8cn4nphksbi5s3ppwfcsh1n-guile.drv building /gnu/store/yq9r44lic8cn4nphksbi5s3ppwfcsh1n-guile.drv... (repl-version 0 1 1) (values (value "/gnu/store/rgk1sc2zvllf7cgbyj5g28b9v2x6nvn3-guile/share/gui= le/site/3.0")) successfully built /gnu/store/yq9r44lic8cn4nphksbi5s3ppwfcsh1n-guile.drv successfully built /gnu/store/yq9r44lic8cn4nphksbi5s3ppwfcsh1n-guile.drv /gnu/store/rgk1sc2zvllf7cgbyj5g28b9v2x6nvn3-guile real 0m1.253s user 0m0.751s sys 0m0.034s --8<---------------cut here---------------end--------------->8--- So the problem must be elsewhere. One thing is that this particular .drv is quite big: --8<---------------cut here---------------start------------->8--- $ guix size /gnu/store/yq9r44lic8cn4nphksbi5s3ppwfcsh1n-guile.drv store item total s= elf /gnu/store/wjr1sqxci75jfkvwzmmvlgypkyp3fvqi-guix-17bd024 105.0 = 105.0 42.9% /gnu/store/6b3fp2kwqxw7cvnh6jz9g49n9k2n5avy-guix-packages-base-source 36= .2 36.2 14.8% /gnu/store/h3hjni2bd8f5kbn8w0g2xx2jnagfynnr-guile-e2ed33e 29.0 = 29.0 11.8% /gnu/store/gfb26xn5f2qyrxzv847cd6iql6h8igh0-packages 23.1 = 23.1 9.4% /gnu/store/dj8rq4jd8qgk4k1hzg8djcvi6hlbdb7m-doc 22.3 = 22.3 9.1% /gnu/store/a5z192qxni40ly92cknwxkpnxc8zxfgf-guix 7.8 = 7.8 3.2% /gnu/store/f8wg2y3rrh4dgi6znjrv8a1l4ymkc7rb-guix-packages-source 2.8 = 2.8 1.2% /gnu/store/8iwm8x6472xnf7g6hfxzw029gxm1qbkq-guix-system-source 2.5 = 2.5 1.0% /gnu/store/2dxbf11shgvs1cgvzz6wpnfra97f0sfw-guix-extra-source 2.5 = 2.5 1.0% /gnu/store/q57az26zdgafic5b8xsbxlh4iwddvr86-doc 2.1 = 2.1 0.9% [=E2=80=A6] /gnu/store/0mxnx8l4fgigvd7gakwdk6hc6im4wnai-disarchive-mirrors 0.0 = 0.0 0.0% total: 244.8 MiB --8<---------------cut here---------------end--------------->8--- Merely transferring it for offloading takes time; depending on machine load, perhaps it can hit the 1h max-silent timeout? That=E2=80=99d mean the throughput is below 70=C2=A0KiB/s. Plausible? The closure of /gnu/store/x9dyzywwmsp8vmpvpx54yhcnz7rqin0l-asciidoc-9.1.0.drv is 3.0=C2=A0MiB and it builds in 4 seconds, so there must be something else going on, perhaps a bug in libssh/Guile-SSH leading to stale connections? Thoughts? Ludo=E2=80=99.
Ludovic Courtès <ludo@HIDDEN>
:othacehe@HIDDEN, bug-guix@HIDDEN
.
Full text available.othacehe@HIDDEN, bug-guix@HIDDEN
:bug#61839
; Package guix
.
Full text available.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997 nCipher Corporation Ltd,
1994-97 Ian Jackson.