GNU bug report logs - #70761
Guix deploy cannot reboot remote machines without an error

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: guix; Reported by: Richard Sent <richard@HIDDEN>; dated Fri, 3 May 2024 21:59:02 UTC; Maintainer for guix is bug-guix@HIDDEN.

Message received at 70761 <at> debbugs.gnu.org:


Received: (at 70761) by debbugs.gnu.org; 4 Feb 2025 16:43:25 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Tue Feb 04 11:43:25 2025
Received: from localhost ([127.0.0.1]:46434 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1tfM1F-0001lo-Dz
	for submit <at> debbugs.gnu.org; Tue, 04 Feb 2025 11:43:25 -0500
Received: from cotopaxi.ee.ethz.ch ([129.132.148.196]:50703)
 by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128)
 (Exim 4.84_2) (envelope-from <gabriel@HIDDEN>)
 id 1tfM1C-0001lT-QP
 for 70761 <at> debbugs.gnu.org; Tue, 04 Feb 2025 11:43:23 -0500
Received: from localhost (antispam.ee.ethz.ch [129.132.2.16])
 by cotopaxi.ee.ethz.ch (Postfix) with ESMTP id E52691FED7;
 Tue,  4 Feb 2025 17:43:15 +0100 (CET)
X-Virus-Scanned: by amavisd at antispam.ee.ethz.ch
Received: from cotopaxi.ee.ethz.ch ([129.132.148.196])
 by localhost (antispam.ee.ethz.ch [129.132.2.16]) (amavisd-new, port 10028)
 with ESMTP id 707GmUaE_6O0; Tue,  4 Feb 2025 17:43:15 +0100 (CET)
Received: from silvi (212-51-128-25.fiber7.init7.net [212.51.128.25])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange ECDHE (prime256v1) server-signature RSA-PSS (4096 bits)
 server-digest SHA256) (Client did not present a certificate)
 (Authenticated sender: gabriel)
 by cotopaxi.ee.ethz.ch (Postfix) with ESMTPSA;
 Tue,  4 Feb 2025 17:43:15 +0100 (CET)
From: Gabriel Wicki <gabriel@HIDDEN>
To: Richard Sent <richard@HIDDEN>
Subject: deploy and reboot
Date: Tue, 04 Feb 2025 17:43:14 +0100
Message-ID: <871pwdu0d9.fsf@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 70761
Cc: 70761 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

Hi Richard!

I just stumbled over the same issue and while i am not really sure what
to think about the screen/tmux/SIGHUP proposal (does this apply to all
deploy commands or just a more fancy/sophisticated usage scenario?) i'd
go with the special --reboot flag to the deploy command that

 1. causes a reboot and then

 2. waits for the machine(s) to come back up.

I am not sure whether this is possible already, but will happily dive in
a little further (and prepare a patch if circumstances allow).

Have a nice week,
gabber




Information forwarded to bug-guix@HIDDEN:
bug#70761; Package guix. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 3 May 2024 21:58:33 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Fri May 03 17:58:33 2024
Received: from localhost ([127.0.0.1]:49086 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1s30vI-0005Of-Fr
	for submit <at> debbugs.gnu.org; Fri, 03 May 2024 17:58:32 -0400
Received: from lists.gnu.org ([2001:470:142::17]:36560)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <richard@HIDDEN>) id 1s30vB-0005OV-M4
 for submit <at> debbugs.gnu.org; Fri, 03 May 2024 17:58:29 -0400
Received: from eggs.gnu.org ([2001:470:142:3::10])
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <richard@HIDDEN>)
 id 1s30uj-00046F-MX
 for bug-guix@HIDDEN; Fri, 03 May 2024 17:57:57 -0400
Received: from mail-108-mta194.mxroute.com ([136.175.108.194])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128)
 (Exim 4.90_1) (envelope-from <richard@HIDDEN>)
 id 1s30uh-0004D6-H3
 for bug-guix@HIDDEN; Fri, 03 May 2024 17:57:57 -0400
Received: from filter006.mxroute.com ([136.175.111.2] filter006.mxroute.com)
 (Authenticated sender: mN4UYu2MZsgR)
 by mail-108-mta194.mxroute.com (ZoneMTA) with ESMTPSA id
 18f4076a36d0008ca2.001 for <bug-guix@HIDDEN>
 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384);
 Fri, 03 May 2024 21:57:48 +0000
X-Zone-Loop: cd276b7bc8ca9d3c8d3252efdbcb40d518ad80ed8266
X-Originating-IP: [136.175.111.2]
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed;
 d=freakingpenguin.com; s=x; h=Content-Transfer-Encoding:Content-Type:
 MIME-Version:Message-ID:Date:Subject:To:From:Sender:Reply-To:Cc:Content-ID:
 Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc
 :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe:
 List-Subscribe:List-Post:List-Owner:List-Archive;
 bh=O71K+SLBxAZXrYcRkpQYR5fi17jaw3P1VBlwFEJZ7DI=; b=djQA9MRTwBtEQNnFl3ddLAYjEa
 TVpd2V2gr8nchO+sruy3/osWAjv4iY+obgvjsWE49genAcBGfqE1YYzYia7SRXeCF/0p+vGOKSSAt
 RACtS6qbLbgGuGJNkqLS2FHZL0RpuIwlkCajbPteVYmXEfQM4Sak58rEjhsawnsqSamBM74m81RuT
 iUhkvsPew5fWZutbV1MQ1Xe7Koa2x+EZlVOMwY8lrdNgtyFLTBhc8O++cjfiGTDwD840AGxmY0ENa
 6+3YvyVNNGWTNQDvQ2y8H7WVB6V2EOVTMRPDtqE3m9IMH8gNGKEN26jwzk89fjm76XnVxDcKGkj3Y
 1n0NysyQ==;
From: Richard Sent <richard@HIDDEN>
To: bug-guix@HIDDEN
Subject: Guix deploy cannot reboot remote machines without an error
Date: Fri, 03 May 2024 17:57:38 -0400
Message-ID: <87wmoapnhp.fsf@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Authenticated-Id: richard@HIDDEN
Received-SPF: pass client-ip=136.175.108.194;
 envelope-from=richard@HIDDEN; helo=mail-108-mta194.mxroute.com
X-Spam_score_int: -16
X-Spam_score: -1.7
X-Spam_bar: -
X-Spam_report: (-1.7 / 5.0 requ) BAYES_00=-1.9, DKIM_INVALID=0.1,
 DKIM_SIGNED=0.1, SPF_HELO_NONE=0.001,
 SPF_PASS=-0.001 autolearn=no autolearn_force=no
X-Spam_action: no action
X-Spam-Score: 0.9 (/)
X-Debbugs-Envelope-To: submit
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -0.1 (/)

Hi Guix,

One neat feature of guix deploy is the ability to run a command on a
list of remote machines. One command that would commonly be run is
reboot, so that upgrades to, say, the Linux kernel take effect.

While the command itself /does/ run and the system /does/ restart, guix
deploy doesn't gracefully handle the connection loss. The first machine
rebooted will throw an error, halting the reboot of the rest of the
machines in the list.

tmux and screen aren't really compatible with '-x -- <command>' since
you can't start and detach from sessions and the & in '-x -- nohup
<command> &' gets swallowed by the host shell. I haven't found a
workaround. Even if they did work, I suspect there is a race condition
between "host closes session" and "remote restarts".

We shouldn't assume that any command may potentially close the SSH
session and catch errors by default.

One solution could be adding an alternative to -x that nohup's the
command and attempts to cleanly close the SSH session. If the session
errors out after the command is nohup'd (e.g. reboot race condition),
catch the SSH error and exit.

Alternatively we could add a --reboot flag, although I prefer the more
general solution.

Perhaps this can be the impetus for implementing the "deploy-hook"
functionality described at https://issues.guix.gnu.org/53486. In a
particularly fancy world, we could combine rebooting with pre and post
reboot command execution, but now I'm thinking of pies =F0=9F=A5=A7 in skie=
s =E2=9B=85.

Or maybe I'm completely wrong and this is possible (sorry!), in which
case we probably could add a quick mention of it in the manual.

--8<---------------cut here---------------start------------->8---
gibraltar :( rsent$ guix deploy rsent/machines/lan.scm --no-grafts -x -- re=
boot
guix deploy: warning: <machine-ssh-configuration> without a 'host-key' is d=
eprecated
guix deploy: sending 1 store item (0 MiB) to 'horizon.local'...
;;; [2024/05/03 17:16:45.032361, 0] [GSSH ERROR] Parent session is not conn=
ected: #<unknown channel (freed) 7fbe66aa71a0>
Backtrace:
          16 (primitive-load "/home/richard/.config/guix/current/bin/guix")
In guix/ui.scm:
   2312:7 15 (run-guix . _)
  2275:10 14 (run-guix-command _ . _)
In ice-9/boot-9.scm:
  1752:10 13 (with-exception-handler _ _ #:unwind? _ #:unwind-for-type _)
In guix/status.scm:
    839:4 12 (call-with-status-report _ _)
In ice-9/boot-9.scm:
  1752:10 11 (with-exception-handler _ _ #:unwind? _ #:unwind-for-type _)
In guix/store.scm:
   666:37 10 (thunk)
   1302:8  9 (call-with-build-handler _ _)
   1302:8  8 (call-with-build-handler #<procedure 7fbe69be4690 at guix/ui.s=
cm:1222:2 (continue store things mode)> _)
In guix/scripts/deploy.scm:
   274:23  7 (_)
In srfi/srfi-1.scm:
   460:18  6 (fold #<procedure 7fbe78c25540 at guix/scripts/deploy.scm:274:=
28 (machine result)> #t (#<<machine> operating-system: #<<operating-system>=
 ke=E2=80=A6>))
In guix/scripts/deploy.scm:
    214:2  5 (_ #<<machine> operating-system: #<<operating-system> kernel: =
#<package linux@HIDDEN nongnu/packages/linux.scm:118 7fbe6343c0b0> kernel-lo=
ada=E2=80=A6> =E2=80=A6)
In guix/store.scm:
  2182:25  4 (run-with-store #<store-connection 256.100 7fbe78cf8960> #<pro=
cedure 7fbe66b423c0 at guix/remote.scm:119:2 (state)> #:guile-for-build _ #=
 _ # _)
In guix/remote.scm:
    72:20  3 (_ _)
In unknown file:
           2 (channel-get-exit-status #<unknown channel (freed) 7fbe66aa71a=
0>)
In ice-9/boot-9.scm:
  1685:16  1 (raise-exception _ #:continuable? _)
  1685:16  0 (raise-exception _ #:continuable? _)

ice-9/boot-9.scm:1685:16: In procedure raise-exception:
Throw to key `guile-ssh-error' with args `("channel-get-exit-status" "Paren=
t session is not connected" #<unknown channel (freed) 7fbe66aa71a0> #f)'.
--8<---------------cut here---------------end--------------->8---


--=20
Take it easy,
Richard Sent
Making my computer weirder one commit at a time.




Acknowledgement sent to Richard Sent <richard@HIDDEN>:
New bug report received and forwarded. Copy sent to bug-guix@HIDDEN. Full text available.
Report forwarded to bug-guix@HIDDEN:
bug#70761; Package guix. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Tue, 4 Feb 2025 16:45:01 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.