GNU bug report logs - #24496
offloading should fall back to local build after n tries

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: guix; Reported by: ng0 <ngillmann@HIDDEN>; dated Wed, 21 Sep 2016 15:41:02 UTC; Maintainer for guix is bug-guix@HIDDEN.

Message received at 24496 <at> debbugs.gnu.org:


Received: (at 24496) by debbugs.gnu.org; 5 Oct 2016 11:36:34 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Wed Oct 05 07:36:34 2016
Received: from localhost ([127.0.0.1]:45221 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1brkV0-0006MI-0E
	for submit <at> debbugs.gnu.org; Wed, 05 Oct 2016 07:36:34 -0400
Received: from eggs.gnu.org ([208.118.235.92]:52119)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <ludo@HIDDEN>) id 1brkUx-0006M3-UN
 for 24496 <at> debbugs.gnu.org; Wed, 05 Oct 2016 07:36:32 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <ludo@HIDDEN>) id 1brkUp-0005gy-Jz
 for 24496 <at> debbugs.gnu.org; Wed, 05 Oct 2016 07:36:26 -0400
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=-1.8 required=5.0 tests=BAYES_50,RP_MATCHES_RCVD
 autolearn=disabled version=3.3.2
Received: from fencepost.gnu.org ([2001:4830:134:3::e]:34659)
 by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <ludo@HIDDEN>)
 id 1brkUp-0005gp-Gw; Wed, 05 Oct 2016 07:36:23 -0400
Received: from reverse-83.fdn.fr ([80.67.176.83]:48446 helo=pluto)
 by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256)
 (Exim 4.82) (envelope-from <ludo@HIDDEN>)
 id 1brkUo-00030Q-OY; Wed, 05 Oct 2016 07:36:23 -0400
From: ludo@HIDDEN (Ludovic =?utf-8?Q?Court=C3=A8s?=)
To: ng0 <ngillmann@HIDDEN>
Subject: Re: bug#24496: offloading should fall back to local build after n
 tries
References: <8760ppr3q3.fsf@HIDDEN> <87r387nhjg.fsf@HIDDEN>
 <87vax8nis5.fsf@HIDDEN>
X-URL: http://www.fdn.fr/~lcourtes/
X-Revolutionary-Date: 14 =?utf-8?Q?Vend=C3=A9miaire?= an 225 de la
 =?utf-8?Q?R=C3=A9volution?=
X-PGP-Key-ID: 0x090B11993D9AEBB5
X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc
X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4  0CFB 090B 1199 3D9A EBB5
X-OS: x86_64-unknown-linux-gnu
Date: Wed, 05 Oct 2016 13:36:20 +0200
In-Reply-To: <87vax8nis5.fsf@HIDDEN> (ng0's message of "Tue, 04
 Oct 2016 17:08:58 +0000")
Message-ID: <87a8ej81u3.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
X-Received-From: 2001:4830:134:3::e
X-Spam-Score: -7.7 (-------)
X-Debbugs-Envelope-To: 24496
Cc: 24496 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -7.7 (-------)

ng0 <ngillmann@HIDDEN> skribis:

> Ludovic Court=C3=A8s <ludo@HIDDEN> writes:

[...]

>> Like you say, on Hydra-style setup this could be a problem: the
>> front-end machine may have --max-jobs=3D0, meaning that it cannot perform
>> builds on its own.
>>
>> So I guess we would need a command-line option to select a different
>> behavior.  I=E2=80=99m not sure how to do that because =E2=80=98guix off=
load=E2=80=99 is
>> =E2=80=9Chidden=E2=80=9D behind =E2=80=98guix-daemon=E2=80=99, so there=
=E2=80=99s no obvious place for such an
>> option.
>
> Could the daemon run with --enable-hydra-style or --disable-hydra-style
> and --disable-hydra-style would allow falling back to local build if
> after a defined time - keeping slow connections in mind - the machine
> did not reply.

That would be too ad-hoc IMO, and the problem mentioned above remains.

>> In the meantime, you could also hack up your machines.scm: it would
>> return a list where unreachable machines have been filtered out.
>
> How can I achieve this?

Something like:

  (define the-machine (build-machine =E2=80=A6))

  (if (managed-to-connect-timely the-machine)
      (list the-machine)
      '())

=E2=80=A6 where =E2=80=98managed-to-connect-timely=E2=80=99 would try to co=
nnect to the
machine with a timeout.

> And to append to this bug: it seems to me that offloading requires 1
> lsh-key for each
> build-machine.

The main machine needs to be able to connect to each build machine over
SSH, so indeed, that requires proper SSH key registration (host keys and
authorized user keys).

> (https://lists.gnu.org/archive/html/help-guix/2016-10/msg00007.html)
> and that you can not directly address them (say I want to create some
> system where I want to build on machine 1 AND machine 2. Having 2
> x86_64 in machines.scm only selects one of them (if 2 were working,
> see linked thread) and builds on the one which is accessible first. If
> however the first machine is somehow blocked and it fails, therefore
> terminates lsh connection, the build does not happen at all.

The code that selects machines is in (guix scripts offload),
specifically =E2=80=98choose-build-machine=E2=80=99.  It tries to choose th=
e =E2=80=9Cbest=E2=80=9D
machine, which means, roughly, the fastest and least loaded one.

HTH,
Ludo=E2=80=99.




Information forwarded to bug-guix@HIDDEN:
bug#24496; Package guix. Full text available.

Message received at 24496 <at> debbugs.gnu.org:


Received: (at 24496) by debbugs.gnu.org; 4 Oct 2016 17:09:09 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Tue Oct 04 13:09:08 2016
Received: from localhost ([127.0.0.1]:44808 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1brTDI-00041C-LP
	for submit <at> debbugs.gnu.org; Tue, 04 Oct 2016 13:09:08 -0400
Received: from aibo.runbox.com ([91.220.196.211]:44400)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <ng0@HIDDEN>) id 1brTDH-000413-E2
 for 24496 <at> debbugs.gnu.org; Tue, 04 Oct 2016 13:09:08 -0400
Received: from [10.9.9.212] (helo=mailfront12.runbox.com)
 by bars.runbox.com with esmtp (Exim 4.71)
 (envelope-from <ng0@HIDDEN>)
 id 1brTDF-0001CG-Qg; Tue, 04 Oct 2016 19:09:05 +0200
Received: from x5d83ef73.dyn.telefonica.de ([93.131.239.115] helo=localhost)
 by mailfront12.runbox.com with esmtpsa (uid:892961 )
 (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82)
 id 1brTD9-0004tx-IW; Tue, 04 Oct 2016 19:08:59 +0200
From: ng0 <ngillmann@HIDDEN>
To: Ludovic =?utf-8?Q?Court=C3=A8s?= <ludo@HIDDEN>
Subject: Re: bug#24496: offloading should fall back to local build after n
 tries
In-Reply-To: <87r387nhjg.fsf@HIDDEN>
References: <8760ppr3q3.fsf@HIDDEN> <87r387nhjg.fsf@HIDDEN>
Date: Tue, 04 Oct 2016 17:08:58 +0000
Message-ID: <87vax8nis5.fsf@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-Spam-Score: -0.7 (/)
X-Debbugs-Envelope-To: 24496
Cc: 24496 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -0.7 (/)

Ludovic Courtès <ludo@HIDDEN> writes:

> Hello!
>
> ng0 <ngillmann@HIDDEN> skribis:
>
>> When I forgot that my build machine is offline and I did not pass
>> --no-build-hook, the offloading keeps trying forever until I had to
>> cancel the build, boot the build-machine and started the build again.
>>
>> A solution could be a config option or default behavior which after
>> failing to offload for n times gives up and uses the local builder.
>>
>> Is this desired at all? Setups like hydra could get problems, but for
>> small setups with the same architecture there could be a solution beyond
>> --no-build-hook?
>
> Like you say, on Hydra-style setup this could be a problem: the
> front-end machine may have --max-jobs=0, meaning that it cannot perform
> builds on its own.
>
> So I guess we would need a command-line option to select a different
> behavior.  I’m not sure how to do that because ‘guix offload’ is
> “hidden” behind ‘guix-daemon’, so there’s no obvious place for such an
> option.

Could the daemon run with --enable-hydra-style or --disable-hydra-style
and --disable-hydra-style would allow falling back to local build if
after a defined time - keeping slow connections in mind - the machine
did not reply.

> In the meantime, you could also hack up your machines.scm: it would
> return a list where unreachable machines have been filtered out.

How can I achieve this?

And to append to this bug: it seems to me that offloading requires 1
lsh-key for each
build-machine. (https://lists.gnu.org/archive/html/help-guix/2016-10/msg00007.html)
and that you can not directly address them (say I want to create some
system where I want to build on machine 1 AND machine 2. Having 2 x86_64
in machines.scm only selects one of them (if 2 were working, see linked
thread) and builds on the one which is accessible first. If however the
first machine is somehow blocked and it fails, therefore terminates lsh
connection, the build does not happen at all.

Leaving out the problems, what I want to do in short: How could I build
on both systems at the same time when I desire to do so?

> Ludo’.
>

-- 




Information forwarded to bug-guix@HIDDEN:
bug#24496; Package guix. Full text available.

Message received at 24496 <at> debbugs.gnu.org:


Received: (at 24496) by debbugs.gnu.org; 26 Sep 2016 15:50:17 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Mon Sep 26 11:50:17 2016
Received: from localhost ([127.0.0.1]:36572 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1boYAb-0002pw-5a
	for submit <at> debbugs.gnu.org; Mon, 26 Sep 2016 11:50:17 -0400
Received: from mail2-relais-roc.national.inria.fr ([192.134.164.83]:27149)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <ludo@HIDDEN>) id 1boYAY-0002ph-8w
 for 24496 <at> debbugs.gnu.org; Mon, 26 Sep 2016 11:50:15 -0400
X-IronPort-AV: E=Sophos;i="5.30,400,1470693600"; d="scan'208";a="238321904"
Received: from smb-adpcdg1-06.hotspot.hub-one.net (HELO pluto)
 ([213.174.99.134])
 by mail2-relais-roc.national.inria.fr with ESMTP/TLS/AES256-GCM-SHA384;
 26 Sep 2016 17:50:07 +0200
From: ludo@HIDDEN (Ludovic =?utf-8?Q?Court=C3=A8s?=)
To: ng0 <ngillmann@HIDDEN>
Subject: Re: bug#24496: offloading should fall back to local build after n
 tries
In-Reply-To: <8760ppr3q3.fsf@HIDDEN> (ng0's message of "Wed, 21
 Sep 2016 09:39:48 +0000")
Date: Mon, 26 Sep 2016 18:20:51 +0900
Message-ID: <87r387nhjg.fsf@HIDDEN>
References: <8760ppr3q3.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux)
X-URL: http://www.fdn.fr/~lcourtes/
X-Revolutionary-Date: 5 =?utf-8?Q?Vend=C3=A9miaire?= an 225 de la
 =?utf-8?Q?R=C3=A9volution?=
X-PGP-Key-ID: 0x090B11993D9AEBB5
X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc
X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4  0CFB 090B 1199 3D9A EBB5
X-OS: x86_64-unknown-linux-gnu
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Spam-Score: -2.9 (--)
X-Debbugs-Envelope-To: 24496
Cc: 24496 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -2.9 (--)

Hello!

ng0 <ngillmann@HIDDEN> skribis:

> When I forgot that my build machine is offline and I did not pass
> --no-build-hook, the offloading keeps trying forever until I had to
> cancel the build, boot the build-machine and started the build again.
>
> A solution could be a config option or default behavior which after
> failing to offload for n times gives up and uses the local builder.
>
> Is this desired at all? Setups like hydra could get problems, but for
> small setups with the same architecture there could be a solution beyond
> --no-build-hook?

Like you say, on Hydra-style setup this could be a problem: the
front-end machine may have --max-jobs=3D0, meaning that it cannot perform
builds on its own.

So I guess we would need a command-line option to select a different
behavior.  I=E2=80=99m not sure how to do that because =E2=80=98guix offloa=
d=E2=80=99 is
=E2=80=9Chidden=E2=80=9D behind =E2=80=98guix-daemon=E2=80=99, so there=E2=
=80=99s no obvious place for such an
option.

In the meantime, you could also hack up your machines.scm: it would
return a list where unreachable machines have been filtered out.

Ludo=E2=80=99.




Information forwarded to bug-guix@HIDDEN:
bug#24496; Package guix. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 21 Sep 2016 15:40:18 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Wed Sep 21 11:40:18 2016
Received: from localhost ([127.0.0.1]:59729 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1bmjdB-0002k2-Ur
	for submit <at> debbugs.gnu.org; Wed, 21 Sep 2016 11:40:18 -0400
Received: from eggs.gnu.org ([208.118.235.92]:51847)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <ng0@HIDDEN>) id 1bme17-0007LF-H9
 for submit <at> debbugs.gnu.org; Wed, 21 Sep 2016 05:40:37 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <ng0@HIDDEN>) id 1bme11-0001BM-Bi
 for submit <at> debbugs.gnu.org; Wed, 21 Sep 2016 05:40:32 -0400
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,FREEMAIL_FROM
 autolearn=disabled version=3.3.2
Received: from lists.gnu.org ([2001:4830:134:3::11]:36936)
 by eggs.gnu.org with esmtp (Exim 4.71)
 (envelope-from <ng0@HIDDEN>) id 1bme11-0001BG-9F
 for submit <at> debbugs.gnu.org; Wed, 21 Sep 2016 05:40:31 -0400
Received: from eggs.gnu.org ([2001:4830:134:3::10]:42811)
 by lists.gnu.org with esmtp (Exim 4.71)
 (envelope-from <ng0@HIDDEN>) id 1bme10-0002RY-07
 for bug-guix@HIDDEN; Wed, 21 Sep 2016 05:40:30 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <ng0@HIDDEN>) id 1bme0v-00019e-VW
 for bug-guix@HIDDEN; Wed, 21 Sep 2016 05:40:29 -0400
Received: from aibo.runbox.com ([91.220.196.211]:35375)
 by eggs.gnu.org with esmtp (Exim 4.71)
 (envelope-from <ng0@HIDDEN>) id 1bme0v-000184-PB
 for bug-guix@HIDDEN; Wed, 21 Sep 2016 05:40:25 -0400
Received: from [10.9.9.210] (helo=mailfront10.runbox.com)
 by bars.runbox.com with esmtp (Exim 4.71)
 (envelope-from <ng0@HIDDEN>) id 1bme0p-0005Ny-Rz
 for bug-guix@HIDDEN; Wed, 21 Sep 2016 11:40:19 +0200
Received: from xd9bb8cb8.dyn.telefonica.de ([217.187.140.184] helo=localhost)
 by mailfront10.runbox.com with esmtpsa (uid:892961 )
 (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) id 1bme0L-0006RN-3W
 for bug-guix@HIDDEN; Wed, 21 Sep 2016 11:39:49 +0200
From: ng0 <ngillmann@HIDDEN>
To: bug-guix@HIDDEN
Subject: offloading should fall back to local build after n tries
Date: Wed, 21 Sep 2016 09:39:48 +0000
Message-ID: <8760ppr3q3.fsf@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x
X-Received-From: 2001:4830:134:3::11
X-Spam-Score: -5.0 (-----)
X-Debbugs-Envelope-To: submit
X-Mailman-Approved-At: Wed, 21 Sep 2016 11:40:16 -0400
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -5.0 (-----)

When I forgot that my build machine is offline and I did not pass
--no-build-hook, the offloading keeps trying forever until I had to
cancel the build, boot the build-machine and started the build again.

A solution could be a config option or default behavior which after
failing to offload for n times gives up and uses the local builder.

Is this desired at all? Setups like hydra could get problems, but for
small setups with the same architecture there could be a solution beyond
--no-build-hook?
-- 
              ng0




Acknowledgement sent to ng0 <ngillmann@HIDDEN>:
New bug report received and forwarded. Copy sent to bug-guix@HIDDEN. Full text available.
Report forwarded to bug-guix@HIDDEN:
bug#24496; Package guix. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Mon, 25 Nov 2019 12:00:02 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.