GNU bug report logs - #34157
Hydra: mozjs-60 builds on x86_64 and i686 seemingly get stuck

Previous Next

Package: guix;

Reported by: Mark H Weaver <mhw <at> netris.org>

Date: Mon, 21 Jan 2019 15:33:02 UTC

Severity: normal

Merged with 35181

Done: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 34157 in the body.
You can then email your comments to 34157 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guix <at> gnu.org:
bug#34157; Package guix. (Mon, 21 Jan 2019 15:33:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Mark H Weaver <mhw <at> netris.org>:
New bug report received and forwarded. Copy sent to bug-guix <at> gnu.org. (Mon, 21 Jan 2019 15:33:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Mark H Weaver <mhw <at> netris.org>
To: bug-guix <at> gnu.org
Subject: Hydra: mozjs-60 builds on x86_64 and i686 seemingly get stuck
Date: Mon, 21 Jan 2019 10:31:46 -0500
Yesterday on Hydra, I found both Intel mozjs-60 builds seemingly stuck
while exporting the source checkout to hydra.gnunet.org.  One had been
going for ~22.5 hours, and the other for ~12 hours.  I forcefully killed
them and restarted them.  Now I see the same thing has happened on the
second attempt.  Both builds have been seemingly stuck like this for
about 19 hours:

  https://hydra.gnu.org/build/3342528
  https://hydra.gnu.org/build/3343511

In both cases, the build logs are empty, and the hydra log ends with:

  sending 1 store item to 'hydra.gnunet.org'...
  exporting path `/gnu/store/j2sz7dg35vkcz38sim71jll2ix1nk554-mozjs-60.2.3-2-checkout'

Of course, it's possible that they're not really stuck, but that they're
merely taking a ridiculously long time to send the source checkout to
the build slave.  My personal checkout of the mozilla-esr60 branch,
without the .hg directory, is about 2.1 gigabytes.

What do you think?

      Mark




Information forwarded to bug-guix <at> gnu.org:
bug#34157; Package guix. (Mon, 21 Jan 2019 15:40:02 GMT) Full text and rfc822 format available.

Message #8 received at 34157 <at> debbugs.gnu.org (full text, mbox):

From: Efraim Flashner <efraim <at> flashner.co.il>
To: Mark H Weaver <mhw <at> netris.org>
Cc: 34157 <at> debbugs.gnu.org
Subject: Re: bug#34157: Hydra: mozjs-60 builds on x86_64 and i686 seemingly
 get stuck
Date: Mon, 21 Jan 2019 17:39:47 +0200
[Message part 1 (text/plain, inline)]
On Mon, Jan 21, 2019 at 10:31:46AM -0500, Mark H Weaver wrote:
> Yesterday on Hydra, I found both Intel mozjs-60 builds seemingly stuck
> while exporting the source checkout to hydra.gnunet.org.  One had been
> going for ~22.5 hours, and the other for ~12 hours.  I forcefully killed
> them and restarted them.  Now I see the same thing has happened on the
> second attempt.  Both builds have been seemingly stuck like this for
> about 19 hours:
> 
>   https://hydra.gnu.org/build/3342528
>   https://hydra.gnu.org/build/3343511
> 
> In both cases, the build logs are empty, and the hydra log ends with:
> 
>   sending 1 store item to 'hydra.gnunet.org'...
>   exporting path `/gnu/store/j2sz7dg35vkcz38sim71jll2ix1nk554-mozjs-60.2.3-2-checkout'
> 
> Of course, it's possible that they're not really stuck, but that they're
> merely taking a ridiculously long time to send the source checkout to
> the build slave.  My personal checkout of the mozilla-esr60 branch,
> without the .hg directory, is about 2.1 gigabytes.
> 
> What do you think?
> 
>       Mark
> 
12 hours is far too long for it to tie up a build slave, sending code or
not. Being silent that long doesn't trigger the auto-kill?

-- 
Efraim Flashner   <efraim <at> flashner.co.il>   אפרים פלשנר
GPG key = A28B F40C 3E55 1372 662D  14F7 41AA E7DC CA3D 8351
Confidentiality cannot be guaranteed on emails sent or received unencrypted
[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-guix <at> gnu.org:
bug#34157; Package guix. (Tue, 22 Jan 2019 02:56:01 GMT) Full text and rfc822 format available.

Message #11 received at 34157 <at> debbugs.gnu.org (full text, mbox):

From: Mark H Weaver <mhw <at> netris.org>
To: Efraim Flashner <efraim <at> flashner.co.il>
Cc: 34157 <at> debbugs.gnu.org
Subject: Re: bug#34157: Hydra: mozjs-60 builds on x86_64 and i686 seemingly
 get stuck
Date: Mon, 21 Jan 2019 21:54:43 -0500
Efraim Flashner <efraim <at> flashner.co.il> writes:

> On Mon, Jan 21, 2019 at 10:31:46AM -0500, Mark H Weaver wrote:
>> Yesterday on Hydra, I found both Intel mozjs-60 builds seemingly stuck
>> while exporting the source checkout to hydra.gnunet.org.  One had been
>> going for ~22.5 hours, and the other for ~12 hours.  I forcefully killed
>> them and restarted them.  Now I see the same thing has happened on the
>> second attempt.  Both builds have been seemingly stuck like this for
>> about 19 hours:
>> 
>>   https://hydra.gnu.org/build/3342528
>>   https://hydra.gnu.org/build/3343511
>> 
>> In both cases, the build logs are empty, and the hydra log ends with:
>> 
>>   sending 1 store item to 'hydra.gnunet.org'...
>>   exporting path `/gnu/store/j2sz7dg35vkcz38sim71jll2ix1nk554-mozjs-60.2.3-2-checkout'
>> 
>> Of course, it's possible that they're not really stuck, but that they're
>> merely taking a ridiculously long time to send the source checkout to
>> the build slave.  My personal checkout of the mozilla-esr60 branch,
>> without the .hg directory, is about 2.1 gigabytes.
>> 
>> What do you think?
>> 
>>       Mark
>> 
> 12 hours is far too long for it to tie up a build slave, sending code or
> not.

Those two builds are still occupying build slots.  As I write this,
they've been running for over 30 hours.

I was curious whether the transfers were actually happening, even if
slowly, so I looked at 'netstat' output:

--8<---------------cut here---------------start------------->8---
root <at> 20121227-hydra:~# netstat --inet --program | grep net.in.tum
tcp        0      0 20121227-hydra.gn:58007 hydra.net.in.tum.de:ssh ESTABLISHED 18774/guile     
tcp        0      0 20121227-hydra.gn:42586 hydra.net.in.tum.de:ssh ESTABLISHED 10042/guile     
tcp        0      0 20121227-hydra.gn:56413 hydra.net.in.tum.de:ssh ESTABLISHED 16236/guile     
--8<---------------cut here---------------end--------------->8---

There are currently three builds allocated to hydra.gnunet.org
(a.k.a. hydra.net.in.tum), so it appears that all three ssh connections
are still active.  However, even after repeating this command many
times, I've never seen a non-zero "Send-Q" value.  This suggests that no
data is actually being sent, but that it's stuck waiting for something.

I'll leave these builds alone for now, in case Ludovic wants to
investigate further.

> Being silent that long doesn't trigger the auto-kill?

I guess that the usual timeouts do not apply to file transfers performed
before the actual build takes place.

     Thanks,
       Mark




Information forwarded to bug-guix <at> gnu.org:
bug#34157; Package guix. (Tue, 22 Jan 2019 13:25:02 GMT) Full text and rfc822 format available.

Message #14 received at 34157 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Mark H Weaver <mhw <at> netris.org>
Cc: Efraim Flashner <efraim <at> flashner.co.il>, 34157 <at> debbugs.gnu.org
Subject: Re: bug#34157: Hydra: mozjs-60 builds on x86_64 and i686 seemingly
 get stuck
Date: Tue, 22 Jan 2019 14:24:51 +0100
Hi Mark,

Mark H Weaver <mhw <at> netris.org> skribis:

> Those two builds are still occupying build slots.  As I write this,
> they've been running for over 30 hours.
>
> I was curious whether the transfers were actually happening, even if
> slowly, so I looked at 'netstat' output:
>
> root <at> 20121227-hydra:~# netstat --inet --program | grep net.in.tum
> tcp        0      0 20121227-hydra.gn:58007 hydra.net.in.tum.de:ssh ESTABLISHED 18774/guile     
> tcp        0      0 20121227-hydra.gn:42586 hydra.net.in.tum.de:ssh ESTABLISHED 10042/guile     
> tcp        0      0 20121227-hydra.gn:56413 hydra.net.in.tum.de:ssh ESTABLISHED 16236/guile     
>
> There are currently three builds allocated to hydra.gnunet.org
> (a.k.a. hydra.net.in.tum), so it appears that all three ssh connections
> are still active.  However, even after repeating this command many
> times, I've never seen a non-zero "Send-Q" value.  This suggests that no
> data is actually being sent, but that it's stuck waiting for something.

Weird.

> I'll leave these builds alone for now, in case Ludovic wants to
> investigate further.

I think you can terminate them as I’d rather not commit to investigate
further now.

I believe hydra.gnu.org is still running a rather old
guix-daemon/offload, right?  We should upgrade to the latest and
greatest to make sure we’re after a bug that’s still present.

Thanks,
Ludo’.




Merged 34157 35181. Request was from Mark H Weaver <mhw <at> netris.org> to control <at> debbugs.gnu.org. (Tue, 09 Apr 2019 01:08:02 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sat, 13 May 2023 11:24:08 GMT) Full text and rfc822 format available.

This bug report was last modified 347 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.