GNU bug report logs - #60803
Cuirass stopped processing jobs for aarch64-linux and x86_64-linux

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: guix; Reported by: Marius Bakke <marius@HIDDEN>; Done: Ludovic Courtès <ludo@HIDDEN>; Maintainer for guix is bug-guix@HIDDEN.
bug closed, send any further explanations to 60803 <at> debbugs.gnu.org and Marius Bakke <marius@HIDDEN> Request was from Ludovic Courtès <ludo@HIDDEN> to control <at> debbugs.gnu.org. Full text available.

Message received at 60803 <at> debbugs.gnu.org:


Received: (at 60803) by debbugs.gnu.org; 15 Jan 2023 04:26:42 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sat Jan 14 23:26:42 2023
Received: from localhost ([127.0.0.1]:55900 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1pGubS-0003se-6Y
	for submit <at> debbugs.gnu.org; Sat, 14 Jan 2023 23:26:42 -0500
Received: from eggs.gnu.org ([209.51.188.92]:40062)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <marius@HIDDEN>) id 1pGubQ-0003sR-Fv
 for 60803 <at> debbugs.gnu.org; Sat, 14 Jan 2023 23:26:41 -0500
Received: from fencepost.gnu.org ([2001:470:142:3::e])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <marius@HIDDEN>) id 1pGubL-0002oq-9f
 for 60803 <at> debbugs.gnu.org; Sat, 14 Jan 2023 23:26:35 -0500
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=MIME-Version:Date:References:In-Reply-To:Subject:To:
 From; bh=nDHK1BKbztS6bbj/qqKC3vnxln2IvhdXM1hXGBGszO0=; b=CCMPGkJbQCi3TfW1hG7Z
 6Sd7IwscwO+1OE2KMfA0TGn/6SQRNwtp/u2SJ5R8aNR7aMPcB8CPy+fs177sUGXarFExh+j5sp/rL
 cLENdzvq9KcuJc7NCNPL1wzk7vuYz5WcOeDpD/rHSNgPwbFVMUt40/mCmdXBObMnavJKA7hjtXlGs
 3v02HhH4TRdJtSK2fm8t1McJT+IGiBIS/cjrbMmXsXj1Nsuk+QdBexhF/Wfj3R9Odox4OhayL5Hci
 i2n4xBZZxjVUUOJDC9PDagt+jlb+AwPtV4OvwXsxzmE+QWnZotdTkHxLdtKPj7yGNEQ/RE0FkDppA
 buzM5kqxHXb9Bg==;
Received: from fwa5e22-224.bb.online.no ([88.94.34.224] helo=localhost)
 by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <marius@HIDDEN>)
 id 1pGubK-0004aV-QQ; Sat, 14 Jan 2023 23:26:35 -0500
From: Marius Bakke <marius@HIDDEN>
To: 60803 <at> debbugs.gnu.org
Subject: Re: bug#60803: Cuirass stopped processing jobs for aarch64-linux
 and x86_64-linux
In-Reply-To: <87bkn1n15x.fsf@HIDDEN>
References: <87bkn1n15x.fsf@HIDDEN>
Date: Sun, 15 Jan 2023 05:26:27 +0100
Message-ID: <878ri4mnik.fsf@HIDDEN>
MIME-Version: 1.0
Content-Type: multipart/signed; boundary="=-=-=";
 micalg=pgp-sha512; protocol="application/pgp-signature"
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 60803
Cc: othacehe@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

--=-=-=
Content-Type: text/plain

Marius Bakke <marius@HIDDEN> skriver:

> The 335212 build is for x86_64-linux, we have the same problem with
> 335087 (also perftest) on aarch64.  i686-linux and powerpc64le-linux is
> fine.

I deleted these two from the Builds and BuildDependencies tables which
allowed Cuirass to move forward (or backwards, really, as it was
processing new jobs just fine).

Not sure how to mitigate the problem (race when two evaluations create
different derivations with identical outputs at the same time?), but at
least we know how to deal with it.

Speaking of builds, I started debugging #60016 and accidentally deleted
build 175246!  Enough late night debugging for me...  I'll set up my own
Cuirass to experiment on "soon".

--=-=-=
Content-Type: application/pgp-signature; name="signature.asc"

-----BEGIN PGP SIGNATURE-----

iIUEARYKAC0WIQRNTknu3zbaMQ2ddzTocYulkRQQdwUCY8OAcw8cbWFyaXVzQGdu
dS5vcmcACgkQ6HGLpZEUEHe+tgEAx8zHVH7VGv9EY/Xc3FKqHDeMY35FA4zCvcG2
YlE/Dr8BAPiTogxuXk3DdA7uDvOEFoBF/fGtmXH0cCf9Pwrd89QJ
=05EW
-----END PGP SIGNATURE-----
--=-=-=--




Information forwarded to bug-guix@HIDDEN:
bug#60803; Package guix. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 14 Jan 2023 05:19:36 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sat Jan 14 00:19:36 2023
Received: from localhost ([127.0.0.1]:53009 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1pGYx5-00023E-MI
	for submit <at> debbugs.gnu.org; Sat, 14 Jan 2023 00:19:36 -0500
Received: from lists.gnu.org ([209.51.188.17]:46498)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <marius@HIDDEN>) id 1pGYx1-00022y-8K
 for submit <at> debbugs.gnu.org; Sat, 14 Jan 2023 00:19:32 -0500
Received: from eggs.gnu.org ([2001:470:142:3::10])
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <marius@HIDDEN>) id 1pGYx1-00026z-05
 for bug-guix@HIDDEN; Sat, 14 Jan 2023 00:19:31 -0500
Received: from fencepost.gnu.org ([2001:470:142:3::e])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <marius@HIDDEN>) id 1pGYx0-00020z-OD
 for bug-guix@HIDDEN; Sat, 14 Jan 2023 00:19:30 -0500
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=MIME-Version:Date:Subject:To:From:in-reply-to:
 references; bh=1Ri55FETXEer8w01pg7+zN1QgH98rB50rRtyMvZoT4U=; b=ROjaeQtxkhj/NN
 CA0dlBy5hQklwoIHImAFlLr3m3m595C4PnYgZXZQuNyd8XNRNnW3qnd+I1sCPaMNzTnTKb3vhZ6ch
 UKzA+Py/9eYOhgOjGBWK+6FU7u1RKccISMCMuk6yslCIbA5zeuP6uHMH9cGfl6NWoHyu4zSI9DF6L
 sw1NPGdrdzjnGlaMDK7Qpzr4x1NUj+KwJpqmfyZOCqBGXAtg63YeLlYEyo28IspY6EEMpM0c+E41q
 WeIuVcfG3edAYSZ86Y+bmUepOdDLysk1I4yZ+arr3Qq1AZhi7PRyZXN9B89Mxtaxe0Jp9b8DZ03z/
 mjqjLzm72s0jylDKl4+w==;
Received: from fwa5e22-224.bb.online.no ([88.94.34.224] helo=localhost)
 by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <marius@HIDDEN>) id 1pGYx0-0008Bi-63
 for bug-guix@HIDDEN; Sat, 14 Jan 2023 00:19:30 -0500
From: Marius Bakke <marius@HIDDEN>
To: bug-guix@HIDDEN
Subject: Cuirass stopped processing jobs for aarch64-linux and x86_64-linux
X-Debbugs-CC: othacehe@HIDDEN
Date: Sat, 14 Jan 2023 06:19:22 +0100
Message-ID: <87bkn1n15x.fsf@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: submit
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

Hello Guix,

Cuirass has stopped processing (old) jobs for aarch64 and x86_64.  After
digging through the database it's because (db-get-pending-build ...)
returns a build that is missing from the Jobs table:

  WITH pending_dependencies AS
  (SELECT Builds.id, count(dep.id) as deps FROM Builds
  LEFT JOIN BuildDependencies as bd ON bd.source = Builds.id
  LEFT JOIN Builds AS dep ON bd.target = dep.id AND dep.status != 0
  WHERE Builds.status = -2 AND Builds.system = 'x86_64-linux'
  GROUP BY builds.id
  ORDER BY Builds.priority ASC, Builds.timestamp DESC)
  SELECT id FROM pending_dependencies where deps = 0 limit 1;

     id
  --------
   335212

However:

  select * from jobs  where  build = 335212;
   name | evaluation | build | status | system
  ------+------------+-------+--------+--------
  (0 rows)

For clarity:

  select id,derivation,evaluation,job_name,nix_name,status from builds where id = 335212;
     id   |                            derivation                             | evaluation |       job_name        |     nix_name      | status
  --------+-------------------------------------------------------------------+------------+-----------------------+-------------------+--------
   335212 | /gnu/store/yzgcza0nijnp79mzz878q9a61p6jykgh-perftest-4.5-0.20.drv |     103435 | perftest.x86_64-linux | perftest-4.5-0.20 |     -2

The derivation is also missing from the Outputs table, which causes the
monster query in (db-get-builds ...), which is what workers call to
fetch the next job, to return nothing.

335212 belongs to evaluation 103435 according to the above query, but
does not show up here:

  https://ci.guix.gnu.org/eval/103435?all=&paginate=0

The build id sequence appears to belong to this evaluation:

  https://ci.guix.gnu.org/eval/103436?all=&paginate=0

(notice how it has 335211 and 335213).

I'm not sure how to recover from this.  Either manually create the
entries in Jobs and Outputs, or delete the offending Builds entry?

The 335212 build is for x86_64-linux, we have the same problem with
335087 (also perftest) on aarch64.  i686-linux and powerpc64le-linux is
fine.

Ideas?




Acknowledgement sent to Marius Bakke <marius@HIDDEN>:
New bug report received and forwarded. Copy sent to othacehe@HIDDEN, bug-guix@HIDDEN. Full text available.
Report forwarded to othacehe@HIDDEN, bug-guix@HIDDEN:
bug#60803; Package guix. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Sun, 14 Jul 2024 21:45:01 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.