GNU bug report logs - #61493
[PATCH 0/2] gnu: hwloc: Skip failing test on non-x86 systems.

Previous Next

Package: guix-patches;

Reported by: Simon South <simon <at> simonsouth.net>

Date: Mon, 13 Feb 2023 20:58:02 UTC

Severity: normal

Tags: patch

Done: Ludovic Courtès <ludo <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 61493 in the body.
You can then email your comments to 61493 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to guix-patches <at> gnu.org:
bug#61493; Package guix-patches. (Mon, 13 Feb 2023 20:58:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Simon South <simon <at> simonsouth.net>:
New bug report received and forwarded. Copy sent to guix-patches <at> gnu.org. (Mon, 13 Feb 2023 20:58:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Simon South <simon <at> simonsouth.net>
To: guix-patches <at> gnu.org
Subject: [PATCH 0/2] gnu: hwloc: Skip failing test on non-x86 systems.
Date: Mon, 13 Feb 2023 15:56:47 -0500
Here's a patch that circumvents a test failure in hwloc 2.9.0 on non-x86
systems (and specifically on AArch64), allowing the package to build
successfully on these machines.

An additional, bonus patch removes a pair of obsolete comments from the hwloc
package definitions.

I've tested these changes on x86-64 and AArch64 and generally, things seem
fine.

- On x86-64, of hwloc's 136 dependents the only seven[0] that fail to build
  appear to be existing failures, according to ci.guix.gnu.org.

- On AArch64, the package builds fine; many of its dependents fail (in fact I
  am still waiting for builds to complete) but again, none of the failures
  I've investigated appear to be new.

----------

Here's some background information regarding the fix in case it's useful:

One of hwloc's primary functions is to provide information about the host
computer's processor topology, in terms of NUMA nodes, CPU clusters and so on.
At start-up it it tries to collect this information by querying a sequence of
"topology backends" that each implement a different strategy for detecting the
host system's configuration.

The first source of information is the operating system, so on most Guix
machines the "Linux" backend runs first.  This tries to pull information from
the /sys filesystem tree but since that's inaccessible from within build
containers, this always fails during hwloc's tests.

For x86 machines specifically, hwloc provides an architecture-specific,
fallback backend that can obtain the same information by querying the hardware
directly.  This normally succeeds within the build environment, and so hwloc
passes its tests without issue on x86 and x86-64 machines.

But those are the only platforms for which an architecture-specific topology
backend is provided: On other systems, once the Linux backend fails, hwloc has
nothing else to try and so any tests that rely on the host system's topology
having been detected will fail.

My patch fixes the build on these machines by skipping the one (other) test
that relies on this information being available, only on non-x86 systems where
the unavailability of /sys means certain failure.

For reference, the backends mentioned above are implemented in hwloc's
hwloc/topology-linux.c and hwloc/topology-x86.c.

--
Simon South
simon <at> simonsouth.net

[0] combinatorial-blas, cube, elemental, elpa-openmpi, python-dolfin-adjoint,
    scorep-openmpi and superlu-dist.


Simon South (2):
  gnu: hwloc: Remove obsolete comments.
  gnu: hwloc: Skip failing test on non-x86 systems.

 gnu/packages/mpi.scm | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)


base-commit: 5b1eab43f011983d9ee560d6935409b6b39706ff
-- 
2.39.1





Information forwarded to guix-patches <at> gnu.org:
bug#61493; Package guix-patches. (Mon, 13 Feb 2023 21:02:02 GMT) Full text and rfc822 format available.

Message #8 received at 61493 <at> debbugs.gnu.org (full text, mbox):

From: Simon South <simon <at> simonsouth.net>
To: 61493 <at> debbugs.gnu.org
Subject: [PATCH 2/2] gnu: hwloc: Skip failing test on non-x86 systems.
Date: Mon, 13 Feb 2023 16:01:12 -0500
* gnu/packages/mpi.scm (hwloc-2)[arguments]<#:phases>: Rename
"skip-test-that-requires-/sys" phase to "skip-tests-that-require-/sys" and
expand to skip additional test requiring /sys on non-x86 systems.
---
 gnu/packages/mpi.scm | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/gnu/packages/mpi.scm b/gnu/packages/mpi.scm
index febd0b4124..22d47b966c 100644
--- a/gnu/packages/mpi.scm
+++ b/gnu/packages/mpi.scm
@@ -164,10 +164,19 @@ (define-public hwloc-2
                (substitute* "tests/hwloc/linux-libnuma.c"
                  (("numa_available\\(\\)")
                   "-1"))))
-           (add-before 'check 'skip-test-that-requires-/sys
+           (add-before 'check 'skip-tests-that-require-/sys
              (lambda _
                ;; 'test-gather-topology.sh' requires /sys as of 2.9.0; skip it.
-               (setenv "HWLOC_TEST_GATHER_TOPOLOGY" "0")))
+               (setenv "HWLOC_TEST_GATHER_TOPOLOGY" "0")
+
+               ;; 'hwloc_backends' also requires /sys on non-x86 systems, for
+               ;; which hwloc lacks a topology backend not reliant on the
+               ;; operating system; skip it also on these machines.
+               (substitute* "tests/hwloc/hwloc_backends.c"
+                 ,@(if (not (target-x86?))
+                       '((("putenv\\(\\(char \\*\\) \"HWLOC_L" all)
+                          (string-append "exit (77);\n" all)))
+                       '()))))
            (add-before 'check 'skip-test-that-fails-on-qemu
              (lambda _
                ;; Skip test that fails on emulated hardware due to QEMU bug:
-- 
2.39.1





Information forwarded to guix-patches <at> gnu.org:
bug#61493; Package guix-patches. (Mon, 13 Feb 2023 21:02:03 GMT) Full text and rfc822 format available.

Message #11 received at 61493 <at> debbugs.gnu.org (full text, mbox):

From: Simon South <simon <at> simonsouth.net>
To: 61493 <at> debbugs.gnu.org
Subject: [PATCH 1/2] gnu: hwloc: Remove obsolete comments.
Date: Mon, 13 Feb 2023 16:01:11 -0500
hwloc 2.x become the default with commit 8ec7ca22d3, "gnu: hwloc: Default to
2.x.".

* gnu/packages/mpi.scm (hwloc-1): Remove obsolete comment.
(hwloc-2): Remove obsolete comment.
---
 gnu/packages/mpi.scm | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/gnu/packages/mpi.scm b/gnu/packages/mpi.scm
index 70b14c30b3..febd0b4124 100644
--- a/gnu/packages/mpi.scm
+++ b/gnu/packages/mpi.scm
@@ -53,8 +53,6 @@ (define-module (gnu packages mpi)
   #:use-module (ice-9 match))
 
 (define-public hwloc-1
-  ;; Note: For now we keep 1.x as the default because many packages have yet
-  ;; to migrate to 2.0.
   (package
     (name "hwloc")
     (version "1.11.13")
@@ -140,7 +138,6 @@ (define-public hwloc-1
     (license license:bsd-3)))
 
 (define-public hwloc-2
-  ;; Note: 2.x isn't the default yet, see above.
   (package
     (inherit hwloc-1)
     (version "2.9.0")
-- 
2.39.1





Reply sent to Ludovic Courtès <ludo <at> gnu.org>:
You have taken responsibility. (Mon, 27 Feb 2023 14:54:02 GMT) Full text and rfc822 format available.

Notification sent to Simon South <simon <at> simonsouth.net>:
bug acknowledged by developer. (Mon, 27 Feb 2023 14:54:02 GMT) Full text and rfc822 format available.

Message #16 received at 61493-done <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Simon South <simon <at> simonsouth.net>
Cc: 61493-done <at> debbugs.gnu.org
Subject: Re: bug#61493: [PATCH 0/2] gnu: hwloc: Skip failing test on non-x86
 systems.
Date: Mon, 27 Feb 2023 15:52:43 +0100
Hi Simon,

Simon South <simon <at> simonsouth.net> skribis:

> Here's a patch that circumvents a test failure in hwloc 2.9.0 on non-x86
> systems (and specifically on AArch64), allowing the package to build
> successfully on these machines.
>
> An additional, bonus patch removes a pair of obsolete comments from the hwloc
> package definitions.
>
> I've tested these changes on x86-64 and AArch64 and generally, things seem
> fine.
>
> - On x86-64, of hwloc's 136 dependents the only seven[0] that fail to build
>   appear to be existing failures, according to ci.guix.gnu.org.
>
> - On AArch64, the package builds fine; many of its dependents fail (in fact I
>   am still waiting for builds to complete) but again, none of the failures
>   I've investigated appear to be new.

It’s a clear improvement according to <https://qa.guix.gnu.org/issue/61493>.

> ----------
>
> Here's some background information regarding the fix in case it's useful:
>
> One of hwloc's primary functions is to provide information about the host
> computer's processor topology, in terms of NUMA nodes, CPU clusters and so on.
> At start-up it it tries to collect this information by querying a sequence of
> "topology backends" that each implement a different strategy for detecting the
> host system's configuration.
>
> The first source of information is the operating system, so on most Guix
> machines the "Linux" backend runs first.  This tries to pull information from
> the /sys filesystem tree but since that's inaccessible from within build
> containers, this always fails during hwloc's tests.
>
> For x86 machines specifically, hwloc provides an architecture-specific,
> fallback backend that can obtain the same information by querying the hardware
> directly.  This normally succeeds within the build environment, and so hwloc
> passes its tests without issue on x86 and x86-64 machines.
>
> But those are the only platforms for which an architecture-specific topology
> backend is provided: On other systems, once the Linux backend fails, hwloc has
> nothing else to try and so any tests that rely on the host system's topology
> having been detected will fail.
>
> My patch fixes the build on these machines by skipping the one (other) test
> that relies on this information being available, only on non-x86 systems where
> the unavailability of /sys means certain failure.
>
> For reference, the backends mentioned above are implemented in hwloc's
> hwloc/topology-linux.c and hwloc/topology-x86.c.

Interesting, thanks for explaining!

Ludo’.




Information forwarded to guix-patches <at> gnu.org:
bug#61493; Package guix-patches. (Mon, 27 Feb 2023 15:38:01 GMT) Full text and rfc822 format available.

Message #19 received at 61493 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludovic.courtes <at> inria.fr>
To: Simon South <simon <at> simonsouth.net>
Cc: 61493 <at> debbugs.gnu.org
Subject: Re: bug#61493: [PATCH 0/2] gnu: hwloc: Skip failing test on non-x86
 systems.
Date: Mon, 27 Feb 2023 16:37:33 +0100
Hi again,

Simon South <simon <at> simonsouth.net> skribis:

> Here's a patch that circumvents a test failure in hwloc 2.9.0 on non-x86
> systems (and specifically on AArch64), allowing the package to build
> successfully on these machines.

I forwarded this to Brice Goglin, a colleague of mine also hwloc
co-maintainer, and they kindly opened an issue usptream:

  https://github.com/open-mpi/hwloc/pull/570

Feel free to comment there!

Ludo’.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Tue, 28 Mar 2023 11:24:14 GMT) Full text and rfc822 format available.

This bug report was last modified 1 year and 29 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.