GNU bug report logs - #67722
[PATCH] gnu: libtorrent-rasterbar: Work around hang in test_ssl.

Previous Next

Package: guix-patches;

Reported by: Tomas Volf <~@wolfsden.cz>

Date: Sat, 9 Dec 2023 00:33:01 UTC

Severity: normal

Tags: patch

Done: Tomas Volf <~@wolfsden.cz>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 67722 in the body.
You can then email your comments to 67722 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to guix-patches <at> gnu.org:
bug#67722; Package guix-patches. (Sat, 09 Dec 2023 00:33:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Tomas Volf <~@wolfsden.cz>:
New bug report received and forwarded. Copy sent to guix-patches <at> gnu.org. (Sat, 09 Dec 2023 00:33:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Tomas Volf <~@wolfsden.cz>
To: guix-patches <at> gnu.org
Cc: Tomas Volf <~@wolfsden.cz>
Subject: [PATCH] gnu: libtorrent-rasterbar: Remove timeout for tests.
Date: Sat,  9 Dec 2023 01:31:25 +0100
The timeout is still enforced by the build farm for the build as a whole, so
it should not cause any builds to be permanently stuck.

* gnu/packages/bittorrent.scm
(libtorrent-rasterbar)[arguments]<#:phases>['check]: Remote test timeout.

Change-Id: I535c72fec24658a4b2151d2e8794319055c9a278
---
 gnu/packages/bittorrent.scm | 17 +++++------------
 1 file changed, 5 insertions(+), 12 deletions(-)

diff --git a/gnu/packages/bittorrent.scm b/gnu/packages/bittorrent.scm
index 8c032940d4..5d7d05178b 100644
--- a/gnu/packages/bittorrent.scm
+++ b/gnu/packages/bittorrent.scm
@@ -470,7 +470,6 @@ (define-public libtorrent-rasterbar
                     (exclude-regex (string-append "^("
                                                   (string-join disabled-tests "|")
                                                   ")$"))
-                    (timeout "600")
                     (jobs (if parallel-tests?
                               (number->string (parallel-job-count))
                               "1")))
@@ -478,7 +477,6 @@ (define-public libtorrent-rasterbar
                  (invoke "ctest"
                          "-E" exclude-regex
                          "-j" jobs
-                         "--timeout" timeout
                          "--output-on-failure")
                  ;; test_ssl relies on bundled TLS certificates with a fixed
                  ;; expiry date.  To ensure succesful builds in the future,
@@ -488,16 +486,11 @@ (define-public libtorrent-rasterbar
                  ;; test_fast_extension, test_privacy and test_resolve_links
                  ;; to hang, even with FAKETIME_ONLY_CMDS.  Not sure why.  So
                  ;; execute only test_ssl under faketime.
-                 ;;
-                 ;; Note: The test_ssl test times out in the ci.
-                 ;; Temporarily disable it until that is resolved.
-                 ;; (invoke "faketime" "2022-10-24"
-                 ;;         "ctest"
-                 ;;         "-R" "^test_ssl$"
-                 ;;         "-j" jobs
-                 ;;         "--timeout" timeout
-                 ;;         "--output-on-failure")
-                 )))))))
+                 (invoke "faketime" "2022-10-24"
+                         "ctest"
+                         "-R" "^test_ssl$"
+                         "-j" jobs
+                         "--output-on-failure"))))))))
     (inputs (list boost openssl))
     (native-inputs `(("libfaketime" ,libfaketime)
                      ("python-wrapper" ,python-wrapper)

base-commit: 5e4c31518aba62b2cca7c346bcc56cfa9a4d10d0
-- 
2.41.0





Information forwarded to guix-patches <at> gnu.org:
bug#67722; Package guix-patches. (Sat, 09 Dec 2023 19:36:01 GMT) Full text and rfc822 format available.

Message #8 received at 67722 <at> debbugs.gnu.org (full text, mbox):

From: Tomas Volf <~@wolfsden.cz>
To: 67722 <at> debbugs.gnu.org
Cc: Tomas Volf <~@wolfsden.cz>
Subject: [PATCH v2] gnu: libtorrent-rasterbar: Remove timeout for tests.
Date: Sat,  9 Dec 2023 20:34:55 +0100
The timeout is still enforced by the build farm for the build as a whole, so
it should not cause any builds to be permanently stuck.

* gnu/packages/bittorrent.scm
(libtorrent-rasterbar)[arguments]<#:phases>['check]: Remote test timeout.

Change-Id: I535c72fec24658a4b2151d2e8794319055c9a278
---
 gnu/packages/bittorrent.scm | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/gnu/packages/bittorrent.scm b/gnu/packages/bittorrent.scm
index 731c8e1c20..5d7d05178b 100644
--- a/gnu/packages/bittorrent.scm
+++ b/gnu/packages/bittorrent.scm
@@ -470,7 +470,6 @@ (define-public libtorrent-rasterbar
                     (exclude-regex (string-append "^("
                                                   (string-join disabled-tests "|")
                                                   ")$"))
-                    (timeout "600")
                     (jobs (if parallel-tests?
                               (number->string (parallel-job-count))
                               "1")))
@@ -478,7 +477,6 @@ (define-public libtorrent-rasterbar
                  (invoke "ctest"
                          "-E" exclude-regex
                          "-j" jobs
-                         "--timeout" timeout
                          "--output-on-failure")
                  ;; test_ssl relies on bundled TLS certificates with a fixed
                  ;; expiry date.  To ensure succesful builds in the future,
@@ -492,7 +490,6 @@ (define-public libtorrent-rasterbar
                          "ctest"
                          "-R" "^test_ssl$"
                          "-j" jobs
-                         "--timeout" timeout
                          "--output-on-failure"))))))))
     (inputs (list boost openssl))
     (native-inputs `(("libfaketime" ,libfaketime)

base-commit: 61f2d84e75c340c2ba528d392f522c51b8843f34
-- 
2.41.0





Information forwarded to guix-patches <at> gnu.org:
bug#67722; Package guix-patches. (Tue, 12 Dec 2023 08:12:01 GMT) Full text and rfc822 format available.

Message #11 received at 67722 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Tomas Volf <~@wolfsden.cz>
Cc: 67722 <at> debbugs.gnu.org
Subject: Re: [bug#67722] [PATCH v2] gnu: libtorrent-rasterbar: Remove
 timeout for tests.
Date: Tue, 12 Dec 2023 09:11:07 +0100
Hello,

Tomas Volf <~@wolfsden.cz> skribis:

> The timeout is still enforced by the build farm for the build as a whole, so
> it should not cause any builds to be permanently stuck.
>
> * gnu/packages/bittorrent.scm
> (libtorrent-rasterbar)[arguments]<#:phases>['check]: Remote test timeout.
>
> Change-Id: I535c72fec24658a4b2151d2e8794319055c9a278

[...]

> -                    (timeout "600")
>                      (jobs (if parallel-tests?
>                                (number->string (parallel-job-count))
>                                "1")))
> @@ -478,7 +477,6 @@ (define-public libtorrent-rasterbar
>                   (invoke "ctest"
>                           "-E" exclude-regex
>                           "-j" jobs
> -                         "--timeout" timeout

What’s the rationale though?

If we know that tests, individually, are meant to take less than 10mn,
it still seems nicer to stop at 10mn rather than wait for the 1h
max-silent timeout, no?

Thanks,
Ludo’.




Information forwarded to guix-patches <at> gnu.org:
bug#67722; Package guix-patches. (Tue, 12 Dec 2023 23:04:02 GMT) Full text and rfc822 format available.

Message #14 received at 67722 <at> debbugs.gnu.org (full text, mbox):

From: Tomas Volf <~@wolfsden.cz>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 67722 <at> debbugs.gnu.org
Subject: Re: [bug#67722] [PATCH v2] gnu: libtorrent-rasterbar: Remove timeout
 for tests.
Date: Wed, 13 Dec 2023 00:03:05 +0100
[Message part 1 (text/plain, inline)]
On 2023-12-12 09:11:07 +0100, Ludovic Courtès wrote:
> What’s the rationale though?
>
> If we know that tests, individually, are meant to take less than 10mn,
> it still seems nicer to stop at 10mn rather than wait for the 1h
> max-silent timeout, no?

Originally the rationale was to just try it out and see if it works in the CI or
not.  The timeout was not originally there, I added it during fixing of the
tests so I wondered if it was a mistake.  Since the QA is still in "Pending",
jury is still out on that one.

I do not know enough about the architecture and utilization of the build
machines to be sure, so one of my hypotheses was that the machine might have
been overloaded during the test run causing the timeout.  So I wanted to test
it.  (I sent this as #67693 at first, but I think the CI got confused by the WIP
prefix.  I am not sure what is a way to mark patches intended to check the CI,
but not necessarily intended to be merged.  Sorry you wasted the time reviewing
this, I did not expect anyone to look until the CI passes.)

In the mean time I was running the build locally to see if I can reproduce the
hang and I can.  After running for couple of hours in a loop the build failed
with the timeout (not sure on what round, guix build --rounds does not tell
that).

So it seems like the test_ssl is just prone to sporadic failures.  Since we both
succeeded in building locally, I assume we just got unlucky (or lucky?) in the
CI.

I am currently testing a v3, and once it passes --rounds=64 (which will take a
while) I will sent it as an updated patch.

Tomas

--
There are only two hard things in Computer Science:
cache invalidation, naming things and off-by-one errors.
[signature.asc (application/pgp-signature, inline)]

Information forwarded to guix-patches <at> gnu.org:
bug#67722; Package guix-patches. (Wed, 13 Dec 2023 16:40:01 GMT) Full text and rfc822 format available.

Message #17 received at 67722 <at> debbugs.gnu.org (full text, mbox):

From: Tomas Volf <~@wolfsden.cz>
To: 67722 <at> debbugs.gnu.org
Cc: Tomas Volf <~@wolfsden.cz>
Subject: [PATCH v3] gnu: libtorrent-rasterbar: Work around hang in test_ssl.
Date: Wed, 13 Dec 2023 17:38:57 +0100
test_ssl does sometimes hang (at least when executed under faketime).  It is
somewhat unlikely to happen, and (on my machine) required a build with
--rounds=32 to reproduce it.

The workaround is to set somewhat lower timeout of 240s (expected test
duration * 5 rounded up to whole minutes) and retry few times on failure.  In
this way, --rounds=64 finished successfully (on my machine).

At the same time remove the timeout from the other tests, since it is not
necessary (they do not hang), and one of them runs for ~270s (almost half the
original timeout), so it could posse a problem on slow/overloaded machine.

* gnu/packages/bittorrent.scm
(libtorrent-rasterbar)[arguments]<#:phases>['check]: Remote test timeout for
most tests.  Lower the timeout for test_ssl.  Retry test_ssl on failure.

Change-Id: I535c72fec24658a4b2151d2e8794319055c9a278
---
 gnu/packages/bittorrent.scm | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/gnu/packages/bittorrent.scm b/gnu/packages/bittorrent.scm
index 731c8e1c20..4585c3b088 100644
--- a/gnu/packages/bittorrent.scm
+++ b/gnu/packages/bittorrent.scm
@@ -470,7 +470,6 @@ (define-public libtorrent-rasterbar
                     (exclude-regex (string-append "^("
                                                   (string-join disabled-tests "|")
                                                   ")$"))
-                    (timeout "600")
                     (jobs (if parallel-tests?
                               (number->string (parallel-job-count))
                               "1")))
@@ -478,7 +477,6 @@ (define-public libtorrent-rasterbar
                  (invoke "ctest"
                          "-E" exclude-regex
                          "-j" jobs
-                         "--timeout" timeout
                          "--output-on-failure")
                  ;; test_ssl relies on bundled TLS certificates with a fixed
                  ;; expiry date.  To ensure succesful builds in the future,
@@ -492,7 +490,11 @@ (define-public libtorrent-rasterbar
                          "ctest"
                          "-R" "^test_ssl$"
                          "-j" jobs
-                         "--timeout" timeout
+                         ;; test_ssl sometimes hangs (at least when run under
+                         ;; faketime), therefore set a time limit and retry
+                         ;; few times on failure.
+                         "--timeout" "240"
+                         "--repeat" "until-pass:5"
                          "--output-on-failure"))))))))
     (inputs (list boost openssl))
     (native-inputs `(("libfaketime" ,libfaketime)

base-commit: 1b2505217cf222d98cc960b8510660976a01cfa1
-- 
2.41.0





Information forwarded to guix-patches <at> gnu.org:
bug#67722; Package guix-patches. (Fri, 15 Dec 2023 11:34:01 GMT) Full text and rfc822 format available.

Message #20 received at 67722 <at> debbugs.gnu.org (full text, mbox):

From: Tomas Volf <~@wolfsden.cz>
To: 67722 <at> debbugs.gnu.org
Cc: Tomas Volf <~@wolfsden.cz>
Subject: [PATCH v4] gnu: libtorrent-rasterbar: Work around hang in test_ssl.
Date: Fri, 15 Dec 2023 12:32:18 +0100
test_ssl does sometimes hang (at least when executed under faketime).  It is
somewhat unlikely to happen, and (on my machine) required a build with
--rounds=32 to reproduce it.

The workaround is to set somewhat lower timeout of 240s (expected test
duration * 5 rounded up to whole minutes) and retry few times on failure.  In
this way, --rounds=64 finished successfully (on my machine).

At the same time remove the timeout from the other tests, since it is not
necessary (they do not hang), and one of them runs for ~270s (almost half the
original timeout), so it could posse a problem on slow/overloaded machine.

* gnu/packages/bittorrent.scm
(libtorrent-rasterbar)[arguments]<#:phases>['check]: Remote test timeout for
most tests.  Lower the timeout for test_ssl.  Retry test_ssl on failure.

Change-Id: I535c72fec24658a4b2151d2e8794319055c9a278
---
No changes, just rebase, resolving a merge conflict.

 gnu/packages/bittorrent.scm | 22 ++++++++++------------
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/gnu/packages/bittorrent.scm b/gnu/packages/bittorrent.scm
index 8c032940d4..4585c3b088 100644
--- a/gnu/packages/bittorrent.scm
+++ b/gnu/packages/bittorrent.scm
@@ -470,7 +470,6 @@ (define-public libtorrent-rasterbar
                     (exclude-regex (string-append "^("
                                                   (string-join disabled-tests "|")
                                                   ")$"))
-                    (timeout "600")
                     (jobs (if parallel-tests?
                               (number->string (parallel-job-count))
                               "1")))
@@ -478,7 +477,6 @@ (define-public libtorrent-rasterbar
                  (invoke "ctest"
                          "-E" exclude-regex
                          "-j" jobs
-                         "--timeout" timeout
                          "--output-on-failure")
                  ;; test_ssl relies on bundled TLS certificates with a fixed
                  ;; expiry date.  To ensure succesful builds in the future,
@@ -488,16 +486,16 @@ (define-public libtorrent-rasterbar
                  ;; test_fast_extension, test_privacy and test_resolve_links
                  ;; to hang, even with FAKETIME_ONLY_CMDS.  Not sure why.  So
                  ;; execute only test_ssl under faketime.
-                 ;;
-                 ;; Note: The test_ssl test times out in the ci.
-                 ;; Temporarily disable it until that is resolved.
-                 ;; (invoke "faketime" "2022-10-24"
-                 ;;         "ctest"
-                 ;;         "-R" "^test_ssl$"
-                 ;;         "-j" jobs
-                 ;;         "--timeout" timeout
-                 ;;         "--output-on-failure")
-                 )))))))
+                 (invoke "faketime" "2022-10-24"
+                         "ctest"
+                         "-R" "^test_ssl$"
+                         "-j" jobs
+                         ;; test_ssl sometimes hangs (at least when run under
+                         ;; faketime), therefore set a time limit and retry
+                         ;; few times on failure.
+                         "--timeout" "240"
+                         "--repeat" "until-pass:5"
+                         "--output-on-failure"))))))))
     (inputs (list boost openssl))
     (native-inputs `(("libfaketime" ,libfaketime)
                      ("python-wrapper" ,python-wrapper)

base-commit: b681e339fa37f2a26763458ee56b31af1d6a7ec5
--
2.41.0




Changed bug title to '[PATCH] gnu: libtorrent-rasterbar: Work around hang in test_ssl.' from '[PATCH] gnu: libtorrent-rasterbar: Remove timeout for tests.' Request was from Tomas Volf <~@wolfsden.cz> to control <at> debbugs.gnu.org. (Tue, 16 Jan 2024 13:26:03 GMT) Full text and rfc822 format available.

Information forwarded to guix-patches <at> gnu.org:
bug#67722; Package guix-patches. (Tue, 16 Jan 2024 13:28:01 GMT) Full text and rfc822 format available.

Message #25 received at 67722 <at> debbugs.gnu.org (full text, mbox):

From: Tomas Volf <~@wolfsden.cz>
To: 67722 <at> debbugs.gnu.org
Subject: Re: [bug#67722] [PATCH v4] gnu: libtorrent-rasterbar: Work around
 hang in test_ssl.
Date: Tue, 16 Jan 2024 14:27:28 +0100
Polite ping.  Would anyone have time to look into this?




bug closed, send any further explanations to 67722 <at> debbugs.gnu.org and Tomas Volf <~@wolfsden.cz> Request was from Tomas Volf <~@wolfsden.cz> to control <at> debbugs.gnu.org. (Sun, 06 Oct 2024 16:13:03 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Mon, 04 Nov 2024 12:24:07 GMT) Full text and rfc822 format available.

This bug report was last modified 129 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.