GNU bug report logs -
#67722
[PATCH] gnu: libtorrent-rasterbar: Work around hang in test_ssl.
Previous Next
Reported by: Tomas Volf <~@wolfsden.cz>
Date: Sat, 9 Dec 2023 00:33:01 UTC
Severity: normal
Tags: patch
Done: Tomas Volf <~@wolfsden.cz>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 67722 in the body.
You can then email your comments to 67722 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
guix-patches <at> gnu.org
:
bug#67722
; Package
guix-patches
.
(Sat, 09 Dec 2023 00:33:01 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Tomas Volf <~@wolfsden.cz>
:
New bug report received and forwarded. Copy sent to
guix-patches <at> gnu.org
.
(Sat, 09 Dec 2023 00:33:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
The timeout is still enforced by the build farm for the build as a whole, so
it should not cause any builds to be permanently stuck.
* gnu/packages/bittorrent.scm
(libtorrent-rasterbar)[arguments]<#:phases>['check]: Remote test timeout.
Change-Id: I535c72fec24658a4b2151d2e8794319055c9a278
---
gnu/packages/bittorrent.scm | 17 +++++------------
1 file changed, 5 insertions(+), 12 deletions(-)
diff --git a/gnu/packages/bittorrent.scm b/gnu/packages/bittorrent.scm
index 8c032940d4..5d7d05178b 100644
--- a/gnu/packages/bittorrent.scm
+++ b/gnu/packages/bittorrent.scm
@@ -470,7 +470,6 @@ (define-public libtorrent-rasterbar
(exclude-regex (string-append "^("
(string-join disabled-tests "|")
")$"))
- (timeout "600")
(jobs (if parallel-tests?
(number->string (parallel-job-count))
"1")))
@@ -478,7 +477,6 @@ (define-public libtorrent-rasterbar
(invoke "ctest"
"-E" exclude-regex
"-j" jobs
- "--timeout" timeout
"--output-on-failure")
;; test_ssl relies on bundled TLS certificates with a fixed
;; expiry date. To ensure succesful builds in the future,
@@ -488,16 +486,11 @@ (define-public libtorrent-rasterbar
;; test_fast_extension, test_privacy and test_resolve_links
;; to hang, even with FAKETIME_ONLY_CMDS. Not sure why. So
;; execute only test_ssl under faketime.
- ;;
- ;; Note: The test_ssl test times out in the ci.
- ;; Temporarily disable it until that is resolved.
- ;; (invoke "faketime" "2022-10-24"
- ;; "ctest"
- ;; "-R" "^test_ssl$"
- ;; "-j" jobs
- ;; "--timeout" timeout
- ;; "--output-on-failure")
- )))))))
+ (invoke "faketime" "2022-10-24"
+ "ctest"
+ "-R" "^test_ssl$"
+ "-j" jobs
+ "--output-on-failure"))))))))
(inputs (list boost openssl))
(native-inputs `(("libfaketime" ,libfaketime)
("python-wrapper" ,python-wrapper)
base-commit: 5e4c31518aba62b2cca7c346bcc56cfa9a4d10d0
--
2.41.0
Information forwarded
to
guix-patches <at> gnu.org
:
bug#67722
; Package
guix-patches
.
(Sat, 09 Dec 2023 19:36:01 GMT)
Full text and
rfc822 format available.
Message #8 received at 67722 <at> debbugs.gnu.org (full text, mbox):
The timeout is still enforced by the build farm for the build as a whole, so
it should not cause any builds to be permanently stuck.
* gnu/packages/bittorrent.scm
(libtorrent-rasterbar)[arguments]<#:phases>['check]: Remote test timeout.
Change-Id: I535c72fec24658a4b2151d2e8794319055c9a278
---
gnu/packages/bittorrent.scm | 3 ---
1 file changed, 3 deletions(-)
diff --git a/gnu/packages/bittorrent.scm b/gnu/packages/bittorrent.scm
index 731c8e1c20..5d7d05178b 100644
--- a/gnu/packages/bittorrent.scm
+++ b/gnu/packages/bittorrent.scm
@@ -470,7 +470,6 @@ (define-public libtorrent-rasterbar
(exclude-regex (string-append "^("
(string-join disabled-tests "|")
")$"))
- (timeout "600")
(jobs (if parallel-tests?
(number->string (parallel-job-count))
"1")))
@@ -478,7 +477,6 @@ (define-public libtorrent-rasterbar
(invoke "ctest"
"-E" exclude-regex
"-j" jobs
- "--timeout" timeout
"--output-on-failure")
;; test_ssl relies on bundled TLS certificates with a fixed
;; expiry date. To ensure succesful builds in the future,
@@ -492,7 +490,6 @@ (define-public libtorrent-rasterbar
"ctest"
"-R" "^test_ssl$"
"-j" jobs
- "--timeout" timeout
"--output-on-failure"))))))))
(inputs (list boost openssl))
(native-inputs `(("libfaketime" ,libfaketime)
base-commit: 61f2d84e75c340c2ba528d392f522c51b8843f34
--
2.41.0
Information forwarded
to
guix-patches <at> gnu.org
:
bug#67722
; Package
guix-patches
.
(Tue, 12 Dec 2023 08:12:01 GMT)
Full text and
rfc822 format available.
Message #11 received at 67722 <at> debbugs.gnu.org (full text, mbox):
Hello,
Tomas Volf <~@wolfsden.cz> skribis:
> The timeout is still enforced by the build farm for the build as a whole, so
> it should not cause any builds to be permanently stuck.
>
> * gnu/packages/bittorrent.scm
> (libtorrent-rasterbar)[arguments]<#:phases>['check]: Remote test timeout.
>
> Change-Id: I535c72fec24658a4b2151d2e8794319055c9a278
[...]
> - (timeout "600")
> (jobs (if parallel-tests?
> (number->string (parallel-job-count))
> "1")))
> @@ -478,7 +477,6 @@ (define-public libtorrent-rasterbar
> (invoke "ctest"
> "-E" exclude-regex
> "-j" jobs
> - "--timeout" timeout
What’s the rationale though?
If we know that tests, individually, are meant to take less than 10mn,
it still seems nicer to stop at 10mn rather than wait for the 1h
max-silent timeout, no?
Thanks,
Ludo’.
Information forwarded
to
guix-patches <at> gnu.org
:
bug#67722
; Package
guix-patches
.
(Tue, 12 Dec 2023 23:04:02 GMT)
Full text and
rfc822 format available.
Message #14 received at 67722 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On 2023-12-12 09:11:07 +0100, Ludovic Courtès wrote:
> What’s the rationale though?
>
> If we know that tests, individually, are meant to take less than 10mn,
> it still seems nicer to stop at 10mn rather than wait for the 1h
> max-silent timeout, no?
Originally the rationale was to just try it out and see if it works in the CI or
not. The timeout was not originally there, I added it during fixing of the
tests so I wondered if it was a mistake. Since the QA is still in "Pending",
jury is still out on that one.
I do not know enough about the architecture and utilization of the build
machines to be sure, so one of my hypotheses was that the machine might have
been overloaded during the test run causing the timeout. So I wanted to test
it. (I sent this as #67693 at first, but I think the CI got confused by the WIP
prefix. I am not sure what is a way to mark patches intended to check the CI,
but not necessarily intended to be merged. Sorry you wasted the time reviewing
this, I did not expect anyone to look until the CI passes.)
In the mean time I was running the build locally to see if I can reproduce the
hang and I can. After running for couple of hours in a loop the build failed
with the timeout (not sure on what round, guix build --rounds does not tell
that).
So it seems like the test_ssl is just prone to sporadic failures. Since we both
succeeded in building locally, I assume we just got unlucky (or lucky?) in the
CI.
I am currently testing a v3, and once it passes --rounds=64 (which will take a
while) I will sent it as an updated patch.
Tomas
--
There are only two hard things in Computer Science:
cache invalidation, naming things and off-by-one errors.
[signature.asc (application/pgp-signature, inline)]
Information forwarded
to
guix-patches <at> gnu.org
:
bug#67722
; Package
guix-patches
.
(Wed, 13 Dec 2023 16:40:01 GMT)
Full text and
rfc822 format available.
Message #17 received at 67722 <at> debbugs.gnu.org (full text, mbox):
test_ssl does sometimes hang (at least when executed under faketime). It is
somewhat unlikely to happen, and (on my machine) required a build with
--rounds=32 to reproduce it.
The workaround is to set somewhat lower timeout of 240s (expected test
duration * 5 rounded up to whole minutes) and retry few times on failure. In
this way, --rounds=64 finished successfully (on my machine).
At the same time remove the timeout from the other tests, since it is not
necessary (they do not hang), and one of them runs for ~270s (almost half the
original timeout), so it could posse a problem on slow/overloaded machine.
* gnu/packages/bittorrent.scm
(libtorrent-rasterbar)[arguments]<#:phases>['check]: Remote test timeout for
most tests. Lower the timeout for test_ssl. Retry test_ssl on failure.
Change-Id: I535c72fec24658a4b2151d2e8794319055c9a278
---
gnu/packages/bittorrent.scm | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/gnu/packages/bittorrent.scm b/gnu/packages/bittorrent.scm
index 731c8e1c20..4585c3b088 100644
--- a/gnu/packages/bittorrent.scm
+++ b/gnu/packages/bittorrent.scm
@@ -470,7 +470,6 @@ (define-public libtorrent-rasterbar
(exclude-regex (string-append "^("
(string-join disabled-tests "|")
")$"))
- (timeout "600")
(jobs (if parallel-tests?
(number->string (parallel-job-count))
"1")))
@@ -478,7 +477,6 @@ (define-public libtorrent-rasterbar
(invoke "ctest"
"-E" exclude-regex
"-j" jobs
- "--timeout" timeout
"--output-on-failure")
;; test_ssl relies on bundled TLS certificates with a fixed
;; expiry date. To ensure succesful builds in the future,
@@ -492,7 +490,11 @@ (define-public libtorrent-rasterbar
"ctest"
"-R" "^test_ssl$"
"-j" jobs
- "--timeout" timeout
+ ;; test_ssl sometimes hangs (at least when run under
+ ;; faketime), therefore set a time limit and retry
+ ;; few times on failure.
+ "--timeout" "240"
+ "--repeat" "until-pass:5"
"--output-on-failure"))))))))
(inputs (list boost openssl))
(native-inputs `(("libfaketime" ,libfaketime)
base-commit: 1b2505217cf222d98cc960b8510660976a01cfa1
--
2.41.0
Information forwarded
to
guix-patches <at> gnu.org
:
bug#67722
; Package
guix-patches
.
(Fri, 15 Dec 2023 11:34:01 GMT)
Full text and
rfc822 format available.
Message #20 received at 67722 <at> debbugs.gnu.org (full text, mbox):
test_ssl does sometimes hang (at least when executed under faketime). It is
somewhat unlikely to happen, and (on my machine) required a build with
--rounds=32 to reproduce it.
The workaround is to set somewhat lower timeout of 240s (expected test
duration * 5 rounded up to whole minutes) and retry few times on failure. In
this way, --rounds=64 finished successfully (on my machine).
At the same time remove the timeout from the other tests, since it is not
necessary (they do not hang), and one of them runs for ~270s (almost half the
original timeout), so it could posse a problem on slow/overloaded machine.
* gnu/packages/bittorrent.scm
(libtorrent-rasterbar)[arguments]<#:phases>['check]: Remote test timeout for
most tests. Lower the timeout for test_ssl. Retry test_ssl on failure.
Change-Id: I535c72fec24658a4b2151d2e8794319055c9a278
---
No changes, just rebase, resolving a merge conflict.
gnu/packages/bittorrent.scm | 22 ++++++++++------------
1 file changed, 10 insertions(+), 12 deletions(-)
diff --git a/gnu/packages/bittorrent.scm b/gnu/packages/bittorrent.scm
index 8c032940d4..4585c3b088 100644
--- a/gnu/packages/bittorrent.scm
+++ b/gnu/packages/bittorrent.scm
@@ -470,7 +470,6 @@ (define-public libtorrent-rasterbar
(exclude-regex (string-append "^("
(string-join disabled-tests "|")
")$"))
- (timeout "600")
(jobs (if parallel-tests?
(number->string (parallel-job-count))
"1")))
@@ -478,7 +477,6 @@ (define-public libtorrent-rasterbar
(invoke "ctest"
"-E" exclude-regex
"-j" jobs
- "--timeout" timeout
"--output-on-failure")
;; test_ssl relies on bundled TLS certificates with a fixed
;; expiry date. To ensure succesful builds in the future,
@@ -488,16 +486,16 @@ (define-public libtorrent-rasterbar
;; test_fast_extension, test_privacy and test_resolve_links
;; to hang, even with FAKETIME_ONLY_CMDS. Not sure why. So
;; execute only test_ssl under faketime.
- ;;
- ;; Note: The test_ssl test times out in the ci.
- ;; Temporarily disable it until that is resolved.
- ;; (invoke "faketime" "2022-10-24"
- ;; "ctest"
- ;; "-R" "^test_ssl$"
- ;; "-j" jobs
- ;; "--timeout" timeout
- ;; "--output-on-failure")
- )))))))
+ (invoke "faketime" "2022-10-24"
+ "ctest"
+ "-R" "^test_ssl$"
+ "-j" jobs
+ ;; test_ssl sometimes hangs (at least when run under
+ ;; faketime), therefore set a time limit and retry
+ ;; few times on failure.
+ "--timeout" "240"
+ "--repeat" "until-pass:5"
+ "--output-on-failure"))))))))
(inputs (list boost openssl))
(native-inputs `(("libfaketime" ,libfaketime)
("python-wrapper" ,python-wrapper)
base-commit: b681e339fa37f2a26763458ee56b31af1d6a7ec5
--
2.41.0
Changed bug title to '[PATCH] gnu: libtorrent-rasterbar: Work around hang in test_ssl.' from '[PATCH] gnu: libtorrent-rasterbar: Remove timeout for tests.'
Request was from
Tomas Volf <~@wolfsden.cz>
to
control <at> debbugs.gnu.org
.
(Tue, 16 Jan 2024 13:26:03 GMT)
Full text and
rfc822 format available.
Information forwarded
to
guix-patches <at> gnu.org
:
bug#67722
; Package
guix-patches
.
(Tue, 16 Jan 2024 13:28:01 GMT)
Full text and
rfc822 format available.
Message #25 received at 67722 <at> debbugs.gnu.org (full text, mbox):
Polite ping. Would anyone have time to look into this?
bug closed, send any further explanations to
67722 <at> debbugs.gnu.org and Tomas Volf <~@wolfsden.cz>
Request was from
Tomas Volf <~@wolfsden.cz>
to
control <at> debbugs.gnu.org
.
(Sun, 06 Oct 2024 16:13:03 GMT)
Full text and
rfc822 format available.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Mon, 04 Nov 2024 12:24:07 GMT)
Full text and
rfc822 format available.
This bug report was last modified 129 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.