GNU bug report logs - #31776
guile-2.2: FTBFS on armhf: FAIL: gc.test: gc: after-gc-hook gets called

Previous Next

Package: guile;

Reported by: Rob Browning <rlb <at> defaultvalue.org>

Date: Sun, 10 Jun 2018 17:33:02 UTC

Severity: normal

To reply to this bug, email your comments to 31776 AT debbugs.gnu.org.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guile <at> gnu.org:
bug#31776; Package guile. (Sun, 10 Jun 2018 17:33:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Rob Browning <rlb <at> defaultvalue.org>:
New bug report received and forwarded. Copy sent to bug-guile <at> gnu.org. (Sun, 10 Jun 2018 17:33:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Rob Browning <rlb <at> defaultvalue.org>
To: bug-guile <at> gnu.org
Cc: Emilio Pozuelo Monfort <pochu <at> debian.org>, 900652 <at> bugs.debian.org,
 900652-forwarded <at> bugs.debian.org
Subject: guile-2.2: FTBFS on armhf: FAIL: gc.test: gc: after-gc-hook gets
 called
Date: Sun, 10 Jun 2018 12:32:11 -0500
It looks like gc.test may be failing intermittently in Debian (see below).
Searching around I saw at least one other report of this in the #guile
logs from last year.

For now, I'm wondering if if would be plausible to mark the test as
unresolved to avoid guile-2.2's removal from Debian testing, or if the
failure is likely to indicate a problem serious enough to warrant that
removal.

Emilio Pozuelo Monfort <pochu <at> debian.org> writes:

> Your package failed to build on armhf:
>
> Running gc.test
> FAIL: gc.test: gc: after-gc-hook gets called
> [...]
> Totals for this test run:
> passes:                 40732
> failures:               1
> unexpected passes:      0
> expected failures:      10
> unresolved test cases:  578
> untested test cases:    1
> unsupported test cases: 1
> errors:                 0
>
> FAIL: check-guile
> ==================================
> 1 of 1 test failed
>
>
> Full log at https://buildd.debian.org/status/package.php?p=guile-2.2

Thanks
-- 
Rob Browning
rlb @defaultvalue.org and @debian.org
GPG as of 2011-07-10 E6A9 DA3C C9FD 1FF8 C676 D2C4 C0F0 39E9 ED1B 597A
GPG as of 2002-11-03 14DD 432F AE39 534D B592 F9A0 25C8 D377 8C7E 73A4




Information forwarded to bug-guile <at> gnu.org:
bug#31776; Package guile. (Sat, 16 Jun 2018 22:08:01 GMT) Full text and rfc822 format available.

Message #8 received at 31776 <at> debbugs.gnu.org (full text, mbox):

From: Rob Browning <rlb <at> defaultvalue.org>
To: 31776 <at> debbugs.gnu.org
Cc: Emilio Pozuelo Monfort <pochu <at> debian.org>, 900652 <at> bugs.debian.org,
 900652-forwarded <at> bugs.debian.org
Subject: Re: bug#31776: guile-2.2: FTBFS on armhf: FAIL: gc.test:
 gc:	after-gc-hook gets called
Date: Sat, 16 Jun 2018 17:07:05 -0500
Rob Browning <rlb <at> defaultvalue.org> writes:

> It looks like gc.test may be failing intermittently in Debian (see below).
> Searching around I saw at least one other report of this in the #guile
> logs from last year.
>
> For now, I'm wondering if if would be plausible to mark the test as
> unresolved to avoid guile-2.2's removal from Debian testing, or if the
> failure is likely to indicate a problem serious enough to warrant that
> removal.

Just wanted to check back about this.  It's caused a build on the buildds
to fail again.

Thanks
-- 
Rob Browning
rlb @defaultvalue.org and @debian.org
GPG as of 2011-07-10 E6A9 DA3C C9FD 1FF8 C676 D2C4 C0F0 39E9 ED1B 597A
GPG as of 2002-11-03 14DD 432F AE39 534D B592 F9A0 25C8 D377 8C7E 73A4




Information forwarded to bug-guile <at> gnu.org:
bug#31776; Package guile. (Thu, 19 Jul 2018 16:17:01 GMT) Full text and rfc822 format available.

Message #11 received at 31776 <at> debbugs.gnu.org (full text, mbox):

From: Rob Browning <rlb <at> defaultvalue.org>
To: 31776 <at> debbugs.gnu.org
Cc: Emilio Pozuelo Monfort <pochu <at> debian.org>, 900652 <at> bugs.debian.org,
 900652-forwarded <at> bugs.debian.org, guile-devel <at> gnu.org
Subject: Re: bug#31776: guile-2.2: FTBFS on armhf: FAIL:
 gc.test:	gc:	after-gc-hook gets called
Date: Thu, 19 Jul 2018 11:16:38 -0500
Rob Browning <rlb <at> defaultvalue.org> writes:

> Rob Browning <rlb <at> defaultvalue.org> writes:
>
>> It looks like gc.test may be failing intermittently in Debian (see below).
>> Searching around I saw at least one other report of this in the #guile
>> logs from last year.
>>
>> For now, I'm wondering if if would be plausible to mark the test as
>> unresolved to avoid guile-2.2's removal from Debian testing, or if the
>> failure is likely to indicate a problem serious enough to warrant that
>> removal.
>
> Just wanted to check back about this.  It's caused a build on the buildds
> to fail again.

As an update, If we don't resolve this, guile-2.2 will be removed from
Debian testing this weekend.

Thanks
-- 
Rob Browning
rlb @defaultvalue.org and @debian.org
GPG as of 2011-07-10 E6A9 DA3C C9FD 1FF8 C676 D2C4 C0F0 39E9 ED1B 597A
GPG as of 2002-11-03 14DD 432F AE39 534D B592 F9A0 25C8 D377 8C7E 73A4




Information forwarded to bug-guile <at> gnu.org:
bug#31776; Package guile. (Mon, 06 Aug 2018 13:22:01 GMT) Full text and rfc822 format available.

Message #14 received at 31776 <at> debbugs.gnu.org (full text, mbox):

From: Göran Weinholt <goran <at> weinholt.se>
To: Rob Browning <rlb <at> defaultvalue.org>
Cc: Emilio Pozuelo Monfort <pochu <at> debian.org>, 900652-forwarded <at> bugs.debian.org,
 guile-devel <at> gnu.org, 31776 <at> debbugs.gnu.org
Subject: Re: bug#31776: guile-2.2: FTBFS on armhf: FAIL:
 gc.test:	gc:	after-gc-hook gets called
Date: Mon, 06 Aug 2018 15:20:32 +0200
[Message part 1 (text/plain, inline)]
Rob Browning <rlb <at> defaultvalue.org> writes:

> Rob Browning <rlb <at> defaultvalue.org> writes:
>
>> Rob Browning <rlb <at> defaultvalue.org> writes:
>>
>>> It looks like gc.test may be failing intermittently in Debian (see below).
>>> Searching around I saw at least one other report of this in the #guile
>>> logs from last year.
>>>
>>> For now, I'm wondering if if would be plausible to mark the test as
>>> unresolved to avoid guile-2.2's removal from Debian testing, or if the
>>> failure is likely to indicate a problem serious enough to warrant that
>>> removal.
>>
>> Just wanted to check back about this.  It's caused a build on the buildds
>> to fail again.
>
> As an update, If we don't resolve this, guile-2.2 will be removed from
> Debian testing this weekend.

Hello Rob,

The test fails with 2.2.4+1-1 on amd64 as well:

https://tests.reproducible-builds.org/debian/rb-pkg/unstable/amd64/guile-2.2.html

It's really tricky to get it to fail predictably, but you can even your
odds by testing only asyncs.test and gc.test:

 apt-get source guile-2.2
 cd guile-2.2-2.2.4+1
 dpkg-buildpackage -us -uc
 mkdir test-suite/async-tests
 cp test-suite/tests/{asyncs,gc}.test test-suite/async-tests/
 meta/guile --debug -L $PWD/test-suite --no-auto-compile \
   -e main -s $PWD/test-suite/guile-test \
   --test-suite $PWD/test-suite/async-tests \
   --log-file check-guile-async.log

Try the last command around a dozen times and it'll fail eventually.

I didn't get further with debugging than determining that something
probably goes wrong in the interaction between queue_after_gc_hook(),
scm_i_async_push() and scm_i_async_pop(). Every time there was a
failure, this condition was false (the cdr was set to empty list):

  if (scm_is_false (SCM_CDR (after_gc_async_cell)))
    {
      SCM_SETCDR (after_gc_async_cell, t->pending_asyncs);
      t->pending_asyncs = after_gc_async_cell;
    }

Regards,

-- 
Göran Weinholt
Debian developer
73 de SA6CJK
[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-guile <at> gnu.org:
bug#31776; Package guile. (Tue, 16 Apr 2019 21:40:02 GMT) Full text and rfc822 format available.

Message #17 received at 31776 <at> debbugs.gnu.org (full text, mbox):

From: Mark H Weaver <mhw <at> netris.org>
To: Andrea Azzarone <azzaronea <at> gmail.com>
Cc: guile-devel <at> gnu.org, 31776 <at> debbugs.gnu.org
Subject: Re: [PATCH] Fix gc.test "after-gc-hook gets called" failures
Date: Tue, 16 Apr 2019 17:38:00 -0400
Hi Andrea,

Andrea Azzarone <azzaronea <at> gmail.com> writes:

> "after-gc-hook gets called" test randomly fails as reported
> downstream, for example:
> - https://debbugs.gnu.org/cgi/bugreport.cgi?bug=31776
> - https://bugs.launchpad.net/ubuntu/+source/guile-2.2/+bug/1823459
>
> I'm attaching a patch that seems to fix the failures.
>
> From 2efba337d5b636cd975260f19ea74e27ecf0ca17 Mon Sep 17 00:00:00 2001
> From: Andrea Azzarone <andrea.azzarone <at> canonical.com>
> Date: Thu, 11 Apr 2019 16:30:58 +0100
> Subject: Fix gc.test "after-gc-hook gets called" failures
>
> * libguile/scmsigs.c: Call scm_async_tick to give any pending asyncs a chance to
>   run before we block indefinitely waiting for a signal to arrive.

Thanks for this.  I pushed your commit (with minor reformatting) to our
'stable-2.2' branch as commit 546b0e87294b837ec29164d87cf17102e9aeee0c.

I believe that this will prevent the problem from happening in the most
common cases, e.g. when there's only one user-visible thread, or when
there are no long-sleeping user-visible threads.

However, it occurs to me that in a multithreaded Guile program, a user
thread might trigger a GC and then sleep for a long time, without
calling 'scm_async_tick' in between.  If we're unlucky and the
'after_gc_async' gets queued in the wrong thread, it might be a long
time before the hook runs.

Fundamentally, the problem we face here is similar to the thorny
problems faced with finalizers and signal handlers: we must choose a
proper time and context for them to be run safely, when the data they
need to access is in a consistent state, etc.

To deal with the issues around finalizers, Guile recently gained a
finalizer thread.  It may be that we should arrange to run the
'after_gc_async' in the finalizer thread as well, instead of whatever
random thread we happen to be in when GC is triggered.

Thoughts?

      Regards,
        Mark




This bug report was last modified 5 years and 3 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.