GNU bug report logs - #62691
Calling system* in the module body hangs Guile, while calling open-pipe* does not

Previous Next

Package: guile;

Reported by: Михаил Бахтерев <mike.bakhterev <at> gmail.com>

Date: Thu, 6 Apr 2023 06:26:03 UTC

Severity: normal

Done: Ludovic Courtès <ludo <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 62691 in the body.
You can then email your comments to 62691 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guile <at> gnu.org:
bug#62691; Package guile. (Thu, 06 Apr 2023 06:26:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Михаил Бахтерев <mike.bakhterev <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-guile <at> gnu.org. (Thu, 06 Apr 2023 06:26:03 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Михаил Бахтерев
 <mike.bakhterev <at> gmail.com>
To: bug-guile <at> gnu.org
Subject: Calling system* in the module body hangs Guile, while calling
 open-pipe* does not
Date: Thu, 6 Apr 2023 01:47:54 +0500
[Message part 1 (text/plain, inline)]
Greetings! I've hit the following issue.

1. $ guile --version
guile (GNU Guile) 3.0.9

2.  uname -a
Linux ein 6.2.9-arch1-1 #1 SMP PREEMPT_DYNAMIC Thu, 30 Mar 2023 14:51:14
+0000 x86_64 GNU/Linux

3. $ pacman -Qi guile
Name            : guile
Version         : 3.0.9-1
Description     : Portable, embeddable Scheme implementation written in C
Architecture    : x86_64

4. When loading (i am not sure about the stage) module which contains in
the body system* call Guile hangs on futex operation. The code to reproduce
the behavior.

$ cat a.scm
(add-to-load-path ".")
(import (b))
(display "hello world from SCM!")
(newline)

$ cat b.scm
(define-module (b))
(system* "echo" "hello world from SYS!")

$ guile a.scm
HANGS HERE!

But if system* is changed to open-pipe*, like so:
$ cat b.scm
(define-module (b) #:use-module (ice-9 popen))
(close-pipe (open-pipe* OPEN_WRITE "echo" "hello world from SYS!"))

everything seems ok:
$ guile a.scm
hello world from SYS!
Hello World from SCM!

- MB, with best regards.
[Message part 2 (text/html, inline)]

Information forwarded to bug-guile <at> gnu.org:
bug#62691; Package guile. (Sun, 09 Apr 2023 07:59:01 GMT) Full text and rfc822 format available.

Message #8 received at 62691 <at> debbugs.gnu.org (full text, mbox):

From: Timothy Sample <samplet <at> ngyro.com>
To: Михаил Бахтерев
 <mike.bakhterev <at> gmail.com>
Cc: 62691 <at> debbugs.gnu.org
Subject: Re: bug#62691: Calling system* in the module body hangs Guile,
 while calling open-pipe* does not
Date: Sun, 09 Apr 2023 01:58:48 -0600
Hi!

Михаил Бахтерев <mike.bakhterev <at> gmail.com> writes:

> Greetings! I've hit the following issue.
>
> [...]
>
> 4. When loading (i am not sure about the stage) module which contains
> in the body system* call Guile hangs on futex operation. The code to
> reproduce the behavior.
>
> $ cat a.scm
> (add-to-load-path ".")
> (import (b))
> (display "hello world from SCM!")
> (newline)
>  
> $ cat b.scm 
> (define-module (b))
> (system* "echo" "hello world from SYS!")
>
> $ guile a.scm 
> HANGS HERE!

It hangs for me, too.  It turns out that Guile cannot start its signal
handling thread during module resolution.  A slightly simpler reproducer
can be obtained by replacing the call to ‘system*’ with:

  (sigaction SIGINT SIG_DFL)

It doesn’t matter which signal or how it’s handled, just that calling
‘sigaction’ forces Guile to start the signal handling thread.  (The
reason ‘system*’ hangs is that it calls ‘sigaction’ under the hood.)

If I’m reading everything right, this is because Guile uses
‘call-with-new-thread’ to start the signal handling thread.  That
procedure creates a new thread, and among other things, sets an object
property on the thread [1]:

  (set! (thread-join-data thread) (cons cv mutex))

However, setting the object property requires module resolution, since
the above expands to:

  (((@@ (guile) setter) thread-join-data) thread (cons cv mutex))
  ;; ^^^^^^^^^^^^^^^^^  Module resolution.

At this point, we have two threads.  The first one is holding the
“autoload” lock (acquired during module resolution) and is trying to
start the signal handling thread.  It ends up waiting for the signal
thread to indicate that it started [1]:

  (wait-condition-variable cv mutex)

The second thread is trying to acquire the “autoload” lock so that it
can set its ‘thread-join-data’ property (by resolving ‘setter’).  It
won’t signal that it has started until it does so.

The two threads are deadlocked.

A simple fix would be to use the underlying ‘setter’ generic directly:

  (setter thread-join-data thread (cons cv mutex))

It’s hard to read (IMO), but maybe with a comment it’s good enough.

For now it’s best to simply avoid calling ‘system*’ at the top level of
a module.


-- Tim

[1] See ‘call-with-new-thread’ in “module/ice-9/threads.scm”.




Information forwarded to bug-guile <at> gnu.org:
bug#62691; Package guile. (Tue, 11 Apr 2023 17:01:02 GMT) Full text and rfc822 format available.

Message #11 received at 62691 <at> debbugs.gnu.org (full text, mbox):

From: Timothy Sample <samplet <at> ngyro.com>
To: Михаил Бахтерев
 <mike.bakhterev <at> gmail.com>
Cc: 62691 <at> debbugs.gnu.org
Subject: Re: bug#62691: Calling system* in the module body hangs Guile,
 while calling open-pipe* does not
Date: Tue, 11 Apr 2023 10:59:50 -0600
[Message part 1 (text/plain, inline)]
Timothy Sample <samplet <at> ngyro.com> writes:

> It turns out that Guile cannot start its signal handling thread during
> module resolution.

I should add that merely using 'call-with-new-thread' in a module being
imported is enough to trigger deadlock.

> A simple fix would be to use the underlying ‘setter’ generic directly:
>
>   (setter thread-join-data thread (cons cv mutex))
>
> It’s hard to read (IMO), but maybe with a comment it’s good enough.

Here’s the patch.

I’m not super familiar with Guile’s threading code, so maybe this quick
fix is a little too quick.  That is, maybe it would be better to avoid
object properties altogether, or maybe we could make other changes to
the way that the new thread signals the calling thread.  We could also
think through the usage of the ‘autoload’ mutex.  The benefit of this
patch is that it is (almost) exactly the same as the old code, so it’s
unlikely to result in nasty surprises.

[0001-Avoid-module-resolution-in-call-with-new-thread.patch (text/x-patch, inline)]
From 6b6792c7a21de9be1825719bfca0f01177381cf9 Mon Sep 17 00:00:00 2001
From: Timothy Sample <samplet <at> ngyro.com>
Date: Tue, 11 Apr 2023 10:22:46 -0600
Subject: [PATCH] Avoid module resolution in 'call-with-new-thread'.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Fixes <https://bugs.gnu.org/62691>.
Reported by Михаил Бахтерев <mike.bakhterev <at> gmail.com>.

* module/ice-9/threads.scm (call-with-new-thread): Do not use 'set!'
to set object properties while the calling thread is waiting on the
new thread to initialize.
---
 module/ice-9/threads.scm | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/module/ice-9/threads.scm b/module/ice-9/threads.scm
index 5a13cec1d..048d8b085 100644
--- a/module/ice-9/threads.scm
+++ b/module/ice-9/threads.scm
@@ -151,7 +151,15 @@ Once @var{thunk} or @var{handler} returns, the return value is made the
                  (lambda ()
                    (lock-mutex mutex)
                    (set! thread (current-thread))
-                   (set! (thread-join-data thread) (cons cv mutex))
+                   ;; Rather than use the 'set!' syntax here, we use the
+                   ;; underlying 'setter' generic function to set the
+                   ;; 'thread-join-data' property on 'thread'.  This is
+                   ;; because 'set!' will try to resolve 'setter' in the
+                   ;; '(guile)' module, which means acquiring the
+                   ;; 'autoload' mutex.  If the calling thread is
+                   ;; already holding that mutex, this will result in
+                   ;; deadlock.  See <https://bugs.gnu.org/62691>.
+                   ((setter thread-join-data) thread (cons cv mutex))
                    (signal-condition-variable cv)
                    (unlock-mutex mutex)
                    (call-with-unblocked-asyncs
-- 
2.39.2


Reply sent to Ludovic Courtès <ludo <at> gnu.org>:
You have taken responsibility. (Sun, 16 Jul 2023 20:29:02 GMT) Full text and rfc822 format available.

Notification sent to Михаил Бахтерев <mike.bakhterev <at> gmail.com>:
bug acknowledged by developer. (Sun, 16 Jul 2023 20:29:02 GMT) Full text and rfc822 format available.

Message #16 received at 62691-done <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Timothy Sample <samplet <at> ngyro.com>
Cc: 62691-done <at> debbugs.gnu.org,
 Михаил Бахтерев
 <mike.bakhterev <at> gmail.com>
Subject: Re: bug#62691: Calling system* in the module body hangs Guile,
 while calling open-pipe* does not
Date: Sun, 16 Jul 2023 22:28:37 +0200
Hi,

Timothy Sample <samplet <at> ngyro.com> skribis:

>>From 6b6792c7a21de9be1825719bfca0f01177381cf9 Mon Sep 17 00:00:00 2001
> From: Timothy Sample <samplet <at> ngyro.com>
> Date: Tue, 11 Apr 2023 10:22:46 -0600
> Subject: [PATCH] Avoid module resolution in 'call-with-new-thread'.
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
>
> Fixes <https://bugs.gnu.org/62691>.
> Reported by Михаил Бахтерев <mike.bakhterev <at> gmail.com>.
>
> * module/ice-9/threads.scm (call-with-new-thread): Do not use 'set!'
> to set object properties while the calling thread is waiting on the
> new thread to initialize.

Woow, good catch!  Finally, applied.  Thanks!

Ludo’.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Mon, 14 Aug 2023 11:24:09 GMT) Full text and rfc822 format available.

This bug report was last modified 247 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.