GNU bug report logs - #59055
[PATCH] Fix possible deadlock.

Previous Next

Package: guile;

Reported by: Olivier Dion <olivier.dion <at> polymtl.ca>

Date: Sat, 5 Nov 2022 17:00:02 UTC

Severity: normal

Tags: patch

Done: Ludovic Courtès <ludo <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 59055 in the body.
You can then email your comments to 59055 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guile <at> gnu.org:
bug#59055; Package guile. (Sat, 05 Nov 2022 17:00:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Olivier Dion <olivier.dion <at> polymtl.ca>:
New bug report received and forwarded. Copy sent to bug-guile <at> gnu.org. (Sat, 05 Nov 2022 17:00:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Olivier Dion <olivier.dion <at> polymtl.ca>
To: bug-guile <at> gnu.org
Cc: Olivier Dion <olivier.dion <at> polymtl.ca>
Subject: [PATCH] Fix possible deadlock.
Date: Sat,  5 Nov 2022 12:59:23 -0400
If we got interrupted while waiting on our condition variable, we unlock
the kernel mutex momentarily while executing asynchronous operations
before putting us back into the waiting queue.

However, we have to retry acquiring the mutex before getting back into
the queue, otherwise it's possible that we wait indefinitely since
nobody could be the owner for a while.

* libguile/threads.c (lock_mutex): Try acquring the mutex after signal
interruption.
---
 libguile/threads.c | 19 ++++++++++---------
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/libguile/threads.c b/libguile/threads.c
index 280d306bf..0f5cf2ed5 100644
--- a/libguile/threads.c
+++ b/libguile/threads.c
@@ -1022,14 +1022,7 @@ lock_mutex (enum scm_mutex_kind kind, struct scm_mutex *m,
 
         if (err == 0)
           {
-            if (scm_is_eq (m->owner, SCM_BOOL_F))
-              {
-                m->owner = current_thread->handle;
-                scm_i_pthread_mutex_unlock (&m->lock);
-                return SCM_BOOL_T;
-              }
-            else
-              continue;
+            goto maybe_acquire;
           }
         else if (err == ETIMEDOUT)
           {
@@ -1041,7 +1034,7 @@ lock_mutex (enum scm_mutex_kind kind, struct scm_mutex *m,
             scm_i_pthread_mutex_unlock (&m->lock);
             scm_async_tick ();
             scm_i_scm_pthread_mutex_lock (&m->lock);
-            continue;
+            goto maybe_acquire;
           }
         else
           {
@@ -1050,6 +1043,14 @@ lock_mutex (enum scm_mutex_kind kind, struct scm_mutex *m,
             errno = err;
             SCM_SYSERROR;
           }
+
+      maybe_acquire:
+        if (scm_is_eq (m->owner, SCM_BOOL_F))
+          {
+            m->owner = current_thread->handle;
+            scm_i_pthread_mutex_unlock (&m->lock);
+            return SCM_BOOL_T;
+          }
       }
 }
 #undef FUNC_NAME
-- 
2.38.0





Reply sent to Ludovic Courtès <ludo <at> gnu.org>:
You have taken responsibility. (Sun, 20 Nov 2022 17:20:02 GMT) Full text and rfc822 format available.

Notification sent to Olivier Dion <olivier.dion <at> polymtl.ca>:
bug acknowledged by developer. (Sun, 20 Nov 2022 17:20:02 GMT) Full text and rfc822 format available.

Message #10 received at 59055-done <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Olivier Dion <olivier.dion <at> polymtl.ca>
Cc: 59055-done <at> debbugs.gnu.org
Subject: Re: bug#59055: [PATCH] Fix possible deadlock.
Date: Sun, 20 Nov 2022 18:19:07 +0100
Hi,

Olivier Dion <olivier.dion <at> polymtl.ca> skribis:

> If we got interrupted while waiting on our condition variable, we unlock
> the kernel mutex momentarily while executing asynchronous operations
> before putting us back into the waiting queue.
>
> However, we have to retry acquiring the mutex before getting back into
> the queue, otherwise it's possible that we wait indefinitely since
> nobody could be the owner for a while.
>
> * libguile/threads.c (lock_mutex): Try acquring the mutex after signal
> interruption.

Looks reasonable to me; applied.

Did you try to come up with a reproducer?  That would be awesome but I
guess it’s hard because you need to trigger EINTR at the right point.

Thanks,
Ludo’.




Information forwarded to bug-guile <at> gnu.org:
bug#59055; Package guile. (Sun, 20 Nov 2022 18:35:01 GMT) Full text and rfc822 format available.

Message #13 received at 59055-done <at> debbugs.gnu.org (full text, mbox):

From: Olivier Dion <olivier.dion <at> polymtl.ca>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 59055-done <at> debbugs.gnu.org
Subject: Re: bug#59055: [PATCH] Fix possible deadlock.
Date: Sun, 20 Nov 2022 13:33:56 -0500
On Sun, 20 Nov 2022, Ludovic Courtès <ludo <at> gnu.org> wrote:

> Did you try to come up with a reproducer?  That would be awesome but I
> guess it’s hard because you need to trigger EINTR at the right point.

With a stress test in guile-parallel.  Very hard to reproduce indeed.
You can also reproduce it with `ice-9 futures` I think.

Here's the stress test that I've been using:
--8<---------------cut here---------------start------------->8---
(use-modules
  ((ice-9 futures) #:prefix ice-9:)
  (srfi srfi-1)
  (srfi srfi-26))

(define (run-stress-test N future touch)
  (for-each
   touch
   (unfold
    (cut = N <>)
    (lambda (_)
      (future
       (const #t)))
    1+
    0)))

(run-stress-test 10000000 ice-9:make-future ice-9:touch)
--8<---------------cut here---------------end--------------->8---

-- 
Olivier Dion
oldiob.dev




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Mon, 19 Dec 2022 12:24:10 GMT) Full text and rfc822 format available.

This bug report was last modified 1 year and 128 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.