GNU bug report logs - #37757
shepherd segfaults upon shutdown (kernel panic)

Package: guix;

Reported by: Jesse Gibbons <jgibbons2357 <at> gmail.com>

Date: Tue, 15 Oct 2019 04:35:02 UTC

Severity: important

Done: Ludovic Courtès <ludo <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 37757 in the body.
You can then email your comments to 37757 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox

Report forwarded to bug-guix <at> gnu.org:
bug#37757; Package guix. (Tue, 15 Oct 2019 04:35:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Jesse Gibbons <jgibbons2357 <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-guix <at> gnu.org. (Tue, 15 Oct 2019 04:35:08 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Jesse Gibbons <jgibbons2357 <at> gmail.com>
To: bug-guix mailing list <bug-guix <at> gnu.org>
Subject: kernel panic
Date: Mon, 14 Oct 2019 21:49:32 -0600

[Message part 1 (text/plain, inline)]

Attached is a picture of the kernel panic. It happened when I tried to shut
down.
I do not know what log to look at to get any details about what happened
about that time. Of course, the panic itself is not in any of the logs in
/var/log.
This is not the first time there was a kernel panic during the shutdown
process.

[Image-HRCX9Z.png (image/png, attachment)]

Changed bug title to 'Kernel panic upon shutdown' from 'kernel panic' Request was from Ludovic Courtès <ludo <at> gnu.org> to control <at> debbugs.gnu.org. (Mon, 28 Oct 2019 22:20:02 GMT) Full text and rfc822 format available.

Severity set to 'important' from 'normal' Request was from Ludovic Courtès <ludo <at> gnu.org> to control <at> debbugs.gnu.org. (Mon, 28 Oct 2019 22:20:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-guix <at> gnu.org:
bug#37757; Package guix. (Mon, 28 Oct 2019 22:29:02 GMT) Full text and rfc822 format available.

Message #12 received at 37757 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Jesse Gibbons <jgibbons2357 <at> gmail.com>
Cc: 37757 <at> debbugs.gnu.org
Subject: Kernel panic upon shutdown
Date: Mon, 28 Oct 2019 23:28:30 +0100

Hi,

Jesse Gibbons <jgibbons2357 <at> gmail.com> skribis:

> Attached is a picture of the kernel panic. It happened when I tried to shut
> down.
> I do not know what log to look at to get any details about what happened
> about that time. Of course, the panic itself is not in any of the logs in
> /var/log.
> This is not the first time there was a kernel panic during the shutdown
> process.

I’ve just seen it on a laptop running GNOME and ‘%desktop-services’.
The kernel panic appeared right after shutting down ModemManager (I
don’t have ModemManager on my own laptop and I’ve never experienced the
bug, but I don’t know if it’s significant.)

Note that we see (roughly):

  attempted to kill init! exit code=0x0000000b

which, unless I’m mistaken, means that PID 1 segfaulted (SIGSEGV = 11),
which is bad.

According to reboot(2), the ‘reboot’ syscall doesn’t return in this
case, so the segfault must have happened before the ‘reboot’ call.

The problem appeared roughly after the ‘core-updates’ merge, but I don’t
see any change to the ‘reboot’ wrapper in glibc 2.29.

Is it reproducible for you in a VM built with ‘guix system vm’?  If
would be helpful if we had that.

Thanks,
Ludo’.

Information forwarded to bug-guix <at> gnu.org:
bug#37757; Package guix. (Wed, 13 Nov 2019 22:06:02 GMT) Full text and rfc822 format available.

Message #15 received at 37757 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Jesse Gibbons <jgibbons2357 <at> gmail.com>
Cc: 37757 <at> debbugs.gnu.org
Subject: Re: bug#37757: Kernel panic upon shutdown
Date: Wed, 13 Nov 2019 23:05:13 +0100

Ludovic Courtès <ludo <at> gnu.org> skribis:

> I’ve just seen it on a laptop running GNOME and ‘%desktop-services’.
> The kernel panic appeared right after shutting down ModemManager (I
> don’t have ModemManager on my own laptop and I’ve never experienced the
> bug, but I don’t know if it’s significant.)
>
> Note that we see (roughly):
>
>   attempted to kill init! exit code=0x0000000b

[...]

> Is it reproducible for you in a VM built with ‘guix system vm’?  If
> would be helpful if we had that.

For the record, apparently I can’t reproduce it in a ‘guix system vm
gnu/system/examples/desktop.tmpl’ VM.

Ludo’.

Information forwarded to bug-guix <at> gnu.org:
bug#37757; Package guix. (Wed, 13 Nov 2019 22:23:02 GMT) Full text and rfc822 format available.

Message #18 received at 37757 <at> debbugs.gnu.org (full text, mbox):

From: Jan <tona_kosmicznego_smiecia <at> interia.pl>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: Jesse Gibbons <jgibbons2357 <at> gmail.com>, 37757 <at> debbugs.gnu.org
Subject: Re: bug#37757: Kernel panic upon shutdown
Date: Wed, 13 Nov 2019 23:22:02 +0100

Hi,
I encountered the same error today. I had ran "sudo herd stop tor" and
then "sudo herd stop xorg-server" and it panicked.


Jan Wielkiewicz

Changed bug title to 'shepherd segfaults upon shutdown (kernel panic)' from 'Kernel panic upon shutdown' Request was from Ludovic Courtès <ludo <at> gnu.org> to control <at> debbugs.gnu.org. (Thu, 28 Nov 2019 11:42:01 GMT) Full text and rfc822 format available.

Information forwarded to bug-guix <at> gnu.org:
bug#37757; Package guix. (Thu, 28 Nov 2019 11:46:01 GMT) Full text and rfc822 format available.

Message #23 received at 37757 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Jesse Gibbons <jgibbons2357 <at> gmail.com>
Cc: 37757 <at> debbugs.gnu.org
Subject: Re: bug#37757: Kernel panic upon shutdown
Date: Thu, 28 Nov 2019 12:45:00 +0100

[Message part 1 (text/plain, inline)]

Hello!

The attached patch should allow shepherd (PID 1) to dump core when it
crashes (systemd does something similar).

Jesse (and anyone else experiencing this!), could you try to (1)
reconfigure with this patch, (2) reboot, (3) try to halt the system to
reproduce the crash, and (4) retrieve a backtrace from the ‘core’ file?

For #4, you’ll have to do something along these lines once you’ve
rebooted after the crash:

  sudo gdb /run/current-system/profile/bin/guile /core

and then type “thread apply all bt” at the GDB prompt.

I’ll also try to do that on another machine where I’ve seen it happen.

Thanks in advance!

Ludo’.

[Message part 2 (text/x-patch, inline)]

diff --git a/gnu/services/shepherd.scm b/gnu/services/shepherd.scm
index 08bb33039c..ec49244cf6 100644
--- a/gnu/services/shepherd.scm
+++ b/gnu/services/shepherd.scm
@@ -277,45 +277,87 @@ and return the resulting '.go' file."
 
   (let ((files (map shepherd-service-file services)))
     (define config
-      #~(begin
-          (use-modules (srfi srfi-34)
-                       (system repl error-handling))
+      (with-imported-modules '((guix build syscalls))
+        #~(begin
+            (use-modules (srfi srfi-34)
+                         (system repl error-handling)
+                         (guix build syscalls)
+                         (system foreign))
 
-          ;; Arrange to spawn a REPL if something goes wrong.  This is better
-          ;; than a kernel panic.
-          (call-with-error-handling
-            (lambda ()
-              (apply register-services
-                     (map load-compiled '#$(map scm->go files)))))
+            (define signal
+              (let ((proc (pointer->procedure int
+                                              (dynamic-func "signal"
+                                                            (dynamic-link))
+                                              (list int '*))))
+                (lambda (signum handler)
+                  (proc signum
+                        (if (integer? handler)                ;SIG_DFL, etc.
+                            (make-pointer handler)
+                            (procedure->pointer void handler (list int)))))))
 
-          ;; guix-daemon 0.6 aborts if 'PATH' is undefined, so work around
-          ;; it.
-          (setenv "PATH" "/run/current-system/profile/bin")
+            (define (handle-crash sig)
+              (dynamic-wind
+                (const #t)
+                (lambda ()
+                  (gc-disable)
+                  (pk 'crash! sig)
+                  ;; Fork and have the child dump core at the root.
+                  (match (clone SIGCHLD)
+                    (0
+                     (setrlimit 'core #f #f)
+                     (chdir "/")
+                     (signal sig SIG_DFL)
+                     ;; Note: 'getpid' would return 1, hence this hack.
+                     (kill (string->number (readlink "/proc/self"))
+                           sig)
+                     (primitive-_exit 253))
+                    (child
+                     (waitpid child)
+                     (sync)
+                     ;; Hopefully at this point core has been dumped.
+                     (pk 'done)
+                     (sleep 3)
+                     (primitive-_exit 255))))
+                (lambda ()
+                  (primitive-_exit 254))))
 
-          (format #t "starting services...~%")
-          (for-each (lambda (service)
-                      ;; In the Shepherd 0.3 the 'start' method can raise
-                      ;; '&action-runtime-error' if it fails, so protect
-                      ;; against it.  (XXX: 'action-runtime-error?' is not
-                      ;; exported is 0.3, hence 'service-error?'.)
-                      (guard (c ((service-error? c)
-                                 (format (current-error-port)
-                                         "failed to start service '~a'~%"
-                                         service)))
-                        (start service)))
-                    '#$(append-map shepherd-service-provision
-                                   (filter shepherd-service-auto-start?
-                                           services)))
+            (signal SIGSEGV handle-crash)
 
-          ;; Hang up stdin.  At this point, we assume that 'start' methods
-          ;; that required user interaction on the console (e.g.,
-          ;; 'cryptsetup open' invocations, post-fsck emergency REPL) have
-          ;; completed.  User interaction becomes impossible after this
-          ;; call; this avoids situations where services wrongfully lead
-          ;; PID 1 to read from stdin (the console), which users may not
-          ;; have access to (see <https://bugs.gnu.org/23697>).
-          (redirect-port (open-input-file "/dev/null")
-                         (current-input-port))))
+            ;; Arrange to spawn a REPL if something goes wrong.  This is better
+            ;; than a kernel panic.
+            (call-with-error-handling
+              (lambda ()
+                (apply register-services
+                       (map load-compiled '#$(map scm->go files)))))
+
+            ;; guix-daemon 0.6 aborts if 'PATH' is undefined, so work around
+            ;; it.
+            (setenv "PATH" "/run/current-system/profile/bin")
+
+            (format #t "starting services...~%")
+            (for-each (lambda (service)
+                        ;; In the Shepherd 0.3 the 'start' method can raise
+                        ;; '&action-runtime-error' if it fails, so protect
+                        ;; against it.  (XXX: 'action-runtime-error?' is not
+                        ;; exported is 0.3, hence 'service-error?'.)
+                        (guard (c ((service-error? c)
+                                   (format (current-error-port)
+                                           "failed to start service '~a'~%"
+                                           service)))
+                          (start service)))
+                      '#$(append-map shepherd-service-provision
+                                     (filter shepherd-service-auto-start?
+                                             services)))
+
+            ;; Hang up stdin.  At this point, we assume that 'start' methods
+            ;; that required user interaction on the console (e.g.,
+            ;; 'cryptsetup open' invocations, post-fsck emergency REPL) have
+            ;; completed.  User interaction becomes impossible after this
+            ;; call; this avoids situations where services wrongfully lead
+            ;; PID 1 to read from stdin (the console), which users may not
+            ;; have access to (see <https://bugs.gnu.org/23697>).
+            (redirect-port (open-input-file "/dev/null")
+                           (current-input-port)))))
 
     (scheme-file "shepherd.conf" config)))

Information forwarded to bug-guix <at> gnu.org:
bug#37757; Package guix. (Mon, 02 Dec 2019 17:34:01 GMT) Full text and rfc822 format available.

Message #26 received at 37757 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Jesse Gibbons <jgibbons2357 <at> gmail.com>,
 Jan <tona_kosmicznego_smiecia <at> interia.pl>
Cc: 37757 <at> debbugs.gnu.org
Subject: Re: bug#37757: Kernel panic upon shutdown
Date: Mon, 02 Dec 2019 18:33:03 +0100

[Message part 1 (text/plain, inline)]

Hi!

Ludovic Courtès <ludo <at> gnu.org> skribis:

> Jesse (and anyone else experiencing this!), could you try to (1)
> reconfigure with this patch, (2) reboot, (3) try to halt the system to
> reproduce the crash, and (4) retrieve a backtrace from the ‘core’ file?
>
> For #4, you’ll have to do something along these lines once you’ve
> rebooted after the crash:
>
>   sudo gdb /run/current-system/profile/bin/guile /core
>
> and then type “thread apply all bt” at the GDB prompt.

It turns out the previous patch didn’t work; in short, we really have to
use async-signal-safe functions only from the signal handler, so this
has to be done in C.

The attached patch does that.  I’ve tried it with ‘guix system
container’ and it seems to dump core as expected, from what I can see.

Let me know if you manage to reproduce the bug and to get a core dumped
with this patch.

To everyone reading this: if you’re experiencing shepherd crashes,
please raise your hand :-) and consider applying this patch so we can
gather debugging info!

Thanks,
Ludo’.

[Message part 2 (text/x-patch, inline)]

diff --git a/gnu/services/shepherd.scm b/gnu/services/shepherd.scm
index 08bb33039c..cf82ef0a4c 100644
--- a/gnu/services/shepherd.scm
+++ b/gnu/services/shepherd.scm
@@ -271,6 +271,23 @@ and return the resulting '.go' file."
                          (compile-file #$file #:output-file #$output
                                        #:env env))))))
 
+(define (crash-handler)
+  (define gcc-toolchain
+    (module-ref (resolve-interface '(gnu packages commencement))
+                'gcc-toolchain))
+
+  (define source
+    (local-file "../system/aux-files/shepherd-crash-handler.c"))
+
+  (computed-file "crash-handler.so"
+                 #~(begin
+                     (setenv "PATH" #+(file-append gcc-toolchain "/bin"))
+                     (setenv "CPATH" #+(file-append gcc-toolchain "/include"))
+                     (setenv "LIBRARY_PATH"
+                             #+(file-append gcc-toolchain "/lib"))
+                     (system* "gcc" "-Wall" "-g" "-O3" "-fPIC"
+                              "-shared" "-o" #$output #$source))))
+
 (define (shepherd-configuration-file services)
   "Return the shepherd configuration file for SERVICES."
   (assert-valid-graph services)
@@ -281,6 +298,9 @@ and return the resulting '.go' file."
           (use-modules (srfi srfi-34)
                        (system repl error-handling))
 
+          ;; Load the crash handler, which allows shepherd to dump core.
+          (dynamic-link #$(crash-handler))
+
           ;; Arrange to spawn a REPL if something goes wrong.  This is better
           ;; than a kernel panic.
           (call-with-error-handling
diff --git a/gnu/system/aux-files/shepherd-crash-handler.c b/gnu/system/aux-files/shepherd-crash-handler.c
new file mode 100644
index 0000000000..6b2db10866
--- /dev/null
+++ b/gnu/system/aux-files/shepherd-crash-handler.c
@@ -0,0 +1,70 @@
+#define _GNU_SOURCE
+
+#include <stdlib.h>
+#include <unistd.h>
+#include <sched.h>
+#include <sys/time.h>
+#include <sys/resource.h>
+#include <sys/types.h>
+#include <sys/wait.h>
+#include <sys/syscall.h>   /* For SYS_xxx definitions */
+#include <signal.h>
+
+static void
+handle_crash (int sig)
+{
+  static const char msg[] = "Shepherd crashed!\n";
+  write (2, msg, sizeof msg);
+
+#ifdef __sparc__
+  /* See 'raw_clone' in systemd.  */
+# error "SPARC uses a different 'clone' syscall convention"
+#endif
+
+  pid_t pid = syscall (SYS_clone, SIGCHLD, NULL);
+  if (pid < 0)
+    abort ();
+
+  if (pid == 0)
+    {
+      /* Restore the default signal handler to get a core dump.  */
+      signal (sig, SIG_DFL);
+
+      const struct rlimit infinity = { RLIM_INFINITY, RLIM_INFINITY };
+      setrlimit (RLIMIT_CORE, &infinity);
+      chdir ("/");
+
+      int pid = syscall (SYS_getpid);
+      kill (pid, sig);
+
+      /* As it turns out, 'kill' simply returns without doing anything, which
+	 is consistent with the "Notes" section of kill(2).  Thus, force a
+	 crash.  */
+      * (int *) 0 = 42;
+
+      _exit (254);
+    }
+  else
+    {
+      signal (sig, SIG_IGN);
+
+      int status;
+      waitpid (pid, &status, 0);
+
+      sync ();
+
+      _exit (255);
+    }
+
+  _exit (253);
+}
+
+static void initialize_crash_handler (void)
+  __attribute__ ((constructor));
+
+static void
+initialize_crash_handler (void)
+{
+  signal (SIGSEGV, handle_crash);
+  signal (SIGABRT, handle_crash);
+}

Information forwarded to bug-guix <at> gnu.org:
bug#37757; Package guix. (Tue, 03 Dec 2019 09:44:02 GMT) Full text and rfc822 format available.

Message #29 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Arne Babenhauserheide <arne_bab <at> web.de>
To: bug-guix <at> gnu.org
Cc: Jan <tona_kosmicznego_smiecia <at> interia.pl>,
 Jesse Gibbons <jgibbons2357 <at> gmail.com>, 37757 <at> debbugs.gnu.org
Subject: Re: bug#37757: Kernel panic upon shutdown
Date: Tue, 03 Dec 2019 10:43:13 +0100

[Message part 1 (text/plain, inline)]

Ludovic Courtès <ludo <at> gnu.org> writes:
> To everyone reading this: if you’re experiencing shepherd crashes,
> please raise your hand :-)

\o

> and consider applying this patch so we can gather debugging info!

Can I do that without installing from a local checkout?

Best wishes,
Arne
--
Unpolitisch sein
heißt politisch sein
ohne es zu merken

[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-guix <at> gnu.org:
bug#37757; Package guix. (Tue, 03 Dec 2019 09:44:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-guix <at> gnu.org:
bug#37757; Package guix. (Mon, 09 Dec 2019 13:49:01 GMT) Full text and rfc822 format available.

Message #35 received at 37757 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Jesse Gibbons <jgibbons2357 <at> gmail.com>
Cc: Andy Wingo <wingo <at> igalia.com>, Jan <tona_kosmicznego_smiecia <at> interia.pl>,
 37757 <at> debbugs.gnu.org
Subject: Re: bug#37757: Kernel panic upon shutdown
Date: Mon, 09 Dec 2019 14:47:59 +0100

[Message part 1 (text/plain, inline)]

Hello,

[+Cc: Andy for a heads-up on the fix below.]

Ludovic Courtès <ludo <at> gnu.org> skribis:

> It turns out the previous patch didn’t work; in short, we really have to
> use async-signal-safe functions only from the signal handler, so this
> has to be done in C.
>
> The attached patch does that.  I’ve tried it with ‘guix system
> container’ and it seems to dump core as expected, from what I can see.
>
> Let me know if you manage to reproduce the bug and to get a core dumped
> with this patch.

Good news!  The patch does indeed allow shepherd to dump core, and I
managed to grab the backtrace below on an x86_64 machine running Guix
System (from yesterday) with GNOME:

--8<---------------cut here---------------start------------->8---
Using host libthread_db library "/gnu/store/ahqgl4h89xqj695lgqvsaf6zh2nhy4pj-glibc-2.29/lib/libthread_db.so.1".
Core was generated by `/gnu/store/1mkkv2caiqbdbbd256c4dirfi4kwsacv-guile-2.2.6/bin/guile --no-auto-com'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  handle_crash (sig=11)
    at /gnu/store/dayk54wxskp14w53813384azhxmd5awz-shepherd-crash-handler.c:43
43	      * (int *) 0 = 42;
[Current thread is 1 (LWP 4635)]

[…]

Thread 1 (LWP 4635):
#0  handle_crash (sig=11) at /gnu/store/dayk54wxskp14w53813384azhxmd5awz-shepherd-crash-handler.c:43
        infinity = {rlim_cur = 18446744073709551615, rlim_max = 18446744073709551615}
        pid = <optimized out>
        msg = "Shepherd crashed!\n"
        pid = <optimized out>
#1  <signal handler called>
No locals.
#2  handle_crash (sig=6) at /gnu/store/dayk54wxskp14w53813384azhxmd5awz-shepherd-crash-handler.c:43
        infinity = {rlim_cur = 18446744073709551615, rlim_max = 18446744073709551615}
        pid = <optimized out>
        msg = "Shepherd crashed!\n"
        pid = <optimized out>
#3  <signal handler called>
No locals.
#4  __GI_raise (sig=sig <at> entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
        set = {__val = {0, 2314885530818445312, 0 <repeats 14 times>}}
        pid = <optimized out>
        tid = <optimized out>
        ret = <optimized out>
#5  0x00007f03eef40891 in __GI_abort () at abort.c:79
        save_stage = 1
        act = {__sigaction_handler = {sa_handler = 0x0, sa_sigaction = 0x0}, sa_mask = {__val = {0 <repeats 13 times>, 139654877144192, 0, 139654877624544}}, sa_flags = -279049286, sa_restorer = 0x7f03ef57e480 <read_finalization_pipe_data>}
        sigs = {__val = {32, 0 <repeats 15 times>}}
#6  0x00007f03ef57e89a in finalization_thread_proc (unused=<optimized out>) at finalizers.c:228
        data = {byte = -24 '\350', n = -1, err = 4}
#7  0x00007f03ef56f35a in c_body (d=0x7f03ed152e50) at continuations.c:422
        data = 0x7f03ed152e50
#8  0x00007f03ef5f079f in vm_regular_engine (thread=0x2, vp=0x7f03eb1caea0, registers=0x0, resume=-286001158) at vm-engine.c:786
        ret = 2
        ip = <optimized out>
        sp = <optimized out>
        op = 10
        jump_table_ = {…}
        jump_table = 0x7f03ef64d8e0 <jump_table_>

[…]

#19 scm_with_guile (func=<optimized out>, data=<optimized out>) at threads.c:710
No locals.
#20 0x00007f03ef497015 in start_thread (arg=0x7f03ed153700) at pthread_create.c:486
        ret = <optimized out>
        pd = 0x7f03ed153700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {139654839219968, -749312912628550421, 140727702524830, 140727702524831, 140727702524832, 139654839219968, 837174519050892523, 837169745183601899}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
#21 0x00007f03eeffd91f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
No locals.
--8<---------------cut here---------------end--------------->8---

So what happens is that ‘finalization_thread_proc’ in Guile receives
EINTR (data.err == 4) but then, despite EINTR, it goes on to check the
value of ‘data.byte’ and aborts because it’s neither 0 nor 1.

My plan is to:

  1. push the patch below to the ‘stable-2.2’ branch of Guile;
     done:
     <https://git.savannah.gnu.org/cgit/guile.git/commit/?h=stable-2.2&id=edf5aea7ac852db2356ef36cba4a119eb0c81ea9>;

  2. use a patched Guile for the ‘shepherd’ package;

  3. include the crash handler in the Shepherd.

Thoughts?

Thanks,
Ludo’.

[Message part 2 (text/x-patch, inline)]

diff --git a/libguile/finalizers.c b/libguile/finalizers.c
index c5d69e8e3..94a6e6b0a 100644
--- a/libguile/finalizers.c
+++ b/libguile/finalizers.c
@@ -1,4 +1,4 @@
-/* Copyright (C) 2012, 2013, 2014 Free Software Foundation, Inc.
+/* Copyright (C) 2012, 2013, 2014, 2019 Free Software Foundation, Inc.
  *
  * This library is free software; you can redistribute it and/or
  * modify it under the terms of the GNU Lesser General Public License
@@ -211,21 +211,26 @@ finalization_thread_proc (void *unused)
 
       scm_without_guile (read_finalization_pipe_data, &data);
       
-      if (data.n <= 0 && data.err != EINTR) 
+      if (data.n <= 0)
         {
-          perror ("error in finalization thread");
-          return NULL;
+          if (data.err != EINTR)
+            {
+              perror ("error in finalization thread");
+              return NULL;
+            }
         }
-
-      switch (data.byte)
+      else
         {
-        case 0:
-          scm_run_finalizers ();
-          break;
-        case 1:
-          return NULL;
-        default:
-          abort ();
+          switch (data.byte)
+            {
+            case 0:
+              scm_run_finalizers ();
+              break;
+            case 1:
+              return NULL;
+            default:
+              abort ();
+            }
         }
     }
 }

Reply sent to Ludovic Courtès <ludo <at> gnu.org>:
You have taken responsibility. (Mon, 09 Dec 2019 23:14:02 GMT) Full text and rfc822 format available.

Notification sent to Jesse Gibbons <jgibbons2357 <at> gmail.com>:
bug acknowledged by developer. (Mon, 09 Dec 2019 23:14:03 GMT) Full text and rfc822 format available.

Message #40 received at 37757-done <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Jesse Gibbons <jgibbons2357 <at> gmail.com>
Cc: Arne Babenhauserheide <arne_bab <at> web.de>,
 Jan <tona_kosmicznego_smiecia <at> interia.pl>, 37757-done <at> debbugs.gnu.org
Subject: Re: bug#37757: Kernel panic upon shutdown
Date: Tue, 10 Dec 2019 00:13:44 +0100

Hi,

Ludovic Courtès <ludo <at> gnu.org> skribis:

> My plan is to:
>
>   1. push the patch below to the ‘stable-2.2’ branch of Guile;
>      done:
>      <https://git.savannah.gnu.org/cgit/guile.git/commit/?h=stable-2.2&id=edf5aea7ac852db2356ef36cba4a119eb0c81ea9>;
>
>   2. use a patched Guile for the ‘shepherd’ package;

Done:
<https://git.savannah.gnu.org/cgit/guix.git/commit/?id=24ba2cee2b1671c5dae36bb4cdba139f1fd09023>.

>   3. include the crash handler in the Shepherd.

Done:
<https://git.savannah.gnu.org/cgit/shepherd.git/commit/?id=dfb7c7ecdb2d12061073e6939ec6e765ae59c00c>.

I’m closing the bug.  Please reopen it if you notice anything wrong!

Ludo’.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Tue, 07 Jan 2020 12:24:05 GMT) Full text and rfc822 format available.

This bug report was last modified 6 years and 1 day ago.

Previous Next

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #37757 shepherd segfaults upon shutdown (kernel panic)

GNU bug report logs - #37757
shepherd segfaults upon shutdown (kernel panic)