GNU bug report logs - #34276
‘guix system disk-image’ successfully builds a bad image

Previous Next

Package: guix;

Reported by: Tobias Geerinckx-Rice <me <at> tobias.gr>

Date: Fri, 1 Feb 2019 15:59:01 UTC

Severity: important

Merged with 37164

Done: Ludovic Courtès <ludo <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 34276 in the body.
You can then email your comments to 34276 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guix <at> gnu.org:
bug#34276; Package guix. (Fri, 01 Feb 2019 15:59:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Tobias Geerinckx-Rice <me <at> tobias.gr>:
New bug report received and forwarded. Copy sent to bug-guix <at> gnu.org. (Fri, 01 Feb 2019 15:59:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Tobias Geerinckx-Rice <me <at> tobias.gr>
To: Bug Guix <bug-guix <at> gnu.org>
Subject: ‘guix system disk-image’
 successfully builds a bad image
Date: Fri, 01 Feb 2019 16:57:48 +0100
Hullo!

I wanted to install this ‘Guix’ thing that everyone's so hyped up 
about.

I have a small forgotten script in my ~/guix.git that runs:

 ./pre-inst-env guix system disk-image --fallback 
 --image-size=1.5G \
	gnu/system/install.scm

This was written back when 1.5G was higher than the default.

Now it's much lower and too small to store all the Guix.  However, 
the command completes ‘successfully’:

copying 422 store items  [#########:
In srfi/srfi-1.scm:
  466:18 19 (fold #<procedure 1a60440 at ice-9/ftw.scm:452:38 
  (sub?> ?)
In unknown file:
         18 (_ #<procedure 1917270 at ice-9/ftw.scm:454:44 ()> 
         #<p?> ?)
In ice-9/ftw.scm:
  452:32 17 (loop _ _ #(21 1706421 16749 3 0 0 0 4096 1548869386 
  ?) ?)
In srfi/srfi-1.scm:
  466:18 16 (fold #<procedure 1a60160 at ice-9/ftw.scm:452:38 
  (sub?> ?)
In unknown file:
         15 (_ #<procedure 1917240 at ice-9/ftw.scm:454:44 ()> 
         #<p?> ?)
In ice-9/ftw.scm:
  452:32 14 (loop _ _ #(21 1739151 16749 3 0 0 0 4096 1548869386 
  ?) ?)
In srfi/srfi-1.scm:
  466:18 13 (fold #<procedure 1b8f8c0 at ice-9/ftw.scm:452:38 
  (sub?> ?)
In unknown file:
         12 (_ #<procedure 1b5bc90 at ice-9/ftw.scm:454:44 ()> 
         #<p?> ?)
In ice-9/ftw.scm:
  452:32 11 (loop _ _ #(21 1772091 16749 13 0 0 0 4096 1548869389 
  ?) ?)
In srfi/srfi-1.scm:
  466:18 10 (fold #<procedure 1b8f280 at ice-9/ftw.scm:452:38 
  (sub?> ?)
In unknown file:
          9 (_ #<procedure 1a56750 at ice-9/ftw.scm:454:44 ()> 
          #<p?> ?)
In ice-9/ftw.scm:
  452:32  8 (loop _ _ #(21 2132258 16749 98 0 0 0 4096 1548869432 
  ?) ?)
In srfi/srfi-1.scm:
  466:18  7 (fold #<procedure 140dd20 at ice-9/ftw.scm:452:38 
  (sub?> ?)
In unknown file:
          6 (_ #<procedure 19ea030 at ice-9/ftw.scm:454:44 ()> 
          #<p?> ?)
In ice-9/ftw.scm:
  452:32  5 (loop _ _ #(21 4589344 16749 24 0 0 0 4096 1548869676 
  ?) ?)
In srfi/srfi-1.scm:
  466:18  4 (fold #<procedure 1969540 at ice-9/ftw.scm:452:38 
  (sub?> ?)
In unknown file:
          3 (_ #<procedure 1725750 at ice-9/ftw.scm:454:44 ()> 
          #<p?> ?)
In ice-9/ftw.scm:
  482:39  2 (loop _ _ #(21 4589402 16749 3 0 0 0 4096 1548869687 
  ?) ?)
In ./guix/build/utils.scm:
  312:27  1 (_ 
  "/gnu/store/ricf82z3mqqrqim67jz3jlsglfm1g1a8-linux-?" ?)
In unknown file:
          0 (copy-file 
          "/gnu/store/ricf82z3mqqrqim67jz3jlsglfm1g1a?" ?)

ERROR: In procedure copy-file:
In procedure copy-file: No space left on device
^MESC[Kcopying 422 store items
boot program 
'/gnu/store/lbvrvrlqab4qpw9f907na445kppmknab-linux-vm-loader' 
terminated, rebooting
[ 1071.512054] Unregister pv shared memory for cpu 0
[ 1071.522414] reboot: Restarting system
[ 1071.542285] reboot: machine restart
successfully built 
/gnu/store/lbyq5790j5hfq3spbm76i1yw3sj41l8b-disk-image.drv
/gnu/store/dby523cy1l4wrqi8wwmk5ln9qr7g5mh8-disk-image

Kind regards,

T G-R

Sent from my GNU Emacs




Information forwarded to bug-guix <at> gnu.org:
bug#34276; Package guix. (Sun, 17 Mar 2019 12:10:02 GMT) Full text and rfc822 format available.

Message #8 received at 34276 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Tobias Geerinckx-Rice <me <at> tobias.gr>
Cc: 34276 <at> debbugs.gnu.org
Subject: Re: bug#34276: ‘guix system disk-image’ successfully builds a bad image
Date: Sun, 17 Mar 2019 13:09:35 +0100
[Message part 1 (text/plain, inline)]
Hello,

Tobias Geerinckx-Rice <me <at> tobias.gr> skribis:

> ERROR: In procedure copy-file:
> In procedure copy-file: No space left on device
> ^MESC[Kcopying 422 store items
> boot program
> '/gnu/store/lbvrvrlqab4qpw9f907na445kppmknab-linux-vm-loader'
> terminated, rebooting
> [ 1071.512054] Unregister pv shared memory for cpu 0
> [ 1071.522414] reboot: Restarting system
> [ 1071.542285] reboot: machine restart
> successfully built
> /gnu/store/lbyq5790j5hfq3spbm76i1yw3sj41l8b-disk-image.drv

I investigated a bit.  I managed to get our code to cause a kernel panic
upon failure (patch below).  However I fail to turn that guest kernel
panic into a different QEMU exit code.

I tried to use the “pvpanic” paravirtualized device (the ‘pvpanic.ko’
module in the guest, and “-device pvpanic” on the QEMU command line),
but unfortunately that thing is almost undocumented and I can’t get it
to turn the panic into a non-zero exit code, nor do I know if it’s
possible.

Thoughts anyone?

The other option would be to create a special file in the 9p mount
that’s shared with the host upon success, but that seems a bit hacky.

Thanks,
Ludo’.

[Message part 2 (text/x-patch, inline)]
diff --git a/gnu/system/linux-initrd.scm b/gnu/system/linux-initrd.scm
index 983c6d81c8..cb29a656b9 100644
--- a/gnu/system/linux-initrd.scm
+++ b/gnu/system/linux-initrd.scm
@@ -1,5 +1,5 @@
 ;;; GNU Guix --- Functional package management for GNU
-;;; Copyright © 2013, 2014, 2015, 2016, 2017, 2018 Ludovic Courtès <ludo <at> gnu.org>
+;;; Copyright © 2013, 2014, 2015, 2016, 2017, 2018, 2019 Ludovic Courtès <ludo <at> gnu.org>
 ;;; Copyright © 2016 Mark H Weaver <mhw <at> netris.org>
 ;;; Copyright © 2016 Jan Nieuwenhuizen <janneke <at> gnu.org>
 ;;; Copyright © 2017 Mathieu Othacehe <m.othacehe <at> gmail.com>
@@ -279,6 +279,7 @@ FILE-SYSTEMS."
             "isci")                      ;for SAS controllers like Intel C602
           '())
 
+    "pvpanic"
     ,@virtio-modules))
 
 (define-syntax %base-initrd-modules
diff --git a/gnu/system/vm.scm b/gnu/system/vm.scm
index e561285964..b671c74ab8 100644
--- a/gnu/system/vm.scm
+++ b/gnu/system/vm.scm
@@ -187,8 +187,9 @@ made available under the /xchg CIFS share."
                   ;; When USER-BUILDER succeeds, reboot (indicating a
                   ;; success), otherwise die, which causes a kernel panic
                   ;; ("Attempted to kill init!").
-                  #~(when (zero? (system* #$user-builder))
-                      (reboot))))
+                  #~(if (zero? (system* #$user-builder))
+                        (reboot)
+                        (exit 1))))
 
   (let ((initrd (or initrd
                     (base-initrd file-systems

Severity set to 'important' from 'normal' Request was from Ludovic Courtès <ludo <at> gnu.org> to control <at> debbugs.gnu.org. (Wed, 01 May 2019 20:20:01 GMT) Full text and rfc822 format available.

Merged 34276 37164. Request was from Ludovic Courtès <ludo <at> gnu.org> to control <at> debbugs.gnu.org. (Sun, 01 Sep 2019 20:38:03 GMT) Full text and rfc822 format available.

Information forwarded to bug-guix <at> gnu.org:
bug#34276; Package guix. (Thu, 19 Mar 2020 20:06:02 GMT) Full text and rfc822 format available.

Message #15 received at 34276 <at> debbugs.gnu.org (full text, mbox):

From: Brice Waegeneire <brice <at> waegenei.re>
To: 34276 <at> debbugs.gnu.org
Subject: Re: bug#34276: ‘guix system disk-im age’ successfully builds a bad image
Date: Thu, 19 Mar 2020 20:05:08 +0000
Hello Ludovic,

> I investigated a bit.  I managed to get our code to cause a kernel 
> panic
> upon failure (patch below).  However I fail to turn that guest kernel
> panic into a different QEMU exit code.
> 
> I tried to use the “pvpanic” paravirtualized device (the ‘pvpanic.ko’
> module in the guest, and “-device pvpanic” on the QEMU command line),
> but unfortunately that thing is almost undocumented and I can’t get it
> to turn the panic into a non-zero exit code, nor do I know if it’s
> possible.
> 
> Thoughts anyone?

I looked a little into it and I have found how to use pvpanic.
Unfortunately it's not as straight forward as getting a non-zero exit
code form qemu. When pvpanic is loaded in a VṂ, as you did with 
“-device
pvpanic”, generate events[0] on the QMP interface when a crash happen
and qemu either shutdown or pause when using --no-shutdown[1].

(gnu build marionette) which use the “-monitor” interface could be
recycled to use “-qmp” a machine interface using JSON.

Following is log of a QMP session where the guest panicked[2]:
--8<---------------cut here---------------start------------->8---
{
    "QMP": {
        "version": {
            "qemu": {
                "micro": 0,
                "minor": 2,
                "major": 4
            },
            "package": ""
        },
        "capabilities": [
            "oob"
        ]
    }
}
{ "execute": "qmp_capabilities" }
{
    "return": {
    }
}
{
    "timestamp": {
        "seconds": 1584645026,
        "microseconds": 936550
    },
    "event": "GUEST_PANICKED",
    "data": {
        "action": "pause"
    }
}
{
    "timestamp": {
        "seconds": 1584645026,
        "microseconds": 936675
    },
    "event": "GUEST_PANICKED",
    "data": {
        "action": "poweroff"
    }
}
{
    "timestamp": {
        "seconds": 1584645026,
        "microseconds": 936776
    },
    "event": "SHUTDOWN",
    "data": {
        "guest": true,
        "reason": "guest-panic"
    }
}
--8<---------------cut here---------------end--------------->8---


[0]: 
https://github.com/qemu/qemu/blob/9ced5c7c20cb16dff0c2fa3242c3ee96b68cec2a/qapi/run-state.json#L339-L355
[1]: 
https://github.com/qemu/qemu/blob/4dd6517e369828171290b65e11f6a45aeeed15af/softmmu/vl.c#L1423-L1427
[2]: 
https://git.qemu.org/?p=qemu.git;a=blob_plain;f=docs/interop/qmp-intro.txt;hb=HEAD

Brice.




Information forwarded to bug-guix <at> gnu.org:
bug#34276; Package guix. (Sat, 21 Mar 2020 15:59:01 GMT) Full text and rfc822 format available.

Message #18 received at 34276 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Brice Waegeneire <brice <at> waegenei.re>
Cc: 34276 <at> debbugs.gnu.org
Subject: Re: bug#34276: ‘guix system disk-im
 age’ successfully builds a bad image
Date: Sat, 21 Mar 2020 16:58:02 +0100
Hi Brice,

Brice Waegeneire <brice <at> waegenei.re> skribis:

>> I investigated a bit.  I managed to get our code to cause a kernel 
>> panic
>> upon failure (patch below).  However I fail to turn that guest kernel
>> panic into a different QEMU exit code.
>> 
>> I tried to use the “pvpanic” paravirtualized device (the ‘pvpanic.ko’
>> module in the guest, and “-device pvpanic” on the QEMU command line),
>> but unfortunately that thing is almost undocumented and I can’t get it
>> to turn the panic into a non-zero exit code, nor do I know if it’s
>> possible.
>> 
>> Thoughts anyone?
>
> I looked a little into it and I have found how to use pvpanic.
> Unfortunately it's not as straight forward as getting a non-zero exit
> code form qemu. When pvpanic is loaded in a VṂ, as you did with 
> “-device
> pvpanic”, generate events[0] on the QMP interface when a crash happen
> and qemu either shutdown or pause when using --no-shutdown[1].
>
> (gnu build marionette) which use the “-monitor” interface could be
> recycled to use “-qmp” a machine interface using JSON.
>
> Following is log of a QMP session where the guest panicked[2]:

Oooh, I see, thanks for digging into this!

Any idea how to implement it?  Is QMP a request/reply kind of interface
like the monitor?

Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#34276; Package guix. (Sat, 21 Mar 2020 16:45:01 GMT) Full text and rfc822 format available.

Message #21 received at 34276 <at> debbugs.gnu.org (full text, mbox):

From: Brice Waegeneire <brice <at> waegenei.re>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 34276 <at> debbugs.gnu.org
Subject: Re: bug#34276: ‘guix system disk-im age’ successfully builds a bad image
Date: Sat, 21 Mar 2020 16:44:02 +0000
Hello Ludo,

On 2020-03-21 15:58, Ludovic Courtès wrote:
> Any idea how to implement it?  Is QMP a request/reply kind of interface
> like the monitor?

Not really or I would have sent a patch instead.

QMP is similar to the the monitor in the sense that you can send a 
command and
receive a reply but it give us access to more features; in our case
asynchronous events. To get notified by the pvpanic device that a panic 
occured
on the guest it is needed to do the following:
1. Connect to the socket
2. Receive the server greetings
3. Respond with the capabilites request
4. Receive the capabilites respond
5. Listen on GUEST_PANICKED events

The QMP specifications are available here[0].

[0]: 
https://git.qemu.org/?p=qemu.git;a=blob_plain;f=docs/interop/qmp-spec.txt;hb=HEAD

Brice.




Reply sent to Ludovic Courtès <ludo <at> gnu.org>:
You have taken responsibility. (Thu, 26 Mar 2020 22:59:02 GMT) Full text and rfc822 format available.

Notification sent to Tobias Geerinckx-Rice <me <at> tobias.gr>:
bug acknowledged by developer. (Thu, 26 Mar 2020 22:59:02 GMT) Full text and rfc822 format available.

Message #26 received at 34276-done <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Tobias Geerinckx-Rice <me <at> tobias.gr>
Cc: 34276-done <at> debbugs.gnu.org
Subject: Re: bug#34276: ‘guix system disk-image’ successfully builds a bad image
Date: Thu, 26 Mar 2020 23:57:53 +0100
Hi,

Ludovic Courtès <ludo <at> gnu.org> skribis:

> The other option would be to create a special file in the 9p mount
> that’s shared with the host upon success, but that seems a bit hacky.

Turns out that was easily done and better than the status quo.
Done in commit be6520e6a58d7f6ee58f4cab76db9d1245410113!

Ludo’.




Reply sent to Ludovic Courtès <ludo <at> gnu.org>:
You have taken responsibility. (Thu, 26 Mar 2020 22:59:02 GMT) Full text and rfc822 format available.

Notification sent to Jesse Gibbons <jgibbons2357 <at> gmail.com>:
bug acknowledged by developer. (Thu, 26 Mar 2020 22:59:02 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Fri, 24 Apr 2020 11:24:06 GMT) Full text and rfc822 format available.

This bug report was last modified 4 years and 3 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.