GNU bug report logs - #40927
[PATCH] Allow resume from swap device during boot

Previous Next

Package: guix-patches;

Reported by: Jean-Baptiste Note <jean-baptiste.note <at> m4x.org>

Date: Tue, 28 Apr 2020 09:25:02 UTC

Severity: normal

Tags: patch

Done: Tobias Geerinckx-Rice <me <at> tobias.gr>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 40927 in the body.
You can then email your comments to 40927 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to guix-patches <at> gnu.org:
bug#40927; Package guix-patches. (Tue, 28 Apr 2020 09:25:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Jean-Baptiste Note <jean-baptiste.note <at> m4x.org>:
New bug report received and forwarded. Copy sent to guix-patches <at> gnu.org. (Tue, 28 Apr 2020 09:25:03 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Jean-Baptiste Note <jean-baptiste.note <at> m4x.org>
To: guix-patches <at> gnu.org
Subject: [PATCH] Allow resume from swap device during boot
Date: Tue, 28 Apr 2020 07:26:15 +0000
[Message part 1 (text/plain, inline)]
Dear GUIX maintainers,

Current the GUIX SD boot process does not allow resuming from sleep,
even thought sleep options are available through loginctl, eg:

loginctl hybrid-sleep
loginctl hibernate

This is a very important feature for people like me using GUIX SD on a
laptop (yes, it is possible, mine is a corebooted X230 running
linux-libre!)

This patch is based on a patch floating around. The core functionality
has been isolated, the resume function isolated, the patch rebased and
tested. I'm not taking credit for it, even though tracing the exact
origin is hard.

The resume hook is called if the resume= kernel argument is provided,
which one can do during system configuration.

My scheme level is zero, so please bear with me. In particular, some
conditionals could maybe be moved within the function, or the function
itself called within some already-available hooks. Also it is not clear
if the commit log is adequate for such a change.

Please let me know how to improve this and get this merged; I can also
write some documentation (probably once the mechanism is in place) to
explain how the feature can be used.

Kind regards,
Jean-Baptiste

[0001-linux-boot-Add-support-for-resuming-from-swap-device.patch (text/x-patch, attachment)]
[signature.asc (application/pgp-signature, inline)]

Information forwarded to guix-patches <at> gnu.org:
bug#40927; Package guix-patches. (Fri, 01 May 2020 14:52:01 GMT) Full text and rfc822 format available.

Message #8 received at 40927 <at> debbugs.gnu.org (full text, mbox):

From: Danny Milosavljevic <dannym <at> scratchpost.org>
To: Jean-Baptiste Note <jean-baptiste.note <at> m4x.org>
Cc: 40927 <at> debbugs.gnu.org
Subject: Re: [bug#40927] [PATCH] Allow resume from swap device during boot
Date: Fri, 1 May 2020 16:50:45 +0200
[Message part 1 (text/plain, inline)]
Hi Jean-Baptiste Note,

On Tue, 28 Apr 2020 07:26:15 +0000
Jean-Baptiste Note <jean-baptiste.note <at> m4x.org> wrote:

> linux-libre/Documentation/swsusp.txt

Should be linux-libre/Documentation/power/swsusp.txt .

Otherwise OK.

Please try to find the names of the actual authors so we can commit the parts
they wrote with them as author.

Who provides the "resume=" argument eventually?

Also, if we want to support this, what would the Guix installer have to do?

Add a swap partition?  Does it have to be exactly as big as the amount of RAM
or does it need to be bigger because of overhead?
[Message part 2 (application/pgp-signature, inline)]

Information forwarded to guix-patches <at> gnu.org:
bug#40927; Package guix-patches. (Fri, 01 May 2020 16:36:01 GMT) Full text and rfc822 format available.

Message #11 received at 40927 <at> debbugs.gnu.org (full text, mbox):

From: Tobias Geerinckx-Rice <me <at> tobias.gr>
To: guix-patches <at> gnu.org
Cc: 40927 <at> debbugs.gnu.org
Subject: Re: [bug#40927] [PATCH] Allow resume from swap device during boot
Date: Fri, 01 May 2020 18:35:28 +0200
[Message part 1 (text/plain, inline)]
Jean-Baptiste,

Jean-Baptiste Note 写道:
> This is a very important feature for people like me using GUIX 
> SD on a
> laptop (yes, it is possible, mine is a corebooted X230 running
> linux-libre!)

Greetings, brother in hardware.

Note that hibernation and resumption already work fine if you rely 
on the kernel's built-in support.  I've been hibernating my X230T 
for years without patches.

This explicit initramfs support is only needed if your storage 
drivers aren't available when the kernel itself tries to resume, 
and the initramfs has to retry later.  That's slower but allows 
things like ahci to be modular instead of built-in.

> This patch is based on a patch floating around. The core 
> functionality
> has been isolated, the resume function isolated, the patch 
> rebased and
> tested. I'm not taking credit for it, even though tracing the 
> exact
> origin is hard.

Heh.  It's certainly very similar to a patch of mine that's ‘out 
there’ although it's probably not the only one.

> The resume hook is called if the resume= kernel argument is 
> provided,
> which one can do during system configuration.

This patch ignores the ‘noresume’ flag, which is bad.  If it's 
present anywhere on the command line resumption should be skipped:

+        (when (and (not (member "noresume" args))
+                   resume-device
+                   (file-exists? resume-device)
+                   (file-exists? "/sys/power/resume"))
+          (resume-from-device resume-device))

> My scheme level is zero, so please bear with me. In particular, 
> some
> conditionals could maybe be moved within the function,

I'd like to see everything moved into a self-contained DTRT 
[try-to-]resume procedure, except for the ‘noresume’ check (so 
callers are free to explicitly resume if they so please):

+        (unless (member "noresume" args)
+                (resume-if-hibernated  (find-long-option "resume" 
args)))

This is what the last iteration of my patch does.

If (find-long-option "resume" args) is #f, fall back to 
CONFIG_PM_STD_PARTITION.  This is what the kernel does: even if 
‘resume=’ is missing, it will try to resume from that device if 
specified.  We should match that behaviour if we can.

Problem is, I forgot how to get that value from user space, or if 
it's even possible.  I gave up on Linux's built-in hibernation 
(swsusp) years ago.  It's poorly written and maintained, and the 
author fiercely defends it from all improvement.

Instead I use TuxOnIce, which exposes it under 
/sys/power/tuxonice/….  That's obviously not an option here, 
although it would be friendly to fall back to it for us ToI users 
:-)

*user.

I think ToI even exposes the ‘last used hibernation device’ 
somewhere, so user space can just use that instead of 
CONFIG_PM_STD_PARTITION.  This is obviously the right thing to do. 
Again, not sure if swsusp does.

> or the function itself called within some already-available 
> hooks.

We don't have a concept of ‘initramfs hooks’ and I think that's a 
good thing.  Gives me dracut flashbacks.  Don't lets bother with 
them until we need them, which is never.

> Also it is not clear if the commit log is adequate for such a 
> change.

It's all right :-)  If anything, it's too long:

 linux-boot: Add support for resuming from swap device.

 * gnu/build/linux-boot.scm (resume-from-device): New procedure.
 * gnu/build/linux-boot.scm (boot-system): Call it, unless 
 ‘noresume’
   is present on the kernel command line.

> Please let me know how to improve this and get this merged; I 
> can also
> write some documentation (probably once the mechanism is in 
> place) to
> explain how the feature can be used.

If this works properly no documentation is needed.  The kernel by 
default writes to the first swap device; we should magically 
resume from it.  Forcing users to know (or care) about part 2 is 
asymmetrical and wrong.

Not sure if that's possible with vanilla Linux-Libre…

Will stop shilling ToI for now,

T G-R
[signature.asc (application/pgp-signature, inline)]

Information forwarded to guix-patches <at> gnu.org:
bug#40927; Package guix-patches. (Fri, 01 May 2020 16:53:02 GMT) Full text and rfc822 format available.

Information forwarded to guix-patches <at> gnu.org:
bug#40927; Package guix-patches. (Sun, 03 May 2020 11:06:01 GMT) Full text and rfc822 format available.

Message #17 received at 40927 <at> debbugs.gnu.org (full text, mbox):

From: Tobias Geerinckx-Rice <me <at> tobias.gr>
To: 40927 <at> debbugs.gnu.org
Subject: Re: [bug#40927] [PATCH] Allow resume from swap device during boot
Date: Sun, 03 May 2020 13:05:52 +0200
[Message part 1 (text/plain, inline)]
Hullo again,

Tobias Geerinckx-Rice 写道:
> This is what the last iteration of my patch does.

I managed to find it!  Long live dusty back-ups.

It uses native procedures to divine the device number, instead of 
— clever! — /sys/blockery that supports only a subset of specs 
(and reminds me too much of ‘look what I found lying around’ shell 
scripting).  If any are missing we can add them to 
CANONICALIZE-DEVICE-SPEC to the benefit of all.

Tested with both built-in & modular ATA drivers × mainline swsusp 
& TuxOnIce.  More welcome.

These copyright dates are downright embarrassing.  Let's Get 
(something like) This Merged!

Kind regards,

T G-R

[0001-linux-boot-Resume-from-hibernation.patch (text/x-patch, inline)]
From 46c4c1010d9257f3d1d1ddb201dc7f7519d42ba0 Mon Sep 17 00:00:00 2001
From: Tobias Geerinckx-Rice <me <at> tobias.gr>
Date: Fri, 26 Feb 2016 17:57:07 +0100
Subject: [PATCH] linux-boot: Resume from hibernation.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* gnu/build/linux-boot.scm (resume-if-hibernated): New procedure.
(boot-system): Call it unless ‘noresume’ was specified.
---
 gnu/build/linux-boot.scm | 52 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 52 insertions(+)

diff --git a/gnu/build/linux-boot.scm b/gnu/build/linux-boot.scm
index 05e833c0c6..74e76b6a31 100644
--- a/gnu/build/linux-boot.scm
+++ b/gnu/build/linux-boot.scm
@@ -1,5 +1,6 @@
 ;;; GNU Guix --- Functional package management for GNU
 ;;; Copyright © 2013, 2014, 2015, 2016, 2017, 2018, 2019 Ludovic Courtès <ludo <at> gnu.org>
+;;; Copyright © 2016, 2017, 2019 Tobias Geerinckx-Rice <me <at> tobias.gr>
 ;;; Copyright © 2017 Mathieu Othacehe <m.othacehe <at> gmail.com>
 ;;; Copyright © 2019 Guillaume Le Vaillant <glv <at> posteo.net>
 ;;;
@@ -110,6 +111,51 @@ OPTION doesn't appear in ARGUMENTS."
                        (substring arg (+ 1 (string-index arg #\=)))))
                 arguments)))
 
+(define (resume-if-hibernated device)
+  "Resume from hibernation if possible.  This is safe ONLY if no on-disk file
+systems have been mounted; calling it later risks severe file system corruption!
+See <Documentation/swsusp.txt> in the kernel source directory.  This is the
+caller's responsibility, as is catching exceptions if resumption was supposed to
+happen but didn't.
+
+Resume only from DEVICE if it's a string.  If it's #f, use the kernel's default
+hibernation device (CONFIG_PM_STD_PARTITION).  Never return if resumption
+succeeds.  Return nothing otherwise.  The kernel logs any details to dmesg."
+
+  (define (string->major:minor string)
+    "Return a string with MAJOR:MINOR numbers of the device specified by STRING"
+
+    ;; The "resume=" kernel command-line option always provides a string, which
+    ;; can represent a device, a UUID, or a label.  Check for all three.
+    (let* ((spec (cond ((string-prefix? "/" string) string)
+                       ((uuid string) => identity)
+                       (else (file-system-label string))))
+           ;; XXX kernel's swsusp_resume_can_resume() waits if ‘resumewait’ is
+           ;; found on the command line; our canonicalize-device-spec gives up
+           ;; after 20 seconds.  We could loop (ew!) if someone relies on it…
+           (device (canonicalize-device-spec spec))
+           (rdev (stat:rdev (stat device)))
+           (minor (modulo rdev 256))
+           (major (/ (- rdev minor) 256)))
+      (format #f "~a:~a" major minor)))
+
+  ;; Write the MAJOR:MINOR numbers of the specified or default resume DEVICE to
+  ;; this magic file.  The kernel will immediately try to resume from it.
+  (let ((resume "/sys/power/resume"))
+    (when (file-exists? resume)         ; this kernel supports hibernation
+      ;; Honour the kernel's default device (only) if none other was given.
+      (let ((major:minor (if device
+                             (string->major:minor device)
+                             (let ((default (call-with-input-file resume
+                                              read-line)))
+                               ;; Don't waste time echoing ‘nothing’ to /sys.
+                               (if (string=? "0:0" default)
+                                   #f
+                                   default)))))
+        (when major:minor
+          (call-with-output-file resume ; may throw an ‘Invalid argument’
+            (cut display major:minor <>))))))) ; may never return
+
 (define* (make-disk-device-nodes base major #:optional (minor 0))
   "Make the block device nodes around BASE (something like \"/root/dev/sda\")
 with the given MAJOR number, starting with MINOR."
@@ -504,6 +550,12 @@ upon error."
         (load-linux-modules-from-directory linux-modules
                                            linux-module-directory)
 
+        (unless (member "noresume" args)
+          ;; Try to resume immediately after loading (storage) modules
+          ;; but before any on-disk file systems have been mounted.
+          (false-if-exception           ; failure is not fatal
+           (resume-if-hibernated (find-long-option "resume" args))))
+
         (when keymap-file
           (let ((status (system* "loadkeys" keymap-file)))
             (unless (zero? status)
-- 
2.25.2

[signature.asc (application/pgp-signature, inline)]

Information forwarded to guix-patches <at> gnu.org:
bug#40927; Package guix-patches. (Tue, 05 May 2020 18:08:01 GMT) Full text and rfc822 format available.

Message #20 received at 40927 <at> debbugs.gnu.org (full text, mbox):

From: Jean-Baptiste Note <jean-baptiste.note <at> m4x.org>
To: Danny Milosavljevic <dannym <at> scratchpost.org>
Cc: 40927 <at> debbugs.gnu.org
Subject: Re: [bug#40927] [PATCH] Allow resume from swap device during boot
Date: Tue, 05 May 2020 18:07:15 +0000
[Message part 1 (text/plain, inline)]
Hi Danny,

Thanks for being so guix in answering the original report.
Your questions prompted me to take a bit of time before answering.

Regarding attribution, i've dug two instances of this patch:

https://git.pantherx.org/mirror/guix/commit/af8d58efa05927b24694c87379cccc378f3fdde7

https://issues.guix.gnu.org/issue/37290
https://lists.gnu.org/archive/html/bug-guix/2019-09/msg00017.html

Both instances of the patch are given by author Mark H Weaver <mhw <at> netris.org>

As you can see there's already an open issue in GUIX regarding resume;
i'm sorry i missed that. Maybe for attribution it would make sense for
me to resend the patch there; let me know how to proceed.

Regarding the patch itself and your questions:

  - for comparison to the original patch: elogind currently has the
    configurability required to make the second half of the original
    patch moot, and I think it would be better to split the work
    anyways;

  - i've addressed the reference to swsusp.txt

  - the resume argument is provided by (in my case) the system
    definition, so I basically have something like that:

#+begin_src scheme
(operating-system
  (kernel-arguments
  (cons
   (string-append "modprobe.blacklist="
		  (comma-separated
		   %redundant-linux-modules))
  '("--resume=/dev/sdb3" "quiet"))))
#end_src

  - in order to support this in the GUIX installer, I guess adding such
    a line in the generated system definition, along with a swap
    partition, would be sufficient. My knowledge of the workings of the
    guix installer is null, tough :)

  - please note that if you have a suspend signature on your swap
    partition, and fail to resume from it (for instance because of a
    missing kernel commandline argument...), then shepherd will fail
    bringing up the swap device, and you will need to handle this
    situation by hand currently;

  - My knowledge of the details of swsusp is rather limited: I don't
    know, for instance, how this would translate in case of a more
    complex setup of multi-swap devices;

  - However, it seems very clear from the documentation that a mere
    resume= argument syntax can trigger specific initialization paths
    within the kernel itself leading to a resume. I find it better
    style, ultimately, to properly separate the responsibilities of
    kernel and initrd, follow the GUIX conventions with a --resume
    argument.

  - therefore I propose the following, updated patch, with a more
    GUIX-like --resume argument; I've tested it on my laptop, which
    happens to have an encrypted /home partition (but only one swap
    device).

  - as far as I can understand the documentation, the swap partition
    does not have to be as big as the RAM; by default the kernel will
    strive to make the suspend image 2/5 of the physical RAM, but this
    can be tuned more aggressively. There's actually less than overhead:
    some parts of the memory can be discarded (shared MMAPd files, the
    buffer cache); and even then you can hope compression will reduce
    the needs. If the swap space is not large enough, suspend will
    simply fail.

This probably raises more issues than it solves, however i'd rather you
have the whole picture. Let me know how to proceed; I'm willing to
follow-up with other patches to get the feature up to the required
standards of quality.

Kind regards,
Jean-Baptiste
[0001-linux-boot-Add-support-for-resuming-from-swap-device.patch (text/x-patch, attachment)]
[signature.asc (application/pgp-signature, inline)]

Information forwarded to guix-patches <at> gnu.org:
bug#40927; Package guix-patches. (Tue, 05 May 2020 19:30:02 GMT) Full text and rfc822 format available.

Message #23 received at 40927 <at> debbugs.gnu.org (full text, mbox):

From: Jean-Baptiste Note <jean-baptiste.note <at> m4x.org>
To: Danny Milosavljevic <dannym <at> scratchpost.org>
Cc: 40927 <at> debbugs.gnu.org
Subject: Re: [bug#40927] [PATCH] Allow resume from swap device during boot
Date: Tue, 05 May 2020 19:29:36 +0000
[Message part 1 (text/plain, inline)]
Hi there,

As i'm not subscribed to the list, I completely missed on Tobias'
messages -- and my previous one must seem strange in hindsight. Sorry
about this.

I'm now subscribed, but still can't reply directly to his email.

Though I still won't be able to answer directly to Tobias' latest email,
I will test the patch, and /somehow/ report back on Thursday.

For now I must say that on my kernel -- official guix linux-libre-5.4 --
the built-in support does not seem to be working (I'm pretty sure i've
tested this); I will double-check however.

Kind regards,
Jean-Baptiste
[signature.asc (application/pgp-signature, inline)]

Information forwarded to guix-patches <at> gnu.org:
bug#40927; Package guix-patches. (Thu, 07 May 2020 21:00:02 GMT) Full text and rfc822 format available.

Message #26 received at 40927 <at> debbugs.gnu.org (full text, mbox):

From: Jean-Baptiste Note <jean-baptiste.note <at> m4x.org>
To: me <at> tobias.gr, Danny Milosavljevic <dannym <at> scratchpost.org>
Cc: 40927 <at> debbugs.gnu.org
Subject: Subject=[bug#40927] [PATCH] Allow resume from swap device during boot
Date: Thu, 07 May 2020 20:58:45 +0000
[Message part 1 (text/plain, inline)]
Dear Tobias,

I hopefully found a way to reply to your message directly. I've checked
resume with your patch -- actually a slightly reworked version of it; a
few remarks:

- I've amended it as per Danny's suggestions regarding documentation location

- I've adjusted it to take the guixy --resume and --noresume options
  rather than the kernel norm resume and noresume. In particular this
  allows me to test the 'kernel' resume path and the 'initrd' resume
  path with the same os definition quickly, only adjusting the GRUB
  cmdline within GRUB.

- I find the added copyright years on top strange -- shouldn't it be
  2020 ? 

The results of my tests are attached as a separate file. In a nutshell:

- I can confirm that on my kernel (default guix linux-libre-5.4) the
  resume= option by itself does not resume at all. This seems to have been
  reported already:
  https://lists.gnu.org/archive/html/bug-guix/2016-11/msg00060.html

- resume-by-uuid from userspace does not work
- resume-by-label from userspace does not work either
(as much as I'd like them to).

(please note that, on top of this, I found no way to specify the swap
devices in the os definition by UUID or LABEL either -- so the
limitation does not bother me in the current state).

Each time these loop for 20 seconds, waiting for the partition to appear
(the partition is there, from dmesg, but somehow is not in the database
looked for in (canonicalize-device-spec spec) -- weird).

Regarding the interface, I'm not sure we should feel bound by the
argument naming convention of the kernel for resume handling from
userspace -- we could leave its own namespace for the kernel to handle,
for those kernels that can, and separate the resume handling as done by
initrd in a separate, more guixy namespace.

Actually I don't really understand why we would want overlapping of
names for two different codepaths; I think separating the names brings
more flexibility (you can control from the GRUB commandline which resume
path you want to take) and less confusion (it is clearer from the
commandline where the resuming will/should be handled).

To summarize:

* I still think that this feature is greatly needed: we can hibernate
  with the current software stack, but the default kernel won't resume,
  leaving the system in a bad state (swap is not activated by shepherd,
  see logs), and we need manual swapon to leave this bad state;

* This version of the patch looks like it can handle UUID and LABEL,
  while it can't -- for reasons that i've not dug;

* Meanwhile I find resume-by-uuid or resume-by-label currently
  pointless, given the limited ways we have to specify swap devices in
  the OS definition file.

I'd be in favor of doing the following:

* Dropping support for UUID and LABEL in the code *OR* signalling
  clearly in some comment that the path is currently non-functional;

* Including the patch ASAP to avoid the really bad state we're in
  currently after a suspend.

* Using the guixy --resume and --noresume options rather than the kernel
  ones

Please let me know how to proceed, and let me know how to handle the
copyright notice.

Kind regards,
Jean-Baptiste

[tests_resume.txt (text/plain, attachment)]
[0001-linux-boot-Resume-from-hibernation.patch (text/x-patch, attachment)]
[signature.asc (application/pgp-signature, inline)]

Reply sent to Tobias Geerinckx-Rice <me <at> tobias.gr>:
You have taken responsibility. (Sun, 09 May 2021 19:35:03 GMT) Full text and rfc822 format available.

Notification sent to Jean-Baptiste Note <jean-baptiste.note <at> m4x.org>:
bug acknowledged by developer. (Sun, 09 May 2021 19:35:03 GMT) Full text and rfc822 format available.

Message #31 received at 40927-done <at> debbugs.gnu.org (full text, mbox):

From: Tobias Geerinckx-Rice <me <at> tobias.gr>
To: 37290-done <at> debbugs.gnu.org, 40927-done <at> debbugs.gnu.org
Subject: Close hibernation bugs
Date: Sun, 09 May 2021 21:34:51 +0200
[Message part 1 (text/plain, inline)]
Hibernation has been supported on Guix Systems for a very long 
time now.

I'm not aware of any, but any failures to hibernate are probably 
configuration or hardware-specific and should be reported as new 
bugs.

Kind regards,

T G-R
[signature.asc (application/pgp-signature, inline)]

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Mon, 07 Jun 2021 11:24:05 GMT) Full text and rfc822 format available.

This bug report was last modified 4 years and 24 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.