GNU bug report logs - #68504
[PATCH] Add copy-on-write support to scm_copy_file.

Previous Next

Package: guile;

Reported by: Tomas Volf <~@wolfsden.cz>

Date: Tue, 16 Jan 2024 12:49:02 UTC

Severity: normal

Tags: patch

Done: Ludovic Courtès <ludo <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 68504 in the body.
You can then email your comments to 68504 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guile <at> gnu.org:
bug#68504; Package guile. (Tue, 16 Jan 2024 12:49:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Tomas Volf <~@wolfsden.cz>:
New bug report received and forwarded. Copy sent to bug-guile <at> gnu.org. (Tue, 16 Jan 2024 12:49:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Tomas Volf <~@wolfsden.cz>
To: bug-guile <at> gnu.org
Cc: Tomas Volf <~@wolfsden.cz>
Subject: [PATCH] Add copy-on-write support to scm_copy_file.
Date: Tue, 16 Jan 2024 13:48:17 +0100
On modern file-systems (BTRFS, ZFS) it is possible to copy a file using
copy-on-write method.  For large files it has the advantage of being
much faster and saving disk space (since identical extents are not
duplicated).  This feature is stable and for example coreutils' `cp'
does use it automatically (see --reflink).

This commit adds support for this feature into our
copy-file (scm_copy_file) procedure.  Same as `cp', it defaults to
'auto, meaning the copy-on-write is attempted, and in case of failure
the regular copy is performed.

No tests are provided, because the behavior depends on the system,
underlying file-system and its configuration.  That makes it challenging
to write a test for it.  Manual testing was performed instead:

    $ btrfs filesystem du /tmp/cow*
         Total   Exclusive  Set shared  Filename
      36.00KiB    36.00KiB       0.00B  /tmp/cow

    $ cat cow-test.scm
    (copy-file "/tmp/cow" "/tmp/cow-unspecified")
    (copy-file "/tmp/cow" "/tmp/cow-always" #:copy-on-write 'always)
    (copy-file "/tmp/cow" "/tmp/cow-auto" #:copy-on-write 'auto)
    (copy-file "/tmp/cow" "/tmp/cow-never" #:copy-on-write 'never)
    (copy-file "/tmp/cow" "/dev/shm/cow-unspecified")
    (copy-file "/tmp/cow" "/dev/shm/cow-auto" #:copy-on-write 'auto)
    (copy-file "/tmp/cow" "/dev/shm/cow-never" #:copy-on-write 'never)
    $ ./meta/guile -s cow-test.scm

    $ btrfs filesystem du /tmp/cow*
         Total   Exclusive  Set shared  Filename
      36.00KiB       0.00B    36.00KiB  /tmp/cow
      36.00KiB       0.00B    36.00KiB  /tmp/cow-always
      36.00KiB       0.00B    36.00KiB  /tmp/cow-auto
      36.00KiB    36.00KiB       0.00B  /tmp/cow-never
      36.00KiB       0.00B    36.00KiB  /tmp/cow-unspecified

    $ sha1sum /tmp/cow* /dev/shm/cow*
    4c665f87b5dc2e7d26279c4b48968d085e1ace32  /tmp/cow
    4c665f87b5dc2e7d26279c4b48968d085e1ace32  /tmp/cow-always
    4c665f87b5dc2e7d26279c4b48968d085e1ace32  /tmp/cow-auto
    4c665f87b5dc2e7d26279c4b48968d085e1ace32  /tmp/cow-never
    4c665f87b5dc2e7d26279c4b48968d085e1ace32  /tmp/cow-unspecified
    4c665f87b5dc2e7d26279c4b48968d085e1ace32  /dev/shm/cow-auto
    4c665f87b5dc2e7d26279c4b48968d085e1ace32  /dev/shm/cow-never
    4c665f87b5dc2e7d26279c4b48968d085e1ace32  /dev/shm/cow-unspecified

This commit also adds to new failure modes for (copy-file).

Failure to copy-on-write when 'always was passed in:

    scheme@(guile-user)> (copy-file "/tmp/cow" "/dev/shm/cow" #:copy-on-write 'always)
    ice-9/boot-9.scm:1676:22: In procedure raise-exception:
    In procedure copy-file: copy-on-write failed: Invalid cross-device link

Passing in invalid value for the #:copy-on-write keyword argument:

    scheme@(guile-user)> (copy-file "/tmp/cow" "/dev/shm/cow" #:copy-on-write 'nevr)
    ice-9/boot-9.scm:1676:22: In procedure raise-exception:
    In procedure copy-file: invalid value for #:copy-on-write: nevr

* NEWS: Add note for copy-file supporting copy-on-write.
* configure.ac: Check for linux/fs.h.
* doc/ref/posix.texi (File System)[copy-file]: Document the new
signature.
* libguile/filesys.c (clone_file): New function cloning a file using
FICLONE, if supported.
(k_copy_on_write): New keyword.
(sym_always, sym_auto, sym_never): New symbols.
(scm_copy_file): New #:copy-on-write keyword argument.  Attempt
copy-on-write copy by default.
* libguile/filesys.h: Update signature for scm_copy_file.
---
 NEWS               |  9 ++++++
 configure.ac       |  1 +
 doc/ref/posix.texi |  9 +++++-
 libguile/filesys.c | 74 +++++++++++++++++++++++++++++++++++++++-------
 libguile/filesys.h |  2 +-
 5 files changed, 82 insertions(+), 13 deletions(-)

diff --git a/NEWS b/NEWS
index b319404d7..9147098c9 100644
--- a/NEWS
+++ b/NEWS
@@ -21,6 +21,15 @@ definitely unused---this is notably the case for modules that are only
 used at macro-expansion time, such as (srfi srfi-26).  In those cases,
 the compiler reports it as "possibly unused".
 
+** copy-file now supports copy-on-write
+
+The copy-file procedure now takes an additional keyword argument,
+#:copy-on-write, specifying whether copy-on-write should be done, if the
+underlying file-system supports it.  Possible values are 'always, 'auto
+and 'never, with 'auto being the default.
+
+This speeds up copying large files a lot while saving the disk space.
+
 * Bug fixes
 
 ** (ice-9 suspendable-ports) incorrect UTF-8 decoding
diff --git a/configure.ac b/configure.ac
index d0a2dc79b..c46586e9b 100644
--- a/configure.ac
+++ b/configure.ac
@@ -418,6 +418,7 @@ AC_SUBST([SCM_I_GSC_HAVE_STRUCT_DIRENT64])
 #   sys/sendfile.h - non-POSIX, found in glibc
 #
 AC_CHECK_HEADERS([complex.h fenv.h io.h memory.h process.h \
+linux/fs.h \
 sys/dir.h sys/ioctl.h sys/select.h \
 sys/time.h sys/timeb.h sys/times.h sys/stdtypes.h sys/types.h \
 sys/utime.h unistd.h utime.h pwd.h grp.h sys/utsname.h \
diff --git a/doc/ref/posix.texi b/doc/ref/posix.texi
index fec42d061..d26808d91 100644
--- a/doc/ref/posix.texi
+++ b/doc/ref/posix.texi
@@ -896,10 +896,17 @@ of @code{delete-file}.  Why doesn't POSIX have a @code{rmdirat} function
 for this instead?  No idea!
 @end deffn
 
-@deffn {Scheme Procedure} copy-file oldfile newfile
+@deffn {Scheme Procedure} copy-file @var{oldfile} @var{newfile} @
+       [#:copy-on-write='auto]
 @deffnx {C Function} scm_copy_file (oldfile, newfile)
 Copy the file specified by @var{oldfile} to @var{newfile}.
 The return value is unspecified.
+
+@code{#:copy-on-write} keyword argument determines whether copy-on-write
+copy should be attempted and the behavior in case of failure.  Possible
+values are @code{'always} (attempt the copy-on-write, return error if it
+fails), @code{'auto} (attempt the copy-on-write, fallback to regular
+copy if it fails) and @code{'never} (perform the regular copy).
 @end deffn
 
 @deffn {Scheme Procedure} sendfile out in count [offset]
diff --git a/libguile/filesys.c b/libguile/filesys.c
index 1f0bba556..4fb8b9831 100644
--- a/libguile/filesys.c
+++ b/libguile/filesys.c
@@ -67,6 +67,11 @@
 # include <sys/sendfile.h>
 #endif
 
+#if defined(HAVE_SYS_IOCTL_H) && defined(HAVE_LINUX_FS_H)
+# include <linux/fs.h>
+# include <sys/ioctl.h>
+#endif
+
 #include "async.h"
 #include "boolean.h"
 #include "dynwind.h"
@@ -75,6 +80,7 @@
 #include "fports.h"
 #include "gsubr.h"
 #include "iselect.h"
+#include "keywords.h"
 #include "list.h"
 #include "load.h"	/* for scm_i_mirror_backslashes */
 #include "modules.h"
@@ -1255,20 +1261,49 @@ SCM_DEFINE (scm_readlink, "readlink", 1, 0, 0,
 }
 #undef FUNC_NAME
 
-SCM_DEFINE (scm_copy_file, "copy-file", 2, 0, 0,
-            (SCM oldfile, SCM newfile),
+static int
+clone_file (int oldfd, int newfd)
+{
+#ifdef FICLONE
+  return ioctl (newfd, FICLONE, oldfd);
+#else
+  (void)oldfd;
+  (void)newfd;
+  errno = EOPNOTSUPP;
+  return -1;
+#endif
+}
+
+SCM_KEYWORD (k_copy_on_write, "copy-on-write");
+SCM_SYMBOL (sym_always, "always");
+SCM_SYMBOL (sym_auto, "auto");
+SCM_SYMBOL (sym_never, "never");
+
+SCM_DEFINE (scm_copy_file, "copy-file", 2, 0, 1,
+            (SCM oldfile, SCM newfile, SCM rest),
 	    "Copy the file specified by @var{oldfile} to @var{newfile}.\n"
-	    "The return value is unspecified.")
+	    "The return value is unspecified.\n"
+            "\n"
+            "@code{#:copy-on-write} keyword argument determines whether "
+            "copy-on-write copy should be attempted and the "
+            "behavior in case of failure.  Possible values are "
+            "@code{'always} (attempt the copy-on-write, return error if "
+            "it fails), @code{'auto} (attempt the copy-on-write, "
+            "fallback to regular copy if it fails) and @code{'never} "
+            "(perform the regular copy)."
+            )
 #define FUNC_NAME s_scm_copy_file
 {
   char *c_oldfile, *c_newfile;
   int oldfd, newfd;
   int n, rv;
+  SCM cow = sym_auto;
+  int clone_res;
   char buf[BUFSIZ];
   struct stat_or_stat64 oldstat;
 
   scm_dynwind_begin (0);
-  
+
   c_oldfile = scm_to_locale_string (oldfile);
   scm_dynwind_free (c_oldfile);
   c_newfile = scm_to_locale_string (newfile);
@@ -1292,13 +1327,30 @@ SCM_DEFINE (scm_copy_file, "copy-file", 2, 0, 0,
       SCM_SYSERROR;
     }
 
-  while ((n = read (oldfd, buf, sizeof buf)) > 0)
-    if (write (newfd, buf, n) != n)
-      {
-	close (oldfd);
-	close (newfd);
-	SCM_SYSERROR;
-      }
+  scm_c_bind_keyword_arguments ("copy-file", rest, 0,
+                                k_copy_on_write, &cow,
+                                SCM_UNDEFINED);
+
+  if (scm_is_eq (cow, sym_always) || scm_is_eq (cow, sym_auto))
+    clone_res = clone_file(oldfd, newfd);
+  else if (scm_is_eq (cow, sym_never))
+    clone_res = -1;
+  else
+    scm_misc_error ("copy-file",
+                    "invalid value for #:copy-on-write: ~S",
+                    scm_list_1 (cow));
+
+  if (scm_is_eq (cow, sym_always) && clone_res)
+    scm_syserror ("copy-file: copy-on-write failed");
+
+  if (clone_res)
+    while ((n = read (oldfd, buf, sizeof buf)) > 0)
+      if (write (newfd, buf, n) != n)
+        {
+          close (oldfd);
+          close (newfd);
+          SCM_SYSERROR;
+        }
   close (oldfd);
   if (close (newfd) == -1)
     SCM_SYSERROR;
diff --git a/libguile/filesys.h b/libguile/filesys.h
index 1ce50d30e..4f620dfef 100644
--- a/libguile/filesys.h
+++ b/libguile/filesys.h
@@ -73,7 +73,7 @@ SCM_API SCM scm_symlink (SCM oldpath, SCM newpath);
 SCM_API SCM scm_symlinkat (SCM dir, SCM oldpath, SCM newpath);
 SCM_API SCM scm_readlink (SCM path);
 SCM_API SCM scm_lstat (SCM str);
-SCM_API SCM scm_copy_file (SCM oldfile, SCM newfile);
+SCM_API SCM scm_copy_file (SCM oldfile, SCM newfile, SCM rest);
 SCM_API SCM scm_mkstemp (SCM tmpl);
 SCM_API SCM scm_mkdtemp (SCM tmpl);
 SCM_API SCM scm_dirname (SCM filename);
-- 
2.41.0





Information forwarded to bug-guile <at> gnu.org:
bug#68504; Package guile. (Wed, 24 Jan 2024 10:28:01 GMT) Full text and rfc822 format available.

Message #8 received at 68504 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Tomas Volf <~@wolfsden.cz>
Cc: 68504 <at> debbugs.gnu.org
Subject: Re: bug#68504: [PATCH] Add copy-on-write support to scm_copy_file.
Date: Wed, 24 Jan 2024 11:26:56 +0100
Hi,

Tomas Volf <~@wolfsden.cz> skribis:

> On modern file-systems (BTRFS, ZFS) it is possible to copy a file using
> copy-on-write method.  For large files it has the advantage of being
> much faster and saving disk space (since identical extents are not
> duplicated).  This feature is stable and for example coreutils' `cp'
> does use it automatically (see --reflink).
>
> This commit adds support for this feature into our
> copy-file (scm_copy_file) procedure.  Same as `cp', it defaults to
> 'auto, meaning the copy-on-write is attempted, and in case of failure
> the regular copy is performed.
>
> No tests are provided, because the behavior depends on the system,
> underlying file-system and its configuration.  That makes it challenging
> to write a test for it.  Manual testing was performed instead:
>
>     $ btrfs filesystem du /tmp/cow*
>          Total   Exclusive  Set shared  Filename
>       36.00KiB    36.00KiB       0.00B  /tmp/cow
>
>     $ cat cow-test.scm
>     (copy-file "/tmp/cow" "/tmp/cow-unspecified")
>     (copy-file "/tmp/cow" "/tmp/cow-always" #:copy-on-write 'always)
>     (copy-file "/tmp/cow" "/tmp/cow-auto" #:copy-on-write 'auto)
>     (copy-file "/tmp/cow" "/tmp/cow-never" #:copy-on-write 'never)
>     (copy-file "/tmp/cow" "/dev/shm/cow-unspecified")
>     (copy-file "/tmp/cow" "/dev/shm/cow-auto" #:copy-on-write 'auto)
>     (copy-file "/tmp/cow" "/dev/shm/cow-never" #:copy-on-write 'never)
>     $ ./meta/guile -s cow-test.scm
>
>     $ btrfs filesystem du /tmp/cow*
>          Total   Exclusive  Set shared  Filename
>       36.00KiB       0.00B    36.00KiB  /tmp/cow
>       36.00KiB       0.00B    36.00KiB  /tmp/cow-always
>       36.00KiB       0.00B    36.00KiB  /tmp/cow-auto
>       36.00KiB    36.00KiB       0.00B  /tmp/cow-never
>       36.00KiB       0.00B    36.00KiB  /tmp/cow-unspecified
>
>     $ sha1sum /tmp/cow* /dev/shm/cow*
>     4c665f87b5dc2e7d26279c4b48968d085e1ace32  /tmp/cow
>     4c665f87b5dc2e7d26279c4b48968d085e1ace32  /tmp/cow-always
>     4c665f87b5dc2e7d26279c4b48968d085e1ace32  /tmp/cow-auto
>     4c665f87b5dc2e7d26279c4b48968d085e1ace32  /tmp/cow-never
>     4c665f87b5dc2e7d26279c4b48968d085e1ace32  /tmp/cow-unspecified
>     4c665f87b5dc2e7d26279c4b48968d085e1ace32  /dev/shm/cow-auto
>     4c665f87b5dc2e7d26279c4b48968d085e1ace32  /dev/shm/cow-never
>     4c665f87b5dc2e7d26279c4b48968d085e1ace32  /dev/shm/cow-unspecified
>
> This commit also adds to new failure modes for (copy-file).
>
> Failure to copy-on-write when 'always was passed in:
>
>     scheme@(guile-user)> (copy-file "/tmp/cow" "/dev/shm/cow" #:copy-on-write 'always)
>     ice-9/boot-9.scm:1676:22: In procedure raise-exception:
>     In procedure copy-file: copy-on-write failed: Invalid cross-device link
>
> Passing in invalid value for the #:copy-on-write keyword argument:
>
>     scheme@(guile-user)> (copy-file "/tmp/cow" "/dev/shm/cow" #:copy-on-write 'nevr)
>     ice-9/boot-9.scm:1676:22: In procedure raise-exception:
>     In procedure copy-file: invalid value for #:copy-on-write: nevr
>
> * NEWS: Add note for copy-file supporting copy-on-write.
> * configure.ac: Check for linux/fs.h.
> * doc/ref/posix.texi (File System)[copy-file]: Document the new
> signature.
> * libguile/filesys.c (clone_file): New function cloning a file using
> FICLONE, if supported.
> (k_copy_on_write): New keyword.
> (sym_always, sym_auto, sym_never): New symbols.
> (scm_copy_file): New #:copy-on-write keyword argument.  Attempt
> copy-on-write copy by default.
> * libguile/filesys.h: Update signature for scm_copy_file.

The patch looks great (and very useful) to me, modulo one issue:

> -SCM_API SCM scm_copy_file (SCM oldfile, SCM newfile);
> +SCM_API SCM scm_copy_file (SCM oldfile, SCM newfile, SCM rest);

Since this is a public interface, we cannot change this function’s
signature during the 3.0 stable series.

Thus, I would suggest keeping the public ‘scm_copy_file’ unchanged and
internally having a three-argument variant.  The Scheme-level
‘copy-file’ would map to that three-argument variant.  (See how
‘scm_pipe’ and ‘scm accept’ as examples.)

Could you send an updated patch?

BTW, copyright assignment to the FSF is now optional but encouraged.
Please see
<https://lists.gnu.org/archive/html/guile-devel/2022-10/msg00008.html>.

Thanks,
Ludo’.




Information forwarded to bug-guile <at> gnu.org:
bug#68504; Package guile. (Wed, 24 Jan 2024 19:12:02 GMT) Full text and rfc822 format available.

Message #11 received at 68504 <at> debbugs.gnu.org (full text, mbox):

From: Tomas Volf <~@wolfsden.cz>
To: 68504 <at> debbugs.gnu.org
Cc: Tomas Volf <~@wolfsden.cz>
Subject: [PATCH v2] Add copy-on-write support to scm_copy_file.
Date: Wed, 24 Jan 2024 20:10:32 +0100
On modern file-systems (BTRFS, ZFS) it is possible to copy a file using
copy-on-write method.  For large files it has the advantage of being
much faster and saving disk space (since identical extents are not
duplicated).  This feature is stable and for example coreutils' `cp'
does use it automatically (see --reflink).

This commit adds support for this feature into our
copy-file (scm_copy_file) procedure.  Same as `cp', it defaults to
'auto, meaning the copy-on-write is attempted, and in case of failure
the regular copy is performed.

No tests are provided, because the behavior depends on the system,
underlying file-system and its configuration.  That makes it challenging
to write a test for it.  Manual testing was performed instead:

    $ btrfs filesystem du /tmp/cow*
         Total   Exclusive  Set shared  Filename
      36.00KiB    36.00KiB       0.00B  /tmp/cow

    $ cat cow-test.scm
    (copy-file "/tmp/cow" "/tmp/cow-unspecified")
    (copy-file "/tmp/cow" "/tmp/cow-always" #:copy-on-write 'always)
    (copy-file "/tmp/cow" "/tmp/cow-auto" #:copy-on-write 'auto)
    (copy-file "/tmp/cow" "/tmp/cow-never" #:copy-on-write 'never)
    (copy-file "/tmp/cow" "/dev/shm/cow-unspecified")
    (copy-file "/tmp/cow" "/dev/shm/cow-auto" #:copy-on-write 'auto)
    (copy-file "/tmp/cow" "/dev/shm/cow-never" #:copy-on-write 'never)
    $ ./meta/guile -s cow-test.scm

    $ btrfs filesystem du /tmp/cow*
         Total   Exclusive  Set shared  Filename
      36.00KiB       0.00B    36.00KiB  /tmp/cow
      36.00KiB       0.00B    36.00KiB  /tmp/cow-always
      36.00KiB       0.00B    36.00KiB  /tmp/cow-auto
      36.00KiB    36.00KiB       0.00B  /tmp/cow-never
      36.00KiB       0.00B    36.00KiB  /tmp/cow-unspecified

    $ sha1sum /tmp/cow* /dev/shm/cow*
    4c665f87b5dc2e7d26279c4b48968d085e1ace32  /tmp/cow
    4c665f87b5dc2e7d26279c4b48968d085e1ace32  /tmp/cow-always
    4c665f87b5dc2e7d26279c4b48968d085e1ace32  /tmp/cow-auto
    4c665f87b5dc2e7d26279c4b48968d085e1ace32  /tmp/cow-never
    4c665f87b5dc2e7d26279c4b48968d085e1ace32  /tmp/cow-unspecified
    4c665f87b5dc2e7d26279c4b48968d085e1ace32  /dev/shm/cow-auto
    4c665f87b5dc2e7d26279c4b48968d085e1ace32  /dev/shm/cow-never
    4c665f87b5dc2e7d26279c4b48968d085e1ace32  /dev/shm/cow-unspecified

This commit also adds to new failure modes for (copy-file).

Failure to copy-on-write when 'always was passed in:

    scheme@(guile-user)> (copy-file "/tmp/cow" "/dev/shm/cow" #:copy-on-write 'always)
    ice-9/boot-9.scm:1676:22: In procedure raise-exception:
    In procedure copy-file: copy-on-write failed: Invalid cross-device link

Passing in invalid value for the #:copy-on-write keyword argument:

    scheme@(guile-user)> (copy-file "/tmp/cow" "/dev/shm/cow" #:copy-on-write 'nevr)
    ice-9/boot-9.scm:1676:22: In procedure raise-exception:
    In procedure copy-file: invalid value for #:copy-on-write: nevr

* NEWS: Add note for copy-file supporting copy-on-write.
* configure.ac: Check for linux/fs.h.
* doc/ref/posix.texi (File System)[copy-file]: Document the new
signature.
* libguile/filesys.c (clone_file): New function cloning a file using
FICLONE, if supported.
(k_copy_on_write): New keyword.
(sym_always, sym_auto, sym_never): New symbols.
(scm_copy_file2): Renamed from scm_copy_file.  New #:copy-on-write
keyword argument.  Attempt copy-on-write copy by default.
(scm_copy_file): Call scm_copy_file2.
* libguile/filesys.h: Add scm_copy_file2 as SCM_INTERNAL.
---
v2: Introduce scm_copy_file2 in order to preserve backwards compatibility.

 NEWS               |  9 +++++
 configure.ac       |  1 +
 doc/ref/posix.texi |  9 ++++-
 libguile/filesys.c | 82 +++++++++++++++++++++++++++++++++++++++-------
 libguile/filesys.h |  1 +
 5 files changed, 89 insertions(+), 13 deletions(-)

diff --git a/NEWS b/NEWS
index b319404d7..9147098c9 100644
--- a/NEWS
+++ b/NEWS
@@ -21,6 +21,15 @@ definitely unused---this is notably the case for modules that are only
 used at macro-expansion time, such as (srfi srfi-26).  In those cases,
 the compiler reports it as "possibly unused".

+** copy-file now supports copy-on-write
+
+The copy-file procedure now takes an additional keyword argument,
+#:copy-on-write, specifying whether copy-on-write should be done, if the
+underlying file-system supports it.  Possible values are 'always, 'auto
+and 'never, with 'auto being the default.
+
+This speeds up copying large files a lot while saving the disk space.
+
 * Bug fixes

 ** (ice-9 suspendable-ports) incorrect UTF-8 decoding
diff --git a/configure.ac b/configure.ac
index d0a2dc79b..c46586e9b 100644
--- a/configure.ac
+++ b/configure.ac
@@ -418,6 +418,7 @@ AC_SUBST([SCM_I_GSC_HAVE_STRUCT_DIRENT64])
 #   sys/sendfile.h - non-POSIX, found in glibc
 #
 AC_CHECK_HEADERS([complex.h fenv.h io.h memory.h process.h \
+linux/fs.h \
 sys/dir.h sys/ioctl.h sys/select.h \
 sys/time.h sys/timeb.h sys/times.h sys/stdtypes.h sys/types.h \
 sys/utime.h unistd.h utime.h pwd.h grp.h sys/utsname.h \
diff --git a/doc/ref/posix.texi b/doc/ref/posix.texi
index fec42d061..d26808d91 100644
--- a/doc/ref/posix.texi
+++ b/doc/ref/posix.texi
@@ -896,10 +896,17 @@ of @code{delete-file}.  Why doesn't POSIX have a @code{rmdirat} function
 for this instead?  No idea!
 @end deffn

-@deffn {Scheme Procedure} copy-file oldfile newfile
+@deffn {Scheme Procedure} copy-file @var{oldfile} @var{newfile} @
+       [#:copy-on-write='auto]
 @deffnx {C Function} scm_copy_file (oldfile, newfile)
 Copy the file specified by @var{oldfile} to @var{newfile}.
 The return value is unspecified.
+
+@code{#:copy-on-write} keyword argument determines whether copy-on-write
+copy should be attempted and the behavior in case of failure.  Possible
+values are @code{'always} (attempt the copy-on-write, return error if it
+fails), @code{'auto} (attempt the copy-on-write, fallback to regular
+copy if it fails) and @code{'never} (perform the regular copy).
 @end deffn

 @deffn {Scheme Procedure} sendfile out in count [offset]
diff --git a/libguile/filesys.c b/libguile/filesys.c
index 1f0bba556..5be42b825 100644
--- a/libguile/filesys.c
+++ b/libguile/filesys.c
@@ -67,6 +67,11 @@
 # include <sys/sendfile.h>
 #endif

+#if defined(HAVE_SYS_IOCTL_H) && defined(HAVE_LINUX_FS_H)
+# include <linux/fs.h>
+# include <sys/ioctl.h>
+#endif
+
 #include "async.h"
 #include "boolean.h"
 #include "dynwind.h"
@@ -75,6 +80,7 @@
 #include "fports.h"
 #include "gsubr.h"
 #include "iselect.h"
+#include "keywords.h"
 #include "list.h"
 #include "load.h"	/* for scm_i_mirror_backslashes */
 #include "modules.h"
@@ -1255,20 +1261,49 @@ SCM_DEFINE (scm_readlink, "readlink", 1, 0, 0,
 }
 #undef FUNC_NAME

-SCM_DEFINE (scm_copy_file, "copy-file", 2, 0, 0,
-            (SCM oldfile, SCM newfile),
+static int
+clone_file (int oldfd, int newfd)
+{
+#ifdef FICLONE
+  return ioctl (newfd, FICLONE, oldfd);
+#else
+  (void)oldfd;
+  (void)newfd;
+  errno = EOPNOTSUPP;
+  return -1;
+#endif
+}
+
+SCM_KEYWORD (k_copy_on_write, "copy-on-write");
+SCM_SYMBOL (sym_always, "always");
+SCM_SYMBOL (sym_auto, "auto");
+SCM_SYMBOL (sym_never, "never");
+
+SCM_DEFINE (scm_copy_file2, "copy-file", 2, 0, 1,
+            (SCM oldfile, SCM newfile, SCM rest),
 	    "Copy the file specified by @var{oldfile} to @var{newfile}.\n"
-	    "The return value is unspecified.")
-#define FUNC_NAME s_scm_copy_file
+	    "The return value is unspecified.\n"
+            "\n"
+            "@code{#:copy-on-write} keyword argument determines whether "
+            "copy-on-write copy should be attempted and the "
+            "behavior in case of failure.  Possible values are "
+            "@code{'always} (attempt the copy-on-write, return error if "
+            "it fails), @code{'auto} (attempt the copy-on-write, "
+            "fallback to regular copy if it fails) and @code{'never} "
+            "(perform the regular copy)."
+            )
+#define FUNC_NAME s_scm_copy_file2
 {
   char *c_oldfile, *c_newfile;
   int oldfd, newfd;
   int n, rv;
+  SCM cow = sym_auto;
+  int clone_res;
   char buf[BUFSIZ];
   struct stat_or_stat64 oldstat;

   scm_dynwind_begin (0);
-
+
   c_oldfile = scm_to_locale_string (oldfile);
   scm_dynwind_free (c_oldfile);
   c_newfile = scm_to_locale_string (newfile);
@@ -1292,13 +1327,30 @@ SCM_DEFINE (scm_copy_file, "copy-file", 2, 0, 0,
       SCM_SYSERROR;
     }

-  while ((n = read (oldfd, buf, sizeof buf)) > 0)
-    if (write (newfd, buf, n) != n)
-      {
-	close (oldfd);
-	close (newfd);
-	SCM_SYSERROR;
-      }
+  scm_c_bind_keyword_arguments ("copy-file", rest, 0,
+                                k_copy_on_write, &cow,
+                                SCM_UNDEFINED);
+
+  if (scm_is_eq (cow, sym_always) || scm_is_eq (cow, sym_auto))
+    clone_res = clone_file(oldfd, newfd);
+  else if (scm_is_eq (cow, sym_never))
+    clone_res = -1;
+  else
+    scm_misc_error ("copy-file",
+                    "invalid value for #:copy-on-write: ~S",
+                    scm_list_1 (cow));
+
+  if (scm_is_eq (cow, sym_always) && clone_res)
+    scm_syserror ("copy-file: copy-on-write failed");
+
+  if (clone_res)
+    while ((n = read (oldfd, buf, sizeof buf)) > 0)
+      if (write (newfd, buf, n) != n)
+        {
+          close (oldfd);
+          close (newfd);
+          SCM_SYSERROR;
+        }
   close (oldfd);
   if (close (newfd) == -1)
     SCM_SYSERROR;
@@ -1308,6 +1360,12 @@ SCM_DEFINE (scm_copy_file, "copy-file", 2, 0, 0,
 }
 #undef FUNC_NAME

+SCM
+scm_copy_file (SCM oldfile, SCM newfile)
+{
+  return scm_copy_file2 (oldfile, newfile, SCM_UNSPECIFIED);
+}
+
 SCM_DEFINE (scm_sendfile, "sendfile", 3, 1, 0,
 	    (SCM out, SCM in, SCM count, SCM offset),
 	    "Send @var{count} bytes from @var{in} to @var{out}, both of which "
diff --git a/libguile/filesys.h b/libguile/filesys.h
index 1ce50d30e..8e849fe7a 100644
--- a/libguile/filesys.h
+++ b/libguile/filesys.h
@@ -74,6 +74,7 @@ SCM_API SCM scm_symlinkat (SCM dir, SCM oldpath, SCM newpath);
 SCM_API SCM scm_readlink (SCM path);
 SCM_API SCM scm_lstat (SCM str);
 SCM_API SCM scm_copy_file (SCM oldfile, SCM newfile);
+SCM_INTERNAL SCM scm_copy_file2 (SCM oldfile, SCM newfile, SCM rest);
 SCM_API SCM scm_mkstemp (SCM tmpl);
 SCM_API SCM scm_mkdtemp (SCM tmpl);
 SCM_API SCM scm_dirname (SCM filename);
--
2.41.0




Information forwarded to bug-guile <at> gnu.org:
bug#68504; Package guile. (Wed, 24 Jan 2024 19:17:01 GMT) Full text and rfc822 format available.

Message #14 received at 68504 <at> debbugs.gnu.org (full text, mbox):

From: Tomas Volf <~@wolfsden.cz>
To: 68504 <at> debbugs.gnu.org
Cc: Tomas Volf <~@wolfsden.cz>
Subject: [PATCH v3] Add copy-on-write support to scm_copy_file.
Date: Wed, 24 Jan 2024 20:14:32 +0100
On modern file-systems (BTRFS, ZFS) it is possible to copy a file using
copy-on-write method.  For large files it has the advantage of being
much faster and saving disk space (since identical extents are not
duplicated).  This feature is stable and for example coreutils' `cp'
does use it automatically (see --reflink).

This commit adds support for this feature into our copy-file procedure.
Same as `cp', it defaults to 'auto, meaning the copy-on-write is
attempted, and in case of failure the regular copy is performed.

No tests are provided, because the behavior depends on the system,
underlying file-system and its configuration.  That makes it challenging
to write a test for it.  Manual testing was performed instead:

    $ btrfs filesystem du /tmp/cow*
         Total   Exclusive  Set shared  Filename
      36.00KiB    36.00KiB       0.00B  /tmp/cow

    $ cat cow-test.scm
    (copy-file "/tmp/cow" "/tmp/cow-unspecified")
    (copy-file "/tmp/cow" "/tmp/cow-always" #:copy-on-write 'always)
    (copy-file "/tmp/cow" "/tmp/cow-auto" #:copy-on-write 'auto)
    (copy-file "/tmp/cow" "/tmp/cow-never" #:copy-on-write 'never)
    (copy-file "/tmp/cow" "/dev/shm/cow-unspecified")
    (copy-file "/tmp/cow" "/dev/shm/cow-auto" #:copy-on-write 'auto)
    (copy-file "/tmp/cow" "/dev/shm/cow-never" #:copy-on-write 'never)
    $ ./meta/guile -s cow-test.scm

    $ btrfs filesystem du /tmp/cow*
         Total   Exclusive  Set shared  Filename
      36.00KiB       0.00B    36.00KiB  /tmp/cow
      36.00KiB       0.00B    36.00KiB  /tmp/cow-always
      36.00KiB       0.00B    36.00KiB  /tmp/cow-auto
      36.00KiB    36.00KiB       0.00B  /tmp/cow-never
      36.00KiB       0.00B    36.00KiB  /tmp/cow-unspecified

    $ sha1sum /tmp/cow* /dev/shm/cow*
    4c665f87b5dc2e7d26279c4b48968d085e1ace32  /tmp/cow
    4c665f87b5dc2e7d26279c4b48968d085e1ace32  /tmp/cow-always
    4c665f87b5dc2e7d26279c4b48968d085e1ace32  /tmp/cow-auto
    4c665f87b5dc2e7d26279c4b48968d085e1ace32  /tmp/cow-never
    4c665f87b5dc2e7d26279c4b48968d085e1ace32  /tmp/cow-unspecified
    4c665f87b5dc2e7d26279c4b48968d085e1ace32  /dev/shm/cow-auto
    4c665f87b5dc2e7d26279c4b48968d085e1ace32  /dev/shm/cow-never
    4c665f87b5dc2e7d26279c4b48968d085e1ace32  /dev/shm/cow-unspecified

This commit also adds to new failure modes for (copy-file).

Failure to copy-on-write when 'always was passed in:

    scheme@(guile-user)> (copy-file "/tmp/cow" "/dev/shm/cow" #:copy-on-write 'always)
    ice-9/boot-9.scm:1676:22: In procedure raise-exception:
    In procedure copy-file: copy-on-write failed: Invalid cross-device link

Passing in invalid value for the #:copy-on-write keyword argument:

    scheme@(guile-user)> (copy-file "/tmp/cow" "/dev/shm/cow" #:copy-on-write 'nevr)
    ice-9/boot-9.scm:1676:22: In procedure raise-exception:
    In procedure copy-file: invalid value for #:copy-on-write: nevr

* NEWS: Add note for copy-file supporting copy-on-write.
* configure.ac: Check for linux/fs.h.
* doc/ref/posix.texi (File System)[copy-file]: Document the new
signature.
* libguile/filesys.c (clone_file): New function cloning a file using
FICLONE, if supported.
(k_copy_on_write): New keyword.
(sym_always, sym_auto, sym_never): New symbols.
(scm_copy_file2): Renamed from scm_copy_file.  New #:copy-on-write
keyword argument.  Attempt copy-on-write copy by default.
(scm_copy_file): Call scm_copy_file2.
* libguile/filesys.h: Add scm_copy_file2 as SCM_INTERNAL.
---
v2: Introduce scm_copy_file2 in order to preserve backwards compatibility.

v3: Remove mention of scm_copy_file from the commit message.

 NEWS               |  9 +++++
 configure.ac       |  1 +
 doc/ref/posix.texi |  9 ++++-
 libguile/filesys.c | 82 +++++++++++++++++++++++++++++++++++++++-------
 libguile/filesys.h |  1 +
 5 files changed, 89 insertions(+), 13 deletions(-)

diff --git a/NEWS b/NEWS
index b319404d7..9147098c9 100644
--- a/NEWS
+++ b/NEWS
@@ -21,6 +21,15 @@ definitely unused---this is notably the case for modules that are only
 used at macro-expansion time, such as (srfi srfi-26).  In those cases,
 the compiler reports it as "possibly unused".

+** copy-file now supports copy-on-write
+
+The copy-file procedure now takes an additional keyword argument,
+#:copy-on-write, specifying whether copy-on-write should be done, if the
+underlying file-system supports it.  Possible values are 'always, 'auto
+and 'never, with 'auto being the default.
+
+This speeds up copying large files a lot while saving the disk space.
+
 * Bug fixes

 ** (ice-9 suspendable-ports) incorrect UTF-8 decoding
diff --git a/configure.ac b/configure.ac
index d0a2dc79b..c46586e9b 100644
--- a/configure.ac
+++ b/configure.ac
@@ -418,6 +418,7 @@ AC_SUBST([SCM_I_GSC_HAVE_STRUCT_DIRENT64])
 #   sys/sendfile.h - non-POSIX, found in glibc
 #
 AC_CHECK_HEADERS([complex.h fenv.h io.h memory.h process.h \
+linux/fs.h \
 sys/dir.h sys/ioctl.h sys/select.h \
 sys/time.h sys/timeb.h sys/times.h sys/stdtypes.h sys/types.h \
 sys/utime.h unistd.h utime.h pwd.h grp.h sys/utsname.h \
diff --git a/doc/ref/posix.texi b/doc/ref/posix.texi
index fec42d061..d26808d91 100644
--- a/doc/ref/posix.texi
+++ b/doc/ref/posix.texi
@@ -896,10 +896,17 @@ of @code{delete-file}.  Why doesn't POSIX have a @code{rmdirat} function
 for this instead?  No idea!
 @end deffn

-@deffn {Scheme Procedure} copy-file oldfile newfile
+@deffn {Scheme Procedure} copy-file @var{oldfile} @var{newfile} @
+       [#:copy-on-write='auto]
 @deffnx {C Function} scm_copy_file (oldfile, newfile)
 Copy the file specified by @var{oldfile} to @var{newfile}.
 The return value is unspecified.
+
+@code{#:copy-on-write} keyword argument determines whether copy-on-write
+copy should be attempted and the behavior in case of failure.  Possible
+values are @code{'always} (attempt the copy-on-write, return error if it
+fails), @code{'auto} (attempt the copy-on-write, fallback to regular
+copy if it fails) and @code{'never} (perform the regular copy).
 @end deffn

 @deffn {Scheme Procedure} sendfile out in count [offset]
diff --git a/libguile/filesys.c b/libguile/filesys.c
index 1f0bba556..5be42b825 100644
--- a/libguile/filesys.c
+++ b/libguile/filesys.c
@@ -67,6 +67,11 @@
 # include <sys/sendfile.h>
 #endif

+#if defined(HAVE_SYS_IOCTL_H) && defined(HAVE_LINUX_FS_H)
+# include <linux/fs.h>
+# include <sys/ioctl.h>
+#endif
+
 #include "async.h"
 #include "boolean.h"
 #include "dynwind.h"
@@ -75,6 +80,7 @@
 #include "fports.h"
 #include "gsubr.h"
 #include "iselect.h"
+#include "keywords.h"
 #include "list.h"
 #include "load.h"	/* for scm_i_mirror_backslashes */
 #include "modules.h"
@@ -1255,20 +1261,49 @@ SCM_DEFINE (scm_readlink, "readlink", 1, 0, 0,
 }
 #undef FUNC_NAME

-SCM_DEFINE (scm_copy_file, "copy-file", 2, 0, 0,
-            (SCM oldfile, SCM newfile),
+static int
+clone_file (int oldfd, int newfd)
+{
+#ifdef FICLONE
+  return ioctl (newfd, FICLONE, oldfd);
+#else
+  (void)oldfd;
+  (void)newfd;
+  errno = EOPNOTSUPP;
+  return -1;
+#endif
+}
+
+SCM_KEYWORD (k_copy_on_write, "copy-on-write");
+SCM_SYMBOL (sym_always, "always");
+SCM_SYMBOL (sym_auto, "auto");
+SCM_SYMBOL (sym_never, "never");
+
+SCM_DEFINE (scm_copy_file2, "copy-file", 2, 0, 1,
+            (SCM oldfile, SCM newfile, SCM rest),
 	    "Copy the file specified by @var{oldfile} to @var{newfile}.\n"
-	    "The return value is unspecified.")
-#define FUNC_NAME s_scm_copy_file
+	    "The return value is unspecified.\n"
+            "\n"
+            "@code{#:copy-on-write} keyword argument determines whether "
+            "copy-on-write copy should be attempted and the "
+            "behavior in case of failure.  Possible values are "
+            "@code{'always} (attempt the copy-on-write, return error if "
+            "it fails), @code{'auto} (attempt the copy-on-write, "
+            "fallback to regular copy if it fails) and @code{'never} "
+            "(perform the regular copy)."
+            )
+#define FUNC_NAME s_scm_copy_file2
 {
   char *c_oldfile, *c_newfile;
   int oldfd, newfd;
   int n, rv;
+  SCM cow = sym_auto;
+  int clone_res;
   char buf[BUFSIZ];
   struct stat_or_stat64 oldstat;

   scm_dynwind_begin (0);
-
+
   c_oldfile = scm_to_locale_string (oldfile);
   scm_dynwind_free (c_oldfile);
   c_newfile = scm_to_locale_string (newfile);
@@ -1292,13 +1327,30 @@ SCM_DEFINE (scm_copy_file, "copy-file", 2, 0, 0,
       SCM_SYSERROR;
     }

-  while ((n = read (oldfd, buf, sizeof buf)) > 0)
-    if (write (newfd, buf, n) != n)
-      {
-	close (oldfd);
-	close (newfd);
-	SCM_SYSERROR;
-      }
+  scm_c_bind_keyword_arguments ("copy-file", rest, 0,
+                                k_copy_on_write, &cow,
+                                SCM_UNDEFINED);
+
+  if (scm_is_eq (cow, sym_always) || scm_is_eq (cow, sym_auto))
+    clone_res = clone_file(oldfd, newfd);
+  else if (scm_is_eq (cow, sym_never))
+    clone_res = -1;
+  else
+    scm_misc_error ("copy-file",
+                    "invalid value for #:copy-on-write: ~S",
+                    scm_list_1 (cow));
+
+  if (scm_is_eq (cow, sym_always) && clone_res)
+    scm_syserror ("copy-file: copy-on-write failed");
+
+  if (clone_res)
+    while ((n = read (oldfd, buf, sizeof buf)) > 0)
+      if (write (newfd, buf, n) != n)
+        {
+          close (oldfd);
+          close (newfd);
+          SCM_SYSERROR;
+        }
   close (oldfd);
   if (close (newfd) == -1)
     SCM_SYSERROR;
@@ -1308,6 +1360,12 @@ SCM_DEFINE (scm_copy_file, "copy-file", 2, 0, 0,
 }
 #undef FUNC_NAME

+SCM
+scm_copy_file (SCM oldfile, SCM newfile)
+{
+  return scm_copy_file2 (oldfile, newfile, SCM_UNSPECIFIED);
+}
+
 SCM_DEFINE (scm_sendfile, "sendfile", 3, 1, 0,
 	    (SCM out, SCM in, SCM count, SCM offset),
 	    "Send @var{count} bytes from @var{in} to @var{out}, both of which "
diff --git a/libguile/filesys.h b/libguile/filesys.h
index 1ce50d30e..8e849fe7a 100644
--- a/libguile/filesys.h
+++ b/libguile/filesys.h
@@ -74,6 +74,7 @@ SCM_API SCM scm_symlinkat (SCM dir, SCM oldpath, SCM newpath);
 SCM_API SCM scm_readlink (SCM path);
 SCM_API SCM scm_lstat (SCM str);
 SCM_API SCM scm_copy_file (SCM oldfile, SCM newfile);
+SCM_INTERNAL SCM scm_copy_file2 (SCM oldfile, SCM newfile, SCM rest);
 SCM_API SCM scm_mkstemp (SCM tmpl);
 SCM_API SCM scm_mkdtemp (SCM tmpl);
 SCM_API SCM scm_dirname (SCM filename);
--
2.41.0




Information forwarded to bug-guile <at> gnu.org:
bug#68504; Package guile. (Wed, 24 Jan 2024 19:21:02 GMT) Full text and rfc822 format available.

Message #17 received at 68504 <at> debbugs.gnu.org (full text, mbox):

From: Tomas Volf <~@wolfsden.cz>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 68504 <at> debbugs.gnu.org
Subject: Re: bug#68504: [PATCH] Add copy-on-write support to scm_copy_file.
Date: Wed, 24 Jan 2024 20:19:51 +0100
[Message part 1 (text/plain, inline)]
On 2024-01-24 11:26:56 +0100, Ludovic Courtès wrote:
>
> The patch looks great (and very useful) to me, modulo one issue:
>
> > -SCM_API SCM scm_copy_file (SCM oldfile, SCM newfile);
> > +SCM_API SCM scm_copy_file (SCM oldfile, SCM newfile, SCM rest);
>
> Since this is a public interface, we cannot change this function’s
> signature during the 3.0 stable series.
>
> Thus, I would suggest keeping the public ‘scm_copy_file’ unchanged and
> internally having a three-argument variant.  The Scheme-level
> ‘copy-file’ would map to that three-argument variant.  (See how
> ‘scm_pipe’ and ‘scm accept’ as examples.)

That is a very good point, which I did not realize at all.  Thanks to the
examples you provided, it was not that hard to do (well, assuming I did it
right).

> Could you send an updated patch?

Done.  However now that I read it after myself, I overlooked this occurrence of
scm_copy_file in the commit message:

    This commit adds support for this feature into our
    copy-file (scm_copy_file) procedure.  Same as `cp', it defaults to

So I just sent v3 right after v2, sorry for the noise, should have been more
careful.

>
> BTW, copyright assignment to the FSF is now optional but encouraged.
> Please see
> <https://lists.gnu.org/archive/html/guile-devel/2022-10/msg00008.html>.

Since it is optional, I will currently opt into not doing the assignment, I do
not like the concept that much.  I will try to find time to actually form an
opinion based on facts.

Have a nice day,
Tomas

--
There are only two hard things in Computer Science:
cache invalidation, naming things and off-by-one errors.
[signature.asc (application/pgp-signature, inline)]

Reply sent to Ludovic Courtès <ludo <at> gnu.org>:
You have taken responsibility. (Tue, 12 Mar 2024 13:08:01 GMT) Full text and rfc822 format available.

Notification sent to Tomas Volf <~@wolfsden.cz>:
bug acknowledged by developer. (Tue, 12 Mar 2024 13:08:02 GMT) Full text and rfc822 format available.

Message #22 received at 68504-done <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Tomas Volf <~@wolfsden.cz>
Cc: 68504-done <at> debbugs.gnu.org
Subject: Re: bug#68504: [PATCH v3] Add copy-on-write support to scm_copy_file.
Date: Tue, 12 Mar 2024 14:06:34 +0100
Hi Tomas,

Tomas Volf <~@wolfsden.cz> skribis:

> On modern file-systems (BTRFS, ZFS) it is possible to copy a file using
> copy-on-write method.  For large files it has the advantage of being
> much faster and saving disk space (since identical extents are not
> duplicated).  This feature is stable and for example coreutils' `cp'
> does use it automatically (see --reflink).
>
> This commit adds support for this feature into our copy-file procedure.
> Same as `cp', it defaults to 'auto, meaning the copy-on-write is
> attempted, and in case of failure the regular copy is performed.
>
> No tests are provided, because the behavior depends on the system,
> underlying file-system and its configuration.  That makes it challenging
> to write a test for it.  Manual testing was performed instead:
>
>     $ btrfs filesystem du /tmp/cow*
>          Total   Exclusive  Set shared  Filename
>       36.00KiB    36.00KiB       0.00B  /tmp/cow
>
>     $ cat cow-test.scm
>     (copy-file "/tmp/cow" "/tmp/cow-unspecified")
>     (copy-file "/tmp/cow" "/tmp/cow-always" #:copy-on-write 'always)
>     (copy-file "/tmp/cow" "/tmp/cow-auto" #:copy-on-write 'auto)
>     (copy-file "/tmp/cow" "/tmp/cow-never" #:copy-on-write 'never)
>     (copy-file "/tmp/cow" "/dev/shm/cow-unspecified")
>     (copy-file "/tmp/cow" "/dev/shm/cow-auto" #:copy-on-write 'auto)
>     (copy-file "/tmp/cow" "/dev/shm/cow-never" #:copy-on-write 'never)
>     $ ./meta/guile -s cow-test.scm
>
>     $ btrfs filesystem du /tmp/cow*
>          Total   Exclusive  Set shared  Filename
>       36.00KiB       0.00B    36.00KiB  /tmp/cow
>       36.00KiB       0.00B    36.00KiB  /tmp/cow-always
>       36.00KiB       0.00B    36.00KiB  /tmp/cow-auto
>       36.00KiB    36.00KiB       0.00B  /tmp/cow-never
>       36.00KiB       0.00B    36.00KiB  /tmp/cow-unspecified
>
>     $ sha1sum /tmp/cow* /dev/shm/cow*
>     4c665f87b5dc2e7d26279c4b48968d085e1ace32  /tmp/cow
>     4c665f87b5dc2e7d26279c4b48968d085e1ace32  /tmp/cow-always
>     4c665f87b5dc2e7d26279c4b48968d085e1ace32  /tmp/cow-auto
>     4c665f87b5dc2e7d26279c4b48968d085e1ace32  /tmp/cow-never
>     4c665f87b5dc2e7d26279c4b48968d085e1ace32  /tmp/cow-unspecified
>     4c665f87b5dc2e7d26279c4b48968d085e1ace32  /dev/shm/cow-auto
>     4c665f87b5dc2e7d26279c4b48968d085e1ace32  /dev/shm/cow-never
>     4c665f87b5dc2e7d26279c4b48968d085e1ace32  /dev/shm/cow-unspecified
>
> This commit also adds to new failure modes for (copy-file).
>
> Failure to copy-on-write when 'always was passed in:
>
>     scheme@(guile-user)> (copy-file "/tmp/cow" "/dev/shm/cow" #:copy-on-write 'always)
>     ice-9/boot-9.scm:1676:22: In procedure raise-exception:
>     In procedure copy-file: copy-on-write failed: Invalid cross-device link
>
> Passing in invalid value for the #:copy-on-write keyword argument:
>
>     scheme@(guile-user)> (copy-file "/tmp/cow" "/dev/shm/cow" #:copy-on-write 'nevr)
>     ice-9/boot-9.scm:1676:22: In procedure raise-exception:
>     In procedure copy-file: invalid value for #:copy-on-write: nevr
>
> * NEWS: Add note for copy-file supporting copy-on-write.
> * configure.ac: Check for linux/fs.h.
> * doc/ref/posix.texi (File System)[copy-file]: Document the new
> signature.
> * libguile/filesys.c (clone_file): New function cloning a file using
> FICLONE, if supported.
> (k_copy_on_write): New keyword.
> (sym_always, sym_auto, sym_never): New symbols.
> (scm_copy_file2): Renamed from scm_copy_file.  New #:copy-on-write
> keyword argument.  Attempt copy-on-write copy by default.
> (scm_copy_file): Call scm_copy_file2.
> * libguile/filesys.h: Add scm_copy_file2 as SCM_INTERNAL.
> ---
> v2: Introduce scm_copy_file2 in order to preserve backwards compatibility.
>
> v3: Remove mention of scm_copy_file from the commit message.

Finally pushed as e1690f3fd251d69b3687ec12c6f4b41034047f0f.  Note that I
added copyright lines for you, let me know if I got it wrong.

As a followup, we should add support for ‘copy_file_range’ when FICLONE
cannot be used; glibc supports it on all platforms but it returns ENOSYS
on GNU/Hurd currently.

WDYT?

Thank you!

Ludo’.




Information forwarded to bug-guile <at> gnu.org:
bug#68504; Package guile. (Tue, 12 Mar 2024 23:21:01 GMT) Full text and rfc822 format available.

Message #25 received at 68504 <at> debbugs.gnu.org (full text, mbox):

From: Tomas Volf <~@wolfsden.cz>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 68504 <at> debbugs.gnu.org
Subject: Re: bug#68504: [PATCH v3] Add copy-on-write support to scm_copy_file.
Date: Wed, 13 Mar 2024 00:19:24 +0100
[Message part 1 (text/plain, inline)]
On 2024-03-12 14:06:34 +0100, Ludovic Courtès wrote:
>
> Finally pushed as e1690f3fd251d69b3687ec12c6f4b41034047f0f.  Note that I
> added copyright lines for you, let me know if I got it wrong.

Thank you for merging it, and thanks for the copyright, looks correct :)

> As a followup, we should add support for ‘copy_file_range’ when FICLONE
> cannot be used; glibc supports it on all platforms but it returns ENOSYS
> on GNU/Hurd currently.
>
> WDYT?

Sure, I am willing to do my part.  I managed to find this blog post[0], so after
some minor troubles I did manage to get a VM with GNU/Hurd running.  Next I will
read up on copy_file_range and try to put together a patch.

Just to make sure, your idea here is exactly what?  Always try to use
copy_file_range before the regular copy?  So the flow would be

For 'always case:

    CoW  ---fail-->  FAIL

For 'auto case:

    CoW  ---fail-->  copy_file_range  ---fail-->  current copy  ---fail-->  FAIL

For 'never case:

    copy_file_range  ---fail-->  current copy  ---fail-->  FAIL

Is that an accurate summary?  Or did you mean only as a fallback for the CoW, so
only for 'auto, but not for the 'never?

Tomas

0: https://guix.gnu.org/en/blog/2020/a-hello-world-virtual-machine-running-the-hurd/

--
There are only two hard things in Computer Science:
cache invalidation, naming things and off-by-one errors.
[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-guile <at> gnu.org:
bug#68504; Package guile. (Thu, 21 Mar 2024 14:48:02 GMT) Full text and rfc822 format available.

Message #28 received at 68504 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Tomas Volf <~@wolfsden.cz>
Cc: 68504 <at> debbugs.gnu.org
Subject: Re: bug#68504: [PATCH v3] Add copy-on-write support to scm_copy_file.
Date: Thu, 21 Mar 2024 15:23:33 +0100
Hi,

Tomas Volf <~@wolfsden.cz> skribis:

> Sure, I am willing to do my part.  I managed to find this blog post[0], so after
> some minor troubles I did manage to get a VM with GNU/Hurd running.  Next I will
> read up on copy_file_range and try to put together a patch.

It’s really just (service hurd-vm-service-type) on Guix System:

  https://guix.gnu.org/manual/devel/en/html_node/Virtualization-Services.html#The-Hurd-in-a-Virtual-Machine

> Just to make sure, your idea here is exactly what?  Always try to use
> copy_file_range before the regular copy?  So the flow would be
>
> For 'always case:
>
>     CoW  ---fail-->  FAIL
>
> For 'auto case:
>
>     CoW  ---fail-->  copy_file_range  ---fail-->  current copy  ---fail-->  FAIL
>
> For 'never case:
>
>     copy_file_range  ---fail-->  current copy  ---fail-->  FAIL
>
> Is that an accurate summary?

Yes, that’s exactly what I had in mind.

Actually it might be better to use sendfile(2), which is slightly less
generic but otherwise equivalent AFAICS, and which happens to have a
Hurd implementation in glibc.

Thanks for your help!

Ludo’.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Fri, 19 Apr 2024 11:24:08 GMT) Full text and rfc822 format available.

This bug report was last modified 98 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.