GNU bug report logs - #55724
cp --reflink=always failing when --reflink=auto reflinks successfully on OpenZFS

Previous Next

Package: coreutils;

Reported by: Rich <rincebrain <at> gmail.com>

Date: Mon, 30 May 2022 10:58:02 UTC

Severity: normal

To reply to this bug, email your comments to 55724 AT debbugs.gnu.org.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-coreutils <at> gnu.org:
bug#55724; Package coreutils. (Mon, 30 May 2022 10:58:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Rich <rincebrain <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Mon, 30 May 2022 10:58:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Rich <rincebrain <at> gmail.com>
To: bug-coreutils <at> gnu.org
Subject: cp --reflink=always failing when --reflink=auto reflinks successfully
 on OpenZFS
Date: Mon, 30 May 2022 05:13:05 -0400
[Message part 1 (text/plain, inline)]
Hi!

So, OpenZFS is adding reflink support Soon(tm), including across
filesystems on a pool, which is nice.

Unfortunately, Linux's VFS returns EXDEV for trying FICLONE or
FICLONERANGE (but not copy_file_range) cross-filesystem before you ever ask
the filesystem-specific code, so currently, the following strange behavior
occurs:

On coreutils 8.30 or 8.32, cp --reflink=always across filesystems will fail
with EXDEV and --reflink=auto will not reflink (because they're not trying
copy_file_range as a fallback).
On coreutils git, as of b3331d59e, cp --reflink=always across filesystems
will fail with EXDEV without ever getting out of Linux's VFS code, cp
--reflink=auto will reflink silently (since it falls back to
copy_file_range after getting EXDEV), cp --reflink=never will not reflink.

(On the same filesystem, in all of the above versions, cp --reflink=always
and =auto do the same thing and reflink correctly.)

I'm not sure what the "correct" behavior here should be, but at least =auto
working and =always failing seems like a surprising and incorrect outcome
to me, though it's not readily obvious to me how the code "should" flow
instead to avoid that - and since the failure cases happen before calling
into OpenZFS, I don't see any way it could be handled better there.

Happy to point people at the WIP code being used to demonstrate this if it
would be helpful, but this seems like it's only OpenZFS specific in that
nobody else has this functionality but would hit this case (because IIUC
btrfs avoids clone_file failing with EXDEV by pretending they're not
distinct filesystems, and there's not many other FSes where reflink across
filesystems would make sense).

Thanks for any insights,
- Rich
[Message part 2 (text/html, inline)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#55724; Package coreutils. (Mon, 30 May 2022 15:05:01 GMT) Full text and rfc822 format available.

Message #8 received at 55724 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Rich <rincebrain <at> gmail.com>, 55724 <at> debbugs.gnu.org
Subject: Re: bug#55724: cp --reflink=always failing when --reflink=auto
 reflinks successfully on OpenZFS
Date: Mon, 30 May 2022 16:04:07 +0100
On 30/05/2022 10:13, Rich wrote:
> Hi!
> 
> So, OpenZFS is adding reflink support Soon(tm), including across
> filesystems on a pool, which is nice.
> 
> Unfortunately, Linux's VFS returns EXDEV for trying FICLONE or
> FICLONERANGE (but not copy_file_range) cross-filesystem before you ever ask
> the filesystem-specific code, so currently, the following strange behavior
> occurs:
> 
> On coreutils 8.30 or 8.32, cp --reflink=always across filesystems will fail
> with EXDEV and --reflink=auto will not reflink (because they're not trying
> copy_file_range as a fallback).
> On coreutils git, as of b3331d59e, cp --reflink=always across filesystems
> will fail with EXDEV without ever getting out of Linux's VFS code, cp
> --reflink=auto will reflink silently (since it falls back to
> copy_file_range after getting EXDEV), cp --reflink=never will not reflink.
> 
> (On the same filesystem, in all of the above versions, cp --reflink=always
> and =auto do the same thing and reflink correctly.)
> 
> I'm not sure what the "correct" behavior here should be, but at least =auto
> working and =always failing seems like a surprising and incorrect outcome
> to me, though it's not readily obvious to me how the code "should" flow
> instead to avoid that - and since the failure cases happen before calling
> into OpenZFS, I don't see any way it could be handled better there.
> 
> Happy to point people at the WIP code being used to demonstrate this if it
> would be helpful, but this seems like it's only OpenZFS specific in that
> nobody else has this functionality but would hit this case (because IIUC
> btrfs avoids clone_file failing with EXDEV by pretending they're not
> distinct filesystems, and there's not many other FSes where reflink across
> filesystems would make sense).
> 
> Thanks for any insights,
> - Rich

Thanks for the clear info.
Yes this is an awkward one, which I'm not sure cp can do anything about.
`cp --reflink=always` => ensure we can reflink or otherwise fail.
Really the kernel has to behave appropriately there
and not do the blanket assumption with EXDEV.
cp can't determine from copy_file_range() whether a reflink
was performed or not, so wouldn't be appropriate to use with --reflink=always.

cheers,
Pádraig




Information forwarded to bug-coreutils <at> gnu.org:
bug#55724; Package coreutils. (Mon, 30 May 2022 23:32:02 GMT) Full text and rfc822 format available.

Message #11 received at 55724 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Pádraig Brady <P <at> draigBrady.com>
Cc: 55724 <at> debbugs.gnu.org, Rich <rincebrain <at> gmail.com>
Subject: Re: bug#55724: cp --reflink=always failing when --reflink=auto
 reflinks successfully on OpenZFS
Date: Mon, 30 May 2022 16:31:40 -0700
On 5/30/22 08:04, Pádraig Brady wrote:
> Really the kernel has to behave appropriately there
> and not do the blanket assumption with EXDEV.

I agree. VFS should be willing to try a cross-filesystem FICLONE. Not 
only does copy_file_range not guarantee cloning; it is less efficient 
even when it does clone, due to the need to find the holes in the source 
file.




Information forwarded to bug-coreutils <at> gnu.org:
bug#55724; Package coreutils. (Tue, 31 May 2022 10:10:03 GMT) Full text and rfc822 format available.

Message #14 received at 55724 <at> debbugs.gnu.org (full text, mbox):

From: Rich <rincebrain <at> gmail.com>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 55724 <at> debbugs.gnu.org, Pádraig Brady <P <at> draigbrady.com>
Subject: Re: bug#55724: cp --reflink=always failing when --reflink=auto
 reflinks successfully on OpenZFS
Date: Tue, 31 May 2022 04:10:01 -0400
[Message part 1 (text/plain, inline)]
On Mon, May 30, 2022 at 7:31 PM Paul Eggert <eggert <at> cs.ucla.edu> wrote:

> On 5/30/22 08:04, Pádraig Brady wrote:
> > Really the kernel has to behave appropriately there
> > and not do the blanket assumption with EXDEV.
>
> I agree. VFS should be willing to try a cross-filesystem FICLONE. Not
> only does copy_file_range not guarantee cloning; it is less efficient
> even when it does clone, due to the need to find the holes in the source
> file.
>

I would also agree, it would be nice if those restrictions were removed.

However, it has historically been the experience of both developers and
users of OpenZFS that mentioning using it, finding bugs in other code
because of it, or wanting things for it to LKML or any adjacent mailing
list almost always results in "we don't care about out of tree, non-GPL
code, tough shit" (which is why, for example, OpenZFS now implements its
own FPU save/restore dance on x86).

So, unfortunately, I suspect this situation will continue, and we will just
document the behavior and explain that neither coreutils nor OpenZFS can do
anything about it without violating correctness other ways.

Thanks for your thoughts,
- Rich
[Message part 2 (text/html, inline)]

This bug report was last modified 2 years and 177 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.