GNU bug report logs - #60455
Missing fallback if copy_file_range returns ENOENT?

Previous Next

Package: coreutils;

Reported by: Sam James <sam <at> gentoo.org>

Date: Sat, 31 Dec 2022 17:02:02 UTC

Severity: normal

Done: Pádraig Brady <P <at> draigBrady.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 60455 in the body.
You can then email your comments to 60455 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-coreutils <at> gnu.org:
bug#60455; Package coreutils. (Sat, 31 Dec 2022 17:02:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Sam James <sam <at> gentoo.org>:
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Sat, 31 Dec 2022 17:02:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Sam James <sam <at> gentoo.org>
To: bug-coreutils <at> gnu.org
Subject: Missing fallback if copy_file_range returns ENOENT?
Date: Sat, 31 Dec 2022 17:00:48 +0000
[Message part 1 (text/plain, inline)]
Hi folks,

Originally reported in Gentoo at https://bugs.gentoo.org/885793.

Frank Limpert reported that when copying large files across CIFS shares,
cp may abort because copy_file_range returns ENOENT sometimes.

This sounds like a suspicious kernel bug if CIFS interactions are sometimes
spuriously giving ENOENT, but I'm wondering if coreutils needs to do
anything to handle this as well.

strace output from his cp invocation: https://bugs.gentoo.org/attachment.cgi?id=842497

Best,
sam
[signature.asc (application/pgp-signature, attachment)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#60455; Package coreutils. (Sat, 31 Dec 2022 18:52:01 GMT) Full text and rfc822 format available.

Message #8 received at 60455 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Sam James <sam <at> gentoo.org>, 60455 <at> debbugs.gnu.org
Subject: Re: bug#60455: Missing fallback if copy_file_range returns ENOENT?
Date: Sat, 31 Dec 2022 18:51:12 +0000
On 31/12/2022 17:00, Sam James wrote:
> Hi folks,
> 
> Originally reported in Gentoo at https://bugs.gentoo.org/885793.
> 
> Frank Limpert reported that when copying large files across CIFS shares,
> cp may abort because copy_file_range returns ENOENT sometimes.
> 
> This sounds like a suspicious kernel bug if CIFS interactions are sometimes
> spuriously giving ENOENT, but I'm wondering if coreutils needs to do
> anything to handle this as well.
> 
> strace output from his cp invocation: https://bugs.gentoo.org/attachment.cgi?id=842497

We may be able to fallback, but it depends if the errno
is possible to be returned at a partial copy or not.
If partial then there is not much we can do.
Now ENOENT is not a documented errno for copy_file_range()
so I'm not sure what we should do with it.
I didn't see on the bug above if any data was copied.
Could we get more info about that?
Searching for "cifs STATUS_OBJECT_NAME_NOT_FOUND" indicates we
might be able to retry in this case:
https://lists.samba.org/archive/samba/2017-September/211000.html
I guess we could be defensive and also fstat(dest_fd)
and fallback to standard copy if no data yet transferred.
However note the above url is suggesting this error may
not be specific to copy_file_range() and just an intermittent cifs thing.
I.e. copy_file_range() is just a red herring here,
and this just needs fixing in the kernel or server side setup.

cheers,
Pádraig





Information forwarded to bug-coreutils <at> gnu.org:
bug#60455; Package coreutils. (Sat, 07 Jan 2023 07:35:02 GMT) Full text and rfc822 format available.

Message #11 received at 60455 <at> debbugs.gnu.org (full text, mbox):

From: Sam James <sam <at> gentoo.org>
To: Pádraig Brady <P <at> draigBrady.com>
Cc: 60455 <at> debbugs.gnu.org
Subject: Re: bug#60455: Missing fallback if copy_file_range returns ENOENT?
Date: Sat, 7 Jan 2023 07:34:06 +0000
[Message part 1 (text/plain, inline)]

> On 31 Dec 2022, at 18:51, Pádraig Brady <P <at> draigBrady.com> wrote:
> 
> On 31/12/2022 17:00, Sam James wrote:
>> Hi folks,
>> Originally reported in Gentoo at https://bugs.gentoo.org/885793.
>> Frank Limpert reported that when copying large files across CIFS shares,
>> cp may abort because copy_file_range returns ENOENT sometimes.
>> This sounds like a suspicious kernel bug if CIFS interactions are sometimes
>> spuriously giving ENOENT, but I'm wondering if coreutils needs to do
>> anything to handle this as well.
>> strace output from his cp invocation: https://bugs.gentoo.org/attachment.cgi?id=842497
> 
> We may be able to fallback, but it depends if the errno
> is possible to be returned at a partial copy or not.
> If partial then there is not much we can do.
> Now ENOENT is not a documented errno for copy_file_range()
> so I'm not sure what we should do with it.
> I didn't see on the bug above if any data was copied.
> Could we get more info about that?

Frank got back to me and said an empty file gets created:
```
# cp /mnt/Backup/EAV/data-eav-eav-aktiv-20221207.dump.xz /mnt/OldBackup/EAV/1
cp: error copying '/mnt/Backup/EAV/data-eav-eav-aktiv-20221207.dump.xz' to '/mnt/OldBackup/EAV/1/data-eav-eav-aktiv-20221207.dump.xz': No such file or directory
# stat /mnt/OldBackup/EAV/1/data-eav-eav-aktiv-20221207.dump.xz
File: /mnt/OldBackup/EAV/1/data-eav-eav-aktiv-20221207.dump.xz
Size: 0 Blocks: 8 IO Block: 1048576 regular empty file
Device: 0,36 Inode: 81611419679 Links: 1
Access: (0664/-rw-rw-r--) Uid: ( 0/ root) Gid: ( 16/ cron)
Access: 2023-01-06 21:45:57.070743000 +0100
Modify: 2023-01-06 21:45:57.070743000 +0100
Change: 2023-01-06 21:45:57.070743000 +0100
Birth: 2023-01-06 21:45:57.070743000 +0100
```

[signature.asc (application/pgp-signature, attachment)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#60455; Package coreutils. (Sat, 07 Jan 2023 16:26:02 GMT) Full text and rfc822 format available.

Message #14 received at 60455 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Sam James <sam <at> gentoo.org>
Cc: 60455 <at> debbugs.gnu.org
Subject: Re: bug#60455: Missing fallback if copy_file_range returns ENOENT?
Date: Sat, 7 Jan 2023 16:25:07 +0000
[Message part 1 (text/plain, inline)]
On 07/01/2023 07:34, Sam James wrote:
> 
> 
>> On 31 Dec 2022, at 18:51, Pádraig Brady <P <at> draigBrady.com> wrote:
>>
>> On 31/12/2022 17:00, Sam James wrote:
>>> Hi folks,
>>> Originally reported in Gentoo at https://bugs.gentoo.org/885793.
>>> Frank Limpert reported that when copying large files across CIFS shares,
>>> cp may abort because copy_file_range returns ENOENT sometimes.
>>> This sounds like a suspicious kernel bug if CIFS interactions are sometimes
>>> spuriously giving ENOENT, but I'm wondering if coreutils needs to do
>>> anything to handle this as well.
>>> strace output from his cp invocation: https://bugs.gentoo.org/attachment.cgi?id=842497
>>
>> We may be able to fallback, but it depends if the errno
>> is possible to be returned at a partial copy or not.
>> If partial then there is not much we can do.
>> Now ENOENT is not a documented errno for copy_file_range()
>> so I'm not sure what we should do with it.
>> I didn't see on the bug above if any data was copied.
>> Could we get more info about that?
> 
> Frank got back to me and said an empty file gets created:
> ```
> # cp /mnt/Backup/EAV/data-eav-eav-aktiv-20221207.dump.xz /mnt/OldBackup/EAV/1
> cp: error copying '/mnt/Backup/EAV/data-eav-eav-aktiv-20221207.dump.xz' to '/mnt/OldBackup/EAV/1/data-eav-eav-aktiv-20221207.dump.xz': No such file or directory
> # stat /mnt/OldBackup/EAV/1/data-eav-eav-aktiv-20221207.dump.xz
> File: /mnt/OldBackup/EAV/1/data-eav-eav-aktiv-20221207.dump.xz
> Size: 0 Blocks: 8 IO Block: 1048576 regular empty file

OK then it's probably worth handling in coreutils then.
Note I still get the feeling this is a race in CIFS
that is only being made more apparent with copy_file_range(),
but fair enough that this is a regressions for users and
we should be able to cater for it easy enough.

The attached does that for ENOENT.

cheers,
Pádraig
[copy-range-cifs.patch (text/x-patch, attachment)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#60455; Package coreutils. (Sun, 08 Jan 2023 00:53:02 GMT) Full text and rfc822 format available.

Message #17 received at 60455 <at> debbugs.gnu.org (full text, mbox):

From: Sam James <sam <at> gentoo.org>
To: Pádraig Brady <P <at> draigBrady.com>
Cc: 60455 <at> debbugs.gnu.org
Subject: Re: bug#60455: Missing fallback if copy_file_range returns ENOENT?
Date: Sun, 8 Jan 2023 00:51:55 +0000
[Message part 1 (text/plain, inline)]

> On 7 Jan 2023, at 16:25, Pádraig Brady <P <at> draigBrady.com> wrote:
> 
> On 07/01/2023 07:34, Sam James wrote:
>>> On 31 Dec 2022, at 18:51, Pádraig Brady <P <at> draigBrady.com> wrote:
>>> 
>>> On 31/12/2022 17:00, Sam James wrote:
>>>> Hi folks,
>>>> Originally reported in Gentoo at https://bugs.gentoo.org/885793.
>>>> Frank Limpert reported that when copying large files across CIFS shares,
>>>> cp may abort because copy_file_range returns ENOENT sometimes.
>>>> This sounds like a suspicious kernel bug if CIFS interactions are sometimes
>>>> spuriously giving ENOENT, but I'm wondering if coreutils needs to do
>>>> anything to handle this as well.
>>>> strace output from his cp invocation: https://bugs.gentoo.org/attachment.cgi?id=842497
>>> 
>>> We may be able to fallback, but it depends if the errno
>>> is possible to be returned at a partial copy or not.
>>> If partial then there is not much we can do.
>>> Now ENOENT is not a documented errno for copy_file_range()
>>> so I'm not sure what we should do with it.
>>> I didn't see on the bug above if any data was copied.
>>> Could we get more info about that?
>> Frank got back to me and said an empty file gets created:
>> ```
>> # cp /mnt/Backup/EAV/data-eav-eav-aktiv-20221207.dump.xz /mnt/OldBackup/EAV/1
>> cp: error copying '/mnt/Backup/EAV/data-eav-eav-aktiv-20221207.dump.xz' to '/mnt/OldBackup/EAV/1/data-eav-eav-aktiv-20221207.dump.xz': No such file or directory
>> # stat /mnt/OldBackup/EAV/1/data-eav-eav-aktiv-20221207.dump.xz
>> File: /mnt/OldBackup/EAV/1/data-eav-eav-aktiv-20221207.dump.xz
>> Size: 0 Blocks: 8 IO Block: 1048576 regular empty file
> 
> OK then it's probably worth handling in coreutils then.
> Note I still get the feeling this is a race in CIFS
> that is only being made more apparent with copy_file_range(),
> but fair enough that this is a regressions for users and
> we should be able to cater for it easy enough.

Total agreement. Thanks, looks good!
[signature.asc (application/pgp-signature, attachment)]

Reply sent to Pádraig Brady <P <at> draigBrady.com>:
You have taken responsibility. (Sun, 08 Jan 2023 13:46:02 GMT) Full text and rfc822 format available.

Notification sent to Sam James <sam <at> gentoo.org>:
bug acknowledged by developer. (Sun, 08 Jan 2023 13:46:02 GMT) Full text and rfc822 format available.

Message #22 received at 60455-done <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Sam James <sam <at> gentoo.org>
Cc: 60455-done <at> debbugs.gnu.org
Subject: Re: bug#60455: Missing fallback if copy_file_range returns ENOENT?
Date: Sun, 8 Jan 2023 13:45:40 +0000
On 08/01/2023 00:51, Sam James wrote:
> 
> 
>> On 7 Jan 2023, at 16:25, Pádraig Brady <P <at> draigBrady.com> wrote:

>> OK it's probably worth handling in coreutils then.
>> Note I still get the feeling this is a race in CIFS
>> that is only being made more apparent with copy_file_range(),
>> but fair enough that this is a regressions for users and
>> we should be able to cater for it easy enough.

Or more precisely, ENOENT will be unusual for fd operations,
and so falling back to a standard copy should just be
restricted to this or similar cases.

If this was seen on a single CIFS mount it may be
less appropriate as then the user may not want to
fall back to a client side copy, when a server side should work.
But in this separate mount case, the fallback is appropriate.
I guess we could restrict to separate device IDs,
but that's probably getting too complicated for this.

> Total agreement. Thanks, looks good!

Pushed.
Marking this as done.

cheers,
Pádraig




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Mon, 06 Feb 2023 12:24:11 GMT) Full text and rfc822 format available.

This bug report was last modified 1 year and 79 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.