GNU bug report logs - #51433
cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE

Reported by: Janne Heß <janne+coreutils <at> hess.ooo>

Date: Wed, 27 Oct 2021 11:56:01 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 51433 in the body.
You can then email your comments to 51433 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox

Report forwarded to bug-coreutils <at> gnu.org:
bug#51433; Package coreutils. (Wed, 27 Oct 2021 11:56:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Janne Heß <janne+coreutils <at> hess.ooo>:
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Wed, 27 Oct 2021 11:56:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Janne Heß <janne+coreutils <at> hess.ooo>
To: bug-coreutils <at> gnu.org
Subject: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE
Date: Wed, 27 Oct 2021 12:00:06 +0200

Hi everyone,

I packaged coreutils 9.0 for NixOS and we found breakages that seemed to be very random during builds of packages
that use the updated coreutils in their build process. It's really hard to tell the main cause but it seems like the issues
are caused by binaries that are corrupted after cp copied them from /tmp to /nix. The issue arises both when the
directories are on the same filesystem and when /tmp is on tmpfs.
Upon further inspection/bisection we figured out these issues are caused by a6eaee501f6ec0c152abe88640203a64c390993e.
This seems to happen on ZFS and indeed on the main coreutils mailing list there is a ZFS issue linked [1].
The testsuite was patched in 61c81ffaacb0194dec31297bc1aa51be72315858 so it doesn't detect this issue anymore,
but the issue still very much happens in the real world.

We have found this to happen while building the completions for a go tool (jx) which seems to be the same
issue as [2]. The tool is built, copied using cp, and called which causes a segfault to happen.

Building another package (peertube) on x86_64-linux on ext4 also fails with strange errors in the
test suite, something about "Error: The service is no longer running". This does not happen when the mentioned
coreutils commit is undone by replacing #ifdef with #if 0 [3].

We have also seen this issue on Darwin when building Alacritty but only happening on some machines
but we were not able to pin it down any further there so this might be related or it might not.

Since the issue is so random, we started wondering if it might be related to -frandom-seed which changes in NixOS
when rebuilding a package [4]. A thing to note here is that Nix does a lot of sandboxing stuff during builds which
includes mount namespaces so a Kernel bug is not out of the question. All of these issues happened during Nix builds,
coreutils 9.0 never made it out of the NixOS staging environment due to the builds breaking. We will probably disable
the new code paths as outlined above so the issue is contained for NixOS users and does not hit any production environments.

[1]: https://github.com/openzfs/zfs/issues/11900
[2]: https://github.com/golang/go/issues/48636
[3]: https://raw.githubusercontent.com/NixOS/nixpkgs/bf0531b4f8a2de4ff2700797fb211a90c951786e/pkgs/tools/misc/coreutils/disable-seek-hole.patch
[4]: https://github.com/NixOS/nixpkgs/pull/141684#issuecomment-952339263

Information forwarded to bug-coreutils <at> gnu.org:
bug#51433; Package coreutils. (Wed, 27 Oct 2021 15:37:02 GMT) Full text and rfc822 format available.

Message #8 received at 51433 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Janne Heß <janne+coreutils <at> hess.ooo>,
 51433 <at> debbugs.gnu.org
Subject: Re: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE
Date: Wed, 27 Oct 2021 16:36:29 +0100

On 27/10/2021 11:00, Janne Heß wrote:
> Hi everyone,
> 
> I packaged coreutils 9.0 for NixOS and we found breakages that seemed to be very random during builds of packages
> that use the updated coreutils in their build process. It's really hard to tell the main cause but it seems like the issues
> are caused by binaries that are corrupted after cp copied them from /tmp to /nix. The issue arises both when the
> directories are on the same filesystem and when /tmp is on tmpfs.
> Upon further inspection/bisection we figured out these issues are caused by a6eaee501f6ec0c152abe88640203a64c390993e.
> This seems to happen on ZFS and indeed on the main coreutils mailing list there is a ZFS issue linked [1].
> The testsuite was patched in 61c81ffaacb0194dec31297bc1aa51be72315858 so it doesn't detect this issue anymore,
> but the issue still very much happens in the real world.
> 
> We have found this to happen while building the completions for a go tool (jx) which seems to be the same
> issue as [2]. The tool is built, copied using cp, and called which causes a segfault to happen.
> 
> Building another package (peertube) on x86_64-linux on ext4 also fails with strange errors in the
> test suite, something about "Error: The service is no longer running". This does not happen when the mentioned
> coreutils commit is undone by replacing #ifdef with #if 0 [3].
> 
> We have also seen this issue on Darwin when building Alacritty but only happening on some machines
> but we were not able to pin it down any further there so this might be related or it might not.
> 
> Since the issue is so random, we started wondering if it might be related to -frandom-seed which changes in NixOS
> when rebuilding a package [4]. A thing to note here is that Nix does a lot of sandboxing stuff during builds which
> includes mount namespaces so a Kernel bug is not out of the question. All of these issues happened during Nix builds,
> coreutils 9.0 never made it out of the NixOS staging environment due to the builds breaking. We will probably disable
> the new code paths as outlined above so the issue is contained for NixOS users and does not hit any production environments.
> 
> [1]: https://github.com/openzfs/zfs/issues/11900
> [2]: https://github.com/golang/go/issues/48636
> [3]: https://raw.githubusercontent.com/NixOS/nixpkgs/bf0531b4f8a2de4ff2700797fb211a90c951786e/pkgs/tools/misc/coreutils/disable-seek-hole.patch
> [4]: https://github.com/NixOS/nixpkgs/pull/141684#issuecomment-952339263

We know about the ZFS issue with SEEK_HOLE:
https://lists.gnu.org/archive/html/coreutils/2021-10/msg00021.html

I've asked the user having nixos issues on darwin whether they're using the zfs on darwin port,
or at least what file system is being copied from there.

This is awkward to handle unfortunately.
All I can think of now is to identify the file system type for each source file,
and disable SEEK_HOLE on zfs at least.

thanks for the info,
Pádraig

Information forwarded to bug-coreutils <at> gnu.org:
bug#51433; Package coreutils. (Thu, 28 Oct 2021 07:57:01 GMT) Full text and rfc822 format available.

Message #11 received at 51433 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Janne Heß <janne+coreutils <at> hess.ooo>
Cc: 51433 <at> debbugs.gnu.org
Subject: Re: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE
Date: Thu, 28 Oct 2021 00:56:11 -0700

[Message part 1 (text/plain, inline)]

On 10/27/21 03:00, Janne Heß wrote:
> Building another package (peertube) on x86_64-linux on ext4 also fails with strange errors in the
> test suite, something about "Error: The service is no longer running". This does not happen when the mentioned
> coreutils commit is undone by replacing #ifdef with #if 0 [3].

So the problem is not limited to ZFS? Which means that even if we 
implemented Pádraig's suggestion and disabled SEEK_HOLE on zfs, we'd 
still run into problems? That's really puzzling. Particularly since it's 
not clear what program is generating the diagnostic "The service is no 
longer running", or how it's related to GNU cp.

Anyway, the ZFS issue sounds like a serious bug in lseek+SEEK_DATA that 
really needs to be fixed. This is not just a coreutils issue, as other 
programs use SEEK_DATA.

I assume the ZFS bug (if the bug is related to ZFS, anyway) is a race 
condition of some sort; at least, that's what the trace in 
<https://github.com/openzfs/zfs/issues/11900> suggests.

In particular, I was struck that the depthcharge.config file that 'cp' 
was reading from was created by some other process, this way:

[pid 3014182] openat(AT_FDCWD, 
"/build/guybrush/tmp/portage/sys-boot/depthcharge-0.0.1-r3237/image/firmware/guybrush/depthcharge/depthcharge.config", 
O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0666) = 4
[pid 3014182] fstat(4, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
[pid 3014182] ioctl(4, TCGETS, 0x7ffd919d61c0) = -1 ENOTTY 
(Inappropriate ioctl for device)
[pid 3014182] lseek(3, 0, SEEK_CUR)     = 0
[pid 3014182] lseek(3, 0, SEEK_DATA)    = 0
[pid 3014182] lseek(3, 0, SEEK_HOLE)    = 9608
[pid 3014182] copy_file_range(3, [0], 4, [0], 9608, 0) = 9608
[pid 3014182] lseek(3, 0, SEEK_CUR)     = 9608
[pid 3014182] lseek(3, 9608, SEEK_DATA) = -1 ENXIO (No such device or 
address)
[pid 3014182] lseek(3, 0, SEEK_END)     = 9608
[pid 3014182] ftruncate(4, 9608)        = 0
[pid 3014182] close(4)                  = 0

So, one hypothesis is that ZFS's implementation of copy_file_range does 
not set up data structures appropriately for cp's later use of 
lseek+SEEK_DATA when reading depthcharge.config. That is, from cp's 
point of view, the ftruncate(4, 9608) has been executed but the 
copy_file_range(3, [0], 4, [0], 9608, 0) has not been executed yet (it's 
cached somewhere, no doubt).

If my guess is right, then an fdatasync or fsync on cp's input might 
work around  common instances of this ZFS bug. Could you try the 
attached coreutils patch, and see whether it works around the bug? Or 
perhaps change 'fdatasync' with 'fsync' in the attached patch? Thanks.

[0001-cp-attempt-to-work-around-ZFS-bug.patch (text/x-patch, attachment)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#51433; Package coreutils. (Thu, 28 Oct 2021 13:55:02 GMT) Full text and rfc822 format available.

Message #14 received at 51433 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Paul Eggert <eggert <at> cs.ucla.edu>, Janne Heß
 <janne+coreutils <at> hess.ooo>
Cc: 51433 <at> debbugs.gnu.org
Subject: Re: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE
Date: Thu, 28 Oct 2021 14:54:12 +0100

On 28/10/2021 08:56, Paul Eggert wrote:
> On 10/27/21 03:00, Janne Heß wrote:
>> Building another package (peertube) on x86_64-linux on ext4 also fails with strange errors in the
>> test suite, something about "Error: The service is no longer running". This does not happen when the mentioned
>> coreutils commit is undone by replacing #ifdef with #if 0 [3].
> 
> So the problem is not limited to ZFS? Which means that even if we
> implemented Pádraig's suggestion and disabled SEEK_HOLE on zfs, we'd
> still run into problems? That's really puzzling. Particularly since it's
> not clear what program is generating the diagnostic "The service is no
> longer running", or how it's related to GNU cp.
> 
> Anyway, the ZFS issue sounds like a serious bug in lseek+SEEK_DATA that
> really needs to be fixed. This is not just a coreutils issue, as other
> programs use SEEK_DATA.
> 
> I assume the ZFS bug (if the bug is related to ZFS, anyway) is a race
> condition of some sort; at least, that's what the trace in
> <https://github.com/openzfs/zfs/issues/11900> suggests.
> 
> In particular, I was struck that the depthcharge.config file that 'cp'
> was reading from was created by some other process, this way:
> 
> [pid 3014182] openat(AT_FDCWD,
> "/build/guybrush/tmp/portage/sys-boot/depthcharge-0.0.1-r3237/image/firmware/guybrush/depthcharge/depthcharge.config",
> O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0666) = 4
> [pid 3014182] fstat(4, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
> [pid 3014182] ioctl(4, TCGETS, 0x7ffd919d61c0) = -1 ENOTTY
> (Inappropriate ioctl for device)
> [pid 3014182] lseek(3, 0, SEEK_CUR)     = 0
> [pid 3014182] lseek(3, 0, SEEK_DATA)    = 0
> [pid 3014182] lseek(3, 0, SEEK_HOLE)    = 9608
> [pid 3014182] copy_file_range(3, [0], 4, [0], 9608, 0) = 9608
> [pid 3014182] lseek(3, 0, SEEK_CUR)     = 9608
> [pid 3014182] lseek(3, 9608, SEEK_DATA) = -1 ENXIO (No such device or
> address)
> [pid 3014182] lseek(3, 0, SEEK_END)     = 9608
> [pid 3014182] ftruncate(4, 9608)        = 0
> [pid 3014182] close(4)                  = 0
> 
> So, one hypothesis is that ZFS's implementation of copy_file_range does
> not set up data structures appropriately for cp's later use of
> lseek+SEEK_DATA when reading depthcharge.config. That is, from cp's
> point of view, the ftruncate(4, 9608) has been executed but the
> copy_file_range(3, [0], 4, [0], 9608, 0) has not been executed yet (it's
> cached somewhere, no doubt).
> 
> If my guess is right, then an fdatasync or fsync on cp's input might
> work around  common instances of this ZFS bug. Could you try the
> attached coreutils patch, and see whether it works around the bug? Or
> perhaps change 'fdatasync' with 'fsync' in the attached patch? Thanks.

Further debugging from Nix folks suggest ZFS was in consideration always,
as invalid artifacts were written to a central cache from ZFS backed hosts.
So we should at least change the comment in the patch to only mention ZFS.

Also it seems like fsync() does avoid the ZFS issue as mentioned in:
https://github.com/openzfs/zfs/issues/11900

BTW I'm slightly worried about retrying SEEK_DATA as
FreeBSD 9.1 has a bug with large sparse files at least
where it takes ages for SEEK_DATA to return:
  36.13290615 lseek(3,0x0,SEEK_DATA)         = -32768 (0xffff8000)
If ENXIO is not set in that case, then there is no issue.

Also I'm not sure restricting sync to ENXIO is general enough,
as an strace from a problematic cp, from the github issue above is:
  lseek(3, 0, SEEK_DATA)            = 0
  lseek(3, 0, SEEK_HOLE)            = 131072
  lseek(3, 0, SEEK_SET)             = 0
  read(3, "\177ELF\2\1"..., 131072) = 131072
  write(4, "\177ELF\2\"..., 131072) = 131072
  lseek(3, 131072, SEEK_DATA)       = -1 ENXIO
  ftruncate(4, 3318813)             = 0

For completeness, there is another SEEK_DATA issue on ZFS,
but only a perf one, not a correctness one.
This was a perf issue we saw in our tests and handled with:
https://git.sv.gnu.org/cgit/coreutils.git/commit/?id=v9.0-3-g61c81ffaa
That case is where ZFS has an async operation that needs to complete
before SEEK_DATA can determine a sparse file is empty (return ENXIO).
That seems to be controlled with zfs-dmu-offset-next-sync
One can see that perf issue with:

   rm f3 f4 &&
   truncate -s 1T f3 &&
   sleep 2 &&
   timeout 1 src/cp --reflink=never f3 f4 &&
   echo ok

If you change the `sleep 2` to a `sleep 4`, then "ok" is echoed.
Sometimes `sleep 3` works, so this seems to be the zfs timer.
Changing to a smaller file doesn't seem to change anything.
Using `sync f3`, 'sync .`, or `sync` rather than `sleep 4` doesn't help.

cheers,
Pádraig

Information forwarded to bug-coreutils <at> gnu.org:
bug#51433; Package coreutils. (Thu, 28 Oct 2021 19:12:01 GMT) Full text and rfc822 format available.

Message #17 received at 51433 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Pádraig Brady <P <at> draigBrady.com>
Cc: Janne Heß <janne+coreutils <at> hess.ooo>,
 51433 <at> debbugs.gnu.org
Subject: Re: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE
Date: Thu, 28 Oct 2021 12:11:41 -0700

On 10/28/21 06:54, Pádraig Brady wrote:

> Further debugging from Nix folks suggest ZFS was in consideration always,
> as invalid artifacts were written to a central cache from ZFS backed hosts.
> So we should at least change the comment in the patch to only mention ZFS.

Yes, that sounds reasonable.

This ZFS bug sounds pretty serious, though. Apparently it affects star 
and other programs too. I'm not sure we should attempt to work around it 
in coreutils, if the workarounds penalize everybody not using ZFS.

Is it cheap to check whether a file is actually in a ZFS filesystem? 
(Don't know how this'd work with loopback mounts, NFS, etc.) If so, it 
might be better to simply fdatasync (or even fsync) every input file 
that's on ZFS, until we know the ZFS bugs are fixed.

In theory we could fdatasync/fsync every input file on every platform. 
It'd be a shame to do that, though; that would slow down everybody 
merely to work around this ZFS bug.

> Also it seems like fsync() does avoid the ZFS issue as mentioned in:
> https://github.com/openzfs/zfs/issues/11900

Yes. I'm hoping that fdatasync suffices as it's lighter-weight. But if 
fsync is needed we can use fsync.

> BTW I'm slightly worried about retrying SEEK_DATA as
> FreeBSD 9.1 has a bug with large sparse files at least
> where it takes ages for SEEK_DATA to return:
>    36.13290615 lseek(3,0x0,SEEK_DATA)         = -32768 (0xffff8000)
> If ENXIO is not set in that case, then there is no issue.

Wait - lseek returns a number less than -1?! We could easily check for 
that FreeBSD bug, perhaps as an independent patch; this shouldn't 
require any extra syscalls.

Also please see 
<https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=256205>. It appears 
that ZFS has significant bugs in this area on FreeBSD, bugs that haven't 
been fixed yet. That bug report does suggest that an fsync (and I hope 
fdatasync) works around the bugs.

> Also I'm not sure restricting sync to ENXIO is general enough,
> as an strace from a problematic cp, from the github issue above is:
>    lseek(3, 0, SEEK_DATA)            = 0
>    lseek(3, 0, SEEK_HOLE)            = 131072
>    lseek(3, 0, SEEK_SET)             = 0
>    read(3, "\177ELF\2\1"..., 131072) = 131072
>    write(4, "\177ELF\2\"..., 131072) = 131072
>    lseek(3, 131072, SEEK_DATA)       = -1 ENXIO
>    ftruncate(4, 3318813)             = 0

How about if we also do an fdatasync+retry after that 2nd lseek yields 
ENXIO? Would that suffice to work around the ZFS bug? Would it be too 
much of a performance penalty for non-ZFS users?

Information forwarded to bug-coreutils <at> gnu.org:
bug#51433; Package coreutils. (Thu, 28 Oct 2021 21:00:02 GMT) Full text and rfc822 format available.

Message #20 received at 51433 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: Janne Heß <janne+coreutils <at> hess.ooo>,
 51433 <at> debbugs.gnu.org
Subject: Re: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE
Date: Thu, 28 Oct 2021 21:59:03 +0100

On 28/10/2021 20:11, Paul Eggert wrote:
> On 10/28/21 06:54, Pádraig Brady wrote:
> 
>> Further debugging from Nix folks suggest ZFS was in consideration always,
>> as invalid artifacts were written to a central cache from ZFS backed hosts.
>> So we should at least change the comment in the patch to only mention ZFS.
> 
> Yes, that sounds reasonable.
> 
> This ZFS bug sounds pretty serious, though. Apparently it affects star
> and other programs too. I'm not sure we should attempt to work around it
> in coreutils, if the workarounds penalize everybody not using ZFS.
> 
> Is it cheap to check whether a file is actually in a ZFS filesystem?
> (Don't know how this'd work with loopback mounts, NFS, etc.) If so, it
> might be better to simply fdatasync (or even fsync) every input file
> that's on ZFS, until we know the ZFS bugs are fixed.
> 
> In theory we could fdatasync/fsync every input file on every platform.
> It'd be a shame to do that, though; that would slow down everybody
> merely to work around this ZFS bug.
> 
>> Also it seems like fsync() does avoid the ZFS issue as mentioned in:
>> https://github.com/openzfs/zfs/issues/11900
> 
> Yes. I'm hoping that fdatasync suffices as it's lighter-weight. But if
> fsync is needed we can use fsync.
> 
>> BTW I'm slightly worried about retrying SEEK_DATA as
>> FreeBSD 9.1 has a bug with large sparse files at least
>> where it takes ages for SEEK_DATA to return:
>>     36.13290615 lseek(3,0x0,SEEK_DATA)         = -32768 (0xffff8000)
>> If ENXIO is not set in that case, then there is no issue.
> 
> Wait - lseek returns a number less than -1?! We could easily check for
> that FreeBSD bug, perhaps as an independent patch; this shouldn't
> require any extra syscalls.
> 
> Also please see
> <https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=256205>. It appears
> that ZFS has significant bugs in this area on FreeBSD, bugs that haven't
> been fixed yet. That bug report does suggest that an fsync (and I hope
> fdatasync) works around the bugs.
> 
>> Also I'm not sure restricting sync to ENXIO is general enough,
>> as an strace from a problematic cp, from the github issue above is:
>>     lseek(3, 0, SEEK_DATA)            = 0
>>     lseek(3, 0, SEEK_HOLE)            = 131072
>>     lseek(3, 0, SEEK_SET)             = 0
>>     read(3, "\177ELF\2\1"..., 131072) = 131072
>>     write(4, "\177ELF\2\"..., 131072) = 131072
>>     lseek(3, 131072, SEEK_DATA)       = -1 ENXIO
>>     ftruncate(4, 3318813)             = 0
> 
> How about if we also do an fdatasync+retry after that 2nd lseek yields
> ENXIO? Would that suffice to work around the ZFS bug? Would it be too
> much of a performance penalty for non-ZFS users?

I don't think there is anything special about first or second lseek().
ZFS just seems to be returning ENXIO depending on the write pattern of the file.
Since we currently always have an expected SEEK_DATA ENXIO at end of file,
doing a sync on any ENXIO would be best avoided.

I wonder after getting a SEEK_DATA ENXIO, might we correlate
we really are at end of file by checking SEEK_HOLE also returns ENXIO?
If not we could do the datasync and try SEEK_DATA again.
I've not seen enough syscall traces to be confident of that though.

BTW we only attempt the SEEK_HOLE scanning when our heuristic
of st_blocks being less than st_size indicates there may be holes.
It's a bit surprising that that is the case for these elf binaries.
Perhaps zfs is updating st_blocks async, or perhaps there are
runs of zeros in these files that a linker or whatever is sparsifying?

cheers,
Pádraig

Information forwarded to bug-coreutils <at> gnu.org:
bug#51433; Package coreutils. (Sat, 30 Oct 2021 01:25:01 GMT) Full text and rfc822 format available.

Message #23 received at 51433 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Pádraig Brady <P <at> draigBrady.com>
Cc: Janne Heß <janne+coreutils <at> hess.ooo>,
 51433 <at> debbugs.gnu.org
Subject: Re: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE
Date: Fri, 29 Oct 2021 18:24:38 -0700

[Message part 1 (text/plain, inline)]

On 10/28/21 12:11, Paul Eggert wrote:
> Wait - lseek returns a number less than -1?! We could easily check for 
> that FreeBSD bug, perhaps as an independent patch; this shouldn't 
> require any extra syscalls.

I installed the attached patch to do this. This doesn't fix coreutils 
bug#51433; it merely makes 'cp' and similar programs more likely to 
detect and report the FreeBSD 9.1 bug you mentioned.

[0001-cp-defend-better-against-FreeBSD-9.1-zfs-bug.patch (text/x-patch, attachment)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#51433; Package coreutils. (Sat, 30 Oct 2021 02:05:02 GMT) Full text and rfc822 format available.

Message #26 received at 51433 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Pádraig Brady <P <at> draigBrady.com>
Cc: Janne Heß <janne+coreutils <at> hess.ooo>,
 51433 <at> debbugs.gnu.org
Subject: Re: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE
Date: Fri, 29 Oct 2021 19:04:04 -0700

[Message part 1 (text/plain, inline)]

On 10/28/21 13:59, Pádraig Brady wrote:

> I wonder after getting a SEEK_DATA ENXIO, might we correlate
> we really are at end of file by checking SEEK_HOLE also returns ENXIO?

Wouldn't SEEK_HOLE return the current offset, instead of ENXIO? That is, 
if the underlying data structure is wrong and claims that the rest of 
the file is a hole, wouldn't SEEK_HOLE merely repeat that information?

> Perhaps zfs is updating st_blocks async, or perhaps there are
> runs of zeros in these files that a linker or whatever is sparsifying?

Even if st_blocks was out-of-date, that's just a heuristic and the later 
lseeks should still work. I don't think the files contain actual holes.

I don't see an easy workaround for the ZFS bug, unless we want to slow 
down 'cp' for everybody. This really needs to be fixed on the ZFS side.

The attached patch to OpenZFS might work around the bug (I haven't 
tested it; perhaps someone who uses ZFS could give it a try).

[zfs.patch (text/x-patch, attachment)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#51433; Package coreutils. (Sat, 30 Oct 2021 10:23:01 GMT) Full text and rfc822 format available.

Message #29 received at 51433 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: Janne Heß <janne+coreutils <at> hess.ooo>,
 51433 <at> debbugs.gnu.org
Subject: Re: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE
Date: Sat, 30 Oct 2021 11:22:13 +0100

On 30/10/2021 02:24, Paul Eggert wrote:
> On 10/28/21 12:11, Paul Eggert wrote:
>> Wait - lseek returns a number less than -1?! We could easily check for
>> that FreeBSD bug, perhaps as an independent patch; this shouldn't
>> require any extra syscalls.
> 
> I installed the attached patch to do this. This doesn't fix coreutils
> bug#51433; it merely makes 'cp' and similar programs more likely to
> detect and report the FreeBSD 9.1 bug you mentioned.

On further inspection it seems the bug is in truss :/
I.e. the only bug on FreeBSD 9.1 is that the SEEK_DATA takes ages,
but the returned values are correct.
I.e. SEEK_DATA was returning 1099511595008 in this case
which truss was reporting in 32 bit rather than 64 bit.
I see other inconsistencies in truss output,
so it can only be relied on for what syscall was called,
rather than what was passed/returned.

Given this patch doesn't change any operation,
it may be best to revert to avoid the complexity?

cheers,
Pádraig

Information forwarded to bug-coreutils <at> gnu.org:
bug#51433; Package coreutils. (Sat, 30 Oct 2021 17:09:02 GMT) Full text and rfc822 format available.

Message #32 received at 51433 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Pádraig Brady <P <at> draigBrady.com>
Cc: Janne Heß <janne+coreutils <at> hess.ooo>,
 51433 <at> debbugs.gnu.org
Subject: Re: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE
Date: Sat, 30 Oct 2021 10:07:53 -0700

[Message part 1 (text/plain, inline)]

On 10/30/21 03:22, Pádraig Brady wrote:

> Given this patch doesn't change any operation,
> it may be best to revert to avoid the complexity?

Yes, and thanks for checking. I installed the attached.

The main bug of course remains unfixed, as it's a ZFS bug that needs 
fixing in ZFS.

[0001-cp-revert-unnecessary-FreeBSD-workaround.patch (text/x-patch, attachment)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#51433; Package coreutils. (Sun, 31 Oct 2021 16:14:02 GMT) Full text and rfc822 format available.

Message #35 received at 51433 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: Janne Heß <janne+coreutils <at> hess.ooo>,
 51433 <at> debbugs.gnu.org
Subject: Re: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE
Date: Sun, 31 Oct 2021 16:13:45 +0000

[Message part 1 (text/plain, inline)]

On 28/10/2021 20:11, Paul Eggert wrote:
> On 10/28/21 06:54, Pádraig Brady wrote:
> 
>> Further debugging from Nix folks suggest ZFS was in consideration always,
>> as invalid artifacts were written to a central cache from ZFS backed hosts.
>> So we should at least change the comment in the patch to only mention ZFS.
> 
> Yes, that sounds reasonable.
> 
> This ZFS bug sounds pretty serious, though. Apparently it affects star
> and other programs too. I'm not sure we should attempt to work around it
> in coreutils, if the workarounds penalize everybody not using ZFS.

Yes this is an awkward situation.
Though given the frequency of the cp/zfs combo,
I think we should try to provide some mitigation.

> Is it cheap to check whether a file is actually in a ZFS filesystem?
> (Don't know how this'd work with loopback mounts, NFS, etc.) If so, it
> might be better to simply fdatasync (or even fsync) every input file
> that's on ZFS, until we know the ZFS bugs are fixed.

The attached uses statfs()->f_type which is usually available,
to avoid using SEEK_DATA on ZFS.  This should be fairly lightweight I think,
and only used for files that might be sparse.

There may still be issues with NFS backed by ZFS,
but that should be rarer and more centrally managed,
thus more amenable to possible future ZFS fixes, or mitigations.
We could also disable SEEK_DATA for remote file systems,
but that would be a pity since SEEK_DATA seems especially useful there.

cheers,
Pádraig

[zfs-seek-data-skip.patch (text/x-patch, attachment)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#51433; Package coreutils. (Sun, 31 Oct 2021 16:24:01 GMT) Full text and rfc822 format available.

Message #38 received at 51433 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: Janne Heß <janne+coreutils <at> hess.ooo>,
 51433 <at> debbugs.gnu.org
Subject: Re: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE
Date: Sun, 31 Oct 2021 16:23:45 +0000

On 30/10/2021 03:04, Paul Eggert wrote:
> On 10/28/21 13:59, Pádraig Brady wrote:
> 
>> I wonder after getting a SEEK_DATA ENXIO, might we correlate
>> we really are at end of file by checking SEEK_HOLE also returns ENXIO?
> 
> Wouldn't SEEK_HOLE return the current offset, instead of ENXIO? That is,
> if the underlying data structure is wrong and claims that the rest of
> the file is a hole, wouldn't SEEK_HOLE merely repeat that information?

What I was thinking is that SEEK_DATA returning ENXIO suggests end of file.
If we're not actually at EOF a SEEK_HOLE at SEEK_CUR+1 might not indicate ENXIO.
I which case we could do a fdatasync().
All wild speculation though depending on the ZFS bug.
If the bug is fixed and we have a good reproducer, we might
be able to do something like this to cater for existing buggy ZFS installations.

cheers,
Pádraig

Information forwarded to bug-coreutils <at> gnu.org:
bug#51433; Package coreutils. (Mon, 01 Nov 2021 05:53:01 GMT) Full text and rfc822 format available.

Message #41 received at 51433 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Pádraig Brady <P <at> draigBrady.com>
Cc: Janne Heß <janne+coreutils <at> hess.ooo>,
 51433 <at> debbugs.gnu.org
Subject: Re: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE
Date: Sun, 31 Oct 2021 22:52:04 -0700

On 10/31/21 09:13, Pádraig Brady wrote:

> The attached uses statfs()->f_type which is usually available,
> to avoid using SEEK_DATA on ZFS.  This should be fairly lightweight I 
> think,
> and only used for files that might be sparse.

Couldn't we be even lazier about invoking fstatfs? It needs to be 
invoked only if the file appears to be sparse (as you mentioned), and 
also only if lseek+SEEK_DATA fails (with errno==ENXIO) at a position 
less than the file's length, and also at most once per input file 
descriptor.

When copying multiple files (e.g., cp -r) we could also cache the 
previous file's st_dev and ZFS status, so in the typical case we could 
skip fstatfs entirely except for the first sparsish file that satisfies 
the abovementioned criteria.

Also, as I understand it the problem occurs with OpenZFS but not on 
Solaris or its descendants, so we don't need to do any of this if __sun 
is defined.

Also, if we're going to use fstatfs it'd probably be better to hoist all 
the fstatfs portability mess out of stat.c and into a new file (say, 
gl/lib/pstatfs.c) that gives us a portable way of getting statfs-like 
data out of filesystems, and using that new file in both stat.c and copy.c.

One more idea: I've seen reports that the problem goes away if you use 
'read' to read just the first byte of the false hole; that might be a 
simpler workaround than getting into the fstatfs portability mess.

> We could also disable SEEK_DATA for remote file systems,
> but that would be a pity since SEEK_DATA seems especially useful there.

Absolutely.

Don't know if you're following 
<https://github.com/openzfs/zfs/issues/11900> but rincebrain reports a 
working workaround on the OpenZFS side tho more testing needs to be 
done. This OpenZFS bugs affects GNU tar and star (and surely other 
programs) as well as coreutils, and we may be better off overall if we 
merely ask people to fix their ZFS implementation rather than our trying 
to work around the bug in coreutils.

Information forwarded to bug-coreutils <at> gnu.org:
bug#51433; Package coreutils. (Mon, 01 Nov 2021 13:28:01 GMT) Full text and rfc822 format available.

Message #44 received at 51433 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: Janne Heß <janne+coreutils <at> hess.ooo>,
 51433 <at> debbugs.gnu.org
Subject: Re: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE
Date: Mon, 1 Nov 2021 13:27:15 +0000

On 01/11/2021 05:52, Paul Eggert wrote:
> On 10/31/21 09:13, Pádraig Brady wrote:
> 
>> The attached uses statfs()->f_type which is usually available,
>> to avoid using SEEK_DATA on ZFS.  This should be fairly lightweight I
>> think,
>> and only used for files that might be sparse.
> 
> Couldn't we be even lazier about invoking fstatfs? It needs to be
> invoked only if the file appears to be sparse (as you mentioned), and
> also only if lseek+SEEK_DATA fails (with errno==ENXIO) at a position
> less than the file's length, and also at most once per input file
> descriptor.
> 
> When copying multiple files (e.g., cp -r) we could also cache the
> previous file's st_dev and ZFS status, so in the typical case we could
> skip fstatfs entirely except for the first sparsish file that satisfies
> the abovementioned criteria.

I was thinking that sparse may be enough of a filter,
but these are valid suggestions.

> Also, as I understand it the problem occurs with OpenZFS but not on
> Solaris or its descendants, so we don't need to do any of this if __sun
> is defined.

Oh interesting.
If that was actually the case we could probably default
to allowing SEEK_DATA, and only disallow when f_type == ZFS,
i.e. allow SEEK_DATA on solaris.

> Also, if we're going to use fstatfs it'd probably be better to hoist all
> the fstatfs portability mess out of stat.c and into a new file (say,
> gl/lib/pstatfs.c) that gives us a portable way of getting statfs-like
> data out of filesystems, and using that new file in both stat.c and copy.c.

.. and tail.c. Good suggestion.

> One more idea: I've seen reports that the problem goes away if you use
> 'read' to read just the first byte of the false hole; that might be a
> simpler workaround than getting into the fstatfs portability mess.

Indeed. It would be really nice to get a solution supporting
existing, and future openzfs installations.

>> We could also disable SEEK_DATA for remote file systems,
>> but that would be a pity since SEEK_DATA seems especially useful there.
> 
> Absolutely.
> 
> Don't know if you're following
> <https://github.com/openzfs/zfs/issues/11900> but rincebrain reports a
> working workaround on the OpenZFS side tho more testing needs to be
> done. This OpenZFS bugs affects GNU tar and star (and surely other
> programs) as well as coreutils, and we may be better off overall if we
> merely ask people to fix their ZFS implementation rather than our trying
> to work around the bug in coreutils.

I see a slew of recent messages re workarounds in openzfs.
It's good to see some movement there, as it's definitely
best to address this in zfs.

cheers,
Pádraig

Information forwarded to bug-coreutils <at> gnu.org:
bug#51433; Package coreutils. (Wed, 03 Nov 2021 15:39:01 GMT) Full text and rfc822 format available.

Message #47 received at 51433 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Janne Heß <janne+coreutils <at> hess.ooo>,
 51433 <at> debbugs.gnu.org
Subject: Re: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE
Date: Wed, 3 Nov 2021 15:37:58 +0000

On 27/10/2021 11:00, Janne Heß wrote:
> Hi everyone,
> 
> I packaged coreutils 9.0 for NixOS and we found breakages that seemed to be very random during builds of packages
> that use the updated coreutils in their build process. It's really hard to tell the main cause but it seems like the issues
> are caused by binaries that are corrupted after cp copied them from /tmp to /nix. The issue arises both when the
> directories are on the same filesystem and when /tmp is on tmpfs.
> Upon further inspection/bisection we figured out these issues are caused by a6eaee501f6ec0c152abe88640203a64c390993e.
> This seems to happen on ZFS and indeed on the main coreutils mailing list there is a ZFS issue linked [1].
> The testsuite was patched in 61c81ffaacb0194dec31297bc1aa51be72315858 so it doesn't detect this issue anymore,
> but the issue still very much happens in the real world.
> 
> We have found this to happen while building the completions for a go tool (jx) which seems to be the same
> issue as [2]. The tool is built, copied using cp, and called which causes a segfault to happen.
> 
> Building another package (peertube) on x86_64-linux on ext4 also fails with strange errors in the
> test suite, something about "Error: The service is no longer running". This does not happen when the mentioned
> coreutils commit is undone by replacing #ifdef with #if 0 [3].
> 
> We have also seen this issue on Darwin when building Alacritty but only happening on some machines
> but we were not able to pin it down any further there so this might be related or it might not.
> 
> Since the issue is so random, we started wondering if it might be related to -frandom-seed which changes in NixOS
> when rebuilding a package [4]. A thing to note here is that Nix does a lot of sandboxing stuff during builds which
> includes mount namespaces so a Kernel bug is not out of the question. All of these issues happened during Nix builds,
> coreutils 9.0 never made it out of the NixOS staging environment due to the builds breaking. We will probably disable
> the new code paths as outlined above so the issue is contained for NixOS users and does not hit any production environments.
> 
> [1]: https://github.com/openzfs/zfs/issues/11900
> [2]: https://github.com/golang/go/issues/48636
> [3]: https://raw.githubusercontent.com/NixOS/nixpkgs/bf0531b4f8a2de4ff2700797fb211a90c951786e/pkgs/tools/misc/coreutils/disable-seek-hole.patch
> [4]: https://github.com/NixOS/nixpkgs/pull/141684#issuecomment-952339263

Looks like there is a WIP fix for OpenZFS mentioned at [1],
where mmap'd regions were not being flushed:
https://github.com/openzfs/zfs/commit/f2eebe07

So this should unblock enabling coreutils 9 at some stage at least.
I've asked at [1] now they know what's going on,
how programs might best distinguish buggy instances of openzfs.

cheers,
Pádraig

Information forwarded to bug-coreutils <at> gnu.org:
bug#51433; Package coreutils. (Wed, 03 Nov 2021 17:58:02 GMT) Full text and rfc822 format available.

Message #50 received at 51433 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Pádraig Brady <P <at> draigBrady.com>
Cc: Janne Heß <janne+coreutils <at> hess.ooo>,
 51433 <at> debbugs.gnu.org
Subject: Re: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE
Date: Wed, 3 Nov 2021 10:57:28 -0700

On 11/3/21 08:37, Pádraig Brady wrote:
> So this should unblock enabling coreutils 9 at some stage at least.

That's good news.

> I've asked at [1] now they know what's going on,
> how programs might best distinguish buggy instances of openzfs.

Maybe "configure" could do a "modinfo zfs" or a "cat 
/sys/modinfo/zfs/version"? Admittedly that's *quite* a hack and of 
course wouldn't be that reliable.

PS. Is there some way to arrange for this OpenZFS patch to be backported 
to Fedora relatively quickly? (I ask as a Fedora user. :-)

Information forwarded to bug-coreutils <at> gnu.org:
bug#51433; Package coreutils. (Mon, 08 Nov 2021 07:05:02 GMT) Full text and rfc822 format available.

Message #53 received at 51433 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: 51433 <at> debbugs.gnu.org
Cc: Janne Heß <janne+coreutils <at> hess.ooo>
Subject: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE
Date: Sun, 7 Nov 2021 23:04:27 -0800

It looks like the patch to OpenZFS might not suffice; see:

https://github.com/openzfs/zfs/issues/11900#issuecomment-962812974

It looks like the OpenZFS folks are planning to look at it more this 
week. There is an obvious workaround (though it hurts performance)....

Reply sent to Paul Eggert <eggert <at> cs.ucla.edu>:
You have taken responsibility. (Fri, 28 Jan 2022 02:45:01 GMT) Full text and rfc822 format available.

Notification sent to Janne Heß <janne+coreutils <at> hess.ooo>:
bug acknowledged by developer. (Fri, 28 Jan 2022 02:45:01 GMT) Full text and rfc822 format available.

Message #58 received at 51433-done <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: 51433-done <at> debbugs.gnu.org
Cc: Janne Heß <janne+coreutils <at> hess.ooo>
Subject: Re: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE
Date: Thu, 27 Jan 2022 18:43:55 -0800

On 11/7/21 23:04, Paul Eggert wrote:

> https://github.com/openzfs/zfs/issues/11900#issuecomment-962812974

Apparently the OpenZFS bug has been fixed, as behlendorf closed it 20 
days ago.

Since there doesn't seem to be a good way for coreutils to work around 
the bug, and the bug potentially affects all apps that use SEEK_DATA, 
I'm taking the liberty of closing the coreutils bug report. Thanks for 
reporting it.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Fri, 25 Feb 2022 12:24:06 GMT) Full text and rfc822 format available.

This bug report was last modified 3 years and 129 days ago.

Previous Next

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #51433 cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE

GNU bug report logs - #51433
cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE