GNU bug report logs -
#70877
guix-daemon fails to copy 4+GB file to store
Previous Next
Reported by: Ricardo Wurmus <rekado <at> elephly.net>
Date: Sat, 11 May 2024 10:54:01 UTC
Severity: important
Done: Ludovic Courtès <ludo <at> gnu.org>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 70877 in the body.
You can then email your comments to 70877 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
ludo <at> gnu.org, bug-guix <at> gnu.org
:
bug#70877
; Package
guix
.
(Sat, 11 May 2024 10:54:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Ricardo Wurmus <rekado <at> elephly.net>
:
New bug report received and forwarded. Copy sent to
ludo <at> gnu.org, bug-guix <at> gnu.org
.
(Sat, 11 May 2024 10:54:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
The guix-daemon's libutil/util.cc uses copy_file_range to copy a
downloaded file into the store. copy_file_range fails on files larger
than 4GB with an error like this:
guix build: error: short write in copy_file_range `15' to `16': No such file or directory
The man page for copy_file_range says that it could return EFBIG when
the range exceeds the maximum range. The daemon code does not check any
limits and will attempt to copy the whole file.
I believe our code ought to check the value of st.size and fall back to
a boring copy if it exceeds some "reasonable" value.
This is where copy_file_range is used:
https://git.savannah.gnu.org/cgit/guix.git/tree/nix/libutil/util.cc#n382
Here is a little reproducer:
[bug.scm (text/plain, inline)]
(use-modules (guix download)
(guix packages)
(guix build-system trivial))
(package
(name "chungus")
(version "1")
(source
(origin
(method url-fetch)
(uri "http://localhost:1111/chungus")
(sha256
(base32 "0nx67d4ls2nfwcfdmg81vf240z6lpwpdqypssr1wzn3hyz4szci4"))))
(build-system trivial-build-system)
(home-page "")
(synopsis "")
(description "")
(license #f))
[Message part 3 (text/plain, inline)]
--8<---------------cut here---------------start------------->8---
# generate a big file
dd bs=1M count=4096 if=/dev/zero of=/tmp/chungus
# serve it
guix shell woof -- woof -i 127.0.0.1 -p 1111 -c 1 /tmp/chungus
# build the source derivation
guix build --no-grafts -Sf bug.scm
# observe the error
# guix build: error: short write in copy_file_range `15' to `16': No such file or directory
--8<---------------cut here---------------end--------------->8---
--
Ricardo
Information forwarded
to
bug-guix <at> gnu.org
:
bug#70877
; Package
guix
.
(Sun, 12 May 2024 07:14:01 GMT)
Full text and
rfc822 format available.
Message #8 received at 70877 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Sat, May 11, 2024 at 12:52:53PM +0200, Ricardo Wurmus wrote:
> The guix-daemon's libutil/util.cc uses copy_file_range to copy a
> downloaded file into the store. copy_file_range fails on files larger
> than 4GB with an error like this:
>
> guix build: error: short write in copy_file_range `15' to `16': No such file or directory
>
> The man page for copy_file_range says that it could return EFBIG when
> the range exceeds the maximum range. The daemon code does not check any
> limits and will attempt to copy the whole file.
>
> I believe our code ought to check the value of st.size and fall back to
> a boring copy if it exceeds some "reasonable" value.
>
> This is where copy_file_range is used:
> https://git.savannah.gnu.org/cgit/guix.git/tree/nix/libutil/util.cc#n382
>
> Here is a little reproducer:
>
> (use-modules (guix download)
> (guix packages)
> (guix build-system trivial))
>
> (package
> (name "chungus")
> (version "1")
> (source
> (origin
> (method url-fetch)
> (uri "http://localhost:1111/chungus")
> (sha256
> (base32 "0nx67d4ls2nfwcfdmg81vf240z6lpwpdqypssr1wzn3hyz4szci4"))))
> (build-system trivial-build-system)
> (home-page "")
> (synopsis "")
> (description "")
> (license #f))
>
> --8<---------------cut here---------------start------------->8---
> # generate a big file
> dd bs=1M count=4096 if=/dev/zero of=/tmp/chungus
> # serve it
> guix shell woof -- woof -i 127.0.0.1 -p 1111 -c 1 /tmp/chungus
> # build the source derivation
> guix build --no-grafts -Sf bug.scm
> # observe the error
> # guix build: error: short write in copy_file_range `15' to `16': No such file or directory
> --8<---------------cut here---------------end--------------->8---
>
This sounds like a similar failure to bug 65714 that I ran into with
guix copy, but I wasn't able to diagnose it.
https://issues.guix.gnu.org/65714
--
Efraim Flashner <efraim <at> flashner.co.il> רנשלפ םירפא
GPG key = A28B F40C 3E55 1372 662D 14F7 41AA E7DC CA3D 8351
Confidentiality cannot be guaranteed on emails sent or received unencrypted
[signature.asc (application/pgp-signature, inline)]
Severity set to 'important' from 'normal'
Request was from
Ludovic Courtès <ludo <at> gnu.org>
to
control <at> debbugs.gnu.org
.
(Mon, 13 May 2024 09:06:01 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-guix <at> gnu.org
:
bug#70877
; Package
guix
.
(Mon, 13 May 2024 10:11:01 GMT)
Full text and
rfc822 format available.
Message #13 received at 70877 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Hi,
Thanks for the bug report and nice reproducer!
Ricardo Wurmus <rekado <at> elephly.net> skribis:
> The guix-daemon's libutil/util.cc uses copy_file_range to copy a
> downloaded file into the store. copy_file_range fails on files larger
> than 4GB with an error like this:
>
> guix build: error: short write in copy_file_range `15' to `16': No such file or directory
>
> The man page for copy_file_range says that it could return EFBIG when
> the range exceeds the maximum range. The daemon code does not check any
> limits and will attempt to copy the whole file.
>
> I believe our code ought to check the value of st.size and fall back to
> a boring copy if it exceeds some "reasonable" value.
The goal leading to this error message looks like this:
copy_file_range(15, NULL, 16, NULL, 4294967297, 0) = 2147479552
… which is precisely 2 GiB - 4 KiB.
Reading the man page, it’s entirely fine: like ‘write’,
‘copy_file_range’ might copy less than asked for, so it’s really a
mistake of mine to assume that short writes can’t happen. Presumably
there’s an internal limit here we’re reaching that explains why it won’t
copy more than 2 GiB at once.
With the following change, we get:
newfstatat(15, "", {st_mode=S_IFREG|0644, st_size=4294967297, ...}, AT_EMPTY_PATH) = 0
copy_file_range(15, NULL, 16, NULL, 4294967297, 0) = 2147479552
copy_file_range(15, NULL, 16, NULL, 2147487745, 0) = 2147479552
copy_file_range(15, NULL, 16, NULL, 8193, 0) = 8193
fchown(16, 30001, 30000) = 0
Could you confirm that it works for you?
Thanks,
Ludo’.
[0001-daemon-Loop-over-copy_file_range-upon-short-writes.patch (text/x-patch, inline)]
From efd9f3383756df9959651125c0f2e2e769630851 Mon Sep 17 00:00:00 2001
Message-ID: <efd9f3383756df9959651125c0f2e2e769630851.1715594931.git.ludo <at> gnu.org>
From: =?UTF-8?q?Ludovic=20Court=C3=A8s?= <ludo <at> gnu.org>
Date: Mon, 13 May 2024 12:02:30 +0200
Subject: [PATCH] =?UTF-8?q?daemon:=20Loop=20over=20=E2=80=98copy=5Ffile=5F?=
=?UTF-8?q?range=E2=80=99=20upon=20short=20writes.?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Fixes <https://issues.guix.gnu.org/70877>.
* nix/libutil/util.cc (copyFile): Loop over ‘copy_file_range’ instead of
throwing upon short write.
Reported-by: Ricardo Wurmus <rekado <at> elephly.net>
Change-Id: Id7b8a65ea59006c2d91bc23732309a68665b9ca0
---
nix/libutil/util.cc | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/nix/libutil/util.cc b/nix/libutil/util.cc
index 578d6572934..3206dea11b1 100644
--- a/nix/libutil/util.cc
+++ b/nix/libutil/util.cc
@@ -397,9 +397,14 @@ static void copyFile(int sourceFd, int destinationFd)
} else {
if (result < 0)
throw SysError(format("copy_file_range `%1%' to `%2%'") % sourceFd % destinationFd);
- if (result < st.st_size)
- throw SysError(format("short write in copy_file_range `%1%' to `%2%'")
- % sourceFd % destinationFd);
+
+ /* If 'copy_file_range' copied less than requested, try again. */
+ for (ssize_t copied = result; copied < st.st_size; copied += result) {
+ result = copy_file_range(sourceFd, NULL, destinationFd, NULL,
+ st.st_size - copied, 0);
+ if (result < 0)
+ throw SysError(format("copy_file_range `%1%' to `%2%'") % sourceFd % destinationFd);
+ }
}
}
base-commit: 89cd778f6a45cd9b43a4dc1f236dcd0a87af955c
--
2.41.0
Information forwarded
to
bug-guix <at> gnu.org
:
bug#70877
; Package
guix
.
(Mon, 13 May 2024 12:11:02 GMT)
Full text and
rfc822 format available.
Message #16 received at 70877 <at> debbugs.gnu.org (full text, mbox):
Ludovic Courtès <ludo <at> gnu.org> writes:
> Could you confirm that it works for you?
I've applied this locally, started the new daemon, and used it to build
the 4+GB source code derivation of a big package that used to fail
before. It works now. Thank you!
--
Ricardo
Information forwarded
to
bug-guix <at> gnu.org
:
bug#70877
; Package
guix
.
(Mon, 13 May 2024 14:35:02 GMT)
Full text and
rfc822 format available.
Message #19 received at 70877 <at> debbugs.gnu.org (full text, mbox):
Ricardo Wurmus <rekado <at> elephly.net> skribis:
> Ludovic Courtès <ludo <at> gnu.org> writes:
>
>> Could you confirm that it works for you?
>
> I've applied this locally, started the new daemon, and used it to build
> the 4+GB source code derivation of a big package that used to fail
> before. It works now. Thank you!
Pushed as 7757fdd491862fa5c33f1f894503346b89898a01.
I’ll update the ‘guix’ package to make the fix available.
Thanks for testing!
Ludo’.
Reply sent
to
Ludovic Courtès <ludo <at> gnu.org>
:
You have taken responsibility.
(Mon, 13 May 2024 16:25:02 GMT)
Full text and
rfc822 format available.
Notification sent
to
Ricardo Wurmus <rekado <at> elephly.net>
:
bug acknowledged by developer.
(Mon, 13 May 2024 16:25:02 GMT)
Full text and
rfc822 format available.
Message #24 received at 70877-done <at> debbugs.gnu.org (full text, mbox):
Ludovic Courtès <ludo <at> gnu.org> skribis:
> Pushed as 7757fdd491862fa5c33f1f894503346b89898a01.
>
> I’ll update the ‘guix’ package to make the fix available.
Done in 58be9a79e2862d5fa9842d73f498ce2e5442b9ce.
Ludo'.
Information forwarded
to
bug-guix <at> gnu.org
:
bug#70877
; Package
guix
.
(Tue, 14 May 2024 22:27:02 GMT)
Full text and
rfc822 format available.
Message #27 received at 70877 <at> debbugs.gnu.org (full text, mbox):
BTW, the newly updated ‘guix’ package is 8% smaller, as a result of
<https://issues.guix.gnu.org/70398>:
--8<---------------cut here---------------start------------->8---
$ guix describe
Generation 302 May 12 2024 23:29:11 (current)
guix 89cd778
repository URL: https://git.savannah.gnu.org/git/guix.git
branch: master
commit: 89cd778f6a45cd9b43a4dc1f236dcd0a87af955c
$ guix size guix |head -2
store item total self
/gnu/store/r96xq0064nqf43ygcr7z9lgb18vrd1wa-guix-1.4.0-18.4c94b9e 705.8 400.6 56.8%
$ ./pre-inst-env guix size guix |head -2
store item total self
/gnu/store/mcw1d2zy96is5ymjj903i3bi5a0qdwr5-guix-1.4.0-19.7ca9809 673.8 368.7 54.7%
$ git log |head -1
commit 58be9a79e2862d5fa9842d73f498ce2e5442b9ce
--8<---------------cut here---------------end--------------->8---
Ludo’.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Wed, 12 Jun 2024 11:24:18 GMT)
Full text and
rfc822 format available.
This bug report was last modified 5 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.