GNU bug report logs -
#31332
touch unnecessarily calls dup2
Previous Next
Reported by: John Steele Scott <toojays <at> toojays.net>
Date: Tue, 1 May 2018 15:28:02 UTC
Severity: normal
Tags: notabug
Done: Assaf Gordon <assafgordon <at> gmail.com>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 31332 in the body.
You can then email your comments to 31332 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-coreutils <at> gnu.org
:
bug#31332
; Package
coreutils
.
(Tue, 01 May 2018 15:28:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
John Steele Scott <toojays <at> toojays.net>
:
New bug report received and forwarded. Copy sent to
bug-coreutils <at> gnu.org
.
(Tue, 01 May 2018 15:28:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
From https://stackoverflow.com/questions/40446555/why-does-touch-call-the-dup2-syscall
jscott <at> citra:/tmp$ touch --version | head -1
touch (GNU coreutils) 8.25
jscott <at> citra:/tmp$ strace -ttt touch foo 2>&1 | tail -9
1525170579.952032 open("foo", O_WRONLY|O_CREAT|O_NOCTTY|O_NONBLOCK, 0666) = 3
1525170579.952080 dup2(3, 0) = 0
1525170579.952119 close(3) = 0
1525170579.952156 utimensat(0, NULL, NULL, 0) = 0
1525170579.952209 close(0) = 0
1525170579.952257 close(1) = 0
1525170579.952294 close(2) = 0
1525170579.952333 exit_group(0) = ?
1525170579.952450 +++ exited with 0 +++
My analysis from that discussion:
It's a historical artifact.
The open()+dup2() pattern comes from the fd_reopen() function, which is used
by several of the programs in the coreutils code base.
Prior to coreutils commit e373bb1, fd_reopen() did not do open()+dup2(), but
closed the desired file descriptor before opening a new one. This was the
case when touch started using this function at coreutils commit 478bd89. Per
that commit message, the intent was to reduce the number of file descriptors
touch would have open.
In the scheme of things the extra call to dup2() is not a big deal. We've lived
with it for 10 years. On my laptop it would take 25,000 invocations of touch
before those unnecessary dup2() calls add up to a second. But touch is called a
lot, on systems all over the world. So maybe you guys care about this?
Cheers,
John
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#31332
; Package
coreutils
.
(Tue, 30 Oct 2018 03:18:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 31332 <at> debbugs.gnu.org (full text, mbox):
tags 31332 notabug
close 31332
stop
On 2018-05-01 4:38 a.m., John Steele Scott wrote:
> From https://stackoverflow.com/questions/40446555/why-does-touch-call-the-dup2-syscall
>
> jscott <at> citra:/tmp$ touch --version | head -1
> touch (GNU coreutils) 8.25
> jscott <at> citra:/tmp$ strace -ttt touch foo 2>&1 | tail -9
> 1525170579.952032 open("foo", O_WRONLY|O_CREAT|O_NOCTTY|O_NONBLOCK, 0666) = 3
> 1525170579.952080 dup2(3, 0) = 0
> 1525170579.952119 close(3) = 0
> 1525170579.952156 utimensat(0, NULL, NULL, 0) = 0
> 1525170579.952209 close(0) = 0
> 1525170579.952257 close(1) = 0
> 1525170579.952294 close(2) = 0
> 1525170579.952333 exit_group(0) = ?
> 1525170579.952450 +++ exited with 0 +++
>
> My analysis from that discussion:
>
> It's a historical artifact.
>
> The open()+dup2() pattern comes from the fd_reopen() function, which is used
> by several of the programs in the coreutils code base.
>
Not exactly an "artifact" - but an indented operation.
The goal of "fd_reopen" is not just to open a file,
but to ensure the returned file descriptor (an "int") has a specific
value.
For example, standard input (STDIN) is typically file descriptor value
0. calling fd_reopen first opens the file (the kernel returns file
descriptor 3), then it ensures file descriptor 0 is associated with the
same file (and then, calling "close(3)" gets rid of the other file
descriptor).
Checking how fd_reopen is used in coreutils, it is almost always used
to ensure STDIN/STDOUT point to the desired file(s):
====
$ git grep fd_reopen
src/csplit.c: if (! STREQ (name, "-") && fd_reopen (STDIN_FILENO, name,
O_RDONLY, 0) < 0)
src/dd.c:/* Restart on EINTR from fd_reopen(). */
src/dd.c:ifd_reopen (int desired_fd, char const *file, int flag, mode_t
mode)
src/dd.c: ret = fd_reopen (desired_fd, file, flag, mode);
src/dd.c: if (ifd_reopen (STDIN_FILENO, input_file, O_RDONLY |
input_flags, 0) < 0)
src/dd.c: || ifd_reopen (STDOUT_FILENO, output_file, O_RDWR |
opts, perms) < 0)
src/dd.c: && (ifd_reopen (STDOUT_FILENO, output_file, O_WRONLY
| opts, perms)
src/nohup.c: if (fd_reopen (STDIN_FILENO, "/dev/null", O_WRONLY, 0)
< 0)
src/nohup.c: ? fd_reopen (STDOUT_FILENO, file, flags, mode)
src/nohup.c: ? fd_reopen (STDOUT_FILENO, in_home,
flags, mode)
src/split.c: && fd_reopen (STDIN_FILENO, infile, O_RDONLY, 0) < 0)
src/stty.c: if (fd_reopen (STDIN_FILENO, device_name, O_RDONLY |
O_NONBLOCK, 0) < 0)
src/touch.c: fd = fd_reopen (STDIN_FILENO, file,
===
This means the rest of the program can just operate on STDIN (or STDOUT).
As such, I'm closing this item.
Discussion can continue by replying to this thread.
-assaf
Added tag(s) notabug.
Request was from
Assaf Gordon <assafgordon <at> gmail.com>
to
control <at> debbugs.gnu.org
.
(Tue, 30 Oct 2018 03:18:02 GMT)
Full text and
rfc822 format available.
bug closed, send any further explanations to
31332 <at> debbugs.gnu.org and John Steele Scott <toojays <at> toojays.net>
Request was from
Assaf Gordon <assafgordon <at> gmail.com>
to
control <at> debbugs.gnu.org
.
(Tue, 30 Oct 2018 03:18:02 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#31332
; Package
coreutils
.
(Tue, 30 Oct 2018 04:20:01 GMT)
Full text and
rfc822 format available.
Message #15 received at 31332 <at> debbugs.gnu.org (full text, mbox):
Hi Assif,
Thanks for your reply.
On 30/10/18 1:47 pm, Assaf Gordon wrote:
> tags 31332 notabug
> close 31332
> stop
>
> On 2018-05-01 4:38 a.m., John Steele Scott wrote:
>> From https://stackoverflow.com/questions/40446555/why-does-touch-call-the-dup2-syscall
>>
>> jscott <at> citra:/tmp$ touch --version | head -1
>> touch (GNU coreutils) 8.25
>> jscott <at> citra:/tmp$ strace -ttt touch foo 2>&1 | tail -9
>> 1525170579.952032 open("foo", O_WRONLY|O_CREAT|O_NOCTTY|O_NONBLOCK, 0666) = 3
>> 1525170579.952080 dup2(3, 0) = 0
>> 1525170579.952119 close(3) = 0
>> 1525170579.952156 utimensat(0, NULL, NULL, 0) = 0
>> 1525170579.952209 close(0) = 0
>> 1525170579.952257 close(1) = 0
>> 1525170579.952294 close(2) = 0
>> 1525170579.952333 exit_group(0) = ?
>> 1525170579.952450 +++ exited with 0 +++
>>
>> My analysis from that discussion:
>>
>> It's a historical artifact.
>>
>> The open()+dup2() pattern comes from the fd_reopen() function, which is used
>> by several of the programs in the coreutils code base.
>>
>
> Not exactly an "artifact" - but an indented operation.
>
I called it a historical artefact because the call to dup2() did not occur when touch started calling fd_reopen(). Prior to http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=e373bb19 fd_reopen() initially did "close(desired_fd); fd = open(...)" which would always do the right thing if stdin was the desired fd. No call to dup2().
> The goal of "fd_reopen" is not just to open a file,
> but to ensure the returned file descriptor (an "int") has a specific
> value.
>
> For example, standard input (STDIN) is typically file descriptor value 0. calling fd_reopen first opens the file (the kernel returns file descriptor 3), then it ensures file descriptor 0 is associated with the same file (and then, calling "close(3)" gets rid of the other file descriptor).
>
> Checking how fd_reopen is used in coreutils, it is almost always used
> to ensure STDIN/STDOUT point to the desired file(s):
> ====
> $ git grep fd_reopen
> src/csplit.c: if (! STREQ (name, "-") && fd_reopen (STDIN_FILENO, name, O_RDONLY, 0) < 0)
> src/dd.c:/* Restart on EINTR from fd_reopen(). */
> src/dd.c:ifd_reopen (int desired_fd, char const *file, int flag, mode_t mode)
> src/dd.c: ret = fd_reopen (desired_fd, file, flag, mode);
> src/dd.c: if (ifd_reopen (STDIN_FILENO, input_file, O_RDONLY | input_flags, 0) < 0)
> src/dd.c: || ifd_reopen (STDOUT_FILENO, output_file, O_RDWR | opts, perms) < 0)
> src/dd.c: && (ifd_reopen (STDOUT_FILENO, output_file, O_WRONLY | opts, perms)
> src/nohup.c: if (fd_reopen (STDIN_FILENO, "/dev/null", O_WRONLY, 0) < 0)
> src/nohup.c: ? fd_reopen (STDOUT_FILENO, file, flags, mode)
> src/nohup.c: ? fd_reopen (STDOUT_FILENO, in_home, flags, mode)
> src/split.c: && fd_reopen (STDIN_FILENO, infile, O_RDONLY, 0) < 0)
> src/stty.c: if (fd_reopen (STDIN_FILENO, device_name, O_RDONLY | O_NONBLOCK, 0) < 0)
> src/touch.c: fd = fd_reopen (STDIN_FILENO, file,
> ===
>
> This means the rest of the program can just operate on STDIN (or STDOUT).
I haven't looked at all those cases, but my point was that the call to dup2() adds an extra 40us to each invocation of touch. In the case of touch, it really only cares about the fd being STDOUT_FILENO or !STDOUT_FILENO; it doesn't need to be STDIN_FILENO exactly. With a small amount of work touch could use open() instead of fd_reopen(). The code would be somewhat simpler and it would save 40us of execution time.
Or maybe it wouldn't, since there's an extra fd open when the program terminates and we need to wait for the kernel to clean that up anyway? I don't know.
Cheers,
John
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#31332
; Package
coreutils
.
(Tue, 30 Oct 2018 05:39:02 GMT)
Full text and
rfc822 format available.
Message #18 received at 31332 <at> debbugs.gnu.org (full text, mbox):
John Steele Scott wrote:
> Prior tohttp://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=e373bb19 fd_reopen() initially did "close(desired_fd); fd = open(...)" which would always do the right thing
No, as the old code did the wrong thing for "touch /dev/stdin", whereas the
current code works.
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#31332
; Package
coreutils
.
(Tue, 30 Oct 2018 06:25:01 GMT)
Full text and
rfc822 format available.
Message #21 received at 31332 <at> debbugs.gnu.org (full text, mbox):
On 30/10/18 4:08 pm, Paul Eggert wrote:
> John Steele Scott wrote:
>> Prior tohttp://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=e373bb19 fd_reopen() initially did "close(desired_fd); fd = open(...)" which would always do the right thing
>
> No, as the old code did the wrong thing for "touch /dev/stdin", whereas the current code works.
I stand corrected. Okay, I see that we don't want to close /dev/stdin at the start of touch. Thanks for making me think of this case.
I'd much much obliged if one of you could enlighten me as to why touch needs to treat /dev/stdin as a special case after the file has been opened though. Wouldn't the following work just as well, with one less system call?
--- a/src/touch.c
+++ b/src/touch.c
@@ -132,8 +132,8 @@ touch (const char *file)
else if (! (no_create || no_dereference))
{
/* Try to open FILE, creating it if necessary. */
- fd = fd_reopen (STDIN_FILENO, file,
- O_WRONLY | O_CREAT | O_NONBLOCK | O_NOCTTY, MODE_RW_UGO);
+ fd = open (file,
+ O_WRONLY | O_CREAT | O_NONBLOCK | O_NOCTTY, MODE_RW_UGO);
/* Don't save a copy of errno if it's EISDIR, since that would lead
touch to give a bogus diagnostic for e.g., 'touch /' (assuming
@@ -166,9 +166,9 @@ touch (const char *file)
(no_dereference && fd == -1) ? AT_SYMLINK_NOFOLLOW : 0)
== 0);
- if (fd == STDIN_FILENO)
+ if (fd != STDOUT_FILENO)
{
- if (close (STDIN_FILENO) != 0)
+ if (close (fd) != 0)
{
error (0, errno, _("failed to close %s"), quoteaf (file));
return false;
Cheers,
John
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#31332
; Package
coreutils
.
(Mon, 12 Nov 2018 22:24:01 GMT)
Full text and
rfc822 format available.
Message #24 received at 31332 <at> debbugs.gnu.org (full text, mbox):
John Steele Scott wrote:
> I'd much much obliged if one of you could enlighten me as to why touch needs to treat /dev/stdin as a special case after the file has been opened though. Wouldn't the following work just as well, with one less system call?
>
> --- a/src/touch.c
> +++ b/src/touch.c
> @@ -132,8 +132,8 @@ touch (const char *file)
> else if (! (no_create || no_dereference))
> {
> /* Try to open FILE, creating it if necessary. */
> - fd = fd_reopen (STDIN_FILENO, file,
> - O_WRONLY | O_CREAT | O_NONBLOCK | O_NOCTTY, MODE_RW_UGO);
> + fd = open (file,
> + O_WRONLY | O_CREAT | O_NONBLOCK | O_NOCTTY, MODE_RW_UGO);
>
> /* Don't save a copy of errno if it's EISDIR, since that would lead
> touch to give a bogus diagnostic for e.g., 'touch /' (assuming
> @@ -166,9 +166,9 @@ touch (const char *file)
> (no_dereference && fd == -1) ? AT_SYMLINK_NOFOLLOW : 0)
> == 0);
>
> - if (fd == STDIN_FILENO)
> + if (fd != STDOUT_FILENO)
> {
> - if (close (STDIN_FILENO) != 0)
> + if (close (fd) != 0)
That patch has trouble if 'open' returns 1. The resulting fd is never closed and
for some filesystems (NFS, for example) one needs to close the fd to find out
whether the fdutimensat call actually worked.
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#31332
; Package
coreutils
.
(Mon, 12 Nov 2018 22:42:01 GMT)
Full text and
rfc822 format available.
Message #27 received at 31332 <at> debbugs.gnu.org (full text, mbox):
On 13/11/18 8:53 am, Paul Eggert wrote:
> John Steele Scott wrote:
>> I'd much much obliged if one of you could enlighten me as to why touch needs to treat /dev/stdin as a special case after the file has been opened though. Wouldn't the following work just as well, with one less system call?
>>
>> --- a/src/touch.c
>> +++ b/src/touch.c
>> @@ -132,8 +132,8 @@ touch (const char *file)
>> else if (! (no_create || no_dereference))
>> {
>> /* Try to open FILE, creating it if necessary. */
>> - fd = fd_reopen (STDIN_FILENO, file,
>> - O_WRONLY | O_CREAT | O_NONBLOCK | O_NOCTTY, MODE_RW_UGO);
>> + fd = open (file,
>> + O_WRONLY | O_CREAT | O_NONBLOCK | O_NOCTTY, MODE_RW_UGO);
>> /* Don't save a copy of errno if it's EISDIR, since that would lead
>> touch to give a bogus diagnostic for e.g., 'touch /' (assuming
>> @@ -166,9 +166,9 @@ touch (const char *file)
>> (no_dereference && fd == -1) ? AT_SYMLINK_NOFOLLOW : 0)
>> == 0);
>> - if (fd == STDIN_FILENO)
>> + if (fd != STDOUT_FILENO)
>> {
>> - if (close (STDIN_FILENO) != 0)
>> + if (close (fd) != 0)
>
> That patch has trouble if 'open' returns 1. The resulting fd is never closed and for some filesystems (NFS, for example) one needs to close the fd to find out whether the fdutimensat call actually worked.
Thanks Paul,
So if someone did want to wipe out that dup2 call from GNU touch, they'd need to track separately whether they opened the file or whether they are using a passed-in file descriptor, rather than relying on the value of fd to indicate that. Got it.
I'll stop flogging this dead horse now. Thanks both Paul and Assaf for your replies.
Cheers,
John
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Tue, 11 Dec 2018 12:24:04 GMT)
Full text and
rfc822 format available.
This bug report was last modified 5 years and 137 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.