GNU bug report logs - #13539
Use fdopendir, fstatat and readlinkat, for efficiency.

Previous Next

Package: emacs;

Reported by: Paul Eggert <eggert <at> cs.ucla.edu>

Date: Thu, 24 Jan 2013 08:28:02 UTC

Severity: normal

Tags: patch

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 13539 in the body.
You can then email your comments to 13539 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#13539; Package emacs. (Thu, 24 Jan 2013 08:28:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Paul Eggert <eggert <at> cs.ucla.edu>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Thu, 24 Jan 2013 08:28:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: bug-gnu-emacs <at> gnu.org
Cc: Eli Zaretskii <eliz <at> gnu.org>
Subject: Use fdopendir, fstatat and readlinkat, for efficiency.
Date: Thu, 24 Jan 2013 00:26:50 -0800
[Message part 1 (text/plain, inline)]
Attached is a patch that can greatly improve the performance of
directory-files-and-attributes, which I'd like to install
into the trunk.  The key idea is to use fstatat
instead of lstat; this reduces directory-reading
from an O(ND) to an O(N) operation, where N is the number
of directory entries and D is the length of
the directory's name.

This patch works on POSIXish hosts, but it'll no doubt
need a few changes to the MS-Windows makefiles to compile
fdopendir.c, fstatat.c, openat-die.c, save-cwd.c, openat-proc.c,
or whatever subset of these is needed on MS-Windows.
I'll CC: this to Eli to give him a heads-up.

This patch is relative to trunk bzr 111595.
[fdopendir.txt (text/plain, attachment)]

Added tag(s) patch. Request was from Paul Eggert <eggert <at> cs.ucla.edu> to control <at> debbugs.gnu.org. (Thu, 24 Jan 2013 09:21:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13539; Package emacs. (Thu, 24 Jan 2013 17:30:02 GMT) Full text and rfc822 format available.

Message #10 received at 13539 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 13539 <at> debbugs.gnu.org
Subject: bug#13539: Use fdopendir, fstatat and readlinkat, for efficiency.
Date: Thu, 24 Jan 2013 19:28:33 +0200
> Date: Thu, 24 Jan 2013 00:26:50 -0800
> From: Paul Eggert <eggert <at> cs.ucla.edu>
> CC: Eli Zaretskii <eliz <at> gnu.org>
> 
> Attached is a patch that can greatly improve the performance of
> directory-files-and-attributes, which I'd like to install
> into the trunk.  The key idea is to use fstatat
> instead of lstat; this reduces directory-reading
> from an O(ND) to an O(N) operation, where N is the number
> of directory entries and D is the length of
> the directory's name.

Is it really worth our while to spend energy on speeding up an API
that goes mostly unused in Emacs?  Its only heavy user is ls-lisp.el,
which is used on Windows to replace ls in Dired.  But the Windows
build will not be able to benefit from these changes anyway.  Just a
thought...

> This patch works on POSIXish hosts, but it'll no doubt
> need a few changes to the MS-Windows makefiles to compile
> fdopendir.c, fstatat.c, openat-die.c, save-cwd.c, openat-proc.c,
> or whatever subset of these is needed on MS-Windows.
> I'll CC: this to Eli to give him a heads-up.

I'm sorry, but this changeset is extremely unfriendly to the Windows
build.  It relies on the ability to open a directory, which is not
supported by 'open' on Windows; it uses the gnulib dirent replacement,
which is incompatible with the existing implementation in Emacs; it
uses sys/stat.h header and functions in ways that are incompatible
with their replacements in Emacs; and it requires to process many
header files with Sed, which was not required to build Emacs on
Windows until now.  All that to eventually fall back on opendir and
stat/lstat, since the *at functionality cannot be supported on Windows
anyway.  Fixing all this would require a lot of work for absolutely no
gain, which sounds like a waste of energy.

Instead, I suggest to use in dired.c functions that will accept both a
file name and a file descriptor, so that Posix hosts could use the
descriptor and ignore the file name, while on Windows Emacs will
ignore the descriptor and use the file name.  (This requires to retain
the code you suggest to delete that concatenates file names returned
by readdir with the parent directory; we could have that code
#ifdef'ed away for Posix hosts.)  That will still require some
adjustments in the w32 sources, but they will be 2 orders of magnitude
smaller and less complex, and will not hamper Posix hosts in any way.
The only downside will be somewhat more ugly code, but files like
dired.c already have quite a few #ifdef's, not surprisingly.

Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13539; Package emacs. (Mon, 28 Jan 2013 08:36:02 GMT) Full text and rfc822 format available.

Message #13 received at 13539 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 13539 <at> debbugs.gnu.org
Subject: Re: bug#13539: Use fdopendir, fstatat and readlinkat, for efficiency.
Date: Mon, 28 Jan 2013 00:34:49 -0800
[Message part 1 (text/plain, inline)]
On 01/24/2013 09:28 AM, Eli Zaretskii wrote:

> Is it really worth our while to spend energy on speeding up an API
> that goes mostly unused in Emacs?  Its only heavy user is ls-lisp.el,

This patch also improves the performance of file-name-completion
-- perhaps you missed that part?  Anyway, on the POSIX side
it's a pretty simple patch to Emacs proper and is an obvious
performance win, and it'd be a shame if it couldn't be put in.

> I suggest to use in dired.c functions that will accept both a
> file name and a file descriptor,

Better than that, w32 can model the POSIX file descriptor by
using an int that represents the directory name.  The
attached patch does that.  It's a bit of a hack, but I don't
see why it wouldn't work.  If it doesn't work, we can fall
back on doing things the Gnulib way.  Gnulib uses 'int' to
emulate "open" directories on MS-Windows, and this works for
many other GNU applications.  It's probably cleaner to do it
that way in the long run, so this patch has a FIXME or two
about this.

In this patch, the MS-Windows port shouldn't need to worry
about the added source files; it should be able to ignore them all.
For example, it needn't worry about lib/dirent.in.h.

I haven't tested it on MS-Windows, though.
[fdopendis.txt (text/plain, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13539; Package emacs. (Mon, 28 Jan 2013 14:39:02 GMT) Full text and rfc822 format available.

Message #16 received at 13539 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 13539 <at> debbugs.gnu.org
Subject: Re: bug#13539: Use fdopendir, fstatat and readlinkat, for efficiency.
Date: Mon, 28 Jan 2013 16:37:48 +0200
> Date: Mon, 28 Jan 2013 00:34:49 -0800
> From: Paul Eggert <eggert <at> cs.ucla.edu>
> CC: 13539 <at> debbugs.gnu.org
> 
> > Is it really worth our while to spend energy on speeding up an API
> > that goes mostly unused in Emacs?  Its only heavy user is ls-lisp.el,
> 
> This patch also improves the performance of file-name-completion
> -- perhaps you missed that part?

I just didn't imagine that file-name-completion would need to call
'lstat'.  But now I see that it does, even twice in a row -- only so
it could support FOO/ in completion-ignored-extensions.  So perhaps a
further optimization would be to avoid the call to
file_name_completion_stat if completion-ignored-extensions is devoid
of elements that end in a slash.

> Better than that, w32 can model the POSIX file descriptor by
> using an int that represents the directory name.

Yes, this is indeed much simpler, thanks.  I have only one comment:

> +static DIR *
> +open_directory (char const *name, int *fdp)
> +{
> +  DIR *d;
> +  int fd, opendir_errno;
> +
> +  block_input ();
> +
> +#ifdef DOS_NT
> +  /* Directories cannot be opened, so emulate an open directory with
> +     its name, cast to a pointer.  This is good enough for Emacs.  */
> +  fd = (int) name;
          ^^^^^
I think we should use ptrdiff_t rather than int here, since that could
make a difference in 64-bit builds on Windows.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13539; Package emacs. (Wed, 30 Jan 2013 00:58:02 GMT) Full text and rfc822 format available.

Message #19 received at 13539 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 13539 <at> debbugs.gnu.org
Subject: Re: bug#13539: Use fdopendir, fstatat and readlinkat, for efficiency.
Date: Tue, 29 Jan 2013 16:56:37 -0800
[Message part 1 (text/plain, inline)]
On 01/28/2013 06:37 AM, Eli Zaretskii wrote:
> perhaps a
> further optimization would be to avoid the call to
> file_name_completion_stat if completion-ignored-extensions is devoid
> of elements that end in a slash.

Yes, something like that might make sense.

>> +  fd = (int) name;
>           ^^^^^
> I think we should use ptrdiff_t rather than int here, since that could
> make a difference in 64-bit builds on Windows.

Unfortunately that won't work because the existing API uses int elsewhere
and we can't easily change int to ptrdiff_t there.
But there's another approach: simply set fd = 0, and use 
dir_pathname inside w32.c.  This should be good enough for what Emacs needs.
Revised patch attached.
[fdopendis.txt (text/plain, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13539; Package emacs. (Wed, 30 Jan 2013 17:29:01 GMT) Full text and rfc822 format available.

Message #22 received at 13539 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 13539 <at> debbugs.gnu.org
Subject: Re: bug#13539: Use fdopendir, fstatat and readlinkat, for efficiency.
Date: Wed, 30 Jan 2013 19:27:32 +0200
> Date: Tue, 29 Jan 2013 16:56:37 -0800
> From: Paul Eggert <eggert <at> cs.ucla.edu>
> CC: 13539 <at> debbugs.gnu.org
> 
> But there's another approach: simply set fd = 0, and use 
> dir_pathname inside w32.c.  This should be good enough for what Emacs needs.
> Revised patch attached.

Works for me, thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13539; Package emacs. (Wed, 30 Jan 2013 19:25:01 GMT) Full text and rfc822 format available.

Message #25 received at 13539 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 13539 <at> debbugs.gnu.org
Subject: Re: bug#13539: Use fdopendir, fstatat and readlinkat, for efficiency.
Date: Wed, 30 Jan 2013 14:24:06 -0500
>> perhaps a further optimization would be to avoid the call to
>> file_name_completion_stat if completion-ignored-extensions is devoid
>> of elements that end in a slash.
> Yes, something like that might make sense.

Maybe.  But maybe not.  The default value of
completion-ignored-extensions includes elements that end in a slash, and
I think fairly few uses of try-completion rebind it to something else.


        Stefan




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13539; Package emacs. (Wed, 30 Jan 2013 20:26:01 GMT) Full text and rfc822 format available.

Message #28 received at 13539 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 13539 <at> debbugs.gnu.org, eggert <at> cs.ucla.edu
Subject: Re: bug#13539: Use fdopendir, fstatat and readlinkat, for efficiency.
Date: Wed, 30 Jan 2013 22:24:14 +0200
> From: Stefan Monnier <monnier <at> iro.umontreal.ca>
> Cc: Eli Zaretskii <eliz <at> gnu.org>,  13539 <at> debbugs.gnu.org
> Date: Wed, 30 Jan 2013 14:24:06 -0500
> 
> >> perhaps a further optimization would be to avoid the call to
> >> file_name_completion_stat if completion-ignored-extensions is devoid
> >> of elements that end in a slash.
> > Yes, something like that might make sense.
> 
> Maybe.  But maybe not.  The default value of
> completion-ignored-extensions includes elements that end in a slash, and
> I think fairly few uses of try-completion rebind it to something else.

I agree.  It just sounds excessive to invoke lstat/stat _twice_ just
to support this one feature.

Perhaps at least systems that have d_type could avoid one of these 2
calls.  (d_type can easily be added to the Windows emulation.)




Reply sent to Paul Eggert <eggert <at> cs.ucla.edu>:
You have taken responsibility. (Fri, 01 Feb 2013 06:43:02 GMT) Full text and rfc822 format available.

Notification sent to Paul Eggert <eggert <at> cs.ucla.edu>:
bug acknowledged by developer. (Fri, 01 Feb 2013 06:43:03 GMT) Full text and rfc822 format available.

Message #33 received at 13539-done <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 13539-done <at> debbugs.gnu.org
Subject: Re: bug#13539: Use fdopendir, fstatat and readlinkat, for efficiency.
Date: Thu, 31 Jan 2013 22:42:04 -0800
On 01/30/2013 09:27 AM, Eli Zaretskii wrote:
> Works for me, thanks.

OK, pushed into the trunk as bzr 111646,
and marking this bug as done.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Fri, 01 Mar 2013 12:24:03 GMT) Full text and rfc822 format available.

This bug report was last modified 11 years and 79 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.