GNU bug report logs -
#54532
Faster sorting
Previous Next
Reported by: Andrew Cohen <acohen <at> ust.hk>
Date: Wed, 23 Mar 2022 00:00:02 UTC
Severity: wishlist
Tags: patch
Fixed in version 29.1
Done: Stefan Kangas <stefan <at> marxist.se>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 54532 in the body.
You can then email your comments to 54532 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#54532
; Package
emacs
.
(Wed, 23 Mar 2022 00:00:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Andrew Cohen <acohen <at> ust.hk>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Wed, 23 Mar 2022 00:00:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Tags: patch
As discussed on emacs-devel I have been working on a replacement for the
current sorting algorithm in Emacs. With the help of Mattias Engdegård
we have made a lot of progress, and the attached patch implements
TIMSORT, the sorting algorithm introduced 20 years ago in python and now
used in Android and many other places. This implementation is pretty
much always 20% to 30% faster than the current version, and in many
cases is an order of magnitude faster. The current Emacs code treats
lists and vectors differently, while the new implementation uses a
common code path. Some benchmarks (times in microseconds; all lists are
length 10K):
| | oldlist | oldvec | tim |
| (make-random-list 10000) | 2790 | 2123 | 1557 |
| (nreverse (make-sorted-list 10000)) | 1417 | 987 | 118 |
| (make-sorted-list 10000) | 1310 | 899 | 116 |
| (make-swapped-list 10000 3) | 1457 | 1019 | 122 |
| (make-plus-list 10000) | 1309 | 899 | 119 |
| (make-onepercent-list 10000) | 1764 | 1272 | 183 |
| (make-constant-list 10000) | 1292 | 888 | 116 |
| (make-evil-list 10000) | 1374 | 946 | 398 |
| (make-block-list 10000 100) | 2235 | 1646 | 919 |
| (make-block-list 10000 10) | 2598 | 1962 | 1451 |
The patch has 4 parts:
1. Add a new `record_unwind_protect_ptr_mark` function for use with C data
structures that use the specpdl for clean-up but also contain possibly
unique references to Lisp objects. This is needed for the dynamic
memory management that the new algorithm uses.
2. The actual sorting change. This removes the old sorting routines and
puts the new implementation in a separate file, sort.c
3. A bunch of new unit tests.
4. An optimization that resolves the sorting comparison symbol into the
corresponding function before starting the sort.
In GNU Emacs 29.0.50 (build 5, x86_64-pc-linux-gnu, GTK+ Version 3.24.33, cairo version 1.16.0)
of 2022-03-22 built on clove
Repository revision: c3c1ee56a44541e1eb2fd7e523f7508fd330d635
Repository branch: scratch/local
System Description: Debian GNU/Linux bookworm/sid
Configured using:
'configure --with-x-toolkit=gtk3 --with-native-compilation --with-pgtk
--with-xwidgets'
[patch.diff (text/patch, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#54532
; Package
emacs
.
(Wed, 23 Mar 2022 12:03:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 54532 <at> debbugs.gnu.org (full text, mbox):
Andrew Cohen <acohen <at> ust.hk> writes:
> | | oldlist | oldvec | tim |
> | (make-random-list 10000) | 2790 | 2123 | 1557 |
> | (nreverse (make-sorted-list 10000)) | 1417 | 987 | 118 |
> | (make-sorted-list 10000) | 1310 | 899 | 116 |
> | (make-swapped-list 10000 3) | 1457 | 1019 | 122 |
> | (make-plus-list 10000) | 1309 | 899 | 119 |
> | (make-onepercent-list 10000) | 1764 | 1272 | 183 |
> | (make-constant-list 10000) | 1292 | 888 | 116 |
> | (make-evil-list 10000) | 1374 | 946 | 398 |
> | (make-block-list 10000 100) | 2235 | 1646 | 919 |
> | (make-block-list 10000 10) | 2598 | 1962 | 1451 |
Wow, great! A tenfold speed increase on (mostly-)sorted lists (which is a
common use case in my experience) is impressive.
Reading the code, I don't really have any comments (but then again, I
don't really understand the timsort algorithm anyway).
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#54532
; Package
emacs
.
(Wed, 23 Mar 2022 13:31:06 GMT)
Full text and
rfc822 format available.
Message #11 received at 54532 <at> debbugs.gnu.org (full text, mbox):
> From: Andrew Cohen <acohen <at> ust.hk>
> Date: Wed, 23 Mar 2022 07:59:11 +0800
>
> 1. Add a new `record_unwind_protect_ptr_mark` function for use with C data
> structures that use the specpdl for clean-up but also contain possibly
> unique references to Lisp objects. This is needed for the dynamic
> memory management that the new algorithm uses.
Can you tell more why this was needed? Emacs has conservative stack
marking, so any Lisp object that is referred by some stack-based
variable should be protected from GC. Why isn't that enough in this
case?
> 4. An optimization that resolves the sorting comparison symbol into the
> corresponding function before starting the sort.
Any reasons this is a separate patch?
Thanks a lot for working on this.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#54532
; Package
emacs
.
(Wed, 23 Mar 2022 13:47:02 GMT)
Full text and
rfc822 format available.
Message #14 received at 54532 <at> debbugs.gnu.org (full text, mbox):
> From: Andrew Cohen <acohen <at> ust.hk>
> Date: Wed, 23 Mar 2022 07:59:11 +0800
>
> The patch has 4 parts:
>
> 1. Add a new `record_unwind_protect_ptr_mark` function for use with C data
> structures that use the specpdl for clean-up but also contain possibly
> unique references to Lisp objects. This is needed for the dynamic
> memory management that the new algorithm uses.
> 2. The actual sorting change. This removes the old sorting routines and
> puts the new implementation in a separate file, sort.c
> 3. A bunch of new unit tests.
> 4. An optimization that resolves the sorting comparison symbol into the
> corresponding function before starting the sort.
Thanks, a few minor stylistic issues:
> +/* BINARYSORT() is a stable binary insertion sort used for sorting the
> + list starting at LO and ending at HI. On entry, LO <= START <= HI,
> + and [LO, START) is already sorted (pass START == LO if you don't
> + know!). Even in case of error, the output slice will be some
> + permutation of the input (nothing is lost or duplicated). */
Our style of such comments is to phrase them as doc strings. Which
means, among other things:
. no need to name the function, since the comment is right in front
of it;
. the style should be imperative mood: "sort the list starting at LO
and ending and HI using a stable binary insertion sort algorithm",
etc.
One other nit: we dislike the FOO() style to mean that FOO is a
function. That's because FOO() looks like a call to a function with
no arguments, which is not what you mean. We use 'FOO' instead. (Not
that you should need to refer to functions too much if you follow our
style conventions.)
Please fix your commentary that documents functions with the above
rules in mind.
> +static ptrdiff_t
> +count_run (merge_state *ms, Lisp_Object *lo, const Lisp_Object *hi, bool *descending)
This line is too long, please break it in two.
> + eassume (lastofs == ofs); /* Then a[ofs-1] < key <= a[ofs]. */
> + return ofs;
> +}
Is this really eassume, or is eassert better here?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#54532
; Package
emacs
.
(Wed, 23 Mar 2022 20:25:02 GMT)
Full text and
rfc822 format available.
Message #17 received at 54532 <at> debbugs.gnu.org (full text, mbox):
> Can you tell more why this was needed? Emacs has conservative stack marking, so any Lisp object that is referred by some stack-based variable should be protected from GC. Why isn't that enough in this case?
Because Lisp values are temporarily moved out to a heap allocated buffer which isn't traversed by the stack scanner. `record_unwind_protect_ptr_mark` can be seen as a generalisation of `record_unwind_protect_ptr` (because that's what it is).
> Is this really eassume, or is eassert better here?
No, eassume is appropriate here. It provides more optimisation opportunities.
In fact, most uses of eassert are better or just as good as eassume as long as there are no function calls or tricky memory dereferences.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#54532
; Package
emacs
.
(Wed, 23 Mar 2022 23:33:01 GMT)
Full text and
rfc822 format available.
Message #20 received at 54532 <at> debbugs.gnu.org (full text, mbox):
>>>>> "EZ" == Eli Zaretskii <eliz <at> gnu.org> writes:
>> From: Andrew Cohen <acohen <at> ust.hk> Date: Wed, 23 Mar 2022
>> 07:59:11 +0800
>>
>> The patch has 4 parts:
[...]
EZ> Thanks, a few minor stylistic issues:
[...]
EZ> Please fix your commentary that documents functions with the
EZ> above rules in mind.
Will do.
>> +static ptrdiff_t +count_run (merge_state *ms, Lisp_Object *lo,
>> const Lisp_Object *hi, bool *descending)
EZ> This line is too long, please break it in two.
Yes.
>> + eassume (lastofs == ofs); /* Then a[ofs-1] < key <= a[ofs]. */
>> + return ofs; +}
EZ> Is this really eassume, or is eassert better here?
See the comment by Mattias (this is outside my area of expertise).
--
Andrew Cohen
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#54532
; Package
emacs
.
(Wed, 23 Mar 2022 23:44:01 GMT)
Full text and
rfc822 format available.
Message #23 received at 54532 <at> debbugs.gnu.org (full text, mbox):
>>>>> "EZ" == Eli Zaretskii <eliz <at> gnu.org> writes:
>> From: Andrew Cohen <acohen <at> ust.hk> Date: Wed, 23 Mar 2022
>> 07:59:11 +0800
>>
>> 1. Add a new `record_unwind_protect_ptr_mark` function for use
>> with C data structures that use the specpdl for clean-up but also
>> contain possibly unique references to Lisp objects. This is
>> needed for the dynamic memory management that the new algorithm
>> uses.
EZ> Can you tell more why this was needed? Emacs has conservative
EZ> stack marking, so any Lisp object that is referred by some
EZ> stack-based variable should be protected from GC. Why isn't
EZ> that enough in this case?
Mattias responded, but just to recap---during the merging of sublists
additional memory is dynamically xalloc'ed (and later xfree'd) on the
heap, and there can be brief periods where the original Lisp_Objects
exist /only/ in the heap. The alternative (which we mentioned briefly in
earlier discussions) would be to allocate the maximum amount of needed
memory at the outset with SAFE_ALLOCA_LISP. This is what the current
sorting routine does. But for many (maybe most?) actual calls to timsort no
additional memory is needed so the dynamic allocation makes
significantly better use of resources. Mattias explained the issue to me
and once I understood it found it was easy to trigger; there is a
specific test in the unit test patch to check for this issue.
>> 4. An optimization that resolves the sorting comparison symbol
>> into the corresponding function before starting the sort.
EZ> Any reasons this is a separate patch?
Just to make it easier to look at the rather lengthy code by segregating
things that are self-contained. If/when we decide to put this in I
would probably squash down to two (or maybe 3) commits.
Thanks for the thorough look at the code, I continue to learn a lot!
Best,
Andy
--
Andrew Cohen
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#54532
; Package
emacs
.
(Thu, 24 Mar 2022 06:43:02 GMT)
Full text and
rfc822 format available.
Message #26 received at 54532 <at> debbugs.gnu.org (full text, mbox):
> From: Mattias Engdegård <mattiase <at> acm.org>
> Date: Wed, 23 Mar 2022 21:24:16 +0100
> Cc: Andrew Cohen <acohen <at> ust.hk>, 54532 <at> debbugs.gnu.org
>
> > Can you tell more why this was needed? Emacs has conservative stack marking, so any Lisp object that is referred by some stack-based variable should be protected from GC. Why isn't that enough in this case?
>
> Because Lisp values are temporarily moved out to a heap allocated buffer which isn't traversed by the stack scanner. `record_unwind_protect_ptr_mark` can be seen as a generalisation of `record_unwind_protect_ptr` (because that's what it is).
I understand that these objects are on the heap, but AFAIU a reference
to them is on the stack, isn't it?
> > Is this really eassume, or is eassert better here?
>
> No, eassume is appropriate here. It provides more optimisation opportunities.
> In fact, most uses of eassert are better or just as good as eassume as long as there are no function calls or tricky memory dereferences.
I asked about a specific instance, not in general. That instance was
at the end of the function, right before it returns, and I wonder what
kind of optimization opportunities that could present.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#54532
; Package
emacs
.
(Thu, 24 Mar 2022 07:24:02 GMT)
Full text and
rfc822 format available.
Message #29 received at 54532 <at> debbugs.gnu.org (full text, mbox):
>>>>> "EZ" == Eli Zaretskii <eliz <at> gnu.org> writes:
[...]
>> Because Lisp values are temporarily moved out to a heap allocated
>> buffer which isn't traversed by the stack
>> scanner. `record_unwind_protect_ptr_mark` can be seen as a
>> generalisation of `record_unwind_protect_ptr` (because that's
>> what it is).
EZ> I understand that these objects are on the heap, but AFAIU a
EZ> reference to them is on the stack, isn't it?
Not always. The algorithm merges sublists A and B (which are adjacent to
each other on the stack). Doing this entirely in place is very slow
(lots of swaps) so we use temp memory of size A (assume A is the shorter
list). The idea is to copy the A list to the heap memory, and then use
the standard merge algorithm between the heap version of A and (stack
version of) B, filling in the space where A used to reside in the
stack. The issue is that since we are filling in areas in the stack that
were once occupied by A we can end up with some of the original
Lisp_Objects residing only in the /copy/ of A on the heap. Once the
merging is done, everyone is at home back on the stack.
Maybe a simple example: A = (A1, A2) and B=(B1, B2) with merged ordering (A1,
B1, A2, B2). Here is the merge sequence:
(A1, A2, B1, B2) (start)
(A1, A2, B1, B2) (copy A1 from the heap to the first space)
(A1, B1, B1, B2) (copy B1 from the stack to the second place)
(A1, B1, A2, B2) (copy A2 from the heap to the third place and finish)
Notice that in the third step A2 exists only in the heap copy.
Best,
Andy
--
Andrew Cohen
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#54532
; Package
emacs
.
(Thu, 24 Mar 2022 08:57:02 GMT)
Full text and
rfc822 format available.
Message #32 received at 54532 <at> debbugs.gnu.org (full text, mbox):
> From: Andrew Cohen <acohen <at> ust.hk>
> Cc: Mattias Engdegård <mattiase <at> acm.org>,
> 54532 <at> debbugs.gnu.org
> Date: Thu, 24 Mar 2022 15:22:50 +0800
>
> EZ> I understand that these objects are on the heap, but AFAIU a
> EZ> reference to them is on the stack, isn't it?
>
> Not always. The algorithm merges sublists A and B (which are adjacent to
> each other on the stack). Doing this entirely in place is very slow
> (lots of swaps) so we use temp memory of size A (assume A is the shorter
> list). The idea is to copy the A list to the heap memory, and then use
> the standard merge algorithm between the heap version of A and (stack
> version of) B, filling in the space where A used to reside in the
> stack. The issue is that since we are filling in areas in the stack that
> were once occupied by A we can end up with some of the original
> Lisp_Objects residing only in the /copy/ of A on the heap. Once the
> merging is done, everyone is at home back on the stack.
If this is just temporary, then how about disabling GC (using
inhibit_garbage_collection) during this stage instead? Do we expect
to have a lot of garbage generated while this particular part of the
code runs? Does this code call some Lisp about whose consing behavior
which we cannot assume anything? If we can be certain we won't have a
lot of garbage during this stage, disabling GC is just TRT, IMO.
In general, having marked Lisp objects should be avoided, because
examining those objects in GDB is tricky at best, and because some
functions/macros will hit assertions if passed such an object. Two
examples are ASIZE and ASET. When inside GC, the use of marked
objects is unavoidable, but AFAIU this patch also marks those Lisp
objects while they are in the heap, and GC is not necessarily running,
is that right? So from my POV we should try to avoid this quite
unusual practice, if that's feasible.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#54532
; Package
emacs
.
(Thu, 24 Mar 2022 09:18:02 GMT)
Full text and
rfc822 format available.
Message #35 received at 54532 <at> debbugs.gnu.org (full text, mbox):
>>>>> "EZ" == Eli Zaretskii <eliz <at> gnu.org> writes:
[...]
EZ> If this is just temporary, then how about disabling GC (using
EZ> inhibit_garbage_collection) during this stage instead?
I don't see why that wouldn't work, but I would defer to others who
understand these issues better then me.
EZ> Do we expect to have a lot of garbage generated while this
EZ> particular part of the code runs? Does this code call some Lisp
EZ> about whose consing behavior which we cannot assume anything?
EZ> If we can be certain we won't have a lot of garbage during this
EZ> stage, disabling GC is just TRT, IMO.
The only lisp that gets called is the predicate used in the merging
(deciding which element precedes and which follows); in principle this
predicate can be any lisp code so I don't think we can assume much about
its behavior, and while doing a lot of consing is probably rare I don't
think we can rule it out.
EZ> In general, having marked Lisp objects should be avoided,
EZ> because examining those objects in GDB is tricky at best, and
EZ> because some functions/macros will hit assertions if passed such
EZ> an object. Two examples are ASIZE and ASET. When inside GC,
EZ> the use of marked objects is unavoidable, but AFAIU this patch
EZ> also marks those Lisp objects while they are in the heap, and GC
EZ> is not necessarily running, is that right?
No, I think they only get marked during the marking phase of a
GC. Mattias' implementation simply adds a callback to do the actual marking
that is invoked in mark_specpdl:
case SPECPDL_UNWIND_PTR:
if (pdl->unwind_ptr.mark)
pdl->unwind_ptr.mark (pdl->unwind_ptr.arg);
break;
Best,
Andy
--
Andrew Cohen
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#54532
; Package
emacs
.
(Thu, 24 Mar 2022 09:38:01 GMT)
Full text and
rfc822 format available.
Message #38 received at 54532 <at> debbugs.gnu.org (full text, mbox):
24 mars 2022 kl. 07.42 skrev Eli Zaretskii <eliz <at> gnu.org>:
> That instance was
> at the end of the function, right before it returns, and I wonder what
> kind of optimization opportunities that could present.
I don't think we need to justify every single `eassume` on the concrete utility for a compiler; in general, the more information we give it, the better code it can produce. It just doesn't hurt to do so.
In fact, the only reason we have `eassert` at all is for assertions that may be time-consuming or otherwise affect the execution (that is, expressions that the compiler just can't optimise away). For anything else, `eassume` is strictly better since it does all that `eassert` does, but with the extra optimisation hints.
Now in this concrete case, we state that `lastofs` and `ofs` are equal at the point when we are about to return `ofs`, and that gives the compiler the option to return `lastofs` instead, should that be more convenient in some way.
The compiler also knows that lastofs >= ofs because of the loop condition, which means that it can deduce that lastofs > ofs can never occur which can have various uses -- for example, in the statement
ptrdiff_t m = lastofs + ((ofs - lastofs) >> 1);
it would know that the argument being shifted is nonnegative, which might be useful in instruction selection. And so on.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#54532
; Package
emacs
.
(Thu, 24 Mar 2022 09:56:01 GMT)
Full text and
rfc822 format available.
Message #41 received at 54532 <at> debbugs.gnu.org (full text, mbox):
24 mars 2022 kl. 10.17 skrev Andrew Cohen <acohen <at> ust.hk>:
> No, I think they only get marked during the marking phase of a
> GC. Mattias' implementation simply adds a callback to do the actual marking
> that is invoked in mark_specpdl:
>
> case SPECPDL_UNWIND_PTR:
> if (pdl->unwind_ptr.mark)
> pdl->unwind_ptr.mark (pdl->unwind_ptr.arg);
> break;
Right -- it's really just a generalisation of the existing `record_unwind_protect_ptr` and fills a gap that wasn't properly handled before. The other unwind handlers in the family already take care of marking:
record_unwind_protect:
takes a single Lisp_Object, which is marked.
record_unwind_protect_array:
takes a heap-allocated array of Lisp_Objects, which are all marked.
record_unwind_protect_int:
takes an int, which does not need marking.
and so on. `record_unwind_protect_ptr` stands out because it takes a generic pointer and unwind function, but does not account for the possibility that anything referred to by that pointer may need marking. `record_unwind_protect_ptr_mark` plugs that hole in a very natural way.
I don't think we should disable the GC during sorting; the predicate function is not restricted in what it can do. It could build data structures, throw, call sort on another sequence, switch threads, and so on.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#54532
; Package
emacs
.
(Thu, 31 Mar 2022 12:04:01 GMT)
Full text and
rfc822 format available.
Message #44 received at 54532 <at> debbugs.gnu.org (full text, mbox):
Re-skimming this thread, it seems like Eli's questions were answered
(but there were a couple of very minor stylistic issues to be fixed).
So I guess this is ready for merging now (after fixing those couple
issues)? Eli?
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#54532
; Package
emacs
.
(Thu, 31 Mar 2022 13:59:02 GMT)
Full text and
rfc822 format available.
Message #47 received at 54532 <at> debbugs.gnu.org (full text, mbox):
> From: Lars Ingebrigtsen <larsi <at> gnus.org>
> Cc: 54532 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>
> Date: Thu, 31 Mar 2022 14:03:05 +0200
>
> Re-skimming this thread, it seems like Eli's questions were answered
> (but there were a couple of very minor stylistic issues to be fixed).
>
> So I guess this is ready for merging now (after fixing those couple
> issues)? Eli?
Yes, I think as soon as Andrew comes up with an updated patch, we can
install this.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#54532
; Package
emacs
.
(Thu, 31 Mar 2022 23:48:02 GMT)
Full text and
rfc822 format available.
Message #50 received at 54532 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
>>>>> "EZ" == Eli Zaretskii <eliz <at> gnu.org> writes:
[...]
EZ> Yes, I think as soon as Andrew comes up with an updated patch,
EZ> we can install this.
Attached is the patch for the items that Eli identified earlier
(improvements in comments and breaking one long line). I'll wait for any
further changes/comments. Then I think its best to squash this down to 2
commits: one for the GC marking, and all the rest related to
sorting. (I'm happy to use more commits if you think the predicate
resolution or the unit tests deserve their own patches).
Best,
Andy
[patch.diff (text/x-diff, inline)]
From 551352052021c92bf75001b07ad714454aad7706 Mon Sep 17 00:00:00 2001
From: Andrew G Cohen <cohen <at> andy.bu.edu>
Date: Fri, 1 Apr 2022 07:38:48 +0800
Subject: [PATCH] ; * src/sort.c: Improve comments.
---
src/sort.c | 155 +++++++++++++++++++++++++++--------------------------
1 file changed, 78 insertions(+), 77 deletions(-)
diff --git a/src/sort.c b/src/sort.c
index 48e106e92d..694d826853 100644
--- a/src/sort.c
+++ b/src/sort.c
@@ -117,10 +117,10 @@ inorder (const Lisp_Object predicate, const Lisp_Object a, const Lisp_Object b)
}
-/* BINARYSORT() is a stable binary insertion sort used for sorting the
- list starting at LO and ending at HI. On entry, LO <= START <= HI,
- and [LO, START) is already sorted (pass START == LO if you don't
- know!). Even in case of error, the output slice will be some
+/* Sort the list starting at LO and ending at HI using a stable binary
+ insertion sort algorithm. On entry the sublist [LO, START) (with
+ START between LO and HIGH) is known to be sorted (pass START == LO
+ if you are unsure). Even in case of error, the output will be some
permutation of the input (nothing is lost or duplicated). */
static void
@@ -154,17 +154,18 @@ binarysort (merge_state *ms, Lisp_Object *lo, const Lisp_Object *hi,
}
-/* COUNT_RUN() returns the length of the run beginning at LO, in the
- slice [LO, HI) with LO < HI. A "run" is the longest
+/* Find and return the length of the "run" (the longest
non-decreasing sequence or the longest strictly decreasing
sequence, with the Boolean *DESCENDING set to 0 in the former
- case, or to 1 in the latter. The strictness of the definition of
- "descending" is needed so that the caller can safely reverse a
- descending sequence without violating stability (strict > ensures
- there are no equal elements to get out of order). */
+ case, or to 1 in the latter) beginning at LO, in the slice [LO,
+ HI) with LO < HI. The strictness of the definition of
+ "descending" ensures there are no equal elements to get out of
+ order so the caller can safely reverse a descending sequence
+ without violating stability. */
static ptrdiff_t
-count_run (merge_state *ms, Lisp_Object *lo, const Lisp_Object *hi, bool *descending)
+count_run (merge_state *ms, Lisp_Object *lo, const Lisp_Object *hi,
+ bool *descending)
{
Lisp_Object pred = ms->predicate;
@@ -198,23 +199,24 @@ count_run (merge_state *ms, Lisp_Object *lo, const Lisp_Object *hi, bool *descen
}
-/* GALLOP_LEFT() locates the proper position of KEY in a sorted
+/* Locate and return the proper insertion position of KEY in a sorted
vector: if the vector contains an element equal to KEY, return the
position immediately to the left of the leftmost equal element.
- [GALLOP_RIGHT() does the same except returns the position to the
+ [GALLOP_RIGHT does the same except it returns the position to the
right of the rightmost equal element (if any).]
- 'A' is a sorted vector with N elements, starting at A[0]. N must be > 0.
+ 'A' is a sorted vector of N elements. N must be > 0.
- HINT is an index at which to begin the search, 0 <= HINT < N. The closer
- HINT is to the final result, the faster this runs.
+ Elements preceding HINT, a non-negative index less than N, are
+ skipped. The closer HINT is to the final result, the faster this
+ runs.
The return value is the int k in [0, N] such that
A[k-1] < KEY <= a[k]
- pretending that *(A-1) is minus infinity and A[N] is plus infinity. IOW,
- KEY belongs at index k; or, IOW, the first k elements of A should precede
+ pretending that *(A-1) precedes all values and *(A+N) succeeds all
+ values. In other words, the first k elements of A should precede
KEY, and the last N-k should follow KEY. */
static ptrdiff_t
@@ -254,7 +256,7 @@ gallop_left (merge_state *ms, const Lisp_Object key, Lisp_Object *a,
{
/* When key <= a[hint], gallop left, until
a[hint - ofs] < key <= a[hint - lastofs]. */
- const ptrdiff_t maxofs = hint + 1; /* Here &a[0] is lowest. */
+ const ptrdiff_t maxofs = hint + 1; /* Here &a[0] is lowest. */
while (ofs < maxofs)
{
if (inorder (pred, a[-ofs], key))
@@ -283,18 +285,19 @@ gallop_left (merge_state *ms, const Lisp_Object key, Lisp_Object *a,
ptrdiff_t m = lastofs + ((ofs - lastofs) >> 1);
if (inorder (pred, a[m], key))
- lastofs = m + 1; /* Here a[m] < key. */
+ lastofs = m + 1; /* Here a[m] < key. */
else
ofs = m; /* Here key <= a[m]. */
}
- eassume (lastofs == ofs); /* Then a[ofs-1] < key <= a[ofs]. */
+ eassume (lastofs == ofs); /* Then a[ofs-1] < key <= a[ofs]. */
return ofs;
}
-/* GALLOP_RIGHT() is exactly like GALLOP_LEFT(), except that if KEY
- already exists in A[0:N], it finds the position immediately to the
- right of the rightmost equal value.
+/* Locate and return the proper position of KEY in a sorted vector
+ exactly like GALLOP_LEFT, except that if KEY already exists in
+ A[0:N] find the position immediately to the right of the rightmost
+ equal value.
The return value is the int k in [0, N] such that
@@ -315,7 +318,7 @@ gallop_right (merge_state *ms, const Lisp_Object key, Lisp_Object *a,
{
/* When key < a[hint], gallop left until
a[hint - ofs] <= key < a[hint - lastofs]. */
- const ptrdiff_t maxofs = hint + 1; /* Here &a[0] is lowest. */
+ const ptrdiff_t maxofs = hint + 1; /* Here &a[0] is lowest. */
while (ofs < maxofs)
{
if (inorder (pred, key, a[-ofs]))
@@ -338,7 +341,7 @@ gallop_right (merge_state *ms, const Lisp_Object key, Lisp_Object *a,
{
/* When a[hint] <= key, gallop right, until
a[hint + lastofs] <= key < a[hint + ofs]. */
- const ptrdiff_t maxofs = n - hint; /* Here &a[n-1] is highest. */
+ const ptrdiff_t maxofs = n - hint; /* Here &a[n-1] is highest. */
while (ofs < maxofs)
{
if (inorder (pred, key, a[ofs]))
@@ -368,9 +371,9 @@ gallop_right (merge_state *ms, const Lisp_Object key, Lisp_Object *a,
if (inorder (pred, key, a[m]))
ofs = m; /* Here key < a[m]. */
else
- lastofs = m + 1; /* Here a[m] <= key. */
+ lastofs = m + 1; /* Here a[m] <= key. */
}
- eassume (lastofs == ofs); /* Now a[ofs-1] <= key < a[ofs]. */
+ eassume (lastofs == ofs); /* Now a[ofs-1] <= key < a[ofs]. */
return ofs;
}
@@ -411,9 +414,9 @@ merge_markmem (void *arg)
}
-/* CLEANUP_MEM frees all temp storage. If an exception occurs while
- merging it will first relocate any lisp elements in temp storage
- back to the original array. */
+/* Free all temp storage. If an exception occurs while merging,
+ relocate any lisp elements in temp storage back to the original
+ array. */
static void
cleanup_mem (void *arg)
@@ -440,9 +443,9 @@ cleanup_mem (void *arg)
}
-/* MERGE_GETMEM() ensures availability of enough temp memory for NEED
- array slots. Any previously allocated memory is first freed, and a
- cleanup routine is registered to free memory at the very end, or on
+/* Allocate enough temp memory for NEED array slots. Any previously
+ allocated memory is first freed, and a cleanup routine is
+ registered to free memory at the very end of the sort, or on
exception. */
static void
@@ -453,7 +456,7 @@ merge_getmem (merge_state *ms, const ptrdiff_t need)
if (ms->a == ms->temparray)
{
/* We only get here if alloc is needed and this is the first
- time, so we set up the unwind. */
+ time, so we set up the unwind protection. */
specpdl_ref count = SPECPDL_INDEX ();
record_unwind_protect_ptr_mark (cleanup_mem, ms, merge_markmem);
ms->count = count;
@@ -479,10 +482,10 @@ needmem (merge_state *ms, ptrdiff_t na)
}
-/* MERGE_LO() stably merges the NA elements starting at SSA with the
- NB elements starting at SSB = SSA + NA, in-place. NA and NB must
- be positive. We also require that SSA[NA-1] belongs at the end of
- the merge, and should have NA <= NB. */
+/* Stably merge (in-place) the NA elements starting at SSA with the NB
+ elements starting at SSB = SSA + NA. NA and NB must be positive.
+ Require that SSA[NA-1] belongs at the end of the merge, and NA <=
+ NB. */
static void
merge_lo (merge_state *ms, Lisp_Object *ssa, ptrdiff_t na, Lisp_Object *ssb,
@@ -509,9 +512,9 @@ merge_lo (merge_state *ms, Lisp_Object *ssa, ptrdiff_t na, Lisp_Object *ssb,
ptrdiff_t min_gallop = ms->min_gallop;
for (;;)
{
- ptrdiff_t acount = 0; /* This holds the # of consecutive times A won. */
+ ptrdiff_t acount = 0; /* The # of consecutive times A won. */
- ptrdiff_t bcount = 0; /* This holds the # of consecutive times B won. */
+ ptrdiff_t bcount = 0; /* The # of consecutive times B won. */
for (;;)
{
@@ -540,9 +543,10 @@ merge_lo (merge_state *ms, Lisp_Object *ssa, ptrdiff_t na, Lisp_Object *ssb,
}
}
- /* One run is winning so consistently that galloping may be a huge
- win. We try that, and continue galloping until (if ever)
- neither run appears to be winning consistently anymore. */
+ /* One run is winning so consistently that galloping may be a
+ huge speedup. We try that, and continue galloping until (if
+ ever) neither run appears to be winning consistently
+ anymore. */
++min_gallop;
do {
eassume (na > 1 && nb > 0);
@@ -558,8 +562,8 @@ merge_lo (merge_state *ms, Lisp_Object *ssa, ptrdiff_t na, Lisp_Object *ssb,
na -= k;
if (na == 1)
goto CopyB;
- /* While na==0 is impossible now if the comparison function is
- consistent, we shouldn't assume that it is. */
+ /* While na==0 is impossible for a consistent comparison
+ function, we shouldn't assume that it is. */
if (na == 0)
goto Succeed;
}
@@ -603,10 +607,10 @@ merge_lo (merge_state *ms, Lisp_Object *ssa, ptrdiff_t na, Lisp_Object *ssb,
}
-/* MERGE_HI() stably merges the NA elements starting at SSA with the
- NB elements starting at SSB = SSA + NA, in-place. NA and NB must
- be positive. We also require that SSA[NA-1] belongs at the end of
- the merge, and should have NA >= NB. */
+/* Stably merge (in-place) the NA elements starting at SSA with the NB
+ elements starting at SSB = SSA + NA. NA and NB must be positive.
+ Require that SSA[NA-1] belongs at the end of the merge, and NA >=
+ NB. */
static void
merge_hi (merge_state *ms, Lisp_Object *ssa, ptrdiff_t na,
@@ -636,8 +640,8 @@ merge_hi (merge_state *ms, Lisp_Object *ssa, ptrdiff_t na,
ptrdiff_t min_gallop = ms->min_gallop;
for (;;) {
- ptrdiff_t acount = 0; /* This holds the # of consecutive times A won. */
- ptrdiff_t bcount = 0; /* This holds the # of consecutive times B won. */
+ ptrdiff_t acount = 0; /* The # of consecutive times A won. */
+ ptrdiff_t bcount = 0; /* The # of consecutive times B won. */
for (;;) {
eassume (na > 0 && nb > 1);
@@ -666,8 +670,8 @@ merge_hi (merge_state *ms, Lisp_Object *ssa, ptrdiff_t na,
}
/* One run is winning so consistently that galloping may be a huge
- win. Try that, and continue galloping until (if ever) neither
- run appears to be winning consistently anymore. */
+ speedup. Try that, and continue galloping until (if ever)
+ neither run appears to be winning consistently anymore. */
++min_gallop;
do {
eassume (na > 0 && nb > 1);
@@ -701,8 +705,8 @@ merge_hi (merge_state *ms, Lisp_Object *ssa, ptrdiff_t na,
nb -= k;
if (nb == 1)
goto CopyA;
- /* While nb==0 is impossible now if the comparison function
- is consistent, we shouldn't assume that it is. */
+ /* While nb==0 is impossible for a consistent comparison
+ function we shouldn't assume that it is. */
if (nb == 0)
goto Succeed;
}
@@ -730,7 +734,7 @@ merge_hi (merge_state *ms, Lisp_Object *ssa, ptrdiff_t na,
}
-/* MERGE_AT() merges the two runs at stack indices I and I+1. */
+/* Merge the two runs at stack indices I and I+1. */
static void
merge_at (merge_state *ms, const ptrdiff_t i)
@@ -747,9 +751,9 @@ merge_at (merge_state *ms, const ptrdiff_t i)
eassume (na > 0 && nb > 0);
eassume (ssa + na == ssb);
- /* Record the length of the combined runs; if i is the 3rd-last run
- now, also slide over the last run (which isn't involved in this
- merge). The current run i+1 goes away in any case. */
+ /* Record the length of the combined runs. The current run i+1 goes
+ away after the merge. If i is the 3rd-last run now, slide the
+ last run (which isn't involved in this merge) over to i+1. */
ms->pending[i].len = na + nb;
if (i == ms->n - 3)
ms->pending[i + 1] = ms->pending[i + 2];
@@ -779,10 +783,9 @@ merge_at (merge_state *ms, const ptrdiff_t i)
}
-/* POWERLOOP() computes the "power" of the first of two adjacent runs
- begining at index S1, with the first having length N1 and the
- second (starting at index S1+N1) having length N2. The list has
- total length N. */
+/* Compute the "power" of the first of two adjacent runs begining at
+ index S1, with the first having length N1 and the second (starting
+ at index S1+N1) having length N2. The run has total length N. */
static int
powerloop (const ptrdiff_t s1, const ptrdiff_t n1, const ptrdiff_t n2,
@@ -824,11 +827,11 @@ powerloop (const ptrdiff_t s1, const ptrdiff_t n1, const ptrdiff_t n2,
}
-/* FOUND_NEW_RUN() updates the state when a run of length N2 has been
- identified. If there's already a stretch on the stack, apply the
- "powersort" merge strategy: compute the topmost stretch's "power"
- (depth in a conceptual binary merge tree) and merge adjacent runs
- on the stack with greater power. */
+/* Update the state upon identifying a run of length N2. If there's
+ already a stretch on the stack, apply the "powersort" merge
+ strategy: compute the topmost stretch's "power" (depth in a
+ conceptual binary merge tree) and merge adjacent runs on the stack
+ with greater power. */
static void
found_new_run (merge_state *ms, const ptrdiff_t n2)
@@ -851,9 +854,8 @@ found_new_run (merge_state *ms, const ptrdiff_t n2)
}
-/* MERGE_FORCE_COLLAPSE() unconditionally merges all stretches on the
- stack until only one remains, and returns 0 on success. This is
- used at the end of the mergesort. */
+/* Unconditionally merge all stretches on the stack until only one
+ remains. */
static void
merge_force_collapse (merge_state *ms)
@@ -871,9 +873,8 @@ merge_force_collapse (merge_state *ms)
}
-/* MERGE_COMPUTE_MINRUN() computes a good value for the minimum run
- length; natural runs shorter than this are boosted artificially via
- binary insertion.
+/* Compute a good value for the minimum run length; natural runs
+ shorter than this are boosted artificially via binary insertion.
If N < 64, return N (it's too small to bother with fancy stuff).
Otherwise if N is an exact power of 2, return 32. Finally, return
@@ -907,8 +908,8 @@ reverse_vector (Lisp_Object *s, const ptrdiff_t n)
}
}
-/* TIM_SORT sorts the array SEQ with LENGTH elements in the order
- determined by PREDICATE. */
+/* Sort the array SEQ with LENGTH elements in the order determined by
+ PREDICATE. */
void
tim_sort (Lisp_Object predicate, Lisp_Object *seq, const ptrdiff_t length)
--
2.34.1.575.g55b058a8bb
[Message part 3 (text/plain, inline)]
--
Andrew Cohen
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#54532
; Package
emacs
.
(Fri, 01 Apr 2022 06:28:02 GMT)
Full text and
rfc822 format available.
Message #53 received at 54532 <at> debbugs.gnu.org (full text, mbox):
> From: Andrew Cohen <acohen <at> ust.hk>
> Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 54532 <at> debbugs.gnu.org
> Date: Fri, 01 Apr 2022 07:47:27 +0800
>
> EZ> Yes, I think as soon as Andrew comes up with an updated patch,
> EZ> we can install this.
>
> Attached is the patch for the items that Eli identified earlier
> (improvements in comments and breaking one long line).
They are fine by me, thanks.
> I'll wait for any
> further changes/comments. Then I think its best to squash this down to 2
> commits: one for the GC marking, and all the rest related to
> sorting. (I'm happy to use more commits if you think the predicate
> resolution or the unit tests deserve their own patches).
I think even a single commit is okay, unless you care about the
history of the development of this.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#54532
; Package
emacs
.
(Tue, 07 Jun 2022 07:07:02 GMT)
Full text and
rfc822 format available.
Message #56 received at 54532 <at> debbugs.gnu.org (full text, mbox):
Eli Zaretskii <eliz <at> gnu.org> writes:
>> From: Andrew Cohen <acohen <at> ust.hk>
>> Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 54532 <at> debbugs.gnu.org
>> Date: Fri, 01 Apr 2022 07:47:27 +0800
>>
>> EZ> Yes, I think as soon as Andrew comes up with an updated patch,
>> EZ> we can install this.
>>
>> Attached is the patch for the items that Eli identified earlier
>> (improvements in comments and breaking one long line).
>
> They are fine by me, thanks.
>
>> I'll wait for any
>> further changes/comments. Then I think its best to squash this down to 2
>> commits: one for the GC marking, and all the rest related to
>> sorting. (I'm happy to use more commits if you think the predicate
>> resolution or the unit tests deserve their own patches).
>
> I think even a single commit is okay, unless you care about the
> history of the development of this.
That was 9 weeks ago. Any news here? It would be great to get this
installed.
Thanks in advance.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#54532
; Package
emacs
.
(Tue, 07 Jun 2022 09:08:01 GMT)
Full text and
rfc822 format available.
Message #59 received at 54532 <at> debbugs.gnu.org (full text, mbox):
close 54532 29.1
thanks
Andrew Cohen <acohen <at> ust.hk> writes:
> This has been in master for some time :)
>
> Commit 9ff2f0be32be621 on Apr 4.
Thanks, I'm therefore closing this bug report.
bug marked as fixed in version 29.1, send any further explanations to
54532 <at> debbugs.gnu.org and Andrew Cohen <acohen <at> ust.hk>
Request was from
Stefan Kangas <stefan <at> marxist.se>
to
control <at> debbugs.gnu.org
.
(Tue, 07 Jun 2022 09:08:02 GMT)
Full text and
rfc822 format available.
Changed bug title to 'Faster sorting' from '[PATCH] sorting'
Request was from
Stefan Kangas <stefan <at> marxist.se>
to
control <at> debbugs.gnu.org
.
(Tue, 07 Jun 2022 09:11:02 GMT)
Full text and
rfc822 format available.
Severity set to 'wishlist' from 'normal'
Request was from
Stefan Kangas <stefan <at> marxist.se>
to
control <at> debbugs.gnu.org
.
(Tue, 07 Jun 2022 09:11:02 GMT)
Full text and
rfc822 format available.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Tue, 05 Jul 2022 11:24:07 GMT)
Full text and
rfc822 format available.
This bug report was last modified 2 years and 361 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.