GNU bug report logs -
#48841
fido-mode is slower than ido-mode with similar settings
Previous Next
Reported by: Dmitry Gutov <dgutov <at> yandex.ru>
Date: Sat, 5 Jun 2021 01:40:01 UTC
Severity: normal
Done: João Távora <joaotavora <at> gmail.com>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 48841 in the body.
You can then email your comments to 48841 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Sat, 05 Jun 2021 01:40:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Dmitry Gutov <dgutov <at> yandex.ru>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Sat, 05 Jun 2021 01:40:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
I'm comparing
ido-mode
with ido-ubiquitous-mode (for support for arbitrary completion
tables), available at
https://github.com/DarwinAwardWinner/ido-completing-read-plus
with (setq ido-enable-flex-matching t), of course
versus
fido-mode
with
(setq icomplete-compute-delay 0)
(setq icomplete-show-matches-on-no-input t)
(setq icomplete-max-delay-chars 0)
The values chosen for behavior maximally close to ido.
Try something like:
- Start a session with personal config and a number of loaded packages
(so that there are a lot of functions defined in obarray)
- Type 'C-h f'
- Type 'a', then type 'b'.
- Delete 'b', type it again, see how quickly you can make the
completions update.
With ido, the updates seem instant (probably due to some magic in
ido-completing-read-plus); with fido, there is some lag. Not huge, but
easy enough to notice.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Sat, 05 Jun 2021 09:36:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 48841 <at> debbugs.gnu.org (full text, mbox):
Dmitry Gutov <dgutov <at> yandex.ru> writes:
> I'm comparing
>
> ido-mode
> with ido-ubiquitous-mode (for support for arbitrary completion
> tables), available at
> https://github.com/DarwinAwardWinner/ido-completing-read-plus
> with (setq ido-enable-flex-matching t), of course
>
> versus
>
> fido-mode
> with
> (setq icomplete-compute-delay 0)
> (setq icomplete-show-matches-on-no-input t)
> (setq icomplete-max-delay-chars 0)
>
> The values chosen for behavior maximally close to ido.
>
> Try something like:
>
> - Start a session with personal config and a number of loaded
> packages (so that there are a lot of functions defined in obarray)
> - Type 'C-h f'
> - Type 'a', then type 'b'.
> - Delete 'b', type it again, see how quickly you can make the
> completions update.
>
> With ido, the updates seem instant (probably due to some magic in
> ido-completing-read-plus); with fido, there is some lag. Not huge, but
> easy enough to notice.
Thanks for the report. Before I try reproducing, can you try with
fido-vertical-mode and tell us it if that changes anything? I think I
remember that skipping some suffix-calculation logic saved on a few
traversals of the big list of symbol completions.
João
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Sat, 05 Jun 2021 23:03:01 GMT)
Full text and
rfc822 format available.
Message #11 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On 05.06.2021 12:35, João Távora wrote:
> Thanks for the report. Before I try reproducing, can you try with
> fido-vertical-mode and tell us it if that changes anything? I think I
> remember that skipping some suffix-calculation logic saved on a few
> traversals of the big list of symbol completions.
Why, yes it is. fido-vertical-mode is definitely snappier with such
settings.
Maybe still not on the level of ido-mode, but at least halfway there,
compared to the "horizontal" version.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Sat, 05 Jun 2021 23:21:01 GMT)
Full text and
rfc822 format available.
Message #14 received at 48841 <at> debbugs.gnu.org (full text, mbox):
Dmitry Gutov <dgutov <at> yandex.ru> writes:
> On 05.06.2021 12:35, João Távora wrote:
>> Thanks for the report. Before I try reproducing, can you try with
>> fido-vertical-mode and tell us it if that changes anything? I think I
>> remember that skipping some suffix-calculation logic saved on a few
>> traversals of the big list of symbol completions.
>
> Why, yes it is. fido-vertical-mode is definitely snappier with such
> settings.
>
> Maybe still not on the level of ido-mode, but at least halfway there,
> compared to the "horizontal" version.
Yes, that is also my assessment after trying your recipe. fido-mode +
fido-vertical-mode is not quite as snappy as ido-ubiquitous-mode, but
decently close.
My bet is that the remaining lag is due to sorting. In a dumb but
illustrative example, when given the pattern 'fmcro' flex-enabled
ido-mode pops 'flymake--backend-state-p--cmacro' to the top, while fido
mode selects the much more reasonable 'defmacro'.
Now, what I called here the "suffix-calculation logic" is what I also
called the "[mplete] dance" back in the emacs-devel thread. Truth is,
it's always annoyed me in icomplete partially because I don't understand
what it does exactly and how it is supposed to help me. I suppose
Stefan knows best here. Regardless of its use, it seems to require
another try-completion call in all the filtered candidates (which might
be very big) so that's probably where the extra lag comes from.
So, in summary, to speed this up for whomever is _not_ using
fido-vertical-mode, either we manage to speed up that part of
icomplete.el, or we get rid of it completely (at least for fido-mode).
For reference, it lives in an "else" branch of one of the "if"s in
icomplete-completions.
João
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Sat, 05 Jun 2021 23:43:01 GMT)
Full text and
rfc822 format available.
Message #17 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On 06.06.2021 02:20, João Távora wrote:
> Now, what I called here the "suffix-calculation logic" is what I also
> called the "[mplete] dance" back in the emacs-devel thread. Truth is,
> it's always annoyed me in icomplete partially because I don't understand
> what it does exactly and how it is supposed to help me. I suppose
> Stefan knows best here. Regardless of its use, it seems to require
> another try-completion call in all the filtered candidates (which might
> be very big) so that's probably where the extra lag comes from.
Shouldn't this logic (whether it's used) be governed by the variable
icomplete-hide-common-prefix?
Which icomplete--fido-mode-setup sets to nil (appropriately, given than
ido-mode does not have this behavior). And looking at its behavior, it
only does the "[mplete] dance" when there is only one match remaining.
Whether to make icomplete-hide-common-prefix affect even the "only one
match" case is a matter of taste: I could personally find an argument
for either choice. But we're seeing performance degradation when there
are many matches, not one.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Sun, 06 Jun 2021 00:26:02 GMT)
Full text and
rfc822 format available.
Message #20 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On 06.06.2021 02:20, João Távora wrote:
> My bet is that the remaining lag is due to sorting. In a dumb but
> illustrative example, when given the pattern 'fmcro' flex-enabled
> ido-mode pops 'flymake--backend-state-p--cmacro' to the top, while fido
> mode selects the much more reasonable 'defmacro'.
Perhaps not sorting exactly, but the scoring part? Lowering the
implementation into C might help, we discussed something like that in
the past.
And/or pick a different algorithm. E.g. Jaro-Winkler, which apparently
is used in a lot of "fuzzy matching" implementations out there, it's
pretty fast.
I took an example like
(setq s (all-completions "" obarray))
(setq ss (cl-delete-if-not (lambda (s) (string-match-p "a" s)) s))
then
(benchmark 1 '(completion-all-completions "a" ss nil 1))
prints 0.180s here, whereas a "pure Ruby" implementation of Jaro-Winkler
takes about 0.060s on the exact same set of strings. But perhaps Ruby is
just faster than Elisp, I don't have a good comparison.
(The only J-W implementation in Elisp I have found yet --
https://github.com/rdiankov/emacs-config/blob/master/.emacs-lisp/auto-complete-1.3.1/fuzzy.el#L70
-- is slower than the current scoring algo).
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Sun, 06 Jun 2021 02:36:01 GMT)
Full text and
rfc822 format available.
Message #23 received at 48841 <at> debbugs.gnu.org (full text, mbox):
> Stefan knows best here. Regardless of its use, it seems to require
> another try-completion call in all the filtered candidates (which might
> be very big) so that's probably where the extra lag comes from.
IIRC the `try-completion` call is performed on the list of possible
completions rather than on the original completion table, so it should
be quite fast. I'd be surprised if it is a significant portion of the
overall time.
Stefan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Sun, 06 Jun 2021 06:56:01 GMT)
Full text and
rfc822 format available.
Message #26 received at 48841 <at> debbugs.gnu.org (full text, mbox):
Dmitry Gutov <dgutov <at> yandex.ru> writes:
> On 06.06.2021 02:20, João Távora wrote:
>> My bet is that the remaining lag is due to sorting. In a dumb but
>> illustrative example, when given the pattern 'fmcro' flex-enabled
>> ido-mode pops 'flymake--backend-state-p--cmacro' to the top, while fido
>> mode selects the much more reasonable 'defmacro'.
>
> Perhaps not sorting exactly, but the scoring part? Lowering the
> implementation into C might help, we discussed something like that in
> the past.
Perhaps, all could be measured. But I also remember explaining that
scoring is basically free. The flex algorithm search algorithm is
greedy already, it doesn't backtrack. Given a pattern 'foo' against
'fabrobazo', it takes 9 steps to see that it matches. I can't see any
other way to improve that, short of a something like a tottaly different
string implementation. The scoring's numeric calculations at each step
are trivial.
One way to verify this is to do the scoring, but simply disregard it for
sorting purposes.
> And/or pick a different algorithm. E.g. Jaro-Winkler, which apparently
> is used in a lot of "fuzzy matching" implementations out there, it's
> pretty fast.
That may be useful, but for other purposes. If I understand correctly,
Jaro-Winkler is for finding the distante between two arbitrary strings,
where the first in not a subsequence of the second. I bet google uses
stuff like that when you accitendally transpose characters. Flex just
gives up. Those other others algos still catch the match (and Google
than probably NL-scours your most intimate fears and checks with your
local dictator before showing you typing suggestions)
> I took an example like
>
> (setq s (all-completions "" obarray))
> (setq ss (cl-delete-if-not (lambda (s) (string-match-p "a" s)) s))
>
> then
>
> (benchmark 1 '(completion-all-completions "a" ss nil 1))
>
> prints 0.180s here, whereas a "pure Ruby" implementation of
> Jaro-Winkler takes about 0.060s on the exact same set of strings. But
> perhaps Ruby is just faster than Elisp, I don't have a good
> comparison.
Go ahead and kill the scoring calculationg altogether in
completion-pcm--hilit-commonality. I bet it won't make a difference.
If fact, for that experiment, try a simple substring search. I bet
you're just seeing an inferior GC at work, or a string implementation
that's made optimized for other stuff that Ruby's can't, like
propertization. Try making Ruby strings that mimic Elips if you've time
to spare...
> (The only J-W implementation in Elisp I have found yet --
> https://github.com/rdiankov/emacs-config/blob/master/.emacs-lisp/auto-complete-1.3.1/fuzzy.el#L70
> -- is slower than the current scoring algo).
There you have it.
João
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Sun, 06 Jun 2021 07:00:01 GMT)
Full text and
rfc822 format available.
Message #29 received at 48841 <at> debbugs.gnu.org (full text, mbox):
Stefan Monnier <monnier <at> iro.umontreal.ca> writes:
>> Stefan knows best here. Regardless of its use, it seems to require
>> another try-completion call in all the filtered candidates (which might
>> be very big) so that's probably where the extra lag comes from.
>
> IIRC the `try-completion` call is performed on the list of possible
> completions rather than on the original completion table, so it should
> be quite fast. I'd be surprised if it is a significant portion of the
> overall time.
Very true, but here's the suprise: In the flex style, there are a _lot_
of "possible completions" for the null or very short patterns. So those
calculations -- which were more than certainly thought up for prefix-ish
styles -- are quite slow (and also quite useless for flex). At least
that's my theory.
João
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Sun, 06 Jun 2021 16:56:02 GMT)
Full text and
rfc822 format available.
Message #32 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On 06.06.2021 09:59, João Távora wrote:
> Very true, but here's the suprise: In the flex style, there are a_lot_
> of "possible completions" for the null or very short patterns. So those
> calculations -- which were more than certainly thought up for prefix-ish
> styles -- are quite slow (and also quite useless for flex). At least
> that's my theory.
try-completion doesn't trigger any completion style machinery; only
completion-try-completion does.
And are we talking about the 'try-completion' call which is guarded with
(when icomplete-hide-common-prefix ...)?
icomplete--fido-mode-setup sets that variable to nil.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Sun, 06 Jun 2021 17:56:01 GMT)
Full text and
rfc822 format available.
Message #35 received at 48841 <at> debbugs.gnu.org (full text, mbox):
> Very true, but here's the suprise: In the flex style, there are a _lot_
> of "possible completions" for the null or very short patterns. So those
> calculations -- which were more than certainly thought up for prefix-ish
> styles -- are quite slow (and also quite useless for flex). At least
> that's my theory.
In the very worst possible case, `try-completion` will be just as slow
as the original computation of the set of possible completions. So at
most it will double the total time (and this assumes we do basically
nothing else than a single call to `all-completions` to get the set of
candidates and then display them).
In practice I'd be surprised if it ever reaches the 20% mark of the time spent.
Stefan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Sun, 06 Jun 2021 18:38:02 GMT)
Full text and
rfc822 format available.
Message #38 received at 48841 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Sun, Jun 6, 2021, 17:55 Dmitry Gutov <dgutov <at> yandex.ru> wrote:
> On 06.06.2021 09:59, João Távora wrote:
> > Very true, but here's the suprise: In the flex style, there are a_lot_
> > of "possible completions" for the null or very short patterns. So those
> > calculations -- which were more than certainly thought up for prefix-ish
> > styles -- are quite slow (and also quite useless for flex). At least
> > that's my theory.
>
> try-completion doesn't trigger any completion style machinery; only
> completion-try-completion does.
>
I have no idea if completion style stuff b is related. Just that else
branch is there to calculate some 'determ' thing and a cursory look
revealed try-completion calls being passed 'comps', or 'completions'.
Presumably lots of data given short flex style patterns. No idea what it
accomplishes, as I said.
Bottom line is that something (TM) happened to speed up the whole thing
when I skipped over that whole part. I had vertical mode basically visually
equivalent to vertical, but quite slower. After skipping that part they
became practically equivalent. And you yourself witnessed this when
switching yo vertical mode, which is when the skip is made.
I'll check later in the week, away from my computer now.
And are we talking about the 'try-completion' call which is guarded with
> (when icomplete-hide-common-prefix ...)?
>
No idea
>
> icomplete--fido-mode-setup sets that variable to nil.
>
May be.
João
>
[Message part 2 (text/html, inline)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Sun, 06 Jun 2021 21:34:02 GMT)
Full text and
rfc822 format available.
Message #41 received at 48841 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Sun, Jun 6, 2021, 18:55 Stefan Monnier <monnier <at> iro.umontreal.ca> wrote:
> In practice I'd be surprised if it ever reaches the 20% mark of the time
> spent.
Personally, I'm usually suprised if less than 80% of my estimates aren't
totally off. :) Anyway, if not try-completion like I theorized, it should
be reasonably easy to pinpoint: something non-fido-essential in that else
branch is causing a real slowdown.
João
>
>
[Message part 2 (text/html, inline)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Sun, 06 Jun 2021 22:21:01 GMT)
Full text and
rfc822 format available.
Message #44 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On 06.06.2021 09:54, João Távora wrote:
>> Perhaps not sorting exactly, but the scoring part? Lowering the
>> implementation into C might help, we discussed something like that in
>> the past.
>
> Perhaps, all could be measured. But I also remember explaining that
> scoring is basically free. The flex algorithm search algorithm is
> greedy already, it doesn't backtrack. Given a pattern 'foo' against
> 'fabrobazo', it takes 9 steps to see that it matches. I can't see any
> other way to improve that, short of a something like a tottaly different
> string implementation. The scoring's numeric calculations at each step
> are trivial.
Thanks, if it's indeed that fast, I might have a use for it as well, in
a different context. ;-)
>> And/or pick a different algorithm. E.g. Jaro-Winkler, which apparently
>> is used in a lot of "fuzzy matching" implementations out there, it's
>> pretty fast.
>
> That may be useful, but for other purposes. If I understand correctly,
> Jaro-Winkler is for finding the distante between two arbitrary strings,
> where the first in not a subsequence of the second. I bet google uses
> stuff like that when you accitendally transpose characters. Flex just
> gives up. Those other others algos still catch the match (and Google
> than probably NL-scours your most intimate fears and checks with your
> local dictator before showing you typing suggestions)
I'm not crazy about shuffling completion, but some people did indeed ask
for it. The filtering step has to be more complex, though (you can't
just construct a straightforward regexp to filter with).
>> I took an example like
>>
>> (setq s (all-completions "" obarray))
>> (setq ss (cl-delete-if-not (lambda (s) (string-match-p "a" s)) s))
>>
>> then
>>
>> (benchmark 1 '(completion-all-completions "a" ss nil 1))
>>
>> prints 0.180s here, whereas a "pure Ruby" implementation of
>> Jaro-Winkler takes about 0.060s on the exact same set of strings. But
>> perhaps Ruby is just faster than Elisp, I don't have a good
>> comparison.
>
> Go ahead and kill the scoring calculationg altogether in
> completion-pcm--hilit-commonality. I bet it won't make a difference.
> If fact, for that experiment, try a simple substring search.
Same result, indeed. We should note, though, that
completion-pcm--hilit-commonality has some steps that were added
together with the 'flex' style, for it to work.
> I bet
> you're just seeing an inferior GC at work, or a string implementation
> that's made optimized for other stuff that Ruby's can't, like
> propertization. Try making Ruby strings that mimic Elips if you've time
> to spare...
I did some instrumenting, replacing (completion-pcm--hilit-commonality
pattern all) inside completion-flex-all-completions with
(benchmark-progn (completion-pcm--hilit-commonality pattern all)).
Recompiled between each change (interpreted mode gives very different
numbers).
Unmodified, the call takes ~85ms:
Elapsed time: 0.085520s (0.068406s in 4 GCs)
If I comment everything inside its first lambda except the returned
value (making the function the same as #'identity), the time goes down
to <1ms.
Uncomment the 'copy-sequence' and 'string-match' calls (which one might
suspect of garbage generation):
Elapsed time: 0.006380s
Tried binding gc-cons-threshold to a high value, and even galling
garbage-collect-maybe (or not): that speeds it up, but adds some
unpredictable GC pauses later (though it would be nice to be able to
consolidate the pauses into one collection pass).
Long story short, the patch I just installed, to reuse the match data,
brings the runtime down to
Elapsed time: 0.066388s (0.050087s in 3 GCs)
Tried other things like moving the update-score-and-face lambda out of
the mapcar loop - that didn't move a needle. Someone more familiar with
the code might get further. But perhaps it's just the cost of executing
this logic in Lisp 12000 times, and doing some of that in C would be the
next step.
And a weird part: replacing all repeated (length str) calls with a
reference to an existing local binding makes it *slower* (back to the
original performance). Check this out:
diff --git a/lisp/minibuffer.el b/lisp/minibuffer.el
index d5a0118b7c..d7102245a2 100644
--- a/lisp/minibuffer.el
+++ b/lisp/minibuffer.el
@@ -3544,7 +3544,7 @@ completion-pcm--hilit-commonality
score-numerator (+ score-numerator (- b a)))
(unless (or (= a last-b)
(zerop last-b)
- (= a (length str)))
+ (= a end))
(setq
score-denominator (+ score-denominator
1
@@ -3562,12 +3562,12 @@ completion-pcm--hilit-commonality
;; for that extra bit of match (bug#42149).
(unless (= from match-end)
(funcall update-score-and-face from match-end))
- (if (> (length str) pos)
+ (if (> end pos)
(add-face-text-property
pos (1+ pos)
'completions-first-difference
nil str))
- (unless (zerop (length str))
+ (unless (zerop end)
(put-text-property
0 1 'completion-score
(/ score-numerator (* end (1+ score-denominator)) 1.0)
str)))
@@ -3980,7 +3980,7 @@ completion-flex-all-completions
string table pred point
#'completion-flex--make-flex-pattern)))
(when all
- (nconc (completion-pcm--hilit-commonality pattern all)
+ (nconc (benchmark-progn (completion-pcm--hilit-commonality
pattern all))
(length prefix))))))
;; Initials completion
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Sun, 06 Jun 2021 22:22:01 GMT)
Full text and
rfc822 format available.
Message #47 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On 06.06.2021 21:37, João Távora wrote:
> Bottom line is that something (TM) happened to speed up the whole thing
> when I skipped over that whole part. I had vertical mode basically
> visually equivalent to vertical, but quite slower. After skipping that
> part they became practically equivalent. And you yourself witnessed this
> when switching yo vertical mode, which is when the skip is made.
Yep.
> I'll check later in the week, away from my computer now.
Looking forward to it.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 07 Jun 2021 01:37:02 GMT)
Full text and
rfc822 format available.
Message #50 received at 48841 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Sun, Jun 6, 2021, 23:20 Dmitry Gutov <dgutov <at> yandex.ru> wrote:
>
> >> And/or pick a different algorithm. E.g. Jaro-Winkler, which apparently
> >> is used in a lot of "fuzzy matching" implementations out there, it's
> >> pretty fast.
> >
> > That may be useful, but for other purposes. If I understand correctly,
> > Jaro-Winkler is for finding the distante between two arbitrary strings,
> > where the first in not a subsequence of the second. I bet google uses
> > stuff like that when you accitendally transpose characters. Flex just
> > gives up. Those other others algos still catch the match (and Google
> > than probably NL-scours your most intimate fears and checks with your
> local dictator before showing you typing suggestions)
>
Meant to write ML as in machine learning, not NL.
I'm not crazy about shuffling completion, but some people did indeed ask
> for it. The filtering step has to be more complex, though (you can't
> just construct a straightforward regexp to filter with).
>
I think you calculate the distance using one of these fancy multi-surname
algorithms , then do cutoff somewhere.
Same result, indeed. We should note, though, that
> completion-pcm--hilit-commonality has some steps that were added
> together with the 'flex' style, for it to work.
>
But nothing algorithmically aberrant, I think.
I did some instrumenting, replacing (completion-pcm--hilit-commonality
> pattern all) inside completion-flex-all-completions with
> (benchmark-progn (completion-pcm--hilit-commonality pattern all)).
> Recompiled between each change (interpreted mode gives very different
> numbers).
>
> Unmodified, the call takes ~85ms:
>
> Elapsed time: 0.085520s (0.068406s in 4 GCs)
>
By the way, I think you should be running benchmarks multiple times to get
times in the seconds range, and reduce noise. Multiple levels of CPU cache
and other factors like temp thottling may skew results when running just
one time.
If I comment everything inside its first lambda except the returned
> value (making the function the same as #'identity), the time goes down
> to <1ms.
>
> Uncomment the 'copy-sequence' and 'string-match' calls (which one might
> suspect of garbage generation):
>
> Elapsed time: 0.006380s
>
> Tried binding gc-cons-threshold to a high value, and even galling
> garbage-collect-maybe (or not): that speeds it up, but adds some
> unpredictable GC pauses later (though it would be nice to be able to
> consolidate the pauses into one collection pass).
>
Maybe in Elisp that's a good idea, in other lisps and other languages,
second-guessing the GC is a bad idea. I hear ours is so basic that indeed
it might be reasonable.
Long story short, the patch I just installed, to reuse the match data,
> brings the runtime down to
>
> Elapsed time: 0.066388s (0.050087s in 3 GCs)
>
That's nice! but are you sure you're not seeing noise, too?
Tried other things like moving the update-score-and-face lambda out of
> the mapcar loop - that didn't move a needle.
If a lambda is non capturing of stuff inside the loop, only one copy of it
is ever made, I think. So it doesn't suprise me.
And a weird part: replacing all repeated (length str) calls with a
> reference to an existing local binding makes it *slower* (back to the
> original performance).
Might be noise, or you might be thrashing of CPU caches, who knows? If the
string length is on the same cache line as the contents of the string
you're reading, then evicting that to go read the value of a boxed integer
somewhere radically different is slow. Just speculation of course. Might
just be noise or something else entirely.
João
[Message part 2 (text/html, inline)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 07 Jun 2021 01:56:02 GMT)
Full text and
rfc822 format available.
Message #53 received at 48841 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Sun, Jun 6, 2021, 23:21 Dmitry Gutov <dgutov <at> yandex.ru> wrote:
> On 06.06.2021 21:37, João Távora wrote:
> > Bottom line is that something (TM) happened to speed up the whole thing
> > when I skipped over that whole part. I had vertical mode basically
> > visually equivalent to vertical, but quite slower. After skipping that
> > part they became practically equivalent. And you yourself witnessed this
> > when switching yo vertical mode, which is when the skip is made.
>
> Yep.
>
By the way, earlier I meant to write:
"I had vertical mode basically visually equivalent to vertico.el, but quite
slower. After skipping that part they became practically equivalent. "
autocorrect on my phone had other ideas...
João
[Message part 2 (text/html, inline)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 07 Jun 2021 02:06:02 GMT)
Full text and
rfc822 format available.
Message #56 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On 07.06.2021 02:49, João Távora wrote:
> Same result, indeed. We should note, though, that
> completion-pcm--hilit-commonality has some steps that were added
> together with the 'flex' style, for it to work.
>
>
> But nothing algorithmically aberrant, I think.
Just some stuff adding work to GC, I think.
> By the way, I think you should be running benchmarks multiple times to
> get times in the seconds range, and reduce noise. Multiple levels of CPU
> cache and other factors like temp thottling may skew results when
> running just one time.
Yeah, I repeat the action with each version for like a few dozen times,
until I see the numbers stabilize, or just take the average.
> Tried binding gc-cons-threshold to a high value, and even galling
> garbage-collect-maybe (or not): that speeds it up, but adds some
> unpredictable GC pauses later (though it would be nice to be able to
> consolidate the pauses into one collection pass).
>
>
> Maybe in Elisp that's a good idea, in other lisps and other languages,
> second-guessing the GC is a bad idea. I hear ours is so basic that
> indeed it might be reasonable.
I never get good results with that.
> Long story short, the patch I just installed, to reuse the match data,
> brings the runtime down to
>
> Elapsed time: 0.066388s (0.050087s in 3 GCs)
>
>
> That's nice! but are you sure you're not seeing noise, too?
Pretty sure.
> Tried other things like moving the update-score-and-face lambda out of
> the mapcar loop - that didn't move a needle.
>
>
> If a lambda is non capturing of stuff inside the loop, only one copy of
> it is ever made, I think. So it doesn't suprise me.
update-score-and-face references both variables in its closest binding
form (score-numerator, score-denominator) and the parameter of its
containing lambda (str).
Maybe moving all of them to parameters and return values (making it a
static function and having the caller manage state) would help, I
haven't tried that exactly.
> And a weird part: replacing all repeated (length str) calls with a
> reference to an existing local binding makes it *slower* (back to the
> original performance).
>
>
> Might be noise, or you might be thrashing of CPU caches, who knows? If
> the string length is on the same cache line as the contents of the
> string you're reading, then evicting that to go read the value of a
> boxed integer somewhere radically different is slow.
But the string value is boxed as well, right? So the code needs to
follow one indirection either way. Perhaps there's also overhead in
looking up the lexical scope.
I also tried using the new-and-shiny length> and length=. This simply
made no measurable difference.
> Just speculation of
> course. Might just be noise or something else entirely.
This is highly reproducible. On my machine, at least.
Given how weird it is, I wouldn't just write about it without
recompiling, restarting Emacs and measuring the scenario several times,
with different versions of code.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 07 Jun 2021 08:54:02 GMT)
Full text and
rfc822 format available.
Message #59 received at 48841 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Mon, Jun 7, 2021, 01:11 Dmitry Gutov <dgutov <at> yandex.ru> wrote:
> On 07.06.2021 02:49, João Távora wrote:
>
> > Same result, indeed. We should note, though, that
> > completion-pcm--hilit-commonality has some steps that were added
> > together with the 'flex' style, for it to work.
> >
> >
> > But nothing algorithmically aberrant, I think.
>
> Just some stuff adding work to GC, I think.
>
O(n) stuff being a property and a number on each string, small in
comparison to the string.
Maybe moving all of them to parameters and return values (making it a
> static function and having the caller manage state) would help, I
> haven't tried that exactly.
>
Normally, in those adventures you end up with the same allocations
somewhere else, and uglier code. But you can try.
> Might be noise, or you might be thrashing of CPU caches, who knows? If
> > the string length is on the same cache line as the contents of the
> > string you're reading, then evicting that to go read the value of a
> > boxed integer somewhere radically different is slow.
>
> But the string value is boxed as well, right?
The key is locality. If the string length and data happen to live nearby in
memory (in the same box, so to speak), there's a decent chance that reading
one brings the other into the cache, and you get a nice hit depending on
your subsequent operation.
Here I'm just speculating, as I said. In managed languages such as Lisps,
it's somewhat unpredictable. It's also always hardware dependent. Though
given C/C++, a known processor and the right application, this will make a
world of a difference, and will yield truly "weird" results (which arent
weird at all after you understand the logic). Like, for example a vector
being much better at sorted insertion than a linked list. (!) Look it up.
Bjarne Stroustrup has one of those talks.
João
[Message part 2 (text/html, inline)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Fri, 11 Jun 2021 02:20:02 GMT)
Full text and
rfc822 format available.
Message #62 received at 48841 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On 07.06.2021 11:52, João Távora wrote:
> Maybe moving all of them to parameters and return values (making it a
> static function and having the caller manage state) would help, I
> haven't tried that exactly.
>
>
> Normally, in those adventures you end up with the same allocations
> somewhere else, and uglier code. But you can try.
I have it a try, with little success (patch attached, for posterity).
Since there's no multiple value returns in Lisp, I had to define a
container for the three values.
The performance is basically the same, which seems to indicate that
either Elisp has to allocate very little for a compiled lambda code, or
it's optimized out (which would also make sense: the only thing
necessary for it is a new container for the current scope).
> > Might be noise, or you might be thrashing of CPU caches, who
> knows? If
> > the string length is on the same cache line as the contents of the
> > string you're reading, then evicting that to go read the value of a
> > boxed integer somewhere radically different is slow.
>
> But the string value is boxed as well, right?
>
>
> The key is locality. If the string length and data happen to live nearby
> in memory (in the same box, so to speak), there's a decent chance that
> reading one brings the other into the cache, and you get a nice hit
> depending on your subsequent operation.
>
> Here I'm just speculating, as I said. In managed languages such as
> Lisps, it's somewhat unpredictable. It's also always hardware dependent.
> Though given C/C++, a known processor and the right application, this
> will make a world of a difference, and will yield truly "weird" results
> (which arent weird at all after you understand the logic). Like, for
> example a vector being much better at sorted insertion than a linked
> list. (!) Look it up. Bjarne Stroustrup has one of those talks.
When you have to do some work, better memory locality can indeed change
a lot. But in this case we have an already computed value vs. something
the code still needs to compute, however fast that is.
Accessing function arguments must be currently much faster than looking
up the current scope defined with 'let'.
Anyway, looking at what else could be removed, now that the extra
allocation in 'match-data' is gone, what really speeds it up 2x-11x
(depending on whether GC kicks in, but it more often doesn't), is
commenting out the line:
(setq str (copy-sequence str))
So if it were possible to rearrange completion-pcm--hilit-commonality
not to have to modify the strings (probably removing the function
altogether?), that would improve the potential performance of c-a-p-f
quite a bit, for fido-mode and other frontends (depending on how much
overhead the other layers add).
Ultimately, the scoring information doesn't have to live in the text
properties. For sorting, the frontend could allocate a hash table, then
ask the [backends? styles?] for completion scores on each item and sort
based on that. Since faces are needed only for the completions that are
currently displayed, even having to repeat the regexp matching stuff for
each of them later would be no big deal performance-wise, compared to
the current approach.
Anyway, these are musing for the much-discussed future iteration of the
API. With the current version, and tied by backward compatibility, it
might be possible to wring 10ms of improvement by consolidating text
property changes somehow, but likely no more than that.
Looking forward for your analysis of fido-vertical-mode's performance
improvement over the "normal" one.
[completion-pcm-score-struct.diff (text/x-patch, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Fri, 11 Jun 2021 17:10:02 GMT)
Full text and
rfc822 format available.
Message #65 received at 48841 <at> debbugs.gnu.org (full text, mbox):
Dmitry Gutov <dgutov <at> yandex.ru> writes:
> On 07.06.2021 11:52, João Távora wrote:
>
>> Maybe moving all of them to parameters and return values (making it a
>> static function and having the caller manage state) would help, I
>> haven't tried that exactly.
>> Normally, in those adventures you end up with the same allocations
>> somewhere else, and uglier code. But you can try.
>
> I have it a try, with little success (patch attached, for
> posterity). Since there's no multiple value returns in Lisp, I had to
> define a container for the three values.
And if there were multiple value, you can bet the container for them
wouldn't be free ;-)
> The performance is basically the same, which seems to indicate that
> either Elisp has to allocate very little for a compiled lambda code,
> or it's optimized out (which would also make sense: the only thing
> necessary for it is a new container for the current scope).
Which lambda are we talking about? Is it ` update-score-and-face`? If
so, I would guess that the capture of `score-denominator` is what takes
space, and that space is no bigger than another variable in that let
scope.
>> Though given C/C++, a known processor and the right application,
>> this will make a world of a difference, and will yield truly "weird"
>> results (which arent weird at all after you understand the
>> logic). Like, for example a vector being much better at sorted
>> insertion than a linked list. (!) Look it up. Bjarne Stroustrup has
>> one of those talks.
> When you have to do some work, better memory locality can indeed
> change a lot. But in this case we have an already computed value
> vs. something the code still needs to compute, however fast that is.
But `length` of a string, in any sane string implementation, _is_
accessing "an already computed value". Which likely lives just besides
the data. In Emacs, it seems to be two pointers (8 bytes) apart from
the data. In a system with 64bytes of L1/2/3 cache it still
theoretically makes up to 52 bytes come in "for free" after you read the
length. But to be honest I tried a bit and haven't come up with
benchmarks to help confirm -- or help dispel -- this theory. Maybe you
can distill your "weird" experiment down to a code snippet?
> Accessing function arguments must be currently much faster than
> looking up the current scope defined with 'let'.
In a compiled CL system, I would expect the former to use the stack, and
the to use the heap, but it wouldn't make any difference in reading the
variable's value, I think. But Elisp is byte-compiled, not natively
compiled (except for that thing now, haven't tried it), and i don't
understand how the byte-compiler chooses byte-codes so all bets are off.
> Anyway, looking at what else could be removed, now that the extra
> allocation in 'match-data' is gone, what really speeds it up 2x-11x
> (depending on whether GC kicks in, but it more often doesn't), is
> commenting out the line:
>
> (setq str (copy-sequence str))
>
> So if it were possible to rearrange completion-pcm--hilit-commonality
> not to have to modify the strings (probably removing the function
> altogether?), that would improve the potential performance of c-a-p-f
> quite a bit, for fido-mode and other frontends (depending on how much
> overhead the other layers add).
Very interesting. I don't know what the matter is with modifying the
string itself. Is it because we want to protect its 'face' property?
Maybe, but what's the harm in chaning it? Maybe Stefan knows. Stefan,
are you reading this far?
If we do want to protect the shared 'face' property -- and only 'face'
-- then we could very add some other property about face that the
frontend could read "just in time" before it itself makes a copy of the
string to display to the user.
This technique appears to be slightly simpler than using the hash-table
indirection you propose (we would need something like that if, for some
reason, we absolutely could not touch the string's property list.)
> Anyway, these are musing for the much-discussed future iteration of
> the API. With the current version, and tied by backward compatibility,
Maybe I'm missing something, but I don't see why my above idea requires
changing _that_ much of the API (a bit would change yes). It's a matter
of letting frontends opt-out of the current readily-available
face-propertized completions and opt-into a display-time facility that
does this propertization.
But if the speedup is big, I'd revisit the rationale for requiring those
copies to be performed in the first place. In my (very brief) testing
it doesn't hurt a bit to remove it.
> Looking forward for your analysis of fido-vertical-mode's performance
> improvement over the "normal" one.
Will take a look now.
João
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Fri, 11 Jun 2021 22:36:01 GMT)
Full text and
rfc822 format available.
Message #68 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On 11.06.2021 20:09, João Távora wrote:
>> The performance is basically the same, which seems to indicate that
>> either Elisp has to allocate very little for a compiled lambda code,
>> or it's optimized out (which would also make sense: the only thing
>> necessary for it is a new container for the current scope).
>
> Which lambda are we talking about? Is it ` update-score-and-face`? If
> so, I would guess that the capture of `score-denominator` is what takes
> space, and that space is no bigger than another variable in that let
> scope.
Yes, that one.
> But `length` of a string, in any sane string implementation, _is_
> accessing "an already computed value". Which likely lives just besides
> the data. In Emacs, it seems to be two pointers (8 bytes) apart from
> the data. In a system with 64bytes of L1/2/3 cache it still
> theoretically makes up to 52 bytes come in "for free" after you read the
> length. But to be honest I tried a bit and haven't come up with
> benchmarks to help confirm -- or help dispel -- this theory. Maybe you
> can distill your "weird" experiment down to a code snippet?
I've tried reproducing the effect with a small snippet, and failed. :-(
>> Anyway, looking at what else could be removed, now that the extra
>> allocation in 'match-data' is gone, what really speeds it up 2x-11x
>> (depending on whether GC kicks in, but it more often doesn't), is
>> commenting out the line:
>>
>> (setq str (copy-sequence str))
>>
>> So if it were possible to rearrange completion-pcm--hilit-commonality
>> not to have to modify the strings (probably removing the function
>> altogether?), that would improve the potential performance of c-a-p-f
>> quite a bit, for fido-mode and other frontends (depending on how much
>> overhead the other layers add).
>
> Very interesting. I don't know what the matter is with modifying the
> string itself. Is it because we want to protect its 'face' property?
> Maybe, but what's the harm in chaning it?
I imagine it's just a "correctness" thing. Text properties are part of
the string's identity. We add text properties, so we make a copy because
we don't own the original list (it might be saved to some constant and
also used for, I don't know, IMenu items?)
> If we do want to protect the shared 'face' property -- and only 'face'
> -- then we could very add some other property about face that the
> frontend could read "just in time" before it itself makes a copy of the
> string to display to the user.
Yes, it's an option (though a less elegant one): apply some namespaced
text properties with the necessary data. And then we'd also be able to
fontify "just in time".
Do we have any "frozen strings" in Emacs, which absolutely could not be
modified? Do we plan to?
> This technique appears to be slightly simpler than using the hash-table
> indirection you propose (we would need something like that if, for some
> reason, we absolutely could not touch the string's property list.)
I disagree it's a simpler technique, but it would indeed be a simpler
change, based on the current implementation.
>> Anyway, these are musing for the much-discussed future iteration of
>> the API. With the current version, and tied by backward compatibility,
>
> Maybe I'm missing something, but I don't see why my above idea requires
> changing _that_ much of the API (a bit would change yes). It's a matter
> of letting frontends opt-out of the current readily-available
> face-propertized completions and opt-into a display-time facility that
> does this propertization.
Even your version is a breaking enough change to be a pain, but possibly
not beneficial enough to bother all consumers with, until we also add
some more awaited features, I guess.
But I don't mind it myself, and happy to update Company. Either way it's
a step forward.
> But if the speedup is big, I'd revisit the rationale for requiring those
> copies to be performed in the first place.
With fido-vertical-mode, and with that particular input, it's
Elapsed time: 0.130773s (0.031547s in 1 GCs)
without copy-sequence, and
Elapsed time: 0.169842s (0.069740s in 4 GCs)
with it. Not game changing, but definitely measurable.
> In my (very brief) testing
> it doesn't hurt a bit to remove it.
Same. But it likely depends on where the strings came from. In the most
usual case, of course, they are created at runtime.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Fri, 11 Jun 2021 22:42:01 GMT)
Full text and
rfc822 format available.
Message #71 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On 12.06.2021 01:34, Dmitry Gutov wrote:
> With fido-vertical-mode, and with that particular input, it's
>
> Elapsed time: 0.130773s (0.031547s in 1 GCs)
>
> without copy-sequence, and
>
> Elapsed time: 0.169842s (0.069740s in 4 GCs)
>
> with it. Not game changing, but definitely measurable.
it = icomplete-completions' runtime
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Fri, 11 Jun 2021 23:26:01 GMT)
Full text and
rfc822 format available.
Message #74 received at 48841 <at> debbugs.gnu.org (full text, mbox):
Dmitry Gutov <dgutov <at> yandex.ru> writes:
> Looking forward for your analysis of fido-vertical-mode's performance
> improvement over the "normal" one.
So, I benchmarked before and after this patch to icomplete.el:
diff --git a/lisp/icomplete.el b/lisp/icomplete.el
index 08b4ef2030..3561ebfa04 100644
--- a/lisp/icomplete.el
+++ b/lisp/icomplete.el
@@ -858,16 +858,8 @@ icomplete-completions
;; removing making `comps' a proper list.
(base-size (prog1 (cdr last)
(if last (setcdr last nil))))
- (most-try
- (if (and base-size (> base-size 0))
- (completion-try-completion
- name candidates predicate (length name) md)
- ;; If the `comps' are 0-based, the result should be
- ;; the same with `comps'.
- (completion-try-completion
- name comps nil (length name) md)))
- (most (if (consp most-try) (car most-try)
- (if most-try (car comps) "")))
+ (most-try nil)
+ (most "")
;; Compare name and most, so we can determine if name is
;; a prefix of most, or something else.
(compare (compare-strings name nil nil
The patch itself nullifies the calculation of the 'determ' thing that I
and presumably some other users don't value that much. It doesn't
affect fido-mode's basic funcionality.
How did I benchmark? Well, to measure the delay the user experiences
until all completions are presented I had to take out the
`while-no-input` in icomplete-exhibit so that this test would work:
;; After the form, type C-u C-x C-e C-m in quick succession
(benchmark-run (completing-read "bla" obarray))
If I don't remove this `while-no-input`, icomplete will not lose time
showing all the completions and will instead select just the first one.
That's a very nice feature for actual use, but for this benchmark that
is not what I want: I want to measure the time to show all the
completions.
Then, the times presented by benchmark-run are the same that the user
sees if he waits to see the completions. Now the values:
Before my patch:
(1.802209488 5 1.3678843490000077)
(1.609066281 4 1.1170432569999775)
(1.878972079 5 1.3725165670000479)
(1.901952581 5 1.3979494059999524)
(1.820800064 5 1.3283940110000003)
After the patch:
(0.552051921 1 0.3079724459999511)
(0.58396499 1 0.3038616050000087)
(0.861106587 2 0.6046198220000178)
(0.611551175 1 0.30275532399997473)
(0.62500199 1 0.3160454470000218)
Without the patch but with icomplete-vertical-mode:
(0.645366711 1 0.3412892389999911)
(0.6256968110000001 1 0.3234302760000105)
(0.9716317630000001 2 0.6676939319999633)
(0.6414442749999999 1 0.3325084230000357)
(0.627684562 1 0.32241421699995954)
João
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Sat, 12 Jun 2021 00:44:01 GMT)
Full text and
rfc822 format available.
Message #77 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On 12.06.2021 02:24, João Távora wrote:
> Dmitry Gutov <dgutov <at> yandex.ru> writes:
>
>> Looking forward for your analysis of fido-vertical-mode's performance
>> improvement over the "normal" one.
>
> So, I benchmarked before and after this patch to icomplete.el:
>
> diff --git a/lisp/icomplete.el b/lisp/icomplete.el
> index 08b4ef2030..3561ebfa04 100644
> --- a/lisp/icomplete.el
> +++ b/lisp/icomplete.el
> @@ -858,16 +858,8 @@ icomplete-completions
> ;; removing making `comps' a proper list.
> (base-size (prog1 (cdr last)
> (if last (setcdr last nil))))
> - (most-try
> - (if (and base-size (> base-size 0))
> - (completion-try-completion
> - name candidates predicate (length name) md)
> - ;; If the `comps' are 0-based, the result should be
> - ;; the same with `comps'.
> - (completion-try-completion
> - name comps nil (length name) md)))
> - (most (if (consp most-try) (car most-try)
> - (if most-try (car comps) "")))
> + (most-try nil)
> + (most "")
> ;; Compare name and most, so we can determine if name is
> ;; a prefix of most, or something else.
> (compare (compare-strings name nil nil
All right, so this is not about try-completion, it's about
completion-try-completion. That makes sense.
> The patch itself nullifies the calculation of the 'determ' thing that I
> and presumably some other users don't value that much. It doesn't
> affect fido-mode's basic funcionality.
>
> How did I benchmark? Well, to measure the delay the user experiences
> until all completions are presented I had to take out the
> `while-no-input` in icomplete-exhibit so that this test would work:
>
> ;; After the form, type C-u C-x C-e C-m in quick succession
> (benchmark-run (completing-read "bla" obarray))
>
> If I don't remove this `while-no-input`, icomplete will not lose time
> showing all the completions and will instead select just the first one.
> That's a very nice feature for actual use, but for this benchmark that
> is not what I want: I want to measure the time to show all the
> completions.
Did the same, can repro.
> Then, the times presented by benchmark-run are the same that the user
> sees if he waits to see the completions. Now the values:
>
> Before my patch:
>
> (1.802209488 5 1.3678843490000077)
> (1.609066281 4 1.1170432569999775)
> (1.878972079 5 1.3725165670000479)
> (1.901952581 5 1.3979494059999524)
> (1.820800064 5 1.3283940110000003)
>
> After the patch:
>
> (0.552051921 1 0.3079724459999511)
> (0.58396499 1 0.3038616050000087)
> (0.861106587 2 0.6046198220000178)
> (0.611551175 1 0.30275532399997473)
> (0.62500199 1 0.3160454470000218)
I get
(0.377195885 10 0.24448539800000013)
before and
(0.245218061 6 0.1390041310000001)
after. A solid improvement.
BTW, if I just stick benchmark-progn around icomplete-completions like
diff --git a/lisp/icomplete.el b/lisp/icomplete.el
index 08b4ef2030..b9fe3e1836 100644
--- a/lisp/icomplete.el
+++ b/lisp/icomplete.el
@@ -678,12 +678,13 @@ icomplete-exhibit
;; seems to trigger it fairly often!
(while-no-input-ignore-events '(selection-request))
(text (while-no-input
- (icomplete-completions
- field-string
- (icomplete--completion-table)
- (icomplete--completion-predicate)
- (if (window-minibuffer-p)
- (eq minibuffer--require-match t)))))
+ (benchmark-progn
+ (icomplete-completions
+ field-string
+ (icomplete--completion-table)
+ (icomplete--completion-predicate)
+ (if (window-minibuffer-p)
+ (eq minibuffer--require-match t))))))
(buffer-undo-list t)
deactivate-mark)
;; Do nothing if while-no-input was aborted.
...it reports
Elapsed time: 0.329006s (0.246073s in 10 GCs)
vs
Elapsed time: 0.169200s (0.113762s in 5 GCs)
I suppose the 40-70ms difference is due to delay in typing.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Sun, 13 Jun 2021 14:30:02 GMT)
Full text and
rfc822 format available.
Message #80 received at 48841 <at> debbugs.gnu.org (full text, mbox):
Dmitry Gutov <dgutov <at> yandex.ru> writes:
> On 12.06.2021 02:24, João Távora wrote:
>> Dmitry Gutov <dgutov <at> yandex.ru> writes:
>>
>>> Looking forward for your analysis of fido-vertical-mode's performance
>>> improvement over the "normal" one.
>> So, I benchmarked before and after this patch to icomplete.el:
>> diff --git a/lisp/icomplete.el b/lisp/icomplete.el
>> index 08b4ef2030..3561ebfa04 100644
>> --- a/lisp/icomplete.el
>> +++ b/lisp/icomplete.el
>> @@ -858,16 +858,8 @@ icomplete-completions
>> ;; removing making `comps' a proper list.
>> (base-size (prog1 (cdr last)
>> (if last (setcdr last nil))))
>> - (most-try
>> - (if (and base-size (> base-size 0))
>> - (completion-try-completion
>> - name candidates predicate (length name) md)
>> - ;; If the `comps' are 0-based, the result should be
>> - ;; the same with `comps'.
>> - (completion-try-completion
>> - name comps nil (length name) md)))
>> - (most (if (consp most-try) (car most-try)
>> - (if most-try (car comps) "")))
>> + (most-try nil)
>> + (most "")
>> ;; Compare name and most, so we can determine if name is
>> ;; a prefix of most, or something else.
>> (compare (compare-strings name nil nil
>
> All right, so this is not about try-completion, it's about
> completion-try-completion. That makes sense.
Yeah, to be honest, once I'm done actually using these functions I
immediately evict the differences between try-completion,
completion-try-completion, try-try-completion-completion, or any of
these yoda-speak variations from my mental cache.
Here I meant is that there was something apparently useless and slow (to
fido-mode at least) going on in that else branch.
> Elapsed time: 0.329006s (0.246073s in 10 GCs)
>
> vs
>
> Elapsed time: 0.169200s (0.113762s in 5 GCs)
>
> I suppose the 40-70ms difference is due to delay in typing.
No idea. In my (slower?) system, I typed C-u C-x C-e C-m pretty fast.
Presumably the C-m goes in before pp-eval-last-sexp has a chance to read
more input so I wouldn't think it's a delay in typing. I could
investigate, but since your measurements confirm the same tendency
anyway, I think this simple patch is what's needed to close this issue.
diff --git a/lisp/icomplete.el b/lisp/icomplete.el
index 08b4ef2030..5d37f47e7d 100644
--- a/lisp/icomplete.el
+++ b/lisp/icomplete.el
@@ -859,13 +859,14 @@ icomplete-completions
(base-size (prog1 (cdr last)
(if last (setcdr last nil))))
(most-try
- (if (and base-size (> base-size 0))
- (completion-try-completion
- name candidates predicate (length name) md)
- ;; If the `comps' are 0-based, the result should be
- ;; the same with `comps'.
- (completion-try-completion
- name comps nil (length name) md)))
+ (and (not fido-mode) ; Fido avoids these expensive calculations.
+ (if (and base-size (> base-size 0))
+ (completion-try-completion
+ name candidates predicate (length name) md)
+ ;; If the `comps' are 0-based, the result should be
+ ;; the same with `comps'.
+ (completion-try-completion
+ name comps nil (length name) md))))
(most (if (consp most-try) (car most-try)
(if most-try (car comps) "")))
;; Compare name and most, so we can determine if name is
João
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Sun, 13 Jun 2021 14:56:01 GMT)
Full text and
rfc822 format available.
Message #83 received at 48841 <at> debbugs.gnu.org (full text, mbox):
Dmitry Gutov <dgutov <at> yandex.ru> writes:
>> Very interesting. I don't know what the matter is with modifying
>> the
>> string itself. Is it because we want to protect its 'face' property?
>> Maybe, but what's the harm in chaning it?
>
> I imagine it's just a "correctness" thing. Text properties are part of
> the string's identity. We add text properties, so we make a copy
> because we don't own the original list (it might be saved to some
> constant and also used for, I don't know, IMenu items?)
I just confirmed it's not for correctness. If one foregoes the copy, in
fido-mode a C-h f followed by some input followed by C-g and an M-x and
no input (empty pattern) will show interesting results, i.e. a list of
propertized strings when nothing should be propertized.
But maybe that's a question of removing the propertization when the
pattern is empty? I'll try later.
>> If we do want to protect the shared 'face' property -- and only 'face'
>> -- then we could very add some other property about face that the
>> frontend could read "just in time" before it itself makes a copy of the
>> string to display to the user.
>
> Yes, it's an option (though a less elegant one): apply some namespaced
> text properties with the necessary data. And then we'd also be able to
> fontify "just in time".
>
> Do we have any "frozen strings" in Emacs, which absolutely could not
> be modified? Do we plan to?
Immutable strings? And error when one tries to? Or just ignore the
modification? And how would that help here?
> I disagree it's a simpler technique, but it would indeed be a simpler
> change, based on the current implementation.
simpler means simpler in my book :-)
> But I don't mind it myself, and happy to update Company. Either way
> it's a step forward.
If Company and fido-mode and a couple more outside the core/Elpa are all
that's needed, it's probably warranted. But there are so many frontends
right now, I don't know... We'd need some "opt into the optimization",
I think."
>> But if the speedup is big, I'd revisit the rationale for requiring those
>> copies to be performed in the first place.
>
> With fido-vertical-mode, and with that particular input, it's
>
> Elapsed time: 0.130773s (0.031547s in 1 GCs)
>
> without copy-sequence, and
>
> Elapsed time: 0.169842s (0.069740s in 4 GCs)
>
> with it. Not game changing, but definitely measurable.
I don't have these results. I tried the previous experiment:
;; C-u C-x C-e C-m in quick succession
(benchmark-run (completing-read "bla" obarray))
And turned icomplete.el's while-no-input into a progn.
In an Emacs -Q where (length (all-completions "" obarray)) is about
20000.
;; with copy
(0.39647753 6 0.22811240199999983)
(0.431950471 7 0.263988651)
(0.451116177 6 0.2249686070000001)
;; without copy, small but measurable speedup
(0.29890632 2 0.08419541699999966)
(0.293501099 2 0.08622194699999985)
(0.306566633 3 0.0853211100000002)
In a loaded Emacs where (length (all-completions "" obarray)) is 64554
;; with copy
(2.869362171 6 2.3882547280000495)
(2.909661303 6 2.4209153659999743)
(2.845522439 6 2.3638140250000106)
;; without copy. Huge speedup.
(0.79817337 1 0.4526993239999797)
(0.8231736510000001 1 0.4752496449999626)
(0.719004478 1 0.4016016420000028)
João
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 14 Jun 2021 00:09:02 GMT)
Full text and
rfc822 format available.
Message #86 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On 13.06.2021 17:29, João Távora wrote:
> Yeah, to be honest, once I'm done actually using these functions I
> immediately evict the differences between try-completion,
> completion-try-completion, try-try-completion-completion, or any of
> these yoda-speak variations from my mental cache.
>
> Here I meant is that there was something apparently useless and slow (to
> fido-mode at least) going on in that else branch.
And there it was!
>> Elapsed time: 0.329006s (0.246073s in 10 GCs)
>>
>> vs
>>
>> Elapsed time: 0.169200s (0.113762s in 5 GCs)
>>
>> I suppose the 40-70ms difference is due to delay in typing.
>
> No idea. In my (slower?) system, I typed C-u C-x C-e C-m pretty fast.
> Presumably the C-m goes in before pp-eval-last-sexp has a chance to read
> more input so I wouldn't think it's a delay in typing.
Some input latency must be there, of course it depends on how fast Emacs
is handling the previous events, how fast the machine is, etc.
Anyway, I was just describing an easier way to benchmark the same effect
(without having to hit the key sequence very quickly). Hope you or
someone else find it useful.
> I could
> investigate, but since your measurements confirm the same tendency
> anyway, I think this simple patch is what's needed to close this issue.
Haven't tested it myself, but it looks like it will almost certainly work.
> diff --git a/lisp/icomplete.el b/lisp/icomplete.el
> index 08b4ef2030..5d37f47e7d 100644
> --- a/lisp/icomplete.el
> +++ b/lisp/icomplete.el
> @@ -859,13 +859,14 @@ icomplete-completions
> (base-size (prog1 (cdr last)
> (if last (setcdr last nil))))
> (most-try
> - (if (and base-size (> base-size 0))
> - (completion-try-completion
> - name candidates predicate (length name) md)
> - ;; If the `comps' are 0-based, the result should be
> - ;; the same with `comps'.
> - (completion-try-completion
> - name comps nil (length name) md)))
> + (and (not fido-mode) ; Fido avoids these expensive calculations.
Perhaps predicate it on the value of icomplete-hide-common-prefix instead?
fido-mode sets it to nil, and this way we retain a better level of
abstraction, and better backward compatibility for vanilla
icomplete-mode users.
Next step might be switching this var's default value to nil.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 14 Jun 2021 00:17:02 GMT)
Full text and
rfc822 format available.
Message #89 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On Mon, Jun 14, 2021 at 1:08 AM Dmitry Gutov <dgutov <at> yandex.ru> wrote:
> Anyway, I was just describing an easier way to benchmark the same effect
> (without having to hit the key sequence very quickly). Hope you or
> someone else find it useful.
Yes. It is useful. I didn't know about benchmark-progn.
> > I could
> > investigate, but since your measurements confirm the same tendency
> > anyway, I think this simple patch is what's needed to close this issue.
>
> Haven't tested it myself, but it looks like it will almost certainly work.
>
> > diff --git a/lisp/icomplete.el b/lisp/icomplete.el
> > index 08b4ef2030..5d37f47e7d 100644
> > --- a/lisp/icomplete.el
> > +++ b/lisp/icomplete.el
> > @@ -859,13 +859,14 @@ icomplete-completions
> > (base-size (prog1 (cdr last)
> > (if last (setcdr last nil))))
> > (most-try
> > - (if (and base-size (> base-size 0))
> > - (completion-try-completion
> > - name candidates predicate (length name) md)
> > - ;; If the `comps' are 0-based, the result should be
> > - ;; the same with `comps'.
> > - (completion-try-completion
> > - name comps nil (length name) md)))
> > + (and (not fido-mode) ; Fido avoids these expensive calculations.
>
> Perhaps predicate it on the value of icomplete-hide-common-prefix instead?
>
> fido-mode sets it to nil, and this way we retain a better level of
> abstraction, and better backward compatibility for vanilla
> icomplete-mode users.
This is a good idea, the level of abstraction. But what is this
"common prefix" anyway? Is it the the same as the "determ"
thing, or the "[mplete...] dance" as I called it earlier. Shouldn't
fido-mode then _hide_ it?
I'm confused, but if you're not, go ahead and make that more
abstract change instead of relying on fido-mode.
> Next step might be switching this var's default value to nil.
Also confused, but no objections.
João
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Thu, 17 Jun 2021 02:25:02 GMT)
Full text and
rfc822 format available.
Message #92 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On 14.06.2021 03:16, João Távora wrote:
>> Perhaps predicate it on the value of icomplete-hide-common-prefix instead?
>>
>> fido-mode sets it to nil, and this way we retain a better level of
>> abstraction, and better backward compatibility for vanilla
>> icomplete-mode users.
> This is a good idea, the level of abstraction. But what is this
> "common prefix" anyway? Is it the the same as the "determ"
> thing, or the "[mplete...] dance" as I called it earlier. Shouldn't
> fido-mode then_hide_ it?
>
> I'm confused, but if you're not, go ahead and make that more
> abstract change instead of relying on fido-mode.
So... it's a bit more complex than that. The 'most' value computes the
biggest "fuzzy" match (taking into account completion styles) and bases
the resulting display of the "single match" on that.
Before your patch the output could look like:
starfshe|(...t-file-process-shell-command) [Matched]
with the patch it's much less informative:
starfshe| [Matched]
...so it has value, whether the variable I mentioned above is t or nil.
It seems there are two ways to proceed from here:
- Just alter the printing logic in the "single match" case to print the
match text in full is it's not equal to the input string. I haven't
puzzled out the logic doing that yet.
- Try to keep the current behavior while avoiding the duplicate work.
About the latter option: the result of that most-try stuff is only
useful when there is only one match, right? But that work is performed
unconditionally.
Unless I'm missing something and the value does see some use in the
multiple-matches situations, the patch below both keeps the current
behavior and gives the same performance improvement:
diff --git a/lisp/icomplete.el b/lisp/icomplete.el
index 08b4ef2030..fc88e2a3e0 100644
--- a/lisp/icomplete.el
+++ b/lisp/icomplete.el
@@ -859,13 +859,14 @@ icomplete-completions
(base-size (prog1 (cdr last)
(if last (setcdr last nil))))
(most-try
- (if (and base-size (> base-size 0))
+ (unless (cdr comps)
+ (if (and base-size (> base-size 0))
+ (completion-try-completion
+ name candidates predicate (length name) md)
+ ;; If the `comps' are 0-based, the result should be
+ ;; the same with `comps'.
(completion-try-completion
- name candidates predicate (length name) md)
- ;; If the `comps' are 0-based, the result should be
- ;; the same with `comps'.
- (completion-try-completion
- name comps nil (length name) md)))
+ name comps nil (length name) md))))
(most (if (consp most-try) (car most-try)
(if most-try (car comps) "")))
;; Compare name and most, so we can determine if name is
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Thu, 17 Jun 2021 02:37:02 GMT)
Full text and
rfc822 format available.
Message #95 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On 13.06.2021 17:55, João Távora wrote:
> I just confirmed it's not for correctness. If one foregoes the copy, in
> fido-mode a C-h f followed by some input followed by C-g and an M-x and
> no input (empty pattern) will show interesting results, i.e. a list of
> propertized strings when nothing should be propertized.
That's also an example of correctness problem, just the more obvious
kind. It probably happens in a number of situations, e.g. all (?) uses
of completion-table-with-cache.
> But maybe that's a question of removing the propertization when the
> pattern is empty? I'll try later.
That must add some performance penalty as well, for extra mutation. And
wouldn't cover some potential other uses of the same set of strings. In
IMenu, maybe?
The latter is hypothetical, of course.
>> Do we have any "frozen strings" in Emacs, which absolutely could not
>> be modified? Do we plan to?
>
> Immutable strings? And error when one tries to? Or just ignore the
> modification? And how would that help here?
It wouldn't help. It would do the opposite: basically forbid the
approach you suggest, mutation without copying.
>> I disagree it's a simpler technique, but it would indeed be a simpler
>> change, based on the current implementation.
>
> simpler means simpler in my book :-)
One is simpler diff, another is simpler resulting code. Both have their
upsides.
>> But I don't mind it myself, and happy to update Company. Either way
>> it's a step forward.
>
> If Company and fido-mode and a couple more outside the core/Elpa are all
> that's needed, it's probably warranted. But there are so many frontends
> right now, I don't know... We'd need some "opt into the optimization",
> I think."
Since all other users are third-party (and thus have short release
cycles), it shouldn't be too much of a problem. Some highlighting code
would start to fail, but probably without disastrous results. And then
people will issue updates to look for some new property when the old
expected ones are all missing.
>>> But if the speedup is big, I'd revisit the rationale for requiring those
>>> copies to be performed in the first place.
>>
>> With fido-vertical-mode, and with that particular input, it's
>>
>> Elapsed time: 0.130773s (0.031547s in 1 GCs)
>>
>> without copy-sequence, and
>>
>> Elapsed time: 0.169842s (0.069740s in 4 GCs)
>>
>> with it. Not game changing, but definitely measurable.
>
> I don't have these results. I tried the previous experiment:
>
> ;; C-u C-x C-e C-m in quick succession
> (benchmark-run (completing-read "bla" obarray))
>
> And turned icomplete.el's while-no-input into a progn.
>
> In an Emacs -Q where (length (all-completions "" obarray)) is about
> 20000.
>
> ;; with copy
> (0.39647753 6 0.22811240199999983)
> (0.431950471 7 0.263988651)
> (0.451116177 6 0.2249686070000001)
>
> ;; without copy, small but measurable speedup
> (0.29890632 2 0.08419541699999966)
> (0.293501099 2 0.08622194699999985)
> (0.306566633 3 0.0853211100000002)
>
> In a loaded Emacs where (length (all-completions "" obarray)) is 64554
>
> ;; with copy
> (2.869362171 6 2.3882547280000495)
> (2.909661303 6 2.4209153659999743)
> (2.845522439 6 2.3638140250000106)
>
> ;; without copy. Huge speedup.
> (0.79817337 1 0.4526993239999797)
> (0.8231736510000001 1 0.4752496449999626)
> (0.719004478 1 0.4016016420000028)
Even better.
My current session has 37559 symbols, so it's somewhere in the middle.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Thu, 17 Jun 2021 21:23:02 GMT)
Full text and
rfc822 format available.
Message #98 received at 48841 <at> debbugs.gnu.org (full text, mbox):
Dmitry Gutov <dgutov <at> yandex.ru> writes:
>>> I disagree it's a simpler technique, but it would indeed be a simpler
>>> change, based on the current implementation.
>> simpler means simpler in my book :-)
>
> One is simpler diff, another is simpler resulting code. Both have
> their upsides.
Oh, you meant the The Big Redesign? I'm a fan of that too, not only
here but constantly, everywhere... That indeed means simpler resulting
code in abstract. Problem is that also means different resulting code
to different people. But is definitely doable.
>>> But I don't mind it myself, and happy to update Company. Either way
>>> it's a step forward.
>> If Company and fido-mode and a couple more outside the core/Elpa are
>> all
>> that's needed, it's probably warranted. But there are so many frontends
>> right now, I don't know... We'd need some "opt into the optimization",
>> I think."
>
> Since all other users are third-party (and thus have short release
> cycles), it shouldn't be too much of a problem. Some highlighting code
> would start to fail, but probably without disastrous results. And then
> people will issue updates to look for some new property when the old
> expected ones are all missing.
OK. I can live with that rationale. So what are the places to touch
that "we" control?
- icomplete.el? for fido-mode & friends
- minibuffer.el, for the *Completions* buffer
- company.el
- Any notable others?
>> ;; with copy
>> (2.869362171 6 2.3882547280000495)
>> (2.909661303 6 2.4209153659999743)
>> (2.845522439 6 2.3638140250000106)
>> ;; without copy. Huge speedup.
>> (0.79817337 1 0.4526993239999797)
>> (0.8231736510000001 1 0.4752496449999626)
>> (0.719004478 1 0.4016016420000028)
>
> Even better.
>
> My current session has 37559 symbols, so it's somewhere in the middle.
Yes, this is a big performance bottleneck. But i wonder if tweaking GC
parameter would help here. I know nothing of Emacs GC parameters.
João
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Thu, 17 Jun 2021 21:30:02 GMT)
Full text and
rfc822 format available.
Message #101 received at 48841 <at> debbugs.gnu.org (full text, mbox):
Dmitry Gutov <dgutov <at> yandex.ru> writes:
> On 14.06.2021 03:16, João Távora wrote:
>>> Perhaps predicate it on the value of icomplete-hide-common-prefix instead?
>>>
>>> fido-mode sets it to nil, and this way we retain a better level of
>>> abstraction, and better backward compatibility for vanilla
>>> icomplete-mode users.
>> This is a good idea, the level of abstraction. But what is this
>> "common prefix" anyway? Is it the the same as the "determ"
>> thing, or the "[mplete...] dance" as I called it earlier. Shouldn't
>> fido-mode then_hide_ it?
>> I'm confused, but if you're not, go ahead and make that more
>> abstract change instead of relying on fido-mode.
>
> So... it's a bit more complex than that.
Yes, my batch broke the things you mentioned.
> It seems there are two ways to proceed from here:
>
> - Just alter the printing logic in the "single match" case to print
> the match text in full is it's not equal to the input string. I
> haven't puzzled out the logic doing that yet.
>
> - Try to keep the current behavior while avoiding the duplicate work.
Both sound absolutely fine to me.
> About the latter option: the result of that most-try stuff is only
> useful when there is only one match, right?
No idea, but may be.
> Unless I'm missing something and the value does see some use in the
> multiple-matches situations, the patch below both keeps the current
> behavior and gives the same performance improvement:
That'd be fantastic, but I doubt you'd be keeping the exact same
behaviour. I never understood it -- that's the thing here -- but I
think that completion-try-completion is doing more stuff when multiple
candidates matched by a pattern happen to share the same prefix or
suffix or something like that. I might be completely wrong, tho.
But really if you make this patch conditional to fido-mode or that other
var that you think is more abstract, I think it's fine and it's a very
clear win. I really doubt that the tiny number of fido-mode users care
about that behaviour anyway, but I'm sure they'll appreciate the
considerable speedup.
João
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Sun, 04 Jul 2021 01:43:02 GMT)
Full text and
rfc822 format available.
Message #104 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On 18.06.2021 00:29, João Távora wrote:
>> Unless I'm missing something and the value does see some use in the
>> multiple-matches situations, the patch below both keeps the current
>> behavior and gives the same performance improvement:
>
> That'd be fantastic, but I doubt you'd be keeping the exact same
> behaviour. I never understood it -- that's the thing here -- but I
> think that completion-try-completion is doing more stuff when multiple
> candidates matched by a pattern happen to share the same prefix or
> suffix or something like that. I might be completely wrong, tho.
Turns out that indeed the logic is used in the "multiple matches" case:
when icomplete-hide-common-prefix is non-nil.
Meaning, with icomplete-mode but not with fido-mode.
> But really if you make this patch conditional to fido-mode or that other
> var that you think is more abstract, I think it's fine and it's a very
> clear win. I really doubt that the tiny number of fido-mode users care
> about that behaviour anyway, but I'm sure they'll appreciate the
> considerable speedup.
So I have done the above. There is no change in observable behavior
either (AFAICS), so it's win-win.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Sun, 04 Jul 2021 01:54:02 GMT)
Full text and
rfc822 format available.
Message #107 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On 18.06.2021 00:21, João Távora wrote:
> Dmitry Gutov <dgutov <at> yandex.ru> writes:
>
>>>> I disagree it's a simpler technique, but it would indeed be a simpler
>>>> change, based on the current implementation.
>>> simpler means simpler in my book :-)
>>
>> One is simpler diff, another is simpler resulting code. Both have
>> their upsides.
>
> Oh, you meant the The Big Redesign? I'm a fan of that too, not only
> here but constantly, everywhere... That indeed means simpler resulting
> code in abstract. Problem is that also means different resulting code
> to different people. But is definitely doable.
I meant a particular change (not modifying the strings at all in
advance) which would be bigger and indeed might better fit The Big Redesign.
>>>> But I don't mind it myself, and happy to update Company. Either way
>>>> it's a step forward.
>>> If Company and fido-mode and a couple more outside the core/Elpa are
>>> all
>>> that's needed, it's probably warranted. But there are so many frontends
>>> right now, I don't know... We'd need some "opt into the optimization",
>>> I think."
>>
>> Since all other users are third-party (and thus have short release
>> cycles), it shouldn't be too much of a problem. Some highlighting code
>> would start to fail, but probably without disastrous results. And then
>> people will issue updates to look for some new property when the old
>> expected ones are all missing.
>
> OK. I can live with that rationale. So what are the places to touch
> that "we" control?
>
> - icomplete.el? for fido-mode & friends
> - minibuffer.el, for the *Completions* buffer
> - company.el
> - Any notable others?
corfu, consult, etc? Probably Ivy too. All of these are in GNU ELPA.
BTW, I think Daniel had some ideas about applying the face property
lazily as well. I can't find the particular discussion now, but perhaps
he can add to this discussion as well.
>>> ;; with copy
>>> (2.869362171 6 2.3882547280000495)
>>> (2.909661303 6 2.4209153659999743)
>>> (2.845522439 6 2.3638140250000106)
>>> ;; without copy. Huge speedup.
>>> (0.79817337 1 0.4526993239999797)
>>> (0.8231736510000001 1 0.4752496449999626)
>>> (0.719004478 1 0.4016016420000028)
>>
>> Even better.
>>
>> My current session has 37559 symbols, so it's somewhere in the middle.
>
> Yes, this is a big performance bottleneck. But i wonder if tweaking GC
> parameter would help here. I know nothing of Emacs GC parameters.
I think the current understanding that by raising gc-cons-threshold we
exchange the number of GC pauses for larger latencies. I suppose one
could tune that value to a particular workload such that the 4 GCs in a
row that I had will turn into just one (and thus save on re-scanning the
heap 3 times), but data sets change.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Wed, 07 Jul 2021 08:57:02 GMT)
Full text and
rfc822 format available.
Message #110 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On 7/4/21 3:53 AM, Dmitry Gutov wrote:
>> - icomplete.el? for fido-mode & friends
>> - minibuffer.el, for the *Completions* buffer
>> - company.el
>> - Any notable others?
>
> corfu, consult, etc? Probably Ivy too. All of these are in GNU ELPA.
>
> BTW, I think Daniel had some ideas about applying the face property
> lazily as well. I can't find the particular discussion now, but perhaps
> he can add to this discussion as well.
Yes, Vertico and Corfu apply highlighting lazily. This leads to
significant performance wins. See `vertico--all-completions` in
https://github.com/minad/vertico/blob/main/vertico.el#L243-L279 and
bug#47711.
The technique I am using in Vertico and Corfu retains backward
compatibility, such that the strings are returned unmodified by the
completion style. Highlighting is applied lazily by copying the
candidate strings and mutating the copies. For now I am relying on advices.
One could add an optional argument (or dynamically bound variable) to
completion styles which tell the completion style to opt out of copying
the candidates and the highlighting.
Daniel
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Wed, 11 Aug 2021 14:18:01 GMT)
Full text and
rfc822 format available.
Message #113 received at 48841 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
I prepared a patch which provides the API
`completion-filter-completions`. This function supports deferred
highlighting and returns additional data with the list of matching
completion candidates. The API supersedes the existing function
`completion-all-completions`.
The main goal of the new API is to avoid expensive string allocations
and highlighting during completion. This is particularly relevant for
continuously updating completion UIs like Icomplete or Vertico.
Furthermore the end position of the completion boundaries is returned
with the completion results. This information is not provided by the
existing `completion-all-completions` API.
See also the relevant bugs bug#47711 and bug#48841. I am looking forward
to your feedback. Thank you!
Daniel Mendler
[0001-Add-new-completion-filter-completions-API-and-deferr.patch (text/x-diff, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Wed, 11 Aug 2021 16:12:02 GMT)
Full text and
rfc822 format available.
Message #116 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On 8/11/21 4:16 PM, Daniel Mendler wrote:
> I prepared a patch which provides the API
> `completion-filter-completions`. This function supports deferred
> highlighting and returns additional data with the list of matching
> completion candidates. The API supersedes the existing function
> `completion-all-completions`.
>
> The main goal of the new API is to avoid expensive string allocations
> and highlighting during completion. This is particularly relevant for
> continuously updating completion UIs like Icomplete or Vertico.
> Furthermore the end position of the completion boundaries is returned
> with the completion results. This information is not provided by the
> existing `completion-all-completions` API.
>
> See also the relevant bugs bug#47711 and bug#48841. I am looking forward
> to your feedback. Thank you!
There are currently two issues with the patch with regards to backward
compatibility. Fortunately they are fixable with a little effort.
1. I would like to deprecate `completion-score' or remove it altogether,
but unfortunately `completion-score' is used in the wild. In order to
preserve `completion-score', bind `completion--filter-completions' in
the highlighting functions. Add `completion-score' in
`completion-pcm--hilit-commonality' when
`completion--filter-completions' is nil.
2. In `completion--nth-completion' set `completion--filter-completions'
to nil, unless `(memq style '(emacs21 emacs22 basic
partial-completion initials flex))' such that custom completion
styles which wrap the completion functions don't see the new return
value format, except if the custom style opts in explicitly by
binding `completion--filter-completions'. An alternative criterion is
`(memq fun '(completion-emacs22-all-completions) ...)'. Unfortunately
this approach will still not work if the user has advised a
`completion-x-all-completions' function. The only 100% safe approach
seems to transparently redirect calls to
`completion-x-all-completions' to `completion--x-filter-completions',
which returns the results in the new format.
With these changes the patch should be 100% backward compatible.
Daniel
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Wed, 11 Aug 2021 16:18:02 GMT)
Full text and
rfc822 format available.
Message #119 received at 48841 <at> debbugs.gnu.org (full text, mbox):
Perhaps you should first provide a patch with these 2 "little effort" changes,
(that are presumably also backward compatible and don't affect the API) by
themselves. Reading about these complex ideas isn't as clear as seeing
them in actual code.
Then it'll be easier to evaluate the merits of the patch you proposed in
your first email.
João
On Wed, Aug 11, 2021 at 5:11 PM Daniel Mendler <mail <at> daniel-mendler.de> wrote:
>
> On 8/11/21 4:16 PM, Daniel Mendler wrote:
> > I prepared a patch which provides the API
> > `completion-filter-completions`. This function supports deferred
> > highlighting and returns additional data with the list of matching
> > completion candidates. The API supersedes the existing function
> > `completion-all-completions`.
> >
> > The main goal of the new API is to avoid expensive string allocations
> > and highlighting during completion. This is particularly relevant for
> > continuously updating completion UIs like Icomplete or Vertico.
> > Furthermore the end position of the completion boundaries is returned
> > with the completion results. This information is not provided by the
> > existing `completion-all-completions` API.
> >
> > See also the relevant bugs bug#47711 and bug#48841. I am looking forward
> > to your feedback. Thank you!
>
> There are currently two issues with the patch with regards to backward
> compatibility. Fortunately they are fixable with a little effort.
>
> 1. I would like to deprecate `completion-score' or remove it altogether,
> but unfortunately `completion-score' is used in the wild. In order to
> preserve `completion-score', bind `completion--filter-completions' in
> the highlighting functions. Add `completion-score' in
> `completion-pcm--hilit-commonality' when
> `completion--filter-completions' is nil.
>
> 2. In `completion--nth-completion' set `completion--filter-completions'
> to nil, unless `(memq style '(emacs21 emacs22 basic
> partial-completion initials flex))' such that custom completion
> styles which wrap the completion functions don't see the new return
> value format, except if the custom style opts in explicitly by
> binding `completion--filter-completions'. An alternative criterion is
> `(memq fun '(completion-emacs22-all-completions) ...)'. Unfortunately
> this approach will still not work if the user has advised a
> `completion-x-all-completions' function. The only 100% safe approach
> seems to transparently redirect calls to
> `completion-x-all-completions' to `completion--x-filter-completions',
> which returns the results in the new format.
>
> With these changes the patch should be 100% backward compatible.
>
> Daniel
--
João Távora
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Thu, 12 Aug 2021 08:01:02 GMT)
Full text and
rfc822 format available.
Message #122 received at 48841 <at> debbugs.gnu.org (full text, mbox):
[I removed emacs-devel from the CC list, please don't cross-post to
both emacs-devel and the bug tracker.]
> From: Daniel Mendler <mail <at> daniel-mendler.de>
> Date: Wed, 11 Aug 2021 16:16:57 +0200
> Cc: 48841 <at> debbugs.gnu.org, Dmitry Gutov <dgutov <at> yandex.ru>,
> João Távora <joaotavora <at> gmail.com>,
> Stefan Monnier <monnier <at> iro.umontreal.ca>, 47711 <at> debbugs.gnu.org
>
> I prepared a patch which provides the API
> `completion-filter-completions`. This function supports deferred
> highlighting and returns additional data with the list of matching
> completion candidates. The API supersedes the existing function
> `completion-all-completions`.
Thanks. The discussion of this is still going on, and I don't
consider myself an expert in this area of Emacs, so below please find
only comments for minor formatting and documentation aspects.
> Add a new `completion-filter-completions` API, which supersedes
> `completion-all-completions`. The new API returns the matching
> completion candidates and additional data. The return value is an
> alist, with the keys `completions`, `base`, `end` and `highlight`.
> The API can be extended in a backward compatible way later on thanks
> to the use of an alist as return value.
Please don't use Markdown-style quoting `like this` in our comments
and log messages. We quote 'like this' in these places.
> The `completions` value is the list of completion strings *without*
> applied highlighting. The completion strings are returned unmodified,
> which avoids allocations and results in performance gains for
This is unclear: how can you return a list of strings which you
produce without allocating the strings?
> The value `base` is the base position of the completion.
"Base position" where, or relative to what object?
> Correspondingly the value `end` specifies the end position of the
> completion counted from the beginning of the input strng.
So the base position is also relative to the beginning of the input
string? If so, please say so explicitly.
> Finally the
> `highlight` value is a function taking a list of completion strings
> and returns a new list of new strings with highlighting applied.
If you say "taking a list...", then for consistent style please also
say "...and returning a new list...".
> A continously updating UI can use the highlighting function to apply
> highlighting only to the visible completions.
Not, "visible", but "displayed", right? So I'd rephrase
...to apply highlighting only to the completions that are actually
displayed.
> (completion-basic-all-completions,
> completion-emacs21-all-completions,
> completion-emacs22-all-completions): Use it.
That's incorrect format, I guess you did it by hand rather than
letting change-log-mode do it for you? The correct format looks like
this:
(completion-basic-all-completions)
(completion-emacs21-all-completions)
(completion-emacs22-all-completions): use it.
IOW, each line ends with a closing parentheses, not a comma.
> +(defvar completion--filter-completions nil
> + "Enable the new completions return value format.
Using genitive construction should be limited to just 2 words. Here
you have 4, which produces an awkward, hard to interpret phrase.
Suggest to reword:
If non-nil, return completions in `completion-filter-completions' format.
Note that I also dropped the "new" part (which generally doesn't stand
the test of time), and instead gave a hint as to what that format is.
Btw, why is this an internal variable? Shouldn't all completion
styles ideally support it? If so, it should be a public variable,
documented in the ELisp manual. And the name should also end with -p,
since it's a boolean. How about completion-filter-completions-format-p?
> + New completion style functions may always return their
> +results in the new alist format, since `completion-all-completions'
> +transparently converts back to the old improper list of completions
> +with base size in the last cdr.")
"may" and "always" are a kind of contradiction.
Also, I'd drop the "improper" part, as it sounds like a derogatory
adjective.
> (defun completion-try-completion (string table pred point &optional metadata)
> "Try to complete STRING using completion table TABLE.
> Only the elements of table that satisfy predicate PRED are considered.
> -POINT is the position of point within STRING.
> -The return value can be either nil to indicate that there is no completion,
> -t to indicate that STRING is the only possible completion,
> -or a pair (NEWSTRING . NEWPOINT) of the completed result string together with
> -a new position for point."
> +POINT is the position of point within STRING. The return value can be
> +either nil to indicate that there is no completion, t to indicate that
> +STRING is the only possible completion, or a pair (NEWSTRING . NEWPOINT)
> +of the completed result string together with a new position for point.
> +The METADATA may be modified by the completion style."
Here you changed whitespace by filling, and that ruined the
intentionally formatted doc string, which made it easy to find the
various forms of the return value and the important parts of the doc
string. Please keep the original formatting.
> (defun completion-all-completions (string table pred point &optional metadata)
> "List the possible completions of STRING in completion table TABLE.
> Only the elements of table that satisfy predicate PRED are considered.
> -POINT is the position of point within STRING.
> -The return value is a list of completions and may contain the base-size
> -in the last `cdr'."
> - ;; FIXME: We need to additionally return the info needed for the
> - ;; second part of completion-base-position.
> - (completion--nth-completion 2 string table pred point metadata))
> +POINT is the position of point within STRING. The return value is a
> +list of completions and may contain the base-size in the last `cdr'.
> +The METADATA may be modified by the completion style. This function
> +has been superseded by `completion-filter-completions', which returns
> +richer information and supports deferred candidate highlighting."
Likewise here.
Also, the "This function has been superseded..." part should be a new
paragraph, so that it stands out. (And I'm not yet sure we indeed
want to say "superseded" here, but that's part of the on-going
discussion. maybe use a more neutral language here, like "See also".)
> + (if (and result (consp (car result)))
> + ;; Give the completion styles some freedom!
> + ;; If they are targeting Emacs 28 upwards only, they
> + ;; may always return a result with deferred
> + ;; highlighting. We convert back to the old format
> + ;; here by applying the highlighting eagerly.
"May always" again. How about "can always" instead?
> + (nconc (funcall (cdr (assq 'highlight result))
> + (cdr (assq 'completions result)))
> + (cdr (assq 'base result)))
> + result)))
> +
> +(defun completion-filter-completions (string table pred point metadata)
> + "Filter the possible completions of STRING in completion table TABLE.
Is "filter" really the right word here (in the doc string)? "Filer"
means you take a sequence and produce another sequence with some
members removed. That's not what this API does, is it? Suggest to
use a different name, like completion-completions-alist or
completion-all-completions-as-alist.
> +Only the elements of table that satisfy predicate PRED are considered.
> +POINT is the position of point within STRING. The METADATA may be
> +modified by the completion style. The return value is a alist with
> +the keys:
> +
> +- base: Base position of the completion (from the start of STRING)
"Base" here means the beginning? If so, why not call it "beg" or
somesuch?
> +This function supersedes the function `completion-all-completions'."
Again, "supersedes" is too strong, IMO. I would say "is a variant of"
instead, and explain why this variant could be better suited to some
use cases. IOW, explain the upsides (and downsides, if any), and let
the programmers decide whether they want this, instead of more-or-less
forcing them to use it.
> + ;; Deferred highlighting has been requested, but the completion
> + ;; style returned a non-deferred result. Convert the result to the
^^
two spaces between sentences, please.
> + ;; new alist format.
"New" is not a good word here.
> + ;; added by the completion machinery.
Please start comments with a capital letter.
> +(defun completion--deferred-hilit (completions prefix-len base end)
> + "Return completions in old format or new alist format.
> +If `completion--filter-completions' is non-nil use the new format."
Again, please don't use "old" and "new" here, but instead describe
explicitly the differences between them, or provide a hyperlink to
where that is described.
> + ;; Apply highlighting
Please end each sentence in a comment with a period.
> +(defun completion-pcm--deferred-hilit (pattern completions base end)
> + "Return completions in old format or new alist format.
> +If `completion--filter-completions' is non-nil use the new format."
"Old" and "new" again.
> (defun completion-pcm--hilit-commonality (pattern completions)
> "Show where and how well PATTERN matches COMPLETIONS.
> PATTERN, a list of symbols and strings as seen
> `completion-pcm--merge-completions', is assumed to match every
> string in COMPLETIONS. Return a deep copy of COMPLETIONS where
> -each string is propertized with `completion-score', a number
> -between 0 and 1, and with faces `completions-common-part',
> +each string is propertized with faces `completions-common-part',
> `completions-first-difference' in the relevant segments."
Are we really losing the completion-score property here? If so, why?
> + ;; If `pattern' doesn't have an explicit trailing any,
This is confusing: what do you mean by "explicit trailing any" in the
context of patterns?
> +(defun completion--flex-score (pattern completions)
> + "Compute how well PATTERN matches COMPLETIONS.
> +PATTERN, a list of strings is assumed to match every string in
> +COMPLETIONS.
Is PATTERN really a list? It would be strange for a list to be called
PATTERN, and how can a list "match every string in COMPLETIONS"?
> Return a copy of COMPLETIONS where each element is
> +a pair of a score and the completion string.
What is "the completion string" in this case? is it the same string
from COMPLETIONS, or is it something else? The doc string leaves that
unclear.
> The score lies in
> +the range between -1 and 0, where -1 corresponds to the full
> +match."
What score could a partial match have, and what is the meaning of the
numerical value for a partial match?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Thu, 12 Aug 2021 08:48:02 GMT)
Full text and
rfc822 format available.
Message #125 received at 48841 <at> debbugs.gnu.org (full text, mbox):
Eli, thank you for your feedback and for pointing me to the mode which
helps with the formatting. I will address the documentation and
formatting issues as soon as the discussion has concluded.
In the following I answer to a few of your questions about technical
details.
>> The `completions` value is the list of completion strings *without*
>> applied highlighting. The completion strings are returned unmodified,
>> which avoids allocations and results in performance gains for
>
> This is unclear: how can you return a list of strings which you
> produce without allocating the strings?
The function 'completion-filter-completions' receives a completion table
as argument. The strings produced by this table are returned
unmodified, but of course the completion table has to produce them. For
a static completion table (e.g., in the simplest case a list of strings)
the completion table itself will not allocate strings. In this scenario
'completion-filter-completions' will not perform any string allocations,
only the list will be allocated. This is what leads to major
performance gains.
>> +(defvar completion--filter-completions nil
>> + "Enable the new completions return value format.
>
> Btw, why is this an internal variable? Shouldn't all completion
> styles ideally support it? If so, it should be a public variable,
> documented in the ELisp manual. And the name should also end with -p,
> since it's a boolean. How about completion-filter-completions-format-p?
(As I understood the style guide '-p' is not a good idea for boolean
variables, since a value is not a predicate in a strict sense.)
To address your technical comment - this variable is precisely what one
of the technical difficulties mentioned in my other mail is about. The
question is how we can retain backward compatibility of the completion
style 'all' functions, e.g., 'completion-basic-all-completions', while
still allowing the function to return the newly introduced alist format
with more data, which enables 'completion-filter-completions' to perform
the efficient deferred highlighting.
> Also, the "This function has been superseded..." part should be a new
> paragraph, so that it stands out. (And I'm not yet sure we indeed
> want to say "superseded" here, but that's part of the on-going
> discussion. maybe use a more neutral language here, like "See also".)
The new API 'completion-filter-completions' will substitute the existing
API 'completion-all-completions'. I only didn't go as far as
deprecating the 'completion-all-completions' API right away, but we
could also do this.
> Is "filter" really the right word here (in the doc string)? "Filer"
> means you take a sequence and produce another sequence with some
> members removed. That's not what this API does, is it? Suggest to
> use a different name, like completion-completions-alist or
> completion-all-completions-as-alist.
"Filter" seems like exactly the right word to me. The function takes a
list of strings (or a completion table) and returns a subset of matching
completion strings without further modifications to the strings. See
above what I wrote about allocations.
>> +Only the elements of table that satisfy predicate PRED are considered.
>> +POINT is the position of point within STRING. The METADATA may be
>> +modified by the completion style. The return value is a alist with
>> +the keys:
>> +
>> +- base: Base position of the completion (from the start of STRING)
>
> "Base" here means the beginning? If so, why not call it "beg" or
> somesuch?
Base position is a fixed term which is already used in minibuffer.el for
completions. See also 'completion-base-position' for example.
> Are we really losing the completion-score property here? If so, why?
Yes, the property is removed in the current patch. It is not actually
used for anything in the new implementation. But it is possible to
restore the property such that 'completion-all-completions' always
returns scored candidates as it does now. See my other mail regarding
the caveats of the current patch.
Daniel
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Thu, 12 Aug 2021 09:25:02 GMT)
Full text and
rfc822 format available.
Message #128 received at 48841 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On 8/11/21 6:11 PM, Daniel Mendler wrote:
> 2. In `completion--nth-completion' set `completion--filter-completions'
> to nil, unless `(memq style '(emacs21 emacs22 basic
> partial-completion initials flex))' such that custom completion
> styles which wrap the completion functions don't see the new return
> value format, except if the custom style opts in explicitly by
> binding `completion--filter-completions'. An alternative criterion is
> `(memq fun '(completion-emacs22-all-completions) ...)'. Unfortunately
> this approach will still not work if the user has advised a
> `completion-x-all-completions' function. The only 100% safe approach
> seems to transparently redirect calls to
> `completion-x-all-completions' to `completion--x-filter-completions',
> which returns the results in the new format.
I attached two patch variants which can be placed on top of my previous
patch to improve the backward compatibility of the internal API.
Variant 1: Set 'completion--return-alist-flag' only for the existing
completion styles, such that they transparently upgrade to the alist
return format. If the variable is not set, the completion styles return
the result as plain list retaining backward compatibility. The variable
is purely for internal use, new completion styles should return their
results as an alist on Emacs 28 and newer.
Variant 2: Add an optional argument FILTER to each of the completion
styles 'all' functions, e.g., 'completion-basic-all-completions'. In
'completion--nth-completion' try to call the function with the
additional FILTER argument to upgrade to the alist return format. If
this fails with a 'wrong-number-of-arguments' error, retry again without
the argument.
Daniel
[variant1-restrict.el (text/plain, attachment)]
[variant2-argument.el (text/plain, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Fri, 13 Aug 2021 10:39:01 GMT)
Full text and
rfc822 format available.
Message #131 received at 48841 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
I attached the overhauled patch, which addresses most of the comments by
Eli. In comparison to my last patch, the patch is fully backward
compatible and preserves all existing tests. As before, there are tests
which check the new functionality for each existing completion style.
Daniel
[0001-Add-new-completion-filter-completions-API-and-deferr.patch (text/x-diff, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Fri, 13 Aug 2021 10:58:02 GMT)
Full text and
rfc822 format available.
Message #134 received at 48841 <at> debbugs.gnu.org (full text, mbox):
> In comparison to my last patch, the patch is fully backward
> compatible and preserves all existing tests.
This a very good thing (the fact that the patch is fully backward compatible,
I mean).
It is quite a large patch that touches many completion internals. I'd like
some time to look it over.
I've read the discussion and am indeed aware of some non-neglibile
performance problems in the flex and pcm completion styles since
they need to copy strings around. Other -- completely different --
performance problems affect fido-mode specifically (but not
fido-vertical-mode, curiously).
In some conversation with Dmitry
bug#48841: fido-mode is slower than ido-mode with similar settings
We discussed this.
There was also talk of removing the string copying with minimal (but not null)
backward compatibility breakage. I recall Dmitry saying it was easy
to fix on the
completion frontend side. Many such frontends live in Emacs or GNU Elpa.
On the other hand, the patch that we (or at least I) envisioned in
that discussion
was almost certainly much, much simpler than the one being presented here,
and thus much easier to reason about and discuss.
But to avoid comparing apples to oranges, I would you to summarize exactly,
perhaps in the forms of code snippets, and/or benchmarks exactly what problems
your large patch solves. State the problem(s) first, then the solution
(to each).
If there are multiple problems, then there's a good chance that multiple patches
that address each of these are preferred.
Thank you very much.
João
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Fri, 13 Aug 2021 11:22:02 GMT)
Full text and
rfc822 format available.
Message #137 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On 8/13/21 12:56 PM, João Távora wrote:
> I've read the discussion and am indeed aware of some non-neglibile
> performance problems in the flex and pcm completion styles since
> they need to copy strings around. Other -- completely different --
> performance problems affect fido-mode specifically (but not
> fido-vertical-mode, curiously).
>
> In some conversation with Dmitry
>
> bug#48841: fido-mode is slower than ido-mode with similar settings
>
> We discussed this.
I've read the discussion. You are probably aware of my efforts to in
Vertico to implement deferred highlighting. The patch I implemented
here implements the deferred highlighting in a clean way.
> There was also talk of removing the string copying with minimal (but not null)
> backward compatibility breakage. I recall Dmitry saying it was easy
> to fix on the
> completion frontend side. Many such frontends live in Emacs or GNU Elpa.
> On the other hand, the patch that we (or at least I) envisioned in
> that discussion
> was almost certainly much, much simpler than the one being presented here,
> and thus much easier to reason about and discuss.
No, this is not the case. There is no simple fix of the allocation issue
on the frontend side. The existing API `completion-all-completions`
necessarily has to allocate all the strings in order to attach
highlighting and scoring. The new API solves this in a clean way by
both deferring highlighting and scoring.
I claim that my patch is easy to reason about and refactors the existing
code to address the exact problem we are having. Please take some time
in reviewing it.
> But to avoid comparing apples to oranges, I would you to summarize exactly,
> perhaps in the forms of code snippets, and/or benchmarks exactly what problems
> your large patch solves. State the problem(s) first, then the solution
> (to each).
The main problem is that `completion-all-completions` allocates all the
strings every time the completions are filtered. This is the same
performance issue you encountered in fido-mode/icomplete-mode.
The second problem addressed by the new API
`completion-filter-completions` is that `completion-all-completions` is
limited in what it can return. For example it cannot return the end
position of the completion. This is also solved by the new API. The
new API is a clean extensible way forward.
Daniel
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Fri, 13 Aug 2021 12:07:02 GMT)
Full text and
rfc822 format available.
Message #140 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On Fri, Aug 13, 2021 at 12:21 PM Daniel Mendler <mail <at> daniel-mendler.de> wrote:
> No, this is not the case. There is no simple fix of the allocation issue
> on the frontend side.
I didn't claim that. At all. I claimed that the frontends that would be
affected by the (small) backend patch are easy to adapt. I think
you completely read past my idea.
> The existing API `completion-all-completions`
> necessarily has to allocate all the strings in order to attach
> highlighting and scoring. The new API solves this in a clean way by
> both deferring highlighting and scoring.
I'm not sure you understand my alternative idea. As far as I
understand (and have actually measured) the lines:
;; Don't modify the string itself.
(setq str (copy-sequence str))
in minibuffer.el, in the function completion-pcm--hilit-commonality
Are the cause of the problem that _I am talking about_ and that
I have actually measured. Again you may be referring to a
_different_ problem that I am unaware of.
If one removes these lines, the process becomes much faster, but there is a
problem with highlighting. My idea is indeed to defer highlighting by not
setting the 'face property directly on that shared string, but some
other property
that is read later from the shared string by compliant frontents.
If you have understood this idea, can you comment on it?
(Preferably in terms of less adjectification regarding "cleanliness", but in
terms of actual drawbacks/advantages?)
The drawback that I can see in it is that frontends directly relying
on 'face are
broken by that patch. But according to Dmitry (and I tend to agree), it's
quite easy to address those frontends. Most of them live in Emacs core or
GNU Elpa.
The advantage that I see is that those adaptations apart, it is a small
localized and effective change.
> I claim that my patch is easy to reason about and refactors the existing
> code to address the exact problem we are having. Please take some time
> in reviewing it.
I am already taking some time. I need your assistance in explaining the
problems first. I take into account your claims of cleanliness and elegance,
but in terms of their power of persuasion, they are much more limited
than hard material evidence.
> The main problem is that `completion-all-completions` allocates all the
> strings every time the completions are filtered. This is the same
> performance issue you encountered in fido-mode/icomplete-mode.
OK. I encountered at least two different performance problems there, with
quite different causes. So let's stick to the string-allocation problem. Post
a code snippet that demonstrates the problem the way you see it/experience it?
Some benchmark code would be very welcome. You can probably grab my
benchmarking code from that other bug.
Then it becomes easy to study multiple solutions to that problem and
choose the best one!
> The second problem addressed by the new API
> `completion-filter-completions` is that `completion-all-completions` is
> limited in what it can return. For example it cannot return the end
> position of the completion.
And why is this a problem? Can you post an example of something you'd
like to do, but can't? Regardless, it does seem indeed like a "second" problem
(as you state) so perhaps something that can be addressed separately.
Is your particular solution to this second problem instrumental in solving
the "main problem"
> This is also solved by the new API. The new API is a clean extensible way forward.
I understand you've put time and effort into producing this work. We are
all indebted and I promise to read it. But every API writer in history of
programming has claimed those things and reality often shows otherwise.
So it's not that your work can't be those things you claim, maybe it is, but
generally the larger and broader the work the harder it is to reason about.
João
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Fri, 13 Aug 2021 12:24:02 GMT)
Full text and
rfc822 format available.
Message #143 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On 8/13/21 2:05 PM, João Távora wrote:
>> The existing API `completion-all-completions`
>> necessarily has to allocate all the strings in order to attach
>> highlighting and scoring. The new API solves this in a clean way by
>> both deferring highlighting and scoring.
>
> I'm not sure you understand my alternative idea. As far as I
> understand (and have actually measured) the lines:
>
> ;; Don't modify the string itself.
> (setq str (copy-sequence str))
>
> in minibuffer.el, in the function completion-pcm--hilit-commonality
>
> Are the cause of the problem that _I am talking about_ and that
> I have actually measured. Again you may be referring to a
> _different_ problem that I am unaware of.
You are right that the call to `copy-sequence` is a major bottleneck in
the filtering. However you are wrong that this line can simply be
removed/disabled and the candidates can be modified. The API guarantees
and has always guaranteed that the candidate strings are not mutated.
It is important to keep this property since this will preclude many bugs
due to string mutation. By separating the filtering and mutation
(highlighting, scoring) my patch addresses the problem at hand in the
proper way.
Note that the UI also has no possibility to opt-out of the mutation.
The UI is actually not the one being concerned about the mutation here,
it is the backends (completion tables), which produce the strings. If
one starts mutating these strings you will see bugs cropping up
throughout Emacs where shared strings suddenly have spurious additional
properties due to the completion filtering.
Mutation would be a reasonable choice here if the problem could not be
solved in a proper way. But in fact it can be solved in a proper way
without mutating the strings at all as my patch shows.
> If one removes these lines, the process becomes much faster, but there is a
> problem with highlighting. My idea is indeed to defer highlighting by not
> setting the 'face property directly on that shared string, but some
> other property
> that is read later from the shared string by compliant frontents.
This solution is much more ad-hoc and you still mutate the string which
is not allowed.
> The advantage that I see is that those adaptations apart, it is a small
> localized and effective change.
Note that your idea also does not address the other issues which are
addressed by my patch. The new API `completion-filter-completions`
returns data which hasn't been available before, e.g., the end position,
which cannot be fixed given the existing API.
>> The main problem is that `completion-all-completions` allocates all the
>> strings every time the completions are filtered. This is the same
>> performance issue you encountered in fido-mode/icomplete-mode.
>
> OK. I encountered at least two different performance problems there, with
> quite different causes. So let's stick to the string-allocation problem. Post
> a code snippet that demonstrates the problem the way you see it/experience it?
You can try my Vertico completion UI, which is available on GNU ELPA.
It implements deferred highlighting and there the performance difference
is perceivable. Currently Vertico uses an advice-based hack to avoid
the over-eager string-allocations and the highlighting.
>> The second problem addressed by the new API
>> `completion-filter-completions` is that `completion-all-completions` is
>> limited in what it can return. For example it cannot return the end
>> position of the completion.
>
> And why is this a problem? Can you post an example of something you'd
> like to do, but can't? Regardless, it does seem indeed like a "second" problem
> (as you state) so perhaps something that can be addressed separately.
Please look at the FIXMEs in minibuffer.el which address this.
Currently only the beginning position of the completion boundary is
returned, which is only half of the information.
> I understand you've put time and effort into producing this work. We are
> all indebted and I promise to read it. But every API writer in history of
> programming has claimed those things and reality often shows otherwise.
> So it's not that your work can't be those things you claim, maybe it is, but
> generally the larger and broader the work the harder it is to reason about.
I stand by my claim and I also stand by the claim that
removing/disabling `copy-sequence` is not a proper way to address the
issues at hand and will introduce many bugs in the long run. Please
take your time to look at the patch in earnest. I would also like to
see others chime in here with their opinion.
Daniel
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Fri, 13 Aug 2021 12:38:02 GMT)
Full text and
rfc822 format available.
Message #146 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On Fri, Aug 13, 2021 at 1:22 PM Daniel Mendler <mail <at> daniel-mendler.de> wrote:
> It is important to keep this property since this will preclude many bugs
> due to string mutation.
I am aware of this, of course. Can you give examples of these "many bugs"?
Perhaps other than the one I already described and addressed?
> By separating the filtering and mutation
> (highlighting, scoring) my patch addresses the problem at hand in the
> proper way.
>[ ... ]
> Mutation would be a reasonable choice here if the problem could not be
> solved in a proper way. But in fact it can be solved in a proper way
> without mutating the strings at all as my patch shows.
"proper" is just an reasonably empty adjective. There are different ways to
go about this, of course. What's "proper" and isn't is hard to debate
objectively.
> This solution is much more ad-hoc and you still mutate the string which
> is not allowed.
It's also difficult to debate "ad-hoc" or not. If you've studied the
problem, what
makes you say that mutating the string (in this case, adding a
'completion--style-face' property to it) is not allowed? What negative things
would derive from it.
> > The advantage that I see is that those adaptations apart, it is a small
> > localized and effective change.
>
> Note that your idea also does not address the other issues which are
> addressed by my patch.
That's for sure. My patch idea addresses only that single problem.
I think this is a good property of patches: to solve one thing, not many.
We can make more patches to solve other problems, once we
identify them clearly.
> The new API `completion-filter-completions`
> returns data which hasn't been available before, e.g., the end position,
> which cannot be fixed given the existing API.
>
> >> The main problem is that `completion-all-completions` allocates all the
> >> strings every time the completions are filtered. This is the same
> >> performance issue you encountered in fido-mode/icomplete-mode.
> >
> > OK. I encountered at least two different performance problems there, with
> > quite different causes. So let's stick to the string-allocation problem. Post
> > a code snippet that demonstrates the problem the way you see it/experience it?
Look, one needs to evaluate things quantitively. Your patch is not
to Vertico, it's to Emacs. I'm concerned with changes to Emacs and their
effect on all completion frontends. So trying Vertico isn't very useful.
If you're solving a performance problem (and it seems that you are, among
other things) we really need benchmarks, a description of an experiment whose
results can be reproduced independently. It's the normal scientific method.
Something like:
"before my patch, this code takes 123 seconds to run, after my patch it
takes 12."
> >> The second problem addressed by the new API
> >> `completion-filter-completions` is that `completion-all-completions` is
> >> limited in what it can return. For example it cannot return the end
> >> position of the completion.
> >
> > And why is this a problem? Can you post an example of something you'd
> > like to do, but can't? Regardless, it does seem indeed like a "second" problem
> > (as you state) so perhaps something that can be addressed separately.
>
> Please look at the FIXMEs in minibuffer.el which address this.
> Currently only the beginning position of the completion boundary is
> returned, which is only half of the information.
OK. It does seem like a separate problem, so maybe open a new bug for it?
João Távora
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Fri, 13 Aug 2021 12:57:01 GMT)
Full text and
rfc822 format available.
Message #149 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On 8/13/21 2:37 PM, João Távora wrote:
> On Fri, Aug 13, 2021 at 1:22 PM Daniel Mendler <mail <at> daniel-mendler.de> wrote:
>
>> It is important to keep this property since this will preclude many bugs
>> due to string mutation.
>
> I am aware of this, of course. Can you give examples of these "many bugs"?
> Perhaps other than the one I already described and addressed?
No, João, this is not how it goes. I don't have to prove to you that
your idea introduces bugs. You have to show that mutation of the
completion table strings (which are not supposed to be mutated) will not
lead to bugs, which are hard to find.
In contrast with the new API `completion-filter-completions` this entire
class of bugs is avoided by construction of the API. Furthermore the
`completion-filter-completions` API is easy to use in comparison to your
idea, where "compliant" backends have to apply string manipulations to
apply the highlighting and revert the strings back to their old pristine
state. The only thing the API user has to do is to call the `highlight`
function returned in the alist by `completion-filter-completions`.
>> By separating the filtering and mutation
>> (highlighting, scoring) my patch addresses the problem at hand in the
>> proper way.
>> [ ... ]
>> Mutation would be a reasonable choice here if the problem could not be
>> solved in a proper way. But in fact it can be solved in a proper way
>> without mutating the strings at all as my patch shows.
>
> "proper" is just an reasonably empty adjective. There are different ways to
> go about this, of course. What's "proper" and isn't is hard to debate
> objectively.
You are contradicting yourself here. You agree that string mutation is
better be avoid. If we define "proper" as avoids string mutation if this
is easily possible, then my patch implements a proper solution to the
problem.
>>> The advantage that I see is that those adaptations apart, it is a small
>>> localized and effective change.
>>
>> Note that your idea also does not address the other issues which are
>> addressed by my patch.
>
> That's for sure. My patch idea addresses only that single problem.
> I think this is a good property of patches: to solve one thing, not many.
No, this is not necessarily true. This is only good if the problem is
solved in a way which is future proof. The idea of mutating the strings
is a hack and not a solution. In contrast, I am presenting a
future-proof new API as solution which addresses multiple problems. If
you look at the patch, only 196 new lines are added to minibuffer.el.
Furthermore the patch adds 213 lines of new tests.
> Look, one needs to evaluate things quantitively. Your patch is not
> to Vertico, it's to Emacs. I'm concerned with changes to Emacs and their
> effect on all completion frontends. So trying Vertico isn't very useful.
>
> If you're solving a performance problem (and it seems that you are, among
> other things) we really need benchmarks, a description of an experiment whose
> results can be reproduced independently. It's the normal scientific method.
João, you don't have to lecture me on these things. Of course I can
provide such numbers. You cannot reasonably make the claim that
`copy-sequence` is the problem and at the same time claim that my patch
does not solve the performance issues, when in fact my patch avoids this
exact string copying.
>>>> The second problem addressed by the new API
>>>> `completion-filter-completions` is that `completion-all-completions` is
>>>> limited in what it can return. For example it cannot return the end
>>>> position of the completion.
>>>
>>> And why is this a problem? Can you post an example of something you'd
>>> like to do, but can't? Regardless, it does seem indeed like a "second" problem
>>> (as you state) so perhaps something that can be addressed separately.
>>
>> Please look at the FIXMEs in minibuffer.el which address this.
>> Currently only the beginning position of the completion boundary is
>> returned, which is only half of the information.
>
> OK. It does seem like a separate problem, so maybe open a new bug for it?
There is already a FIXME in minibuffer.el, so I assume Stefan Monnier is
well aware of these issues. It is an additional win of the new API that
such open problems can be fixed too. As I see it, a new API is the way
to go here.
Daniel
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Fri, 13 Aug 2021 13:37:02 GMT)
Full text and
rfc822 format available.
Message #152 received at 48841 <at> debbugs.gnu.org (full text, mbox):
didnOn Fri, Aug 13, 2021 at 1:56 PM Daniel Mendler
<mail <at> daniel-mendler.de> wrote:
>
> On 8/13/21 2:37 PM, João Távora wrote:
> > On Fri, Aug 13, 2021 at 1:22 PM Daniel Mendler <mail <at> daniel-mendler.de> wrote:
> >
> >> It is important to keep this property since this will preclude many bugs
> >> due to string mutation.
> >
> > I am aware of this, of course. Can you give examples of these "many bugs"?
> > Perhaps other than the one I already described and addressed?
>
> No, João, this is not how it goes. I don't have to prove to you that
> your idea introduces bugs.
So you just say it and I have to believe it? Then I could say the same to
you, right? I won't of course, that would be silly.
You have to show that mutation of the
> completion table strings (which are not supposed to be mutated) will not
> lead to bugs, which are hard to find.
>
> In contrast with the new API `completion-filter-completions` this entire
> class of bugs is avoided by construction of the API. Furthermore the
> `completion-filter-completions` API is easy to use in comparison to your
> idea, where "compliant" backends have to apply string manipulations to
> apply the highlighting and revert the strings back to their old pristine
> state. The only thing the API user has to do is to call the `highlight`
> function returned in the alist by `completion-filter-completions`.
>
> >> By separating the filtering and mutation
> >> (highlighting, scoring) my patch addresses the problem at hand in the
> >> proper way.
> >> [ ... ]
> >> Mutation would be a reasonable choice here if the problem could not be
> >> solved in a proper way. But in fact it can be solved in a proper way
> >> without mutating the strings at all as my patch shows.
> >
> > "proper" is just an reasonably empty adjective. There are different ways to
> > go about this, of course. What's "proper" and isn't is hard to debate
> > objectively.
>
> You are contradicting yourself here. You agree that string mutation is
> better be avoid. If we define "proper" as avoids string mutation if this
> is easily possible, then my patch implements a proper solution to the> problem.
I didn't say it's better avoided, though of course I will avoid _any_ change if
I can. I said I have identified one drawback with doing it. Then I
have addressed
that drawback. So that's what I said.
I am unaware of _other_ drawbacks. They might exist, but I am unaware of
them. Perhaps you are, and indeed you state they exist, but you refuse to
let me know about them. Or perhaps others know of them and will let me know.
In my long-running discussion with Dmitry they were not presented (again,
except for the one I identified).
> > That's for sure. My patch idea addresses only that single problem.
> > I think this is a good property of patches: to solve one thing, not many.
> No, this is not necessarily true. This is only good if the problem is
> solved in a way which is future proof.
OK, but what thing of the future, real or academic, do you envision that
would bring back the problem, or create other problems?
> The idea of mutating the strings is a hack and not a solution.
Without facts to back it up, I have to take this as gratuitous disparagement.
Nicht so gut.
> In contrast, I am presenting a
> future-proof new API as solution which addresses multiple problems.
That's the issue. The completion system is very complex and there are many
good ideas, different, floated by many people. But if you make a patch to
address "multiple" fuzzily-described problems, it's hard to judge how good
your ideas even are! Maybe they are indeed very good, I never said
they weren't. No need to get worked up about it!
Again, my proposal is to first focus on the performance problems caused by
string allocation. _That_ problem is well understood, at least by me (but it
would help to settle on convenient benchmarks understood by others, too).
Then we can go from there.
> you look at the patch, only 196 new lines are added to minibuffer.el.
> Furthermore the patch adds 213 lines of new tests.
It's a large patch, over 1000 lines. One does not review a patch
merely by looking at
lines added, when one needs to read much more, to understand implications, etc.
It needs documentation, for one, much more than just docstrings, on
how to use the
new API.
> João, you don't have to lecture me on these things. Of course I can
> provide such numbers.
Then please do! Not meaning to lecture you, just that your suggestion that
I try Vertico UI as a substitution for these numbers seemed completely
misguided. So if you have them (or "can provide them") let's see them.
All I'm asking, preferably from Emacs -Q recipe.
> You cannot reasonably make the claim that
> `copy-sequence` is the problem and at the same time claim that my patch
> does not solve the performance issues, when in fact my patch avoids this
> exact string copying.
I didn't say it didn't solve them! Now, where did I say that? I would
like to see a
benchmark so that I can witness it _and_ study alternative solutions. With
that, there's a better chance that I will be persuaded there are none
as elegant,
clean, proper, pure, etc as yours!
Maybe others review patches on other aspects that's fine. Maybe
others will. Eli reviewed on minor formatting and documentation aspects.
I review them on substance, using numbers and conducting my own
experiments and tests. This takes time and help from the scientist on the
other end.
Simple and in summary, let's hope your next reply has some benchmarks
so we can make progress.
Thanks,
João
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Fri, 13 Aug 2021 14:04:02 GMT)
Full text and
rfc822 format available.
Message #155 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On 8/13/21 3:36 PM, João Távora wrote:
>> You are contradicting yourself here. You agree that string mutation is
>> better be avoid. If we define "proper" as avoids string mutation if this
>> is easily possible, then my patch implements a proper solution to the> problem.
>
> I didn't say it's better avoided, though of course I will avoid _any_ change if
> I can. I said I have identified one drawback with doing it. Then I
> have addressed
> that drawback. So that's what I said.
>
> I am unaware of _other_ drawbacks. They might exist, but I am unaware of
> them. Perhaps you are, and indeed you state they exist, but you refuse to
> let me know about them. Or perhaps others know of them and will let me know.
> In my long-running discussion with Dmitry they were not presented (again,
> except for the one I identified).
In the discussion with Dmitry, I already pointed out that there is an
alternative principled approach implemented by my Vertico UI, which is
in fact the same approach as implemented in this patch.
If there are other useful conclusions from the discussion I will adopt
them here for this patch.
>>> That's for sure. My patch idea addresses only that single problem.
>>> I think this is a good property of patches: to solve one thing, not many.
>> No, this is not necessarily true. This is only good if the problem is
>> solved in a way which is future proof.
>
> OK, but what thing of the future, real or academic, do you envision that
> would bring back the problem, or create other problems?
>
>> The idea of mutating the strings is a hack and not a solution.
>
> Without facts to back it up, I have to take this as gratuitous disparagement.
> Nicht so gut.
João, your whole answers are "nicht so gut" or not useful. What is your
point? Please give constructive technical feedback instead of such
empty phrases.
>> In contrast, I am presenting a
>> future-proof new API as solution which addresses multiple problems.
>
> That's the issue. The completion system is very complex and there are many
> good ideas, different, floated by many people. But if you make a patch to
> address "multiple" fuzzily-described problems, it's hard to judge how good
> your ideas even are! Maybe they are indeed very good, I never said
> they weren't. No need to get worked up about it!
>
> Again, my proposal is to first focus on the performance problems caused by
> string allocation. _That_ problem is well understood, at least by me (but it
> would help to settle on convenient benchmarks understood by others, too).
> Then we can go from there.
No, it is not the correct approach to fix larger issues by applying
localized patches. We both have identified the string allocations and
highlighting as problem. My patch resolves the problem, by exposing
just the right pieces of the already existing completion machinery. More
about this below.
>> you look at the patch, only 196 new lines are added to minibuffer.el.
>> Furthermore the patch adds 213 lines of new tests.
>
> It's a large patch, over 1000 lines. One does not review a patch
> merely by looking at
> lines added, when one needs to read much more, to understand implications, etc.
> It needs documentation, for one, much more than just docstrings, on
> how to use the
> new API.
I suggest you take a step back here and try to understand the high-level
idea first. It seems that you are misjudging the complexity of the
patch. The minibuffer completion machinery is already constructed such
that filtering and highlighting are separate.
If you look at `completion-basic-all-completions` for example, there is
first a filtering step and then the highlighting is applied in a second
step by the function `completion-hilit-commonality`. This separation
exists for all completion styles.
My patch does nothing else than separating these two processing steps.
The new API `completion-filter-completions` returns the filtered list
and a function to apply highlighting afterwards only to the actually
displayed candidates where highlighting is needed.
In contrast your idea totally misses this.
> Maybe others review patches on other aspects that's fine. Maybe
> others will. Eli reviewed on minor formatting and documentation aspects.
I am looking forward to more reviews by other people.
Your desire for benchmarks is understandable, but I doubt that it will
lead to progress in the discussion here and I doubt that it will
convince you.
The outcome of the benchmark is the following - my patch only filters
and does not mutate the strings, so it will be slightly faster than your
idea where the strings are mutated first and afterwards the mutation has
to be undone again. However the mutations are of course not expensive,
so the differences will be small. The discussion we should be having
here is about technical details and internals and not about the numbers
which won't give any guidance in this case regarding the correct API design.
You seem to always come back to the "scientific method". Note that there
is not only statistics, there is only "scientific reasoning" and
mathematics, which allows to reason about transformations and drawing
conclusions from that. If you don't do this, you are only doing half of
the science.
Daniel
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Fri, 13 Aug 2021 14:12:02 GMT)
Full text and
rfc822 format available.
Message #158 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On Fri, Aug 13, 2021 at 3:03 PM Daniel Mendler <mail <at> daniel-mendler.de> wrote:
> > Without facts to back it up, I have to take this as gratuitous disparagement.
> > Nicht so gut.
>
> João, your whole answers are "nicht so gut" or not useful. What is your
> point? Please give constructive technical feedback instead of such
> empty phrases.
Look, you disparaged an idea of mine without absolutely any facts. I don't think
that's good. "Nicht so gut" was a lighthearted way of pointing it out.
Lighten up.
Post the benchmarks you say you have and stop the pompous handwaving.
Bye,
João
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Fri, 13 Aug 2021 14:38:02 GMT)
Full text and
rfc822 format available.
Message #161 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On 8/13/21 4:11 PM, João Távora wrote:
> On Fri, Aug 13, 2021 at 3:03 PM Daniel Mendler <mail <at> daniel-mendler.de> wrote:
>
>>> Without facts to back it up, I have to take this as gratuitous disparagement.
>>> Nicht so gut.
>>
>> João, your whole answers are "nicht so gut" or not useful. What is your
>> point? Please give constructive technical feedback instead of such
>> empty phrases.
>
> Look, you disparaged an idea of mine without absolutely any facts. I don't think
> that's good. "Nicht so gut" was a lighthearted way of pointing it out.
> Lighten up.
> Post the benchmarks you say you have and stop the pompous handwaving.
João, the way you argue is not in any way "lighthearted". It also
depends on what the other party receives as the message. And here you
just repeat this style by calling my reasoning "pompous handwaving".
This is not a fair way to discuss. In contrast my arguments were
generally of a technical nature. I propose we both calm down a bit and
let others chime in here.
Daniel
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Sat, 14 Aug 2021 02:48:02 GMT)
Full text and
rfc822 format available.
Message #164 received at 48841 <at> debbugs.gnu.org (full text, mbox):
Hi folks,
Sorry I'm late to this party.
On 13.08.2021 16:36, João Távora wrote:
>>>> By separating the filtering and mutation
>>>> (highlighting, scoring) my patch addresses the problem at hand in the
>>>> proper way.
>>>> [ ... ]
>>>> Mutation would be a reasonable choice here if the problem could not be
>>>> solved in a proper way. But in fact it can be solved in a proper way
>>>> without mutating the strings at all as my patch shows.
>>> "proper" is just an reasonably empty adjective. There are different ways to
>>> go about this, of course. What's "proper" and isn't is hard to debate
>>> objectively.
>> You are contradicting yourself here. You agree that string mutation is
>> better be avoid. If we define "proper" as avoids string mutation if this
>> is easily possible, then my patch implements a proper solution to the> problem.
> I didn't say it's better avoided, though of course I will avoid_any_ change if
> I can. I said I have identified one drawback with doing it. Then I
> have addressed
> that drawback. So that's what I said.
>
> I am unaware of_other_ drawbacks. They might exist, but I am unaware of
> them. Perhaps you are, and indeed you state they exist, but you refuse to
> let me know about them. Or perhaps others know of them and will let me know.
> In my long-running discussion with Dmitry they were not presented (again,
> except for the one I identified).
I thought I explained the problem with this previously.
It's basically this: we cannot mutate what we don't own. Across all of
completion functions out there, there will be such that return "shared"
strings (meaning, not copied or newly allocated) from their completion
tables. And modifying them is bad, with consequences which can present
themselves in unexpected, often subtle ways.
Since up until now completion-pcm--hilit-commonality copied all strings
before modifying, completion tables such as described (with "shared"
strings) have all been "legal". Suddenly deciding to stop supporting
them would be a major API breakage with consequences that are hard to
predict. And while I perhaps agree that it's an inconvenience, I don't
think it's a choice we can simply make as this stage in c-a-p-f's
development.
To give an example of a subtle consequence:
1. (setq s (symbol-name 'car))
2. (put-text-property 1 3 'face 'error s)
3. Switch to a buffer in fundamental mode
4. (insert (symbol-name 'car)) --> see the error face in the buffer
Now imagine that some completion table collects symbol names by passing
obarray through #'symbol-name rather than #'all-completions, and voila,
if the completion machinery adds properties (any properties, not just
face) to those strings, you have just modified a bunch of global values.
That's not good.
And in the example above, the values are those that the
lispref/objects.texi says we should not change (though it gives
(symbol-name 'cons) as example). "Not mutable", in its parlance. IIRC
the related discussions mentioned that modifying such values could lead
to a segfault in some previous Emacs versions. Maybe not anymore, but
it's still not a good idea.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Sat, 14 Aug 2021 02:56:02 GMT)
Full text and
rfc822 format available.
Message #167 received at 48841 <at> debbugs.gnu.org (full text, mbox):
Aside from the mutability/ownership issue,
On 13.08.2021 15:05, João Távora wrote:
> If one removes these lines, the process becomes much faster, but there is a
> problem with highlighting. My idea is indeed to defer highlighting by not
> setting the 'face property directly on that shared string, but some
> other property
> that is read later from the shared string by compliant frontents.
I haven't done any direct benchmarking, but I'm pretty sure that this
approach cannot, by definition, be as fast as the non-mutating one.
Because you go through the whole list and mutate all of its elements,
you have to perform a certain bit of work W x N times, where N is the
size of the whole list.
Whereas the deferred-mutation approach will only have to do its bit
(which is likely more work, like, WWW) only 20 times, or 100 times, or
however many completions are displayed. And this is usually negligible.
However big the difference is going to be, I can't say in advance, of
course, or whether it's going to be shadowed by some other performance
costs. But the non-mutating approach should have the best optimization
potential when the list is long.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Sat, 14 Aug 2021 03:12:02 GMT)
Full text and
rfc822 format available.
Message #170 received at 48841 <at> debbugs.gnu.org (full text, mbox):
Hi Daniel,
I haven't yet read the patch in detail, but it sounds like a move in the
right direction (even if it doesn't include the long-overdue overhaul of
the whole API).
A few notes on the new stuff:
> Finally the
`highlight` value is a function taking a list of completion strings
and returns a new list of new strings with highlighting applied.
First of all, I'd really like it to be a function that applies to
individual completion strings, not the whole collection. That would make
it much easier to use in company-capf without having to rewrite a lot of
code in the presentation layer.
Second, perhaps instead of modifying the strings themselves it could
return some data (like a list of faces-intervals tuples) that could be
used to do so?
Again, in company-capf's we end up parsing the face properties in the
string, so those modifications are just extra work for CPU which we
could eliminate.
This is less critical, though.
On 11.08.2021 19:11, Daniel Mendler wrote:
> There are currently two issues with the patch with regards to backward
> compatibility. Fortunately they are fixable with a little effort.
>
> 1. I would like to deprecate `completion-score' or remove it altogether,
> but unfortunately `completion-score' is used in the wild. In order to
> preserve `completion-score', bind `completion--filter-completions' in
> the highlighting functions. Add `completion-score' in
> `completion-pcm--hilit-commonality' when
> `completion--filter-completions' is nil.
And third: I think completion-score could ultimately use the same
treatment as 'highlight'. Meaning, being returned up the stack together
with completions, so other bits of code could look up those values.
I don't have a clear picture of this yet, but see the recently filed
bug#49888. If we want to be able to combine matching scores with recency
scores, simply sorting the completions after matching is not going to
cut it.
Not sure if this is something we can tackle now, but keeping this
possible evolution in mind could help us make good choices in the
current migration.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Sat, 14 Aug 2021 06:28:02 GMT)
Full text and
rfc822 format available.
Message #173 received at 48841 <at> debbugs.gnu.org (full text, mbox):
> Cc: 48841 <at> debbugs.gnu.org, dgutov <at> yandex.ru, joaotavora <at> gmail.com,
> monnier <at> iro.umontreal.ca, 47711 <at> debbugs.gnu.org
> From: Daniel Mendler <mail <at> daniel-mendler.de>
> Date: Thu, 12 Aug 2021 10:47:17 +0200
>
> >> The `completions` value is the list of completion strings *without*
> >> applied highlighting. The completion strings are returned unmodified,
> >> which avoids allocations and results in performance gains for
> >
> > This is unclear: how can you return a list of strings which you
> > produce without allocating the strings?
>
> The function 'completion-filter-completions' receives a completion table
> as argument. The strings produced by this table are returned
> unmodified, but of course the completion table has to produce them. For
> a static completion table (e.g., in the simplest case a list of strings)
> the completion table itself will not allocate strings. In this scenario
> 'completion-filter-completions' will not perform any string allocations,
> only the list will be allocated. This is what leads to major
> performance gains.
My point was that at least some of this should be in the description,
otherwise it will leave the reader wondering.
> >> +(defvar completion--filter-completions nil
> >> + "Enable the new completions return value format.
> >
> > Btw, why is this an internal variable? Shouldn't all completion
> > styles ideally support it? If so, it should be a public variable,
> > documented in the ELisp manual. And the name should also end with -p,
> > since it's a boolean. How about completion-filter-completions-format-p?
>
> (As I understood the style guide '-p' is not a good idea for boolean
> variables, since a value is not a predicate in a strict sense.)
>
> To address your technical comment - this variable is precisely what one
> of the technical difficulties mentioned in my other mail is about. The
> question is how we can retain backward compatibility of the completion
> style 'all' functions, e.g., 'completion-basic-all-completions', while
> still allowing the function to return the newly introduced alist format
> with more data, which enables 'completion-filter-completions' to perform
> the efficient deferred highlighting.
I understand, but given that we provide this for other packages, it
shouldn't be an internal variable.
> > Also, the "This function has been superseded..." part should be a new
> > paragraph, so that it stands out. (And I'm not yet sure we indeed
> > want to say "superseded" here, but that's part of the on-going
> > discussion. maybe use a more neutral language here, like "See also".)
>
> The new API 'completion-filter-completions' will substitute the existing
> API 'completion-all-completions'.
That's your hope, and I understand. But we as a project didn't yet
decide to deprecate the original APIs, so talking about superseding is
premature.
> > Is "filter" really the right word here (in the doc string)? "Filer"
> > means you take a sequence and produce another sequence with some
> > members removed. That's not what this API does, is it? Suggest to
> > use a different name, like completion-completions-alist or
> > completion-all-completions-as-alist.
>
> "Filter" seems like exactly the right word to me. The function takes a
> list of strings (or a completion table) and returns a subset of matching
> completion strings without further modifications to the strings. See
> above what I wrote about allocations.
But the name says "filter completions". Which would mean you take a
list of completions and filter out some of them. A completion table
is much more general object than a list of strings. Thus, I think
using "filter" here is sub-optimal.
> >> +Only the elements of table that satisfy predicate PRED are considered.
> >> +POINT is the position of point within STRING. The METADATA may be
> >> +modified by the completion style. The return value is a alist with
> >> +the keys:
> >> +
> >> +- base: Base position of the completion (from the start of STRING)
> >
> > "Base" here means the beginning? If so, why not call it "beg" or
> > somesuch?
>
> Base position is a fixed term which is already used in minibuffer.el for
> completions. See also 'completion-base-position' for example.
Well, we don't have to keep bad habits indefinitely. It's okay to
lose them and use better terminology. Or at least to explain that
terminology in parentheses the first time it is used in some context.
> > Are we really losing the completion-score property here? If so, why?
>
> Yes, the property is removed in the current patch. It is not actually
> used for anything in the new implementation. But it is possible to
> restore the property such that 'completion-all-completions' always
> returns scored candidates as it does now. See my other mail regarding
> the caveats of the current patch.
I'd prefer not to lose existing features, because that'd potentially
make the changes backward-incompatible.
Thanks.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Sat, 14 Aug 2021 06:46:02 GMT)
Full text and
rfc822 format available.
Message #176 received at 48841 <at> debbugs.gnu.org (full text, mbox):
> From: Daniel Mendler <mail <at> daniel-mendler.de>
> Cc: Eli Zaretskii <eliz <at> gnu.org>, João Távora
> <joaotavora <at> gmail.com>, Stefan Monnier <monnier <at> iro.umontreal.ca>,
> Dmitry Gutov <dgutov <at> yandex.ru>
> Date: Fri, 13 Aug 2021 12:38:28 +0200
>
> I attached the overhauled patch, which addresses most of the comments by
> Eli. In comparison to my last patch, the patch is fully backward
> compatible and preserves all existing tests. As before, there are tests
> which check the new functionality for each existing completion style.
Thanks. You were faster than me, so I sent a few more comments to the
old patch today.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Sat, 14 Aug 2021 07:03:02 GMT)
Full text and
rfc822 format available.
Message #179 received at 48841 <at> debbugs.gnu.org (full text, mbox):
> From: Daniel Mendler <mail <at> daniel-mendler.de>
> Date: Fri, 13 Aug 2021 14:56:38 +0200
> Cc: 47711 <at> debbugs.gnu.org, Stefan Monnier <monnier <at> iro.umontreal.ca>,
> 48841 <at> debbugs.gnu.org, Dmitry Gutov <dgutov <at> yandex.ru>
>
> On 8/13/21 2:37 PM, João Távora wrote:
> > On Fri, Aug 13, 2021 at 1:22 PM Daniel Mendler <mail <at> daniel-mendler.de> wrote:
> >
> >> It is important to keep this property since this will preclude many bugs
> >> due to string mutation.
> >
> > I am aware of this, of course. Can you give examples of these "many bugs"?
> > Perhaps other than the one I already described and addressed?
>
> No, João, this is not how it goes. I don't have to prove to you that
> your idea introduces bugs. You have to show that mutation of the
> completion table strings (which are not supposed to be mutated) will not
> lead to bugs, which are hard to find.
Please calm down, both of you. No one has to prove anything to anyone
here, that's not how Emacs development works. We need to see which
idea is better, and if none is significantly better, we will probably
have both of them living side by side.
And while asking for an example of potential bugs is reasonable,
asking for a proof that a change will NOT lead to bugs isn't. So how
about a couple of examples where having original strings unchanged is
important, which could then be discussed?
> >> Note that your idea also does not address the other issues which are
> >> addressed by my patch.
> >
> > That's for sure. My patch idea addresses only that single problem.
> > I think this is a good property of patches: to solve one thing, not many.
>
> No, this is not necessarily true. This is only good if the problem is
> solved in a way which is future proof. The idea of mutating the strings
> is a hack and not a solution.
Just to make sure we are on the same page: adding a text property to a
string doesn't mutate a string. Lisp programs that process these
strings will not necessarily see any difference, and displaying those
strings will also not show any difference if the property is not
related to display. So the assumption that seems to be made here,
that adding a property is the same as mutating a string, is IMO
inaccurate if not incorrect.
And once again: please tone down your responses, both of you. TIA.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Sat, 14 Aug 2021 07:14:02 GMT)
Full text and
rfc822 format available.
Message #182 received at 48841 <at> debbugs.gnu.org (full text, mbox):
> From: Dmitry Gutov <dgutov <at> yandex.ru>
> Date: Sat, 14 Aug 2021 05:47:43 +0300
> Cc: Stefan Monnier <monnier <at> iro.umontreal.ca>, 48841 <at> debbugs.gnu.org,
> 47711 <at> debbugs.gnu.org
>
> I thought I explained the problem with this previously.
>
> It's basically this: we cannot mutate what we don't own. Across all of
> completion functions out there, there will be such that return "shared"
> strings (meaning, not copied or newly allocated) from their completion
> tables. And modifying them is bad, with consequences which can present
> themselves in unexpected, often subtle ways.
>
> Since up until now completion-pcm--hilit-commonality copied all strings
> before modifying, completion tables such as described (with "shared"
> strings) have all been "legal". Suddenly deciding to stop supporting
> them would be a major API breakage with consequences that are hard to
> predict. And while I perhaps agree that it's an inconvenience, I don't
> think it's a choice we can simply make as this stage in c-a-p-f's
> development.
This sounds like an argument against Daniel's approach as well, no?
Because if a completion API returns strings it "doesn't own", there
will be restrictions on Lisp programs that use those strings, because
those Lisp programs previously could do anything they liked with those
strings, and now they cannot. Or am I missing something?
> 1. (setq s (symbol-name 'car))
>
> 2. (put-text-property 1 3 'face 'error s)
>
> 3. Switch to a buffer in fundamental mode
>
> 4. (insert (symbol-name 'car)) --> see the error face in the buffer
>
> Now imagine that some completion table collects symbol names by passing
> obarray through #'symbol-name rather than #'all-completions, and voila,
> if the completion machinery adds properties (any properties, not just
> face) to those strings, you have just modified a bunch of global values.
> That's not good.
How is this different from Daniel's proposal of returning the original
strings? AFAIU, he just shifts the responsibility from the completion
code to the caller of the completion code, but basically leaves the
problem still very much real and pretty much into our face.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Sat, 14 Aug 2021 07:17:02 GMT)
Full text and
rfc822 format available.
Message #185 received at 48841 <at> debbugs.gnu.org (full text, mbox):
> From: Dmitry Gutov <dgutov <at> yandex.ru>
> Date: Sat, 14 Aug 2021 05:55:17 +0300
> Cc: Stefan Monnier <monnier <at> iro.umontreal.ca>, 48841 <at> debbugs.gnu.org,
> 47711 <at> debbugs.gnu.org
>
> On 13.08.2021 15:05, João Távora wrote:
> > If one removes these lines, the process becomes much faster, but there is a
> > problem with highlighting. My idea is indeed to defer highlighting by not
> > setting the 'face property directly on that shared string, but some
> > other property
> > that is read later from the shared string by compliant frontents.
>
> I haven't done any direct benchmarking, but I'm pretty sure that this
> approach cannot, by definition, be as fast as the non-mutating one.
Daniel seems to think otherwise, AFAIU.
> Because you go through the whole list and mutate all of its elements,
> you have to perform a certain bit of work W x N times, where N is the
> size of the whole list.
>
> Whereas the deferred-mutation approach will only have to do its bit
> (which is likely more work, like, WWW) only 20 times, or 100 times, or
> however many completions are displayed. And this is usually negligible.
>
> However big the difference is going to be, I can't say in advance, of
> course, or whether it's going to be shadowed by some other performance
> costs. But the non-mutating approach should have the best optimization
> potential when the list is long.
So I guess the suggestion to have a benchmark is still useful, because
the estimations of which approach has better performance vary between
you three. So maybe producing such benchmarks would be a good step?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Sat, 14 Aug 2021 08:24:02 GMT)
Full text and
rfc822 format available.
Message #188 received at 48841 <at> debbugs.gnu.org (full text, mbox):
Dmitry Gutov <dgutov <at> yandex.ru> writes:
> Aside from the mutability/ownership issue,
>
> On 13.08.2021 15:05, João Távora wrote:
>> If one removes these lines, the process becomes much faster, but there is a
>> problem with highlighting. My idea is indeed to defer highlighting by not
>> setting the 'face property directly on that shared string, but some
>> other property
>> that is read later from the shared string by compliant frontents.
>
> I haven't done any direct benchmarking, but I'm pretty sure that this
> approach cannot, by definition, be as fast as the non-mutating one.
>
> Because you go through the whole list and mutate all of its elements,
> you have to perform a certain bit of work W x N times, where N is the
> size of the whole list.
Let's call W the work that you perform N times in this istuation. In
the non-mutation, let's call it Z. So
W <= Z, because Z not only propertizes the string with a calculation of
faces but _also copies its character contents_.
Also I think it's better to start about copying rather than mutating.
As Eli points out, putting a text property in a string (my idea) is not
equivalent with "mutating" it.
> Whereas the deferred-mutation approach will only have to do its bit
> (which is likely more work, like, WWW) only 20 times, or 100 times, or
> however many completions are displayed. And this is usually
> negligible.
I think you're going in the same fallacy you went briefly in the other
bug report. The flex and pcm styles (meaning
completion-pcm--hilit-commonality) has to go through all the completions
when deciding the score to atribute to each completion that we already
know matches the pattern. That's because this scoring is essential to
sorting. That's a given in both scenarios, copying and non-copying.
Then, it's true that only a very small set of those eventually have to
be displayed to the user, depending on where wants she wants her
scrolling page to be. So that's when you have to apply 'face' to, say
20 strings, and that can indeed be pretty fast. But where does the
information come from?
- Currently, it comes from the string's 'face' itself, which was copied
entirely.
- In the non-copying approach, it must come from somewhere else. One
idea is that it comes from a new "private" property 'lazy-face', also
in the string itselv, but which was _not_ copied. Another idea is
just to remember the pattern and re-match it to those 20 strings.
I think the second alternative is always faster.
> However big the difference is going to be, I can't say in advance, of
> course, or whether it's going to be shadowed by some other performance
> costs. But the non-mutating approach should have the best optimization
> potential when the list is long.
Don't think so. I'm doing benchmarks, will post soon.
João
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Sat, 14 Aug 2021 09:49:02 GMT)
Full text and
rfc822 format available.
Message #191 received at 48841 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Eli Zaretskii <eliz <at> gnu.org> writes:
>> > On Fri, Aug 13, 2021 at 1:22 PM Daniel Mendler <mail <at> daniel-mendler.de> wrote:
>> >
>> >> It is important to keep this property since this will preclude many bugs
>> >> due to string mutation.
>> > I am aware of this, of course. Can you give examples of these "many bugs"?
>> > Perhaps other than the one I already described and addressed?
>> No, João, this is not how it goes. I don't have to prove to you that
>> your idea introduces bugs. You have to show that mutation of the
>> completion table strings (which are not supposed to be mutated) will not
>> lead to bugs, which are hard to find.
> Please calm down, both of you. No one has to prove anything to anyone
> here, that's not how Emacs development works. We need to see which
> idea is better, and if none is significantly better, we will probably
> have both of them living side by side.
>
> And while asking for an example of potential bugs is reasonable,
> asking for a proof that a change will NOT lead to bugs isn't.
As far as I remember, I have done the first. I found bugs and addressed
them. I cannot _prove_ that my change will not leads to bugs indeed:
no-one can with any change. I've just stated repeteadly that I'm not
aware of any such bugs. I can understand intuition" for bugs to a
certain extent (everyone has intuition), but this intuition must always
resolve into actual reality to be useful in the end.
> So how
> about a couple of examples where having original strings unchanged is
> important, which could then be discussed?
Good idea, so in the absence of any controlled benchmarks I did some of
my own, using the most controlled environment I could devise. I start
Emacs like so:
~/Source/Emacs/emacs/src/emacs -Q -f fido-mode -f fido-vertical-mode -l ~/tmp/benchmark.el ~/tmp/benchmark.el
I prime the obarry with lots of symblos to make completion purposedly
slow:
(require 'cl-lib)
(cl-loop repeat 300000 do (intern (symbol-name (gensym))))
I attach the file. Then I try a run of 10 invocations of
;; Press C-u C-x C-e C-m quickly to produce a sample.
(benchmark-run (completing-read "" obarray))
This, I think, is a good representation of the responsiveness of the
completion system. It always prints well after I finish typing, so I
don't think I'm inducing any artificial slowdows while it waits for my
input. When not measuring quantitatively, I also feel the difference in
responsiveness between different approaches.
Summarized results with an assortment of Emacs builds.
- the current master (254dc6ab4ca8e6a549a795f9eaf45378ce51b61f).
20.25 seconds total
- Applying Daniel's patch over 254dc6.
23.41 seconds total
- The theoretical best situation where we don't highlight in
completion-pcm--hilit-commonality (like 254dc6, but just removed
the copy-sequence)
10.70 seconds total
- Experimental patch published in
scratch/icomplete-lazy-highlight-attempt-2 (not finished, still
needs a way for frontends to opt into the optimization).
10.80 seconds total
I invite you all to reproduce these results.
In conclusion, I don't think Daniel's patch is going in the right
direction, *performance-wise*, for the kind of responsiveness scenarios
that I am concerned with, and which were discussed with Dmitry in
bug#48841. It seems to slow down the process by about 10%.
Note 1: there may be *other* performance scenarios that I am not aware
of, where Daniel's patch excels. I've requested these benchmarks,
regrettably without any success.
Note 2: doesn't mean that there aren't *other* merits to Daniel's patch,
but I have not understood these yet. That is due to the stated fact
that the patch is very long, and seems to comprise performance
improvements, refactorings, and API redesign. It has no documentation
in manual and/or examples on how to use the new API.
>> >> Note that your idea also does not address the other issues which are
>> >> addressed by my patch.
>> >
>> > That's for sure. My patch idea addresses only that single problem.
>> > I think this is a good property of patches: to solve one thing, not many.
>>
>> No, this is not necessarily true. This is only good if the problem is
>> solved in a way which is future proof. The idea of mutating the strings
>> is a hack and not a solution.
>
> Just to make sure we are on the same page: adding a text property to a
> string doesn't mutate a string. Lisp programs that process these
> strings will not necessarily see any difference, and displaying those
> strings will also not show any difference if the property is not
> related to display. So the assumption that seems to be made here,
> that adding a property is the same as mutating a string, is IMO
> inaccurate if not incorrect.
Yes, in Lisp it is very common to attach a "private" property to an
object. If no-one else knows about the existence of that property, then
attaching it is not harmful. Generally, of course: there are situations
where adding a private property brings side-effects to other parts of
the code. But IMO that other code is in the wrong, not the one that
adds properties.
Also, to be clear, attaching a different property (as in, not 'face') to
the completion string is only _one_ of the ways of the ways to bypass
copying. According to my measurements, performance doesn't seem to be
decided by property attachments, but by copying or not copying of the
character data of said strings in completion-pcm--hilit-commonality.
João
[benchmark.el (application/emacs-lisp, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Sat, 14 Aug 2021 10:37:02 GMT)
Full text and
rfc822 format available.
Message #194 received at 48841 <at> debbugs.gnu.org (full text, mbox):
Dmitry Gutov <dgutov <at> yandex.ru> writes:
> It's basically this: we cannot mutate what we don't own. Across all of
> completion functions out there, there will be such that return
> "shared" strings (meaning, not copied or newly allocated) from their
> completion tables. And modifying them is bad, with consequences which
> can present themselves in unexpected, often subtle ways.
I agree with this premise. But would you call putting a uniquely named
text property in them a modification or mutation of said strings? I
don't.
> Since up until now completion-pcm--hilit-commonality copied all
> strings before modifying, completion tables such as described (with
> "shared" strings) have all been "legal".
Again, if I take one of this shared strings, in whichever environment
running row now, and I secretly attach a privatte "joaot/blargh"
property to it, there is very very low likelyhood that that will hurt
anybody.
You seem to be worrying about re-setting the 'face' property (a public
property by excellence) and that's the very same bug I experienced and
have described early. It's not even a hard bug to see. Just remove the
copy-sequence in `completion-pcm--hilit-commonality' and see strange
stuff happening.
But if you set some other property, _that_ bug _doesn't_ occur. Do some
other bugs occur? I don't know, I don't think we'll ever know, for
_any_ change.
Furthermore, there are other ways to forego the copying in
`completion-pcm--hilit-commonality and not even touch _ANY_ string
property.
> Suddenly deciding to stop supporting them would be a major API
> breakage with consequences that are hard to predict. And while I
> perhaps agree that it's an inconvenience, I don't think it's a choice
> we can simply make as this stage in c-a-p-f's development.
>
> To give an example of a subtle consequence:
>
> 1. (setq s (symbol-name 'car))
>
> 2. (put-text-property 1 3 'face 'error s)
>
> 3. Switch to a buffer in fundamental mode
>
> 4. (insert (symbol-name 'car)) --> see the error face in the buffer
It's not even subtle :-) Yes this is why have seen from the beginning in
bug#48841. I think it was even I who reported it to you.
The principle to follow can be summarized as this: "Don't touch values
of properties you don't own in objects you don't own."
So just don't touch the 'face' property in things you don't own! But
feel free to touch the "dmitry/blargh" property even in objects you
don't own.
So 'c-p--h-l' doesn't "own" face. So it must either create an object
that it owns or set something that it does own. 'completion-score' is
"owned" by 'c-p--h-l'. Only it can write it (though others can read
it).
> Now imagine that some completion table collects symbol names by
> passing obarray through #'symbol-name rather than #'all-completions,
> and voila, if the completion machinery adds properties (any
> properties, not just face) to those strings, you have just modified a
> bunch of global values. That's not good.
Why? Maybe I'm missing something. Why is adding properties -- that
no-one but the completion machinery knows about -- to those shared
strings "not good"? What bad thing can happen if I do?
> And in the example above, the values are those that the
> lispref/objects.texi says we should not change (though it gives
> (symbol-name 'cons) as example). "Not mutable", in its parlance. IIRC
> the related discussions mentioned that modifying such values could
> lead to a segfault in some previous Emacs versions. Maybe not anymore,
> but it's still not a good idea.
You're extrapolating "change" to also include attaching properties to
symbols. I think that document just means that you can't do stuff like
(aset "cons" 0 ?z)
or
(aset (symbol-name 'cons) 0 ?z)
I don't think it means you can't
(put-text-property 0 1 'joaot/muahahah 42 (symbol-name 'cons))
But maybe Eli or someone else more knowledgeable in the deep internals
of Emacs can correct me.
If indeed I'm wrong, there are other ways to forego the copying in
`c-p---hilit-commonality` and still don't incurr in any such "mutation".
We must keep our eyes on the prize: copying -- not property-attaching --
is the real bummer here.
scratch/icomplete-lazy-highlight-attempt-2, although still incomplete,
is one such approach, though it still sets `completion-score` on the
"shared" string, used later for sorting. But also that could be
prevented (again, only if it turns out to be actually problematic IMO).
João
PS: Maybe I've not stated it clearly enough: I *don't* object to -- or
endorse -- Daniel's patch. My point was solely that it mixes too many
things for me to be intellectually able to review its functional merits,
and that those things should be separated into multiple problems and
patches to make this evaluation easier. Maybe someone with superior
intellecutal capacity can review -- on substance -- as it stands.
See my other reply containing benchmarks. Daniel's patch doesn't
perform well there, but for all I know, it can co-exist with my
non-copying approach, and we can all have our cake.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Sat, 14 Aug 2021 11:23:02 GMT)
Full text and
rfc822 format available.
Message #197 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On 14.08.2021 10:12, Eli Zaretskii wrote:
>> From: Dmitry Gutov <dgutov <at> yandex.ru>
>> Date: Sat, 14 Aug 2021 05:47:43 +0300
>> Cc: Stefan Monnier <monnier <at> iro.umontreal.ca>, 48841 <at> debbugs.gnu.org,
>> 47711 <at> debbugs.gnu.org
>>
>> I thought I explained the problem with this previously.
>>
>> It's basically this: we cannot mutate what we don't own. Across all of
>> completion functions out there, there will be such that return "shared"
>> strings (meaning, not copied or newly allocated) from their completion
>> tables. And modifying them is bad, with consequences which can present
>> themselves in unexpected, often subtle ways.
>>
>> Since up until now completion-pcm--hilit-commonality copied all strings
>> before modifying, completion tables such as described (with "shared"
>> strings) have all been "legal". Suddenly deciding to stop supporting
>> them would be a major API breakage with consequences that are hard to
>> predict. And while I perhaps agree that it's an inconvenience, I don't
>> think it's a choice we can simply make as this stage in c-a-p-f's
>> development.
>
> This sounds like an argument against Daniel's approach as well, no?
> Because if a completion API returns strings it "doesn't own", there
> will be restrictions on Lisp programs that use those strings, because
> those Lisp programs previously could do anything they liked with those
> strings, and now they cannot. Or am I missing something?
Good question. It is not.
The completion tables described above have all been doing "legal"
things, in our general understanding.
But any callers of completion-all-completions were never really allowed
to modify the returned strings (those still were strings that code
"doesn't own").
Of course, some of those callers (I don't know any, though) might have
taken advantage of being able to modify the strings with impunity
because of completion-all-completions' implementation detail, but
they'll have a chance to clean up their act when switching to
completion-filter-completions.
>> 1. (setq s (symbol-name 'car))
>>
>> 2. (put-text-property 1 3 'face 'error s)
>>
>> 3. Switch to a buffer in fundamental mode
>>
>> 4. (insert (symbol-name 'car)) --> see the error face in the buffer
>>
>> Now imagine that some completion table collects symbol names by passing
>> obarray through #'symbol-name rather than #'all-completions, and voila,
>> if the completion machinery adds properties (any properties, not just
>> face) to those strings, you have just modified a bunch of global values.
>> That's not good.
>
> How is this different from Daniel's proposal of returning the original
> strings? AFAIU, he just shifts the responsibility from the completion
> code to the caller of the completion code, but basically leaves the
> problem still very much real and pretty much into our face.
This is a shift of responsibility in the right direction. The callers
might as well do the string copying when needed, but the fact of the
matter is, they usually only need to "copy" 20-100 strings (or however
many is displayed), if they need to modify them at all. That's where we
win: copying 100 strings is better than copying 10000.
Gotta run now, will reply to other email later.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Sat, 14 Aug 2021 11:30:02 GMT)
Full text and
rfc822 format available.
Message #200 received at 48841 <at> debbugs.gnu.org (full text, mbox):
> From: João Távora <joaotavora <at> gmail.com>
> Date: Sat, 14 Aug 2021 11:36:32 +0100
> Cc: Daniel Mendler <mail <at> daniel-mendler.de>,
> Stefan Monnier <monnier <at> iro.umontreal.ca>, 48841 <at> debbugs.gnu.org,
> 47711 <at> debbugs.gnu.org
>
> > And in the example above, the values are those that the
> > lispref/objects.texi says we should not change (though it gives
> > (symbol-name 'cons) as example). "Not mutable", in its parlance. IIRC
> > the related discussions mentioned that modifying such values could
> > lead to a segfault in some previous Emacs versions. Maybe not anymore,
> > but it's still not a good idea.
>
> You're extrapolating "change" to also include attaching properties to
> symbols. I think that document just means that you can't do stuff like
>
> (aset "cons" 0 ?z)
>
> or
>
> (aset (symbol-name 'cons) 0 ?z)
>
> I don't think it means you can't
>
> (put-text-property 0 1 'joaot/muahahah 42 (symbol-name 'cons))
>
> But maybe Eli or someone else more knowledgeable in the deep internals
> of Emacs can correct me.
Text properties are stored separately from the string, so I don't
think adding properties can in general be referred to as "change".
Whether in some particular situation that could count as a "change"
depends on that situation and on the particular property, of course.
I'm not sure in the context of completion there's any reason to count
as "change" adding properties that don't affect display.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Sat, 14 Aug 2021 12:13:02 GMT)
Full text and
rfc822 format available.
Message #203 received at 48841 <at> debbugs.gnu.org (full text, mbox):
Eli Zaretskii <eliz <at> gnu.org> writes:
> Text properties are stored separately from the string, so I don't
> think adding properties can in general be referred to as "change".
>
> Whether in some particular situation that could count as a "change"
> depends on that situation and on the particular property, of course.
> I'm not sure in the context of completion there's any reason to count
> as "change" adding properties that don't affect display.
It is a destructive change, but we may just declare that completion
functions are allowed to destructively change the inputs in certain very
prescribed ways. I'd rather avoid that, though, if at all possible,
because it may lead to subtle bugs all over the place.
Stefan just reminded me (in a different bug report) that we've long
meant to extend the text property machinery with a "namespace" or
"plane" concept. The impetus for this is really the font locking
machinery which wants to manage some text properties that other packages
also want to manage.
The idea is that the display machinery would combine all the planes
before displaying, but each package would just manage its own "plane".
If we had something like this, then using this mechanism in the
completion context would make sense -- we could then say that completion
isn't allowed to alter anything except text properties in its private
plane.
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Sat, 14 Aug 2021 12:41:01 GMT)
Full text and
rfc822 format available.
Message #206 received at 48841 <at> debbugs.gnu.org (full text, mbox):
> From: Lars Ingebrigtsen <larsi <at> gnus.org>
> Cc: João Távora <joaotavora <at> gmail.com>,
> mail <at> daniel-mendler.de,
> dgutov <at> yandex.ru, monnier <at> iro.umontreal.ca, 48841 <at> debbugs.gnu.org,
> 47711 <at> debbugs.gnu.org
> Date: Sat, 14 Aug 2021 14:12:31 +0200
>
> Stefan just reminded me (in a different bug report) that we've long
> meant to extend the text property machinery with a "namespace" or
> "plane" concept. The impetus for this is really the font locking
> machinery which wants to manage some text properties that other packages
> also want to manage.
>
> The idea is that the display machinery would combine all the planes
> before displaying, but each package would just manage its own "plane".
> If we had something like this, then using this mechanism in the
> completion context would make sense -- we could then say that completion
> isn't allowed to alter anything except text properties in its private
> plane.
How can that work if at display time all the "planes" are combined?
It would mean that the code which produced the original strings will
get them displayed differently after completion.
Anyway, I'm not sure why you are talking about display here: the
properties which are supposed to store the information about
completion aren't supposed to affect display.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Sat, 14 Aug 2021 13:30:03 GMT)
Full text and
rfc822 format available.
Message #209 received at 48841 <at> debbugs.gnu.org (full text, mbox):
Eli Zaretskii <eliz <at> gnu.org> writes:
> How can that work if at display time all the "planes" are combined?
> It would mean that the code which produced the original strings will
> get them displayed differently after completion.
That's in the font-lock context, where font-lock will do faces on its
"plane" while other packages can do faces on their own "planes", and
they'll be combined on display.
It not relevant in the context of this bug report, but I thought I'd
just mention the general design.
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Sun, 15 Aug 2021 00:04:02 GMT)
Full text and
rfc822 format available.
Message #212 received at 48841 <at> debbugs.gnu.org (full text, mbox):
João Távora <joaotavora <at> gmail.com> writes:
> in the absence of any controlled benchmarks I did some of
> my own, using the most controlled environment I could devise. I start
> Emacs like so:
>
> ~/Source/Emacs/emacs/src/emacs -Q -f fido-mode -f fido-vertical-mode -l ~/tmp/benchmark.el ~/tmp/benchmark.el
I have know tweaked the benchmark slightly to make it easier to evaluate
speed qualitatively. Here's what I've been using.
(require 'cl-lib)
(fido-mode 1)
(fido-vertical-mode 1)
;; Introduce 150 000 new functions to really slow things down.
;; Probably more than most non-Spacemancs people will have :-)
(defmacro lotsoffunctions ()
`(progn
,@(cl-loop repeat 150000
collect `(defun ,(intern (symbol-name (gensym "heyhey"))) () 42))))
(lotsoffunctions)
(when nil
;; Press C-u C-x C-e C-m quickly to produce a quantitative sample
(benchmark-run (completing-read "" obarray))
;; Or just press C-h f to experience how fast/slow completion is.
)
The results are the same as the ones I reported in the previous email.
I've also cleaned up my previous patch of the
scratch/icomplete-lazy-highlight-attempt-2 branch slightly. It is now
fully opt-in for frontends and completion-styles, so the backward
compatibility problems which I speculated seem to have been exaggerated.
I'm still studying it for flaws, but anyone can have a look. And, of
course, there are many different ways to realize the "opt-in" for
frontends/styles. I just chose the one that seemed the simplest given
the current completion framework.
The performance is still very good, it reduces the usual waiting time in
long lists of completions to about half of what it currently is.
João
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Sun, 15 Aug 2021 18:34:02 GMT)
Full text and
rfc822 format available.
Message #215 received at 48841 <at> debbugs.gnu.org (full text, mbox):
[I've removed bug#47711 from the list, since I haven't read the bug.
This is only directly concerned with this report bug#48841 about speed
differences between fido-mode and ido-mode.]
João Távora <joaotavora <at> gmail.com> writes:
> scratch/icomplete-lazy-highlight-attempt-2, although still incomplete,
> is one such approach, though it still sets `completion-score` on the
> "shared" string, used later for sorting. But also that could be
> prevented (again, only if it turns out to be actually problematic
> IMO).
I have tested the patch more thoroughly now, and have not found any
problems. It's now opt-in for both frontends and completion styles.
Used four combinations:
- For adhering styles (substring, pcm and flex) and adhering frontends
(icomplete-mode and fido-mode).
- For adherint styles with non-adhering but style-respecting frontends
(company and "vanilla" completion).
- Non-adhering styles with adhering frontends, also no problems.
- Non-adhering styles and non-adhering frontends, also no problems.
As for performance, I'm using the usual simple benchmark from Emacs -Q
(require 'cl-lib)
(fido-mode 1)
(fido-vertical-mode 1)
(defmacro lotsoffunctions ()
`(progn
,@(cl-loop repeat 300000
collect `(defun ,(intern (symbol-name (gensym "heyhey"))) () 42))))
(lotsoffunctions)
I then press C-u C-x C-e C-m
(benchmark-run (completing-read "" obarray))
To get a quantitative benchmark or just C-h f to get a qualitative
one. Without the optimization this takes about 5s to evaluate, with the
optimization is usually around 2.6s.
I also tested ido-mode on this, with
(ido-mode)
(ido-ubiquitous-mode)
(setq ido-enable-flex-matching t)
But it seems with ido-ubiquitous-mode, C-h f gives up around 20000
functions. So I tested around that mark and found:
* ido-mode to be minutely faster still in displaying the first set
though it isn't as well sorted by recency, size and alphanumericity.
However, I don't now if I'm seeing correctly ifo-mode
* ido-mode to be slower in responding to quick input (C-h f a), for
example. There's some flickering. It's also problematic when
quitting with C-g (almost hanging sometimes).
All in all I'm satisfied with the speed increase imparted to fido-mode
and fido-vertical mode. In particular, the sluggishness reported for
short inputs (which makes the flex style consider a great deal of
matches) seems to be completely gone.
I'll let people try this out and review the patch, which is after my
sig.
If there's one thing I'm not crazy about, it's the opt-in interface for
frontends which requires them to somehow explain to the completion
machinery what constitutes a completion "session".
That, of course, is essential to allow any caches to be invalidated. I
don't know if the current completion framework has any better mechanism
than the one explained in the docstring of `completion-lazy-hilit'.
Maybe Stefan can speak to that, maybe the table "metadata" is a good
place, but that object seems complicated to access and manipuate.
Other than that detail, the fact that the opt-in is just a variable and
a function call seems simple enough, in my opinion.
Another note: the actual cache implementation is done with "private",
non-display-related string properties on non-copied strings. That's
somewhat of a bad practice to some us, but not to others. I haven't
seen any evidence of mischief, real or academic, but if it ever comes
forward, the implementation can use some off-string caching mechanism
(likely just a hash-table).
João
diff --git a/lisp/icomplete.el b/lisp/icomplete.el
index cd1979d04a..d69cb7568d 100644
--- a/lisp/icomplete.el
+++ b/lisp/icomplete.el
@@ -494,6 +494,7 @@ icomplete-minibuffer-setup
(setq-local icomplete--initial-input (icomplete--field-string))
(setq-local completion-show-inline-help nil)
(setq icomplete--scrolled-completions nil)
+ (setq completion-lazy-hilit (cl-gensym))
(use-local-map (make-composed-keymap icomplete-minibuffer-map
(current-local-map)))
(add-hook 'post-command-hook #'icomplete-post-command-hook nil t)
@@ -800,7 +801,9 @@ icomplete--render-vertical
(cl-return-from icomplete--render-vertical
(concat
" \n"
- (mapconcat #'identity torender icomplete-separator))))
+ (mapconcat #'identity
+ (mapcar #'completion-lazy-hilit torender)
+ icomplete-separator))))
for (comp prefix) in triplets
maximizing (length prefix) into max-prefix-len
maximizing (length comp) into max-comp-len
@@ -812,7 +815,7 @@ icomplete--render-vertical
(cl-loop for (comp prefix suffix) in triplets
concat prefix
concat (make-string (- max-prefix-len (length prefix)) ? )
- concat comp
+ concat (completion-lazy-hilit comp)
concat (make-string (- max-comp-len (length comp)) ? )
concat suffix
concat icomplete-separator))))
@@ -962,7 +965,8 @@ icomplete-completions
(if (< prospects-len prospects-max)
(push comp prospects)
(setq limit t)))
- (setq prospects (nreverse prospects))
+ (setq prospects
+ (nreverse (mapcar #'completion-lazy-hilit prospects)))
;; Decorate first of the prospects.
(when prospects
(let ((first (copy-sequence (pop prospects))))
diff --git a/lisp/minibuffer.el b/lisp/minibuffer.el
index f335a9e13b..c21f234053 100644
--- a/lisp/minibuffer.el
+++ b/lisp/minibuffer.el
@@ -3512,6 +3512,54 @@ flex-score-match-tightness
than the latter (which has two \"holes\" and three
one-letter-long matches).")
+(defvar-local completion-lazy-hilit nil
+ "If non-nil, request lazy highlighting of completion matches.
+
+Completion-presenting frontends may opt to bind this variable to
+a unique non-nil value in the context of completion-producing
+calls (such as `completion-all-sorted-completions'). This hints
+the intervening completion styles that they do not need to
+propertize completion strings with the `face' property.
+
+When doing so, it is the frontend -- not the style -- who becomes
+responsible for `face'-propertizing the completion matches meant
+to be displayed to the user, frequently a small subset of all
+completion matches. This can be done by calling the function
+`completion-lazy-hilit' which returns a `face'-propertized
+string.
+
+The value stored in this variable by the completion frontend must
+be unique to each completion attempt/session. For instance,
+frontends which utilize the minibuffer as the locus of completion
+may set it to a buffer-local value returned by `gensym'. For
+frontends operating within a recursive command loop, let-binding
+it to `gensym' is appropriate.
+
+Note that the optimization enabled by variable is only actually
+performed some completions styles. To others, it is a harmless
+and useless hint. To author a completion style that takes
+advantage of this, look in the source of
+`completion-pcm--hilit-commonality' for ideas.")
+
+(defun completion-lazy-hilit (str)
+ "Return a copy of completion STR that is `face'-propertized.
+See documentation for variable `completion-lazy-hilit' for more
+details."
+ (let* ((str (copy-sequence str))
+ (data (get-text-property 0 'completion-lazy-hilit-data str))
+ (re (and
+ completion-lazy-hilit
+ (eq completion-lazy-hilit (car data)) (cdr data)))
+ (md (and re (string-match re str) (cddr (match-data t))))
+ (me (and md (match-end 0)))
+ (from 0))
+ (while md
+ (add-face-text-property from (pop md) 'completions-common-part nil str)
+ (setq from (pop md)))
+ (unless (or (not me) (= from me))
+ (add-face-text-property from me 'completions-common-part nil str))
+ str))
+
(defun completion-pcm--hilit-commonality (pattern completions)
"Show where and how well PATTERN matches COMPLETIONS.
PATTERN, a list of symbols and strings as seen
@@ -3527,8 +3575,10 @@ completion-pcm--hilit-commonality
last-md)
(mapcar
(lambda (str)
- ;; Don't modify the string itself.
- (setq str (copy-sequence str))
+ (unless completion-lazy-hilit
+ ;; Make a copy of `str' since in this case we're about to
+ ;; `face'-propertize it.
+ (setq str (copy-sequence str)))
(unless (string-match re str)
(error "Internal error: %s does not match %s" re str))
(let* ((pos (if point-idx (match-beginning point-idx) (match-end 0)))
@@ -3576,9 +3626,10 @@ completion-pcm--hilit-commonality
(update-score-and-face
(lambda (a b)
"Update score and face given match range (A B)."
- (add-face-text-property a b
- 'completions-common-part
- nil str)
+ (unless completion-lazy-hilit
+ (add-face-text-property a b
+ 'completions-common-part
+ nil str))
(setq
score-numerator (+ score-numerator (- b a)))
(unless (or (= a last-b)
@@ -3601,7 +3652,10 @@ completion-pcm--hilit-commonality
;; for that extra bit of match (bug#42149).
(unless (= from match-end)
(funcall update-score-and-face from match-end))
- (if (> (length str) pos)
+ (put-text-property 0 1 'completion-lazy-hilit-data
+ (cons completion-lazy-hilit re) str)
+ (if (and (> (length str) pos)
+ (not completion-lazy-hilit))
(add-face-text-property
pos (1+ pos)
'completions-first-difference
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 16 Aug 2021 03:18:02 GMT)
Full text and
rfc822 format available.
Message #218 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On 14.08.2021 14:29, Eli Zaretskii wrote:
> Text properties are stored separately from the string, so I don't
> think adding properties can in general be referred to as "change".
Are you thinking of C strings?
Lisp strings carry text properties in addition to the array of
characters. It doesn't really matter where in the memory the properties
and the characters reside.
> Whether in some particular situation that could count as a "change"
> depends on that situation and on the particular property, of course.
I was talking in the general sense: modifying a value.
One can talk about whether a certain modification matters in certain
situations, but that's not the way to discount a general principle.
> I'm not sure in the context of completion there's any reason to count
> as "change" adding properties that don't affect display.
For the context in question, whether the properties affect display is
not particularly important. Properties affecting display just make it
easier to notice that something's wrong. Bug involving other properties
should be more difficult to investigate.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 16 Aug 2021 03:22:02 GMT)
Full text and
rfc822 format available.
Message #221 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On 14.08.2021 15:12, Lars Ingebrigtsen wrote:
> It is a destructive change, but we may just declare that completion
> functions are allowed to destructively change the inputs in certain very
> prescribed ways. I'd rather avoid that, though, if at all possible,
> because it may lead to subtle bugs all over the place.
That would be a breaking change, but it's a possibility, of course.
If we couldn't find a better way to implement this.
> Stefan just reminded me (in a different bug report) that we've long
> meant to extend the text property machinery with a "namespace" or
> "plane" concept. The impetus for this is really the font locking
> machinery which wants to manage some text properties that other packages
> also want to manage.
"Planes" for text properties are just prefixed properties, I guess?
That's different from the situation with font-lock.
> The idea is that the display machinery would combine all the planes
> before displaying, but each package would just manage its own "plane".
> If we had something like this, then using this mechanism in the
> completion context would make sense -- we could then say that completion
> isn't allowed to alter anything except text properties in its private
> plane.
Yes, if the code makes sure to only use prefixed properties, that would
limit the damage. It could still affect repeated (parallel?) uses of the
same values in the same piece of code.
And even if the effects are usually not serious, are we really okay with
evaluating (symbol-name 'car) someday and seeing lots of properties
attached to it?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 16 Aug 2021 03:28:02 GMT)
Full text and
rfc822 format available.
Message #224 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On 14.08.2021 10:01, Eli Zaretskii wrote:
> Just to make sure we are on the same page: adding a text property to a
> string doesn't mutate a string. Lisp programs that process these
> strings will not necessarily see any difference, and displaying those
> strings will also not show any difference if the property is not
> related to display. So the assumption that seems to be made here,
> that adding a property is the same as mutating a string, is IMO
> inaccurate if not incorrect.
This is nonsense.
A program won't necessarily see a difference in *any* changed value, as
long as some part of it stays the same.
I can zero out the tail of a string, and have a program that only looks
at its first few characters. It wouldn't mean that a string hasn't changed.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 16 Aug 2021 03:28:03 GMT)
Full text and
rfc822 format available.
Message #227 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On Mon, Aug 16, 2021 at 4:21 AM Dmitry Gutov <dgutov <at> yandex.ru> wrote:
> And even if the effects are usually not serious, are we really okay with
> evaluating (symbol-name 'car) someday and seeing lots of properties
> attached to it?
I wouldn't mind that at all. For me, it's quite the same as evaluating
(symbol-plist 'car) and seeing (is-vehicle t number-of-wheels 4) along with
all the other byte-compilation stuff already there.
João Távora
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 16 Aug 2021 03:33:02 GMT)
Full text and
rfc822 format available.
Message #230 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On 16.08.2021 06:27, João Távora wrote:
> I wouldn't mind that at all. For me, it's quite the same as evaluating
> (symbol-plist 'car) and seeing (is-vehicle t number-of-wheels 4) along with
> all the other byte-compilation stuff already there.
Those serve a real purpose, not just work as an accidental cache for
some earlier computation.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 16 Aug 2021 03:49:02 GMT)
Full text and
rfc822 format available.
Message #233 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On 14.08.2021 11:23, João Távora wrote:
> Dmitry Gutov <dgutov <at> yandex.ru> writes:
>
>> Aside from the mutability/ownership issue,
>>
>> On 13.08.2021 15:05, João Távora wrote:
>>> If one removes these lines, the process becomes much faster, but there is a
>>> problem with highlighting. My idea is indeed to defer highlighting by not
>>> setting the 'face property directly on that shared string, but some
>>> other property
>>> that is read later from the shared string by compliant frontents.
>>
>> I haven't done any direct benchmarking, but I'm pretty sure that this
>> approach cannot, by definition, be as fast as the non-mutating one.
>>
>> Because you go through the whole list and mutate all of its elements,
>> you have to perform a certain bit of work W x N times, where N is the
>> size of the whole list.
>
> Let's call W the work that you perform N times in this istuation. In
> the non-mutation, let's call it Z. So
>
> W <= Z, because Z not only propertizes the string with a calculation of
> faces but _also copies its character contents_.
As I pointed out later in the email you're replying to, copying won't
happen N times.
> Also I think it's better to start about copying rather than mutating.
> As Eli points out, putting a text property in a string (my idea) is not
> equivalent with "mutating" it.
In common industry terms, that's mutation. Lisp strings are not C
strings, they are aggregate objects.
>> Whereas the deferred-mutation approach will only have to do its bit
>> (which is likely more work, like, WWW) only 20 times, or 100 times, or
>> however many completions are displayed. And this is usually
>> negligible.
>
> I think you're going in the same fallacy you went briefly in the other
> bug report. The flex and pcm styles (meaning
> completion-pcm--hilit-commonality) has to go through all the completions
> when deciding the score to atribute to each completion that we already
> know matches the pattern. That's because this scoring is essential to
> sorting. That's a given in both scenarios, copying and non-copying.
First of all, note that scoring is only essential to the 'flex' style.
Whereas the improvements we're discussing should benefit all, and can be
more pronounced if the scoring don't need to be performed.
But ok, let's talk about flex in particular.
> Then, it's true that only a very small set of those eventually have to
> be displayed to the user, depending on where wants she wants her
> scrolling page to be. So that's when you have to apply 'face' to, say
> 20 strings, and that can indeed be pretty fast. But where does the
> information come from?
>
> - Currently, it comes from the string's 'face' itself, which was copied
> entirely.
>
> - In the non-copying approach, it must come from somewhere else. One
> idea is that it comes from a new "private" property 'lazy-face', also
> in the string itselv, but which was _not_ copied. Another idea is
> just to remember the pattern and re-match it to those 20 strings.
Let's say that the cost to compute the score (on one completion) is S.
And the cost to highlight it is H. The cost to copy a string is C (that
would be amortized cost, including the subsequent GCs).
The current algorithm costs: N x (C + S + H)
C is unavoidable because of the existing API guarantees.
A non-mutating algorithm can cost:
N x S (for sorting)
+
100 x (C + S + H) (let's say we didn't even cache the scoring info)
...where 100 is the number of displayed completions (the number is
usually lower).
As we measured previously, copying is quite expensive. Even in the
above, not-caching approach we win ((N - 100) x (C + H)), and, okay,
lose 100 x S. For high values of N, it should be a clear win.
> I think the second alternative is always faster.
>
>> However big the difference is going to be, I can't say in advance, of
>> course, or whether it's going to be shadowed by some other performance
>> costs. But the non-mutating approach should have the best optimization
>> potential when the list is long.
>
> Don't think so. I'm doing benchmarks, will post soon.
I'm guessing you just skip the C step in your benchmarks? Which is
something that breaks our current contract.
Still, Daniel's patch should provide a comparable performance
improvement. If you're saying it doesn't give numbers as good, I'll have
to give it a more thorough read and testing tomorrow to comment on that.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 16 Aug 2021 03:55:02 GMT)
Full text and
rfc822 format available.
Message #236 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On Mon, Aug 16, 2021 at 4:31 AM Dmitry Gutov <dgutov <at> yandex.ru> wrote:
>
> On 16.08.2021 06:27, João Távora wrote:
> > I wouldn't mind that at all. For me, it's quite the same as evaluating
> > (symbol-plist 'car) and seeing (is-vehicle t number-of-wheels 4) along with
> > all the other byte-compilation stuff already there.
>
> Those serve a real purpose, not just work as an accidental cache for
> some earlier computation.
Caches also serve "a real purpose". the gv-expander there
would be the "cache of an earlier computation". And I'm not
sure what "accidental" means, but if it means "implementation
detail for something I don't care about", I agree `completion-score`
is "accidental". Should it be called
`completion-score-internal-cache-dont-look-here`?
Maybe.
Bottom line here is that an outside observer has no clue, and
shouldn't need or want to have a clue, on what "foreign" properties
attached to strings or symbols are meant. This is why Eli says, and
I agree, that if the property isn't display related, it's all good. No-one
but the setter and reader of that particular property mind. The CL
systems I've worked with use package qualifiers to fine-grain this
even further, but they use the same principle. That Elisp allows a
string property list doesn't really make a difference IMO.
And none of this really really matters to the discussion. If we absolutely
had to store these associations away from the string plist, (for
some aesthetic reason, I guess), we could just use hash-tables.
Then we could return the string unchanged (and uncopied) and largely
keep performance intact.
But why do it, since a string plist is a such a nice place to do these
associations and there's -- apart from your aesthetics considerations
-- 0 drawbacks identified?
João Távora
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 16 Aug 2021 04:01:02 GMT)
Full text and
rfc822 format available.
Message #239 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On 16.08.2021 06:53, João Távora wrote:
> But why do it, since a string plist is a such a nice place to do these
> associations and there's -- apart from your aesthetics considerations
> -- 0 drawbacks identified?
You read all the explanations, and THAT'S your conclusion?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 16 Aug 2021 04:21:02 GMT)
Full text and
rfc822 format available.
Message #242 received at 48841 <at> debbugs.gnu.org (full text, mbox):
Dmitry Gutov <dgutov <at> yandex.ru> writes:
> As I pointed out later in the email you're replying to, copying won't
> happen N times.
_Currently_, as in origin/master, it happens N times.
In my patch, when a frontend adheres to the thing, it happens D times
where D is the amount of strings you need to display. I guess if you do
the work to adapt a frontend to work with the new API proposed in
Daniel's patch it will be about the same (though likely in many more
lines of Elisp).
> First of all, note that scoring is only essential to the 'flex'
> style. Whereas the improvements we're discussing should benefit all,
> and can be more pronounced if the scoring don't need to be performed.
Yep, I agree. But I don't see why the same principles I espouse --
which really amount to: let the style know it doesn't need to
face-propertize -- can't be applied to other styles that don't need
scoring. Although those don't seem to suffer from any performance
problems, at least I haven't seens any complaints/reports/mesurements
like you did for bug#48841.
> But ok, let's talk about flex in particular.
Yes, I think that is important since it is the style known to be least
performant by its very lax nature.
> I'm guessing you just skip the C step in your benchmarks? Which is
> something that breaks our current contract.
Right. Skipping the 'C = Copying step' is the whole point. It breaks
our contract because the completion styles currently promise to
"face"-propertize the string. So this is why I propose to let the
completion-style know that it doesn't need to. When it is told of that,
it is relieved of the necessity of copying the string. It is the
frontend that will do that copy just before face-propertizing and
displaying the string. As you note, and reality shows, that's much
faster. There is no disagreement here.
> Still, Daniel's patch should provide a comparable performance
Kinf of, from what I've read, it _should_ open the way for that to
happen. From what I understand, you must then change the frontend (in a
big way?) to stop using completion-all-completions and start using the
new thing. That work has not been done, as far as I know. Whereas in
my proposal (which is now a patch posted to bug#48841) you change the
frontend in a very minor way, and that work _has_ been done.
Icomplete was very easy to adapt. I can try adapting company soon.
In practice, we can't kill off completion-all-completions and start
everyone on completion-filter-completions (if that's what it's called).
So if the latter does turn out to be a step in the right direction (I'm
mostly waiting on Stefan to chime in), then I also don't see why we
couldn't have, as Eli suggested, both strategies for lazy highlighting
at some point in the future.
> improvement. If you're saying it doesn't give numbers as good, I'll
> have to give it a more thorough read and testing tomorrow to comment
> on that.
It's not me who is saying it, it's my Emacs :-) The real problem is that
with Daniel's patch, the frontends using the current API (as in
icomplete/fido) measurably become _slower_. Though not by much (around
10%), it is still a shame.
Yes, do your testing and please, as always, try to report as
quantitatively as possible, so that we can really compare apples to
apples.
João
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 16 Aug 2021 04:26:02 GMT)
Full text and
rfc822 format available.
Message #245 received at 48841 <at> debbugs.gnu.org (full text, mbox):
Dmitry Gutov <dgutov <at> yandex.ru> writes:
> On 16.08.2021 06:53, João Távora wrote:
>> But why do it, since a string plist is a such a nice place to do these
>> associations and there's -- apart from your aesthetics considerations
>> -- 0 drawbacks identified?
>
> You read all the explanations, and THAT'S your conclusion?
Yes, I currently conclude that there are 0 drawbacks identified to
creating, _via_ string properties, an association of, say, property
'joaot/answer' with value '42' to _any_ string in my current and future
Emacs runtimes.
I conclude that because --- apart from your aesthetics considerations
("do we really want to see...") --- I have not seen identification of
such drawbacks. Have I missed any?
João
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 16 Aug 2021 08:48:02 GMT)
Full text and
rfc822 format available.
Message #248 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On 8/14/21 9:01 AM, Eli Zaretskii wrote:
> Just to make sure we are on the same page: adding a text property to a
> string doesn't mutate a string. Lisp programs that process these
> strings will not necessarily see any difference, and displaying those
> strings will also not show any difference if the property is not
> related to display. So the assumption that seems to be made here,
> that adding a property is the same as mutating a string, is IMO
> inaccurate if not incorrect.
While I agree that adding text properties is not mutation of the string
text itself it still mutates the string data structure. Dmitry made a
good point about this - if a completion table uses obarray as backend to
the completion table you suddenly attach text properties to symbol
names. This is not a good idea. I actually had such an issue once in a
package I developed, where I accidentally attached text properties (via
the mutating put-text-property API) to some strings I didn't own. This
lead to unexpected side effects at other places where I didn't expect
the strings to have properties attached.
Daniel
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 16 Aug 2021 08:49:02 GMT)
Full text and
rfc822 format available.
Message #251 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On 8/14/21 9:12 AM, Eli Zaretskii wrote:
>> Since up until now completion-pcm--hilit-commonality copied all strings
>> before modifying, completion tables such as described (with "shared"
>> strings) have all been "legal". Suddenly deciding to stop supporting
>> them would be a major API breakage with consequences that are hard to
>> predict. And while I perhaps agree that it's an inconvenience, I don't
>> think it's a choice we can simply make as this stage in c-a-p-f's
>> development.
>
> This sounds like an argument against Daniel's approach as well, no?
> Because if a completion API returns strings it "doesn't own", there
> will be restrictions on Lisp programs that use those strings, because
> those Lisp programs previously could do anything they liked with those
> strings, and now they cannot. Or am I missing something?
No, in my patch the displayed candidate strings are still copied before
the text properties are attached. The strings are kept intact as they
are now.
Daniel
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 16 Aug 2021 08:54:02 GMT)
Full text and
rfc822 format available.
Message #254 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On 8/16/21 6:20 AM, João Távora wrote:
> It's not me who is saying it, it's my Emacs :-) The real problem is that
> with Daniel's patch, the frontends using the current API (as in
> icomplete/fido) measurably become _slower_. Though not by much (around
> 10%), it is still a shame.
I have to check this. I claim that 'completion-all-completions' should
not get slower with my patch. If it gets indeed slower as your benchmark
shows, this should be fixed and can be fixed since I am not doing
something else than decomposing the highlighting and filtering
processes, which are already present in the current machinery. The
amount of work stays the same. However in the case the new
'completion-filter-completions' API is used, the filtering will get much
faster since no highlighting and copying takes place.
Daniel
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 16 Aug 2021 09:09:02 GMT)
Full text and
rfc822 format available.
Message #257 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On 8/16/21 6:25 AM, João Távora wrote:
> Yes, I currently conclude that there are 0 drawbacks identified to
> creating, _via_ string properties, an association of, say, property
> 'joaot/answer' with value '42' to _any_ string in my current and future
> Emacs runtimes.
>
> I conclude that because --- apart from your aesthetics considerations
> ("do we really want to see...") --- I have not seen identification of
> such drawbacks. Have I missed any?
There are serious drawbacks of attaching "private" string properties to
arbitrary strings. For once it complicates debugging seriously if there
are suddenly string properties attached to symbol names. It also leads
to a potential memory leak. But even if we could chose this approach
with global side-effects I don't see a reason to do it given the
approach I am proposing in my patch, which avoids these problems entirely.
1. My approach decomposes the already existing completion machinery into
two steps: 1. filtering and 2. highlighting. It does not change the
fundamentals of the completion machinery. This decomposition of the
existing infrastructure is also what leads to the small number of added
lines to minibuffer.el, even if the patch itself is not as small as we
would like to have it.
2. Why take the chances of introducing potentially harmful global side
effects by attaching string properties to arbitrary strings if we can
avoid it easily?
3. The `completion-filter-completions` API is the fastest possible API
for filtering since it does not change the strings at all, it does not
attach string properties or modify the strings in any other way. By
construction, it is a pure function which only filters the list of
candidates. This makes the function easy to use and easy to reason
about. An API which adds private properties to the strings cannot be as
fast and non-intrusive.
4. If 'completion-all-completions' does indeed get slower thanks to my
patch, it is a performance regression of my patch. I will fix this. And
I thank you João for bringing this to my attention. However one should
also consider that in the end, 'fido-mode' and 'icomplete-mode' should
move to the new API 'completion-filter-completions' such that they
benefit from the fast filtering and only copy and highlight the actually
displayed strings. Given this a potential regression of
'completion-all-completions' would matter less for the incrementally
updating UIs. But of course I feel the same way as João - a performance
regression in the API 'completion-all-completions' is unacceptable. It
will be fixed.
Daniel
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 16 Aug 2021 09:43:02 GMT)
Full text and
rfc822 format available.
Message #260 received at 48841 <at> debbugs.gnu.org (full text, mbox):
Hello Eli,
here I respond to the comments you sent out after I've already sent the
overhauled patch.
On 8/14/21 8:27 AM, Eli Zaretskii wrote:
>> The function 'completion-filter-completions' receives a completion table
>> as argument. The strings produced by this table are returned
>> unmodified, but of course the completion table has to produce them. For
>> a static completion table (e.g., in the simplest case a list of strings)
>> the completion table itself will not allocate strings. In this scenario
>> 'completion-filter-completions' will not perform any string allocations,
>> only the list will be allocated. This is what leads to major
>> performance gains.
>
> My point was that at least some of this should be in the description,
> otherwise it will leave the reader wondering.
I agree with this. The documentation should be improved.
>>>> +(defvar completion--filter-completions nil
>>>> + "Enable the new completions return value format.
>>>
>>> Btw, why is this an internal variable? Shouldn't all completion
>>> styles ideally support it? If so, it should be a public variable,
>>> documented in the ELisp manual. And the name should also end with -p,
>>> since it's a boolean. How about completion-filter-completions-format-p?
>>
>> (As I understood the style guide '-p' is not a good idea for boolean
>> variables, since a value is not a predicate in a strict sense.)
>>
>> To address your technical comment - this variable is precisely what one
>> of the technical difficulties mentioned in my other mail is about. The
>> question is how we can retain backward compatibility of the completion
>> style 'all' functions, e.g., 'completion-basic-all-completions', while
>> still allowing the function to return the newly introduced alist format
>> with more data, which enables 'completion-filter-completions' to perform
>> the efficient deferred highlighting.
>
> I understand, but given that we provide this for other packages, it
> shouldn't be an internal variable.
No, we explicitly don't provide this variable for other packages. It is
explicitly only meant to be used for the existing completion styles
emacs21, emacs22, basic, substring, partial-completion, initials and
flex to opt-in in a backward-compatible/calling convention preserving
way to the alist return format. The idea is to keep the existing APIs
fully backward compatible.
Other packages should select the format returned from the completion
styles differently. They should return the alist format on Emacs 28 or
if the API 'completion-filter-completions' API is present. In the not so
near future external packages which support only Emacs 28 and upwards
will then only return the alist format and don't have to perform any
detection anymore.
>>> Also, the "This function has been superseded..." part should be a new
>>> paragraph, so that it stands out. (And I'm not yet sure we indeed
>>> want to say "superseded" here, but that's part of the on-going
>>> discussion. maybe use a more neutral language here, like "See also".)
>>
>> The new API 'completion-filter-completions' will substitute the existing
>> API 'completion-all-completions'.
>
> That's your hope, and I understand. But we as a project didn't yet
> decide to deprecate the original APIs, so talking about superseding is
> premature.
It is not the hope - it is the explicit goal. The API has been designed
to replace the existing API 'completion-all-completions'. We can of
course tone this down. However I, as a package author, would appreciate
if Emacs tells me when a newer API aims to replace another API and when
the documentation is explicit about it. Of course if you decide to have
the doc strings written in a different tone, I will adapt my patch
accordingly. Here I am just explaining why I chose the word "superseded"
instead of a more neutral word.
>>> Is "filter" really the right word here (in the doc string)? "Filer"
>>> means you take a sequence and produce another sequence with some
>>> members removed. That's not what this API does, is it? Suggest to
>>> use a different name, like completion-completions-alist or
>>> completion-all-completions-as-alist.
>>
>> "Filter" seems like exactly the right word to me. The function takes a
>> list of strings (or a completion table) and returns a subset of matching
>> completion strings without further modifications to the strings. See
>> above what I wrote about allocations.
>
> But the name says "filter completions". Which would mean you take a
> list of completions and filter out some of them. A completion table
> is much more general object than a list of strings. Thus, I think
> using "filter" here is sub-optimal.
Okay, you are right about this. But I think even if the name
'completion-filter-completions' is not 100% precise it still conveys
what the API is about. 'completion-completions-alist' or
'completion-all-completions-as-alist' are valid names of course, but I
dislike them for their verbosity. Already 'completion-all-completions'
is quite verbose. A strong argument to use this long name is that the
completion style functions are still called
'completion-basic-all-completions' etc. But if we accept that the new
API 'completion-filter-completions' will actually supersede the existing
API 'completion-all-completions' it makes sense to use a name which will
not hurt our eyes in the long run. However this is of course just a
personal preference of mine. I don't want to spent much time with name
bikeshedding discussions. If you decide on a name, I will adapt my patch
accordingly.
>>>> +Only the elements of table that satisfy predicate PRED are considered.
>>>> +POINT is the position of point within STRING. The METADATA may be
>>>> +modified by the completion style. The return value is a alist with
>>>> +the keys:
>>>> +
>>>> +- base: Base position of the completion (from the start of STRING)
>>>
>>> "Base" here means the beginning? If so, why not call it "beg" or
>>> somesuch?
>>
>> Base position is a fixed term which is already used in minibuffer.el for
>> completions. See also 'completion-base-position' for example.
>
> Well, we don't have to keep bad habits indefinitely. It's okay to
> lose them and use better terminology. Or at least to explain that
> terminology in parentheses the first time it is used in some context.
Okay, I agree. However I tried to avoid including superfluous changes
with my patch set. We should add more context and documentation and then
rename the variables in another patch if we decide that we want to go
through with it.
>>> Are we really losing the completion-score property here? If so, why?
>>
>> Yes, the property is removed in the current patch. It is not actually
>> used for anything in the new implementation. But it is possible to
>> restore the property such that 'completion-all-completions' always
>> returns scored candidates as it does now. See my other mail regarding
>> the caveats of the current patch.
>
> I'd prefer not to lose existing features, because that'd potentially
> make the changes backward-incompatible.
The overhauled patch (version 2) ensures that no features are lost. The
patch is fully backward compatible. There is a potential performance
regression in 'completion-all-completions' as identified by João. I have
yet to confirm this regression. In any case, it should be fixable since
the refactored 'completion-all-completions' API does precisely the same
amount of work as it does now.
Furthermore more documentation should be added to my patch. As of now
'completion-all-completions' is not mentioned in the info manual and
'completion-filter-completions' is also not added there. We may want to
improve the documentation of that. But for now I would like to limit my
documentation improvements to expanding the doc strings of the functions
involved in my patch.
Daniel
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 16 Aug 2021 10:17:01 GMT)
Full text and
rfc822 format available.
Message #263 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On Mon, Aug 16, 2021 at 10:09 AM Daniel Mendler <mail <at> daniel-mendler.de> wrote:
> There are serious drawbacks of attaching "private" string properties to
> arbitrary strings. For once it complicates debugging seriously if there
> are suddenly string properties attached to symbol names. It also leads
> to a potential memory leak.
Please, in the name of the sanity of this discussion, justify these two
statements with examples or follow them with a clause like "because...".
> 3. The `completion-filter-completions` API is the fastest possible API
Again that's quite a statement that I cannot evaluate the veracity of
without hard proof.
What I _can_ evaluate is the material you have put forward, and using
your patch and your Vertico completion software, version 0.14, the very
latest one. I try
emacs -Q -f package-initialize benchmark.el
where benchmark.el is:
(setq completion-styles '(flex))
(defmacro heyhey ()
`(progn
,@(cl-loop repeat 300000
collect `(defun ,(intern (symbol-name (gensym "heyhey"))) () 42))))
(heyhey)
I then turn on vertico-mode and press C-h f. It takes about 4-5 seconds.
It's *faster* than if I do the same with fido-vertical-mode and the current
master, but is noticeably *slower* than if I do the same with the patch
provided earlier and available at scratch/icomplete-lazy-highlight-attempt-2.
Unfortunately, I cannot measure quantitatively here, because I don't
know how to tell Vertico to wait until it gets the correct result.
In other words, take this form:
(completing-read "bla" obarray)<cursor here>
if you type C-u C-x C-e C-m veeery s-l-o-w-l-y in Vertico, if prints
, correctly, the character "%". But if you evaluate it quickly wrapped
in a benchmark-run, it returns immediately and prints "".
In fido-mode, it always waits blockinly until it prints the correct result
and the time it took it to achieve that result. Not questioning if this is
a bug in Vertico, but it would help if it could do the same, or be
configured to do the same, so that we can measure.
Without that, we can't evaluate the speed of Vertico where,
presumably, the fastest API in the world is being used right now.
> 4. If 'completion-all-completions' does indeed get slower thanks to my
> patch, it is a performance regression of my patch. I will fix this. And
> I thank you João for bringing this to my attention. However one should
> also consider that in the end, 'fido-mode' and 'icomplete-mode' should
> move to the new API 'completion-filter-completions' such that they
> benefit from the fast filtering and only copy and highlight the actually
> displayed strings.
Maybe they will, maybe they will. But it's still quite early to decide if
we're going to move all frontends to that API. There's no manual
documentation for it. Conceivably, if you appreciate your API so, you
could demonstrate in practice us how easy it is to use by providing
a separate patch that adapts icomplete-mode and fido-mode to use it.
Then, I or other fido-mode users could test it for a while, evaluating
its speed and correctness. If it performs well and the completion
API architects have a good outlook for it, I see no reason why it
shouldn't be merged and eventually supersede the new one.
In the meantime, there is a patch with a measured and documented
performance boost where fido-mode and icomplete-mode move
opt-in to a new `completion-lazy-hilit` feature in the "old" API with
a total or four 1-line changes. That patch lives in the branch
scratch/icomplete-lazy-highlight-attempt-2.
I think we should move to that, solving the bug#48841 while we
do the evaluation of all aspects of your contributions.
João Távora
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 16 Aug 2021 10:54:02 GMT)
Full text and
rfc822 format available.
Message #266 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On 8/16/21 12:15 PM, João Távora wrote:
> On Mon, Aug 16, 2021 at 10:09 AM Daniel Mendler <mail <at> daniel-mendler.de> wrote:
>
>> There are serious drawbacks of attaching "private" string properties to
>> arbitrary strings. For once it complicates debugging seriously if there
>> are suddenly string properties attached to symbol names. It also leads
>> to a potential memory leak.
>
> Please, in the name of the sanity of this discussion, justify these two
> statements with examples or follow them with a clause like "because...".
João, I am giving hard examples here. What is not an example about
"memory leak" or making debugging output verbose thanks to the attached
string properties? You always repeat your demands but whatever I write
it is not satisfactory for you. Is it possible to convince you? Can you
try to interpret my arguments in a positive and constructive light
somehow and fill them with meaning such that it makes sense to you? My
goal is not to be right here just for the sake of being right. As Eli
said, nobody has to prove anything.
>> 3. The `completion-filter-completions` API is the fastest possible API
>
> Again that's quite a statement that I cannot evaluate the veracity of
> without hard proof.
As I said, I will ensure that my API does not introduce performance
regressions. And since my new API performs strictly less work than your
proposal it will necessarily be faster if you consider only the
filtering, which is what matters for incrementally updating UIs.
>> 4. If 'completion-all-completions' does indeed get slower thanks to my
>> patch, it is a performance regression of my patch. I will fix this. And
>> I thank you João for bringing this to my attention. However one should
>> also consider that in the end, 'fido-mode' and 'icomplete-mode' should
>> move to the new API 'completion-filter-completions' such that they
>> benefit from the fast filtering and only copy and highlight the actually
>> displayed strings.
>
> Maybe they will, maybe they will. But it's still quite early to decide if
> we're going to move all frontends to that API. There's no manual
> documentation for it. Conceivably, if you appreciate your API so, you
> could demonstrate in practice us how easy it is to use by providing
> a separate patch that adapts icomplete-mode and fido-mode to use it.
There is also no manual documentation of 'completion-all-completions'.
Of course you are right that it is early to make these decisions, but
the new API 'completion-filter-completions' is designed in a way to
allow adoption by 'icomplete-mode'. It is important to design the API
such that adoption is possible. I have a patch for Vertico which shows
this. I can provide patches for 'icomplete-mode' separately later on.
> In the meantime, there is a patch with a measured and documented
> performance boost where fido-mode and icomplete-mode move
> opt-in to a new `completion-lazy-hilit` feature in the "old" API with
> a total or four 1-line changes. That patch lives in the branch
> scratch/icomplete-lazy-highlight-attempt-2.
As argued multiple times here now, the change you are proposing is
fragile. It will lead to problems later on. The goal is not to find the
smallest change which leads to a performance boost, all API violations
allowed. Attaching "private" string properties to arbitrary strings is
an API violation which we will regret later and which will make Emacs
harder to debug and harder to use.
> I think we should move to that, solving the bug#48841 while we
> do the evaluation of all aspects of your contributions.
No, we should not merge this problematic patch of yours. See the many
arguments against this proposal.
Daniel
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 16 Aug 2021 11:38:01 GMT)
Full text and
rfc822 format available.
Message #269 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On Mon, Aug 16, 2021 at 11:53 AM Daniel Mendler <mail <at> daniel-mendler.de> wrote:
> >> There are serious drawbacks of attaching "private" string properties to
> >> arbitrary strings. For once it complicates debugging seriously if there
> >> are suddenly string properties attached to symbol names. It also leads
> >> to a potential memory leak.
> > Please, in the name of the sanity of this discussion, justify these two
> > statements with examples or follow them with a clause like "because...".
> João, I am giving hard examples here.
If I say to you: "It's quite obvious your patch breaks all Git repositories
in Kathmandu, Nepal" I am expected to demonstrate how a patch to
Emacs, leads to a obscure security flaw in the Linux operating system,
that tickles a butterfly at a certain angle that causes an earthquake
in the Kathmandu data center, literally breaking their Git repositories.
This is the the kind of statement you are making to me and the kind
of logical reasoning I'm looking for.
Alternatively, you can provide an actual experiment I can run in
my computer that demonstrates the bug.
I am not a native English speaker, and maybe you don't understand
my language. Another way to explain what I am talking is to talk about
"bug reproduction". You say there's a bug in my patch, I am asking you
for a "bug reproduction recipe" as defined by most, if not all, the results
you get by searching "bug reproduction recipe" in the Google search engine.
> goal is not to be right here just for the sake of being right. As Eli
> said, nobody has to prove anything.
This is clearly not what he said.
> >> 3. The `completion-filter-completions` API is the fastest possible API
> >
> > Again that's quite a statement that I cannot evaluate the veracity of
> > without hard proof.
>
> As I said, I will ensure that my API does not introduce performance
> regressions.
Not only that, to produce veracity on that statement you would need
some much more demanding proof, which is somehow be able to
evaluate all possible APIs to compellingly demonstrate that yours
triumphs.
> I have a patch for Vertico which shows
> this. I can provide patches for 'icomplete-mode' separately later on.
Yes, please do. The earlier, the better.
> No, we should not merge this problematic patch of yours. See the many
> arguments against this proposal.
I'm sorry to speak this child-like language, but a problem is a "bad thing".
An undesirable thing that happens when presumably safe and good
action(s) is taken by some user. Can you explain how, given my patch,
a user would take a sequence of innocent actions that would lead to a
"bad thing" that would _not_ happen if the same sequence of innocent
actions were taken in a version of Emacs without the patch applied?
That, to me, is what constitutes "a bug/problem in the patch".
Let me give you an example: if I make a patch that deletes `/lisp` in
the Emacs source tree, the innocent action "make" would probably
not work. That would be the problem/"bad thing"/bug in that patch.
We cannot proceed this discussion without these explanations. Mind
you, I'm not stating, because it is impossible to prove, that my
patch cannot possibly generate problems, subtle or obvious (that, by
the way, is my interpretation of what Eli meant). But since you so
confidently state that it does, it's quite reasonable that I ask you
for examples that demonstrate it.
Once you do demonstrate these bugs, it's reasonable I will go about
fixing them. Exactly as you say you are going to do. I demonstrated
with code and numbers a regression in _your_ patch, and you say are
going to fix it. That's great, and that's the way it should be. But
you possibly wouldn't go about fixing it if I hadn't demonstrated the
regression compellingly, just as I can't go about fixing a "memory leak"
or "debugging difficulties" if you don't explain what these things mean
to you or how my patch causes them.
João
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 16 Aug 2021 11:48:02 GMT)
Full text and
rfc822 format available.
Message #272 received at 48841 <at> debbugs.gnu.org (full text, mbox):
> Cc: mail <at> daniel-mendler.de, monnier <at> iro.umontreal.ca, 48841 <at> debbugs.gnu.org,
> 47711 <at> debbugs.gnu.org
> From: Dmitry Gutov <dgutov <at> yandex.ru>
> Date: Mon, 16 Aug 2021 06:17:32 +0300
>
> On 14.08.2021 14:29, Eli Zaretskii wrote:
> > Text properties are stored separately from the string, so I don't
> > think adding properties can in general be referred to as "change".
>
> Are you thinking of C strings?
No, about the implementation of Lisp strings in Emacs.
> Lisp strings carry text properties in addition to the array of
> characters. It doesn't really matter where in the memory the properties
> and the characters reside.
Well, it does, at least in some situations. The string text is not
affected, and so the code which processes the string will not notice
that it has a property about which that code has no idea. Only
properties that are known to the processing code can affect it;
non-standard properties private to some other code will generally pass
unnoticed.
> > Whether in some particular situation that could count as a "change"
> > depends on that situation and on the particular property, of course.
>
> I was talking in the general sense: modifying a value.
>
> One can talk about whether a certain modification matters in certain
> situations, but that's not the way to discount a general principle.
I didn't want to start a general philosophical discussion about string
mutability. I hoped to provide input of specific practical use in the
context of this discussion. If what I said is not useful, just
disregard it.
> > I'm not sure in the context of completion there's any reason to count
> > as "change" adding properties that don't affect display.
>
> For the context in question, whether the properties affect display is
> not particularly important. Properties affecting display just make it
> easier to notice that something's wrong. Bug involving other properties
> should be more difficult to investigate.
Once again, if some code invents its private property, not used
anywhere else and not documented anywhere else, then putting such a
property on a string has very high chances of going unnoticed. I hope
this consideration helps this discussion, because saying that
properties change a string blurs the distinction between actually
changing the string text or its properties that affect many parts in
Emacs, and adding some obscure property that is not known to anyone.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 16 Aug 2021 11:49:02 GMT)
Full text and
rfc822 format available.
Message #275 received at 48841 <at> debbugs.gnu.org (full text, mbox):
> Cc: 47711 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca, joaotavora <at> gmail.com,
> 48841 <at> debbugs.gnu.org
> From: Dmitry Gutov <dgutov <at> yandex.ru>
> Date: Mon, 16 Aug 2021 06:26:58 +0300
>
> On 14.08.2021 10:01, Eli Zaretskii wrote:
>
> > Just to make sure we are on the same page: adding a text property to a
> > string doesn't mutate a string. Lisp programs that process these
> > strings will not necessarily see any difference, and displaying those
> > strings will also not show any difference if the property is not
> > related to display. So the assumption that seems to be made here,
> > that adding a property is the same as mutating a string, is IMO
> > inaccurate if not incorrect.
>
> This is nonsense.
>
> A program won't necessarily see a difference in *any* changed value, as
> long as some part of it stays the same.
>
> I can zero out the tail of a string, and have a program that only looks
> at its first few characters. It wouldn't mean that a string hasn't changed.
You are not making any sense.
Anyway, if what I wrote doesn't help, feel free to disregard it.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 16 Aug 2021 11:58:02 GMT)
Full text and
rfc822 format available.
Message #278 received at 48841 <at> debbugs.gnu.org (full text, mbox):
> Cc: joaotavora <at> gmail.com, monnier <at> iro.umontreal.ca, 48841 <at> debbugs.gnu.org,
> 47711 <at> debbugs.gnu.org
> From: Daniel Mendler <mail <at> daniel-mendler.de>
> Date: Mon, 16 Aug 2021 10:48:19 +0200
>
> On 8/14/21 9:12 AM, Eli Zaretskii wrote:
> >> Since up until now completion-pcm--hilit-commonality copied all strings
> >> before modifying, completion tables such as described (with "shared"
> >> strings) have all been "legal". Suddenly deciding to stop supporting
> >> them would be a major API breakage with consequences that are hard to
> >> predict. And while I perhaps agree that it's an inconvenience, I don't
> >> think it's a choice we can simply make as this stage in c-a-p-f's
> >> development.
> >
> > This sounds like an argument against Daniel's approach as well, no?
> > Because if a completion API returns strings it "doesn't own", there
> > will be restrictions on Lisp programs that use those strings, because
> > those Lisp programs previously could do anything they liked with those
> > strings, and now they cannot. Or am I missing something?
>
> No, in my patch the displayed candidate strings are still copied before
> the text properties are attached. The strings are kept intact as they
> are now.
I was talking about the infrastructure that produces the completion
candidates, not about the application that uses them. My point is
that your approach requires the applications using the candidates to
copy them, whereas previously they could use them without copying.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 16 Aug 2021 12:03:02 GMT)
Full text and
rfc822 format available.
Message #281 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On Mon, Aug 16, 2021 at 12:57 PM Eli Zaretskii <eliz <at> gnu.org> wrote:
> I was talking about the infrastructure that produces the completion
> candidates, not about the application that uses them. My point is
> that your approach requires the applications using the candidates to
> copy them, whereas previously they could use them without copying.
If it helps, I think that that is true of all alternatives presented
so far (though I haven't read the big patch fully yet). The difference is
that the consumers who copy the candidate strings will only copy a much
smaller number, typically only the ones that need to be displayed.
Whereas currently, all candidate strings are copied, displayed or not.
João Távora
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 16 Aug 2021 12:07:02 GMT)
Full text and
rfc822 format available.
Message #284 received at 48841 <at> debbugs.gnu.org (full text, mbox):
João, the discussion is clearly not progressing. I propose that we both
take a step back and let the Emacs maintainers, who participated in this
discussion, decide on how to proceed. It seems to me that all arguments
and data has been presented and there is no need for further
reiterations in more and more colorful language. I would also like to
point out that implying that I don't understand your language is
borderline acceptable. I understand the discussion very well, but I
don't understand why you are using these unfair and invalid means of
discussion.
For example there could be these decision outcomes:
1. The information presented up to now does not allow the maintainers to
make a decision. For example the maintainers may ask for further
clarification from you, João, or they may ask for benchmarks from me or
a prove that my patch does not lead to performance regressions.
2. The maintainers decide that no patch should be merged.
2. The maintainers decide that your patch will be merged. I will accept
this decision.
3. The maintainers decide that my patch will be merged. You will accept
this decision.
4. The maintainers decide that both patches will be merged such that
both approaches will be supported. We both will accept this decision.
I want to summarize the situation in the following:
The patches in question address a performance issue in the current
completion machinery which is caused by over-eager copying of the
completion candidate strings and over-eager application of the
highlighting to all candidate strings. For incrementally updating UIs it
would be sufficient to only copy and highlight the strings which are
actually going to be displayed.
My patch takes the approach to expose the existing two-step completion
process, which consists of filtering and highlighting. By returning the
filtered completion strings and a highlighting function this two-step
process is decomposed and the caller of the API has the ability to call
the highlighting function on only the displayed subset of completion
candidates. I argue that exposing the filtering and highlighting
procession steps is the logical and natural conclusion of the existing
machinery.
My patch is fully backward compatible and aims to not introduce any
regressions (also no performance regressions) to the existing API.
Furthermore my patch adheres to the current guarantees given by the
existing 'completion-all-completions' API. The completion strings
provided by the completion backend are not mutated in any way, no string
properties are attached. Since the API 'completion-filter-completions'
proposed in my patch does the minimal amount of work necessary (only
filtering), if no highlighting is requested, I argue that the new API is
the most efficient possible API, given the current constraints.
Furthermore since I am introducing a new API, outstanding issues can be
solved which could not be solved until now given the constraints of the
existing 'completion-all-completions' API. In particular the new API
'completion-filter-completions' API returns additional data like the end
position of the completion boundaries. Until now the end position was
not made available and 'completion-base-position' just used the length
of the input to guess the end position. In a strict sense this guess is
incorrect and there is a FIXME in minibuffer.el, mentioning this issue.
The downside of my patch is that it is a large patch. While it adds only
196 lines of code, which is not much and expected given that it only
reshuffles the existing machinery, it is still a large patch in total.
On the other side, João's patch avoids the complication of adhering to
the existing guarantees of the APIs and takes the liberty of attaching
"private" string properties to the completion strings of the completion
table backend. I argue that attaching the string properties is a
violation of the guarantees of the existing API and violates the
expectations of the existing many completion tables. One very severe
scenario is when obarray is used as completion table, since then each
symbol name gets a private property attached. I argue that such global
side effects like attaching string properties to all completion
candidates should better be avoided. There is the issue that the
attached string properties are a potential memory leak. When dumping the
string representation of symbol names, the symbols will have additional
properties which will complicate the debugging experience. Furthermore
it will lead to confusion since the global side effect during completion
will suddenly have an influence of symbols which don't have to do
anything with completion. The big advantage of João's patch is that it
is very limited in scope and very simple. However I argue that this
simplicity is hard-won and we will regret this approach later due to the
global side effects.
Therefore I conclude that the two-step process proposed in my patch,
which does not introduce problematic global side effects is the better
approach forward. Furthermore a new API is needed such that more
completion data can be returned, e.g., the completion end position. One
could even return additional match data in the future given that the new
API 'completion-filter-completions' is extensible thanks to its alist
return value.
João, please feel free to also present your closing arguments here.
Daniel
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 16 Aug 2021 12:09:03 GMT)
Full text and
rfc822 format available.
Message #287 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On 8/16/21 1:57 PM, Eli Zaretskii wrote:
> I was talking about the infrastructure that produces the completion
> candidates, not about the application that uses them. My point is
> that your approach requires the applications using the candidates to
> copy them, whereas previously they could use them without copying.
Okay, I understand. Yes, in my patch the strings returned by
'completion-filter-completions' must not be mutated by the API consumer
directly. This should be documented clearly, but it is not unexpected.
For example the API 'all-completions' which one can use to obtain the
strings from a completion table also requires the caller to not mutate
the returned strings.
Daniel
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 16 Aug 2021 12:19:02 GMT)
Full text and
rfc822 format available.
Message #290 received at 48841 <at> debbugs.gnu.org (full text, mbox):
> João, please feel free to also present your closing arguments here.
Sorry, I don't "close" arguments like this.
I hope you can provide:
* the fixes to the regression identified
* the benchmarks you say you have
* the patches to icomplete.el that show how it uses your new API
* the demonstrations of the bugs you accuse my patch to suffer from
Thanks.
On Mon, Aug 16, 2021 at 1:05 PM Daniel Mendler <mail <at> daniel-mendler.de> wrote:
>
> João, the discussion is clearly not progressing. I propose that we both
> take a step back and let the Emacs maintainers, who participated in this
> discussion, decide on how to proceed. It seems to me that all arguments
> and data has been presented and there is no need for further
> reiterations in more and more colorful language. I would also like to
> point out that implying that I don't understand your language is
> borderline acceptable. I understand the discussion very well, but I
> don't understand why you are using these unfair and invalid means of
> discussion.
>
> For example there could be these decision outcomes:
>
> 1. The information presented up to now does not allow the maintainers to
> make a decision. For example the maintainers may ask for further
> clarification from you, João, or they may ask for benchmarks from me or
> a prove that my patch does not lead to performance regressions.
>
> 2. The maintainers decide that no patch should be merged.
>
> 2. The maintainers decide that your patch will be merged. I will accept
> this decision.
>
> 3. The maintainers decide that my patch will be merged. You will accept
> this decision.
>
> 4. The maintainers decide that both patches will be merged such that
> both approaches will be supported. We both will accept this decision.
>
> I want to summarize the situation in the following:
>
> The patches in question address a performance issue in the current
> completion machinery which is caused by over-eager copying of the
> completion candidate strings and over-eager application of the
> highlighting to all candidate strings. For incrementally updating UIs it
> would be sufficient to only copy and highlight the strings which are
> actually going to be displayed.
>
> My patch takes the approach to expose the existing two-step completion
> process, which consists of filtering and highlighting. By returning the
> filtered completion strings and a highlighting function this two-step
> process is decomposed and the caller of the API has the ability to call
> the highlighting function on only the displayed subset of completion
> candidates. I argue that exposing the filtering and highlighting
> procession steps is the logical and natural conclusion of the existing
> machinery.
>
> My patch is fully backward compatible and aims to not introduce any
> regressions (also no performance regressions) to the existing API.
> Furthermore my patch adheres to the current guarantees given by the
> existing 'completion-all-completions' API. The completion strings
> provided by the completion backend are not mutated in any way, no string
> properties are attached. Since the API 'completion-filter-completions'
> proposed in my patch does the minimal amount of work necessary (only
> filtering), if no highlighting is requested, I argue that the new API is
> the most efficient possible API, given the current constraints.
>
> Furthermore since I am introducing a new API, outstanding issues can be
> solved which could not be solved until now given the constraints of the
> existing 'completion-all-completions' API. In particular the new API
> 'completion-filter-completions' API returns additional data like the end
> position of the completion boundaries. Until now the end position was
> not made available and 'completion-base-position' just used the length
> of the input to guess the end position. In a strict sense this guess is
> incorrect and there is a FIXME in minibuffer.el, mentioning this issue.
>
> The downside of my patch is that it is a large patch. While it adds only
> 196 lines of code, which is not much and expected given that it only
> reshuffles the existing machinery, it is still a large patch in total.
>
> On the other side, João's patch avoids the complication of adhering to
> the existing guarantees of the APIs and takes the liberty of attaching
> "private" string properties to the completion strings of the completion
> table backend. I argue that attaching the string properties is a
> violation of the guarantees of the existing API and violates the
> expectations of the existing many completion tables. One very severe
> scenario is when obarray is used as completion table, since then each
> symbol name gets a private property attached. I argue that such global
> side effects like attaching string properties to all completion
> candidates should better be avoided. There is the issue that the
> attached string properties are a potential memory leak. When dumping the
> string representation of symbol names, the symbols will have additional
> properties which will complicate the debugging experience. Furthermore
> it will lead to confusion since the global side effect during completion
> will suddenly have an influence of symbols which don't have to do
> anything with completion. The big advantage of João's patch is that it
> is very limited in scope and very simple. However I argue that this
> simplicity is hard-won and we will regret this approach later due to the
> global side effects.
>
> Therefore I conclude that the two-step process proposed in my patch,
> which does not introduce problematic global side effects is the better
> approach forward. Furthermore a new API is needed such that more
> completion data can be returned, e.g., the completion end position. One
> could even return additional match data in the future given that the new
> API 'completion-filter-completions' is extensible thanks to its alist
> return value.
>
> João, please feel free to also present your closing arguments here.
>
> Daniel
--
João Távora
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 16 Aug 2021 12:20:02 GMT)
Full text and
rfc822 format available.
Message #293 received at 48841 <at> debbugs.gnu.org (full text, mbox):
> From: João Távora <joaotavora <at> gmail.com>
> Date: Mon, 16 Aug 2021 13:02:04 +0100
> Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Dmitry Gutov <dgutov <at> yandex.ru>,
> Stefan Monnier <monnier <at> iro.umontreal.ca>, 48841 <at> debbugs.gnu.org, 47711 <at> debbugs.gnu.org
>
> On Mon, Aug 16, 2021 at 12:57 PM Eli Zaretskii <eliz <at> gnu.org> wrote:
>
> > I was talking about the infrastructure that produces the completion
> > candidates, not about the application that uses them. My point is
> > that your approach requires the applications using the candidates to
> > copy them, whereas previously they could use them without copying.
>
> If it helps, I think that that is true of all alternatives presented
> so far
Yes, I know. I was comparing the proposed alternatives to what we
have now, and specifically because Dmitry mentioned this aspect.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 16 Aug 2021 12:40:03 GMT)
Full text and
rfc822 format available.
Message #296 received at 48841 <at> debbugs.gnu.org (full text, mbox):
> Cc: Dmitry Gutov <dgutov <at> yandex.ru>, Lars Ingebrigtsen <larsi <at> gnus.org>,
> 47711 <at> debbugs.gnu.org, 48841 <at> debbugs.gnu.org,
> Stefan Monnier <monnier <at> iro.umontreal.ca>, Eli Zaretskii <eliz <at> gnu.org>
> From: Daniel Mendler <mail <at> daniel-mendler.de>
> Date: Mon, 16 Aug 2021 12:52:58 +0200
>
> On 8/16/21 12:15 PM, João Távora wrote:
> > On Mon, Aug 16, 2021 at 10:09 AM Daniel Mendler <mail <at> daniel-mendler.de> wrote:
> >
> >> There are serious drawbacks of attaching "private" string properties to
> >> arbitrary strings. For once it complicates debugging seriously if there
> >> are suddenly string properties attached to symbol names. It also leads
> >> to a potential memory leak.
> >
> > Please, in the name of the sanity of this discussion, justify these two
> > statements with examples or follow them with a clause like "because...".
>
> João, I am giving hard examples here. What is not an example about
> "memory leak" or making debugging output verbose thanks to the attached
> string properties?
FWIW, I also don't understand how adding properties could cause a
memory leak. When a string is GCed, its properties get GCed as well,
all of them. Am I missing something?
As to more difficult debugging, I think adding a couple of properties
that have simple structure will not impair debugging too much.
Strings with many properties are not uncommon in Emacs, so we already
have to deal with that.
> As I said, I will ensure that my API does not introduce performance
> regressions. And since my new API performs strictly less work than your
> proposal it will necessarily be faster if you consider only the
> filtering, which is what matters for incrementally updating UIs.
I would indeed suggest both to make sure there's no performance
regressions, and would like to see some data similar to what João
presented, which backs up your assessments about your proposal being
faster. Since performance is the main motivation for these changes, I
think it's important for us to be on the same page wrt facts related
to performance, before we make the decision how to proceed.
Thanks.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 16 Aug 2021 12:45:02 GMT)
Full text and
rfc822 format available.
Message #299 received at 48841 <at> debbugs.gnu.org (full text, mbox):
> Cc: Dmitry Gutov <dgutov <at> yandex.ru>, Lars Ingebrigtsen <larsi <at> gnus.org>,
> 47711 <at> debbugs.gnu.org, 48841 <at> debbugs.gnu.org,
> Stefan Monnier <monnier <at> iro.umontreal.ca>, Eli Zaretskii <eliz <at> gnu.org>
> From: Daniel Mendler <mail <at> daniel-mendler.de>
> Date: Mon, 16 Aug 2021 14:05:54 +0200
>
> João, the discussion is clearly not progressing. I propose that we both
> take a step back and let the Emacs maintainers, who participated in this
> discussion, decide on how to proceed. It seems to me that all arguments
> and data has been presented and there is no need for further
> reiterations in more and more colorful language.
As I wrote elsewhere, I'd like to see the performance aspects of this
to be presented from both sides, and agreed upon. I don't think we
can make the decision before we have performance data we all agree
about. The other pros and cons are all of qualitative nature, and
involve intuition, personal experience, and personal preferences, so
each one will have their own balance. But performance is both basic
and qualitative, and we should have the facts and agree on them.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 16 Aug 2021 12:50:02 GMT)
Full text and
rfc822 format available.
Message #302 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On 8/16/21 2:39 PM, Eli Zaretskii wrote:
>> João, I am giving hard examples here. What is not an example about
>> "memory leak" or making debugging output verbose thanks to the attached
>> string properties?
>
> FWIW, I also don't understand how adding properties could cause a
> memory leak. When a string is GCed, its properties get GCed as well,
> all of them. Am I missing something?
If you add string properties to all symbol names this memory will stay
alive for much longer than necessary. It is not a memory leak in the
strongest sense. The memory is still reachable, but there is still no
need to keep the string properties allocated. This is comparable to
memory leaks in other GCed languages where memory is also kept alive for
longer than necessary.
> As to more difficult debugging, I think adding a couple of properties
> that have simple structure will not impair debugging too much.
> Strings with many properties are not uncommon in Emacs, so we already
> have to deal with that.
I disagree with that. We are talking about adding string properties to
every symbol name. This is a global side effect and different from
adding string properties to a set of isolated string in a controlled
manner. I also don't understand why one would even want to take any
chances here given that the feature can be implemented in a way which
avoids this global side effect entirely as my patch shows.
>> As I said, I will ensure that my API does not introduce performance
>> regressions. And since my new API performs strictly less work than your
>> proposal it will necessarily be faster if you consider only the
>> filtering, which is what matters for incrementally updating UIs.
>
> I would indeed suggest both to make sure there's no performance
> regressions, and would like to see some data similar to what João
> presented, which backs up your assessments about your proposal being
> faster. Since performance is the main motivation for these changes, I
> think it's important for us to be on the same page wrt facts related
> to performance, before we make the decision how to proceed.
I will prepare some benchmarks.
Daniel
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 16 Aug 2021 13:00:02 GMT)
Full text and
rfc822 format available.
Message #305 received at 48841 <at> debbugs.gnu.org (full text, mbox):
> Cc: joaotavora <at> gmail.com, dgutov <at> yandex.ru, monnier <at> iro.umontreal.ca,
> 48841 <at> debbugs.gnu.org, 47711 <at> debbugs.gnu.org
> From: Daniel Mendler <mail <at> daniel-mendler.de>
> Date: Mon, 16 Aug 2021 11:42:22 +0200
>
> >> To address your technical comment - this variable is precisely what one
> >> of the technical difficulties mentioned in my other mail is about. The
> >> question is how we can retain backward compatibility of the completion
> >> style 'all' functions, e.g., 'completion-basic-all-completions', while
> >> still allowing the function to return the newly introduced alist format
> >> with more data, which enables 'completion-filter-completions' to perform
> >> the efficient deferred highlighting.
> >
> > I understand, but given that we provide this for other packages, it
> > shouldn't be an internal variable.
>
> No, we explicitly don't provide this variable for other packages. It is
> explicitly only meant to be used for the existing completion styles
> emacs21, emacs22, basic, substring, partial-completion, initials and
> flex to opt-in in a backward-compatible/calling convention preserving
> way to the alist return format. The idea is to keep the existing APIs
> fully backward compatible.
>
> Other packages should select the format returned from the completion
> styles differently. They should return the alist format on Emacs 28 or
> if the API 'completion-filter-completions' API is present. In the not so
> near future external packages which support only Emacs 28 and upwards
> will then only return the alist format and don't have to perform any
> detection anymore.
What if some package outside minibuffer.el will want to control the
format of the returned value, for some reason, like support for old
Emacs versions? are we going to disallow that?
> >> The new API 'completion-filter-completions' will substitute the existing
> >> API 'completion-all-completions'.
> >
> > That's your hope, and I understand. But we as a project didn't yet
> > decide to deprecate the original APIs, so talking about superseding is
> > premature.
>
> It is not the hope - it is the explicit goal. The API has been designed
> to replace the existing API 'completion-all-completions'.
A goal is not a fact. Until that goal is reached, we cannot in good
faith tell people an API is superseded.
> We can of
> course tone this down. However I, as a package author, would appreciate
> if Emacs tells me when a newer API aims to replace another API and when
> the documentation is explicit about it.
That's understood, and when we make that decision, we will of course
announce it. But we didn't do so yet, and this discussion is not even
about that decision. It could be, for example, that both APIs will
live side by side until we decide whether to deprecate the old one.
> Of course if you decide to have
> the doc strings written in a different tone, I will adapt my patch
> accordingly. Here I am just explaining why I chose the word "superseded"
> instead of a more neutral word.
I understand your motivation, I'm just saying that we cannot announce
deprecation before we actually decide to deprecate.
> > But the name says "filter completions". Which would mean you take a
> > list of completions and filter out some of them. A completion table
> > is much more general object than a list of strings. Thus, I think
> > using "filter" here is sub-optimal.
>
> Okay, you are right about this. But I think even if the name
> 'completion-filter-completions' is not 100% precise it still conveys
> what the API is about. 'completion-completions-alist' or
> 'completion-all-completions-as-alist' are valid names of course, but I
> dislike them for their verbosity. Already 'completion-all-completions'
> is quite verbose. A strong argument to use this long name is that the
> completion style functions are still called
> 'completion-basic-all-completions' etc. But if we accept that the new
> API 'completion-filter-completions' will actually supersede the existing
> API 'completion-all-completions' it makes sense to use a name which will
> not hurt our eyes in the long run. However this is of course just a
> personal preference of mine. I don't want to spent much time with name
> bikeshedding discussions. If you decide on a name, I will adapt my patch
> accordingly.
Emacs is frequently accused in having names that are hard to
discover. The only time where we can improve that is when a symbol is
introduced, because later it's impossible for compatibility reasons.
So I'd like to come up with a good name before we install the changes.
That said, I'll let others chime in and agree or disagree with the
name you've chosen.
Thanks.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 16 Aug 2021 13:22:02 GMT)
Full text and
rfc822 format available.
Message #308 received at 48841 <at> debbugs.gnu.org (full text, mbox):
> Cc: joaotavora <at> gmail.com, 48841 <at> debbugs.gnu.org, dgutov <at> yandex.ru,
> larsi <at> gnus.org, monnier <at> iro.umontreal.ca, 47711 <at> debbugs.gnu.org
> From: Daniel Mendler <mail <at> daniel-mendler.de>
> Date: Mon, 16 Aug 2021 14:49:45 +0200
>
> On 8/16/21 2:39 PM, Eli Zaretskii wrote:
> >> João, I am giving hard examples here. What is not an example about
> >> "memory leak" or making debugging output verbose thanks to the attached
> >> string properties?
> >
> > FWIW, I also don't understand how adding properties could cause a
> > memory leak. When a string is GCed, its properties get GCed as well,
> > all of them. Am I missing something?
>
> If you add string properties to all symbol names this memory will stay
> alive for much longer than necessary.
That's a very extreme example, something that I wouldn't expect a Lisp
program to do, without removing the properties shortly thereafter.
And even that isn't a leak.
Note that we already put all kind of properties (although not text
properties) on symbols.
> > As to more difficult debugging, I think adding a couple of properties
> > that have simple structure will not impair debugging too much.
> > Strings with many properties are not uncommon in Emacs, so we already
> > have to deal with that.
>
> I disagree with that. We are talking about adding string properties to
> every symbol name. This is a global side effect and different from
> adding string properties to a set of isolated string in a controlled
> manner. I also don't understand why one would even want to take any
> chances here given that the feature can be implemented in a way which
> avoids this global side effect entirely as my patch shows.
I understand your aversion from such global effects, but I was talking
specifically about the debugging difficulties.
> > I would indeed suggest both to make sure there's no performance
> > regressions, and would like to see some data similar to what João
> > presented, which backs up your assessments about your proposal being
> > faster. Since performance is the main motivation for these changes, I
> > think it's important for us to be on the same page wrt facts related
> > to performance, before we make the decision how to proceed.
>
> I will prepare some benchmarks.
Thank you.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 16 Aug 2021 13:39:02 GMT)
Full text and
rfc822 format available.
Message #311 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On 16.08.2021 14:46, Eli Zaretskii wrote:
>>> Text properties are stored separately from the string, so I don't
>>> think adding properties can in general be referred to as "change".
>>
>> Are you thinking of C strings?
>
> No, about the implementation of Lisp strings in Emacs.
I was talking about their behavior.
>> Lisp strings carry text properties in addition to the array of
>> characters. It doesn't really matter where in the memory the properties
>> and the characters reside.
>
> Well, it does, at least in some situations. The string text is not
> affected, and so the code which processes the string will not notice
> that it has a property about which that code has no idea. Only
> properties that are known to the processing code can affect it;
> non-standard properties private to some other code will generally pass
> unnoticed.
I don't think anybody was suggesting that changing text properties
changes the character codes inside the "C string" part of the Lisp string.
>>> I'm not sure in the context of completion there's any reason to count
>>> as "change" adding properties that don't affect display.
>>
>> For the context in question, whether the properties affect display is
>> not particularly important. Properties affecting display just make it
>> easier to notice that something's wrong. Bug involving other properties
>> should be more difficult to investigate.
>
> Once again, if some code invents its private property, not used
> anywhere else and not documented anywhere else, then putting such a
> property on a string has very high chances of going unnoticed. I hope
> this consideration helps this discussion, because saying that
> properties change a string blurs the distinction between actually
> changing the string text or its properties that affect many parts in
> Emacs, and adding some obscure property that is not known to anyone.
What muddies the water is arguing against a solid engineering principle
with statements like "those mutations are not mutations".
Yes, when the properties are prefixed, the damage is reduced. Even then,
that increases the possibility of introducing bugs in the very code that
sets those properties (like having different code paths where one branch
sets them and another does not; forgetting to clear them in the other
branch; having subsequent code use the property values set by some
previous invocation of the code in question where it took another
branch; not to mention the potential troubles with parallel execution,
which is not a real concern these days, but we're designing for years
ahead, and someday it can be). Memory leaks, too.
Our completion pipeline has multiple interchangeable/pluggable parts, so
we have to be on the lookout even for problems which do not reproduce
with stock Emacs, and that requires solid abstractions.
And speaking of "only private properties", the completion-score property
can be used by downstream code, with all the associated potential for
trouble.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 16 Aug 2021 13:43:02 GMT)
Full text and
rfc822 format available.
Message #314 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On Mon, Aug 16, 2021 at 2:38 PM Dmitry Gutov <dgutov <at> yandex.ru> wrote:
> And speaking of "only private properties", the completion-score property
> can be used by downstream code, with all the associated potential for
> trouble.
That's true. When I created it, I meant for it to be private, I think,
but indeed did forget to mark it as such. It is not documented anywhere
but that hasn't stopped anyone in the past it, indeed.Can you point to
place(s) where it is indeed used other than the flex machinery of
`minibuffer.el`? Thanks.
João Távora
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 16 Aug 2021 14:01:02 GMT)
Full text and
rfc822 format available.
Message #317 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On 16.08.2021 16:21, Eli Zaretskii wrote:
>>> FWIW, I also don't understand how adding properties could cause a
>>> memory leak. When a string is GCed, its properties get GCed as well,
>>> all of them. Am I missing something?
>>
>> If you add string properties to all symbol names this memory will stay
>> alive for much longer than necessary.
>
> That's a very extreme example, something that I wouldn't expect a Lisp
> program to do, without removing the properties shortly thereafter.
And that *will* happen with Joao's approach, as soon as some package
implements a Lisp completion backend in a certain (legal) fashion.
Or using one of a few different fashions, actually.
> And even that isn't a leak.
>
> Note that we already put all kind of properties (although not text
> properties) on symbols.
Those do not, generally, change over time.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 16 Aug 2021 14:15:03 GMT)
Full text and
rfc822 format available.
Message #320 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On 16.08.2021 16:41, João Távora wrote:
> It is not documented anywhere
> but that hasn't stopped anyone in the past it, indeed.Can you point to
> place(s) where it is indeed used other than the flex machinery of
> `minibuffer.el`? Thanks.
Try either of these:
https://github.com/rustify-emacs/fuz.el/blob/master/helm-fuz.el#L228
https://github.com/emacs-helm/helm/blob/master/helm-utils.el
And I'm considering using it in company-sort-by-occurrence, to make sure
that flex sorting is at least semi-honored there (or create a variation
of that transformer). For that to happen, the possible score values and
their meanings will need to be documented, though.
The main scenario (and source of the completion-score property) I have
in mind is not related to fido-mode or flex, but the users can always
put flex into completion-styles by default, which affects company-capf.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 16 Aug 2021 14:21:01 GMT)
Full text and
rfc822 format available.
Message #323 received at 48841 <at> debbugs.gnu.org (full text, mbox):
Dmitry Gutov <dgutov <at> yandex.ru> writes:
> On 16.08.2021 16:21, Eli Zaretskii wrote:
>
>>>> FWIW, I also don't understand how adding properties could cause a
>>>> memory leak. When a string is GCed, its properties get GCed as well,
>>>> all of them. Am I missing something?
>>>
>>> If you add string properties to all symbol names this memory will stay
>>> alive for much longer than necessary.
>> That's a very extreme example, something that I wouldn't expect a
>> Lisp
>> program to do, without removing the properties shortly thereafter.
>
> And that *will* happen with Joao's approach, as soon as some package
> implements a Lisp completion backend in a certain (legal) fashion.
There is no leak, not in the strong or weak sense. There is a constant
usage memory footprint proportional to the size of obarray, yes, but the
factor isn't huge. From the top of my head, I think it's about two
conses and a fp number for each sym. does anyone know how much that is
or can teach me how to measure? Thanks.
Anyway the current situation is constant copies of strings that put
pressure on the GC, as the benchmarks show.
Anyhoo, in the interest of placating this string property bogeyman, I've
prepared a version of my patch that is based on weak-keyed hash tables.
Slightly slower, but not much. Here are the usual benchmarks:
(defmacro heyhey ()
`(progn
,@(cl-loop repeat 300000
collect `(defun ,(intern (symbol-name (gensym "heyhey"))) () 42))))
(setq completion-styles '(flex))
(heyhey)
(when nil
;; Press C-u C-x C-e C-m quickly to produce a sample
(benchmark-run (completing-read "" obarray))
;; my patch with private string properties
(3.596972924 4 1.125298095999998)
(3.584963294 4 1.1266740010000014)
(3.4622216089999998 4 1.0924070069999985)
(3.565632813 4 1.1066678320000012)
(3.456130054 4 1.099950519)
(3.49538751 4 1.1085273779999998)
(3.4882531059999997 4 1.0928655200000001)
(3.526581152 4 1.0909935229999999)
(3.710919876 4 1.13883417)
(3.576690379 4 1.09514685)
;; my patch with an no string properties (global weak hts)
;; Probably the extra gc sweeps are paranoid cleaning up of the
;; hash tables.
(3.981110008 7 1.6466288340000013)
(3.819598429 7 1.5200578379999996)
(3.823931386 7 1.5175787589999992)
(3.9161236720000003 7 1.6184865899999998)
(3.835148066 7 1.5686207249999988)
(3.791906221 7 1.5481051090000015)
(3.798378812 7 1.5164137029999996)
(4.049880173 7 1.7670989089999996)
(3.716469474 6 1.3442434509999996)
(3.422806969 6 1.3272816180000002)
;; current master
(5.405534502 12 2.8778620629999994)
(5.038353216999999 12 2.553688440000002)
(4.94358915 12 2.4917956500000003)
(4.950710861 12 2.4638737510000013)
(5.0242796929999995 12 2.5226992029999984)
(5.020964648 12 2.495171900999999)
(4.914088866 12 2.4218276250000024)
(5.003779622 12 2.502260272000001)
(4.969347707 12 2.4814790469999988)
(5.376038238 11 2.565757513000001)
;; didn't bother with daniel's patch as I've already shown it to be
;; about 10% slower than current master.
)
The patch lives in the branch
scratch/icomplete-lazy-highlight-no-string-props. It's a bit more
complicated to follow, but not much if you understand hash tables. The
interface to icomplete.el is completely unchanged.
All in all, still a very good improvement over the current situation,
and I think I can make it faster.
(Though really do consider Eli's arguments the fastest approach)
>> And even that isn't a leak.
>> Note that we already put all kind of properties (although not text
>> properties) on symbols.
>
> Those do not, generally, change over time.
Neither does this one! At least in size, which is the thing that
matters. So in terms of "negative" consequences it's exactly
equivalent. Read the patch it will be obvious, I think.
João
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 16 Aug 2021 14:27:02 GMT)
Full text and
rfc822 format available.
Message #326 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On 16.08.2021 14:37, João Távora wrote:
> I am not a native English speaker, and maybe you don't understand
> my language. Another way to explain what I am talking is to talk about
> "bug reproduction". You say there's a bug in my patch, I am asking you
> for a "bug reproduction recipe" as defined by most, if not all, the results
> you get by searching "bug reproduction recipe" in the Google search engine.
I hope you, or at least other here, can someday see and understand that
asking to prove standard engineering practices from the first
principles, time and time again in various discussions, is not a way to
encourage good atmosphere or promote project participation.
Are you really not imagine a buggy scenario coming from a combination of
downstream uses of 'completion-score' property, different completion
styles (some setting it, and some not), and a completion table that
either uses global string values outright, or caches them for the
duration of the current command?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 16 Aug 2021 14:31:02 GMT)
Full text and
rfc822 format available.
Message #329 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On Mon, Aug 16, 2021 at 3:26 PM Dmitry Gutov <dgutov <at> yandex.ru> wrote:
>
> On 16.08.2021 14:37, João Távora wrote:
> > I am not a native English speaker, and maybe you don't understand
> > my language. Another way to explain what I am talking is to talk about
> > "bug reproduction". You say there's a bug in my patch, I am asking you
> > for a "bug reproduction recipe" as defined by most, if not all, the results
> > you get by searching "bug reproduction recipe" in the Google search engine.
>
> I hope you, or at least other here, can someday see and understand that
> asking to prove standard engineering practices from the first
> principles, time and time again in various discussions, is not a way to
> encourage good atmosphere or promote project participation.
>
> Are you really not imagine a buggy scenario coming from a combination of
> downstream uses of 'completion-score' property, different completion
> styles (some setting it, and some not), and a completion table that
> either uses global string values outright, or caches them for the
> duration of the current command?
I don't. Please prime my imagination with some illustration based on
your fertile imagination of these things and the patches I have provided.
Oh and spare me the lectures. Thanks.
João
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 16 Aug 2021 14:34:02 GMT)
Full text and
rfc822 format available.
Message #332 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On 16.08.2021 17:20, João Távora wrote:
>>>>> FWIW, I also don't understand how adding properties could cause a
>>>>> memory leak. When a string is GCed, its properties get GCed as well,
>>>>> all of them. Am I missing something?
>>>>
>>>> If you add string properties to all symbol names this memory will stay
>>>> alive for much longer than necessary.
>>> That's a very extreme example, something that I wouldn't expect a
>>> Lisp
>>> program to do, without removing the properties shortly thereafter.
>>
>> And that *will* happen with Joao's approach, as soon as some package
>> implements a Lisp completion backend in a certain (legal) fashion.
>
> There is no leak, not in the strong or weak sense.
Eli already said that, in a sentence that I also quoted. And still: "I
wouldn't expect a Lisp program to do <so>".
> There is a constant
> usage memory footprint proportional to the size of obarray, yes, but the
> factor isn't huge. From the top of my head, I think it's about two
> conses and a fp number for each sym. does anyone know how much that is
> or can teach me how to measure? Thanks.
If we say that your approach is legal, those are only "two conses and a
number" coming from minibuffer.el. But since other packages will also be
allowed to do that, the factor will only be limited by the amount of
installed packages.
> Anyway the current situation is constant copies of strings that put
> pressure on the GC, as the benchmarks show.
>
> Anyhoo, in the interest of placating this string property bogeyman, I've
> prepared a version of my patch that is based on weak-keyed hash tables.
> Slightly slower, but not much. Here are the usual benchmarks:
Cool, I'll take a look, thanks.
>>> And even that isn't a leak.
>>> Note that we already put all kind of properties (although not text
>>> properties) on symbols.
>>
>> Those do not, generally, change over time.
>
> Neither does this one! At least in size, which is the thing that
> matters. So in terms of "negative" consequences it's exactly
> equivalent. Read the patch it will be obvious, I think.
I was talking about the values of the properties, not the size in memory.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 16 Aug 2021 14:38:02 GMT)
Full text and
rfc822 format available.
Message #335 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On Mon, Aug 16, 2021 at 3:33 PM Dmitry Gutov <dgutov <at> yandex.ru> wrote:
> I was talking about the values of the properties, not the size in memory.
Then what's the problem if the value of a property that is an implementation
detail changes? What do you (meaning the user of Emacs) care, ultimately?
João
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 16 Aug 2021 14:49:02 GMT)
Full text and
rfc822 format available.
Message #338 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On 16.08.2021 17:36, João Távora wrote:
> On Mon, Aug 16, 2021 at 3:33 PM Dmitry Gutov<dgutov <at> yandex.ru> wrote:
>
>> I was talking about the values of the properties, not the size in memory.
> Then what's the problem if the value of a property that is an implementation
> detail changes? What do you (meaning the user of Emacs) care, ultimately?
You said "we already have global symbol properties". I pointed out the
differences.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 16 Aug 2021 17:01:02 GMT)
Full text and
rfc822 format available.
Message #341 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On Mon, Aug 16, 2021 at 3:47 PM Dmitry Gutov <dgutov <at> yandex.ru> wrote:
>
> On 16.08.2021 17:36, João Távora wrote:
> > On Mon, Aug 16, 2021 at 3:33 PM Dmitry Gutov<dgutov <at> yandex.ru> wrote:
> >
> >> I was talking about the values of the properties, not the size in memory.
> > Then what's the problem if the value of a property that is an implementation
> > detail changes? What do you (meaning the user of Emacs) care, ultimately?
>
> You said "we already have global symbol properties". I pointed out the
> differences.
Eli said that, I think. Anyway, it doesn't present any kind of problem
whether their values of change or not, as long as the space they
occupy doesn't.
João
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Mon, 16 Aug 2021 18:26:02 GMT)
Full text and
rfc822 format available.
Message #344 received at 48841 <at> debbugs.gnu.org (full text, mbox):
Dmitry Gutov <dgutov <at> yandex.ru> writes:
>> prepared a version of my patch that is based on weak-keyed hash tables.
>> Slightly slower, but not much. Here are the usual benchmarks:
>
> Cool, I'll take a look, thanks.
I've made it faster, now very close to the string-propertizing approach,
itself very close to the theoretical best (which is no copy, no
highlight). See the tip of the
scratch/icomplete-lazy-highlight-no-string-props branch, which I had to
rewrite (some Git flub-up). All benchmarks after sig.
João
(require 'cl-lib)
;; Introduce 300 000 new symbols to slow things down. Probably more
;; than most non-Spacemancs people will have :-)
;; (setq ido-enable-flex-matching t)
;; (ido-mode)
;; (ignore-errors (ido-ubiquitous-mode))
;; (fido-mode)
;; (fido-vertical-mode)
;; (vertico-mode)
;; (hash-table-keys completion--get-lazy-highlight-cache)
(defmacro heyhey ()
`(progn
,@(cl-loop repeat 300000
collect `(defun ,(intern (symbol-name (gensym "heyhey"))) () 42))))
;; (setq completion-styles '(substring))
(setq completion-styles '(flex))
(heyhey)
(setq icomplete-show-matches-on-no-input t)
(symbol-name 'mouse-kill)
(when nil
;; Press C-u C-x C-e C-m quickly to produce a sample. Always discard
;; the first sample in the session, it tends to have an extra GC and be
;; slower.
(benchmark-run (completing-read "" obarray))
;; don't use string props
(2.848873438 6 0.8307729419999994)
(2.848416202 6 0.8370667889999996)
(2.786944063 6 0.8230433460000004)
(2.7815761840000004 6 0.819654023)
(2.6929080819999998 5 0.7036257240000001)
;; string props
(2.630354852 4 0.7071441910000011)
(2.594761891 4 0.7082679669999994)
(2.589480755 4 0.7112978109999997)
(2.661196709 4 0.7130021060000011)
(2.844372962 4 0.7378870879999999)
;; master
(3.6339847759999997 12 1.601142523)
(3.757091055 12 1.6231055449999996)
(3.785980977 12 1.6333413839999995)
(3.716144927 12 1.6100998260000008)
(3.808275042 11 1.611891043)
;; these next two are not comparable to the above, because in
;; ab23fa4eb22f6557414724769958a63f1c59b49a I added sorting to flex
;; which changes results, and Daniel's patch no longer applies
;; cleanly.
;; daniel's patch
(3.420418068 10 1.451012855)
(3.603226896 10 1.672325507)
(3.501318685 10 1.6150095739999992)
(3.659821971 10 1.6580361760000004)
(3.624743674 10 1.657498823)
;; master just before daniel's patch (254dc6ab4c)
(2.611424665 10 1.5267066549999981)
(2.48811409 10 1.486639387000002)
(2.472587389 10 1.479865191)
(2.543277273 10 1.510667634999999)
(2.546243312 10 1.4986345790000009)
)
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Tue, 17 Aug 2021 02:09:01 GMT)
Full text and
rfc822 format available.
Message #347 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On 16.08.2021 21:25, João Távora wrote:
> I've made it faster, now very close to the string-propertizing approach,
> itself very close to the theoretical best (which is no copy, no
> highlight). See the tip of the
> scratch/icomplete-lazy-highlight-no-string-props branch, which I had to
> rewrite (some Git flub-up). All benchmarks after sig.
Thanks. I've read it now.
This implementation style (quick exfiltration via a dynamic var with
some special-cased logic) reminds me of the recent changes to eldoc,
really not my cup of tea.
At the very least, though, you have done the work of proving that the
no-string-propertization approach can be just as fast. Thank you for that.
A hash table with :test 'eq is a good choice. I'd be happy to try to
tweak it further, but it also seems that at this point we can transition
to the discussion about what kind of implementation style we want, since
the performance is proven to be more or less on par.
Though of course that should start with an alternative patch which adds
icomplete support as well (either Daniel does it, or I'll give it a try).
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Tue, 17 Aug 2021 09:00:03 GMT)
Full text and
rfc822 format available.
Message #350 received at 48841 <at> debbugs.gnu.org (full text, mbox):
Dmitry Gutov <dgutov <at> yandex.ru> writes:
> On 16.08.2021 21:25, João Távora wrote:
>> I've made it faster, now very close to the string-propertizing approach,
>> itself very close to the theoretical best (which is no copy, no
>> highlight). See the tip of the
>> scratch/icomplete-lazy-highlight-no-string-props branch, which I had to
>> rewrite (some Git flub-up). All benchmarks after sig.
>
> Thanks. I've read it now.
>
> This implementation style (quick exfiltration via a dynamic var with
> some special-cased logic) reminds me of the recent changes to eldoc,
> really not my cup of tea.
I'm sorry, but I'm not drinking from your herbarium. Googling for
"exfiltration" brings up "malware" and data security. Why is mine
"quick" at that? Is this some kind of metaphor? And what does "some
special-cased logic" refer to exactly? I see as much similarity to
Eldoc and I do to the Sistine chapel.
The only thing I understand, I think, is "dynamic var". If you mean the
variable 'completion-lazy-hilit', notice it is not necessarily used as
dynamic var (in fact in icomplete.el it's just a buffer-local var). As
I explained elsewhere, if the completion machinery had a realiable
abstraction for "session" I would use that.
I don't think it does, does it? Currently, it's the frontend who holds
that knowledge. It will either have an object representing it (maybe a
fancy CLOS thing); or a stack frame with some kind of command loop; or, in
the case of icomplete, a minibuffer session delineated by
kill-all-local-variables.
So, for icomplete.el, setting that variable buffer-locally is the
appropriate thing. For the command-loopy frontend, dynamically binding
it will be. The the objecty frontend, the object itself it proabably a
good value for complation-lazy.hilit.
For completion-capf, if you cared to optimize it with this stuff, it
will likely be ... something something.
Anyway, the "implementation style" I went for is speed, brevity and a
decent docstring.
And it'd be a bit shorter if it used string properties...
> At the very least, though, you have done the work of proving that the
> no-string-propertization approach can be just as fast. Thank you for
> that.
You're welcome. Not really just as fast, but in the big-O ballpark, of
course.
I had hoped to show also that the particular choice of global structure
for string/symbol/whatever association is irrelevant.
I'm still missing the imminent catastrophe (that is so clear to you) of
the put-text-property approach. I'd like these slower and more complex
techniques to appease more than superstition.
> A hash table with :test 'eq is a good choice. I'd be happy to try to
> tweak it further, but it also seems that at this point we can
> transition to the discussion about what kind of implementation style
> we want, since the performance is proven to be more or less on par.
>
> Though of course that should start with an alternative patch which
> adds icomplete support as well (either Daniel does it, or I'll give it
> a try).
I'm curious to see those, yes. But Eli pointed out on, two different
APIs will need to cohabitate since the new API won't kill off the old.
To be very clear, I'm interested in the performance of Daniel's patch,
not really in insufferable claims of its beauty and virginity.
minibuffer.el is a great big mess, I'll leave it to the Great Designers
of the Big Redesign, godspeed to them.
Currently, I just want changes to not assassinate, in speed or form,
icomplete.el or the flex completion style, two fundamental daily drivers
to my work, and other's work. So if/when Daniel's patch doesn't do any
of that (it seems that it currently does), I'll be all for it.
João
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Tue, 17 Aug 2021 11:50:02 GMT)
Full text and
rfc822 format available.
Message #353 received at 48841 <at> debbugs.gnu.org (full text, mbox):
> From: João Távora <joaotavora <at> gmail.com>
> Date: Tue, 17 Aug 2021 09:59:25 +0100
> Cc: Daniel Mendler <mail <at> daniel-mendler.de>, larsi <at> gnus.org,
> monnier <at> iro.umontreal.ca, 48841 <at> debbugs.gnu.org, 47711 <at> debbugs.gnu.org
>
> I'm sorry, but I'm not drinking from your herbarium.
Once again, I'm asking everyone to please remove the emotional and
sarcastic parts from the exchange. It is not helping to have
constructive discussions.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#48841
; Package
emacs
.
(Tue, 17 Aug 2021 11:53:02 GMT)
Full text and
rfc822 format available.
Message #356 received at 48841 <at> debbugs.gnu.org (full text, mbox):
On Tue, Aug 17, 2021 at 12:49 PM Eli Zaretskii <eliz <at> gnu.org> wrote:
>
> > From: João Távora <joaotavora <at> gmail.com>
> > Date: Tue, 17 Aug 2021 09:59:25 +0100
> > Cc: Daniel Mendler <mail <at> daniel-mendler.de>, larsi <at> gnus.org,
> > monnier <at> iro.umontreal.ca, 48841 <at> debbugs.gnu.org, 47711 <at> debbugs.gnu.org
> >
> > I'm sorry, but I'm not drinking from your herbarium.
>
> Once again, I'm asking everyone to please remove the emotional and
> sarcastic parts from the exchange. It is not helping to have
> constructive discussions.
I thought it was rather appropriate for the "my cup of tea" line :-) But
I get the message, and I apologize.
João
Reply sent
to
João Távora <joaotavora <at> gmail.com>
:
You have taken responsibility.
(Wed, 25 Aug 2021 15:44:02 GMT)
Full text and
rfc822 format available.
Notification sent
to
Dmitry Gutov <dgutov <at> yandex.ru>
:
bug acknowledged by developer.
(Wed, 25 Aug 2021 15:44:02 GMT)
Full text and
rfc822 format available.
Message #361 received at 48841-done <at> debbugs.gnu.org (full text, mbox):
João Távora <joaotavora <at> gmail.com> writes:
> [I've removed bug#47711 from the list, since I haven't read the bug.
> This is only directly concerned with this report bug#48841 about speed
> differences between fido-mode and ido-mode.]
>
> João Távora <joaotavora <at> gmail.com> writes:
>
>> scratch/icomplete-lazy-highlight-attempt-2, although still incomplete,
>> is one such approach, though it still sets `completion-score` on the
>> "shared" string, used later for sorting. But also that could be
>> prevented (again, only if it turns out to be actually problematic
>> IMO).
>
> I have tested the patch more thoroughly now, and have not found any
> problems.
As I wait for genuine reports or explanations of the much dramatized
problems in the above patch, I've pushed a much simpler patch that has a
dramatic beneficial effect: simply don't do any copying, highlighting or
scoring if the pcm-style pattern (used by the styles 'flex', 'substring'
and others) is empty.
This more than halves the waiting time for the candidate display when
the pattern is empty. As far as i can tell, `fido-mode` is now faster
than `ido-mode` and so I'm marking this bug closed.
Of course, when there is a pattern of a single character or more, the
icomplete waiting times using my earlier 'completion-lazy-highlight'
patch are still around 70% of the current master. But those times are
always quite shorter than the empty-pattern case. I'll wait a bit for
the alternatives presumably being worked on before pushing that or
something based on it.
João
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Thu, 23 Sep 2021 11:24:06 GMT)
Full text and
rfc822 format available.
This bug report was last modified 2 years and 207 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.