GNU bug report logs - #47711
27.1; Deferred highlighting support in `completion-all-completions', `vertico--all-completions`

Package: emacs;

Reported by: Daniel Mendler <mail <at> daniel-mendler.de>

Date: Sun, 11 Apr 2021 20:52:01 UTC

Severity: normal

Found in version 27.1

Done: Daniel Mendler <mail <at> daniel-mendler.de>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 47711 in the body.
You can then email your comments to 47711 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox

Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Sun, 11 Apr 2021 20:52:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Daniel Mendler <mail <at> daniel-mendler.de>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Sun, 11 Apr 2021 20:52:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Daniel Mendler <mail <at> daniel-mendler.de>
To: bug-gnu-emacs <at> gnu.org
Cc: Stefan Monnier <monnier <at> iro.umontreal.ca>
Subject: 27.1; Deferred highlighting support in `completion-all-completions',
 `vertico--all-completions`
Date: Sun, 11 Apr 2021 22:51:14 +0200

Emacs is lacking a possibility to defer the completion highlighting when
computing completions via `completion-all-completions'. This feature is
important for the performance of completion UIs when the set of all
completions is much larger than the set of completions which are
displayed.

The Vertico package defers highlighting by modifying the
`completion*-hilit-*' function with advices.

(declare-function orderless-highlight-matches "ext:orderless")
(defun vertico--all-completions (&rest args)
  "Compute all completions for ARGS with deferred highlighting."
  (cl-letf* ((orig-pcm (symbol-function 
#'completion-pcm--hilit-commonality))
             (orig-flex (symbol-function 
#'completion-flex-all-completions))
             ((symbol-function #'completion-flex-all-completions)
              (lambda (&rest args)
                ;; Unfortunately for flex we have to undo the deferred 
highlighting, since flex uses
                ;; the completion-score for sorting, which is applied 
during highlighting.
                (cl-letf (((symbol-function 
#'completion-pcm--hilit-commonality) orig-pcm))
                  (apply orig-flex args))))
             ;; Defer the following highlighting functions
             (hl #'identity)
             ((symbol-function #'completion-hilit-commonality)
              (lambda (cands prefix &optional base)
                (setq hl (lambda (x) (nconc 
(completion-hilit-commonality x prefix base) nil)))
                (and cands (nconc cands base))))
             ((symbol-function #'completion-pcm--hilit-commonality)
              (lambda (pattern cands)
                (setq hl (lambda (x) (completion-pcm--hilit-commonality 
pattern x)))
                cands))
             ((symbol-function #'orderless-highlight-matches)
              (lambda (pattern cands)
                (setq hl (lambda (x) (orderless-highlight-matches 
pattern x)))
                cands)))
    (cons (apply #'completion-all-completions args) hl)))

This function `vertico--all-completions` returns the list of completions 
and a highlighting
function which can then be used to highlight the completions on the fly.
It is a prototype of how some improved functionality in Emacs could look 
like.

(completion-all-completions STRING TABLE PRED POINT &optional METADATA 
DEFER-HL)

or

(completion-all-completions-defer-hl STRING TABLE PRED POINT &optional 
METADATA)

If DEFER-HL=t, then the function returns the completions and a
highlighting function. One may consider returning a triple of base,
completions and highlighting functions. Internally the completion styles
should be adapted such that they support the deferred highlighting.

It could be that this feature becomes less needed with the introduction
of gccemacs in the future.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Sun, 18 Apr 2021 21:27:02 GMT) Full text and rfc822 format available.

Message #8 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Mendler <mail <at> daniel-mendler.de>
To: 47711 <at> debbugs.gnu.org
Subject: Re: bug#47711: Acknowledgement (27.1; Deferred highlighting support
 in `completion-all-completions', `vertico--all-completions`)
Date: Sun, 18 Apr 2021 23:26:44 +0200

Deferred highlighting is also useful for completion-at-point, see the 
ELPA Corfu package, `corfu--all-completions`, which uses the same method 
as Vertico.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Wed, 07 Jul 2021 08:57:01 GMT) Full text and rfc822 format available.

Message #11 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Mendler <mail <at> daniel-mendler.de>
To: Dmitry Gutov <dgutov <at> yandex.ru>, João Távora
 <joaotavora <at> gmail.com>
Cc: Stefan Monnier <monnier <at> iro.umontreal.ca>, 48841 <at> debbugs.gnu.org,
 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: fido-mode is slower than ido-mode with similar settings
Date: Wed, 7 Jul 2021 10:56:05 +0200

On 7/4/21 3:53 AM, Dmitry Gutov wrote:
>> - icomplete.el? for fido-mode & friends
>> - minibuffer.el, for the *Completions* buffer
>> - company.el
>> - Any notable others?
> 
> corfu, consult, etc? Probably Ivy too. All of these are in GNU ELPA.
> 
> BTW, I think Daniel had some ideas about applying the face property 
> lazily as well. I can't find the particular discussion now, but perhaps 
> he can add to this discussion as well.

Yes, Vertico and Corfu apply highlighting lazily. This leads to
significant performance wins. See `vertico--all-completions` in
https://github.com/minad/vertico/blob/main/vertico.el#L243-L279 and
bug#47711.

The technique I am using in Vertico and Corfu retains backward
compatibility, such that the strings are returned unmodified by the
completion style. Highlighting is applied lazily by copying the
candidate strings and mutating the copies. For now I am relying on advices.

One could add an optional argument (or dynamically bound variable) to
completion styles which tell the completion style to opt out of copying
the candidates and the highlighting.

Daniel

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Wed, 11 Aug 2021 14:18:01 GMT) Full text and rfc822 format available.

Message #14 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Mendler <mail <at> daniel-mendler.de>
To: "emacs-devel <at> gnu.org" <emacs-devel <at> gnu.org>
Cc: João Távora <joaotavora <at> gmail.com>,
 47711 <at> debbugs.gnu.org, Stefan Monnier <monnier <at> iro.umontreal.ca>,
 48841 <at> debbugs.gnu.org, Dmitry Gutov <dgutov <at> yandex.ru>
Subject: [PATCH] Add new `completion-filter-completions` API and deferred
 highlighting
Date: Wed, 11 Aug 2021 16:16:57 +0200

[Message part 1 (text/plain, inline)]

I prepared a patch which provides the API
`completion-filter-completions`. This function supports deferred
highlighting and returns additional data with the list of matching
completion candidates. The API supersedes the existing function
`completion-all-completions`.

The main goal of the new API is to avoid expensive string allocations
and highlighting during completion. This is particularly relevant for
continuously updating completion UIs like Icomplete or Vertico.
Furthermore the end position of the completion boundaries is returned
with the completion results. This information is not provided by the
existing `completion-all-completions` API.

See also the relevant bugs bug#47711 and bug#48841. I am looking forward
to your feedback. Thank you!

Daniel Mendler

[0001-Add-new-completion-filter-completions-API-and-deferr.patch (text/x-diff, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Wed, 11 Aug 2021 16:12:01 GMT) Full text and rfc822 format available.

Message #17 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Mendler <mail <at> daniel-mendler.de>
To: "emacs-devel <at> gnu.org" <emacs-devel <at> gnu.org>
Cc: 48841 <at> debbugs.gnu.org, Dmitry Gutov <dgutov <at> yandex.ru>,
 João Távora <joaotavora <at> gmail.com>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 47711 <at> debbugs.gnu.org
Subject: Re: [PATCH] Add new `completion-filter-completions` API and deferred
 highlighting
Date: Wed, 11 Aug 2021 18:11:21 +0200

On 8/11/21 4:16 PM, Daniel Mendler wrote:
> I prepared a patch which provides the API
> `completion-filter-completions`. This function supports deferred
> highlighting and returns additional data with the list of matching
> completion candidates. The API supersedes the existing function
> `completion-all-completions`.
> 
> The main goal of the new API is to avoid expensive string allocations
> and highlighting during completion. This is particularly relevant for
> continuously updating completion UIs like Icomplete or Vertico.
> Furthermore the end position of the completion boundaries is returned
> with the completion results. This information is not provided by the
> existing `completion-all-completions` API.
> 
> See also the relevant bugs bug#47711 and bug#48841. I am looking forward
> to your feedback. Thank you!

There are currently two issues with the patch with regards to backward
compatibility. Fortunately they are fixable with a little effort.

1. I would like to deprecate `completion-score' or remove it altogether,
   but unfortunately `completion-score' is used in the wild. In order to
   preserve `completion-score', bind `completion--filter-completions' in
   the highlighting functions. Add `completion-score' in
   `completion-pcm--hilit-commonality' when
   `completion--filter-completions' is nil.

2. In `completion--nth-completion' set `completion--filter-completions'
   to nil, unless `(memq style '(emacs21 emacs22 basic
   partial-completion initials flex))' such that custom completion
   styles which wrap the completion functions don't see the new return
   value format, except if the custom style opts in explicitly by
   binding `completion--filter-completions'. An alternative criterion is
   `(memq fun '(completion-emacs22-all-completions) ...)'. Unfortunately
   this approach will still not work if the user has advised a
   `completion-x-all-completions' function. The only 100% safe approach
   seems to transparently redirect calls to
   `completion-x-all-completions' to `completion--x-filter-completions',
   which returns the results in the new format.

With these changes the patch should be 100% backward compatible.

Daniel

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Wed, 11 Aug 2021 16:18:02 GMT) Full text and rfc822 format available.

Message #20 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Daniel Mendler <mail <at> daniel-mendler.de>
Cc: 47711 <at> debbugs.gnu.org, Dmitry Gutov <dgutov <at> yandex.ru>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 48841 <at> debbugs.gnu.org,
 "emacs-devel <at> gnu.org" <emacs-devel <at> gnu.org>
Subject: Re: [PATCH] Add new `completion-filter-completions` API and deferred
 highlighting
Date: Wed, 11 Aug 2021 17:17:07 +0100

Perhaps you should first provide a patch with these 2 "little effort" changes,
(that are presumably also backward compatible and don't affect the API) by
themselves.  Reading about these complex ideas isn't as clear as seeing
them in actual code.

Then it'll be easier to evaluate the merits of the patch you proposed in
your first email.

João

On Wed, Aug 11, 2021 at 5:11 PM Daniel Mendler <mail <at> daniel-mendler.de> wrote:
>
> On 8/11/21 4:16 PM, Daniel Mendler wrote:
> > I prepared a patch which provides the API
> > `completion-filter-completions`. This function supports deferred
> > highlighting and returns additional data with the list of matching
> > completion candidates. The API supersedes the existing function
> > `completion-all-completions`.
> >
> > The main goal of the new API is to avoid expensive string allocations
> > and highlighting during completion. This is particularly relevant for
> > continuously updating completion UIs like Icomplete or Vertico.
> > Furthermore the end position of the completion boundaries is returned
> > with the completion results. This information is not provided by the
> > existing `completion-all-completions` API.
> >
> > See also the relevant bugs bug#47711 and bug#48841. I am looking forward
> > to your feedback. Thank you!
>
> There are currently two issues with the patch with regards to backward
> compatibility. Fortunately they are fixable with a little effort.
>
> 1. I would like to deprecate `completion-score' or remove it altogether,
>    but unfortunately `completion-score' is used in the wild. In order to
>    preserve `completion-score', bind `completion--filter-completions' in
>    the highlighting functions. Add `completion-score' in
>    `completion-pcm--hilit-commonality' when
>    `completion--filter-completions' is nil.
>
> 2. In `completion--nth-completion' set `completion--filter-completions'
>    to nil, unless `(memq style '(emacs21 emacs22 basic
>    partial-completion initials flex))' such that custom completion
>    styles which wrap the completion functions don't see the new return
>    value format, except if the custom style opts in explicitly by
>    binding `completion--filter-completions'. An alternative criterion is
>    `(memq fun '(completion-emacs22-all-completions) ...)'. Unfortunately
>    this approach will still not work if the user has advised a
>    `completion-x-all-completions' function. The only 100% safe approach
>    seems to transparently redirect calls to
>    `completion-x-all-completions' to `completion--x-filter-completions',
>    which returns the results in the new format.
>
> With these changes the patch should be 100% backward compatible.
>
> Daniel



-- 
João Távora

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Thu, 12 Aug 2021 08:01:01 GMT) Full text and rfc822 format available.

Message #23 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Daniel Mendler <mail <at> daniel-mendler.de>
Cc: 47711 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca, joaotavora <at> gmail.com,
 48841 <at> debbugs.gnu.org, dgutov <at> yandex.ru
Subject: Re: bug#48841: [PATCH] Add new `completion-filter-completions` API and
 deferred highlighting
Date: Thu, 12 Aug 2021 11:00:11 +0300

[I removed emacs-devel from the CC list, please don't cross-post to
both emacs-devel and the bug tracker.]

> From: Daniel Mendler <mail <at> daniel-mendler.de>
> Date: Wed, 11 Aug 2021 16:16:57 +0200
> Cc: 48841 <at> debbugs.gnu.org, Dmitry Gutov <dgutov <at> yandex.ru>,
>  João Távora <joaotavora <at> gmail.com>,
>  Stefan Monnier <monnier <at> iro.umontreal.ca>, 47711 <at> debbugs.gnu.org
> 
> I prepared a patch which provides the API
> `completion-filter-completions`. This function supports deferred
> highlighting and returns additional data with the list of matching
> completion candidates. The API supersedes the existing function
> `completion-all-completions`.

Thanks.  The discussion of this is still going on, and I don't
consider myself an expert in this area of Emacs, so below please find
only comments for minor formatting and documentation aspects.

> Add a new `completion-filter-completions` API, which supersedes
> `completion-all-completions`.  The new API returns the matching
> completion candidates and additional data.  The return value is an
> alist, with the keys `completions`, `base`, `end` and `highlight`.
> The API can be extended in a backward compatible way later on thanks
> to the use of an alist as return value.

Please don't use Markdown-style quoting `like this` in our comments
and log messages.  We quote 'like this' in these places.

> The `completions` value is the list of completion strings *without*
> applied highlighting.  The completion strings are returned unmodified,
> which avoids allocations and results in performance gains for

This is unclear: how can you return a list of strings which you
produce without allocating the strings?

>         The value `base` is the base position of the completion.

"Base position" where, or relative to what object?

> Correspondingly the value `end` specifies the end position of the
> completion counted from the beginning of the input strng.

So the base position is also relative to the beginning of the input
string?  If so, please say so explicitly.

>                                                    Finally the
> `highlight` value is a function taking a list of completion strings
> and returns a new list of new strings with highlighting applied.

If you say "taking a list...", then for consistent style please also
say "...and returning a new list...".

> A continously updating UI can use the highlighting function to apply
> highlighting only to the visible completions.

Not, "visible", but "displayed", right?  So I'd rephrase

  ...to apply highlighting only to the completions that are actually
  displayed.

> (completion-basic-all-completions,
> completion-emacs21-all-completions,
> completion-emacs22-all-completions): Use it.

That's incorrect format, I guess you did it by hand rather than
letting change-log-mode do it for you?  The correct format looks like
this:

  (completion-basic-all-completions)
  (completion-emacs21-all-completions)
  (completion-emacs22-all-completions): use it.

IOW, each line ends with a closing parentheses, not a comma.

> +(defvar completion--filter-completions nil
> +  "Enable the new completions return value format.

Using genitive construction should be limited to just 2 words.  Here
you have 4, which produces an awkward, hard to interpret phrase.
Suggest to reword:

  If non-nil, return completions in `completion-filter-completions' format.

Note that I also dropped the "new" part (which generally doesn't stand
the test of time), and instead gave a hint as to what that format is.

Btw, why is this an internal variable?  Shouldn't all completion
styles ideally support it?  If so, it should be a public variable,
documented in the ELisp manual.  And the name should also end with -p,
since it's a boolean.  How about completion-filter-completions-format-p?

> +            New completion style functions may always return their
> +results in the new alist format, since `completion-all-completions'
> +transparently converts back to the old improper list of completions
> +with base size in the last cdr.")

"may" and "always" are a kind of contradiction.

Also, I'd drop the "improper" part, as it sounds like a derogatory
adjective.

>  (defun completion-try-completion (string table pred point &optional metadata)
>    "Try to complete STRING using completion table TABLE.
>  Only the elements of table that satisfy predicate PRED are considered.
> -POINT is the position of point within STRING.
> -The return value can be either nil to indicate that there is no completion,
> -t to indicate that STRING is the only possible completion,
> -or a pair (NEWSTRING . NEWPOINT) of the completed result string together with
> -a new position for point."
> +POINT is the position of point within STRING.  The return value can be
> +either nil to indicate that there is no completion, t to indicate that
> +STRING is the only possible completion, or a pair (NEWSTRING . NEWPOINT)
> +of the completed result string together with a new position for point.
> +The METADATA may be modified by the completion style."

Here you changed whitespace by filling, and that ruined the
intentionally formatted doc string, which made it easy to find the
various forms of the return value and the important parts of the doc
string.  Please keep the original formatting.

>  (defun completion-all-completions (string table pred point &optional metadata)
>    "List the possible completions of STRING in completion table TABLE.
>  Only the elements of table that satisfy predicate PRED are considered.
> -POINT is the position of point within STRING.
> -The return value is a list of completions and may contain the base-size
> -in the last `cdr'."
> -  ;; FIXME: We need to additionally return the info needed for the
> -  ;; second part of completion-base-position.
> -  (completion--nth-completion 2 string table pred point metadata))
> +POINT is the position of point within STRING.  The return value is a
> +list of completions and may contain the base-size in the last `cdr'.
> +The METADATA may be modified by the completion style.  This function
> +has been superseded by `completion-filter-completions', which returns
> +richer information and supports deferred candidate highlighting."

Likewise here.

Also, the "This function has been superseded..." part should be a new
paragraph, so that it stands out.  (And I'm not yet sure we indeed
want to say "superseded" here, but that's part of the on-going
discussion.  maybe use a more neutral language here, like "See also".)

> +    (if (and result (consp (car result)))
> +        ;; Give the completion styles some freedom!
> +        ;; If they are targeting Emacs 28 upwards only, they
> +        ;; may always return a result with deferred
> +        ;; highlighting.  We convert back to the old format
> +        ;; here by applying the highlighting eagerly.

"May always" again.  How about "can always" instead?

> +        (nconc (funcall (cdr (assq 'highlight result))
> +                        (cdr (assq 'completions result)))
> +               (cdr (assq 'base result)))
> +      result)))
> +
> +(defun completion-filter-completions (string table pred point metadata)
> +  "Filter the possible completions of STRING in completion table TABLE.

Is "filter" really the right word here (in the doc string)?  "Filer"
means you take a sequence and produce another sequence with some
members removed.  That's not what this API does, is it?  Suggest to
use a different name, like completion-completions-alist or
completion-all-completions-as-alist.

> +Only the elements of table that satisfy predicate PRED are considered.
> +POINT is the position of point within STRING.  The METADATA may be
> +modified by the completion style.  The return value is a alist with
> +the keys:
> +
> +- base: Base position of the completion (from the start of STRING)

"Base" here means the beginning?  If so, why not call it "beg" or
somesuch?

> +This function supersedes the function `completion-all-completions'."

Again, "supersedes" is too strong, IMO.  I would say "is a variant of"
instead, and explain why this variant could be better suited to some
use cases.  IOW, explain the upsides (and downsides, if any), and let
the programmers decide whether they want this, instead of more-or-less
forcing them to use it.

> +        ;; Deferred highlighting has been requested, but the completion
> +        ;; style returned a non-deferred result. Convert the result to the
                                                  ^^
two spaces between sentences, please.

> +        ;; new alist format.

"New" is not a good word here.

> +      ;; added by the completion machinery.

Please start comments with a capital letter.

> +(defun completion--deferred-hilit (completions prefix-len base end)
> +  "Return completions in old format or new alist format.
> +If `completion--filter-completions' is non-nil use the new format."

Again, please don't use "old" and "new" here, but instead describe
explicitly the differences between them, or provide a hyperlink to
where that is described.

> +                      ;; Apply highlighting

Please end each sentence in a comment with a period.

> +(defun completion-pcm--deferred-hilit (pattern completions base end)
> +  "Return completions in old format or new alist format.
> +If `completion--filter-completions' is non-nil use the new format."

"Old" and "new" again.

>  (defun completion-pcm--hilit-commonality (pattern completions)
>    "Show where and how well PATTERN matches COMPLETIONS.
>  PATTERN, a list of symbols and strings as seen
>  `completion-pcm--merge-completions', is assumed to match every
>  string in COMPLETIONS.  Return a deep copy of COMPLETIONS where
> -each string is propertized with `completion-score', a number
> -between 0 and 1, and with faces `completions-common-part',
> +each string is propertized with faces `completions-common-part',
>  `completions-first-difference' in the relevant segments."

Are we really losing the completion-score property here?  If so, why?

> +           ;; If `pattern' doesn't have an explicit trailing any,

This is confusing: what do you mean by "explicit trailing any" in the
context of patterns?

> +(defun completion--flex-score (pattern completions)
> +  "Compute how well PATTERN matches COMPLETIONS.
> +PATTERN, a list of strings is assumed to match every string in
> +COMPLETIONS.

Is PATTERN really a list?  It would be strange for a list to be called
PATTERN, and how can a list "match every string in COMPLETIONS"?

>               Return a copy of COMPLETIONS where each element is
> +a pair of a score and the completion string.

What is "the completion string" in this case? is it the same string
from COMPLETIONS, or is it something else?  The doc string leaves that
unclear.

>                                                The score lies in
> +the range between -1 and 0, where -1 corresponds to the full
> +match."

What score could a partial match have, and what is the meaning of the
numerical value for a partial match?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Thu, 12 Aug 2021 08:48:01 GMT) Full text and rfc822 format available.

Message #26 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Mendler <mail <at> daniel-mendler.de>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 47711 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca, joaotavora <at> gmail.com,
 48841 <at> debbugs.gnu.org, dgutov <at> yandex.ru
Subject: Re: bug#48841: [PATCH] Add new `completion-filter-completions` API
 and deferred highlighting
Date: Thu, 12 Aug 2021 10:47:17 +0200

Eli, thank you for your feedback and for pointing me to the mode which
helps with the formatting.  I will address the documentation and
formatting issues as soon as the discussion has concluded.

In the following I answer to a few of your questions about technical
details.

>> The `completions` value is the list of completion strings *without*
>> applied highlighting.  The completion strings are returned unmodified,
>> which avoids allocations and results in performance gains for
> 
> This is unclear: how can you return a list of strings which you
> produce without allocating the strings?

The function 'completion-filter-completions' receives a completion table
as argument.  The strings produced by this table are returned
unmodified, but of course the completion table has to produce them.  For
a static completion table (e.g., in the simplest case a list of strings)
the completion table itself will not allocate strings.  In this scenario
'completion-filter-completions' will not perform any string allocations,
only the list will be allocated.  This is what leads to major
performance gains.

>> +(defvar completion--filter-completions nil
>> +  "Enable the new completions return value format.
>
> Btw, why is this an internal variable?  Shouldn't all completion
> styles ideally support it?  If so, it should be a public variable,
> documented in the ELisp manual.  And the name should also end with -p,
> since it's a boolean.  How about completion-filter-completions-format-p?

(As I understood the style guide '-p' is not a good idea for boolean
variables, since a value is not a predicate in a strict sense.)

To address your technical comment - this variable is precisely what one
of the technical difficulties mentioned in my other mail is about.  The
question is how we can retain backward compatibility of the completion
style 'all' functions, e.g., 'completion-basic-all-completions', while
still allowing the function to return the newly introduced alist format
with more data, which enables 'completion-filter-completions' to perform
the efficient deferred highlighting.

> Also, the "This function has been superseded..." part should be a new
> paragraph, so that it stands out.  (And I'm not yet sure we indeed
> want to say "superseded" here, but that's part of the on-going
> discussion.  maybe use a more neutral language here, like "See also".)

The new API 'completion-filter-completions' will substitute the existing
API 'completion-all-completions'.  I only didn't go as far as
deprecating the 'completion-all-completions' API right away, but we
could also do this.

> Is "filter" really the right word here (in the doc string)?  "Filer"
> means you take a sequence and produce another sequence with some
> members removed.  That's not what this API does, is it?  Suggest to
> use a different name, like completion-completions-alist or
> completion-all-completions-as-alist.

"Filter" seems like exactly the right word to me.  The function takes a
list of strings (or a completion table) and returns a subset of matching
completion strings without further modifications to the strings. See
above what I wrote about allocations.

>> +Only the elements of table that satisfy predicate PRED are considered.
>> +POINT is the position of point within STRING.  The METADATA may be
>> +modified by the completion style.  The return value is a alist with
>> +the keys:
>> +
>> +- base: Base position of the completion (from the start of STRING)
> 
> "Base" here means the beginning?  If so, why not call it "beg" or
> somesuch?

Base position is a fixed term which is already used in minibuffer.el for
completions.  See also 'completion-base-position' for example.

> Are we really losing the completion-score property here?  If so, why?

Yes, the property is removed in the current patch.  It is not actually
used for anything in the new implementation.  But it is possible to
restore the property such that 'completion-all-completions' always
returns scored candidates as it does now.  See my other mail regarding
the caveats of the current patch.

Daniel

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Thu, 12 Aug 2021 09:25:02 GMT) Full text and rfc822 format available.

Message #29 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Mendler <mail <at> daniel-mendler.de>
To: 47711 <at> debbugs.gnu.org, 48841 <at> debbugs.gnu.org
Cc: Eli Zaretskii <eliz <at> gnu.org>, Dmitry Gutov <dgutov <at> yandex.ru>,
 João Távora <joaotavora <at> gmail.com>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>
Subject: Re: [PATCH] Add new `completion-filter-completions` API and deferred
 highlighting
Date: Thu, 12 Aug 2021 11:24:22 +0200

[Message part 1 (text/plain, inline)]

On 8/11/21 6:11 PM, Daniel Mendler wrote:
> 2. In `completion--nth-completion' set `completion--filter-completions'
>    to nil, unless `(memq style '(emacs21 emacs22 basic
>    partial-completion initials flex))' such that custom completion
>    styles which wrap the completion functions don't see the new return
>    value format, except if the custom style opts in explicitly by
>    binding `completion--filter-completions'. An alternative criterion is
>    `(memq fun '(completion-emacs22-all-completions) ...)'. Unfortunately
>    this approach will still not work if the user has advised a
>    `completion-x-all-completions' function. The only 100% safe approach
>    seems to transparently redirect calls to
>    `completion-x-all-completions' to `completion--x-filter-completions',
>    which returns the results in the new format.

I attached two patch variants which can be placed on top of my previous
patch to improve the backward compatibility of the internal API.

Variant 1: Set 'completion--return-alist-flag' only for the existing
completion styles, such that they transparently upgrade to the alist
return format.  If the variable is not set, the completion styles return
the result as plain list retaining backward compatibility.  The variable
is purely for internal use, new completion styles should return their
results as an alist on Emacs 28 and newer.

Variant 2: Add an optional argument FILTER to each of the completion
styles 'all' functions, e.g., 'completion-basic-all-completions'.  In
'completion--nth-completion' try to call the function with the
additional FILTER argument to upgrade to the alist return format.  If
this fails with a 'wrong-number-of-arguments' error, retry again without
the argument.

Daniel

[variant1-restrict.el (text/plain, attachment)]

[variant2-argument.el (text/plain, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Fri, 13 Aug 2021 10:39:01 GMT) Full text and rfc822 format available.

Message #32 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Mendler <mail <at> daniel-mendler.de>
To: 47711 <at> debbugs.gnu.org, 48841 <at> debbugs.gnu.org
Cc: Eli Zaretskii <eliz <at> gnu.org>, Dmitry Gutov <dgutov <at> yandex.ru>,
 João Távora <joaotavora <at> gmail.com>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>
Subject: Re: [PATCH VERSION 2] Add new `completion-filter-completions` API and
 deferred highlighting
Date: Fri, 13 Aug 2021 12:38:28 +0200

[Message part 1 (text/plain, inline)]

I attached the overhauled patch, which addresses most of the comments by
Eli.  In comparison to my last patch, the patch is fully backward
compatible and preserves all existing tests.  As before, there are tests
which check the new functionality for each existing completion style.

Daniel

[0001-Add-new-completion-filter-completions-API-and-deferr.patch (text/x-diff, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Fri, 13 Aug 2021 10:58:02 GMT) Full text and rfc822 format available.

Message #35 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Daniel Mendler <mail <at> daniel-mendler.de>
Cc: Eli Zaretskii <eliz <at> gnu.org>, Dmitry Gutov <dgutov <at> yandex.ru>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 48841 <at> debbugs.gnu.org,
 47711 <at> debbugs.gnu.org
Subject: Re: [PATCH VERSION 2] Add new `completion-filter-completions` API and
 deferred highlighting
Date: Fri, 13 Aug 2021 11:56:59 +0100

> In comparison to my last patch, the patch is fully backward
> compatible and preserves all existing tests.

This a very good thing (the fact that the patch is fully backward compatible,
I mean).

It is quite a large patch that touches many completion internals.  I'd like
some time to look it over.

I've read the discussion and am indeed aware of some non-neglibile
performance problems in the flex and pcm completion styles since
they need to copy strings around.  Other -- completely different --
performance problems affect fido-mode specifically (but not
fido-vertical-mode, curiously).

In some conversation with Dmitry

  bug#48841: fido-mode is slower than ido-mode with similar settings

We discussed this.

There was also talk of removing the string copying with minimal (but not null)
backward compatibility breakage.  I recall Dmitry saying it was easy
to fix on the
completion frontend side.  Many such frontends live in Emacs or GNU Elpa.
On the other hand, the patch that we (or at least I) envisioned in
that discussion
was almost certainly much, much simpler than the one being presented here,
and thus much easier to reason about and discuss.

But to avoid comparing apples to oranges, I would you to summarize exactly,
perhaps in the forms of code snippets, and/or benchmarks exactly what problems
your large patch solves. State the problem(s) first, then the solution
(to each).
If there are multiple problems, then there's a good chance that multiple patches
that address each of these are preferred.

Thank you very much.
João

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Fri, 13 Aug 2021 11:22:02 GMT) Full text and rfc822 format available.

Message #38 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Mendler <mail <at> daniel-mendler.de>
To: João Távora <joaotavora <at> gmail.com>
Cc: Dmitry Gutov <dgutov <at> yandex.ru>, Stefan Monnier <monnier <at> iro.umontreal.ca>,
 48841 <at> debbugs.gnu.org, 47711 <at> debbugs.gnu.org
Subject: Re: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Fri, 13 Aug 2021 13:21:32 +0200

On 8/13/21 12:56 PM, João Távora wrote:
> I've read the discussion and am indeed aware of some non-neglibile
> performance problems in the flex and pcm completion styles since
> they need to copy strings around.  Other -- completely different --
> performance problems affect fido-mode specifically (but not
> fido-vertical-mode, curiously).
> 
> In some conversation with Dmitry
> 
>   bug#48841: fido-mode is slower than ido-mode with similar settings
> 
> We discussed this.

I've read the discussion.  You are probably aware of my efforts to in
Vertico to implement deferred highlighting.  The patch I implemented
here implements the deferred highlighting in a clean way.

> There was also talk of removing the string copying with minimal (but not null)
> backward compatibility breakage.  I recall Dmitry saying it was easy
> to fix on the
> completion frontend side.  Many such frontends live in Emacs or GNU Elpa.
> On the other hand, the patch that we (or at least I) envisioned in
> that discussion
> was almost certainly much, much simpler than the one being presented here,
> and thus much easier to reason about and discuss.

No, this is not the case. There is no simple fix of the allocation issue
on the frontend side.  The existing API `completion-all-completions`
necessarily has to allocate all the strings in order to attach
highlighting and scoring.  The new API solves this in a clean way by
both deferring highlighting and scoring.

I claim that my patch is easy to reason about and refactors the existing
code to address the exact problem we are having. Please take some time
in reviewing it.

> But to avoid comparing apples to oranges, I would you to summarize exactly,
> perhaps in the forms of code snippets, and/or benchmarks exactly what problems
> your large patch solves. State the problem(s) first, then the solution
> (to each).

The main problem is that `completion-all-completions` allocates all the
strings every time the completions are filtered.  This is the same
performance issue you encountered in fido-mode/icomplete-mode.

The second problem addressed by the new API
`completion-filter-completions` is that `completion-all-completions` is
limited in what it can return.  For example it cannot return the end
position of the completion.  This is also solved by the new API.  The
new API is a clean extensible way forward.

Daniel

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Fri, 13 Aug 2021 12:07:01 GMT) Full text and rfc822 format available.

Message #41 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Daniel Mendler <mail <at> daniel-mendler.de>
Cc: Dmitry Gutov <dgutov <at> yandex.ru>, Stefan Monnier <monnier <at> iro.umontreal.ca>,
 48841 <at> debbugs.gnu.org, 47711 <at> debbugs.gnu.org
Subject: Re: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Fri, 13 Aug 2021 13:05:47 +0100

On Fri, Aug 13, 2021 at 12:21 PM Daniel Mendler <mail <at> daniel-mendler.de> wrote:

> No, this is not the case. There is no simple fix of the allocation issue
> on the frontend side.

I didn't claim that. At all. I claimed that the frontends that would be
affected by the (small)  backend patch are easy to adapt.  I think
you completely read past my idea.

> The existing API `completion-all-completions`
> necessarily has to allocate all the strings in order to attach
> highlighting and scoring.  The new API solves this in a clean way by
> both deferring highlighting and scoring.

I'm not sure you understand my alternative idea.  As far as I
understand (and have actually measured) the lines:

   ;; Don't modify the string itself.
   (setq str (copy-sequence str))

in minibuffer.el, in the function completion-pcm--hilit-commonality

Are the cause of the problem that _I am talking about_ and that
I have actually measured.  Again you may be referring to a
_different_ problem that I am unaware of.

If one removes these lines, the process becomes much faster, but there is a
problem with highlighting.  My idea is indeed to defer highlighting by not
setting the 'face property directly on that shared string, but some
other property
that is read later from the shared string by compliant frontents.

If you have understood this idea, can you comment on it?
(Preferably in terms of less adjectification regarding "cleanliness", but in
terms of actual drawbacks/advantages?)

The drawback that I can see in it is that frontends directly relying
on 'face are
broken by that patch. But according to Dmitry (and I tend to agree), it's
quite easy to address those frontends.  Most of them live in Emacs core or
GNU Elpa.

The advantage that I see is that those adaptations apart, it is a small
localized and effective change.

> I claim that my patch is easy to reason about and refactors the existing
> code to address the exact problem we are having. Please take some time
> in reviewing it.

I am already taking some time. I need your assistance in explaining the
problems first. I take into account your claims of cleanliness and elegance,
but in terms of their power of persuasion, they are much more limited
than hard material evidence.

> The main problem is that `completion-all-completions` allocates all the
> strings every time the completions are filtered.  This is the same
> performance issue you encountered in fido-mode/icomplete-mode.

OK. I encountered at least two different performance problems there, with
quite different causes. So let's stick to the string-allocation problem.  Post
a code snippet that demonstrates the problem the way you see it/experience it?

Some benchmark code would be very welcome.  You can probably grab my
benchmarking code from that other bug.

Then it becomes easy to study multiple solutions to that problem and
choose the best one!

> The second problem addressed by the new API
> `completion-filter-completions` is that `completion-all-completions` is
> limited in what it can return.  For example it cannot return the end
> position of the completion.

And why is this a problem? Can you post an example of something you'd
like to do, but can't?  Regardless, it does seem indeed like a "second" problem
(as you state) so perhaps something that can be addressed separately.

Is your particular solution to this second problem instrumental in solving
the "main problem"

> This is also solved by the new API.  The new API is a clean extensible way forward.

I understand you've put time and effort into producing this work. We are
all indebted and I promise to read it. But every API writer in history of
programming has claimed those things and reality often shows otherwise.
So it's not that your work can't be those things you claim, maybe it is, but
generally the larger and broader the work the harder it is to reason about.

João

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Fri, 13 Aug 2021 12:24:01 GMT) Full text and rfc822 format available.

Message #44 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Mendler <mail <at> daniel-mendler.de>
To: João Távora <joaotavora <at> gmail.com>
Cc: Dmitry Gutov <dgutov <at> yandex.ru>, Stefan Monnier <monnier <at> iro.umontreal.ca>,
 48841 <at> debbugs.gnu.org, 47711 <at> debbugs.gnu.org
Subject: Re: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Fri, 13 Aug 2021 14:22:48 +0200

On 8/13/21 2:05 PM, João Távora wrote:
>> The existing API `completion-all-completions`
>> necessarily has to allocate all the strings in order to attach
>> highlighting and scoring.  The new API solves this in a clean way by
>> both deferring highlighting and scoring.
> 
> I'm not sure you understand my alternative idea.  As far as I
> understand (and have actually measured) the lines:
> 
>    ;; Don't modify the string itself.
>    (setq str (copy-sequence str))
> 
> in minibuffer.el, in the function completion-pcm--hilit-commonality
> 
> Are the cause of the problem that _I am talking about_ and that
> I have actually measured.  Again you may be referring to a
> _different_ problem that I am unaware of.

You are right that the call to `copy-sequence` is a major bottleneck in
the filtering.  However you are wrong that this line can simply be
removed/disabled and the candidates can be modified.  The API guarantees
and has always guaranteed that the candidate strings are not mutated.
It is important to keep this property since this will preclude many bugs
due to string mutation.  By separating the filtering and mutation
(highlighting, scoring) my patch addresses the problem at hand in the
proper way.

Note that the UI also has no possibility to opt-out of the mutation.
The UI is actually not the one being concerned about the mutation here,
it is the backends (completion tables), which produce the strings.  If
one starts mutating these strings you will see bugs cropping up
throughout Emacs where shared strings suddenly have spurious additional
properties due to the completion filtering.

Mutation would be a reasonable choice here if the problem could not be
solved in a proper way.  But in fact it can be solved in a proper way
without mutating the strings at all as my patch shows.

> If one removes these lines, the process becomes much faster, but there is a
> problem with highlighting.  My idea is indeed to defer highlighting by not
> setting the 'face property directly on that shared string, but some
> other property
> that is read later from the shared string by compliant frontents.

This solution is much more ad-hoc and you still mutate the string which
is not allowed.

> The advantage that I see is that those adaptations apart, it is a small
> localized and effective change.

Note that your idea also does not address the other issues which are
addressed by my patch.  The new API `completion-filter-completions`
returns data which hasn't been available before, e.g., the end position,
which cannot be fixed given the existing API.

>> The main problem is that `completion-all-completions` allocates all the
>> strings every time the completions are filtered.  This is the same
>> performance issue you encountered in fido-mode/icomplete-mode.
> 
> OK. I encountered at least two different performance problems there, with
> quite different causes. So let's stick to the string-allocation problem.  Post
> a code snippet that demonstrates the problem the way you see it/experience it?

You can try my Vertico completion UI, which is available on GNU ELPA.
It implements deferred highlighting and there the performance difference
is perceivable.  Currently Vertico uses an advice-based hack to avoid
the over-eager string-allocations and the highlighting.

>> The second problem addressed by the new API
>> `completion-filter-completions` is that `completion-all-completions` is
>> limited in what it can return.  For example it cannot return the end
>> position of the completion.
> 
> And why is this a problem? Can you post an example of something you'd
> like to do, but can't?  Regardless, it does seem indeed like a "second" problem
> (as you state) so perhaps something that can be addressed separately.

Please look at the FIXMEs in minibuffer.el which address this.
Currently only the beginning position of the completion boundary is
returned, which is only half of the information.

> I understand you've put time and effort into producing this work. We are
> all indebted and I promise to read it. But every API writer in history of
> programming has claimed those things and reality often shows otherwise.
> So it's not that your work can't be those things you claim, maybe it is, but
> generally the larger and broader the work the harder it is to reason about.

I stand by my claim and I also stand by the claim that
removing/disabling `copy-sequence` is not a proper way to address the
issues at hand and will introduce many bugs in the long run.  Please
take your time to look at the patch in earnest.  I would also like to
see others chime in here with their opinion.

Daniel

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Fri, 13 Aug 2021 12:38:02 GMT) Full text and rfc822 format available.

Message #47 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Daniel Mendler <mail <at> daniel-mendler.de>
Cc: Dmitry Gutov <dgutov <at> yandex.ru>, Stefan Monnier <monnier <at> iro.umontreal.ca>,
 48841 <at> debbugs.gnu.org, 47711 <at> debbugs.gnu.org
Subject: Re: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Fri, 13 Aug 2021 13:37:38 +0100

On Fri, Aug 13, 2021 at 1:22 PM Daniel Mendler <mail <at> daniel-mendler.de> wrote:

> It is important to keep this property since this will preclude many bugs
> due to string mutation.

I am aware of this, of course.  Can you give examples of these "many bugs"?
Perhaps other than the one I already described and addressed?

> By separating the filtering and mutation
> (highlighting, scoring) my patch addresses the problem at hand in the
> proper way.
>[ ... ]
> Mutation would be a reasonable choice here if the problem could not be
> solved in a proper way.  But in fact it can be solved in a proper way
> without mutating the strings at all as my patch shows.

"proper" is just an reasonably empty adjective.  There are different ways to
go about this, of course.  What's "proper" and isn't is hard to debate
objectively.

> This solution is much more ad-hoc and you still mutate the string which
> is not allowed.

It's also difficult to debate "ad-hoc" or not.  If you've studied the
problem, what
makes you say that mutating the string (in this case, adding a
'completion--style-face' property to it) is not allowed? What negative things
would derive from it.

> > The advantage that I see is that those adaptations apart, it is a small
> > localized and effective change.
>
> Note that your idea also does not address the other issues which are
> addressed by my patch.

That's for sure.  My patch idea addresses only that single problem.
I think this is a good property of patches: to solve one thing, not many.

We can make more patches to solve other problems, once we
identify them clearly.

> The new API `completion-filter-completions`
> returns data which hasn't been available before, e.g., the end position,
> which cannot be fixed given the existing API.
>
> >> The main problem is that `completion-all-completions` allocates all the
> >> strings every time the completions are filtered.  This is the same
> >> performance issue you encountered in fido-mode/icomplete-mode.
> >
> > OK. I encountered at least two different performance problems there, with
> > quite different causes. So let's stick to the string-allocation problem.  Post
> > a code snippet that demonstrates the problem the way you see it/experience it?

Look, one needs to evaluate things quantitively. Your patch is not
to Vertico, it's to Emacs. I'm concerned with changes to Emacs and their
effect on all completion frontends.  So trying Vertico isn't very useful.

If you're solving a performance problem (and it seems that you are, among
other things) we really need benchmarks, a description of an experiment whose
results can be reproduced independently. It's the normal scientific method.

Something like:

"before my patch, this code takes 123 seconds to run, after my patch it
takes 12."

> >> The second problem addressed by the new API
> >> `completion-filter-completions` is that `completion-all-completions` is
> >> limited in what it can return.  For example it cannot return the end
> >> position of the completion.
> >
> > And why is this a problem? Can you post an example of something you'd
> > like to do, but can't?  Regardless, it does seem indeed like a "second" problem
> > (as you state) so perhaps something that can be addressed separately.
>
> Please look at the FIXMEs in minibuffer.el which address this.
> Currently only the beginning position of the completion boundary is
> returned, which is only half of the information.

OK. It does seem like a separate problem, so maybe open a new bug for it?

João Távora

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Fri, 13 Aug 2021 12:58:02 GMT) Full text and rfc822 format available.

Message #50 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Mendler <mail <at> daniel-mendler.de>
To: João Távora <joaotavora <at> gmail.com>
Cc: Dmitry Gutov <dgutov <at> yandex.ru>, Stefan Monnier <monnier <at> iro.umontreal.ca>,
 48841 <at> debbugs.gnu.org, 47711 <at> debbugs.gnu.org
Subject: Re: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Fri, 13 Aug 2021 14:56:38 +0200

On 8/13/21 2:37 PM, João Távora wrote:
> On Fri, Aug 13, 2021 at 1:22 PM Daniel Mendler <mail <at> daniel-mendler.de> wrote:
> 
>> It is important to keep this property since this will preclude many bugs
>> due to string mutation.
> 
> I am aware of this, of course.  Can you give examples of these "many bugs"?
> Perhaps other than the one I already described and addressed?

No, João, this is not how it goes.  I don't have to prove to you that
your idea introduces bugs.  You have to show that mutation of the
completion table strings (which are not supposed to be mutated) will not
lead to bugs, which are hard to find.

In contrast with the new API `completion-filter-completions` this entire
class of bugs is avoided by construction of the API.  Furthermore the
`completion-filter-completions` API is easy to use in comparison to your
idea, where "compliant" backends have to apply string manipulations to
apply the highlighting and revert the strings back to their old pristine
state.  The only thing the API user has to do is to call the `highlight`
function returned in the alist by `completion-filter-completions`.

>> By separating the filtering and mutation
>> (highlighting, scoring) my patch addresses the problem at hand in the
>> proper way.
>> [ ... ]
>> Mutation would be a reasonable choice here if the problem could not be
>> solved in a proper way.  But in fact it can be solved in a proper way
>> without mutating the strings at all as my patch shows.
> 
> "proper" is just an reasonably empty adjective.  There are different ways to
> go about this, of course.  What's "proper" and isn't is hard to debate
> objectively.

You are contradicting yourself here. You agree that string mutation is
better be avoid. If we define "proper" as avoids string mutation if this
is easily possible, then my patch implements a proper solution to the
problem.

>>> The advantage that I see is that those adaptations apart, it is a small
>>> localized and effective change.
>>
>> Note that your idea also does not address the other issues which are
>> addressed by my patch.
> 
> That's for sure.  My patch idea addresses only that single problem.
> I think this is a good property of patches: to solve one thing, not many.

No, this is not necessarily true.  This is only good if the problem is
solved in a way which is future proof.  The idea of mutating the strings
is a hack and not a solution. In contrast, I am presenting a
future-proof new API as solution which addresses multiple problems.  If
you look at the patch, only 196 new lines are added to minibuffer.el.
Furthermore the patch adds 213 lines of new tests.

> Look, one needs to evaluate things quantitively. Your patch is not
> to Vertico, it's to Emacs. I'm concerned with changes to Emacs and their
> effect on all completion frontends.  So trying Vertico isn't very useful.
> 
> If you're solving a performance problem (and it seems that you are, among
> other things) we really need benchmarks, a description of an experiment whose
> results can be reproduced independently. It's the normal scientific method.

João, you don't have to lecture me on these things.  Of course I can
provide such numbers.  You cannot reasonably make the claim that
`copy-sequence` is the problem and at the same time claim that my patch
does not solve the performance issues, when in fact my patch avoids this
exact string copying.

>>>> The second problem addressed by the new API
>>>> `completion-filter-completions` is that `completion-all-completions` is
>>>> limited in what it can return.  For example it cannot return the end
>>>> position of the completion.
>>>
>>> And why is this a problem? Can you post an example of something you'd
>>> like to do, but can't?  Regardless, it does seem indeed like a "second" problem
>>> (as you state) so perhaps something that can be addressed separately.
>>
>> Please look at the FIXMEs in minibuffer.el which address this.
>> Currently only the beginning position of the completion boundary is
>> returned, which is only half of the information.
> 
> OK. It does seem like a separate problem, so maybe open a new bug for it?

There is already a FIXME in minibuffer.el, so I assume Stefan Monnier is
well aware of these issues.  It is an additional win of the new API that
such open problems can be fixed too.  As I see it, a new API is the way
to go here.

Daniel

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Fri, 13 Aug 2021 13:37:02 GMT) Full text and rfc822 format available.

Message #53 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Daniel Mendler <mail <at> daniel-mendler.de>
Cc: Dmitry Gutov <dgutov <at> yandex.ru>, Stefan Monnier <monnier <at> iro.umontreal.ca>,
 48841 <at> debbugs.gnu.org, 47711 <at> debbugs.gnu.org
Subject: Re: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Fri, 13 Aug 2021 14:36:21 +0100

didnOn Fri, Aug 13, 2021 at 1:56 PM Daniel Mendler
<mail <at> daniel-mendler.de> wrote:
>
> On 8/13/21 2:37 PM, João Távora wrote:
> > On Fri, Aug 13, 2021 at 1:22 PM Daniel Mendler <mail <at> daniel-mendler.de> wrote:
> >
> >> It is important to keep this property since this will preclude many bugs
> >> due to string mutation.
> >
> > I am aware of this, of course.  Can you give examples of these "many bugs"?
> > Perhaps other than the one I already described and addressed?
>
> No, João, this is not how it goes.  I don't have to prove to you that
> your idea introduces bugs.

So you just say it and I have to believe it?  Then I could say the same to
you, right?  I won't of course, that would be silly.

You have to show that mutation of the
> completion table strings (which are not supposed to be mutated) will not
> lead to bugs, which are hard to find.
>
> In contrast with the new API `completion-filter-completions` this entire
> class of bugs is avoided by construction of the API.  Furthermore the
> `completion-filter-completions` API is easy to use in comparison to your
> idea, where "compliant" backends have to apply string manipulations to
> apply the highlighting and revert the strings back to their old pristine
> state.  The only thing the API user has to do is to call the `highlight`
> function returned in the alist by `completion-filter-completions`.
>
> >> By separating the filtering and mutation
> >> (highlighting, scoring) my patch addresses the problem at hand in the
> >> proper way.
> >> [ ... ]
> >> Mutation would be a reasonable choice here if the problem could not be
> >> solved in a proper way.  But in fact it can be solved in a proper way
> >> without mutating the strings at all as my patch shows.
> >
> > "proper" is just an reasonably empty adjective.  There are different ways to
> > go about this, of course.  What's "proper" and isn't is hard to debate
> > objectively.
>
> You are contradicting yourself here. You agree that string mutation is
> better be avoid. If we define "proper" as avoids string mutation if this
> is easily possible, then my patch implements a proper solution to the> problem.

I didn't say it's better avoided, though of course I will avoid _any_ change if
I can. I said I have identified one drawback with doing it.  Then I
have addressed
that drawback. So that's what I said.

I am unaware of _other_ drawbacks.  They might exist, but I am unaware of
them.  Perhaps you are, and indeed you state they exist, but you refuse to
let me know about them.  Or perhaps others know of them and will let me know.
In my long-running discussion with Dmitry they were not presented (again,
except for the one I identified).

> > That's for sure.  My patch idea addresses only that single problem.
> > I think this is a good property of patches: to solve one thing, not many.
> No, this is not necessarily true.  This is only good if the problem is
> solved in a way which is future proof.

OK, but what thing of the future, real or academic, do you envision that
would bring back the problem, or create other problems?

> The idea of mutating the strings is a hack and not a solution.

Without facts to back it up, I have to take this as gratuitous disparagement.
Nicht so gut.

> In contrast, I am presenting a
> future-proof new API as solution which addresses multiple problems.

That's the issue.  The completion system is very complex and there are many
good ideas, different, floated by many people. But if you make a patch to
address "multiple" fuzzily-described problems, it's hard to judge how good
your ideas even are! Maybe they are indeed very good,  I never said
they weren't.  No need to get worked up about it!

Again, my proposal is to first focus on the performance problems caused by
string allocation.  _That_ problem is well understood, at least by me (but it
would help to settle on convenient benchmarks understood by others, too).
Then we can go from there.

> you look at the patch, only 196 new lines are added to minibuffer.el.
> Furthermore the patch adds 213 lines of new tests.

It's a large patch, over 1000 lines.  One does not review a patch
merely by looking at
lines added, when one needs to read much more, to understand implications, etc.
It needs documentation, for one, much more than just docstrings, on
how to use the
new API.

> João, you don't have to lecture me on these things.  Of course I can
> provide such numbers.

Then please do! Not meaning to lecture you, just that your suggestion that
I try Vertico UI as a substitution for these numbers seemed completely
misguided.  So if you have them (or "can provide them") let's see them.
All I'm asking,  preferably from Emacs -Q recipe.

> You cannot reasonably make the claim that
> `copy-sequence` is the problem and at the same time claim that my patch
> does not solve the performance issues, when in fact my patch avoids this
> exact string copying.

I didn't say it didn't solve them! Now, where did I say that? I would
like to see a
benchmark so that I can witness it _and_ study alternative solutions. With
that, there's a better chance that I will be persuaded there are none
as elegant,
clean, proper, pure, etc as yours!

Maybe others review patches on other aspects that's fine.   Maybe
others will. Eli reviewed on minor formatting and documentation aspects.
I review them on substance, using numbers and conducting my own
experiments and tests. This takes time and help from the scientist on the
other end.

Simple and in summary, let's hope your next reply has some benchmarks
so we can make progress.

Thanks,
João

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Fri, 13 Aug 2021 14:04:02 GMT) Full text and rfc822 format available.

Message #56 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Mendler <mail <at> daniel-mendler.de>
To: João Távora <joaotavora <at> gmail.com>
Cc: Dmitry Gutov <dgutov <at> yandex.ru>, Stefan Monnier <monnier <at> iro.umontreal.ca>,
 48841 <at> debbugs.gnu.org, 47711 <at> debbugs.gnu.org
Subject: Re: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Fri, 13 Aug 2021 16:03:21 +0200

On 8/13/21 3:36 PM, João Távora wrote:
>> You are contradicting yourself here. You agree that string mutation is
>> better be avoid. If we define "proper" as avoids string mutation if this
>> is easily possible, then my patch implements a proper solution to the> problem.
> 
> I didn't say it's better avoided, though of course I will avoid _any_ change if
> I can. I said I have identified one drawback with doing it.  Then I
> have addressed
> that drawback. So that's what I said.
> 
> I am unaware of _other_ drawbacks.  They might exist, but I am unaware of
> them.  Perhaps you are, and indeed you state they exist, but you refuse to
> let me know about them.  Or perhaps others know of them and will let me know.
> In my long-running discussion with Dmitry they were not presented (again,
> except for the one I identified).

In the discussion with Dmitry, I already pointed out that there is an
alternative principled approach implemented by my Vertico UI, which is
in fact the same approach as implemented in this patch.

If there are other useful conclusions from the discussion I will adopt
them here for this patch.

>>> That's for sure.  My patch idea addresses only that single problem.
>>> I think this is a good property of patches: to solve one thing, not many.
>> No, this is not necessarily true.  This is only good if the problem is
>> solved in a way which is future proof.
> 
> OK, but what thing of the future, real or academic, do you envision that
> would bring back the problem, or create other problems?
> 
>> The idea of mutating the strings is a hack and not a solution.
> 
> Without facts to back it up, I have to take this as gratuitous disparagement.
> Nicht so gut.

João, your whole answers are "nicht so gut" or not useful.  What is your
point?  Please give constructive technical feedback instead of such
empty phrases.

>> In contrast, I am presenting a
>> future-proof new API as solution which addresses multiple problems.
> 
> That's the issue.  The completion system is very complex and there are many
> good ideas, different, floated by many people. But if you make a patch to
> address "multiple" fuzzily-described problems, it's hard to judge how good
> your ideas even are! Maybe they are indeed very good,  I never said
> they weren't.  No need to get worked up about it!
> 
> Again, my proposal is to first focus on the performance problems caused by
> string allocation.  _That_ problem is well understood, at least by me (but it
> would help to settle on convenient benchmarks understood by others, too).
> Then we can go from there.

No, it is not the correct approach to fix larger issues by applying
localized patches.  We both have identified the string allocations and
highlighting as problem.  My patch resolves the problem, by exposing
just the right pieces of the already existing completion machinery. More
about this below.

>> you look at the patch, only 196 new lines are added to minibuffer.el.
>> Furthermore the patch adds 213 lines of new tests.
>
> It's a large patch, over 1000 lines.  One does not review a patch
> merely by looking at
> lines added, when one needs to read much more, to understand implications, etc.
> It needs documentation, for one, much more than just docstrings, on
> how to use the
> new API.

I suggest you take a step back here and try to understand the high-level
idea first.  It seems that you are misjudging the complexity of the
patch.  The minibuffer completion machinery is already constructed such
that filtering and highlighting are separate.

If you look at `completion-basic-all-completions` for example, there is
first a filtering step and then the highlighting is applied in a second
step by the function `completion-hilit-commonality`.  This separation
exists for all completion styles.

My patch does nothing else than separating these two processing steps.
The new API `completion-filter-completions` returns the filtered list
and a function to apply highlighting afterwards only to the actually
displayed candidates where highlighting is needed.

In contrast your idea totally misses this.

> Maybe others review patches on other aspects that's fine.   Maybe
> others will. Eli reviewed on minor formatting and documentation aspects.

I am looking forward to more reviews by other people.

Your desire for benchmarks is understandable, but I doubt that it will
lead to progress in the discussion here and I doubt that it will
convince you.

The outcome of the benchmark is the following - my patch only filters
and does not mutate the strings, so it will be slightly faster than your
idea where the strings are mutated first and afterwards the mutation has
to be undone again.  However the mutations are of course not expensive,
so the differences will be small.  The discussion we should be having
here is about technical details and internals and not about the numbers
which won't give any guidance in this case regarding the correct API design.

You seem to always come back to the "scientific method". Note that there
is not only statistics, there is only "scientific reasoning" and
mathematics, which allows to reason about transformations and drawing
conclusions from that.  If you don't do this, you are only doing half of
the science.

Daniel

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Fri, 13 Aug 2021 14:12:01 GMT) Full text and rfc822 format available.

Message #59 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Daniel Mendler <mail <at> daniel-mendler.de>
Cc: Dmitry Gutov <dgutov <at> yandex.ru>, Stefan Monnier <monnier <at> iro.umontreal.ca>,
 48841 <at> debbugs.gnu.org, 47711 <at> debbugs.gnu.org
Subject: Re: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Fri, 13 Aug 2021 15:11:27 +0100

On Fri, Aug 13, 2021 at 3:03 PM Daniel Mendler <mail <at> daniel-mendler.de> wrote:

> > Without facts to back it up, I have to take this as gratuitous disparagement.
> > Nicht so gut.
>
> João, your whole answers are "nicht so gut" or not useful.  What is your
> point?  Please give constructive technical feedback instead of such
> empty phrases.

Look, you disparaged an idea of mine without absolutely any facts. I don't think
that's good. "Nicht so gut" was a lighthearted way of pointing it out.
Lighten up.
Post the benchmarks you say you have and stop the pompous handwaving.

Bye,
João

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Fri, 13 Aug 2021 14:38:02 GMT) Full text and rfc822 format available.

Message #62 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Mendler <mail <at> daniel-mendler.de>
To: João Távora <joaotavora <at> gmail.com>
Cc: Dmitry Gutov <dgutov <at> yandex.ru>, Stefan Monnier <monnier <at> iro.umontreal.ca>,
 48841 <at> debbugs.gnu.org, 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Fri, 13 Aug 2021 16:37:38 +0200

On 8/13/21 4:11 PM, João Távora wrote:
> On Fri, Aug 13, 2021 at 3:03 PM Daniel Mendler <mail <at> daniel-mendler.de> wrote:
> 
>>> Without facts to back it up, I have to take this as gratuitous disparagement.
>>> Nicht so gut.
>>
>> João, your whole answers are "nicht so gut" or not useful.  What is your
>> point?  Please give constructive technical feedback instead of such
>> empty phrases.
> 
> Look, you disparaged an idea of mine without absolutely any facts. I don't think
> that's good. "Nicht so gut" was a lighthearted way of pointing it out.
> Lighten up.
> Post the benchmarks you say you have and stop the pompous handwaving.

João, the way you argue is not in any way "lighthearted".  It also
depends on what the other party receives as the message.  And here you
just repeat this style by calling my reasoning "pompous handwaving".
This is not a fair way to discuss.  In contrast my arguments were
generally of a technical nature.  I propose we both calm down a bit and
let others chime in here.

Daniel

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Sat, 14 Aug 2021 02:48:02 GMT) Full text and rfc822 format available.

Message #65 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: João Távora <joaotavora <at> gmail.com>,
 Daniel Mendler <mail <at> daniel-mendler.de>
Cc: Stefan Monnier <monnier <at> iro.umontreal.ca>, 48841 <at> debbugs.gnu.org,
 47711 <at> debbugs.gnu.org
Subject: Re: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Sat, 14 Aug 2021 05:47:43 +0300

Hi folks,

Sorry I'm late to this party.

On 13.08.2021 16:36, João Távora wrote:
>>>> By separating the filtering and mutation
>>>> (highlighting, scoring) my patch addresses the problem at hand in the
>>>> proper way.
>>>> [ ... ]
>>>> Mutation would be a reasonable choice here if the problem could not be
>>>> solved in a proper way.  But in fact it can be solved in a proper way
>>>> without mutating the strings at all as my patch shows.
>>> "proper" is just an reasonably empty adjective.  There are different ways to
>>> go about this, of course.  What's "proper" and isn't is hard to debate
>>> objectively.
>> You are contradicting yourself here. You agree that string mutation is
>> better be avoid. If we define "proper" as avoids string mutation if this
>> is easily possible, then my patch implements a proper solution to the> problem.
> I didn't say it's better avoided, though of course I will avoid_any_  change if
> I can. I said I have identified one drawback with doing it.  Then I
> have addressed
> that drawback. So that's what I said.
> 
> I am unaware of_other_  drawbacks.  They might exist, but I am unaware of
> them.  Perhaps you are, and indeed you state they exist, but you refuse to
> let me know about them.  Or perhaps others know of them and will let me know.
> In my long-running discussion with Dmitry they were not presented (again,
> except for the one I identified).

I thought I explained the problem with this previously.

It's basically this: we cannot mutate what we don't own. Across all of 
completion functions out there, there will be such that return "shared" 
strings (meaning, not copied or newly allocated) from their completion 
tables. And modifying them is bad, with consequences which can present 
themselves in unexpected, often subtle ways.

Since up until now completion-pcm--hilit-commonality copied all strings 
before modifying, completion tables such as described (with "shared" 
strings) have all been "legal". Suddenly deciding to stop supporting 
them would be a major API breakage with consequences that are hard to 
predict. And while I perhaps agree that it's an inconvenience, I don't 
think it's a choice we can simply make as this stage in c-a-p-f's 
development.

To give an example of a subtle consequence:

  1. (setq s (symbol-name 'car))

  2. (put-text-property 1 3 'face 'error s)

  3. Switch to a buffer in fundamental mode

  4. (insert (symbol-name 'car)) --> see the error face in the buffer

Now imagine that some completion table collects symbol names by passing 
obarray through #'symbol-name rather than #'all-completions, and voila, 
if the completion machinery adds properties (any properties, not just 
face) to those strings, you have just modified a bunch of global values. 
That's not good.

And in the example above, the values are those that the 
lispref/objects.texi says we should not change (though it gives 
(symbol-name 'cons) as example). "Not mutable", in its parlance. IIRC 
the related discussions mentioned that modifying such values could lead 
to a segfault in some previous Emacs versions. Maybe not anymore, but 
it's still not a good idea.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Sat, 14 Aug 2021 02:56:02 GMT) Full text and rfc822 format available.

Message #68 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: João Távora <joaotavora <at> gmail.com>,
 Daniel Mendler <mail <at> daniel-mendler.de>
Cc: Stefan Monnier <monnier <at> iro.umontreal.ca>, 48841 <at> debbugs.gnu.org,
 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Sat, 14 Aug 2021 05:55:17 +0300

Aside from the mutability/ownership issue,

On 13.08.2021 15:05, João Távora wrote:
> If one removes these lines, the process becomes much faster, but there is a
> problem with highlighting.  My idea is indeed to defer highlighting by not
> setting the 'face property directly on that shared string, but some
> other property
> that is read later from the shared string by compliant frontents.

I haven't done any direct benchmarking, but I'm pretty sure that this 
approach cannot, by definition, be as fast as the non-mutating one.

Because you go through the whole list and mutate all of its elements, 
you have to perform a certain bit of work W x N times, where N is the 
size of the whole list.

Whereas the deferred-mutation approach will only have to do its bit 
(which is likely more work, like, WWW) only 20 times, or 100 times, or 
however many completions are displayed. And this is usually negligible.

However big the difference is going to be, I can't say in advance, of 
course, or whether it's going to be shadowed by some other performance 
costs. But the non-mutating approach should have the best optimization 
potential when the list is long.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Sat, 14 Aug 2021 03:12:02 GMT) Full text and rfc822 format available.

Message #71 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Daniel Mendler <mail <at> daniel-mendler.de>,
 "emacs-devel <at> gnu.org" <emacs-devel <at> gnu.org>
Cc: João Távora <joaotavora <at> gmail.com>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 48841 <at> debbugs.gnu.org,
 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: [PATCH] Add new `completion-filter-completions` API
 and deferred highlighting
Date: Sat, 14 Aug 2021 06:11:26 +0300

Hi Daniel,

I haven't yet read the patch in detail, but it sounds like a move in the 
right direction (even if it doesn't include the long-overdue overhaul of 
the whole API).

A few notes on the new stuff:

> Finally the
`highlight` value is a function taking a list of completion strings
and returns a new list of new strings with highlighting applied.

First of all, I'd really like it to be a function that applies to 
individual completion strings, not the whole collection. That would make 
it much easier to use in company-capf without having to rewrite a lot of 
code in the presentation layer.

Second, perhaps instead of modifying the strings themselves it could 
return some data (like a list of faces-intervals tuples) that could be 
used to do so?

Again, in company-capf's we end up parsing the face properties in the 
string, so those modifications are just extra work for CPU which we 
could eliminate.

This is less critical, though.

On 11.08.2021 19:11, Daniel Mendler wrote:
> There are currently two issues with the patch with regards to backward
> compatibility. Fortunately they are fixable with a little effort.
> 
> 1. I would like to deprecate `completion-score' or remove it altogether,
>     but unfortunately `completion-score' is used in the wild. In order to
>     preserve `completion-score', bind `completion--filter-completions' in
>     the highlighting functions. Add `completion-score' in
>     `completion-pcm--hilit-commonality' when
>     `completion--filter-completions' is nil.

And third: I think completion-score could ultimately use the same 
treatment as 'highlight'. Meaning, being returned up the stack together 
with completions, so other bits of code could look up those values.

I don't have a clear picture of this yet, but see the recently filed 
bug#49888. If we want to be able to combine matching scores with recency 
scores, simply sorting the completions after matching is not going to 
cut it.

Not sure if this is something we can tackle now, but keeping this 
possible evolution in mind could help us make good choices in the 
current migration.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Sat, 14 Aug 2021 06:28:02 GMT) Full text and rfc822 format available.

Message #74 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Daniel Mendler <mail <at> daniel-mendler.de>
Cc: 47711 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca, joaotavora <at> gmail.com,
 48841 <at> debbugs.gnu.org, dgutov <at> yandex.ru
Subject: Re: bug#48841: [PATCH] Add new `completion-filter-completions` API
 and deferred highlighting
Date: Sat, 14 Aug 2021 09:27:00 +0300

> Cc: 48841 <at> debbugs.gnu.org, dgutov <at> yandex.ru, joaotavora <at> gmail.com,
>  monnier <at> iro.umontreal.ca, 47711 <at> debbugs.gnu.org
> From: Daniel Mendler <mail <at> daniel-mendler.de>
> Date: Thu, 12 Aug 2021 10:47:17 +0200
> 
> >> The `completions` value is the list of completion strings *without*
> >> applied highlighting.  The completion strings are returned unmodified,
> >> which avoids allocations and results in performance gains for
> > 
> > This is unclear: how can you return a list of strings which you
> > produce without allocating the strings?
> 
> The function 'completion-filter-completions' receives a completion table
> as argument.  The strings produced by this table are returned
> unmodified, but of course the completion table has to produce them.  For
> a static completion table (e.g., in the simplest case a list of strings)
> the completion table itself will not allocate strings.  In this scenario
> 'completion-filter-completions' will not perform any string allocations,
> only the list will be allocated.  This is what leads to major
> performance gains.

My point was that at least some of this should be in the description,
otherwise it will leave the reader wondering.

> >> +(defvar completion--filter-completions nil
> >> +  "Enable the new completions return value format.
> >
> > Btw, why is this an internal variable?  Shouldn't all completion
> > styles ideally support it?  If so, it should be a public variable,
> > documented in the ELisp manual.  And the name should also end with -p,
> > since it's a boolean.  How about completion-filter-completions-format-p?
> 
> (As I understood the style guide '-p' is not a good idea for boolean
> variables, since a value is not a predicate in a strict sense.)
> 
> To address your technical comment - this variable is precisely what one
> of the technical difficulties mentioned in my other mail is about.  The
> question is how we can retain backward compatibility of the completion
> style 'all' functions, e.g., 'completion-basic-all-completions', while
> still allowing the function to return the newly introduced alist format
> with more data, which enables 'completion-filter-completions' to perform
> the efficient deferred highlighting.

I understand, but given that we provide this for other packages, it
shouldn't be an internal variable.

> > Also, the "This function has been superseded..." part should be a new
> > paragraph, so that it stands out.  (And I'm not yet sure we indeed
> > want to say "superseded" here, but that's part of the on-going
> > discussion.  maybe use a more neutral language here, like "See also".)
> 
> The new API 'completion-filter-completions' will substitute the existing
> API 'completion-all-completions'.

That's your hope, and I understand.  But we as a project didn't yet
decide to deprecate the original APIs, so talking about superseding is
premature.

> > Is "filter" really the right word here (in the doc string)?  "Filer"
> > means you take a sequence and produce another sequence with some
> > members removed.  That's not what this API does, is it?  Suggest to
> > use a different name, like completion-completions-alist or
> > completion-all-completions-as-alist.
> 
> "Filter" seems like exactly the right word to me.  The function takes a
> list of strings (or a completion table) and returns a subset of matching
> completion strings without further modifications to the strings. See
> above what I wrote about allocations.

But the name says "filter completions".  Which would mean you take a
list of completions and filter out some of them.  A completion table
is much more general object than a list of strings.  Thus, I think
using "filter" here is sub-optimal.

> >> +Only the elements of table that satisfy predicate PRED are considered.
> >> +POINT is the position of point within STRING.  The METADATA may be
> >> +modified by the completion style.  The return value is a alist with
> >> +the keys:
> >> +
> >> +- base: Base position of the completion (from the start of STRING)
> > 
> > "Base" here means the beginning?  If so, why not call it "beg" or
> > somesuch?
> 
> Base position is a fixed term which is already used in minibuffer.el for
> completions.  See also 'completion-base-position' for example.

Well, we don't have to keep bad habits indefinitely.  It's okay to
lose them and use better terminology.  Or at least to explain that
terminology in parentheses the first time it is used in some context.

> > Are we really losing the completion-score property here?  If so, why?
> 
> Yes, the property is removed in the current patch.  It is not actually
> used for anything in the new implementation.  But it is possible to
> restore the property such that 'completion-all-completions' always
> returns scored candidates as it does now.  See my other mail regarding
> the caveats of the current patch.

I'd prefer not to lose existing features, because that'd potentially
make the changes backward-incompatible.

Thanks.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Sat, 14 Aug 2021 06:46:02 GMT) Full text and rfc822 format available.

Message #77 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Daniel Mendler <mail <at> daniel-mendler.de>
Cc: dgutov <at> yandex.ru, monnier <at> iro.umontreal.ca, joaotavora <at> gmail.com,
 48841 <at> debbugs.gnu.org, 47711 <at> debbugs.gnu.org
Subject: Re: [PATCH VERSION 2] Add new `completion-filter-completions` API and
 deferred highlighting
Date: Sat, 14 Aug 2021 09:45:09 +0300

> From: Daniel Mendler <mail <at> daniel-mendler.de>
> Cc: Eli Zaretskii <eliz <at> gnu.org>, João Távora
>  <joaotavora <at> gmail.com>, Stefan Monnier <monnier <at> iro.umontreal.ca>,
>  Dmitry Gutov <dgutov <at> yandex.ru>
> Date: Fri, 13 Aug 2021 12:38:28 +0200
> 
> I attached the overhauled patch, which addresses most of the comments by
> Eli.  In comparison to my last patch, the patch is fully backward
> compatible and preserves all existing tests.  As before, there are tests
> which check the new functionality for each existing completion style.

Thanks.  You were faster than me, so I sent a few more comments to the
old patch today.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Sat, 14 Aug 2021 07:03:01 GMT) Full text and rfc822 format available.

Message #80 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Daniel Mendler <mail <at> daniel-mendler.de>
Cc: 48841 <at> debbugs.gnu.org, dgutov <at> yandex.ru, joaotavora <at> gmail.com,
 monnier <at> iro.umontreal.ca, 47711 <at> debbugs.gnu.org
Subject: Re: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Sat, 14 Aug 2021 10:01:48 +0300

> From: Daniel Mendler <mail <at> daniel-mendler.de>
> Date: Fri, 13 Aug 2021 14:56:38 +0200
> Cc: 47711 <at> debbugs.gnu.org, Stefan Monnier <monnier <at> iro.umontreal.ca>,
>  48841 <at> debbugs.gnu.org, Dmitry Gutov <dgutov <at> yandex.ru>
> 
> On 8/13/21 2:37 PM, João Távora wrote:
> > On Fri, Aug 13, 2021 at 1:22 PM Daniel Mendler <mail <at> daniel-mendler.de> wrote:
> > 
> >> It is important to keep this property since this will preclude many bugs
> >> due to string mutation.
> > 
> > I am aware of this, of course.  Can you give examples of these "many bugs"?
> > Perhaps other than the one I already described and addressed?
> 
> No, João, this is not how it goes.  I don't have to prove to you that
> your idea introduces bugs.  You have to show that mutation of the
> completion table strings (which are not supposed to be mutated) will not
> lead to bugs, which are hard to find.

Please calm down, both of you.  No one has to prove anything to anyone
here, that's not how Emacs development works.  We need to see which
idea is better, and if none is significantly better, we will probably
have both of them living side by side.

And while asking for an example of potential bugs is reasonable,
asking for a proof that a change will NOT lead to bugs isn't.  So how
about a couple of examples where having original strings unchanged is
important, which could then be discussed?

> >> Note that your idea also does not address the other issues which are
> >> addressed by my patch.
> > 
> > That's for sure.  My patch idea addresses only that single problem.
> > I think this is a good property of patches: to solve one thing, not many.
> 
> No, this is not necessarily true.  This is only good if the problem is
> solved in a way which is future proof.  The idea of mutating the strings
> is a hack and not a solution.

Just to make sure we are on the same page: adding a text property to a
string doesn't mutate a string.  Lisp programs that process these
strings will not necessarily see any difference, and displaying those
strings will also not show any difference if the property is not
related to display.  So the assumption that seems to be made here,
that adding a property is the same as mutating a string, is IMO
inaccurate if not incorrect.

And once again: please tone down your responses, both of you.  TIA.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Sat, 14 Aug 2021 07:14:02 GMT) Full text and rfc822 format available.

Message #83 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: mail <at> daniel-mendler.de, 47711 <at> debbugs.gnu.org, joaotavora <at> gmail.com,
 48841 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca
Subject: Re: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Sat, 14 Aug 2021 10:12:54 +0300

> From: Dmitry Gutov <dgutov <at> yandex.ru>
> Date: Sat, 14 Aug 2021 05:47:43 +0300
> Cc: Stefan Monnier <monnier <at> iro.umontreal.ca>, 48841 <at> debbugs.gnu.org,
>  47711 <at> debbugs.gnu.org
> 
> I thought I explained the problem with this previously.
> 
> It's basically this: we cannot mutate what we don't own. Across all of 
> completion functions out there, there will be such that return "shared" 
> strings (meaning, not copied or newly allocated) from their completion 
> tables. And modifying them is bad, with consequences which can present 
> themselves in unexpected, often subtle ways.
> 
> Since up until now completion-pcm--hilit-commonality copied all strings 
> before modifying, completion tables such as described (with "shared" 
> strings) have all been "legal". Suddenly deciding to stop supporting 
> them would be a major API breakage with consequences that are hard to 
> predict. And while I perhaps agree that it's an inconvenience, I don't 
> think it's a choice we can simply make as this stage in c-a-p-f's 
> development.

This sounds like an argument against Daniel's approach as well, no?
Because if a completion API returns strings it "doesn't own", there
will be restrictions on Lisp programs that use those strings, because
those Lisp programs previously could do anything they liked with those
strings, and now they cannot.  Or am I missing something?

>    1. (setq s (symbol-name 'car))
> 
>    2. (put-text-property 1 3 'face 'error s)
> 
>    3. Switch to a buffer in fundamental mode
> 
>    4. (insert (symbol-name 'car)) --> see the error face in the buffer
> 
> Now imagine that some completion table collects symbol names by passing 
> obarray through #'symbol-name rather than #'all-completions, and voila, 
> if the completion machinery adds properties (any properties, not just 
> face) to those strings, you have just modified a bunch of global values. 
> That's not good.

How is this different from Daniel's proposal of returning the original
strings?  AFAIU, he just shifts the responsibility from the completion
code to the caller of the completion code, but basically leaves the
problem still very much real and pretty much into our face.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Sat, 14 Aug 2021 07:17:01 GMT) Full text and rfc822 format available.

Message #86 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: mail <at> daniel-mendler.de, 47711 <at> debbugs.gnu.org, joaotavora <at> gmail.com,
 48841 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca
Subject: Re: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Sat, 14 Aug 2021 10:16:08 +0300

> From: Dmitry Gutov <dgutov <at> yandex.ru>
> Date: Sat, 14 Aug 2021 05:55:17 +0300
> Cc: Stefan Monnier <monnier <at> iro.umontreal.ca>, 48841 <at> debbugs.gnu.org,
>  47711 <at> debbugs.gnu.org
> 
> On 13.08.2021 15:05, João Távora wrote:
> > If one removes these lines, the process becomes much faster, but there is a
> > problem with highlighting.  My idea is indeed to defer highlighting by not
> > setting the 'face property directly on that shared string, but some
> > other property
> > that is read later from the shared string by compliant frontents.
> 
> I haven't done any direct benchmarking, but I'm pretty sure that this 
> approach cannot, by definition, be as fast as the non-mutating one.

Daniel seems to think otherwise, AFAIU.

> Because you go through the whole list and mutate all of its elements, 
> you have to perform a certain bit of work W x N times, where N is the 
> size of the whole list.
> 
> Whereas the deferred-mutation approach will only have to do its bit 
> (which is likely more work, like, WWW) only 20 times, or 100 times, or 
> however many completions are displayed. And this is usually negligible.
> 
> However big the difference is going to be, I can't say in advance, of 
> course, or whether it's going to be shadowed by some other performance 
> costs. But the non-mutating approach should have the best optimization 
> potential when the list is long.

So I guess the suggestion to have a benchmark is still useful, because
the estimations of which approach has better performance vary between
you three.  So maybe producing such benchmarks would be a good step?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Sat, 14 Aug 2021 08:24:01 GMT) Full text and rfc822 format available.

Message #89 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 48841 <at> debbugs.gnu.org,
 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Sat, 14 Aug 2021 09:23:25 +0100

Dmitry Gutov <dgutov <at> yandex.ru> writes:

> Aside from the mutability/ownership issue,
>
> On 13.08.2021 15:05, João Távora wrote:
>> If one removes these lines, the process becomes much faster, but there is a
>> problem with highlighting.  My idea is indeed to defer highlighting by not
>> setting the 'face property directly on that shared string, but some
>> other property
>> that is read later from the shared string by compliant frontents.
>
> I haven't done any direct benchmarking, but I'm pretty sure that this
> approach cannot, by definition, be as fast as the non-mutating one.
>
> Because you go through the whole list and mutate all of its elements,
> you have to perform a certain bit of work W x N times, where N is the
> size of the whole list.

Let's call W the work that you perform N times in this istuation.  In
the non-mutation, let's call it Z.  So

W <= Z, because Z not only propertizes the string with a calculation of
faces but _also copies its character contents_.

Also I think it's better to start about copying rather than mutating.
As Eli points out, putting a text property in a string (my idea) is not
equivalent with "mutating" it.

> Whereas the deferred-mutation approach will only have to do its bit
> (which is likely more work, like, WWW) only 20 times, or 100 times, or
> however many completions are displayed. And this is usually
> negligible.

I think you're going in the same fallacy you went briefly in the other
bug report.  The flex and pcm styles (meaning
completion-pcm--hilit-commonality) has to go through all the completions
when deciding the score to atribute to each completion that we already
know matches the pattern.  That's because this scoring is essential to
sorting.  That's a given in both scenarios, copying and non-copying.

Then, it's true that only a very small set of those eventually have to
be displayed to the user, depending on where wants she wants her
scrolling page to be.  So that's when you have to apply 'face' to, say
20 strings, and that can indeed be pretty fast.  But where does the
information come from?

- Currently, it comes from the string's 'face' itself, which was copied
  entirely.

- In the non-copying approach, it must come from somewhere else.  One
  idea is that it comes from a new "private" property 'lazy-face', also
  in the string itselv, but which was _not_ copied.  Another idea is
  just to remember the pattern and re-match it to those 20 strings.

I think the second alternative is always faster.

> However big the difference is going to be, I can't say in advance, of
> course, or whether it's going to be shadowed by some other performance
> costs. But the non-mutating approach should have the best optimization
> potential when the list is long.

Don't think so.  I'm doing benchmarks, will post soon.

João

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Sat, 14 Aug 2021 09:49:02 GMT) Full text and rfc822 format available.

Message #92 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, dgutov <at> yandex.ru,
 monnier <at> iro.umontreal.ca, 48841 <at> debbugs.gnu.org, 47711 <at> debbugs.gnu.org
Subject: Re: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Sat, 14 Aug 2021 10:48:28 +0100

[Message part 1 (text/plain, inline)]

Eli Zaretskii <eliz <at> gnu.org> writes:

>> > On Fri, Aug 13, 2021 at 1:22 PM Daniel Mendler <mail <at> daniel-mendler.de> wrote:
>> > 
>> >> It is important to keep this property since this will preclude many bugs
>> >> due to string mutation.
>> > I am aware of this, of course.  Can you give examples of these "many bugs"?
>> > Perhaps other than the one I already described and addressed?
>> No, João, this is not how it goes.  I don't have to prove to you that
>> your idea introduces bugs.  You have to show that mutation of the
>> completion table strings (which are not supposed to be mutated) will not
>> lead to bugs, which are hard to find.
> Please calm down, both of you.  No one has to prove anything to anyone
> here, that's not how Emacs development works.  We need to see which
> idea is better, and if none is significantly better, we will probably
> have both of them living side by side.
>
> And while asking for an example of potential bugs is reasonable,
> asking for a proof that a change will NOT lead to bugs isn't.

As far as I remember, I have done the first. I found bugs and addressed
them. I cannot _prove_ that my change will not leads to bugs indeed:
no-one can with any change.  I've just stated repeteadly that I'm not
aware of any such bugs.  I can understand intuition" for bugs to a
certain extent (everyone has intuition), but this intuition must always
resolve into actual reality to be useful in the end.

> So how
> about a couple of examples where having original strings unchanged is
> important, which could then be discussed?

Good idea, so in the absence of any controlled benchmarks I did some of
my own, using the most controlled environment I could devise.  I start
Emacs like so:

   ~/Source/Emacs/emacs/src/emacs -Q -f fido-mode -f fido-vertical-mode -l ~/tmp/benchmark.el ~/tmp/benchmark.el

I prime the obarry with lots of symblos to make completion purposedly
slow:

   (require 'cl-lib)
   (cl-loop repeat 300000 do (intern (symbol-name (gensym))))

I attach the file. Then I try a run of 10 invocations of

  ;; Press C-u C-x C-e C-m quickly to produce a sample.  
  (benchmark-run (completing-read "" obarray))

This, I think, is a good representation of the responsiveness of the
completion system.  It always prints well after I finish typing, so I
don't think I'm inducing any artificial slowdows while it waits for my
input.  When not measuring quantitatively, I also feel the difference in
responsiveness between different approaches.

Summarized results with an assortment of Emacs builds.

   - the current master (254dc6ab4ca8e6a549a795f9eaf45378ce51b61f).

     20.25 seconds total

   - Applying Daniel's patch over 254dc6.

     23.41 seconds total

   - The theoretical best situation where we don't highlight in
     completion-pcm--hilit-commonality (like 254dc6, but just removed
     the copy-sequence)

     10.70 seconds total

   - Experimental patch published in
     scratch/icomplete-lazy-highlight-attempt-2 (not finished, still
     needs a way for frontends to opt into the optimization).

     10.80 seconds total

I invite you all to reproduce these results.

In conclusion, I don't think Daniel's patch is going in the right
direction, *performance-wise*, for the kind of responsiveness scenarios
that I am concerned with, and which were discussed with Dmitry in
bug#48841.  It seems to slow down the process by about 10%.

Note 1: there may be *other* performance scenarios that I am not aware
of, where Daniel's patch excels.  I've requested these benchmarks,
regrettably without any success.

Note 2: doesn't mean that there aren't *other* merits to Daniel's patch,
but I have not understood these yet.  That is due to the stated fact
that the patch is very long, and seems to comprise performance
improvements, refactorings, and API redesign.  It has no documentation
in manual and/or examples on how to use the new API.

>> >> Note that your idea also does not address the other issues which are
>> >> addressed by my patch.
>> > 
>> > That's for sure.  My patch idea addresses only that single problem.
>> > I think this is a good property of patches: to solve one thing, not many.
>> 
>> No, this is not necessarily true.  This is only good if the problem is
>> solved in a way which is future proof.  The idea of mutating the strings
>> is a hack and not a solution.
>
> Just to make sure we are on the same page: adding a text property to a
> string doesn't mutate a string.  Lisp programs that process these
> strings will not necessarily see any difference, and displaying those
> strings will also not show any difference if the property is not
> related to display.  So the assumption that seems to be made here,
> that adding a property is the same as mutating a string, is IMO
> inaccurate if not incorrect.

Yes, in Lisp it is very common to attach a "private" property to an
object.  If no-one else knows about the existence of that property, then
attaching it is not harmful. Generally, of course: there are situations
where adding a private property brings side-effects to other parts of
the code.  But IMO that other code is in the wrong, not the one that
adds properties.

Also, to be clear, attaching a different property (as in, not 'face') to
the completion string is only _one_ of the ways of the ways to bypass
copying.  According to my measurements, performance doesn't seem to be
decided by property attachments, but by copying or not copying of the
character data of said strings in completion-pcm--hilit-commonality.

João

[benchmark.el (application/emacs-lisp, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Sat, 14 Aug 2021 10:37:01 GMT) Full text and rfc822 format available.

Message #95 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 48841 <at> debbugs.gnu.org,
 47711 <at> debbugs.gnu.org
Subject: Re: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Sat, 14 Aug 2021 11:36:32 +0100

Dmitry Gutov <dgutov <at> yandex.ru> writes:

> It's basically this: we cannot mutate what we don't own. Across all of
> completion functions out there, there will be such that return
> "shared" strings (meaning, not copied or newly allocated) from their
> completion tables. And modifying them is bad, with consequences which
> can present themselves in unexpected, often subtle ways.

I agree with this premise.  But would you call putting a uniquely named
text property in them a modification or mutation of said strings?  I
don't.

> Since up until now completion-pcm--hilit-commonality copied all
> strings before modifying, completion tables such as described (with
> "shared" strings) have all been "legal".

Again, if I take one of this shared strings, in whichever environment
running row now, and I secretly attach a privatte "joaot/blargh"
property to it, there is very very low likelyhood that that will hurt
anybody.

You seem to be worrying about re-setting the 'face' property (a public
property by excellence) and that's the very same bug I experienced and
have described early.  It's not even a hard bug to see.  Just remove the
copy-sequence in `completion-pcm--hilit-commonality' and see strange
stuff happening.

But if you set some other property, _that_ bug _doesn't_ occur.  Do some
other bugs occur?  I don't know, I don't think we'll ever know, for
_any_ change.

Furthermore, there are other ways to forego the copying in
`completion-pcm--hilit-commonality and not even touch _ANY_ string
property.

> Suddenly deciding to stop supporting them would be a major API
> breakage with consequences that are hard to predict. And while I
> perhaps agree that it's an inconvenience, I don't think it's a choice
> we can simply make as this stage in c-a-p-f's development.
>
> To give an example of a subtle consequence:
>
>   1. (setq s (symbol-name 'car))
>
>   2. (put-text-property 1 3 'face 'error s)
>
>   3. Switch to a buffer in fundamental mode
>
>   4. (insert (symbol-name 'car)) --> see the error face in the buffer

It's not even subtle :-) Yes this is why have seen from the beginning in
bug#48841.  I think it was even I who reported it to you.

The principle to follow can be summarized as this: "Don't touch values
of properties you don't own in objects you don't own."

So just don't touch the 'face' property in things you don't own!  But
feel free to touch the "dmitry/blargh" property even in objects you
don't own.

So 'c-p--h-l' doesn't "own" face.  So it must either create an object
that it owns or set something that it does own.  'completion-score' is
"owned" by 'c-p--h-l'.  Only it can write it (though others can read
it).

> Now imagine that some completion table collects symbol names by
> passing obarray through #'symbol-name rather than #'all-completions,
> and voila, if the completion machinery adds properties (any
> properties, not just face) to those strings, you have just modified a
> bunch of global values. That's not good.

Why?  Maybe I'm missing something.  Why is adding properties -- that
no-one but the completion machinery knows about -- to those shared
strings "not good"?  What bad thing can happen if I do?

> And in the example above, the values are those that the
> lispref/objects.texi says we should not change (though it gives
> (symbol-name 'cons) as example). "Not mutable", in its parlance. IIRC
> the related discussions mentioned that modifying such values could
> lead to a segfault in some previous Emacs versions. Maybe not anymore,
> but it's still not a good idea.

You're extrapolating "change" to also include attaching properties to
symbols.  I think that document just means that you can't do stuff like

    (aset "cons" 0 ?z)

or

    (aset (symbol-name 'cons) 0 ?z)

I don't think it means you can't

    (put-text-property 0 1 'joaot/muahahah 42 (symbol-name 'cons))

But maybe Eli or someone else more knowledgeable in the deep internals
of Emacs can correct me.

If indeed I'm wrong, there are other ways to forego the copying in
`c-p---hilit-commonality` and still don't incurr in any such "mutation".
We must keep our eyes on the prize: copying -- not property-attaching --
is the real bummer here.

scratch/icomplete-lazy-highlight-attempt-2, although still incomplete,
is one such approach, though it still sets `completion-score` on the
"shared" string, used later for sorting.  But also that could be
prevented (again, only if it turns out to be actually problematic IMO).

João

PS: Maybe I've not stated it clearly enough: I *don't* object to -- or
endorse -- Daniel's patch.  My point was solely that it mixes too many
things for me to be intellectually able to review its functional merits,
and that those things should be separated into multiple problems and
patches to make this evaluation easier.  Maybe someone with superior
intellecutal capacity can review -- on substance -- as it stands.

See my other reply containing benchmarks.  Daniel's patch doesn't
perform well there, but for all I know, it can co-exist with my
non-copying approach, and we can all have our cake.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Sat, 14 Aug 2021 11:23:02 GMT) Full text and rfc822 format available.

Message #98 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: mail <at> daniel-mendler.de, 47711 <at> debbugs.gnu.org, joaotavora <at> gmail.com,
 48841 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca
Subject: Re: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Sat, 14 Aug 2021 14:22:04 +0300

On 14.08.2021 10:12, Eli Zaretskii wrote:
>> From: Dmitry Gutov <dgutov <at> yandex.ru>
>> Date: Sat, 14 Aug 2021 05:47:43 +0300
>> Cc: Stefan Monnier <monnier <at> iro.umontreal.ca>, 48841 <at> debbugs.gnu.org,
>>   47711 <at> debbugs.gnu.org
>>
>> I thought I explained the problem with this previously.
>>
>> It's basically this: we cannot mutate what we don't own. Across all of
>> completion functions out there, there will be such that return "shared"
>> strings (meaning, not copied or newly allocated) from their completion
>> tables. And modifying them is bad, with consequences which can present
>> themselves in unexpected, often subtle ways.
>>
>> Since up until now completion-pcm--hilit-commonality copied all strings
>> before modifying, completion tables such as described (with "shared"
>> strings) have all been "legal". Suddenly deciding to stop supporting
>> them would be a major API breakage with consequences that are hard to
>> predict. And while I perhaps agree that it's an inconvenience, I don't
>> think it's a choice we can simply make as this stage in c-a-p-f's
>> development.
> 
> This sounds like an argument against Daniel's approach as well, no?
> Because if a completion API returns strings it "doesn't own", there
> will be restrictions on Lisp programs that use those strings, because
> those Lisp programs previously could do anything they liked with those
> strings, and now they cannot.  Or am I missing something?

Good question. It is not.

The completion tables described above have all been doing "legal" 
things, in our general understanding.

But any callers of completion-all-completions were never really allowed 
to modify the returned strings (those still were strings that code 
"doesn't own").

Of course, some of those callers (I don't know any, though) might have 
taken advantage of being able to modify the strings with impunity 
because of completion-all-completions' implementation detail, but 
they'll have a chance to clean up their act when switching to 
completion-filter-completions.

>>     1. (setq s (symbol-name 'car))
>>
>>     2. (put-text-property 1 3 'face 'error s)
>>
>>     3. Switch to a buffer in fundamental mode
>>
>>     4. (insert (symbol-name 'car)) --> see the error face in the buffer
>>
>> Now imagine that some completion table collects symbol names by passing
>> obarray through #'symbol-name rather than #'all-completions, and voila,
>> if the completion machinery adds properties (any properties, not just
>> face) to those strings, you have just modified a bunch of global values.
>> That's not good.
> 
> How is this different from Daniel's proposal of returning the original
> strings?  AFAIU, he just shifts the responsibility from the completion
> code to the caller of the completion code, but basically leaves the
> problem still very much real and pretty much into our face.

This is a shift of responsibility in the right direction. The callers 
might as well do the string copying when needed, but the fact of the 
matter is, they usually only need to "copy" 20-100 strings (or however 
many is displayed), if they need to modify them at all. That's where we 
win: copying 100 strings is better than copying 10000.

Gotta run now, will reply to other email later.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Sat, 14 Aug 2021 11:30:02 GMT) Full text and rfc822 format available.

Message #101 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: João Távora <joaotavora <at> gmail.com>
Cc: mail <at> daniel-mendler.de, 47711 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca,
 48841 <at> debbugs.gnu.org, dgutov <at> yandex.ru
Subject: Re: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Sat, 14 Aug 2021 14:29:03 +0300

> From: João Távora <joaotavora <at> gmail.com>
> Date: Sat, 14 Aug 2021 11:36:32 +0100
> Cc: Daniel Mendler <mail <at> daniel-mendler.de>,
>  Stefan Monnier <monnier <at> iro.umontreal.ca>, 48841 <at> debbugs.gnu.org,
>  47711 <at> debbugs.gnu.org
> 
> > And in the example above, the values are those that the
> > lispref/objects.texi says we should not change (though it gives
> > (symbol-name 'cons) as example). "Not mutable", in its parlance. IIRC
> > the related discussions mentioned that modifying such values could
> > lead to a segfault in some previous Emacs versions. Maybe not anymore,
> > but it's still not a good idea.
> 
> You're extrapolating "change" to also include attaching properties to
> symbols.  I think that document just means that you can't do stuff like
> 
>     (aset "cons" 0 ?z)
> 
> or
> 
>     (aset (symbol-name 'cons) 0 ?z)
> 
> I don't think it means you can't
> 
>     (put-text-property 0 1 'joaot/muahahah 42 (symbol-name 'cons))
> 
> But maybe Eli or someone else more knowledgeable in the deep internals
> of Emacs can correct me.

Text properties are stored separately from the string, so I don't
think adding properties can in general be referred to as "change".

Whether in some particular situation that could count as a "change"
depends on that situation and on the particular property, of course.
I'm not sure in the context of completion there's any reason to count
as "change" adding properties that don't affect display.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Sat, 14 Aug 2021 12:13:02 GMT) Full text and rfc822 format available.

Message #104 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: mail <at> daniel-mendler.de,
 João Távora <joaotavora <at> gmail.com>, 48841 <at> debbugs.gnu.org,
 dgutov <at> yandex.ru, monnier <at> iro.umontreal.ca, 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Sat, 14 Aug 2021 14:12:31 +0200

Eli Zaretskii <eliz <at> gnu.org> writes:

> Text properties are stored separately from the string, so I don't
> think adding properties can in general be referred to as "change".
>
> Whether in some particular situation that could count as a "change"
> depends on that situation and on the particular property, of course.
> I'm not sure in the context of completion there's any reason to count
> as "change" adding properties that don't affect display.

It is a destructive change, but we may just declare that completion
functions are allowed to destructively change the inputs in certain very
prescribed ways.  I'd rather avoid that, though, if at all possible,
because it may lead to subtle bugs all over the place.

Stefan just reminded me (in a different bug report) that we've long
meant to extend the text property machinery with a "namespace" or
"plane" concept.  The impetus for this is really the font locking
machinery which wants to manage some text properties that other packages
also want to manage.

The idea is that the display machinery would combine all the planes
before displaying, but each package would just manage its own "plane".
If we had something like this, then using this mechanism in the
completion context would make sense -- we could then say that completion
isn't allowed to alter anything except text properties in its private
plane.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Sat, 14 Aug 2021 12:41:01 GMT) Full text and rfc822 format available.

Message #107 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: mail <at> daniel-mendler.de, joaotavora <at> gmail.com, 48841 <at> debbugs.gnu.org,
 dgutov <at> yandex.ru, monnier <at> iro.umontreal.ca, 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Sat, 14 Aug 2021 15:39:51 +0300

> From: Lars Ingebrigtsen <larsi <at> gnus.org>
> Cc: João Távora <joaotavora <at> gmail.com>,
>   mail <at> daniel-mendler.de,
>   dgutov <at> yandex.ru,  monnier <at> iro.umontreal.ca,  48841 <at> debbugs.gnu.org,
>   47711 <at> debbugs.gnu.org
> Date: Sat, 14 Aug 2021 14:12:31 +0200
> 
> Stefan just reminded me (in a different bug report) that we've long
> meant to extend the text property machinery with a "namespace" or
> "plane" concept.  The impetus for this is really the font locking
> machinery which wants to manage some text properties that other packages
> also want to manage.
> 
> The idea is that the display machinery would combine all the planes
> before displaying, but each package would just manage its own "plane".
> If we had something like this, then using this mechanism in the
> completion context would make sense -- we could then say that completion
> isn't allowed to alter anything except text properties in its private
> plane.

How can that work if at display time all the "planes" are combined?
It would mean that the code which produced the original strings will
get them displayed differently after completion.

Anyway, I'm not sure why you are talking about display here: the
properties which are supposed to store the information about
completion aren't supposed to affect display.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Sat, 14 Aug 2021 13:30:02 GMT) Full text and rfc822 format available.

Message #110 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: mail <at> daniel-mendler.de, joaotavora <at> gmail.com, 48841 <at> debbugs.gnu.org,
 dgutov <at> yandex.ru, monnier <at> iro.umontreal.ca, 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Sat, 14 Aug 2021 15:29:03 +0200

Eli Zaretskii <eliz <at> gnu.org> writes:

> How can that work if at display time all the "planes" are combined?
> It would mean that the code which produced the original strings will
> get them displayed differently after completion.

That's in the font-lock context, where font-lock will do faces on its
"plane" while other packages can do faces on their own "planes", and
they'll be combined on display.

It not relevant in the context of this bug report, but I thought I'd
just mention the general design.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Sun, 15 Aug 2021 00:04:01 GMT) Full text and rfc822 format available.

Message #113 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, dgutov <at> yandex.ru,
 monnier <at> iro.umontreal.ca, 48841 <at> debbugs.gnu.org, 47711 <at> debbugs.gnu.org
Subject: Re: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Sun, 15 Aug 2021 01:03:18 +0100

João Távora <joaotavora <at> gmail.com> writes:

> in the absence of any controlled benchmarks I did some of
> my own, using the most controlled environment I could devise.  I start
> Emacs like so:
>
>    ~/Source/Emacs/emacs/src/emacs -Q -f fido-mode -f fido-vertical-mode -l ~/tmp/benchmark.el ~/tmp/benchmark.el

I have know tweaked the benchmark slightly to make it easier to evaluate
speed qualitatively.  Here's what I've been using.

   (require 'cl-lib)

   (fido-mode 1)
   (fido-vertical-mode 1)

   ;; Introduce 150 000 new functions to really slow things down.
   ;; Probably more than most non-Spacemancs people will have :-)
   (defmacro lotsoffunctions ()
     `(progn
        ,@(cl-loop repeat 150000
                  collect `(defun ,(intern (symbol-name (gensym "heyhey"))) () 42))))

   (lotsoffunctions)

   (when nil
     ;; Press C-u C-x C-e C-m quickly to produce a quantitative sample
     (benchmark-run (completing-read "" obarray))

     ;; Or just press C-h f to experience how fast/slow completion is.
     )

The results are the same as the ones I reported in the previous email.

I've also cleaned up my previous patch of the
scratch/icomplete-lazy-highlight-attempt-2 branch slightly.  It is now
fully opt-in for frontends and completion-styles, so the backward
compatibility problems which I speculated seem to have been exaggerated.

I'm still studying it for flaws, but anyone can have a look.  And, of
course, there are many different ways to realize the "opt-in" for
frontends/styles.  I just chose the one that seemed the simplest given
the current completion framework.

The performance is still very good, it reduces the usual waiting time in
long lists of completions to about half of what it currently is.

João

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 16 Aug 2021 03:18:01 GMT) Full text and rfc822 format available.

Message #116 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Eli Zaretskii <eliz <at> gnu.org>, João Távora
 <joaotavora <at> gmail.com>
Cc: mail <at> daniel-mendler.de, monnier <at> iro.umontreal.ca, 48841 <at> debbugs.gnu.org,
 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Mon, 16 Aug 2021 06:17:32 +0300

On 14.08.2021 14:29, Eli Zaretskii wrote:
> Text properties are stored separately from the string, so I don't
> think adding properties can in general be referred to as "change".

Are you thinking of C strings?

Lisp strings carry text properties in addition to the array of 
characters. It doesn't really matter where in the memory the properties 
and the characters reside.

> Whether in some particular situation that could count as a "change"
> depends on that situation and on the particular property, of course.

I was talking in the general sense: modifying a value.

One can talk about whether a certain modification matters in certain 
situations, but that's not the way to discount a general principle.

> I'm not sure in the context of completion there's any reason to count
> as "change" adding properties that don't affect display.

For the context in question, whether the properties affect display is 
not particularly important. Properties affecting display just make it 
easier to notice that something's wrong. Bug involving other properties 
should be more difficult to investigate.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 16 Aug 2021 03:22:02 GMT) Full text and rfc822 format available.

Message #119 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Lars Ingebrigtsen <larsi <at> gnus.org>, Eli Zaretskii <eliz <at> gnu.org>
Cc: mail <at> daniel-mendler.de, 47711 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca,
 48841 <at> debbugs.gnu.org, João Távora <joaotavora <at> gmail.com>
Subject: Re: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Mon, 16 Aug 2021 06:21:16 +0300

On 14.08.2021 15:12, Lars Ingebrigtsen wrote:
> It is a destructive change, but we may just declare that completion
> functions are allowed to destructively change the inputs in certain very
> prescribed ways.  I'd rather avoid that, though, if at all possible,
> because it may lead to subtle bugs all over the place.

That would be a breaking change, but it's a possibility, of course.

If we couldn't find a better way to implement this.

> Stefan just reminded me (in a different bug report) that we've long
> meant to extend the text property machinery with a "namespace" or
> "plane" concept.  The impetus for this is really the font locking
> machinery which wants to manage some text properties that other packages
> also want to manage.

"Planes" for text properties are just prefixed properties, I guess? 
That's different from the situation with font-lock.

> The idea is that the display machinery would combine all the planes
> before displaying, but each package would just manage its own "plane".
> If we had something like this, then using this mechanism in the
> completion context would make sense -- we could then say that completion
> isn't allowed to alter anything except text properties in its private
> plane.

Yes, if the code makes sure to only use prefixed properties, that would 
limit the damage. It could still affect repeated (parallel?) uses of the 
same values in the same piece of code.

And even if the effects are usually not serious, are we really okay with 
evaluating (symbol-name 'car) someday and seeing lots of properties 
attached to it?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 16 Aug 2021 03:28:01 GMT) Full text and rfc822 format available.

Message #122 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Eli Zaretskii <eliz <at> gnu.org>, Daniel Mendler <mail <at> daniel-mendler.de>
Cc: joaotavora <at> gmail.com, monnier <at> iro.umontreal.ca, 48841 <at> debbugs.gnu.org,
 47711 <at> debbugs.gnu.org
Subject: Re: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Mon, 16 Aug 2021 06:26:58 +0300

On 14.08.2021 10:01, Eli Zaretskii wrote:

> Just to make sure we are on the same page: adding a text property to a
> string doesn't mutate a string.  Lisp programs that process these
> strings will not necessarily see any difference, and displaying those
> strings will also not show any difference if the property is not
> related to display.  So the assumption that seems to be made here,
> that adding a property is the same as mutating a string, is IMO
> inaccurate if not incorrect.

This is nonsense.

A program won't necessarily see a difference in *any* changed value, as 
long as some part of it stays the same.

I can zero out the tail of a string, and have a program that only looks 
at its first few characters. It wouldn't mean that a string hasn't changed.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 16 Aug 2021 03:28:02 GMT) Full text and rfc822 format available.

Message #125 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 48841 <at> debbugs.gnu.org,
 Lars Ingebrigtsen <larsi <at> gnus.org>, Eli Zaretskii <eliz <at> gnu.org>,
 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Mon, 16 Aug 2021 04:27:24 +0100

On Mon, Aug 16, 2021 at 4:21 AM Dmitry Gutov <dgutov <at> yandex.ru> wrote:

> And even if the effects are usually not serious, are we really okay with
> evaluating (symbol-name 'car) someday and seeing lots of properties
> attached to it?

I wouldn't mind that at all.  For me, it's quite the same as evaluating
(symbol-plist 'car) and seeing (is-vehicle t number-of-wheels 4) along with
all the other byte-compilation stuff already there.

João Távora

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 16 Aug 2021 03:33:02 GMT) Full text and rfc822 format available.

Message #128 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: João Távora <joaotavora <at> gmail.com>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Lars Ingebrigtsen <larsi <at> gnus.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 48841 <at> debbugs.gnu.org,
 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Mon, 16 Aug 2021 06:31:54 +0300

On 16.08.2021 06:27, João Távora wrote:
> I wouldn't mind that at all.  For me, it's quite the same as evaluating
> (symbol-plist 'car) and seeing (is-vehicle t number-of-wheels 4) along with
> all the other byte-compilation stuff already there.

Those serve a real purpose, not just work as an accidental cache for 
some earlier computation.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 16 Aug 2021 03:49:02 GMT) Full text and rfc822 format available.

Message #131 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: João Távora <joaotavora <at> gmail.com>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 48841 <at> debbugs.gnu.org,
 47711 <at> debbugs.gnu.org
Subject: Re: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Mon, 16 Aug 2021 06:48:32 +0300

On 14.08.2021 11:23, João Távora wrote:
> Dmitry Gutov <dgutov <at> yandex.ru> writes:
> 
>> Aside from the mutability/ownership issue,
>>
>> On 13.08.2021 15:05, João Távora wrote:
>>> If one removes these lines, the process becomes much faster, but there is a
>>> problem with highlighting.  My idea is indeed to defer highlighting by not
>>> setting the 'face property directly on that shared string, but some
>>> other property
>>> that is read later from the shared string by compliant frontents.
>>
>> I haven't done any direct benchmarking, but I'm pretty sure that this
>> approach cannot, by definition, be as fast as the non-mutating one.
>>
>> Because you go through the whole list and mutate all of its elements,
>> you have to perform a certain bit of work W x N times, where N is the
>> size of the whole list.
> 
> Let's call W the work that you perform N times in this istuation.  In
> the non-mutation, let's call it Z.  So
> 
> W <= Z, because Z not only propertizes the string with a calculation of
> faces but _also copies its character contents_.

As I pointed out later in the email you're replying to, copying won't 
happen N times.

> Also I think it's better to start about copying rather than mutating.
> As Eli points out, putting a text property in a string (my idea) is not
> equivalent with "mutating" it.

In common industry terms, that's mutation. Lisp strings are not C 
strings, they are aggregate objects.

>> Whereas the deferred-mutation approach will only have to do its bit
>> (which is likely more work, like, WWW) only 20 times, or 100 times, or
>> however many completions are displayed. And this is usually
>> negligible.
> 
> I think you're going in the same fallacy you went briefly in the other
> bug report.  The flex and pcm styles (meaning
> completion-pcm--hilit-commonality) has to go through all the completions
> when deciding the score to atribute to each completion that we already
> know matches the pattern.  That's because this scoring is essential to
> sorting.  That's a given in both scenarios, copying and non-copying.

First of all, note that scoring is only essential to the 'flex' style. 
Whereas the improvements we're discussing should benefit all, and can be 
more pronounced if the scoring don't need to be performed.

But ok, let's talk about flex in particular.

> Then, it's true that only a very small set of those eventually have to
> be displayed to the user, depending on where wants she wants her
> scrolling page to be.  So that's when you have to apply 'face' to, say
> 20 strings, and that can indeed be pretty fast.  But where does the
> information come from?
> 
> - Currently, it comes from the string's 'face' itself, which was copied
>    entirely.
> 
> - In the non-copying approach, it must come from somewhere else.  One
>    idea is that it comes from a new "private" property 'lazy-face', also
>    in the string itselv, but which was _not_ copied.  Another idea is
>    just to remember the pattern and re-match it to those 20 strings.

Let's say that the cost to compute the score (on one completion) is S. 
And the cost to highlight it is H. The cost to copy a string is C (that 
would be amortized cost, including the subsequent GCs).

The current algorithm costs: N x (C + S + H)

C is unavoidable because of the existing API guarantees.

A non-mutating algorithm can cost:

  N x S (for sorting)

  +

  100 x (C + S + H) (let's say we didn't even cache the scoring info)

...where 100 is the number of displayed completions (the number is 
usually lower).

As we measured previously, copying is quite expensive. Even in the 
above, not-caching approach we win ((N - 100) x (C + H)), and, okay, 
lose 100 x S. For high values of N, it should be a clear win.

> I think the second alternative is always faster.
> 
>> However big the difference is going to be, I can't say in advance, of
>> course, or whether it's going to be shadowed by some other performance
>> costs. But the non-mutating approach should have the best optimization
>> potential when the list is long.
> 
> Don't think so.  I'm doing benchmarks, will post soon.

I'm guessing you just skip the C step in your benchmarks? Which is 
something that breaks our current contract.

Still, Daniel's patch should provide a comparable performance 
improvement. If you're saying it doesn't give numbers as good, I'll have 
to give it a more thorough read and testing tomorrow to comment on that.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 16 Aug 2021 03:55:01 GMT) Full text and rfc822 format available.

Message #134 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Lars Ingebrigtsen <larsi <at> gnus.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 48841 <at> debbugs.gnu.org,
 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Mon, 16 Aug 2021 04:53:59 +0100

On Mon, Aug 16, 2021 at 4:31 AM Dmitry Gutov <dgutov <at> yandex.ru> wrote:
>
> On 16.08.2021 06:27, João Távora wrote:
> > I wouldn't mind that at all.  For me, it's quite the same as evaluating
> > (symbol-plist 'car) and seeing (is-vehicle t number-of-wheels 4) along with
> > all the other byte-compilation stuff already there.
>
> Those serve a real purpose, not just work as an accidental cache for
> some earlier computation.

Caches also serve "a real purpose".  the gv-expander there
would be the "cache of an earlier computation". And I'm not
sure what "accidental" means, but if it means "implementation
detail for something I don't care about", I agree `completion-score`
is "accidental".  Should it be called
`completion-score-internal-cache-dont-look-here`?
Maybe.

Bottom line here is that an outside observer has no clue, and
shouldn't need or want to have a clue, on what "foreign" properties
attached to strings or symbols are meant.  This is why Eli says, and
I agree, that if the property isn't display related, it's all good.  No-one
but the setter and reader of that particular property mind.  The CL
systems I've worked with use package qualifiers to fine-grain this
even further, but they use the same principle. That Elisp allows a
string property list doesn't really make a difference IMO.

And none of this really really matters to the discussion.  If we absolutely
had to store these associations away from the string plist, (for
some aesthetic reason, I guess), we could just use hash-tables.

Then we could return the string unchanged (and uncopied) and largely
keep performance intact.
But why do it, since a string plist is a such a nice place to do these
associations and there's -- apart from your aesthetics considerations
-- 0 drawbacks identified?

João Távora

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 16 Aug 2021 04:01:01 GMT) Full text and rfc822 format available.

Message #137 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: João Távora <joaotavora <at> gmail.com>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Lars Ingebrigtsen <larsi <at> gnus.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 48841 <at> debbugs.gnu.org,
 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Mon, 16 Aug 2021 06:59:52 +0300

On 16.08.2021 06:53, João Távora wrote:
> But why do it, since a string plist is a such a nice place to do these
> associations and there's -- apart from your aesthetics considerations
> -- 0 drawbacks identified?

You read all the explanations, and THAT'S your conclusion?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 16 Aug 2021 04:21:01 GMT) Full text and rfc822 format available.

Message #140 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 48841 <at> debbugs.gnu.org,
 47711 <at> debbugs.gnu.org
Subject: Re: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Mon, 16 Aug 2021 05:20:25 +0100

Dmitry Gutov <dgutov <at> yandex.ru> writes:

> As I pointed out later in the email you're replying to, copying won't
> happen N times.

_Currently_, as in origin/master,  it happens N times.

In my patch, when a frontend adheres to the thing, it happens D times
where D is the amount of strings you need to display.  I guess if you do
the work to adapt a frontend to work with the new API proposed in
Daniel's patch it will be about the same (though likely in many more
lines of Elisp).

> First of all, note that scoring is only essential to the 'flex'
> style. Whereas the improvements we're discussing should benefit all,
> and can be more pronounced if the scoring don't need to be performed.

Yep, I agree.  But I don't see why the same principles I espouse --
which really amount to: let the style know it doesn't need to
face-propertize -- can't be applied to other styles that don't need
scoring.  Although those don't seem to suffer from any performance
problems, at least I haven't seens any complaints/reports/mesurements
like you did for bug#48841.

> But ok, let's talk about flex in particular.

Yes, I think that is important since it is the style known to be least
performant by its very lax nature.

> I'm guessing you just skip the C step in your benchmarks? Which is
> something that breaks our current contract.

Right.  Skipping the 'C = Copying step' is the whole point.  It breaks
our contract because the completion styles currently promise to
"face"-propertize the string.  So this is why I propose to let the
completion-style know that it doesn't need to.  When it is told of that,
it is relieved of the necessity of copying the string.  It is the
frontend that will do that copy just before face-propertizing and
displaying the string.  As you note, and reality shows, that's much
faster.  There is no disagreement here.

> Still, Daniel's patch should provide a comparable performance

Kinf of, from what I've read, it _should_ open the way for that to
happen.  From what I understand, you must then change the frontend (in a
big way?) to stop using completion-all-completions and start using the
new thing.  That work has not been done, as far as I know.  Whereas in
my proposal (which is now a patch posted to bug#48841) you change the
frontend in a very minor way, and that work _has_ been done.

Icomplete was very easy to adapt.  I can try adapting company soon.

In practice, we can't kill off completion-all-completions and start
everyone on completion-filter-completions (if that's what it's called).
So if the latter does turn out to be a step in the right direction (I'm
mostly waiting on Stefan to chime in), then I also don't see why we
couldn't have, as Eli suggested, both strategies for lazy highlighting
at some point in the future.

> improvement. If you're saying it doesn't give numbers as good, I'll
> have to give it a more thorough read and testing tomorrow to comment
> on that.

It's not me who is saying it, it's my Emacs :-) The real problem is that
with Daniel's patch, the frontends using the current API (as in
icomplete/fido) measurably become _slower_.  Though not by much (around
10%), it is still a shame.

Yes, do your testing and please, as always, try to report as
quantitatively as possible, so that we can really compare apples to
apples.

João

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 16 Aug 2021 04:26:01 GMT) Full text and rfc822 format available.

Message #143 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Lars Ingebrigtsen <larsi <at> gnus.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 48841 <at> debbugs.gnu.org,
 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Mon, 16 Aug 2021 05:25:44 +0100

Dmitry Gutov <dgutov <at> yandex.ru> writes:

> On 16.08.2021 06:53, João Távora wrote:
>> But why do it, since a string plist is a such a nice place to do these
>> associations and there's -- apart from your aesthetics considerations
>> -- 0 drawbacks identified?
>
> You read all the explanations, and THAT'S your conclusion?

Yes, I currently conclude that there are 0 drawbacks identified to
creating, _via_ string properties, an association of, say, property
'joaot/answer' with value '42' to _any_ string in my current and future
Emacs runtimes.

I conclude that because --- apart from your aesthetics considerations
("do we really want to see...") --- I have not seen identification of
such drawbacks.  Have I missed any?

João

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 16 Aug 2021 08:48:02 GMT) Full text and rfc822 format available.

Message #146 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Mendler <mail <at> daniel-mendler.de>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 48841 <at> debbugs.gnu.org, dgutov <at> yandex.ru, joaotavora <at> gmail.com,
 monnier <at> iro.umontreal.ca, 47711 <at> debbugs.gnu.org
Subject: Re: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Mon, 16 Aug 2021 10:47:06 +0200

On 8/14/21 9:01 AM, Eli Zaretskii wrote:
> Just to make sure we are on the same page: adding a text property to a
> string doesn't mutate a string.  Lisp programs that process these
> strings will not necessarily see any difference, and displaying those
> strings will also not show any difference if the property is not
> related to display.  So the assumption that seems to be made here,
> that adding a property is the same as mutating a string, is IMO
> inaccurate if not incorrect.

While I agree that adding text properties is not mutation of the string
text itself it still mutates the string data structure. Dmitry made a
good point about this - if a completion table uses obarray as backend to
the completion table you suddenly attach text properties to symbol
names. This is not a good idea. I actually had such an issue once in a
package I developed, where I accidentally attached text properties (via
the mutating put-text-property API) to some strings I didn't own. This
lead to unexpected side effects at other places where I didn't expect
the strings to have properties attached.

Daniel

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 16 Aug 2021 08:49:01 GMT) Full text and rfc822 format available.

Message #149 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Mendler <mail <at> daniel-mendler.de>
To: Eli Zaretskii <eliz <at> gnu.org>, Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 47711 <at> debbugs.gnu.org, joaotavora <at> gmail.com, 48841 <at> debbugs.gnu.org,
 monnier <at> iro.umontreal.ca
Subject: Re: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Mon, 16 Aug 2021 10:48:19 +0200

On 8/14/21 9:12 AM, Eli Zaretskii wrote:
>> Since up until now completion-pcm--hilit-commonality copied all strings 
>> before modifying, completion tables such as described (with "shared" 
>> strings) have all been "legal". Suddenly deciding to stop supporting 
>> them would be a major API breakage with consequences that are hard to 
>> predict. And while I perhaps agree that it's an inconvenience, I don't 
>> think it's a choice we can simply make as this stage in c-a-p-f's 
>> development.
> 
> This sounds like an argument against Daniel's approach as well, no?
> Because if a completion API returns strings it "doesn't own", there
> will be restrictions on Lisp programs that use those strings, because
> those Lisp programs previously could do anything they liked with those
> strings, and now they cannot.  Or am I missing something?

No, in my patch the displayed candidate strings are still copied before
the text properties are attached. The strings are kept intact as they
are now.

Daniel

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 16 Aug 2021 08:54:01 GMT) Full text and rfc822 format available.

Message #152 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Mendler <mail <at> daniel-mendler.de>
To: João Távora <joaotavora <at> gmail.com>,
 Dmitry Gutov <dgutov <at> yandex.ru>
Cc: Eli Zaretskii <eliz <at> gnu.org>, Stefan Monnier <monnier <at> iro.umontreal.ca>,
 48841 <at> debbugs.gnu.org, 47711 <at> debbugs.gnu.org
Subject: Re: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Mon, 16 Aug 2021 10:53:31 +0200

On 8/16/21 6:20 AM, João Távora wrote:
> It's not me who is saying it, it's my Emacs :-) The real problem is that
> with Daniel's patch, the frontends using the current API (as in
> icomplete/fido) measurably become _slower_.  Though not by much (around
> 10%), it is still a shame.

I have to check this. I claim that 'completion-all-completions' should
not get slower with my patch. If it gets indeed slower as your benchmark
shows, this should be fixed and can be fixed since I am not doing
something else than decomposing the highlighting and filtering
processes, which are already present in the current machinery. The
amount of work stays the same. However in the case the new
'completion-filter-completions' API is used, the filtering will get much
faster since no highlighting and copying takes place.

Daniel

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 16 Aug 2021 09:09:02 GMT) Full text and rfc822 format available.

Message #155 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Mendler <mail <at> daniel-mendler.de>
To: João Távora <joaotavora <at> gmail.com>,
 Dmitry Gutov <dgutov <at> yandex.ru>
Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, Eli Zaretskii <eliz <at> gnu.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 48841 <at> debbugs.gnu.org,
 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Mon, 16 Aug 2021 11:08:31 +0200

On 8/16/21 6:25 AM, João Távora wrote:
> Yes, I currently conclude that there are 0 drawbacks identified to
> creating, _via_ string properties, an association of, say, property
> 'joaot/answer' with value '42' to _any_ string in my current and future
> Emacs runtimes.
> 
> I conclude that because --- apart from your aesthetics considerations
> ("do we really want to see...") --- I have not seen identification of
> such drawbacks.  Have I missed any?

There are serious drawbacks of attaching "private" string properties to
arbitrary strings. For once it complicates debugging seriously if there
are suddenly string properties attached to symbol names. It also leads
to a potential memory leak. But even if we could chose this approach
with global side-effects I don't see a reason to do it given the
approach I am proposing in my patch, which avoids these problems entirely.

1. My approach decomposes the already existing completion machinery into
two steps: 1. filtering and 2. highlighting. It does not change the
fundamentals of the completion machinery. This decomposition of the
existing infrastructure is also what leads to the small number of added
lines to minibuffer.el, even if the patch itself is not as small as we
would like to have it.

2. Why take the chances of introducing potentially harmful global side
effects by attaching string properties to arbitrary strings if we can
avoid it easily?

3. The `completion-filter-completions` API is the fastest possible API
for filtering since it does not change the strings at all, it does not
attach string properties or modify the strings in any other way. By
construction, it is a pure function which only filters the list of
candidates. This makes the function easy to use and easy to reason
about. An API which adds private properties to the strings cannot be as
fast and non-intrusive.

4. If 'completion-all-completions' does indeed get slower thanks to my
patch, it is a performance regression of my patch. I will fix this. And
I thank you João for bringing this to my attention. However one should
also consider that in the end, 'fido-mode' and 'icomplete-mode' should
move to the new API 'completion-filter-completions' such that they
benefit from the fast filtering and only copy and highlight the actually
displayed strings. Given this a potential regression of
'completion-all-completions' would matter less for the incrementally
updating UIs. But of course I feel the same way as João - a performance
regression in the API 'completion-all-completions' is unacceptable. It
will be fixed.

Daniel

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 16 Aug 2021 09:43:02 GMT) Full text and rfc822 format available.

Message #158 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Mendler <mail <at> daniel-mendler.de>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 48841 <at> debbugs.gnu.org, 47711 <at> debbugs.gnu.org, joaotavora <at> gmail.com,
 monnier <at> iro.umontreal.ca, dgutov <at> yandex.ru
Subject: Re: bug#48841: [PATCH] Add new `completion-filter-completions` API
 and deferred highlighting
Date: Mon, 16 Aug 2021 11:42:22 +0200

Hello Eli,

here I respond to the comments you sent out after I've already sent the
overhauled patch.

On 8/14/21 8:27 AM, Eli Zaretskii wrote:
>> The function 'completion-filter-completions' receives a completion table
>> as argument.  The strings produced by this table are returned
>> unmodified, but of course the completion table has to produce them.  For
>> a static completion table (e.g., in the simplest case a list of strings)
>> the completion table itself will not allocate strings.  In this scenario
>> 'completion-filter-completions' will not perform any string allocations,
>> only the list will be allocated.  This is what leads to major
>> performance gains.
> 
> My point was that at least some of this should be in the description,
> otherwise it will leave the reader wondering.

I agree with this. The documentation should be improved.

>>>> +(defvar completion--filter-completions nil
>>>> +  "Enable the new completions return value format.
>>>
>>> Btw, why is this an internal variable?  Shouldn't all completion
>>> styles ideally support it?  If so, it should be a public variable,
>>> documented in the ELisp manual.  And the name should also end with -p,
>>> since it's a boolean.  How about completion-filter-completions-format-p?
>>
>> (As I understood the style guide '-p' is not a good idea for boolean
>> variables, since a value is not a predicate in a strict sense.)
>>
>> To address your technical comment - this variable is precisely what one
>> of the technical difficulties mentioned in my other mail is about.  The
>> question is how we can retain backward compatibility of the completion
>> style 'all' functions, e.g., 'completion-basic-all-completions', while
>> still allowing the function to return the newly introduced alist format
>> with more data, which enables 'completion-filter-completions' to perform
>> the efficient deferred highlighting.
> 
> I understand, but given that we provide this for other packages, it
> shouldn't be an internal variable.

No, we explicitly don't provide this variable for other packages. It is
explicitly only meant to be used for the existing completion styles
emacs21, emacs22, basic, substring, partial-completion, initials and
flex to opt-in in a backward-compatible/calling convention preserving
way to the alist return format. The idea is to keep the existing APIs
fully backward compatible.

Other packages should select the format returned from the completion
styles differently. They should return the alist format on Emacs 28 or
if the API 'completion-filter-completions' API is present. In the not so
near future external packages which support only Emacs 28 and upwards
will then only return the alist format and don't have to perform any
detection anymore.

>>> Also, the "This function has been superseded..." part should be a new
>>> paragraph, so that it stands out.  (And I'm not yet sure we indeed
>>> want to say "superseded" here, but that's part of the on-going
>>> discussion.  maybe use a more neutral language here, like "See also".)
>>
>> The new API 'completion-filter-completions' will substitute the existing
>> API 'completion-all-completions'.
> 
> That's your hope, and I understand.  But we as a project didn't yet
> decide to deprecate the original APIs, so talking about superseding is
> premature.

It is not the hope - it is the explicit goal. The API has been designed
to replace the existing API 'completion-all-completions'. We can of
course tone this down. However I, as a package author, would appreciate
if Emacs tells me when a newer API aims to replace another API and when
the documentation is explicit about it. Of course if you decide to have
the doc strings written in a different tone, I will adapt my patch
accordingly. Here I am just explaining why I chose the word "superseded"
instead of a more neutral word.

>>> Is "filter" really the right word here (in the doc string)?  "Filer"
>>> means you take a sequence and produce another sequence with some
>>> members removed.  That's not what this API does, is it?  Suggest to
>>> use a different name, like completion-completions-alist or
>>> completion-all-completions-as-alist.
>>
>> "Filter" seems like exactly the right word to me.  The function takes a
>> list of strings (or a completion table) and returns a subset of matching
>> completion strings without further modifications to the strings. See
>> above what I wrote about allocations.
> 
> But the name says "filter completions".  Which would mean you take a
> list of completions and filter out some of them.  A completion table
> is much more general object than a list of strings.  Thus, I think
> using "filter" here is sub-optimal.

Okay, you are right about this. But I think even if the name
'completion-filter-completions' is not 100% precise it still conveys
what the API is about. 'completion-completions-alist' or
'completion-all-completions-as-alist' are valid names of course, but I
dislike them for their verbosity. Already 'completion-all-completions'
is quite verbose. A strong argument to use this long name is that the
completion style functions are still called
'completion-basic-all-completions' etc. But if we accept that the new
API 'completion-filter-completions' will actually supersede the existing
API 'completion-all-completions' it makes sense to use a name which will
not hurt our eyes in the long run. However this is of course just a
personal preference of mine. I don't want to spent much time with name
bikeshedding discussions. If you decide on a name, I will adapt my patch
accordingly.

>>>> +Only the elements of table that satisfy predicate PRED are considered.
>>>> +POINT is the position of point within STRING.  The METADATA may be
>>>> +modified by the completion style.  The return value is a alist with
>>>> +the keys:
>>>> +
>>>> +- base: Base position of the completion (from the start of STRING)
>>>
>>> "Base" here means the beginning?  If so, why not call it "beg" or
>>> somesuch?
>>
>> Base position is a fixed term which is already used in minibuffer.el for
>> completions.  See also 'completion-base-position' for example.
> 
> Well, we don't have to keep bad habits indefinitely.  It's okay to
> lose them and use better terminology.  Or at least to explain that
> terminology in parentheses the first time it is used in some context.

Okay, I agree. However I tried to avoid including superfluous changes
with my patch set. We should add more context and documentation and then
rename the variables in another patch if we decide that we want to go
through with it.

>>> Are we really losing the completion-score property here?  If so, why?
>>
>> Yes, the property is removed in the current patch.  It is not actually
>> used for anything in the new implementation.  But it is possible to
>> restore the property such that 'completion-all-completions' always
>> returns scored candidates as it does now.  See my other mail regarding
>> the caveats of the current patch.
> 
> I'd prefer not to lose existing features, because that'd potentially
> make the changes backward-incompatible.

The overhauled patch (version 2) ensures that no features are lost. The
patch is fully backward compatible. There is a potential performance
regression in 'completion-all-completions' as identified by João. I have
yet to confirm this regression. In any case, it should be fixable since
the refactored 'completion-all-completions' API does precisely the same
amount of work as it does now.

Furthermore more documentation should be added to my patch. As of now
'completion-all-completions' is not mentioned in the info manual and
'completion-filter-completions' is also not added there. We may want to
improve the documentation of that. But for now I would like to limit my
documentation improvements to expanding the doc strings of the functions
involved in my patch.

Daniel

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 16 Aug 2021 10:16:02 GMT) Full text and rfc822 format available.

Message #161 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Daniel Mendler <mail <at> daniel-mendler.de>
Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 47711 <at> debbugs.gnu.org,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 48841 <at> debbugs.gnu.org,
 Dmitry Gutov <dgutov <at> yandex.ru>
Subject: Re: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Mon, 16 Aug 2021 11:15:33 +0100

On Mon, Aug 16, 2021 at 10:09 AM Daniel Mendler <mail <at> daniel-mendler.de> wrote:

> There are serious drawbacks of attaching "private" string properties to
> arbitrary strings. For once it complicates debugging seriously if there
> are suddenly string properties attached to symbol names. It also leads
> to a potential memory leak.

Please, in the name of the sanity of this discussion, justify these two
statements with examples or follow them with a clause like "because...".

> 3. The `completion-filter-completions` API is the fastest possible API

Again that's quite a statement that I cannot evaluate the veracity of
without hard proof.

What I _can_ evaluate is the material you have put forward, and using
your patch and your Vertico completion software, version 0.14, the very
latest one. I try

   emacs -Q -f package-initialize benchmark.el

where benchmark.el is:

   (setq completion-styles '(flex))
   (defmacro heyhey ()
     `(progn
        ,@(cl-loop repeat 300000
           collect `(defun ,(intern (symbol-name (gensym "heyhey"))) () 42))))
   (heyhey)

I then turn on vertico-mode and press C-h f.  It takes about 4-5 seconds.
It's *faster* than if I do the same with fido-vertical-mode and the current
master, but is noticeably *slower* than if I do the same with the patch
provided earlier and available at scratch/icomplete-lazy-highlight-attempt-2.

Unfortunately, I cannot measure quantitatively here, because I don't
know how to tell Vertico to wait until it gets the correct result.
In other words, take this form:

    (completing-read "bla" obarray)<cursor here>

if you type C-u C-x C-e C-m veeery s-l-o-w-l-y in Vertico, if prints
, correctly, the character "%".  But if you evaluate it quickly wrapped
in a benchmark-run, it returns immediately and prints "".

In fido-mode, it always waits blockinly until it prints the correct result
and the time it took it to achieve that result.  Not questioning if this is
a bug in Vertico, but it would help if it could do the same, or be
configured to do the same, so that we can measure.

Without that, we can't evaluate the speed of Vertico where,
presumably, the fastest API in the world is being used right now.

> 4. If 'completion-all-completions' does indeed get slower thanks to my
> patch, it is a performance regression of my patch. I will fix this. And
> I thank you João for bringing this to my attention. However one should
> also consider that in the end, 'fido-mode' and 'icomplete-mode' should
> move to the new API 'completion-filter-completions' such that they
> benefit from the fast filtering and only copy and highlight the actually
> displayed strings.

Maybe they will, maybe they will.  But it's still quite early to decide if
we're going to move all frontends to that API.  There's no manual
documentation for it. Conceivably, if you appreciate your API so, you
could demonstrate in practice us how easy it is to use by providing
a separate patch that adapts icomplete-mode and fido-mode to use it.

Then, I or other fido-mode users could test it for a while, evaluating
its speed and correctness.  If it performs well and the completion
API architects have a good outlook for it, I see no reason why it
shouldn't be merged and eventually supersede the new one.

In the meantime, there is a patch with a measured and documented
performance boost where fido-mode and icomplete-mode move
opt-in to a new `completion-lazy-hilit` feature in the "old" API with
a total or four 1-line changes.  That patch lives in the branch
scratch/icomplete-lazy-highlight-attempt-2.

I think we should move to that, solving the bug#48841 while we
do the evaluation of all aspects of your contributions.

João Távora

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 16 Aug 2021 10:54:02 GMT) Full text and rfc822 format available.

Message #164 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Mendler <mail <at> daniel-mendler.de>
To: João Távora <joaotavora <at> gmail.com>
Cc: Stefan Monnier <monnier <at> iro.umontreal.ca>, 48841 <at> debbugs.gnu.org,
 Dmitry Gutov <dgutov <at> yandex.ru>, Lars Ingebrigtsen <larsi <at> gnus.org>,
 Eli Zaretskii <eliz <at> gnu.org>, 47711 <at> debbugs.gnu.org
Subject: Re: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Mon, 16 Aug 2021 12:52:58 +0200

On 8/16/21 12:15 PM, João Távora wrote:
> On Mon, Aug 16, 2021 at 10:09 AM Daniel Mendler <mail <at> daniel-mendler.de> wrote:
> 
>> There are serious drawbacks of attaching "private" string properties to
>> arbitrary strings. For once it complicates debugging seriously if there
>> are suddenly string properties attached to symbol names. It also leads
>> to a potential memory leak.
> 
> Please, in the name of the sanity of this discussion, justify these two
> statements with examples or follow them with a clause like "because...".

João, I am giving hard examples here. What is not an example about
"memory leak" or making debugging output verbose thanks to the attached
string properties? You always repeat your demands but whatever I write
it is not satisfactory for you. Is it possible to convince you? Can you
try to interpret my arguments in a positive and constructive light
somehow and fill them with meaning such that it makes sense to you? My
goal is not to be right here just for the sake of being right. As Eli
said, nobody has to prove anything.

>> 3. The `completion-filter-completions` API is the fastest possible API
> 
> Again that's quite a statement that I cannot evaluate the veracity of
> without hard proof.

As I said, I will ensure that my API does not introduce performance
regressions. And since my new API performs strictly less work than your
proposal it will necessarily be faster if you consider only the
filtering, which is what matters for incrementally updating UIs.

>> 4. If 'completion-all-completions' does indeed get slower thanks to my
>> patch, it is a performance regression of my patch. I will fix this. And
>> I thank you João for bringing this to my attention. However one should
>> also consider that in the end, 'fido-mode' and 'icomplete-mode' should
>> move to the new API 'completion-filter-completions' such that they
>> benefit from the fast filtering and only copy and highlight the actually
>> displayed strings.
> 
> Maybe they will, maybe they will.  But it's still quite early to decide if
> we're going to move all frontends to that API.  There's no manual
> documentation for it. Conceivably, if you appreciate your API so, you
> could demonstrate in practice us how easy it is to use by providing
> a separate patch that adapts icomplete-mode and fido-mode to use it.

There is also no manual documentation of 'completion-all-completions'.
Of course you are right that it is early to make these decisions, but
the new API 'completion-filter-completions' is designed in a way to
allow adoption by 'icomplete-mode'. It is important to design the API
such that adoption is possible. I have a patch for Vertico which shows
this. I can provide patches for 'icomplete-mode' separately later on.

> In the meantime, there is a patch with a measured and documented
> performance boost where fido-mode and icomplete-mode move
> opt-in to a new `completion-lazy-hilit` feature in the "old" API with
> a total or four 1-line changes.  That patch lives in the branch
> scratch/icomplete-lazy-highlight-attempt-2.

As argued multiple times here now, the change you are proposing is
fragile. It will lead to problems later on. The goal is not to find the
smallest change which leads to a performance boost, all API violations
allowed. Attaching "private" string properties to arbitrary strings is
an API violation which we will regret later and which will make Emacs
harder to debug and harder to use.

> I think we should move to that, solving the bug#48841 while we
> do the evaluation of all aspects of your contributions.

No, we should not merge this problematic patch of yours. See the many
arguments against this proposal.

Daniel

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 16 Aug 2021 11:38:01 GMT) Full text and rfc822 format available.

Message #167 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Daniel Mendler <mail <at> daniel-mendler.de>
Cc: Stefan Monnier <monnier <at> iro.umontreal.ca>, 48841 <at> debbugs.gnu.org,
 Dmitry Gutov <dgutov <at> yandex.ru>, Lars Ingebrigtsen <larsi <at> gnus.org>,
 Eli Zaretskii <eliz <at> gnu.org>, 47711 <at> debbugs.gnu.org
Subject: Re: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Mon, 16 Aug 2021 12:37:17 +0100

On Mon, Aug 16, 2021 at 11:53 AM Daniel Mendler <mail <at> daniel-mendler.de> wrote:

> >> There are serious drawbacks of attaching "private" string properties to
> >> arbitrary strings. For once it complicates debugging seriously if there
> >> are suddenly string properties attached to symbol names. It also leads
> >> to a potential memory leak.
> > Please, in the name of the sanity of this discussion, justify these two
> > statements with examples or follow them with a clause like "because...".
> João, I am giving hard examples here.

If I say to you:  "It's quite obvious your patch breaks all Git repositories
in Kathmandu, Nepal"  I am expected to demonstrate how a patch to
Emacs, leads to a obscure security flaw in the Linux operating system,
that tickles a butterfly at a certain angle that causes an earthquake
in the Kathmandu data center, literally breaking their Git repositories.

This is the the kind of statement you are making to me and the kind
of logical reasoning I'm looking for.

Alternatively, you can provide an actual experiment I can run in
my computer that demonstrates the bug.

I am not a native English speaker, and maybe you don't understand
my language.  Another way to explain what I am talking  is to talk about
"bug reproduction".  You say there's a bug in my patch, I am asking you
for a "bug reproduction recipe" as defined by most,  if not all, the results
you get by searching "bug reproduction recipe" in the Google search engine.

> goal is not to be right here just for the sake of being right. As Eli
> said, nobody has to prove anything.

This is clearly not what he said.

> >> 3. The `completion-filter-completions` API is the fastest possible API
> >
> > Again that's quite a statement that I cannot evaluate the veracity of
> > without hard proof.
>
> As I said, I will ensure that my API does not introduce performance
> regressions.

Not only that, to produce  veracity on that statement you would need
some much more demanding proof, which is somehow be able to
evaluate all possible APIs to compellingly demonstrate that yours
triumphs.

>  I have a patch for Vertico which shows
> this. I can provide patches for 'icomplete-mode' separately later on.

Yes, please do. The earlier, the better.

> No, we should not merge this problematic patch of yours. See the many
> arguments against this proposal.

I'm sorry to speak this child-like language, but a problem is a "bad thing".
An undesirable thing that happens when presumably safe and good
action(s) is taken by some user.  Can you explain how, given my patch,
a user would take a sequence of innocent actions that would lead to a
"bad thing" that would _not_ happen if the same sequence of innocent
actions were taken  in a version of Emacs without the patch applied?
That, to me, is what constitutes "a bug/problem in the patch".

Let me give you an example: if I make a patch that deletes `/lisp` in
the Emacs source tree, the innocent action "make"  would probably
not work.  That would be the problem/"bad thing"/bug in that patch.

We cannot proceed this discussion without these explanations.  Mind
you, I'm not stating, because it is impossible to prove, that my
patch cannot possibly generate problems, subtle or obvious (that, by
the way, is my interpretation of what Eli meant). But since you so
confidently state that it does, it's quite reasonable that I ask you
for examples that demonstrate it.

Once you do demonstrate these bugs, it's reasonable I will go about
fixing them.  Exactly as you say you are going to do.  I demonstrated
with code and numbers a regression in _your_ patch, and you say are
going to fix it.  That's great, and that's the way it should be.  But
you possibly wouldn't go about fixing it if I hadn't demonstrated the
regression compellingly, just as I can't go about fixing a "memory leak"
or "debugging difficulties" if you don't explain what these things mean
to you or how my patch causes them.

João

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 16 Aug 2021 11:48:02 GMT) Full text and rfc822 format available.

Message #170 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: mail <at> daniel-mendler.de, 47711 <at> debbugs.gnu.org, joaotavora <at> gmail.com,
 48841 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca
Subject: Re: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Mon, 16 Aug 2021 14:46:58 +0300

> Cc: mail <at> daniel-mendler.de, monnier <at> iro.umontreal.ca, 48841 <at> debbugs.gnu.org,
>  47711 <at> debbugs.gnu.org
> From: Dmitry Gutov <dgutov <at> yandex.ru>
> Date: Mon, 16 Aug 2021 06:17:32 +0300
> 
> On 14.08.2021 14:29, Eli Zaretskii wrote:
> > Text properties are stored separately from the string, so I don't
> > think adding properties can in general be referred to as "change".
> 
> Are you thinking of C strings?

No, about the implementation of Lisp strings in Emacs.

> Lisp strings carry text properties in addition to the array of 
> characters. It doesn't really matter where in the memory the properties 
> and the characters reside.

Well, it does, at least in some situations.  The string text is not
affected, and so the code which processes the string will not notice
that it has a property about which that code has no idea.  Only
properties that are known to the processing code can affect it;
non-standard properties private to some other code will generally pass
unnoticed.

> > Whether in some particular situation that could count as a "change"
> > depends on that situation and on the particular property, of course.
> 
> I was talking in the general sense: modifying a value.
> 
> One can talk about whether a certain modification matters in certain 
> situations, but that's not the way to discount a general principle.

I didn't want to start a general philosophical discussion about string
mutability.  I hoped to provide input of specific practical use in the
context of this discussion.  If what I said is not useful, just
disregard it.

> > I'm not sure in the context of completion there's any reason to count
> > as "change" adding properties that don't affect display.
> 
> For the context in question, whether the properties affect display is 
> not particularly important. Properties affecting display just make it 
> easier to notice that something's wrong. Bug involving other properties 
> should be more difficult to investigate.

Once again, if some code invents its private property, not used
anywhere else and not documented anywhere else, then putting such a
property on a string has very high chances of going unnoticed.  I hope
this consideration helps this discussion, because saying that
properties change a string blurs the distinction between actually
changing the string text or its properties that affect many parts in
Emacs, and adding some obscure property that is not known to anyone.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 16 Aug 2021 11:49:01 GMT) Full text and rfc822 format available.

Message #173 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: mail <at> daniel-mendler.de, joaotavora <at> gmail.com, monnier <at> iro.umontreal.ca,
 48841 <at> debbugs.gnu.org, 47711 <at> debbugs.gnu.org
Subject: Re: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Mon, 16 Aug 2021 14:48:39 +0300

> Cc: 47711 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca, joaotavora <at> gmail.com,
>  48841 <at> debbugs.gnu.org
> From: Dmitry Gutov <dgutov <at> yandex.ru>
> Date: Mon, 16 Aug 2021 06:26:58 +0300
> 
> On 14.08.2021 10:01, Eli Zaretskii wrote:
> 
> > Just to make sure we are on the same page: adding a text property to a
> > string doesn't mutate a string.  Lisp programs that process these
> > strings will not necessarily see any difference, and displaying those
> > strings will also not show any difference if the property is not
> > related to display.  So the assumption that seems to be made here,
> > that adding a property is the same as mutating a string, is IMO
> > inaccurate if not incorrect.
> 
> This is nonsense.
> 
> A program won't necessarily see a difference in *any* changed value, as 
> long as some part of it stays the same.
> 
> I can zero out the tail of a string, and have a program that only looks 
> at its first few characters. It wouldn't mean that a string hasn't changed.

You are not making any sense.

Anyway, if what I wrote doesn't help, feel free to disregard it.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 16 Aug 2021 11:58:01 GMT) Full text and rfc822 format available.

Message #176 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Daniel Mendler <mail <at> daniel-mendler.de>
Cc: 48841 <at> debbugs.gnu.org, 47711 <at> debbugs.gnu.org, joaotavora <at> gmail.com,
 monnier <at> iro.umontreal.ca, dgutov <at> yandex.ru
Subject: Re: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Mon, 16 Aug 2021 14:57:01 +0300

> Cc: joaotavora <at> gmail.com, monnier <at> iro.umontreal.ca, 48841 <at> debbugs.gnu.org,
>  47711 <at> debbugs.gnu.org
> From: Daniel Mendler <mail <at> daniel-mendler.de>
> Date: Mon, 16 Aug 2021 10:48:19 +0200
> 
> On 8/14/21 9:12 AM, Eli Zaretskii wrote:
> >> Since up until now completion-pcm--hilit-commonality copied all strings 
> >> before modifying, completion tables such as described (with "shared" 
> >> strings) have all been "legal". Suddenly deciding to stop supporting 
> >> them would be a major API breakage with consequences that are hard to 
> >> predict. And while I perhaps agree that it's an inconvenience, I don't 
> >> think it's a choice we can simply make as this stage in c-a-p-f's 
> >> development.
> > 
> > This sounds like an argument against Daniel's approach as well, no?
> > Because if a completion API returns strings it "doesn't own", there
> > will be restrictions on Lisp programs that use those strings, because
> > those Lisp programs previously could do anything they liked with those
> > strings, and now they cannot.  Or am I missing something?
> 
> No, in my patch the displayed candidate strings are still copied before
> the text properties are attached. The strings are kept intact as they
> are now.

I was talking about the infrastructure that produces the completion
candidates, not about the application that uses them.  My point is
that your approach requires the applications using the candidates to
copy them, whereas previously they could use them without copying.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 16 Aug 2021 12:03:02 GMT) Full text and rfc822 format available.

Message #179 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, 47711 <at> debbugs.gnu.org,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 48841 <at> debbugs.gnu.org,
 Dmitry Gutov <dgutov <at> yandex.ru>
Subject: Re: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Mon, 16 Aug 2021 13:02:04 +0100

On Mon, Aug 16, 2021 at 12:57 PM Eli Zaretskii <eliz <at> gnu.org> wrote:

> I was talking about the infrastructure that produces the completion
> candidates, not about the application that uses them.  My point is
> that your approach requires the applications using the candidates to
> copy them, whereas previously they could use them without copying.

If it helps, I think that that is true of all alternatives presented
so far (though I haven't read the big patch fully yet). The difference is
that the consumers who copy the candidate strings will only copy a much
smaller number, typically only the ones that need to be displayed.
Whereas currently, all candidate strings are copied, displayed or not.

João Távora

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 16 Aug 2021 12:07:01 GMT) Full text and rfc822 format available.

Message #182 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Mendler <mail <at> daniel-mendler.de>
To: João Távora <joaotavora <at> gmail.com>
Cc: Stefan Monnier <monnier <at> iro.umontreal.ca>, 48841 <at> debbugs.gnu.org,
 Dmitry Gutov <dgutov <at> yandex.ru>, Lars Ingebrigtsen <larsi <at> gnus.org>,
 Eli Zaretskii <eliz <at> gnu.org>, 47711 <at> debbugs.gnu.org
Subject: Re: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Mon, 16 Aug 2021 14:05:54 +0200

João, the discussion is clearly not progressing. I propose that we both
take a step back and let the Emacs maintainers, who participated in this
discussion, decide on how to proceed. It seems to me that all arguments
and data has been presented and there is no need for further
reiterations in more and more colorful language. I would also like to
point out that implying that I don't understand your language is
borderline acceptable. I understand the discussion very well, but I
don't understand why you are using these unfair and invalid means of
discussion.

For example there could be these decision outcomes:

1. The information presented up to now does not allow the maintainers to
make a decision. For example the maintainers may ask for further
clarification from you, João, or they may ask for benchmarks from me or
a prove that my patch does not lead to performance regressions.

2. The maintainers decide that no patch should be merged.

2. The maintainers decide that your patch will be merged. I will accept
this decision.

3. The maintainers decide that my patch will be merged. You will accept
this decision.

4. The maintainers decide that both patches will be merged such that
both approaches will be supported. We both will accept this decision.

I want to summarize the situation in the following:

The patches in question address a performance issue in the current
completion machinery which is caused by over-eager copying of the
completion candidate strings and over-eager application of the
highlighting to all candidate strings. For incrementally updating UIs it
would be sufficient to only copy and highlight the strings which are
actually going to be displayed.

My patch takes the approach to expose the existing two-step completion
process, which consists of filtering and highlighting. By returning the
filtered completion strings and a highlighting function this two-step
process is decomposed and the caller of the API has the ability to call
the highlighting function on only the displayed subset of completion
candidates. I argue that exposing the filtering and highlighting
procession steps is the logical and natural conclusion of the existing
machinery.

My patch is fully backward compatible and aims to not introduce any
regressions (also no performance regressions) to the existing API.
Furthermore my patch adheres to the current guarantees given by the
existing 'completion-all-completions' API. The completion strings
provided by the completion backend are not mutated in any way, no string
properties are attached. Since the API 'completion-filter-completions'
proposed in my patch does the minimal amount of work necessary (only
filtering), if no highlighting is requested, I argue that the new API is
the most efficient possible API, given the current constraints.

Furthermore since I am introducing a new API, outstanding issues can be
solved which could not be solved until now given the constraints of the
existing 'completion-all-completions' API. In particular the new API
'completion-filter-completions' API returns additional data like the end
position of the completion boundaries. Until now the end position was
not made available and 'completion-base-position' just used the length
of the input to guess the end position. In a strict sense this guess is
incorrect and there is a FIXME in minibuffer.el, mentioning this issue.

The downside of my patch is that it is a large patch. While it adds only
196 lines of code, which is not much and expected given that it only
reshuffles the existing machinery, it is still a large patch in total.

On the other side, João's patch avoids the complication of adhering to
the existing guarantees of the APIs and takes the liberty of attaching
"private" string properties to the completion strings of the completion
table backend. I argue that attaching the string properties is a
violation of the guarantees of the existing API and violates the
expectations of the existing many completion tables. One very severe
scenario is when obarray is used as completion table, since then each
symbol name gets a private property attached. I argue that such global
side effects like attaching string properties to all completion
candidates should better be avoided. There is the issue that the
attached string properties are a potential memory leak. When dumping the
string representation of symbol names, the symbols will have additional
properties which will complicate the debugging experience. Furthermore
it will lead to confusion since the global side effect during completion
will suddenly have an influence of symbols which don't have to do
anything with completion. The big advantage of João's patch is that it
is very limited in scope and very simple. However I argue that this
simplicity is hard-won and we will regret this approach later due to the
global side effects.

Therefore I conclude that the two-step process proposed in my patch,
which does not introduce problematic global side effects is the better
approach forward. Furthermore a new API is needed such that more
completion data can be returned, e.g., the completion end position. One
could even return additional match data in the future given that the new
API 'completion-filter-completions' is extensible thanks to its alist
return value.

João, please feel free to also present your closing arguments here.

Daniel

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 16 Aug 2021 12:09:02 GMT) Full text and rfc822 format available.

Message #185 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Mendler <mail <at> daniel-mendler.de>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 48841 <at> debbugs.gnu.org, 47711 <at> debbugs.gnu.org, joaotavora <at> gmail.com,
 monnier <at> iro.umontreal.ca, dgutov <at> yandex.ru
Subject: Re: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Mon, 16 Aug 2021 14:08:33 +0200

On 8/16/21 1:57 PM, Eli Zaretskii wrote:
> I was talking about the infrastructure that produces the completion
> candidates, not about the application that uses them.  My point is
> that your approach requires the applications using the candidates to
> copy them, whereas previously they could use them without copying.

Okay, I understand. Yes, in my patch the strings returned by
'completion-filter-completions' must not be mutated by the API consumer
directly. This should be documented clearly, but it is not unexpected.
For example the API 'all-completions' which one can use to obtain the
strings from a completion table also requires the caller to not mutate
the returned strings.

Daniel

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 16 Aug 2021 12:19:01 GMT) Full text and rfc822 format available.

Message #188 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Daniel Mendler <mail <at> daniel-mendler.de>
Cc: Stefan Monnier <monnier <at> iro.umontreal.ca>, 48841 <at> debbugs.gnu.org,
 Dmitry Gutov <dgutov <at> yandex.ru>, Lars Ingebrigtsen <larsi <at> gnus.org>,
 Eli Zaretskii <eliz <at> gnu.org>, 47711 <at> debbugs.gnu.org
Subject: Re: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Mon, 16 Aug 2021 13:17:56 +0100

> João, please feel free to also present your closing arguments here.

Sorry, I don't "close" arguments like this.

I hope you can provide:

* the fixes to the regression identified
* the benchmarks you say you have
* the patches to icomplete.el that show how it uses your new API
* the demonstrations of the bugs you accuse my patch to suffer from

Thanks.

On Mon, Aug 16, 2021 at 1:05 PM Daniel Mendler <mail <at> daniel-mendler.de> wrote:
>
> João, the discussion is clearly not progressing. I propose that we both
> take a step back and let the Emacs maintainers, who participated in this
> discussion, decide on how to proceed. It seems to me that all arguments
> and data has been presented and there is no need for further
> reiterations in more and more colorful language. I would also like to
> point out that implying that I don't understand your language is
> borderline acceptable. I understand the discussion very well, but I
> don't understand why you are using these unfair and invalid means of
> discussion.
>
> For example there could be these decision outcomes:
>
> 1. The information presented up to now does not allow the maintainers to
> make a decision. For example the maintainers may ask for further
> clarification from you, João, or they may ask for benchmarks from me or
> a prove that my patch does not lead to performance regressions.
>
> 2. The maintainers decide that no patch should be merged.
>
> 2. The maintainers decide that your patch will be merged. I will accept
> this decision.
>
> 3. The maintainers decide that my patch will be merged. You will accept
> this decision.
>
> 4. The maintainers decide that both patches will be merged such that
> both approaches will be supported. We both will accept this decision.
>
> I want to summarize the situation in the following:
>
> The patches in question address a performance issue in the current
> completion machinery which is caused by over-eager copying of the
> completion candidate strings and over-eager application of the
> highlighting to all candidate strings. For incrementally updating UIs it
> would be sufficient to only copy and highlight the strings which are
> actually going to be displayed.
>
> My patch takes the approach to expose the existing two-step completion
> process, which consists of filtering and highlighting. By returning the
> filtered completion strings and a highlighting function this two-step
> process is decomposed and the caller of the API has the ability to call
> the highlighting function on only the displayed subset of completion
> candidates. I argue that exposing the filtering and highlighting
> procession steps is the logical and natural conclusion of the existing
> machinery.
>
> My patch is fully backward compatible and aims to not introduce any
> regressions (also no performance regressions) to the existing API.
> Furthermore my patch adheres to the current guarantees given by the
> existing 'completion-all-completions' API. The completion strings
> provided by the completion backend are not mutated in any way, no string
> properties are attached. Since the API 'completion-filter-completions'
> proposed in my patch does the minimal amount of work necessary (only
> filtering), if no highlighting is requested, I argue that the new API is
> the most efficient possible API, given the current constraints.
>
> Furthermore since I am introducing a new API, outstanding issues can be
> solved which could not be solved until now given the constraints of the
> existing 'completion-all-completions' API. In particular the new API
> 'completion-filter-completions' API returns additional data like the end
> position of the completion boundaries. Until now the end position was
> not made available and 'completion-base-position' just used the length
> of the input to guess the end position. In a strict sense this guess is
> incorrect and there is a FIXME in minibuffer.el, mentioning this issue.
>
> The downside of my patch is that it is a large patch. While it adds only
> 196 lines of code, which is not much and expected given that it only
> reshuffles the existing machinery, it is still a large patch in total.
>
> On the other side, João's patch avoids the complication of adhering to
> the existing guarantees of the APIs and takes the liberty of attaching
> "private" string properties to the completion strings of the completion
> table backend. I argue that attaching the string properties is a
> violation of the guarantees of the existing API and violates the
> expectations of the existing many completion tables. One very severe
> scenario is when obarray is used as completion table, since then each
> symbol name gets a private property attached. I argue that such global
> side effects like attaching string properties to all completion
> candidates should better be avoided. There is the issue that the
> attached string properties are a potential memory leak. When dumping the
> string representation of symbol names, the symbols will have additional
> properties which will complicate the debugging experience. Furthermore
> it will lead to confusion since the global side effect during completion
> will suddenly have an influence of symbols which don't have to do
> anything with completion. The big advantage of João's patch is that it
> is very limited in scope and very simple. However I argue that this
> simplicity is hard-won and we will regret this approach later due to the
> global side effects.
>
> Therefore I conclude that the two-step process proposed in my patch,
> which does not introduce problematic global side effects is the better
> approach forward. Furthermore a new API is needed such that more
> completion data can be returned, e.g., the completion end position. One
> could even return additional match data in the future given that the new
> API 'completion-filter-completions' is extensible thanks to its alist
> return value.
>
> João, please feel free to also present your closing arguments here.
>
> Daniel



-- 
João Távora

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 16 Aug 2021 12:20:01 GMT) Full text and rfc822 format available.

Message #191 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: João Távora <joaotavora <at> gmail.com>
Cc: mail <at> daniel-mendler.de, 47711 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca,
 48841 <at> debbugs.gnu.org, dgutov <at> yandex.ru
Subject: Re: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Mon, 16 Aug 2021 15:19:17 +0300

> From: João Távora <joaotavora <at> gmail.com>
> Date: Mon, 16 Aug 2021 13:02:04 +0100
> Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Dmitry Gutov <dgutov <at> yandex.ru>, 
> 	Stefan Monnier <monnier <at> iro.umontreal.ca>, 48841 <at> debbugs.gnu.org, 47711 <at> debbugs.gnu.org
> 
> On Mon, Aug 16, 2021 at 12:57 PM Eli Zaretskii <eliz <at> gnu.org> wrote:
> 
> > I was talking about the infrastructure that produces the completion
> > candidates, not about the application that uses them.  My point is
> > that your approach requires the applications using the candidates to
> > copy them, whereas previously they could use them without copying.
> 
> If it helps, I think that that is true of all alternatives presented
> so far

Yes, I know.  I was comparing the proposed alternatives to what we
have now, and specifically because Dmitry mentioned this aspect.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 16 Aug 2021 12:40:02 GMT) Full text and rfc822 format available.

Message #194 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Daniel Mendler <mail <at> daniel-mendler.de>
Cc: monnier <at> iro.umontreal.ca, 48841 <at> debbugs.gnu.org, dgutov <at> yandex.ru,
 joaotavora <at> gmail.com, larsi <at> gnus.org, 47711 <at> debbugs.gnu.org
Subject: Re: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Mon, 16 Aug 2021 15:39:33 +0300

> Cc: Dmitry Gutov <dgutov <at> yandex.ru>, Lars Ingebrigtsen <larsi <at> gnus.org>,
>  47711 <at> debbugs.gnu.org, 48841 <at> debbugs.gnu.org,
>  Stefan Monnier <monnier <at> iro.umontreal.ca>, Eli Zaretskii <eliz <at> gnu.org>
> From: Daniel Mendler <mail <at> daniel-mendler.de>
> Date: Mon, 16 Aug 2021 12:52:58 +0200
> 
> On 8/16/21 12:15 PM, João Távora wrote:
> > On Mon, Aug 16, 2021 at 10:09 AM Daniel Mendler <mail <at> daniel-mendler.de> wrote:
> > 
> >> There are serious drawbacks of attaching "private" string properties to
> >> arbitrary strings. For once it complicates debugging seriously if there
> >> are suddenly string properties attached to symbol names. It also leads
> >> to a potential memory leak.
> > 
> > Please, in the name of the sanity of this discussion, justify these two
> > statements with examples or follow them with a clause like "because...".
> 
> João, I am giving hard examples here. What is not an example about
> "memory leak" or making debugging output verbose thanks to the attached
> string properties?

FWIW, I also don't understand how adding properties could cause a
memory leak.  When a string is GCed, its properties get GCed as well,
all of them.  Am I missing something?

As to more difficult debugging, I think adding a couple of properties
that have simple structure will not impair debugging too much.
Strings with many properties are not uncommon in Emacs, so we already
have to deal with that.

> As I said, I will ensure that my API does not introduce performance
> regressions. And since my new API performs strictly less work than your
> proposal it will necessarily be faster if you consider only the
> filtering, which is what matters for incrementally updating UIs.

I would indeed suggest both to make sure there's no performance
regressions, and would like to see some data similar to what João
presented, which backs up your assessments about your proposal being
faster.  Since performance is the main motivation for these changes, I
think it's important for us to be on the same page wrt facts related
to performance, before we make the decision how to proceed.

Thanks.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 16 Aug 2021 12:45:01 GMT) Full text and rfc822 format available.

Message #197 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Daniel Mendler <mail <at> daniel-mendler.de>
Cc: monnier <at> iro.umontreal.ca, 48841 <at> debbugs.gnu.org, dgutov <at> yandex.ru,
 joaotavora <at> gmail.com, larsi <at> gnus.org, 47711 <at> debbugs.gnu.org
Subject: Re: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Mon, 16 Aug 2021 15:43:59 +0300

> Cc: Dmitry Gutov <dgutov <at> yandex.ru>, Lars Ingebrigtsen <larsi <at> gnus.org>,
>  47711 <at> debbugs.gnu.org, 48841 <at> debbugs.gnu.org,
>  Stefan Monnier <monnier <at> iro.umontreal.ca>, Eli Zaretskii <eliz <at> gnu.org>
> From: Daniel Mendler <mail <at> daniel-mendler.de>
> Date: Mon, 16 Aug 2021 14:05:54 +0200
> 
> João, the discussion is clearly not progressing. I propose that we both
> take a step back and let the Emacs maintainers, who participated in this
> discussion, decide on how to proceed. It seems to me that all arguments
> and data has been presented and there is no need for further
> reiterations in more and more colorful language.

As I wrote elsewhere, I'd like to see the performance aspects of this
to be presented from both sides, and agreed upon.  I don't think we
can make the decision before we have performance data we all agree
about.  The other pros and cons are all of qualitative nature, and
involve intuition, personal experience, and personal preferences, so
each one will have their own balance.  But performance is both basic
and qualitative, and we should have the facts and agree on them.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 16 Aug 2021 12:50:01 GMT) Full text and rfc822 format available.

Message #200 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Mendler <mail <at> daniel-mendler.de>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: monnier <at> iro.umontreal.ca, 48841 <at> debbugs.gnu.org, dgutov <at> yandex.ru,
 joaotavora <at> gmail.com, larsi <at> gnus.org, 47711 <at> debbugs.gnu.org
Subject: Re: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Mon, 16 Aug 2021 14:49:45 +0200

On 8/16/21 2:39 PM, Eli Zaretskii wrote:
>> João, I am giving hard examples here. What is not an example about
>> "memory leak" or making debugging output verbose thanks to the attached
>> string properties?
> 
> FWIW, I also don't understand how adding properties could cause a
> memory leak.  When a string is GCed, its properties get GCed as well,
> all of them.  Am I missing something?

If you add string properties to all symbol names this memory will stay
alive for much longer than necessary. It is not a memory leak in the
strongest sense. The memory is still reachable, but there is still no
need to keep the string properties allocated. This is comparable to
memory leaks in other GCed languages where memory is also kept alive for
longer than necessary.

> As to more difficult debugging, I think adding a couple of properties
> that have simple structure will not impair debugging too much.
> Strings with many properties are not uncommon in Emacs, so we already
> have to deal with that.

I disagree with that. We are talking about adding string properties to
every symbol name. This is a global side effect and different from
adding string properties to a set of isolated string in a controlled
manner. I also don't understand why one would even want to take any
chances here given that the feature can be implemented in a way which
avoids this global side effect entirely as my patch shows.

>> As I said, I will ensure that my API does not introduce performance
>> regressions. And since my new API performs strictly less work than your
>> proposal it will necessarily be faster if you consider only the
>> filtering, which is what matters for incrementally updating UIs.
> 
> I would indeed suggest both to make sure there's no performance
> regressions, and would like to see some data similar to what João
> presented, which backs up your assessments about your proposal being
> faster.  Since performance is the main motivation for these changes, I
> think it's important for us to be on the same page wrt facts related
> to performance, before we make the decision how to proceed.

I will prepare some benchmarks.

Daniel

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 16 Aug 2021 13:00:02 GMT) Full text and rfc822 format available.

Message #203 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Daniel Mendler <mail <at> daniel-mendler.de>
Cc: 48841 <at> debbugs.gnu.org, 47711 <at> debbugs.gnu.org, joaotavora <at> gmail.com,
 monnier <at> iro.umontreal.ca, dgutov <at> yandex.ru
Subject: Re: bug#48841: [PATCH] Add new `completion-filter-completions` API
 and deferred highlighting
Date: Mon, 16 Aug 2021 15:58:47 +0300

> Cc: joaotavora <at> gmail.com, dgutov <at> yandex.ru, monnier <at> iro.umontreal.ca,
>  48841 <at> debbugs.gnu.org, 47711 <at> debbugs.gnu.org
> From: Daniel Mendler <mail <at> daniel-mendler.de>
> Date: Mon, 16 Aug 2021 11:42:22 +0200
> 
> >> To address your technical comment - this variable is precisely what one
> >> of the technical difficulties mentioned in my other mail is about.  The
> >> question is how we can retain backward compatibility of the completion
> >> style 'all' functions, e.g., 'completion-basic-all-completions', while
> >> still allowing the function to return the newly introduced alist format
> >> with more data, which enables 'completion-filter-completions' to perform
> >> the efficient deferred highlighting.
> > 
> > I understand, but given that we provide this for other packages, it
> > shouldn't be an internal variable.
> 
> No, we explicitly don't provide this variable for other packages. It is
> explicitly only meant to be used for the existing completion styles
> emacs21, emacs22, basic, substring, partial-completion, initials and
> flex to opt-in in a backward-compatible/calling convention preserving
> way to the alist return format. The idea is to keep the existing APIs
> fully backward compatible.
> 
> Other packages should select the format returned from the completion
> styles differently. They should return the alist format on Emacs 28 or
> if the API 'completion-filter-completions' API is present. In the not so
> near future external packages which support only Emacs 28 and upwards
> will then only return the alist format and don't have to perform any
> detection anymore.

What if some package outside minibuffer.el will want to control the
format of the returned value, for some reason, like support for old
Emacs versions? are we going to disallow that?

> >> The new API 'completion-filter-completions' will substitute the existing
> >> API 'completion-all-completions'.
> > 
> > That's your hope, and I understand.  But we as a project didn't yet
> > decide to deprecate the original APIs, so talking about superseding is
> > premature.
> 
> It is not the hope - it is the explicit goal. The API has been designed
> to replace the existing API 'completion-all-completions'.

A goal is not a fact.  Until that goal is reached, we cannot in good
faith tell people an API is superseded.

> We can of
> course tone this down. However I, as a package author, would appreciate
> if Emacs tells me when a newer API aims to replace another API and when
> the documentation is explicit about it.

That's understood, and when we make that decision, we will of course
announce it.  But we didn't do so yet, and this discussion is not even
about that decision.  It could be, for example, that both APIs will
live side by side until we decide whether to deprecate the old one.

> Of course if you decide to have
> the doc strings written in a different tone, I will adapt my patch
> accordingly. Here I am just explaining why I chose the word "superseded"
> instead of a more neutral word.

I understand your motivation, I'm just saying that we cannot announce
deprecation before we actually decide to deprecate.

> > But the name says "filter completions".  Which would mean you take a
> > list of completions and filter out some of them.  A completion table
> > is much more general object than a list of strings.  Thus, I think
> > using "filter" here is sub-optimal.
> 
> Okay, you are right about this. But I think even if the name
> 'completion-filter-completions' is not 100% precise it still conveys
> what the API is about. 'completion-completions-alist' or
> 'completion-all-completions-as-alist' are valid names of course, but I
> dislike them for their verbosity. Already 'completion-all-completions'
> is quite verbose. A strong argument to use this long name is that the
> completion style functions are still called
> 'completion-basic-all-completions' etc. But if we accept that the new
> API 'completion-filter-completions' will actually supersede the existing
> API 'completion-all-completions' it makes sense to use a name which will
> not hurt our eyes in the long run. However this is of course just a
> personal preference of mine. I don't want to spent much time with name
> bikeshedding discussions. If you decide on a name, I will adapt my patch
> accordingly.

Emacs is frequently accused in having names that are hard to
discover.  The only time where we can improve that is when a symbol is
introduced, because later it's impossible for compatibility reasons.
So I'd like to come up with a good name before we install the changes.

That said, I'll let others chime in and agree or disagree with the
name you've chosen.

Thanks.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 16 Aug 2021 13:22:02 GMT) Full text and rfc822 format available.

Message #206 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Daniel Mendler <mail <at> daniel-mendler.de>
Cc: monnier <at> iro.umontreal.ca, 48841 <at> debbugs.gnu.org, dgutov <at> yandex.ru,
 joaotavora <at> gmail.com, larsi <at> gnus.org, 47711 <at> debbugs.gnu.org
Subject: Re: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Mon, 16 Aug 2021 16:21:17 +0300

> Cc: joaotavora <at> gmail.com, 48841 <at> debbugs.gnu.org, dgutov <at> yandex.ru,
>  larsi <at> gnus.org, monnier <at> iro.umontreal.ca, 47711 <at> debbugs.gnu.org
> From: Daniel Mendler <mail <at> daniel-mendler.de>
> Date: Mon, 16 Aug 2021 14:49:45 +0200
> 
> On 8/16/21 2:39 PM, Eli Zaretskii wrote:
> >> João, I am giving hard examples here. What is not an example about
> >> "memory leak" or making debugging output verbose thanks to the attached
> >> string properties?
> > 
> > FWIW, I also don't understand how adding properties could cause a
> > memory leak.  When a string is GCed, its properties get GCed as well,
> > all of them.  Am I missing something?
> 
> If you add string properties to all symbol names this memory will stay
> alive for much longer than necessary.

That's a very extreme example, something that I wouldn't expect a Lisp
program to do, without removing the properties shortly thereafter.
And even that isn't a leak.

Note that we already put all kind of properties (although not text
properties) on symbols.

> > As to more difficult debugging, I think adding a couple of properties
> > that have simple structure will not impair debugging too much.
> > Strings with many properties are not uncommon in Emacs, so we already
> > have to deal with that.
> 
> I disagree with that. We are talking about adding string properties to
> every symbol name. This is a global side effect and different from
> adding string properties to a set of isolated string in a controlled
> manner. I also don't understand why one would even want to take any
> chances here given that the feature can be implemented in a way which
> avoids this global side effect entirely as my patch shows.

I understand your aversion from such global effects, but I was talking
specifically about the debugging difficulties.

> > I would indeed suggest both to make sure there's no performance
> > regressions, and would like to see some data similar to what João
> > presented, which backs up your assessments about your proposal being
> > faster.  Since performance is the main motivation for these changes, I
> > think it's important for us to be on the same page wrt facts related
> > to performance, before we make the decision how to proceed.
> 
> I will prepare some benchmarks.

Thank you.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 16 Aug 2021 13:39:02 GMT) Full text and rfc822 format available.

Message #209 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: mail <at> daniel-mendler.de, 47711 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca,
 48841 <at> debbugs.gnu.org, joaotavora <at> gmail.com
Subject: Re: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Mon, 16 Aug 2021 16:38:19 +0300

On 16.08.2021 14:46, Eli Zaretskii wrote:

>>> Text properties are stored separately from the string, so I don't
>>> think adding properties can in general be referred to as "change".
>>
>> Are you thinking of C strings?
> 
> No, about the implementation of Lisp strings in Emacs.

I was talking about their behavior.

>> Lisp strings carry text properties in addition to the array of
>> characters. It doesn't really matter where in the memory the properties
>> and the characters reside.
> 
> Well, it does, at least in some situations.  The string text is not
> affected, and so the code which processes the string will not notice
> that it has a property about which that code has no idea.  Only
> properties that are known to the processing code can affect it;
> non-standard properties private to some other code will generally pass
> unnoticed.

I don't think anybody was suggesting that changing text properties 
changes the character codes inside the "C string" part of the Lisp string.

>>> I'm not sure in the context of completion there's any reason to count
>>> as "change" adding properties that don't affect display.
>>
>> For the context in question, whether the properties affect display is
>> not particularly important. Properties affecting display just make it
>> easier to notice that something's wrong. Bug involving other properties
>> should be more difficult to investigate.
> 
> Once again, if some code invents its private property, not used
> anywhere else and not documented anywhere else, then putting such a
> property on a string has very high chances of going unnoticed.  I hope
> this consideration helps this discussion, because saying that
> properties change a string blurs the distinction between actually
> changing the string text or its properties that affect many parts in
> Emacs, and adding some obscure property that is not known to anyone.

What muddies the water is arguing against a solid engineering principle 
with statements like "those mutations are not mutations".

Yes, when the properties are prefixed, the damage is reduced. Even then, 
that increases the possibility of introducing bugs in the very code that 
sets those properties (like having different code paths where one branch 
sets them and another does not; forgetting to clear them in the other 
branch; having subsequent code use the property values set by some 
previous invocation of the code in question where it took another 
branch; not to mention the potential troubles with parallel execution, 
which is not a real concern these days, but we're designing for years 
ahead, and someday it can be). Memory leaks, too.

Our completion pipeline has multiple interchangeable/pluggable parts, so 
we have to be on the lookout even for problems which do not reproduce 
with stock Emacs, and that requires solid abstractions.

And speaking of "only private properties", the completion-score property 
can be used by downstream code, with all the associated potential for 
trouble.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 16 Aug 2021 13:43:02 GMT) Full text and rfc822 format available.

Message #212 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Eli Zaretskii <eliz <at> gnu.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 48841 <at> debbugs.gnu.org,
 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Mon, 16 Aug 2021 14:41:42 +0100

On Mon, Aug 16, 2021 at 2:38 PM Dmitry Gutov <dgutov <at> yandex.ru> wrote:

> And speaking of "only private properties", the completion-score property
> can be used by downstream code, with all the associated potential for
> trouble.

That's true.  When I created it, I meant for it to be private, I think,
but indeed did forget to mark it as such.  It is not documented anywhere
but that hasn't stopped anyone in the past it, indeed.Can you point to
place(s) where it is indeed used other than the flex machinery of
`minibuffer.el`?  Thanks.

João Távora

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 16 Aug 2021 14:01:02 GMT) Full text and rfc822 format available.

Message #215 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Eli Zaretskii <eliz <at> gnu.org>, Daniel Mendler <mail <at> daniel-mendler.de>
Cc: larsi <at> gnus.org, 47711 <at> debbugs.gnu.org, joaotavora <at> gmail.com,
 48841 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca
Subject: Re: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Mon, 16 Aug 2021 17:00:09 +0300

On 16.08.2021 16:21, Eli Zaretskii wrote:

>>> FWIW, I also don't understand how adding properties could cause a
>>> memory leak.  When a string is GCed, its properties get GCed as well,
>>> all of them.  Am I missing something?
>>
>> If you add string properties to all symbol names this memory will stay
>> alive for much longer than necessary.
> 
> That's a very extreme example, something that I wouldn't expect a Lisp
> program to do, without removing the properties shortly thereafter.

And that *will* happen with Joao's approach, as soon as some package 
implements a Lisp completion backend in a certain (legal) fashion.

Or using one of a few different fashions, actually.

> And even that isn't a leak.
> 
> Note that we already put all kind of properties (although not text
> properties) on symbols.

Those do not, generally, change over time.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 16 Aug 2021 14:15:02 GMT) Full text and rfc822 format available.

Message #218 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: João Távora <joaotavora <at> gmail.com>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 48841 <at> debbugs.gnu.org,
 47711 <at> debbugs.gnu.org
Subject: Re: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Mon, 16 Aug 2021 17:14:14 +0300

On 16.08.2021 16:41, João Távora wrote:
> It is not documented anywhere
> but that hasn't stopped anyone in the past it, indeed.Can you point to
> place(s) where it is indeed used other than the flex machinery of
> `minibuffer.el`?  Thanks.

Try either of these:

https://github.com/rustify-emacs/fuz.el/blob/master/helm-fuz.el#L228
https://github.com/emacs-helm/helm/blob/master/helm-utils.el

And I'm considering using it in company-sort-by-occurrence, to make sure 
that flex sorting is at least semi-honored there (or create a variation 
of that transformer). For that to happen, the possible score values and 
their meanings will need to be documented, though.

The main scenario (and source of the completion-score property) I have 
in mind is not related to fido-mode or flex, but the users can always 
put flex into completion-styles by default, which affects company-capf.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 16 Aug 2021 14:21:01 GMT) Full text and rfc822 format available.

Message #221 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, monnier <at> iro.umontreal.ca,
 48841 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, larsi <at> gnus.org,
 47711 <at> debbugs.gnu.org
Subject: Re: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Mon, 16 Aug 2021 15:20:36 +0100

Dmitry Gutov <dgutov <at> yandex.ru> writes:

> On 16.08.2021 16:21, Eli Zaretskii wrote:
>
>>>> FWIW, I also don't understand how adding properties could cause a
>>>> memory leak.  When a string is GCed, its properties get GCed as well,
>>>> all of them.  Am I missing something?
>>>
>>> If you add string properties to all symbol names this memory will stay
>>> alive for much longer than necessary.
>> That's a very extreme example, something that I wouldn't expect a
>> Lisp
>> program to do, without removing the properties shortly thereafter.
>
> And that *will* happen with Joao's approach, as soon as some package
> implements a Lisp completion backend in a certain (legal) fashion.

There is no leak, not in the strong or weak sense. There is a constant
usage memory footprint proportional to the size of obarray, yes, but the
factor isn't huge.  From the top of my head, I think it's about two
conses and a fp number for each sym. does anyone know how much that is
or can teach me how to measure?  Thanks.

Anyway the current situation is constant copies of strings that put
pressure on the GC, as the benchmarks show.

Anyhoo, in the interest of placating this string property bogeyman, I've
prepared a version of my patch that is based on weak-keyed hash tables.
Slightly slower, but not much. Here are the usual benchmarks:

   (defmacro heyhey ()
     `(progn
        ,@(cl-loop repeat 300000
                  collect `(defun ,(intern (symbol-name (gensym "heyhey"))) () 42))))
   (setq completion-styles '(flex))
   (heyhey)
   (when nil
     ;; Press C-u C-x C-e C-m quickly to produce a sample
     (benchmark-run (completing-read "" obarray))
    
     ;; my patch with private string properties
     (3.596972924 4 1.125298095999998)
     (3.584963294 4 1.1266740010000014)
     (3.4622216089999998 4 1.0924070069999985)
     (3.565632813 4 1.1066678320000012)
     (3.456130054 4 1.099950519)
     (3.49538751 4 1.1085273779999998)
     (3.4882531059999997 4 1.0928655200000001)
     (3.526581152 4 1.0909935229999999)
     (3.710919876 4 1.13883417)
     (3.576690379 4 1.09514685)
    
     ;; my patch with an no string properties (global weak hts)
     ;; Probably the extra gc sweeps are paranoid cleaning up of the
     ;; hash tables.
     (3.981110008 7 1.6466288340000013)
     (3.819598429 7 1.5200578379999996)
     (3.823931386 7 1.5175787589999992)
     (3.9161236720000003 7 1.6184865899999998)
     (3.835148066 7 1.5686207249999988)
     (3.791906221 7 1.5481051090000015)
     (3.798378812 7 1.5164137029999996)
     (4.049880173 7 1.7670989089999996)
     (3.716469474 6 1.3442434509999996)
     (3.422806969 6 1.3272816180000002)
    
     ;; current master
     (5.405534502 12 2.8778620629999994)
     (5.038353216999999 12 2.553688440000002)
     (4.94358915 12 2.4917956500000003)
     (4.950710861 12 2.4638737510000013)
     (5.0242796929999995 12 2.5226992029999984)
     (5.020964648 12 2.495171900999999)
     (4.914088866 12 2.4218276250000024)
     (5.003779622 12 2.502260272000001)
     (4.969347707 12 2.4814790469999988)
     (5.376038238 11 2.565757513000001)
    
     ;; didn't bother with daniel's patch as I've already shown it to be
     ;; about 10% slower than current master.
     
     )

The patch lives in the branch
scratch/icomplete-lazy-highlight-no-string-props.  It's a bit more
complicated to follow, but not much if you understand hash tables.  The
interface to icomplete.el is completely unchanged.

All in all, still a very good improvement over the current situation,
and I think I can make it faster.

(Though really do consider Eli's arguments the fastest approach)

>> And even that isn't a leak.
>> Note that we already put all kind of properties (although not text
>> properties) on symbols.
>
> Those do not, generally, change over time.

Neither does this one! At least in size, which is the thing that
matters.  So in terms of "negative" consequences it's exactly
equivalent. Read the patch it will be obvious, I think.

João

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 16 Aug 2021 14:27:02 GMT) Full text and rfc822 format available.

Message #224 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: João Távora <joaotavora <at> gmail.com>,
 Daniel Mendler <mail <at> daniel-mendler.de>
Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, Eli Zaretskii <eliz <at> gnu.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 48841 <at> debbugs.gnu.org,
 47711 <at> debbugs.gnu.org
Subject: Re: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Mon, 16 Aug 2021 17:26:44 +0300

On 16.08.2021 14:37, João Távora wrote:
> I am not a native English speaker, and maybe you don't understand
> my language.  Another way to explain what I am talking  is to talk about
> "bug reproduction".  You say there's a bug in my patch, I am asking you
> for a "bug reproduction recipe" as defined by most,  if not all, the results
> you get by searching "bug reproduction recipe" in the Google search engine.

I hope you, or at least other here, can someday see and understand that 
asking to prove standard engineering practices from the first 
principles, time and time again in various discussions, is not a way to 
encourage good atmosphere or promote project participation.

Are you really not imagine a buggy scenario coming from a combination of 
downstream uses of 'completion-score' property, different completion 
styles (some setting it, and some not), and a completion table that 
either uses global string values outright, or caches them for the 
duration of the current command?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 16 Aug 2021 14:31:01 GMT) Full text and rfc822 format available.

Message #227 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 48841 <at> debbugs.gnu.org,
 Lars Ingebrigtsen <larsi <at> gnus.org>, Eli Zaretskii <eliz <at> gnu.org>,
 47711 <at> debbugs.gnu.org
Subject: Re: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Mon, 16 Aug 2021 15:29:57 +0100

On Mon, Aug 16, 2021 at 3:26 PM Dmitry Gutov <dgutov <at> yandex.ru> wrote:
>
> On 16.08.2021 14:37, João Távora wrote:
> > I am not a native English speaker, and maybe you don't understand
> > my language.  Another way to explain what I am talking  is to talk about
> > "bug reproduction".  You say there's a bug in my patch, I am asking you
> > for a "bug reproduction recipe" as defined by most,  if not all, the results
> > you get by searching "bug reproduction recipe" in the Google search engine.
>
> I hope you, or at least other here, can someday see and understand that
> asking to prove standard engineering practices from the first
> principles, time and time again in various discussions, is not a way to
> encourage good atmosphere or promote project participation.
>
> Are you really not imagine a buggy scenario coming from a combination of
> downstream uses of 'completion-score' property, different completion
> styles (some setting it, and some not), and a completion table that
> either uses global string values outright, or caches them for the
> duration of the current command?

I don't. Please prime my imagination with some illustration based on
your fertile imagination of these things and the patches I have provided.
Oh and spare me the lectures.  Thanks.

João

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 16 Aug 2021 14:34:02 GMT) Full text and rfc822 format available.

Message #230 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: João Távora <joaotavora <at> gmail.com>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, monnier <at> iro.umontreal.ca,
 48841 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, larsi <at> gnus.org,
 47711 <at> debbugs.gnu.org
Subject: Re: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Mon, 16 Aug 2021 17:33:20 +0300

On 16.08.2021 17:20, João Távora wrote:

>>>>> FWIW, I also don't understand how adding properties could cause a
>>>>> memory leak.  When a string is GCed, its properties get GCed as well,
>>>>> all of them.  Am I missing something?
>>>>
>>>> If you add string properties to all symbol names this memory will stay
>>>> alive for much longer than necessary.
>>> That's a very extreme example, something that I wouldn't expect a
>>> Lisp
>>> program to do, without removing the properties shortly thereafter.
>>
>> And that *will* happen with Joao's approach, as soon as some package
>> implements a Lisp completion backend in a certain (legal) fashion.
> 
> There is no leak, not in the strong or weak sense.

Eli already said that, in a sentence that I also quoted. And still: "I 
wouldn't expect a Lisp program to do <so>".

> There is a constant
> usage memory footprint proportional to the size of obarray, yes, but the
> factor isn't huge.  From the top of my head, I think it's about two
> conses and a fp number for each sym. does anyone know how much that is
> or can teach me how to measure?  Thanks.

If we say that your approach is legal, those are only "two conses and a 
number" coming from minibuffer.el. But since other packages will also be 
allowed to do that, the factor will only be limited by the amount of 
installed packages.

> Anyway the current situation is constant copies of strings that put
> pressure on the GC, as the benchmarks show.
> 
> Anyhoo, in the interest of placating this string property bogeyman, I've
> prepared a version of my patch that is based on weak-keyed hash tables.
> Slightly slower, but not much. Here are the usual benchmarks:

Cool, I'll take a look, thanks.
>>> And even that isn't a leak.
>>> Note that we already put all kind of properties (although not text
>>> properties) on symbols.
>>
>> Those do not, generally, change over time.
> 
> Neither does this one! At least in size, which is the thing that
> matters.  So in terms of "negative" consequences it's exactly
> equivalent. Read the patch it will be obvious, I think.

I was talking about the values of the properties, not the size in memory.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 16 Aug 2021 14:38:02 GMT) Full text and rfc822 format available.

Message #233 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 48841 <at> debbugs.gnu.org,
 Eli Zaretskii <eliz <at> gnu.org>, Lars Ingebrigtsen <larsi <at> gnus.org>,
 47711 <at> debbugs.gnu.org
Subject: Re: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Mon, 16 Aug 2021 15:36:42 +0100

On Mon, Aug 16, 2021 at 3:33 PM Dmitry Gutov <dgutov <at> yandex.ru> wrote:

> I was talking about the values of the properties, not the size in memory.

Then what's the problem if the value of a property that is an implementation
detail changes? What do you (meaning the user of Emacs) care, ultimately?

João

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 16 Aug 2021 14:49:02 GMT) Full text and rfc822 format available.

Message #236 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: João Távora <joaotavora <at> gmail.com>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Lars Ingebrigtsen <larsi <at> gnus.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 48841 <at> debbugs.gnu.org,
 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add
 new `completion-filter-completions` API and deferred highlighting
Date: Mon, 16 Aug 2021 17:47:56 +0300

On 16.08.2021 17:36, João Távora wrote:
> On Mon, Aug 16, 2021 at 3:33 PM Dmitry Gutov<dgutov <at> yandex.ru>  wrote:
> 
>> I was talking about the values of the properties, not the size in memory.
> Then what's the problem if the value of a property that is an implementation
> detail changes? What do you (meaning the user of Emacs) care, ultimately?

You said "we already have global symbol properties". I pointed out the 
differences.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 16 Aug 2021 17:01:01 GMT) Full text and rfc822 format available.

Message #239 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Lars Ingebrigtsen <larsi <at> gnus.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 48841 <at> debbugs.gnu.org,
 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add
 new `completion-filter-completions` API and deferred highlighting
Date: Mon, 16 Aug 2021 17:59:55 +0100

On Mon, Aug 16, 2021 at 3:47 PM Dmitry Gutov <dgutov <at> yandex.ru> wrote:
>
> On 16.08.2021 17:36, João Távora wrote:
> > On Mon, Aug 16, 2021 at 3:33 PM Dmitry Gutov<dgutov <at> yandex.ru>  wrote:
> >
> >> I was talking about the values of the properties, not the size in memory.
> > Then what's the problem if the value of a property that is an implementation
> > detail changes? What do you (meaning the user of Emacs) care, ultimately?
>
> You said "we already have global symbol properties". I pointed out the
> differences.

Eli said that, I think. Anyway,  it doesn't present any kind of problem
whether their values of change or not, as long as the space they
occupy doesn't.

João

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 16 Aug 2021 18:26:02 GMT) Full text and rfc822 format available.

Message #242 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, monnier <at> iro.umontreal.ca,
 48841 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, larsi <at> gnus.org,
 47711 <at> debbugs.gnu.org
Subject: Re: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Mon, 16 Aug 2021 19:25:00 +0100

Dmitry Gutov <dgutov <at> yandex.ru> writes:

>> prepared a version of my patch that is based on weak-keyed hash tables.
>> Slightly slower, but not much. Here are the usual benchmarks:
>
> Cool, I'll take a look, thanks.

I've made it faster, now very close to the string-propertizing approach,
itself very close to the theoretical best (which is no copy, no
highlight).  See the tip of the
scratch/icomplete-lazy-highlight-no-string-props branch, which I had to
rewrite (some Git flub-up).  All benchmarks after sig.

João

(require 'cl-lib)

;; Introduce 300 000 new symbols to slow things down.  Probably more
;; than most non-Spacemancs people will have :-)

;; (setq ido-enable-flex-matching t)
;; (ido-mode)
;; (ignore-errors (ido-ubiquitous-mode))
;; (fido-mode)
;; (fido-vertical-mode)
;; (vertico-mode)

;; (hash-table-keys completion--get-lazy-highlight-cache)

(defmacro heyhey ()
  `(progn
     ,@(cl-loop repeat 300000
	       collect `(defun ,(intern (symbol-name (gensym "heyhey"))) () 42))))
;; (setq completion-styles '(substring))
(setq completion-styles '(flex))
(heyhey)
(setq icomplete-show-matches-on-no-input t)

(symbol-name 'mouse-kill)

(when nil
  ;; Press C-u C-x C-e C-m quickly to produce a sample. Always discard
  ;; the first sample in the session, it tends to have an extra GC and be
  ;; slower.
  (benchmark-run (completing-read "" obarray))

  ;; don't use string props
  (2.848873438 6 0.8307729419999994)
  (2.848416202 6 0.8370667889999996)
  (2.786944063 6 0.8230433460000004)
  (2.7815761840000004 6 0.819654023)
  (2.6929080819999998 5 0.7036257240000001)

  ;; string props
  (2.630354852 4 0.7071441910000011)
  (2.594761891 4 0.7082679669999994)
  (2.589480755 4 0.7112978109999997)
  (2.661196709 4 0.7130021060000011)
  (2.844372962 4 0.7378870879999999)

  ;; master
  (3.6339847759999997 12 1.601142523)
  (3.757091055 12 1.6231055449999996)
  (3.785980977 12 1.6333413839999995)
  (3.716144927 12 1.6100998260000008)
  (3.808275042 11 1.611891043)

  ;; these next two are not comparable to the above, because in
  ;; ab23fa4eb22f6557414724769958a63f1c59b49a I added sorting to flex
  ;; which changes results, and Daniel's patch no longer applies
  ;; cleanly.

  ;; daniel's patch
  (3.420418068 10 1.451012855)
  (3.603226896 10 1.672325507)
  (3.501318685 10 1.6150095739999992)
  (3.659821971 10 1.6580361760000004)
  (3.624743674 10 1.657498823)

  ;; master just before daniel's patch (254dc6ab4c)
  (2.611424665 10 1.5267066549999981)
  (2.48811409 10 1.486639387000002)
  (2.472587389 10 1.479865191)
  (2.543277273 10 1.510667634999999)
  (2.546243312 10 1.4986345790000009)

  )

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Tue, 17 Aug 2021 02:09:01 GMT) Full text and rfc822 format available.

Message #245 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: João Távora <joaotavora <at> gmail.com>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, larsi <at> gnus.org,
 monnier <at> iro.umontreal.ca, 48841 <at> debbugs.gnu.org, 47711 <at> debbugs.gnu.org
Subject: Re: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Tue, 17 Aug 2021 05:08:15 +0300

On 16.08.2021 21:25, João Távora wrote:
> I've made it faster, now very close to the string-propertizing approach,
> itself very close to the theoretical best (which is no copy, no
> highlight).  See the tip of the
> scratch/icomplete-lazy-highlight-no-string-props branch, which I had to
> rewrite (some Git flub-up).  All benchmarks after sig.

Thanks. I've read it now.

This implementation style (quick exfiltration via a dynamic var with 
some special-cased logic) reminds me of the recent changes to eldoc, 
really not my cup of tea.

At the very least, though, you have done the work of proving that the 
no-string-propertization approach can be just as fast. Thank you for that.

A hash table with :test 'eq is a good choice. I'd be happy to try to 
tweak it further, but it also seems that at this point we can transition 
to the discussion about what kind of implementation style we want, since 
the performance is proven to be more or less on par.

Though of course that should start with an alternative patch which adds 
icomplete support as well (either Daniel does it, or I'll give it a try).

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Tue, 17 Aug 2021 09:00:02 GMT) Full text and rfc822 format available.

Message #248 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, larsi <at> gnus.org,
 monnier <at> iro.umontreal.ca, 48841 <at> debbugs.gnu.org, 47711 <at> debbugs.gnu.org
Subject: Re: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add new
 `completion-filter-completions` API and deferred highlighting
Date: Tue, 17 Aug 2021 09:59:25 +0100

Dmitry Gutov <dgutov <at> yandex.ru> writes:

> On 16.08.2021 21:25, João Távora wrote:
>> I've made it faster, now very close to the string-propertizing approach,
>> itself very close to the theoretical best (which is no copy, no
>> highlight).  See the tip of the
>> scratch/icomplete-lazy-highlight-no-string-props branch, which I had to
>> rewrite (some Git flub-up).  All benchmarks after sig.
>
> Thanks. I've read it now.
>
> This implementation style (quick exfiltration via a dynamic var with
> some special-cased logic) reminds me of the recent changes to eldoc,
> really not my cup of tea.

I'm sorry, but I'm not drinking from your herbarium.  Googling for
"exfiltration" brings up "malware" and data security.  Why is mine
"quick" at that?  Is this some kind of metaphor?  And what does "some
special-cased logic" refer to exactly?  I see as much similarity to
Eldoc and I do to the Sistine chapel.

The only thing I understand, I think, is "dynamic var".  If you mean the
variable 'completion-lazy-hilit', notice it is not necessarily used as
dynamic var (in fact in icomplete.el it's just a buffer-local var).  As
I explained elsewhere, if the completion machinery had a realiable
abstraction for "session" I would use that.  

I don't think it does, does it?  Currently, it's the frontend who holds
that knowledge.  It will either have an object representing it (maybe a
fancy CLOS thing); or a stack frame with some kind of command loop; or, in
the case of icomplete, a minibuffer session delineated by
kill-all-local-variables.

So, for icomplete.el, setting that variable buffer-locally is the
appropriate thing.  For the command-loopy frontend, dynamically binding
it will be.  The the objecty frontend, the object itself it proabably a
good value for complation-lazy.hilit.

For completion-capf, if you cared to optimize it with this stuff, it
will likely be ... something something.

Anyway, the "implementation style" I went for is speed, brevity and a
decent docstring.

And it'd be a bit shorter if it used string properties...

> At the very least, though, you have done the work of proving that the
> no-string-propertization approach can be just as fast. Thank you for
> that.

You're welcome.  Not really just as fast, but in the big-O ballpark, of
course.

I had hoped to show also that the particular choice of global structure
for string/symbol/whatever association is irrelevant.

I'm still missing the imminent catastrophe (that is so clear to you) of
the put-text-property approach.  I'd like these slower and more complex
techniques to appease more than superstition.

> A hash table with :test 'eq is a good choice. I'd be happy to try to
> tweak it further, but it also seems that at this point we can
> transition to the discussion about what kind of implementation style
> we want, since the performance is proven to be more or less on par.
>
> Though of course that should start with an alternative patch which
> adds icomplete support as well (either Daniel does it, or I'll give it
> a try).

I'm curious to see those, yes.  But Eli pointed out on, two different
APIs will need to cohabitate since the new API won't kill off the old.

To be very clear, I'm interested in the performance of Daniel's patch,
not really in insufferable claims of its beauty and virginity.

minibuffer.el is a great big mess, I'll leave it to the Great Designers
of the Big Redesign, godspeed to them.

Currently, I just want changes to not assassinate, in speed or form,
icomplete.el or the flex completion style, two fundamental daily drivers
to my work, and other's work.  So if/when Daniel's patch doesn't do any
of that (it seems that it currently does), I'll be all for it.

João

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Tue, 17 Aug 2021 11:50:01 GMT) Full text and rfc822 format available.

Message #251 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: João Távora <joaotavora <at> gmail.com>
Cc: mail <at> daniel-mendler.de, monnier <at> iro.umontreal.ca, 48841 <at> debbugs.gnu.org,
 dgutov <at> yandex.ru, larsi <at> gnus.org, 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add
 new `completion-filter-completions` API and deferred highlighting
Date: Tue, 17 Aug 2021 14:48:55 +0300

> From: João Távora <joaotavora <at> gmail.com>
> Date: Tue, 17 Aug 2021 09:59:25 +0100
> Cc: Daniel Mendler <mail <at> daniel-mendler.de>, larsi <at> gnus.org,
>  monnier <at> iro.umontreal.ca, 48841 <at> debbugs.gnu.org, 47711 <at> debbugs.gnu.org
> 
> I'm sorry, but I'm not drinking from your herbarium.

Once again, I'm asking everyone to please remove the emotional and
sarcastic parts from the exchange.  It is not helping to have
constructive discussions.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Tue, 17 Aug 2021 11:53:02 GMT) Full text and rfc822 format available.

Message #254 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 48841 <at> debbugs.gnu.org,
 Dmitry Gutov <dgutov <at> yandex.ru>, Lars Ingebrigtsen <larsi <at> gnus.org>,
 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add
 new `completion-filter-completions` API and deferred highlighting
Date: Tue, 17 Aug 2021 12:52:35 +0100

On Tue, Aug 17, 2021 at 12:49 PM Eli Zaretskii <eliz <at> gnu.org> wrote:
>
> > From: João Távora <joaotavora <at> gmail.com>
> > Date: Tue, 17 Aug 2021 09:59:25 +0100
> > Cc: Daniel Mendler <mail <at> daniel-mendler.de>, larsi <at> gnus.org,
> >  monnier <at> iro.umontreal.ca, 48841 <at> debbugs.gnu.org, 47711 <at> debbugs.gnu.org
> >
> > I'm sorry, but I'm not drinking from your herbarium.
>
> Once again, I'm asking everyone to please remove the emotional and
> sarcastic parts from the exchange.  It is not helping to have
> constructive discussions.

I thought it was rather appropriate for the "my cup of tea" line :-)  But
I get the message, and I apologize.

João

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Tue, 24 Oct 2023 22:27:02 GMT) Full text and rfc822 format available.

Message #257 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Eli Zaretskii <eliz <at> gnu.org>, Daniel Mendler <mail <at> daniel-mendler.de>,
 Stefan Monnier <monnier <at> IRO.UMontreal.CA>, João Távora
 <joaotavora <at> gmail.com>
Cc: 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add
 new `completion-filter-completions` API and deferred highlighting
Date: Wed, 25 Oct 2023 01:25:23 +0300

[Message part 1 (text/plain, inline)]

Hi all!

Time flies, doesn't it?

On 14/08/2021 10:16, Eli Zaretskii wrote:
>>> If one removes these lines, the process becomes much faster, but there is a
>>> problem with highlighting.  My idea is indeed to defer highlighting by not
>>> setting the 'face property directly on that shared string, but some
>>> other property
>>> that is read later from the shared string by compliant frontents.
>>
>> I haven't done any direct benchmarking, but I'm pretty sure that this
>> approach cannot, by definition, be as fast as the non-mutating one.
> 
> Daniel seems to think otherwise, AFAIU.
> 
>> Because you go through the whole list and mutate all of its elements,
>> you have to perform a certain bit of work W x N times, where N is the
>> size of the whole list.
>>
>> Whereas the deferred-mutation approach will only have to do its bit
>> (which is likely more work, like, WWW) only 20 times, or 100 times, or
>> however many completions are displayed. And this is usually negligible.
>>
>> However big the difference is going to be, I can't say in advance, of
>> course, or whether it's going to be shadowed by some other performance
>> costs. But the non-mutating approach should have the best optimization
>> potential when the list is long.
> 
> So I guess the suggestion to have a benchmark is still useful, because
> the estimations of which approach has better performance vary between
> you three.  So maybe producing such benchmarks would be a good step?

To cross this out from my TODO, I spent most of the day rebasing both of 
the proposed patches (one of them longer than the other) -- one from an 
attachment here and another from a commit inside the 
scratch/icomplete-lazy-highlight-attempt-2 branch, porting icomplete to 
Daniel's new completion-filter-completions API (*), and benchmarking.

(*) Included in the attached patch: it needed changing just two lines 
inside icomplete, but also new variable completion-all-sorted-highlight 
and updates to completion--cache-all-sorted-completions and 
completion-all-sorted-completions.

Both rebased patches are attached to this email for your convenience.

AFAICT, the results confirmed my expectations quoted above.

Using Joao's benchmark, with setup:

  (defmacro lotsoffunctions ()
    `(progn
       ,@(cl-loop repeat 150000
                  collect `(defun ,(intern (symbol-name (gensym 
"heyhey"))) () 42))))

  (lotsoffunctions)

I ran the comparisons for empty and non-empty inputs.

With no characters typed:

  (benchmark-run 1
    (let ((completion-styles '(flex))
          (completion-lazy-hilit (cl-gensym)) ; might not be defined
          )
      ;; Uncomment one of the lines below, depending on the patch used.
      ;; (completion-all-completions "" obarray 'fboundp 0 nil)
      ;; (completion-filter-completions "" obarray 'fboundp 0 nil)
      ))

master => 0.066
lazy-hilit => 0.045
filter-and-defer => 0.041 (but more often ~0.110 including GC, somehow)

With one character typed:

  (benchmark-run 1
    (let ((completion-styles '(flex))
          (completion-lazy-hilit (cl-gensym)) ; might not be defined
          )
      ;; Uncomment one of the lines below, depending on the patch used.
      ;; (completion-all-completions "h" obarray 'fboundp 1 nil)
      ;; (completion-filter-completions "h" obarray 'fboundp 1 nil)
      ))

master => 0.824
lazy-hilit => 0.395
filter-and-defer => 0.125 (!)

This more or less translates into the improvement in speed of 
fido-vertical-mode, according to my benchmark-progn call inside 
icomplete-exhibit (included in both attached patches for convenience). 
For non-empty inputs (h or hh or hhe, to match the generated functions), 
filter-and-defer is about 1.5x faster than lazy-hilit, like 0.450ms vs 
0.640ms.

lazy-hilit is slightly faster than filter-and-defer with the empty input 
(like 380ms vs 420ms), and I'm not yet sure why, but it's the scenario 
with 0 highlighting (and so no flex scoring/sorting). Perhaps some 
short-circuiting can be added somewhere to reach parity, or it's the 
cost of extra branching somewhere for backward compatibility (which 
could be removed in the future). Worth additional study.

[0001-Add-new-completion-filter-completions-API-and-deferr-v3.diff (text/x-patch, attachment)]

[completion-lazy-hilit.patch (text/x-patch, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Wed, 25 Oct 2023 17:51:02 GMT) Full text and rfc822 format available.

Message #260 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Eli Zaretskii <eliz <at> gnu.org>,
 Stefan Monnier <monnier <at> IRO.UMontreal.CA>, 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2]
 Add new `completion-filter-completions` API and deferred highlighting
Date: Wed, 25 Oct 2023 18:52:26 +0100

[Message part 1 (text/plain, inline)]

Dmitry Gutov <dmitry <at> gutov.dev> writes:

> Hi all!
>
> Time flies, doesn't it?

Indeed.  First, thanks for working on this.  I tried your patches and
could reproduce all your numbers with a recent master (643c67cf239).

After checking the 2021 version again (254dc6ab4) I've discovered that
2023 Emacs doesn't respond as well to the lazy-hilit patch as 2021
Emacs.  It's as if the cost of string copies (that the patch optimized)
has gone down significantly.  In addition, in 2021, we never had anyone
perform the needed changes for icomplete.el to work with Daniel's API.
It was only known that without those changes, icomplete.el actually
performed worse under the new patch

So, backed by the new ability to conduct good benchmarks, I looked at
the problem anew. I found some insight in the problem, and came up with
a new "lazy-hilit" patch which performs just as well, if not slightly
better, than Daniel's, while keeping the changes to lisp/minibuffer.el
much more minimal and not adding replacement for the longstanding
completion-all-completions.

Before I go into benchmarks, it's obvious to me that "lazy" or
"deferred" highlighting mean basically the same thing, which is "late"
or "just-in-time" highlighting.

I also think, as I did in 2021, that we should be careful to separate
what are performance-motivated changes from style-motivated changes.
The former are easy to discuss objectively, while the latter are much,
much more subjective.

Whatever the outcome, we're on our way to at least a faster
icomplete.el, which is of course good.

So here are the benchmarks.  The setup is the following, we start Emacs
like:

   src/emacs -nw -Q -f fido-vertical-mode -l ~/Downloads/benchmark.el

And make sure to put 300 000 symbols in the obarray.  The symbols are
prefixed "yoyo" deliberately.

    (cl-loop repeat 300000 do (intern (symbol-name (gensym "yoyo"))))

First a micro-benchmark:

   ;; Daniel's patch worked by Dmitry (v3)
   (benchmark-run 50
    (let ((completion-styles '(flex)))
      (completion-filter-completions "" obarray 'fboundp 0 nil)
      (completion-filter-completions "yo" obarray 'fboundp 0 nil)
      (completion-filter-completions "yoo" obarray 'fboundp 0 nil)
      ));; => (12.192422429999999 3 0.107881004)

  ;; lazy-hilit v4 patch attached in this email
  (benchmark-run 50
    (let ((completion-styles '(flex))
          (completion-lazy-hilit (cl-gensym)))
      (completion-all-completions "" obarray 'fboundp 0 nil)
      (completion-all-completions "yo" obarray 'fboundp 0 nil)
      (completion-all-completions "yoo" obarray 'fboundp 0 nil)
      ));; => (12.267915333 4 0.14799709099999991)

Now, tests specific to icomplete.el using Dmitry's instrumentation.
This is the "yoyo" test.  Evaluate:

   (completing-read "" obarray)

This starts a fido-vertical-mode minibuffer.  First type "yo", then
repeatedly insert and backspace a "o" to make "yoo" and "yo" in
alternating fashion.

The number of matches should be exactly 300696 for "yoo" and 301721 for
"yo".  Highlighting should be correct, of course.

Observe the results printed by the instrumentation in `icomplete.el`.
Collect them from *Messages* after 18 alternations.

The results:

  ;; with Daniel's patch to minibuffer
  ;; 
  ;; Elapsed time: 0.967481s (0.401923s in 8 GCs)
  ;; Elapsed time: 0.703229s (0.252157s in 5 GCs)
  ;; Elapsed time: 0.945053s (0.401540s in 8 GCs)
  ;; Elapsed time: 0.721198s (0.252657s in 5 GCs)
  ;; Elapsed time: 0.951377s (0.394238s in 8 GCs)
  ;; Elapsed time: 0.699232s (0.254524s in 5 GCs)
  ;; Elapsed time: 0.940497s (0.400292s in 8 GCs)
  ;; Elapsed time: 0.709986s (0.253635s in 5 GCs)
  ;; Elapsed time: 0.943063s (0.399020s in 8 GCs)
  ;; Elapsed time: 0.720825s (0.251619s in 5 GCs)
  ;; Elapsed time: 0.972146s (0.407665s in 8 GCs)
  ;; Elapsed time: 0.709619s (0.255678s in 5 GCs)
  ;; Elapsed time: 0.947108s (0.397916s in 8 GCs)
  ;; Elapsed time: 0.727231s (0.254040s in 5 GCs)
  ;; Elapsed time: 0.966196s (0.398492s in 8 GCs)
  ;; Elapsed time: 0.701558s (0.252168s in 5 GCs)
  ;; Elapsed time: 0.936269s (0.388110s in 8 GCs)
  ;; Elapsed time: 0.694050s (0.249759s in 5 GCs)

  ;; with my lazy-hilit patch worked minimally by Dmitry
  ;; 
  ;; Elapsed time: 1.779906s (0.975332s in 15 GCs)
  ;; Elapsed time: 1.342160s (0.490314s in 5 GCs)
  ;; Elapsed time: 1.235759s (0.420019s in 4 GCs)
  ;; Elapsed time: 1.363909s (0.519521s in 5 GCs)
  ;; Elapsed time: 1.175773s (0.423938s in 4 GCs)
  ;; Elapsed time: 1.340017s (0.508744s in 5 GCs)
  ;; Elapsed time: 1.124552s (0.404149s in 4 GCs)
  ;; Elapsed time: 1.327419s (0.499433s in 5 GCs)
  ;; Elapsed time: 1.121927s (0.400499s in 4 GCs)
  ;; Elapsed time: 1.308526s (0.493652s in 5 GCs)
  ;; Elapsed time: 1.159132s (0.404612s in 4 GCs)
  ;; Elapsed time: 1.323803s (0.500754s in 5 GCs)
  ;; Elapsed time: 1.128562s (0.406496s in 4 GCs)
  ;; Elapsed time: 1.345577s (0.503971s in 5 GCs)
  ;; Elapsed time: 1.121691s (0.401876s in 4 GCs)
  ;; Elapsed time: 1.304913s (0.492255s in 5 GCs)
  ;; Elapsed time: 1.141926s (0.399154s in 4 GCs)
  ;; Elapsed time: 1.312480s (0.498205s in 5 GCs)
  ;; Elapsed time: 1.125095s (0.403174s in 4 GCs)
  ;; Elapsed time: 1.332119s (0.503671s in 5 GCs)
  ;; Elapsed time: 1.131561s (0.402268s in 4 GCs)

  ;; New lazy-hilit patch attached:
  ;;
  ;; Elapsed time: 0.902985s (0.224307s in 3 GCs)
  ;; Elapsed time: 0.696391s (0.079687s in 1 GCs)
  ;; Elapsed time: 0.896176s (0.219964s in 3 GCs)
  ;; Elapsed time: 0.648318s (0.074765s in 1 GCs)
  ;; Elapsed time: 0.906288s (0.221534s in 3 GCs)
  ;; Elapsed time: 0.679141s (0.079102s in 1 GCs)
  ;; Elapsed time: 0.889320s (0.222668s in 3 GCs)
  ;; Elapsed time: 0.690199s (0.076926s in 1 GCs)
  ;; Elapsed time: 0.912206s (0.220297s in 3 GCs)
  ;; Elapsed time: 0.675524s (0.078875s in 1 GCs)
  ;; Elapsed time: 0.907111s (0.226627s in 3 GCs)
  ;; Elapsed time: 0.697139s (0.079571s in 1 GCs)
  ;; Elapsed time: 0.906650s (0.220946s in 3 GCs)
  ;; Elapsed time: 0.683808s (0.078001s in 1 GCs)
  ;; Elapsed time: 0.912471s (0.221977s in 3 GCs)
  ;; Elapsed time: 0.699742s (0.079009s in 1 GCs)
  ;; Elapsed time: 0.897566s (0.217786s in 3 GCs)
  ;; Elapsed time: 0.659043s (0.076046s in 1 GCs)

  Daniel+Dmitry patch v3:        0.8308954444444444s avg
  Old lazy-hilit patch:          1.2390678333333334s avg
  New lazy-hilit patch attached: 0.7922265555555554s avg

In conclusion:

* I think the two approaches are basically evenly matched in terms of
  performance, at least for this symbol completion scenario.

* In the completion-{sorted|all}-completions micro-benchmark my patch
  does very marginally worse (0.6%).  Probably because of the use of a
  hash table.  I believe I can fix this, though.

* In the icomplete.el usability test, my new patch probably does
  slightly better (4.6%).  Probably because it doesn't recalculate the
  regexp from the pattern every time it needs to late highlight.

* My patch doesn't suffer from the 'completion--unquoted' property
  complication in Daniel+Dmitry's patch."  It's possible/likely that the
  additional memory needed by this property will introduce an additional
  slowdown which isn't visible in the simpler symbol completion
  scenario.

Both patches propose what are effectively extensions to the large, old
and complicated completion API, in my case an additional variable to
bind (or set), in Daniel's patch a brand new API entry point and the
deprecation of an existing one.

The benchmarks show that Daniel's patch is not absolutely necessary to
reap the benefits of deferred/lazy/late/just-in-time highlighting.

Looking at the two patches side-by-side it seems evident to me that one
patch is much simpler than the other.  The "maybe-alist-maybe-not" flag
dedicated to completely changing the meaning of a number of
minibuffer.el functions while keeping backward compatibility is one such
item of complexity.

Therefore, since we can come up with simpler alternatives that bring
these now well-understood benefits, it won't surprise anyone that I
think we should go for the simpler choice.

João

[0001-Allow-completion-frontends-to-highlight-completion-s.patch (text/x-patch, inline)]

From 78057f40e53f39f0b26f4b9bf5d950b72f1c3d99 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Jo=C3=A3o=20T=C3=A1vora?= <joaotavora <at> gmail.com>
Date: Wed, 25 Oct 2023 13:45:01 +0100
Subject: [PATCH] Allow completion frontends to highlight completion strings
 just-in-time

This allows completion-pcm--hilit-commonality to be sped up
substantially.

Introduce a new variable completion-lazy-hilit that allows for
completion frontends to opt-in an time-saving optimization by some
completions styles, such as the 'flex' and 'pcm' styles.

The variable must be bound or set by the frontend to a unique value
around a completion attempt/session.  See completion-lazy-hilit
docstring for more info.

* lisp/icomplete.el (icomplete-minibuffer-setup): Set completion-lazy-hilit.
(icomplete--render-vertical): Call completion-lazy-hilit.
(icomplete-completions): Call completion-lazy-hilit.

* lisp/minibuffer.el (completion-lazy-hilit): New variable.
(completion-lazy-hilit)
(completion--hilit-from-re): New functions.
(completion--lazy-hilit-table): New variable.
(completion--flex-score-1): New helper.
(completion-pcm--hilit-commonality): Use completion-lazy-hilit.
---
 lisp/icomplete.el  |   9 +-
 lisp/minibuffer.el | 262 +++++++++++++++++++++++++++++----------------
 2 files changed, 173 insertions(+), 98 deletions(-)

diff --git a/lisp/icomplete.el b/lisp/icomplete.el
index e6fdd1f1836..a9ac0b3f040 100644
--- a/lisp/icomplete.el
+++ b/lisp/icomplete.el
@@ -545,6 +545,7 @@ icomplete-minibuffer-setup
     (setq-local icomplete--initial-input (icomplete--field-string))
     (setq-local completion-show-inline-help nil)
     (setq icomplete--scrolled-completions nil)
+    (setq completion-lazy-hilit (cl-gensym))
     (use-local-map (make-composed-keymap icomplete-minibuffer-map
     					 (current-local-map)))
     (add-hook 'post-command-hook #'icomplete-post-command-hook nil t)
@@ -754,12 +755,13 @@ icomplete-exhibit
                            (overlay-end rfn-eshadow-overlay)))
           (let* ((field-string (icomplete--field-string))
                  (text (while-no-input
+                         (benchmark-progn
                          (icomplete-completions
                           field-string
                           (icomplete--completion-table)
                           (icomplete--completion-predicate)
                           (if (window-minibuffer-p)
-                              (eq minibuffer--require-match t)))))
+                              (eq minibuffer--require-match t))))))
                  (buffer-undo-list t)
                  deactivate-mark)
             ;; Do nothing if while-no-input was aborted.
@@ -901,7 +903,7 @@ icomplete--render-vertical
                                 'icomplete-selected-match 'append comp)
      collect (concat prefix
                      (make-string (- max-prefix-len (length prefix)) ? )
-                     comp
+                     (completion-lazy-hilit comp)
                      (make-string (- max-comp-len (length comp)) ? )
                      suffix)
      into lines-aux
@@ -1067,7 +1069,8 @@ icomplete-completions
                   (if (< prospects-len prospects-max)
                       (push comp prospects)
                     (setq limit t)))
-                (setq prospects (nreverse prospects))
+                (setq prospects
+                      (nreverse (mapcar #'completion-lazy-hilit prospects)))
                 ;; Decorate first of the prospects.
                 (when prospects
                   (let ((first (copy-sequence (pop prospects))))
diff --git a/lisp/minibuffer.el b/lisp/minibuffer.el
index 2120e31775e..4591f1145c8 100644
--- a/lisp/minibuffer.el
+++ b/lisp/minibuffer.el
@@ -3749,108 +3749,180 @@ flex-score-match-tightness
 than the latter (which has two \"holes\" and three
 one-letter-long matches).")
 
+(defvar-local completion-lazy-hilit nil
+  "If non-nil, request completion lazy hilighting.
+
+Completion-presenting frontends may opt to bind this variable to
+a unique non-nil value in the context of completion-producing
+calls (such as `completion-all-sorted-completions').  This hints
+the intervening completion styles that they do not need to
+propertize completion strings with the `face' property.
+
+When doing so, it is the frontend -- not the style -- who becomes
+responsible for `face'-propertizing only the completion strings
+that are meant to be displayed to the user.  This can be done by
+calling the function `completion-lazy-hilit' which returns a
+`face'-propertized string.
+
+The value stored in this variable by the completion frontend
+should be unique to each completion attempt or session that
+utilizes the same completion style in `completion-styles-alist'.
+For frontends using the minibuffer as the locus of completion
+calls and display, setting it to a buffer-local value given by
+`gensym' is appropriate.  For frontends operating entirely in a
+single command, let-binding it to `gensym' is appropriate.
+
+Note that the optimization enabled by variable is only actually
+performed some completions styles.  To others, it is a harmless
+and useless hint.  To author a completion style that takes
+advantage of this, look in the source of
+`completion-pcm--hilit-commonality'.")
+
+(defun completion-lazy-hilit (str)
+  "Return a copy of completion STR that is `face'-propertized.
+See documentation for variable `completion-lazy-hilit' for more
+details."
+  (completion--hilit-from-re
+   (copy-sequence str)
+   (gethash completion-lazy-hilit completion--lazy-hilit-table)))
+
+(defun completion--hilit-from-re (string regexp)
+  "Fontify STRING with `completions-common-part' using REGEXP."
+  (let* ((md (and regexp (string-match regexp string) (cddr (match-data t))))
+         (me (and md (match-end 0)))
+         (from 0))
+    (while md
+      (add-face-text-property from (pop md) 'completions-common-part nil string)
+      (setq from (pop md)))
+    (unless (or (not me) (= from me))
+      (add-face-text-property from me 'completions-common-part nil string))
+    string))
+
+(defun completion--flex-score-1 (md match-end len)
+  "Compute matching score of completion.
+The score lies in the range between 0 and 1, where 1 corresponds to
+the full match.
+MD is the match data.
+MATCH-END is the end of the match.
+LEN is the length of the completion string."
+  (let* ((from 0)
+         ;; To understand how this works, consider these simple
+         ;; ascii diagrams showing how the pattern "foo"
+         ;; flex-matches "fabrobazo", "fbarbazoo" and
+         ;; "barfoobaz":
+
+         ;;      f abr o baz o
+         ;;      + --- + --- +
+
+         ;;      f barbaz oo
+         ;;      + ------ ++
+
+         ;;      bar foo baz
+         ;;          +++
+
+         ;; "+" indicates parts where the pattern matched.  A
+         ;; "hole" in the middle of the string is indicated by
+         ;; "-".  Note that there are no "holes" near the edges
+         ;; of the string.  The completion score is a number
+         ;; bound by (0..1] (i.e., larger than (but not equal
+         ;; to) zero, and smaller or equal to one): the higher
+         ;; the better and only a perfect match (pattern equals
+         ;; string) will have score 1.  The formula takes the
+         ;; form of a quotient.  For the numerator, we use the
+         ;; number of +, i.e. the length of the pattern.  For
+         ;; the denominator, it first computes
+         ;;
+         ;;     hole_i_contrib = 1 + (Li-1)^(1/tightness)
+         ;;
+         ;; , for each hole "i" of length "Li", where tightness
+         ;; is given by `flex-score-match-tightness'.  The
+         ;; final value for the denominator is then given by:
+         ;;
+         ;;    (SUM_across_i(hole_i_contrib) + 1) * len
+         ;;
+         ;; , where "len" is the string's length.
+         (score-numerator 0)
+         (score-denominator 0)
+         (last-b 0))
+    (while md
+      (let ((a from)
+            (b (pop md)))
+        (setq
+         score-numerator   (+ score-numerator (- b a)))
+        (unless (or (= a last-b)
+                    (zerop last-b)
+                    (= a len))
+          (setq
+           score-denominator (+ score-denominator
+                                1
+                                (expt (- a last-b 1)
+                                      (/ 1.0
+                                         flex-score-match-tightness)))))
+        (setq
+         last-b              b))
+      (setq from (pop md)))
+    ;; If `pattern' doesn't have an explicit trailing any, the
+    ;; regex `re' won't produce match data representing the
+    ;; region after the match.  We need to account to account
+    ;; for that extra bit of match (bug#42149).
+    (unless (= from match-end)
+      (let ((a from)
+            (b match-end))
+        (setq
+         score-numerator   (+ score-numerator (- b a)))
+        (unless (or (= a last-b)
+                    (zerop last-b)
+                    (= a len))
+          (setq
+           score-denominator (+ score-denominator
+                                1
+                                (expt (- a last-b 1)
+                                      (/ 1.0
+                                         flex-score-match-tightness)))))
+        (setq
+         last-b              b)))
+    (/ score-numerator (* len (1+ score-denominator)) 1.0)))
+
+(defvar completion--lazy-hilit-table (make-hash-table :weakness 'key))
+
 (defun completion-pcm--hilit-commonality (pattern completions)
   "Show where and how well PATTERN matches COMPLETIONS.
 PATTERN, a list of symbols and strings as seen
 `completion-pcm--merge-completions', is assumed to match every
-string in COMPLETIONS.  Return a deep copy of COMPLETIONS where
-each string is propertized with `completion-score', a number
-between 0 and 1, and with faces `completions-common-part',
-`completions-first-difference' in the relevant segments."
+string in COMPLETIONS.
+
+If `completion-lazy-hilit' is nil, return a deep copy of
+COMPLETIONS where each string is propertized with
+`completion-score', a number between 0 and 1, and with faces
+`completions-common-part', `completions-first-difference' in the
+relevant segments.
+
+Else, if `completion-lazy-hilit' is t, return COMPLETIONS where
+each string now has a `completion-score' property and no
+highlighting."
   (cond
    ((and completions (cl-loop for e in pattern thereis (stringp e)))
     (let* ((re (completion-pcm--pattern->regex pattern 'group))
-           (point-idx (completion-pcm--pattern-point-idx pattern))
-           (case-fold-search completion-ignore-case)
-           last-md)
-      (mapcar
-       (lambda (str)
-	 ;; Don't modify the string itself.
-         (setq str (copy-sequence str))
-         (unless (string-match re str)
-           (error "Internal error: %s does not match %s" re str))
-         (let* ((pos (if point-idx (match-beginning point-idx) (match-end 0)))
-                (match-end (match-end 0))
-                (md (cddr (setq last-md (match-data t last-md))))
-                (from 0)
-                (end (length str))
-                ;; To understand how this works, consider these simple
-                ;; ascii diagrams showing how the pattern "foo"
-                ;; flex-matches "fabrobazo", "fbarbazoo" and
-                ;; "barfoobaz":
-
-                ;;      f abr o baz o
-                ;;      + --- + --- +
-
-                ;;      f barbaz oo
-                ;;      + ------ ++
-
-                ;;      bar foo baz
-                ;;          +++
-
-                ;; "+" indicates parts where the pattern matched.  A
-                ;; "hole" in the middle of the string is indicated by
-                ;; "-".  Note that there are no "holes" near the edges
-                ;; of the string.  The completion score is a number
-                ;; bound by (0..1] (i.e., larger than (but not equal
-                ;; to) zero, and smaller or equal to one): the higher
-                ;; the better and only a perfect match (pattern equals
-                ;; string) will have score 1.  The formula takes the
-                ;; form of a quotient.  For the numerator, we use the
-                ;; number of +, i.e. the length of the pattern.  For
-                ;; the denominator, it first computes
-                ;;
-                ;;     hole_i_contrib = 1 + (Li-1)^(1/tightness)
-                ;;
-                ;; , for each hole "i" of length "Li", where tightness
-                ;; is given by `flex-score-match-tightness'.  The
-                ;; final value for the denominator is then given by:
-                ;;
-                ;;    (SUM_across_i(hole_i_contrib) + 1) * len
-                ;;
-                ;; , where "len" is the string's length.
-                (score-numerator 0)
-                (score-denominator 0)
-                (last-b 0)
-                (update-score-and-face
-                 (lambda (a b)
-                   "Update score and face given match range (A B)."
-                   (add-face-text-property a b
-                                           'completions-common-part
-                                           nil str)
-                   (setq
-                    score-numerator   (+ score-numerator (- b a)))
-                   (unless (or (= a last-b)
-                               (zerop last-b)
-                               (= a (length str)))
-                     (setq
-                      score-denominator (+ score-denominator
-                                           1
-                                           (expt (- a last-b 1)
-                                                 (/ 1.0
-                                                    flex-score-match-tightness)))))
-                   (setq
-                    last-b              b))))
-           (while md
-             (funcall update-score-and-face from (pop md))
-             (setq from (pop md)))
-           ;; If `pattern' doesn't have an explicit trailing any, the
-           ;; regex `re' won't produce match data representing the
-           ;; region after the match.  We need to account to account
-           ;; for that extra bit of match (bug#42149).
-           (unless (= from match-end)
-             (funcall update-score-and-face from match-end))
-           (if (> (length str) pos)
-               (add-face-text-property
-                pos (1+ pos)
-                'completions-first-difference
-                nil str))
-           (unless (zerop (length str))
-             (put-text-property
-              0 1 'completion-score
-              (/ score-numerator (* end (1+ score-denominator)) 1.0) str)))
-         str)
-       completions)))
+           last-md
+           (score (lambda (str)
+                    (unless (string-match re str)
+                      (error "Internal error: %s does not match %s" re str))
+                    (let* ((match-end (match-end 0))
+                           (md (cddr (setq last-md (match-data t last-md)))))
+                      (completion--flex-score-1 md match-end (length str))))))
+      (cond (completion-lazy-hilit
+             (puthash completion-lazy-hilit re completion--lazy-hilit-table)
+             (mapc (lambda (str)
+                     (put-text-property 0 1 'completion-score (funcall score str) str))
+                   completions))
+            (t
+             (mapcar
+              (lambda (str)
+                (setq str (copy-sequence str))
+                (put-text-property 0 1 'completion-score (funcall score str) str)
+                (completion--hilit-from-re str re)
+                str)
+              completions)))))
    (t completions)))
 
 (defun completion-pcm--find-all-completions (string table pred point
-- 
2.39.2

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Wed, 25 Oct 2023 20:54:02 GMT) Full text and rfc822 format available.

Message #263 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: João Távora <joaotavora <at> gmail.com>
Cc: Dmitry Gutov <dmitry <at> gutov.dev>, Eli Zaretskii <eliz <at> gnu.org>,
 47711 <at> debbugs.gnu.org, Daniel Mendler <mail <at> daniel-mendler.de>
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2]
 Add new `completion-filter-completions` API and deferred highlighting
Date: Wed, 25 Oct 2023 16:50:29 -0400

This sounds fairly reasonable: the worst-case breakage seems to be that
we may occasionally lose highlighting because the var was non-nil at the
wrong time.

Sidenote: since the hash-table uses `eq` we don't need to use `gensym`,
we can use something like `cons` instead, which is cheaper and doesn't
risk making its way into the `obarray`.

> +(defun completion-lazy-hilit (str)
> +  "Return a copy of completion STR that is `face'-propertized.
> +See documentation for variable `completion-lazy-hilit' for more
> +details."
> +  (completion--hilit-from-re
> +   (copy-sequence str)
> +   (gethash completion-lazy-hilit completion--lazy-hilit-table)))

Hmm... in order to get the right result you need to call
`completion-lazy-hilit` sometime after calling
`completion-all-completions` and before the next call to
`completion-all-completions` done with the same value of
`completion-lazy-hilit`, right?

So how important is it to use a hash-table rather than a variable
holding just "the info about the last call to
`completion-all-completions`"?

> +           last-md
> +           (score (lambda (str)
> +                    (unless (string-match re str)
> +                      (error "Internal error: %s does not match %s" re str))
> +                    (let* ((match-end (match-end 0))
> +                           (md (cddr (setq last-md (match-data t last-md)))))
> +                      (completion--flex-score-1 md match-end (length str))))))
> +      (cond (completion-lazy-hilit
> +             (puthash completion-lazy-hilit re completion--lazy-hilit-table)
> +             (mapc (lambda (str)
> +                     (put-text-property 0 1 'completion-score (funcall score str) str))
> +                   completions))
> +            (t
> +             (mapcar
> +              (lambda (str)
> +                (setq str (copy-sequence str))
> +                (put-text-property 0 1 'completion-score (funcall score str) str)
> +                (completion--hilit-from-re str re)
> +                str)
> +              completions)))))

How much more expensive is it to replace the 

    (mapc (lambda (str)
            (put-text-property 0 1 'completion-score (funcall score str) str))
          completions))

with something like

    (let ((tail `(completion-lazy-hilit (completion--hilit-from-re ,re))))
      (mapc (lambda (str)
              (add-text-properties
               0 1 `(completion-score ,(funcall score str) ,@tail) str))
            completions))

and then get rid of the hash-table altogether?
                 

        Stefan

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Wed, 25 Oct 2023 21:04:02 GMT) Full text and rfc822 format available.

Message #266 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Dmitry Gutov <dmitry <at> gutov.dev>, Eli Zaretskii <eliz <at> gnu.org>,
 47711 <at> debbugs.gnu.org, Daniel Mendler <mail <at> daniel-mendler.de>
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add
 new `completion-filter-completions` API and deferred highlighting
Date: Wed, 25 Oct 2023 22:02:23 +0100

[Message part 1 (text/plain, inline)]

On Wed, Oct 25, 2023, 21:52 Stefan Monnier <monnier <at> iro.umontreal.ca> wrote:

>
>
> and then get rid of the hash-table altogether?
>

Stefan,  again we think alike, I'm taking care of that, the hash table and
the gensym aren't needed at all.

>

[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Wed, 25 Oct 2023 22:11:02 GMT) Full text and rfc822 format available.

Message #269 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Dmitry Gutov <dmitry <at> gutov.dev>, Eli Zaretskii <eliz <at> gnu.org>,
 47711 <at> debbugs.gnu.org, Daniel Mendler <mail <at> daniel-mendler.de>
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2]
 Add new `completion-filter-completions` API and deferred highlighting
Date: Wed, 25 Oct 2023 23:12:28 +0100

[Message part 1 (text/plain, inline)]

João Távora <joaotavora <at> gmail.com> writes:

>>  and then get rid of the hash-table altogether?
>
> Stefan, again we think alike, I'm taking care of that, the hash table
> and the gensym aren't needed at all.

Here's a revised patch without the hash table (actually a patch on top
of the other one, but I send the two of them again).

completion-lazy-hilit is simply bound in icomplete.el's
icomplete-exhibit.

I don't recall exactly why I used to set it in the hook to a gensym in
2021.  I think it was because I was setting face properties on reused
strings (not just score) and that caused problems when switching
completion styles.  But I don't think that can happen anymore, so no
need for that.

>> Hmm... in order to get the right result you need to call
>> `completion-lazy-hilit` sometime after calling
>> `completion-all-completions` and before the next call to
>> `completion-all-completions` done with the same value of
>> `completion-lazy-hilit`, right?

Right, it used to be something like that, but not anymore I think.  Now
the semantics, could be described informally as

   "called by the frontend in the hopes that the style got the hint,
    which will speed things up significantly -- but if the hint wasn't
    caught that's OK too".

>> How much more expensive is it to replace the 

>>     (mapc (lambda (str)
>>             (put-text-property 0 1 'completion-score (funcall score str) str))
>>           completions))

>> with something like

>>     (let ((tail `(completion-lazy-hilit (completion--hilit-from-re ,re))))
>>       (mapc (lambda (str)
>>               (add-text-properties
>>                0 1 `(completion-score ,(funcall score str) ,@tail) str))
>>             completions))

I get the idea, I think but I think it would be somewhat more expensive,
at it is similar to my earlier patch which stored more stuff in every
string which Dmitry (and then I) measured to be slower.  But feel free
to experiment, of course.

João

[0001-Allow-completion-frontends-to-highlight-completion-s.patch (text/x-patch, inline)]

From 09ac2a87cb95d31acdefa3b4920449da2cb848fb Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Jo=C3=A3o=20T=C3=A1vora?= <joaotavora <at> gmail.com>
Date: Wed, 25 Oct 2023 13:45:01 +0100
Subject: [PATCH 1/2] Allow completion frontends to highlight completion
 strings just-in-time

This allows completion-pcm--hilit-commonality to be sped up
substantially.

Introduce a new variable completion-lazy-hilit that allows for
completion frontends to opt-in an time-saving optimization by some
completions styles, such as the 'flex' and 'pcm' styles.

The variable must be bound or set by the frontend to a unique value
around a completion attempt/session.  See completion-lazy-hilit
docstring for more info.

* lisp/icomplete.el (icomplete-minibuffer-setup): Set completion-lazy-hilit.
(icomplete--render-vertical): Call completion-lazy-hilit.
(icomplete-completions): Call completion-lazy-hilit.

* lisp/minibuffer.el (completion-lazy-hilit): New variable.
(completion-lazy-hilit)
(completion--hilit-from-re): New functions.
(completion--lazy-hilit-table): New variable.
(completion--flex-score-1): New helper.
(completion-pcm--hilit-commonality): Use completion-lazy-hilit.
---
 lisp/icomplete.el  |   9 +-
 lisp/minibuffer.el | 262 +++++++++++++++++++++++++++++----------------
 2 files changed, 173 insertions(+), 98 deletions(-)

diff --git a/lisp/icomplete.el b/lisp/icomplete.el
index e6fdd1f1836..a9ac0b3f040 100644
--- a/lisp/icomplete.el
+++ b/lisp/icomplete.el
@@ -545,6 +545,7 @@ icomplete-minibuffer-setup
     (setq-local icomplete--initial-input (icomplete--field-string))
     (setq-local completion-show-inline-help nil)
     (setq icomplete--scrolled-completions nil)
+    (setq completion-lazy-hilit (cl-gensym))
     (use-local-map (make-composed-keymap icomplete-minibuffer-map
     					 (current-local-map)))
     (add-hook 'post-command-hook #'icomplete-post-command-hook nil t)
@@ -754,12 +755,13 @@ icomplete-exhibit
                            (overlay-end rfn-eshadow-overlay)))
           (let* ((field-string (icomplete--field-string))
                  (text (while-no-input
+                         (benchmark-progn
                          (icomplete-completions
                           field-string
                           (icomplete--completion-table)
                           (icomplete--completion-predicate)
                           (if (window-minibuffer-p)
-                              (eq minibuffer--require-match t)))))
+                              (eq minibuffer--require-match t))))))
                  (buffer-undo-list t)
                  deactivate-mark)
             ;; Do nothing if while-no-input was aborted.
@@ -901,7 +903,7 @@ icomplete--render-vertical
                                 'icomplete-selected-match 'append comp)
      collect (concat prefix
                      (make-string (- max-prefix-len (length prefix)) ? )
-                     comp
+                     (completion-lazy-hilit comp)
                      (make-string (- max-comp-len (length comp)) ? )
                      suffix)
      into lines-aux
@@ -1067,7 +1069,8 @@ icomplete-completions
                   (if (< prospects-len prospects-max)
                       (push comp prospects)
                     (setq limit t)))
-                (setq prospects (nreverse prospects))
+                (setq prospects
+                      (nreverse (mapcar #'completion-lazy-hilit prospects)))
                 ;; Decorate first of the prospects.
                 (when prospects
                   (let ((first (copy-sequence (pop prospects))))
diff --git a/lisp/minibuffer.el b/lisp/minibuffer.el
index 2120e31775e..4591f1145c8 100644
--- a/lisp/minibuffer.el
+++ b/lisp/minibuffer.el
@@ -3749,108 +3749,180 @@ flex-score-match-tightness
 than the latter (which has two \"holes\" and three
 one-letter-long matches).")
 
+(defvar-local completion-lazy-hilit nil
+  "If non-nil, request completion lazy hilighting.
+
+Completion-presenting frontends may opt to bind this variable to
+a unique non-nil value in the context of completion-producing
+calls (such as `completion-all-sorted-completions').  This hints
+the intervening completion styles that they do not need to
+propertize completion strings with the `face' property.
+
+When doing so, it is the frontend -- not the style -- who becomes
+responsible for `face'-propertizing only the completion strings
+that are meant to be displayed to the user.  This can be done by
+calling the function `completion-lazy-hilit' which returns a
+`face'-propertized string.
+
+The value stored in this variable by the completion frontend
+should be unique to each completion attempt or session that
+utilizes the same completion style in `completion-styles-alist'.
+For frontends using the minibuffer as the locus of completion
+calls and display, setting it to a buffer-local value given by
+`gensym' is appropriate.  For frontends operating entirely in a
+single command, let-binding it to `gensym' is appropriate.
+
+Note that the optimization enabled by variable is only actually
+performed some completions styles.  To others, it is a harmless
+and useless hint.  To author a completion style that takes
+advantage of this, look in the source of
+`completion-pcm--hilit-commonality'.")
+
+(defun completion-lazy-hilit (str)
+  "Return a copy of completion STR that is `face'-propertized.
+See documentation for variable `completion-lazy-hilit' for more
+details."
+  (completion--hilit-from-re
+   (copy-sequence str)
+   (gethash completion-lazy-hilit completion--lazy-hilit-table)))
+
+(defun completion--hilit-from-re (string regexp)
+  "Fontify STRING with `completions-common-part' using REGEXP."
+  (let* ((md (and regexp (string-match regexp string) (cddr (match-data t))))
+         (me (and md (match-end 0)))
+         (from 0))
+    (while md
+      (add-face-text-property from (pop md) 'completions-common-part nil string)
+      (setq from (pop md)))
+    (unless (or (not me) (= from me))
+      (add-face-text-property from me 'completions-common-part nil string))
+    string))
+
+(defun completion--flex-score-1 (md match-end len)
+  "Compute matching score of completion.
+The score lies in the range between 0 and 1, where 1 corresponds to
+the full match.
+MD is the match data.
+MATCH-END is the end of the match.
+LEN is the length of the completion string."
+  (let* ((from 0)
+         ;; To understand how this works, consider these simple
+         ;; ascii diagrams showing how the pattern "foo"
+         ;; flex-matches "fabrobazo", "fbarbazoo" and
+         ;; "barfoobaz":
+
+         ;;      f abr o baz o
+         ;;      + --- + --- +
+
+         ;;      f barbaz oo
+         ;;      + ------ ++
+
+         ;;      bar foo baz
+         ;;          +++
+
+         ;; "+" indicates parts where the pattern matched.  A
+         ;; "hole" in the middle of the string is indicated by
+         ;; "-".  Note that there are no "holes" near the edges
+         ;; of the string.  The completion score is a number
+         ;; bound by (0..1] (i.e., larger than (but not equal
+         ;; to) zero, and smaller or equal to one): the higher
+         ;; the better and only a perfect match (pattern equals
+         ;; string) will have score 1.  The formula takes the
+         ;; form of a quotient.  For the numerator, we use the
+         ;; number of +, i.e. the length of the pattern.  For
+         ;; the denominator, it first computes
+         ;;
+         ;;     hole_i_contrib = 1 + (Li-1)^(1/tightness)
+         ;;
+         ;; , for each hole "i" of length "Li", where tightness
+         ;; is given by `flex-score-match-tightness'.  The
+         ;; final value for the denominator is then given by:
+         ;;
+         ;;    (SUM_across_i(hole_i_contrib) + 1) * len
+         ;;
+         ;; , where "len" is the string's length.
+         (score-numerator 0)
+         (score-denominator 0)
+         (last-b 0))
+    (while md
+      (let ((a from)
+            (b (pop md)))
+        (setq
+         score-numerator   (+ score-numerator (- b a)))
+        (unless (or (= a last-b)
+                    (zerop last-b)
+                    (= a len))
+          (setq
+           score-denominator (+ score-denominator
+                                1
+                                (expt (- a last-b 1)
+                                      (/ 1.0
+                                         flex-score-match-tightness)))))
+        (setq
+         last-b              b))
+      (setq from (pop md)))
+    ;; If `pattern' doesn't have an explicit trailing any, the
+    ;; regex `re' won't produce match data representing the
+    ;; region after the match.  We need to account to account
+    ;; for that extra bit of match (bug#42149).
+    (unless (= from match-end)
+      (let ((a from)
+            (b match-end))
+        (setq
+         score-numerator   (+ score-numerator (- b a)))
+        (unless (or (= a last-b)
+                    (zerop last-b)
+                    (= a len))
+          (setq
+           score-denominator (+ score-denominator
+                                1
+                                (expt (- a last-b 1)
+                                      (/ 1.0
+                                         flex-score-match-tightness)))))
+        (setq
+         last-b              b)))
+    (/ score-numerator (* len (1+ score-denominator)) 1.0)))
+
+(defvar completion--lazy-hilit-table (make-hash-table :weakness 'key))
+
 (defun completion-pcm--hilit-commonality (pattern completions)
   "Show where and how well PATTERN matches COMPLETIONS.
 PATTERN, a list of symbols and strings as seen
 `completion-pcm--merge-completions', is assumed to match every
-string in COMPLETIONS.  Return a deep copy of COMPLETIONS where
-each string is propertized with `completion-score', a number
-between 0 and 1, and with faces `completions-common-part',
-`completions-first-difference' in the relevant segments."
+string in COMPLETIONS.
+
+If `completion-lazy-hilit' is nil, return a deep copy of
+COMPLETIONS where each string is propertized with
+`completion-score', a number between 0 and 1, and with faces
+`completions-common-part', `completions-first-difference' in the
+relevant segments.
+
+Else, if `completion-lazy-hilit' is t, return COMPLETIONS where
+each string now has a `completion-score' property and no
+highlighting."
   (cond
    ((and completions (cl-loop for e in pattern thereis (stringp e)))
     (let* ((re (completion-pcm--pattern->regex pattern 'group))
-           (point-idx (completion-pcm--pattern-point-idx pattern))
-           (case-fold-search completion-ignore-case)
-           last-md)
-      (mapcar
-       (lambda (str)
-	 ;; Don't modify the string itself.
-         (setq str (copy-sequence str))
-         (unless (string-match re str)
-           (error "Internal error: %s does not match %s" re str))
-         (let* ((pos (if point-idx (match-beginning point-idx) (match-end 0)))
-                (match-end (match-end 0))
-                (md (cddr (setq last-md (match-data t last-md))))
-                (from 0)
-                (end (length str))
-                ;; To understand how this works, consider these simple
-                ;; ascii diagrams showing how the pattern "foo"
-                ;; flex-matches "fabrobazo", "fbarbazoo" and
-                ;; "barfoobaz":
-
-                ;;      f abr o baz o
-                ;;      + --- + --- +
-
-                ;;      f barbaz oo
-                ;;      + ------ ++
-
-                ;;      bar foo baz
-                ;;          +++
-
-                ;; "+" indicates parts where the pattern matched.  A
-                ;; "hole" in the middle of the string is indicated by
-                ;; "-".  Note that there are no "holes" near the edges
-                ;; of the string.  The completion score is a number
-                ;; bound by (0..1] (i.e., larger than (but not equal
-                ;; to) zero, and smaller or equal to one): the higher
-                ;; the better and only a perfect match (pattern equals
-                ;; string) will have score 1.  The formula takes the
-                ;; form of a quotient.  For the numerator, we use the
-                ;; number of +, i.e. the length of the pattern.  For
-                ;; the denominator, it first computes
-                ;;
-                ;;     hole_i_contrib = 1 + (Li-1)^(1/tightness)
-                ;;
-                ;; , for each hole "i" of length "Li", where tightness
-                ;; is given by `flex-score-match-tightness'.  The
-                ;; final value for the denominator is then given by:
-                ;;
-                ;;    (SUM_across_i(hole_i_contrib) + 1) * len
-                ;;
-                ;; , where "len" is the string's length.
-                (score-numerator 0)
-                (score-denominator 0)
-                (last-b 0)
-                (update-score-and-face
-                 (lambda (a b)
-                   "Update score and face given match range (A B)."
-                   (add-face-text-property a b
-                                           'completions-common-part
-                                           nil str)
-                   (setq
-                    score-numerator   (+ score-numerator (- b a)))
-                   (unless (or (= a last-b)
-                               (zerop last-b)
-                               (= a (length str)))
-                     (setq
-                      score-denominator (+ score-denominator
-                                           1
-                                           (expt (- a last-b 1)
-                                                 (/ 1.0
-                                                    flex-score-match-tightness)))))
-                   (setq
-                    last-b              b))))
-           (while md
-             (funcall update-score-and-face from (pop md))
-             (setq from (pop md)))
-           ;; If `pattern' doesn't have an explicit trailing any, the
-           ;; regex `re' won't produce match data representing the
-           ;; region after the match.  We need to account to account
-           ;; for that extra bit of match (bug#42149).
-           (unless (= from match-end)
-             (funcall update-score-and-face from match-end))
-           (if (> (length str) pos)
-               (add-face-text-property
-                pos (1+ pos)
-                'completions-first-difference
-                nil str))
-           (unless (zerop (length str))
-             (put-text-property
-              0 1 'completion-score
-              (/ score-numerator (* end (1+ score-denominator)) 1.0) str)))
-         str)
-       completions)))
+           last-md
+           (score (lambda (str)
+                    (unless (string-match re str)
+                      (error "Internal error: %s does not match %s" re str))
+                    (let* ((match-end (match-end 0))
+                           (md (cddr (setq last-md (match-data t last-md)))))
+                      (completion--flex-score-1 md match-end (length str))))))
+      (cond (completion-lazy-hilit
+             (puthash completion-lazy-hilit re completion--lazy-hilit-table)
+             (mapc (lambda (str)
+                     (put-text-property 0 1 'completion-score (funcall score str) str))
+                   completions))
+            (t
+             (mapcar
+              (lambda (str)
+                (setq str (copy-sequence str))
+                (put-text-property 0 1 'completion-score (funcall score str) str)
+                (completion--hilit-from-re str re)
+                str)
+              completions)))))
    (t completions)))
 
 (defun completion-pcm--find-all-completions (string table pred point
-- 
2.39.2

[0002-Replace-lazy-hilit-hash-table-with-something-simpler.patch (text/x-patch, inline)]

From 5c4449dc578903314f400461d13c4c08e02a18ef Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Jo=C3=A3o=20T=C3=A1vora?= <joaotavora <at> gmail.com>
Date: Wed, 25 Oct 2023 22:36:15 +0100
Subject: [PATCH 2/2] Replace lazy hilit hash table with something simpler

* lisp/icomplete.el (icomplete-minibuffer-setup): Don't set
completion-lazy-hilit.
(icomplete-exhibit): Set it here.

* lisp/minibuffer.el (completion-lazy-hilit): Rework docstring.
(completion-lazy-hilit-fn): Rework from completion--lazy-hilit-re.
(completion--lazy-hilit-table): Delete.
(completion-pcm--hilit-commonality): Rework.
---
 lisp/icomplete.el  |  4 ++--
 lisp/minibuffer.el | 42 +++++++++++++++++++++---------------------
 2 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/lisp/icomplete.el b/lisp/icomplete.el
index a9ac0b3f040..3e888c8b06a 100644
--- a/lisp/icomplete.el
+++ b/lisp/icomplete.el
@@ -545,7 +545,6 @@ icomplete-minibuffer-setup
     (setq-local icomplete--initial-input (icomplete--field-string))
     (setq-local completion-show-inline-help nil)
     (setq icomplete--scrolled-completions nil)
-    (setq completion-lazy-hilit (cl-gensym))
     (use-local-map (make-composed-keymap icomplete-minibuffer-map
     					 (current-local-map)))
     (add-hook 'post-command-hook #'icomplete-post-command-hook nil t)
@@ -723,7 +722,8 @@ icomplete-exhibit
              ;; Check if still in the right buffer (bug#61308)
              (or (window-minibuffer-p) completion-in-region--data)
              (icomplete-simple-completing-p)) ;Shouldn't be necessary.
-    (let ((saved-point (point)))
+    (let ((saved-point (point))
+          (completion-lazy-hilit t))
       (save-excursion
         (goto-char (icomplete--field-end))
         ;; Insert the match-status information:
diff --git a/lisp/minibuffer.el b/lisp/minibuffer.el
index 4591f1145c8..4a727615afb 100644
--- a/lisp/minibuffer.el
+++ b/lisp/minibuffer.el
@@ -1234,6 +1234,7 @@ completion-all-completions
 POINT is the position of point within STRING.
 The return value is a list of completions and may contain the base-size
 in the last `cdr'."
+  (setq completion-lazy-hilit-fn nil)
   ;; FIXME: We need to additionally return the info needed for the
   ;; second part of completion-base-position.
   (completion--nth-completion 2 string table pred point metadata))
@@ -3753,24 +3754,16 @@ completion-lazy-hilit
   "If non-nil, request completion lazy hilighting.
 
 Completion-presenting frontends may opt to bind this variable to
-a unique non-nil value in the context of completion-producing
-calls (such as `completion-all-sorted-completions').  This hints
-the intervening completion styles that they do not need to
-propertize completion strings with the `face' property.
+non-nil value in the context of completion-producing calls (such
+as `completion-all-sorted-completions').  This hints the
+intervening completion styles that they do not need to propertize
+completion strings with the `face' property.
 
 When doing so, it is the frontend -- not the style -- who becomes
 responsible for `face'-propertizing only the completion strings
-that are meant to be displayed to the user.  This can be done by
-calling the function `completion-lazy-hilit' which returns a
-`face'-propertized string.
-
-The value stored in this variable by the completion frontend
-should be unique to each completion attempt or session that
-utilizes the same completion style in `completion-styles-alist'.
-For frontends using the minibuffer as the locus of completion
-calls and display, setting it to a buffer-local value given by
-`gensym' is appropriate.  For frontends operating entirely in a
-single command, let-binding it to `gensym' is appropriate.
+that are meant to be displayed to the user.  This is done by
+calling `completion-lazy-hilit' on each such string, which
+produces the suitably propertized string.
 
 Note that the optimization enabled by variable is only actually
 performed some completions styles.  To others, it is a harmless
@@ -3778,13 +3771,21 @@ completion-lazy-hilit
 advantage of this, look in the source of
 `completion-pcm--hilit-commonality'.")
 
+(defvar completion-lazy-hilit-fn nil
+  "Used by completions styles to honouring `completion-lazy-hilit'.
+When a given style wants to enable support for
+`completion-lazy-hilit' (which see), that style should set this
+variable to a function of one argument, a fresh string to be
+displayed to the user.  The function is responsible for
+destructively highlighting the string.")
+
 (defun completion-lazy-hilit (str)
   "Return a copy of completion STR that is `face'-propertized.
 See documentation for variable `completion-lazy-hilit' for more
 details."
-  (completion--hilit-from-re
-   (copy-sequence str)
-   (gethash completion-lazy-hilit completion--lazy-hilit-table)))
+  (if (and completion-lazy-hilit completion-lazy-hilit-fn)
+      (funcall completion-lazy-hilit-fn (copy-sequence str))
+    str))
 
 (defun completion--hilit-from-re (string regexp)
   "Fontify STRING with `completions-common-part' using REGEXP."
@@ -3883,8 +3884,6 @@ completion--flex-score-1
          last-b              b)))
     (/ score-numerator (* len (1+ score-denominator)) 1.0)))
 
-(defvar completion--lazy-hilit-table (make-hash-table :weakness 'key))
-
 (defun completion-pcm--hilit-commonality (pattern completions)
   "Show where and how well PATTERN matches COMPLETIONS.
 PATTERN, a list of symbols and strings as seen
@@ -3911,7 +3910,8 @@ completion-pcm--hilit-commonality
                            (md (cddr (setq last-md (match-data t last-md)))))
                       (completion--flex-score-1 md match-end (length str))))))
       (cond (completion-lazy-hilit
-             (puthash completion-lazy-hilit re completion--lazy-hilit-table)
+             (setq completion-lazy-hilit-fn
+                   (lambda (str) (completion--hilit-from-re str re)))
              (mapc (lambda (str)
                      (put-text-property 0 1 'completion-score (funcall score str) str))
                    completions))
-- 
2.39.2

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Thu, 26 Oct 2023 21:47:01 GMT) Full text and rfc822 format available.

Message #272 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Dmitry Gutov <dmitry <at> gutov.dev>, Eli Zaretskii <eliz <at> gnu.org>,
 47711 <at> debbugs.gnu.org, Daniel Mendler <mail <at> daniel-mendler.de>
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2]
 Add new `completion-filter-completions` API and deferred highlighting
Date: Thu, 26 Oct 2023 22:49:07 +0100

[Message part 1 (text/plain, inline)]

João Távora <joaotavora <at> gmail.com> writes:

> So, backed by the new ability to conduct good benchmarks, I looked at
> the problem anew. I found some insight in the problem, and came up with
> a new "lazy-hilit" patch which performs just as well, if not slightly
> better, than Daniel's, while keeping the changes to lisp/minibuffer.el
> much more minimal and not adding replacement for the longstanding
> completion-all-completions.

Working on this a bit more, I've now been able to optimize the "lazy
hilit" patch even further by recognizing that in many situations we
don't need to match the "PCM" pattern to each string twice.  The first
time we do it, we can calculate a score immediately and store it in the
string.  The average response times for the "yo-yoo" test described
previously:

   master:                 2.604s
   2021 lazy-hilit patch:  1.240s
   2023 Daniel+Dmitry:     0.831s
   2023 lazy-hilit v1:     0.792s

And the new one:

   2023 lazy-hilit v2:     0.518s

I'm now keeping my work in a branch called
feature/completion-lazy-hilit, but I still attach the diff here.

João

PS: for the flex style in particular, there are even more optimizations
possible.  For example one could take advantage of the fact that in
flex, a longer pattern should always yield a subset of the completions
produced by a shorter pattern in the same completion session.  But this
requires solid concepts of a "completion session".

[lazy-hilit-2023-v2.diff (text/x-patch, inline)]

diff --git a/lisp/icomplete.el b/lisp/icomplete.el
index e6fdd1f1836..3e888c8b06a 100644
--- a/lisp/icomplete.el
+++ b/lisp/icomplete.el
@@ -722,7 +722,8 @@ icomplete-exhibit
              ;; Check if still in the right buffer (bug#61308)
              (or (window-minibuffer-p) completion-in-region--data)
              (icomplete-simple-completing-p)) ;Shouldn't be necessary.
-    (let ((saved-point (point)))
+    (let ((saved-point (point))
+          (completion-lazy-hilit t))
       (save-excursion
         (goto-char (icomplete--field-end))
         ;; Insert the match-status information:
@@ -754,12 +755,13 @@ icomplete-exhibit
                            (overlay-end rfn-eshadow-overlay)))
           (let* ((field-string (icomplete--field-string))
                  (text (while-no-input
+                         (benchmark-progn
                          (icomplete-completions
                           field-string
                           (icomplete--completion-table)
                           (icomplete--completion-predicate)
                           (if (window-minibuffer-p)
-                              (eq minibuffer--require-match t)))))
+                              (eq minibuffer--require-match t))))))
                  (buffer-undo-list t)
                  deactivate-mark)
             ;; Do nothing if while-no-input was aborted.
@@ -901,7 +903,7 @@ icomplete--render-vertical
                                 'icomplete-selected-match 'append comp)
      collect (concat prefix
                      (make-string (- max-prefix-len (length prefix)) ? )
-                     comp
+                     (completion-lazy-hilit comp)
                      (make-string (- max-comp-len (length comp)) ? )
                      suffix)
      into lines-aux
@@ -1067,7 +1069,8 @@ icomplete-completions
                   (if (< prospects-len prospects-max)
                       (push comp prospects)
                     (setq limit t)))
-                (setq prospects (nreverse prospects))
+                (setq prospects
+                      (nreverse (mapcar #'completion-lazy-hilit prospects)))
                 ;; Decorate first of the prospects.
                 (when prospects
                   (let ((first (copy-sequence (pop prospects))))
diff --git a/lisp/minibuffer.el b/lisp/minibuffer.el
index 2120e31775e..b38eb49aba8 100644
--- a/lisp/minibuffer.el
+++ b/lisp/minibuffer.el
@@ -1234,6 +1234,7 @@ completion-all-completions
 POINT is the position of point within STRING.
 The return value is a list of completions and may contain the base-size
 in the last `cdr'."
+  (setq completion-lazy-hilit-fn nil)
   ;; FIXME: We need to additionally return the info needed for the
   ;; second part of completion-base-position.
   (completion--nth-completion 2 string table pred point metadata))
@@ -3720,21 +3721,33 @@ completion-pcm--all-completions
 
     ;; Use all-completions to do an initial cull.  This is a big win,
     ;; since all-completions is written in C!
-    (let* (;; Convert search pattern to a standard regular expression.
-	   (regex (completion-pcm--pattern->regex pattern))
-           (case-fold-search completion-ignore-case)
-           (completion-regexp-list (cons regex completion-regexp-list))
-	   (compl (all-completions
-                   (concat prefix
-                           (if (stringp (car pattern)) (car pattern) ""))
-		   table pred)))
-      (if (not (functionp table))
-	  ;; The internal functions already obeyed completion-regexp-list.
-	  compl
-	(let ((poss ()))
-	  (dolist (c compl)
-	    (when (string-match-p regex c) (push c poss)))
-	  (nreverse poss))))))
+    (let* ((case-fold-search completion-ignore-case)
+           (completion-regexp-list (cons
+                                    ;; Convert search pattern to a
+                                    ;; standard regular expression.
+                                    (completion-pcm--pattern->regex pattern)
+                                    completion-regexp-list))
+	   (completions (all-completions
+                         (concat prefix
+                                 (if (stringp (car pattern)) (car pattern) ""))
+		         table pred)))
+      (cond ((or (not (functionp table))
+                 (cl-loop for e in pattern never (stringp e)))
+	     ;; The internal functions already obeyed completion-regexp-list.
+	     completions)
+            (t
+             ;; The pattern has something interesting to match, in
+             ;; which case we take the opportunity to add an early
+             ;; completion-score cookie to each completion.
+             (cl-loop with re = (completion-pcm--pattern->regex pattern 'group)
+                      for orig in completions
+                      for comp = (copy-sequence orig)
+                      for score = (completion--flex-score comp re t)
+                      when score
+                      do (put-text-property 0 1 'completion-score
+                                      score
+                                      comp)
+                      and collect comp))))))
 
 (defvar flex-score-match-tightness 3
   "Controls how the `flex' completion style scores its matches.
@@ -3749,108 +3762,195 @@ flex-score-match-tightness
 than the latter (which has two \"holes\" and three
 one-letter-long matches).")
 
+(defvar-local completion-lazy-hilit nil
+  "If non-nil, request completion lazy highlighting.
+
+Completion-presenting frontends may opt to bind this variable to
+non-nil value in the context of completion-producing calls (such
+as `completion-all-sorted-completions').  This hints the
+intervening completion styles that they do not need to
+fontify (i.e. propertize with the `face' property) completion
+strings with highlights of the matching parts.
+
+When doing so, it is the frontend -- not the style -- who becomes
+responsible this fontification.  The frontend binds this variable
+to non-nil, and calls the function with the same name
+`completion-lazy-hilit' on each completion string that is to be
+displayed to the user.
+
+Note that only some completion styles take advantage of this
+variable for optimization purposes.  Other styles will ignore the
+hint and greedily fontify as usual.  It is still safe for a
+frontend to call `completion-lazy-hilit' in these situations.
+
+To author a completion style that takes advantage see
+`completion-lazy-hilit-fn' and look in the source of
+`completion-pcm--hilit-commonality'.")
+
+(defvar completion-lazy-hilit-fn nil
+  "Used by completions styles to honouring `completion-lazy-hilit'.
+When a given style wants to enable support for
+`completion-lazy-hilit' (which see), that style should set this
+variable to a function of one argument, a fresh string to be
+displayed to the user.  The function is responsible for
+destructively highlighting the string.")
+
+(defun completion-lazy-hilit (str)
+  "Return a copy of completion STR that is `face'-propertized.
+See documentation for variable `completion-lazy-hilit' for more
+details."
+  (if (and completion-lazy-hilit completion-lazy-hilit-fn)
+      (funcall completion-lazy-hilit-fn (copy-sequence str))
+    str))
+
+(defun completion--hilit-from-re (string regexp)
+  "Fontify STRING with `completions-common-part' using REGEXP."
+  (let* ((md (and regexp (string-match regexp string) (cddr (match-data t))))
+         (me (and md (match-end 0)))
+         (from 0))
+    (while md
+      (add-face-text-property from (pop md) 'completions-common-part nil string)
+      (setq from (pop md)))
+    (unless (or (not me) (= from me))
+      (add-face-text-property from me 'completions-common-part nil string))
+    string))
+
+(defun completion--flex-score-1 (md-groups match-end len)
+  "Compute matching score of completion.
+The score lies in the range between 0 and 1, where 1 corresponds to
+the full match.
+MD-GROUPS is the \"group\"  part of the match data.
+MATCH-END is the end of the match.
+LEN is the length of the completion string."
+  (let* ((from 0)
+         ;; To understand how this works, consider these simple
+         ;; ascii diagrams showing how the pattern "foo"
+         ;; flex-matches "fabrobazo", "fbarbazoo" and
+         ;; "barfoobaz":
+
+         ;;      f abr o baz o
+         ;;      + --- + --- +
+
+         ;;      f barbaz oo
+         ;;      + ------ ++
+
+         ;;      bar foo baz
+         ;;          +++
+
+         ;; "+" indicates parts where the pattern matched.  A
+         ;; "hole" in the middle of the string is indicated by
+         ;; "-".  Note that there are no "holes" near the edges
+         ;; of the string.  The completion score is a number
+         ;; bound by (0..1] (i.e., larger than (but not equal
+         ;; to) zero, and smaller or equal to one): the higher
+         ;; the better and only a perfect match (pattern equals
+         ;; string) will have score 1.  The formula takes the
+         ;; form of a quotient.  For the numerator, we use the
+         ;; number of +, i.e. the length of the pattern.  For
+         ;; the denominator, it first computes
+         ;;
+         ;;     hole_i_contrib = 1 + (Li-1)^(1/tightness)
+         ;;
+         ;; , for each hole "i" of length "Li", where tightness
+         ;; is given by `flex-score-match-tightness'.  The
+         ;; final value for the denominator is then given by:
+         ;;
+         ;;    (SUM_across_i(hole_i_contrib) + 1) * len
+         ;;
+         ;; , where "len" is the string's length.
+         (score-numerator 0)
+         (score-denominator 0)
+         (last-b 0))
+    (while (and md-groups (car md-groups))
+      (let ((a from)
+            (b (pop md-groups)))
+        (setq
+         score-numerator   (+ score-numerator (- b a)))
+        (unless (or (= a last-b)
+                    (zerop last-b)
+                    (= a len))
+          (setq
+           score-denominator (+ score-denominator
+                                1
+                                (expt (- a last-b 1)
+                                      (/ 1.0
+                                         flex-score-match-tightness)))))
+        (setq
+         last-b              b))
+      (setq from (pop md-groups)))
+    ;; If `pattern' doesn't have an explicit trailing any, the
+    ;; regex `re' won't produce match data representing the
+    ;; region after the match.  We need to account to account
+    ;; for that extra bit of match (bug#42149).
+    (unless (= from match-end)
+      (let ((a from)
+            (b match-end))
+        (setq
+         score-numerator   (+ score-numerator (- b a)))
+        (unless (or (= a last-b)
+                    (zerop last-b)
+                    (= a len))
+          (setq
+           score-denominator (+ score-denominator
+                                1
+                                (expt (- a last-b 1)
+                                      (/ 1.0
+                                         flex-score-match-tightness)))))
+        (setq
+         last-b              b)))
+    (/ score-numerator (* len (1+ score-denominator)) 1.0)))
+
+(defvar completion--flex-score-last-md nil
+  "Helper variable for `completion--flex-score'.")
+
+(defun completion--flex-score (str re &optional dont-error)
+  "Compute flex score of completion STR based on RE.
+If DONT-ERROR, just return nil if RE doesn't match STR."
+  (cond ((string-match re str)
+         (let* ((match-end (match-end 0))
+                (md (cddr
+                     (setq
+                      completion--flex-score-last-md
+                      (match-data t completion--flex-score-last-md)))))
+           (completion--flex-score-1 md match-end (length str))))
+        ((not dont-error)
+         (error "Internal error: %s does not match %s" re str))))
+
 (defun completion-pcm--hilit-commonality (pattern completions)
   "Show where and how well PATTERN matches COMPLETIONS.
 PATTERN, a list of symbols and strings as seen
 `completion-pcm--merge-completions', is assumed to match every
-string in COMPLETIONS.  Return a deep copy of COMPLETIONS where
-each string is propertized with `completion-score', a number
-between 0 and 1, and with faces `completions-common-part',
-`completions-first-difference' in the relevant segments."
+string in COMPLETIONS.
+
+If `completion-lazy-hilit' is nil, return a deep copy of
+COMPLETIONS where each string is propertized with
+`completion-score', a number between 0 and 1, and with faces
+`completions-common-part', `completions-first-difference' in the
+relevant segments.
+
+Else, if `completion-lazy-hilit' is t, return COMPLETIONS where
+each string now has a `completion-score' property and no
+highlighting."
   (cond
    ((and completions (cl-loop for e in pattern thereis (stringp e)))
     (let* ((re (completion-pcm--pattern->regex pattern 'group))
-           (point-idx (completion-pcm--pattern-point-idx pattern))
-           (case-fold-search completion-ignore-case)
-           last-md)
-      (mapcar
-       (lambda (str)
-	 ;; Don't modify the string itself.
-         (setq str (copy-sequence str))
-         (unless (string-match re str)
-           (error "Internal error: %s does not match %s" re str))
-         (let* ((pos (if point-idx (match-beginning point-idx) (match-end 0)))
-                (match-end (match-end 0))
-                (md (cddr (setq last-md (match-data t last-md))))
-                (from 0)
-                (end (length str))
-                ;; To understand how this works, consider these simple
-                ;; ascii diagrams showing how the pattern "foo"
-                ;; flex-matches "fabrobazo", "fbarbazoo" and
-                ;; "barfoobaz":
-
-                ;;      f abr o baz o
-                ;;      + --- + --- +
-
-                ;;      f barbaz oo
-                ;;      + ------ ++
-
-                ;;      bar foo baz
-                ;;          +++
-
-                ;; "+" indicates parts where the pattern matched.  A
-                ;; "hole" in the middle of the string is indicated by
-                ;; "-".  Note that there are no "holes" near the edges
-                ;; of the string.  The completion score is a number
-                ;; bound by (0..1] (i.e., larger than (but not equal
-                ;; to) zero, and smaller or equal to one): the higher
-                ;; the better and only a perfect match (pattern equals
-                ;; string) will have score 1.  The formula takes the
-                ;; form of a quotient.  For the numerator, we use the
-                ;; number of +, i.e. the length of the pattern.  For
-                ;; the denominator, it first computes
-                ;;
-                ;;     hole_i_contrib = 1 + (Li-1)^(1/tightness)
-                ;;
-                ;; , for each hole "i" of length "Li", where tightness
-                ;; is given by `flex-score-match-tightness'.  The
-                ;; final value for the denominator is then given by:
-                ;;
-                ;;    (SUM_across_i(hole_i_contrib) + 1) * len
-                ;;
-                ;; , where "len" is the string's length.
-                (score-numerator 0)
-                (score-denominator 0)
-                (last-b 0)
-                (update-score-and-face
-                 (lambda (a b)
-                   "Update score and face given match range (A B)."
-                   (add-face-text-property a b
-                                           'completions-common-part
-                                           nil str)
-                   (setq
-                    score-numerator   (+ score-numerator (- b a)))
-                   (unless (or (= a last-b)
-                               (zerop last-b)
-                               (= a (length str)))
-                     (setq
-                      score-denominator (+ score-denominator
-                                           1
-                                           (expt (- a last-b 1)
-                                                 (/ 1.0
-                                                    flex-score-match-tightness)))))
-                   (setq
-                    last-b              b))))
-           (while md
-             (funcall update-score-and-face from (pop md))
-             (setq from (pop md)))
-           ;; If `pattern' doesn't have an explicit trailing any, the
-           ;; regex `re' won't produce match data representing the
-           ;; region after the match.  We need to account to account
-           ;; for that extra bit of match (bug#42149).
-           (unless (= from match-end)
-             (funcall update-score-and-face from match-end))
-           (if (> (length str) pos)
-               (add-face-text-property
-                pos (1+ pos)
-                'completions-first-difference
-                nil str))
-           (unless (zerop (length str))
-             (put-text-property
-              0 1 'completion-score
-              (/ score-numerator (* end (1+ score-denominator)) 1.0) str)))
-         str)
-       completions)))
+           (score-maybe (lambda (str)
+                          (unless (get-text-property 0 'completion-score str)
+                            (put-text-property 0 1 'completion-score
+                                               (completion--flex-score str re)
+                                               str)))))
+      (cond (completion-lazy-hilit
+             (setq completion-lazy-hilit-fn
+                   (lambda (str) (completion--hilit-from-re str re)))
+             (mapc score-maybe completions))
+            (t
+             (mapcar
+              (lambda (str)
+                (setq str (copy-sequence str))
+                (funcall score-maybe str)
+                (completion--hilit-from-re str re)
+                str)
+              completions)))))
    (t completions)))
 
 (defun completion-pcm--find-all-completions (string table pred point

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Thu, 26 Oct 2023 23:12:01 GMT) Full text and rfc822 format available.

Message #275 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: João Távora <joaotavora <at> gmail.com>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Eli Zaretskii <eliz <at> gnu.org>,
 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add
 new `completion-filter-completions` API and deferred highlighting
Date: Fri, 27 Oct 2023 02:10:48 +0300

Hi Joao,

Thanks for the updates and the numbers.

On 27/10/2023 00:49, João Távora wrote:
> João Távora<joaotavora <at> gmail.com>  writes:
> 
>> So, backed by the new ability to conduct good benchmarks, I looked at
>> the problem anew. I found some insight in the problem, and came up with
>> a new "lazy-hilit" patch which performs just as well, if not slightly
>> better, than Daniel's, while keeping the changes to lisp/minibuffer.el
>> much more minimal and not adding replacement for the longstanding
>> completion-all-completions.
> Working on this a bit more, I've now been able to optimize the "lazy
> hilit" patch even further by recognizing that in many situations we
> don't need to match the "PCM" pattern to each string twice.  The first
> time we do it, we can calculate a score immediately and store it in the
> string.  The average response times for the "yo-yoo" test described
> previously:

This latest change in particular regressed this related scenario:

;; Use 'set' to ensure that the variables are bound.
(cl-loop repeat 300000 do (set (intern (symbol-name (gensym "yoyo"))) 4))

C-h v <then type y o <backspace> o <backspace> ...>

It increased the timings for that scenario from ~0.600s (with either of 
the latest filter-deferred patches and your previous version) to ~1s.

My understanding is it's due to the judicious call (copy-sequence orig) 
that you added before 'put-text-property' is called. While it seems like 
a good idea to preserve the original value, when almost all of obarray 
matches the current input (which is the current scenario), a lot of 
strings will be copied.

But (completing-read "" obarray) is not affected, and I don't see why. 
Maybe because it re-sorts less than icomplete? And the consing triggers 
the GC threshold less often (thanks to the other changes).

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Thu, 26 Oct 2023 23:25:01 GMT) Full text and rfc822 format available.

Message #278 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Eli Zaretskii <eliz <at> gnu.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2]
 Add new `completion-filter-completions` API and deferred highlighting
Date: Fri, 27 Oct 2023 00:27:12 +0100

[Message part 1 (text/plain, inline)]

Dmitry Gutov <dmitry <at> gutov.dev> writes:

> My understanding is it's due to the judicious call (copy-sequence
> orig) that you added before 'put-text-property' is called. While it
> seems like a good idea to preserve the original value, when almost all
> of obarray matches the current input (which is the current scenario),
> a lot of strings will be copied.

You're right, I reproduced the regression.  I thought I had taken out
the copy-sequence, but forgot it there.  In an earlier stage I suspected
that I needed the copy, but I don't think I do.  Please try this new
patch that removes it.  I've also pushed it to the
feature/completion-lazy-hilit branch.

João

[lazy-hilit-2023-v3.diff (text/x-patch, inline)]

diff --git a/lisp/icomplete.el b/lisp/icomplete.el
index e6fdd1f1836..3e888c8b06a 100644
--- a/lisp/icomplete.el
+++ b/lisp/icomplete.el
@@ -722,7 +722,8 @@ icomplete-exhibit
              ;; Check if still in the right buffer (bug#61308)
              (or (window-minibuffer-p) completion-in-region--data)
              (icomplete-simple-completing-p)) ;Shouldn't be necessary.
-    (let ((saved-point (point)))
+    (let ((saved-point (point))
+          (completion-lazy-hilit t))
       (save-excursion
         (goto-char (icomplete--field-end))
         ;; Insert the match-status information:
@@ -754,12 +755,13 @@ icomplete-exhibit
                            (overlay-end rfn-eshadow-overlay)))
           (let* ((field-string (icomplete--field-string))
                  (text (while-no-input
+                         (benchmark-progn
                          (icomplete-completions
                           field-string
                           (icomplete--completion-table)
                           (icomplete--completion-predicate)
                           (if (window-minibuffer-p)
-                              (eq minibuffer--require-match t)))))
+                              (eq minibuffer--require-match t))))))
                  (buffer-undo-list t)
                  deactivate-mark)
             ;; Do nothing if while-no-input was aborted.
@@ -901,7 +903,7 @@ icomplete--render-vertical
                                 'icomplete-selected-match 'append comp)
      collect (concat prefix
                      (make-string (- max-prefix-len (length prefix)) ? )
-                     comp
+                     (completion-lazy-hilit comp)
                      (make-string (- max-comp-len (length comp)) ? )
                      suffix)
      into lines-aux
@@ -1067,7 +1069,8 @@ icomplete-completions
                   (if (< prospects-len prospects-max)
                       (push comp prospects)
                     (setq limit t)))
-                (setq prospects (nreverse prospects))
+                (setq prospects
+                      (nreverse (mapcar #'completion-lazy-hilit prospects)))
                 ;; Decorate first of the prospects.
                 (when prospects
                   (let ((first (copy-sequence (pop prospects))))
diff --git a/lisp/minibuffer.el b/lisp/minibuffer.el
index 2120e31775e..ecde00dd28d 100644
--- a/lisp/minibuffer.el
+++ b/lisp/minibuffer.el
@@ -1234,6 +1234,7 @@ completion-all-completions
 POINT is the position of point within STRING.
 The return value is a list of completions and may contain the base-size
 in the last `cdr'."
+  (setq completion-lazy-hilit-fn nil)
   ;; FIXME: We need to additionally return the info needed for the
   ;; second part of completion-base-position.
   (completion--nth-completion 2 string table pred point metadata))
@@ -3720,21 +3721,32 @@ completion-pcm--all-completions
 
     ;; Use all-completions to do an initial cull.  This is a big win,
     ;; since all-completions is written in C!
-    (let* (;; Convert search pattern to a standard regular expression.
-	   (regex (completion-pcm--pattern->regex pattern))
-           (case-fold-search completion-ignore-case)
-           (completion-regexp-list (cons regex completion-regexp-list))
-	   (compl (all-completions
-                   (concat prefix
-                           (if (stringp (car pattern)) (car pattern) ""))
-		   table pred)))
-      (if (not (functionp table))
-	  ;; The internal functions already obeyed completion-regexp-list.
-	  compl
-	(let ((poss ()))
-	  (dolist (c compl)
-	    (when (string-match-p regex c) (push c poss)))
-	  (nreverse poss))))))
+    (let* ((case-fold-search completion-ignore-case)
+           (completion-regexp-list (cons
+                                    ;; Convert search pattern to a
+                                    ;; standard regular expression.
+                                    (completion-pcm--pattern->regex pattern)
+                                    completion-regexp-list))
+	   (completions (all-completions
+                         (concat prefix
+                                 (if (stringp (car pattern)) (car pattern) ""))
+		         table pred)))
+      (cond ((or (not (functionp table))
+                 (cl-loop for e in pattern never (stringp e)))
+	     ;; The internal functions already obeyed completion-regexp-list.
+	     completions)
+            (t
+             ;; The pattern has something interesting to match, in
+             ;; which case we take the opportunity to add an early
+             ;; completion-score cookie to each completion.
+             (cl-loop with re = (completion-pcm--pattern->regex pattern 'group)
+                      for comp in completions
+                      for score = (completion--flex-score comp re t)
+                      when score
+                      do (put-text-property 0 1 'completion-score
+                                      score
+                                      comp)
+                      and collect comp))))))
 
 (defvar flex-score-match-tightness 3
   "Controls how the `flex' completion style scores its matches.
@@ -3749,108 +3761,195 @@ flex-score-match-tightness
 than the latter (which has two \"holes\" and three
 one-letter-long matches).")
 
+(defvar-local completion-lazy-hilit nil
+  "If non-nil, request completion lazy highlighting.
+
+Completion-presenting frontends may opt to bind this variable to
+non-nil value in the context of completion-producing calls (such
+as `completion-all-sorted-completions').  This hints the
+intervening completion styles that they do not need to
+fontify (i.e. propertize with the `face' property) completion
+strings with highlights of the matching parts.
+
+When doing so, it is the frontend -- not the style -- who becomes
+responsible this fontification.  The frontend binds this variable
+to non-nil, and calls the function with the same name
+`completion-lazy-hilit' on each completion string that is to be
+displayed to the user.
+
+Note that only some completion styles take advantage of this
+variable for optimization purposes.  Other styles will ignore the
+hint and greedily fontify as usual.  It is still safe for a
+frontend to call `completion-lazy-hilit' in these situations.
+
+To author a completion style that takes advantage see
+`completion-lazy-hilit-fn' and look in the source of
+`completion-pcm--hilit-commonality'.")
+
+(defvar completion-lazy-hilit-fn nil
+  "Used by completions styles to honouring `completion-lazy-hilit'.
+When a given style wants to enable support for
+`completion-lazy-hilit' (which see), that style should set this
+variable to a function of one argument, a fresh string to be
+displayed to the user.  The function is responsible for
+destructively highlighting the string.")
+
+(defun completion-lazy-hilit (str)
+  "Return a copy of completion STR that is `face'-propertized.
+See documentation for variable `completion-lazy-hilit' for more
+details."
+  (if (and completion-lazy-hilit completion-lazy-hilit-fn)
+      (funcall completion-lazy-hilit-fn (copy-sequence str))
+    str))
+
+(defun completion--hilit-from-re (string regexp)
+  "Fontify STRING with `completions-common-part' using REGEXP."
+  (let* ((md (and regexp (string-match regexp string) (cddr (match-data t))))
+         (me (and md (match-end 0)))
+         (from 0))
+    (while md
+      (add-face-text-property from (pop md) 'completions-common-part nil string)
+      (setq from (pop md)))
+    (unless (or (not me) (= from me))
+      (add-face-text-property from me 'completions-common-part nil string))
+    string))
+
+(defun completion--flex-score-1 (md-groups match-end len)
+  "Compute matching score of completion.
+The score lies in the range between 0 and 1, where 1 corresponds to
+the full match.
+MD-GROUPS is the \"group\"  part of the match data.
+MATCH-END is the end of the match.
+LEN is the length of the completion string."
+  (let* ((from 0)
+         ;; To understand how this works, consider these simple
+         ;; ascii diagrams showing how the pattern "foo"
+         ;; flex-matches "fabrobazo", "fbarbazoo" and
+         ;; "barfoobaz":
+
+         ;;      f abr o baz o
+         ;;      + --- + --- +
+
+         ;;      f barbaz oo
+         ;;      + ------ ++
+
+         ;;      bar foo baz
+         ;;          +++
+
+         ;; "+" indicates parts where the pattern matched.  A
+         ;; "hole" in the middle of the string is indicated by
+         ;; "-".  Note that there are no "holes" near the edges
+         ;; of the string.  The completion score is a number
+         ;; bound by (0..1] (i.e., larger than (but not equal
+         ;; to) zero, and smaller or equal to one): the higher
+         ;; the better and only a perfect match (pattern equals
+         ;; string) will have score 1.  The formula takes the
+         ;; form of a quotient.  For the numerator, we use the
+         ;; number of +, i.e. the length of the pattern.  For
+         ;; the denominator, it first computes
+         ;;
+         ;;     hole_i_contrib = 1 + (Li-1)^(1/tightness)
+         ;;
+         ;; , for each hole "i" of length "Li", where tightness
+         ;; is given by `flex-score-match-tightness'.  The
+         ;; final value for the denominator is then given by:
+         ;;
+         ;;    (SUM_across_i(hole_i_contrib) + 1) * len
+         ;;
+         ;; , where "len" is the string's length.
+         (score-numerator 0)
+         (score-denominator 0)
+         (last-b 0))
+    (while (and md-groups (car md-groups))
+      (let ((a from)
+            (b (pop md-groups)))
+        (setq
+         score-numerator   (+ score-numerator (- b a)))
+        (unless (or (= a last-b)
+                    (zerop last-b)
+                    (= a len))
+          (setq
+           score-denominator (+ score-denominator
+                                1
+                                (expt (- a last-b 1)
+                                      (/ 1.0
+                                         flex-score-match-tightness)))))
+        (setq
+         last-b              b))
+      (setq from (pop md-groups)))
+    ;; If `pattern' doesn't have an explicit trailing any, the
+    ;; regex `re' won't produce match data representing the
+    ;; region after the match.  We need to account to account
+    ;; for that extra bit of match (bug#42149).
+    (unless (= from match-end)
+      (let ((a from)
+            (b match-end))
+        (setq
+         score-numerator   (+ score-numerator (- b a)))
+        (unless (or (= a last-b)
+                    (zerop last-b)
+                    (= a len))
+          (setq
+           score-denominator (+ score-denominator
+                                1
+                                (expt (- a last-b 1)
+                                      (/ 1.0
+                                         flex-score-match-tightness)))))
+        (setq
+         last-b              b)))
+    (/ score-numerator (* len (1+ score-denominator)) 1.0)))
+
+(defvar completion--flex-score-last-md nil
+  "Helper variable for `completion--flex-score'.")
+
+(defun completion--flex-score (str re &optional dont-error)
+  "Compute flex score of completion STR based on RE.
+If DONT-ERROR, just return nil if RE doesn't match STR."
+  (cond ((string-match re str)
+         (let* ((match-end (match-end 0))
+                (md (cddr
+                     (setq
+                      completion--flex-score-last-md
+                      (match-data t completion--flex-score-last-md)))))
+           (completion--flex-score-1 md match-end (length str))))
+        ((not dont-error)
+         (error "Internal error: %s does not match %s" re str))))
+
 (defun completion-pcm--hilit-commonality (pattern completions)
   "Show where and how well PATTERN matches COMPLETIONS.
 PATTERN, a list of symbols and strings as seen
 `completion-pcm--merge-completions', is assumed to match every
-string in COMPLETIONS.  Return a deep copy of COMPLETIONS where
-each string is propertized with `completion-score', a number
-between 0 and 1, and with faces `completions-common-part',
-`completions-first-difference' in the relevant segments."
+string in COMPLETIONS.
+
+If `completion-lazy-hilit' is nil, return a deep copy of
+COMPLETIONS where each string is propertized with
+`completion-score', a number between 0 and 1, and with faces
+`completions-common-part', `completions-first-difference' in the
+relevant segments.
+
+Else, if `completion-lazy-hilit' is t, return COMPLETIONS where
+each string now has a `completion-score' property and no
+highlighting."
   (cond
    ((and completions (cl-loop for e in pattern thereis (stringp e)))
     (let* ((re (completion-pcm--pattern->regex pattern 'group))
-           (point-idx (completion-pcm--pattern-point-idx pattern))
-           (case-fold-search completion-ignore-case)
-           last-md)
-      (mapcar
-       (lambda (str)
-	 ;; Don't modify the string itself.
-         (setq str (copy-sequence str))
-         (unless (string-match re str)
-           (error "Internal error: %s does not match %s" re str))
-         (let* ((pos (if point-idx (match-beginning point-idx) (match-end 0)))
-                (match-end (match-end 0))
-                (md (cddr (setq last-md (match-data t last-md))))
-                (from 0)
-                (end (length str))
-                ;; To understand how this works, consider these simple
-                ;; ascii diagrams showing how the pattern "foo"
-                ;; flex-matches "fabrobazo", "fbarbazoo" and
-                ;; "barfoobaz":
-
-                ;;      f abr o baz o
-                ;;      + --- + --- +
-
-                ;;      f barbaz oo
-                ;;      + ------ ++
-
-                ;;      bar foo baz
-                ;;          +++
-
-                ;; "+" indicates parts where the pattern matched.  A
-                ;; "hole" in the middle of the string is indicated by
-                ;; "-".  Note that there are no "holes" near the edges
-                ;; of the string.  The completion score is a number
-                ;; bound by (0..1] (i.e., larger than (but not equal
-                ;; to) zero, and smaller or equal to one): the higher
-                ;; the better and only a perfect match (pattern equals
-                ;; string) will have score 1.  The formula takes the
-                ;; form of a quotient.  For the numerator, we use the
-                ;; number of +, i.e. the length of the pattern.  For
-                ;; the denominator, it first computes
-                ;;
-                ;;     hole_i_contrib = 1 + (Li-1)^(1/tightness)
-                ;;
-                ;; , for each hole "i" of length "Li", where tightness
-                ;; is given by `flex-score-match-tightness'.  The
-                ;; final value for the denominator is then given by:
-                ;;
-                ;;    (SUM_across_i(hole_i_contrib) + 1) * len
-                ;;
-                ;; , where "len" is the string's length.
-                (score-numerator 0)
-                (score-denominator 0)
-                (last-b 0)
-                (update-score-and-face
-                 (lambda (a b)
-                   "Update score and face given match range (A B)."
-                   (add-face-text-property a b
-                                           'completions-common-part
-                                           nil str)
-                   (setq
-                    score-numerator   (+ score-numerator (- b a)))
-                   (unless (or (= a last-b)
-                               (zerop last-b)
-                               (= a (length str)))
-                     (setq
-                      score-denominator (+ score-denominator
-                                           1
-                                           (expt (- a last-b 1)
-                                                 (/ 1.0
-                                                    flex-score-match-tightness)))))
-                   (setq
-                    last-b              b))))
-           (while md
-             (funcall update-score-and-face from (pop md))
-             (setq from (pop md)))
-           ;; If `pattern' doesn't have an explicit trailing any, the
-           ;; regex `re' won't produce match data representing the
-           ;; region after the match.  We need to account to account
-           ;; for that extra bit of match (bug#42149).
-           (unless (= from match-end)
-             (funcall update-score-and-face from match-end))
-           (if (> (length str) pos)
-               (add-face-text-property
-                pos (1+ pos)
-                'completions-first-difference
-                nil str))
-           (unless (zerop (length str))
-             (put-text-property
-              0 1 'completion-score
-              (/ score-numerator (* end (1+ score-denominator)) 1.0) str)))
-         str)
-       completions)))
+           (score-maybe (lambda (str)
+                          (unless (get-text-property 0 'completion-score str)
+                            (put-text-property 0 1 'completion-score
+                                               (completion--flex-score str re)
+                                               str)))))
+      (cond (completion-lazy-hilit
+             (setq completion-lazy-hilit-fn
+                   (lambda (str) (completion--hilit-from-re str re)))
+             (mapc score-maybe completions))
+            (t
+             (mapcar
+              (lambda (str)
+                (setq str (copy-sequence str))
+                (funcall score-maybe str)
+                (completion--hilit-from-re str re)
+                str)
+              completions)))))
    (t completions)))
 
 (defun completion-pcm--find-all-completions (string table pred point

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Thu, 26 Oct 2023 23:27:02 GMT) Full text and rfc822 format available.

Message #281 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: João Távora <joaotavora <at> gmail.com>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Eli Zaretskii <eliz <at> gnu.org>,
 Stefan Monnier <monnier <at> IRO.UMontreal.CA>, 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add
 new `completion-filter-completions` API and deferred highlighting
Date: Fri, 27 Oct 2023 02:25:25 +0300

On 25/10/2023 20:52, João Távora wrote:
> And make sure to put 300 000 symbols in the obarray.  The symbols are
> prefixed "yoyo" deliberately.
> 
>      (cl-loop repeat 300000 do (intern (symbol-name (gensym "yoyo"))))
> 
> First a micro-benchmark:
> 
>     ;; Daniel's patch worked by Dmitry (v3)
>     (benchmark-run 50
>      (let ((completion-styles '(flex)))
>        (completion-filter-completions "" obarray 'fboundp 0 nil)
>        (completion-filter-completions "yo" obarray 'fboundp 0 nil)
>        (completion-filter-completions "yoo" obarray 'fboundp 0 nil)
>        ));; => (12.192422429999999 3 0.107881004)
> 
> 
>    ;; lazy-hilit v4 patch attached in this email
>    (benchmark-run 50
>      (let ((completion-styles '(flex))
>            (completion-lazy-hilit (cl-gensym)))
>        (completion-all-completions "" obarray 'fboundp 0 nil)
>        (completion-all-completions "yo" obarray 'fboundp 0 nil)
>        (completion-all-completions "yoo" obarray 'fboundp 0 nil)
>        ));; => (12.267915333 4 0.14799709099999991)

Note on this particular test:

The loop on the first line only creates the symbols in the obarray, but 
not functions. As a result, all the completion-all-completions calls 
return nil because of the 'fboundp' predicate. When you change the 
predicate argument to nil, these timings change considerably (so it's 
wiser to reduce the number of repetitions to 1 or 5 at most), with 
completion-filter-completions being ~2.5x faster than the other.

It is slower in the sorting step, though: mostly due to the extra 
consing created with the alist to-be-sorted, I guess, but also because 
of the repeated string-match call (which, while fast and much faster 
than the match-data call, is still not free).

That's how when compared in practice using fido-vertical-mode the 
results were about the same.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Thu, 26 Oct 2023 23:37:01 GMT) Full text and rfc822 format available.

Message #284 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: João Távora <joaotavora <at> gmail.com>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Eli Zaretskii <eliz <at> gnu.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add
 new `completion-filter-completions` API and deferred highlighting
Date: Fri, 27 Oct 2023 02:35:33 +0300

On 27/10/2023 02:27, João Távora wrote:
> Dmitry Gutov<dmitry <at> gutov.dev>  writes:
> 
>> My understanding is it's due to the judicious call (copy-sequence
>> orig) that you added before 'put-text-property' is called. While it
>> seems like a good idea to preserve the original value, when almost all
>> of obarray matches the current input (which is the current scenario),
>> a lot of strings will be copied.
> You're right, I reproduced the regression.  I thought I had taken out
> the copy-sequence, but forgot it there.  In an earlier stage I suspected
> that I needed the copy, but I don't think I do.  Please try this new
> patch that removes it.  I've also pushed it to the
> feature/completion-lazy-hilit branch.

Yep, without copy-sequence the regression is gone. Now the input strings 
are routinely mutated, though. ;-(

You could do a copy a little later -- after the match succeeds and the 
score is computed. But for short widely-matching inputs like 'a' that 
would make little difference.

I also experimented with hash-tables for "external" score storage. But 
it still comes out a little slower than either of the proposed solutions.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Thu, 26 Oct 2023 23:43:01 GMT) Full text and rfc822 format available.

Message #287 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Eli Zaretskii <eliz <at> gnu.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add
 new `completion-filter-completions` API and deferred highlighting
Date: Fri, 27 Oct 2023 00:44:30 +0100

On Fri, Oct 27, 2023 at 12:25 AM Dmitry Gutov <dmitry <at> gutov.dev> wrote:
>
> On 25/10/2023 20:52, João Távora wrote:
> > And make sure to put 300 000 symbols in the obarray.  The symbols are
> > prefixed "yoyo" deliberately.
> >
> >      (cl-loop repeat 300000 do (intern (symbol-name (gensym "yoyo"))))
> >
> > First a micro-benchmark:
> >
> >     ;; Daniel's patch worked by Dmitry (v3)
> >     (benchmark-run 50
> >      (let ((completion-styles '(flex)))
> >        (completion-filter-completions "" obarray 'fboundp 0 nil)
> >        (completion-filter-completions "yo" obarray 'fboundp 0 nil)
> >        (completion-filter-completions "yoo" obarray 'fboundp 0 nil)
> >        ));; => (12.192422429999999 3 0.107881004)
> >
> >
> >    ;; lazy-hilit v4 patch attached in this email
> >    (benchmark-run 50
> >      (let ((completion-styles '(flex))
> >            (completion-lazy-hilit (cl-gensym)))
> >        (completion-all-completions "" obarray 'fboundp 0 nil)
> >        (completion-all-completions "yo" obarray 'fboundp 0 nil)
> >        (completion-all-completions "yoo" obarray 'fboundp 0 nil)
> >        ));; => (12.267915333 4 0.14799709099999991)
>
> Note on this particular test:
>
> The loop on the first line only creates the symbols in the obarray, but
> not functions. As a result, all the completion-all-completions calls
> return nil because of the 'fboundp' predicate. When you change the
> predicate argument to nil, these timings change considerably (so it's
> wiser to reduce the number of repetitions to 1 or 5 at most), with
> completion-filter-completions being ~2.5x faster than the other.

Right, I missed this.  I can reproduce it, though only around 2.0x faster here.

> It is slower in the sorting step, though: mostly due to the extra
> consing created with the alist to-be-sorted, I guess, but also because
> of the repeated string-match call (which, while fast and much faster
> than the match-data call, is still not free).

And is the sorting step not included in the full call to
completion-filter-completions?  I don't fully understand how it works.

> That's how when compared in practice using fido-vertical-mode the
> results were about the same.

But that's what matters right?

Also in the last iteration of the "yoyo" fido-vertical-mode test,
results with my latest patch are quite a bit faster.

João

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Thu, 26 Oct 2023 23:51:02 GMT) Full text and rfc822 format available.

Message #290 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Eli Zaretskii <eliz <at> gnu.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add
 new `completion-filter-completions` API and deferred highlighting
Date: Fri, 27 Oct 2023 00:52:06 +0100

On Fri, Oct 27, 2023 at 12:35 AM Dmitry Gutov <dmitry <at> gutov.dev> wrote:
>
> On 27/10/2023 02:27, João Távora wrote:
> > Dmitry Gutov<dmitry <at> gutov.dev>  writes:
> >
> >> My understanding is it's due to the judicious call (copy-sequence
> >> orig) that you added before 'put-text-property' is called. While it
> >> seems like a good idea to preserve the original value, when almost all
> >> of obarray matches the current input (which is the current scenario),
> >> a lot of strings will be copied.
> > You're right, I reproduced the regression.  I thought I had taken out
> > the copy-sequence, but forgot it there.  In an earlier stage I suspected
> > that I needed the copy, but I don't think I do.  Please try this new
> > patch that removes it.  I've also pushed it to the
> > feature/completion-lazy-hilit branch.
>
> Yep, without copy-sequence the regression is gone. Now the input strings
> are routinely mutated, though. ;-(

Not sure what you mean by mutated, the strings look fine to me,
does this make any visible problem?  I couldn't detect it.

João

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Fri, 27 Oct 2023 00:13:01 GMT) Full text and rfc822 format available.

Message #293 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: João Távora <joaotavora <at> gmail.com>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Eli Zaretskii <eliz <at> gnu.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add
 new `completion-filter-completions` API and deferred highlighting
Date: Fri, 27 Oct 2023 03:11:46 +0300

On 27/10/2023 02:44, João Távora wrote:
>> It is slower in the sorting step, though: mostly due to the extra
>> consing created with the alist to-be-sorted, I guess, but also because
>> of the repeated string-match call (which, while fast and much faster
>> than the match-data call, is still not free).
> And is the sorting step not included in the full call to
> completion-filter-completions?  I don't fully understand how it works.

It recomputes all the scores inside the display-sort-function.

>> That's how when compared in practice using fido-vertical-mode the
>> results were about the same.
> But that's what matters right?

Pretty much, yes. Except for some potential exotic cases where the UI or 
the user would want to override the sorting logic. Corfu and Vertico 
have such capability, but I'm not sure if it's used often.

> Also in the last iteration of the "yoyo" fido-vertical-mode test,
> results with my latest patch are quite a bit faster.

Hmm, your latest (lazy-hilit-2023-v3.diff) improves the (completing-read 
"" obarray) scenario a lot (over the original two 2023 solutions), but 
not the the 'C-h v' scenario. I don't have an explanation.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Fri, 27 Oct 2023 00:13:02 GMT) Full text and rfc822 format available.

Message #296 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Eli Zaretskii <eliz <at> gnu.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add
 new `completion-filter-completions` API and deferred highlighting
Date: Fri, 27 Oct 2023 01:14:45 +0100

On Fri, Oct 27, 2023 at 12:44 AM João Távora <joaotavora <at> gmail.com> wrote:

> > It is slower in the sorting step, though: mostly due to the extra
> > consing created with the alist to-be-sorted, I guess, but also because
> > of the repeated string-match call (which, while fast and much faster
> > than the match-data call, is still not free).
>
> And is the sorting step not included in the full call to
> completion-filter-completions?  I don't fully understand how it works.

I get it now.  Neither completion-sorted-completions
or completion-all-completions do the sorting.  It's just that
if sorting is bound to be required, completion-all-completions
will do most of the work upfront, and thus sorting will be much
faster in that case, compensating completely.

So now I don't think that micro-benchmark is of much use.  A useful
such benchmark would have to be representative of the full cycle.
Maybe we don't even need this benchmark, given the instrumentation
in icomplete.el works well and seems precise and deterministic (just
not as convenient to run).

João

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Fri, 27 Oct 2023 00:25:01 GMT) Full text and rfc822 format available.

Message #299 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Eli Zaretskii <eliz <at> gnu.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add
 new `completion-filter-completions` API and deferred highlighting
Date: Fri, 27 Oct 2023 01:26:53 +0100

On Fri, Oct 27, 2023 at 1:11 AM Dmitry Gutov <dmitry <at> gutov.dev> wrote:

> > Also in the last iteration of the "yoyo" fido-vertical-mode test,
> > results with my latest patch are quite a bit faster.
>
> Hmm, your latest (lazy-hilit-2023-v3.diff) improves the (completing-read
> "" obarray) scenario a lot (over the original two 2023 solutions), but
> not the the 'C-h v' scenario. I don't have an explanation.

The improvement was due to running string-match only once per completion,
if you look at the changes to completion-pcm--all-completions.

It could be this code doesn't kick in in the C-h v scenario.  Notice
that that function already has some optimizations: for example, when
the regexp match is performed by all-completions and its use of
completion-regexp-alist, there's no way to get the regex match data
to compute the score, so it'll have to be postponed to
completion-pcm--hilit-commonality as in the v2.diff case.

Then again, that doesn't explain why in the C-h v scenario, the regression
was fixed precisely by adjust that code that I was conjecturing might
not kick in.

Anyway, I think it's safer to say that my latest patch is at least as fast,
sometimes faster, than the 2023 completion-filter-completions solution.

João

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Fri, 27 Oct 2023 13:30:02 GMT) Full text and rfc822 format available.

Message #302 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: João Távora <joaotavora <at> gmail.com>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Eli Zaretskii <eliz <at> gnu.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add
 new `completion-filter-completions` API and deferred highlighting
Date: Fri, 27 Oct 2023 16:29:03 +0300

On 27/10/2023 03:26, João Távora wrote:
> On Fri, Oct 27, 2023 at 1:11 AM Dmitry Gutov <dmitry <at> gutov.dev> wrote:
> 
>>> Also in the last iteration of the "yoyo" fido-vertical-mode test,
>>> results with my latest patch are quite a bit faster.
>>
>> Hmm, your latest (lazy-hilit-2023-v3.diff) improves the (completing-read
>> "" obarray) scenario a lot (over the original two 2023 solutions), but
>> not the the 'C-h v' scenario. I don't have an explanation.
> 
> The improvement was due to running string-match only once per completion,
> if you look at the changes to completion-pcm--all-completions.

Right. I just don't (didn't?) have an explanation for the difference in 
the improvement in performance (or the lack thereof) between the two 
scenarios.

> It could be this code doesn't kick in in the C-h v scenario.  Notice
> that that function already has some optimizations: for example, when
> the regexp match is performed by all-completions and its use of
> completion-regexp-alist, there's no way to get the regex match data
> to compute the score, so it'll have to be postponed to
> completion-pcm--hilit-commonality as in the v2.diff case.
> 
> Then again, that doesn't explain why in the C-h v scenario, the regression
> was fixed precisely by adjust that code that I was conjecturing might
> not kick in.

According to my print-debugging, the situation is the reverse: 
(completing-read "" obarray) goes the "The internal functions already 
obeyed" route (because obarray is not a function?), and the scenario 
that didn't get better (C-h v) goes down the clause "pattern has 
something interesting to match". Unless the input is empty, then it also 
goes down the first clause.

So it seems like we might misunderstand the reason why the former got 
faster. I see the check changed from

  (not (functionp table)

to

  (or (not (functionp table))
      (cl-loop for e in pattern never (stringp e)))

but that can't be that reason.

BTW, all-completion's docstring also says that a COLLECTION that is a 
function should itself handle completion-regexp-list, so we could try to 
rely on that too and drop the additional check. That's risky, though, so 
something for a later follow-up.

> Anyway, I think it's safer to say that my latest patch is at least as fast,
> sometimes faster, than the 2023 completion-filter-completions solution.

All other things equal, I also prefer a smaller change, and thank you 
for producing it. Functionally, too, it's almost equivalent to 
completion-filter-completions. So one could easily write a wrapper like 
that with the same performance.

But there are differences. The first is that the highlighter function 
takes one string as an argument instead of a collection. I mentioned 
this before, this will be much handier to use in company-capf.

Second, in Daniel's patch the "adjust metadata" function got a 
different, arguably better, calling convention. That's not dependent on 
the rest of the patch, so it can be considered separately.

Third, it made a principled stance to avoid altering the original 
strings, even the non-visual text properties. This approach could be 
adopted piecewise from Daniel's patch, especially if the performance 
ends up the same or insignificantly different in practical usage.

As for whether we migrate to the completion-filter-completions API, I 
don't have a strong opinion. If we still think that the next revision of 
the completion API will be radically different, then there is not much 
point in making medium-sized steps like that. OTOH, if we end up with 
the same API for the next decade and more, completion-filter-completions 
does look more meaningful, and more easily extensible (e.g. next we 
could add a pair (completion-scorer . fn) to its return value; and so 
on). And again, the implementation could be a simple wrapper at first.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Fri, 27 Oct 2023 13:48:02 GMT) Full text and rfc822 format available.

Message #305 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Eli Zaretskii <eliz <at> gnu.org>,
 João Távora <joaotavora <at> gmail.com>,
 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2]
 Add new `completion-filter-completions` API and deferred highlighting
Date: Fri, 27 Oct 2023 09:46:47 -0400

> BTW, all-completion's docstring also says that a COLLECTION that is
> a function should itself handle completion-regexp-list,

The key here is "should": we know how well ELisp coders follow
such recommendations.


        Stefan

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Fri, 27 Oct 2023 15:43:01 GMT) Full text and rfc822 format available.

Message #308 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Eli Zaretskii <eliz <at> gnu.org>,
 João Távora <joaotavora <at> gmail.com>, 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add
 new `completion-filter-completions` API and deferred highlighting
Date: Fri, 27 Oct 2023 18:41:34 +0300

On 27/10/2023 16:46, Stefan Monnier wrote:
>> BTW, all-completion's docstring also says that a COLLECTION that is
>> a function should itself handle completion-regexp-list,
> The key here is "should": we know how well ELisp coders follow
> such recommendations.

We could, for example, have a period when we warn about returned 
non-matches. string-match-p is not free, but it's not very expensive either.

I think we should fix the discrepancy between the doc and the behavior, 
one way or another. If the function does obey the current phrasing, it 
seems like it's doing extra work if we post-filter anyway, producing 
extra cosing (though the result might still be beneficial if that 
filtering results in a much smaller list on the first try).

But third-party callers of all-completions might also see the doc and 
decide to use completion-regexp-list, with mixed results. That seems 
like a pure downside.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Fri, 27 Oct 2023 16:20:02 GMT) Full text and rfc822 format available.

Message #311 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Eli Zaretskii <eliz <at> gnu.org>,
 João Távora <joaotavora <at> gmail.com>,
 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2]
 Add new `completion-filter-completions` API and deferred highlighting
Date: Fri, 27 Oct 2023 12:19:04 -0400

> We could, for example, have a period when we warn about returned
> non-matches. string-match-p is not free, but it's not very expensive either.

The problem is that I dislike `completion-regexp-list` :-)

More seriously, since it's a dynbound variable it can have unwanted
effects in nested calls to `all/try-completions`, so it's safer to
ignore that variable because its binding is not always "meant for us" :-(


        Stefan

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Fri, 27 Oct 2023 17:07:02 GMT) Full text and rfc822 format available.

Message #314 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Eli Zaretskii <eliz <at> gnu.org>,
 João Távora <joaotavora <at> gmail.com>, 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add
 new `completion-filter-completions` API and deferred highlighting
Date: Fri, 27 Oct 2023 20:06:11 +0300

On 27/10/2023 19:19, Stefan Monnier wrote:
>> We could, for example, have a period when we warn about returned
>> non-matches. string-match-p is not free, but it's not very expensive either.
> 
> The problem is that I dislike `completion-regexp-list` :-)

When we do use it, we can avoid copying all the strings to a new list. 
Skipping consing this way can really move the needle at the level of 
optimization we're discussing now.

> More seriously, since it's a dynbound variable it can have unwanted
> effects in nested calls to `all/try-completions`, so it's safer to
> ignore that variable because its binding is not always "meant for us" :-(

I guess it would be more precise if it was a function argument, e.g. the 
first argument to 'fancy-all-completions' or somesuch that all 
completion tables are supposed to use inside. OTOH, I suppose that might 
hinder those that use external programs.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Fri, 27 Oct 2023 17:15:01 GMT) Full text and rfc822 format available.

Message #317 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Eli Zaretskii <eliz <at> gnu.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2]
 Add new `completion-filter-completions` API and deferred highlighting
Date: Fri, 27 Oct 2023 18:16:42 +0100

[Message part 1 (text/plain, inline)]

Dmitry Gutov <dmitry <at> gutov.dev> writes:

> but that can't be that reason.

Indeed it isn't.  But you're not far off, I think.  Anyway, in the
(completing-read "" obarray) scenario, the shortcut is taken, and that
and the fact that the optimization then completing skips scoring for it
-- which is blatantly wrong -- is why it is much faster.

Looking at this, it seems that string-match with a non-group-capturing
regexp over all the matches (the thing that the change was optimizing
away) isn't that expensive in the overall gist of things.

Thus in the C-h v case, the shortcut isn't taken and the optimization
does what it should, but nothing gargantuan, around 8-9%.  Also keep in
mind this yoyo case is pretty contrived and I suspect more realistic
cases would have even more modest gains.

My attempts to fix the optimization to work correctly in all cases
complicated the code and then slowed down the bare completion-read case.
Not worth it IMO, therefore I have reverted it in the branch and in the
latest patch I attach (lazy-hilit-2023-v4.diff).  Here are the latest
measurements for the "yoyo" test, now considering the C-h v scenario:

;; Daniel+Dmitry completin-read: 0.834s avg
;; Daniel+Dmitry C-h v:          0.988s avg

;; lazy-hilit completing-read:   0.825s avg
;; lazy-hilit C-h v:             0.946s avg

Again, this shows my patch to be about 2-4% faster, though not really
relevant.

Again, the optimization I removed, if it were done correctly (which it
wasn't) could maybe shave off another 8-9% off that.

> But there are differences. The first is that the highlighter function
> takes one string as an argument instead of a collection. I mentioned
> this before, this will be much handier to use in company-capf.

Don't fully follow this.  Can you perhaps show two versions (or two
snippets) of the company-capf code, the handy and the non-handy version?

> Second, in Daniel's patch the "adjust metadata" function got a
> different, arguably better, calling convention. That's not dependent
> on the rest of the patch, so it can be considered separately.

Maybe.  Changes to calling convention can be argued for but, when
possible, they shoudn't be in with performance improvements.

> Third, it made a principled stance to avoid altering the original
> strings, even the non-visual text properties. This approach could be
> adopted piecewise from Daniel's patch, especially if the performance
> ends up the same or insignificantly different in practical usage.

If we really wanted to, we could also adopt the non-propertizing
approach in my lazy-hilit patch, by calculating the score "just in
time", much like Daniel's patch does it.  But it should be clear that
what we save in allocation in completion-pcm--hilit-commonality, we'll
pay for in the same amount in consing later.  So no performance benefit.

And if we do that, don't forget there will be the ugly "unquoted"
complication to deal with.  Then again, in my understanding that
complication is due to similar problem of mixing business and display
logic.  That's assuming I understand this comment in minibuffer.el
correctly:

   ;; Here we choose to quote all elements returned, but a better option
   ;; would be to return unquoted elements together with a function to
   ;; requote them, so that *Completions* can show nicer unquoted values
   ;; which only get quoted when needed by choose-completion.

So I would look into solving that first, instead of allowing the
"unquoted" hacks to spread even further in minibuffer.el

> As for whether we migrate to the completion-filter-completions API, I
> don't have a strong opinion. If we still think that the next revision
> of the completion API will be radically different, then there is not
> much point in making medium-sized steps like that. OTOH, if we end up
> with the same API for the next decade and more,
> completion-filter-completions does look more meaningful, and more
> easily extensible (e.g. next we could add a pair (completion-scorer
> . fn) to its return value; and so on). And again, the implementation
> could be a simple wrapper at first.

I agree on some points, and looking at this from a API-convenience
standpoint I note two things:

* To get late highlighting in Daniel's patch, one has to use a whole new
  API entry function to gather completions, and another to fontify
  completions.  One also deprecates an existing member of the API.

* In my patch, one binds a new API varaible and uses a function to fontify
  completions.  No deprecations.

Even if it wasn't a much simpler change to minibuffer.el without any
backward-compatibility gymnastics, I still think the latter is much
cleaner and easier to implement for completion frontends.  The fact that
there's no deprecation of one of the most important members of the API
to date makes it more confortable to understand IMO.

João

[lazy-hilit-2023-v4.diff (text/x-patch, inline)]

diff --git a/lisp/icomplete.el b/lisp/icomplete.el
index e6fdd1f1836..3e888c8b06a 100644
--- a/lisp/icomplete.el
+++ b/lisp/icomplete.el
@@ -722,7 +722,8 @@ icomplete-exhibit
              ;; Check if still in the right buffer (bug#61308)
              (or (window-minibuffer-p) completion-in-region--data)
              (icomplete-simple-completing-p)) ;Shouldn't be necessary.
-    (let ((saved-point (point)))
+    (let ((saved-point (point))
+          (completion-lazy-hilit t))
       (save-excursion
         (goto-char (icomplete--field-end))
         ;; Insert the match-status information:
@@ -754,12 +755,13 @@ icomplete-exhibit
                            (overlay-end rfn-eshadow-overlay)))
           (let* ((field-string (icomplete--field-string))
                  (text (while-no-input
+                         (benchmark-progn
                          (icomplete-completions
                           field-string
                           (icomplete--completion-table)
                           (icomplete--completion-predicate)
                           (if (window-minibuffer-p)
-                              (eq minibuffer--require-match t)))))
+                              (eq minibuffer--require-match t))))))
                  (buffer-undo-list t)
                  deactivate-mark)
             ;; Do nothing if while-no-input was aborted.
@@ -901,7 +903,7 @@ icomplete--render-vertical
                                 'icomplete-selected-match 'append comp)
      collect (concat prefix
                      (make-string (- max-prefix-len (length prefix)) ? )
-                     comp
+                     (completion-lazy-hilit comp)
                      (make-string (- max-comp-len (length comp)) ? )
                      suffix)
      into lines-aux
@@ -1067,7 +1069,8 @@ icomplete-completions
                   (if (< prospects-len prospects-max)
                       (push comp prospects)
                     (setq limit t)))
-                (setq prospects (nreverse prospects))
+                (setq prospects
+                      (nreverse (mapcar #'completion-lazy-hilit prospects)))
                 ;; Decorate first of the prospects.
                 (when prospects
                   (let ((first (copy-sequence (pop prospects))))
diff --git a/lisp/minibuffer.el b/lisp/minibuffer.el
index 2120e31775e..cd8eeee2c78 100644
--- a/lisp/minibuffer.el
+++ b/lisp/minibuffer.el
@@ -1234,6 +1234,7 @@ completion-all-completions
 POINT is the position of point within STRING.
 The return value is a list of completions and may contain the base-size
 in the last `cdr'."
+  (setq completion-lazy-hilit-fn nil)
   ;; FIXME: We need to additionally return the info needed for the
   ;; second part of completion-base-position.
   (completion--nth-completion 2 string table pred point metadata))
@@ -3749,108 +3750,194 @@ flex-score-match-tightness
 than the latter (which has two \"holes\" and three
 one-letter-long matches).")
 
+(defvar-local completion-lazy-hilit nil
+  "If non-nil, request completion lazy highlighting.
+
+Completion-presenting frontends may opt to bind this variable to
+non-nil value in the context of completion-producing calls (such
+as `completion-all-sorted-completions').  This hints the
+intervening completion styles that they do not need to
+fontify (i.e. propertize with the `face' property) completion
+strings with highlights of the matching parts.
+
+When doing so, it is the frontend -- not the style -- who becomes
+responsible this fontification.  The frontend binds this variable
+to non-nil, and calls the function with the same name
+`completion-lazy-hilit' on each completion string that is to be
+displayed to the user.
+
+Note that only some completion styles take advantage of this
+variable for optimization purposes.  Other styles will ignore the
+hint and greedily fontify as usual.  It is still safe for a
+frontend to call `completion-lazy-hilit' in these situations.
+
+To author a completion style that takes advantage see
+`completion-lazy-hilit-fn' and look in the source of
+`completion-pcm--hilit-commonality'.")
+
+(defvar completion-lazy-hilit-fn nil
+  "Used by completions styles to honouring `completion-lazy-hilit'.
+When a given style wants to enable support for
+`completion-lazy-hilit' (which see), that style should set this
+variable to a function of one argument, a fresh string to be
+displayed to the user.  The function is responsible for
+destructively highlighting the string.")
+
+(defun completion-lazy-hilit (str)
+  "Return a copy of completion STR that is `face'-propertized.
+See documentation for variable `completion-lazy-hilit' for more
+details."
+  (if (and completion-lazy-hilit completion-lazy-hilit-fn)
+      (funcall completion-lazy-hilit-fn (copy-sequence str))
+    str))
+
+(defun completion--hilit-from-re (string regexp)
+  "Fontify STRING with `completions-common-part' using REGEXP."
+  (let* ((md (and regexp (string-match regexp string) (cddr (match-data t))))
+         (me (and md (match-end 0)))
+         (from 0))
+    (while md
+      (add-face-text-property from (pop md) 'completions-common-part nil string)
+      (setq from (pop md)))
+    (unless (or (not me) (= from me))
+      (add-face-text-property from me 'completions-common-part nil string))
+    string))
+
+(defun completion--flex-score-1 (md-groups match-end len)
+  "Compute matching score of completion.
+The score lies in the range between 0 and 1, where 1 corresponds to
+the full match.
+MD-GROUPS is the \"group\"  part of the match data.
+MATCH-END is the end of the match.
+LEN is the length of the completion string."
+  (let* ((from 0)
+         ;; To understand how this works, consider these simple
+         ;; ascii diagrams showing how the pattern "foo"
+         ;; flex-matches "fabrobazo", "fbarbazoo" and
+         ;; "barfoobaz":
+
+         ;;      f abr o baz o
+         ;;      + --- + --- +
+
+         ;;      f barbaz oo
+         ;;      + ------ ++
+
+         ;;      bar foo baz
+         ;;          +++
+
+         ;; "+" indicates parts where the pattern matched.  A
+         ;; "hole" in the middle of the string is indicated by
+         ;; "-".  Note that there are no "holes" near the edges
+         ;; of the string.  The completion score is a number
+         ;; bound by (0..1] (i.e., larger than (but not equal
+         ;; to) zero, and smaller or equal to one): the higher
+         ;; the better and only a perfect match (pattern equals
+         ;; string) will have score 1.  The formula takes the
+         ;; form of a quotient.  For the numerator, we use the
+         ;; number of +, i.e. the length of the pattern.  For
+         ;; the denominator, it first computes
+         ;;
+         ;;     hole_i_contrib = 1 + (Li-1)^(1/tightness)
+         ;;
+         ;; , for each hole "i" of length "Li", where tightness
+         ;; is given by `flex-score-match-tightness'.  The
+         ;; final value for the denominator is then given by:
+         ;;
+         ;;    (SUM_across_i(hole_i_contrib) + 1) * len
+         ;;
+         ;; , where "len" is the string's length.
+         (score-numerator 0)
+         (score-denominator 0)
+         (last-b 0))
+    (while (and md-groups (car md-groups))
+      (let ((a from)
+            (b (pop md-groups)))
+        (setq
+         score-numerator   (+ score-numerator (- b a)))
+        (unless (or (= a last-b)
+                    (zerop last-b)
+                    (= a len))
+          (setq
+           score-denominator (+ score-denominator
+                                1
+                                (expt (- a last-b 1)
+                                      (/ 1.0
+                                         flex-score-match-tightness)))))
+        (setq
+         last-b              b))
+      (setq from (pop md-groups)))
+    ;; If `pattern' doesn't have an explicit trailing any, the
+    ;; regex `re' won't produce match data representing the
+    ;; region after the match.  We need to account to account
+    ;; for that extra bit of match (bug#42149).
+    (unless (= from match-end)
+      (let ((a from)
+            (b match-end))
+        (setq
+         score-numerator   (+ score-numerator (- b a)))
+        (unless (or (= a last-b)
+                    (zerop last-b)
+                    (= a len))
+          (setq
+           score-denominator (+ score-denominator
+                                1
+                                (expt (- a last-b 1)
+                                      (/ 1.0
+                                         flex-score-match-tightness)))))
+        (setq
+         last-b              b)))
+    (/ score-numerator (* len (1+ score-denominator)) 1.0)))
+
+(defvar completion--flex-score-last-md nil
+  "Helper variable for `completion--flex-score'.")
+
+(defun completion--flex-score (str re &optional dont-error)
+  "Compute flex score of completion STR based on RE.
+If DONT-ERROR, just return nil if RE doesn't match STR."
+  (cond ((string-match re str)
+         (let* ((match-end (match-end 0))
+                (md (cddr
+                     (setq
+                      completion--flex-score-last-md
+                      (match-data t completion--flex-score-last-md)))))
+           (completion--flex-score-1 md match-end (length str))))
+        ((not dont-error)
+         (error "Internal error: %s does not match %s" re str))))
+
 (defun completion-pcm--hilit-commonality (pattern completions)
   "Show where and how well PATTERN matches COMPLETIONS.
 PATTERN, a list of symbols and strings as seen
 `completion-pcm--merge-completions', is assumed to match every
-string in COMPLETIONS.  Return a deep copy of COMPLETIONS where
-each string is propertized with `completion-score', a number
-between 0 and 1, and with faces `completions-common-part',
-`completions-first-difference' in the relevant segments."
+string in COMPLETIONS.
+
+If `completion-lazy-hilit' is nil, return a deep copy of
+COMPLETIONS where each string is propertized with
+`completion-score', a number between 0 and 1, and with faces
+`completions-common-part', `completions-first-difference' in the
+relevant segments.
+
+Else, if `completion-lazy-hilit' is t, return COMPLETIONS where
+each string now has a `completion-score' property and no
+highlighting."
   (cond
    ((and completions (cl-loop for e in pattern thereis (stringp e)))
     (let* ((re (completion-pcm--pattern->regex pattern 'group))
-           (point-idx (completion-pcm--pattern-point-idx pattern))
-           (case-fold-search completion-ignore-case)
-           last-md)
-      (mapcar
-       (lambda (str)
-	 ;; Don't modify the string itself.
-         (setq str (copy-sequence str))
-         (unless (string-match re str)
-           (error "Internal error: %s does not match %s" re str))
-         (let* ((pos (if point-idx (match-beginning point-idx) (match-end 0)))
-                (match-end (match-end 0))
-                (md (cddr (setq last-md (match-data t last-md))))
-                (from 0)
-                (end (length str))
-                ;; To understand how this works, consider these simple
-                ;; ascii diagrams showing how the pattern "foo"
-                ;; flex-matches "fabrobazo", "fbarbazoo" and
-                ;; "barfoobaz":
-
-                ;;      f abr o baz o
-                ;;      + --- + --- +
-
-                ;;      f barbaz oo
-                ;;      + ------ ++
-
-                ;;      bar foo baz
-                ;;          +++
-
-                ;; "+" indicates parts where the pattern matched.  A
-                ;; "hole" in the middle of the string is indicated by
-                ;; "-".  Note that there are no "holes" near the edges
-                ;; of the string.  The completion score is a number
-                ;; bound by (0..1] (i.e., larger than (but not equal
-                ;; to) zero, and smaller or equal to one): the higher
-                ;; the better and only a perfect match (pattern equals
-                ;; string) will have score 1.  The formula takes the
-                ;; form of a quotient.  For the numerator, we use the
-                ;; number of +, i.e. the length of the pattern.  For
-                ;; the denominator, it first computes
-                ;;
-                ;;     hole_i_contrib = 1 + (Li-1)^(1/tightness)
-                ;;
-                ;; , for each hole "i" of length "Li", where tightness
-                ;; is given by `flex-score-match-tightness'.  The
-                ;; final value for the denominator is then given by:
-                ;;
-                ;;    (SUM_across_i(hole_i_contrib) + 1) * len
-                ;;
-                ;; , where "len" is the string's length.
-                (score-numerator 0)
-                (score-denominator 0)
-                (last-b 0)
-                (update-score-and-face
-                 (lambda (a b)
-                   "Update score and face given match range (A B)."
-                   (add-face-text-property a b
-                                           'completions-common-part
-                                           nil str)
-                   (setq
-                    score-numerator   (+ score-numerator (- b a)))
-                   (unless (or (= a last-b)
-                               (zerop last-b)
-                               (= a (length str)))
-                     (setq
-                      score-denominator (+ score-denominator
-                                           1
-                                           (expt (- a last-b 1)
-                                                 (/ 1.0
-                                                    flex-score-match-tightness)))))
-                   (setq
-                    last-b              b))))
-           (while md
-             (funcall update-score-and-face from (pop md))
-             (setq from (pop md)))
-           ;; If `pattern' doesn't have an explicit trailing any, the
-           ;; regex `re' won't produce match data representing the
-           ;; region after the match.  We need to account to account
-           ;; for that extra bit of match (bug#42149).
-           (unless (= from match-end)
-             (funcall update-score-and-face from match-end))
-           (if (> (length str) pos)
-               (add-face-text-property
-                pos (1+ pos)
-                'completions-first-difference
-                nil str))
-           (unless (zerop (length str))
-             (put-text-property
-              0 1 'completion-score
-              (/ score-numerator (* end (1+ score-denominator)) 1.0) str)))
-         str)
-       completions)))
+           (score (lambda (str)
+                    (put-text-property 0 1 'completion-score
+                                       (completion--flex-score str re)
+                                       str))))
+      (cond (completion-lazy-hilit
+             (setq completion-lazy-hilit-fn
+                   (lambda (str) (completion--hilit-from-re str re)))
+             (mapc score completions))
+            (t
+             (mapcar
+              (lambda (str)
+                (setq str (copy-sequence str))
+                (funcall score str)
+                (completion--hilit-from-re str re)
+                str)
+              completions)))))
    (t completions)))
 
 (defun completion-pcm--find-all-completions (string table pred point

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Fri, 27 Oct 2023 18:14:01 GMT) Full text and rfc822 format available.

Message #320 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Eli Zaretskii <eliz <at> gnu.org>,
 João Távora <joaotavora <at> gmail.com>,
 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2]
 Add new `completion-filter-completions` API and deferred highlighting
Date: Fri, 27 Oct 2023 14:12:20 -0400

>>> We could, for example, have a period when we warn about returned
>>> non-matches. string-match-p is not free, but it's not very expensive either.
>> The problem is that I dislike `completion-regexp-list` :-)
> When we do use it, we can avoid copying all the strings to a new
> list. Skipping consing this way can really move the needle at the level of
> optimization we're discussing now.

Oh, don't get me wrong, I like the functionality it offers, I just
dislike the way it works.

>> More seriously, since it's a dynbound variable it can have unwanted
>> effects in nested calls to `all/try-completions`, so it's safer to
>> ignore that variable because its binding is not always "meant for us" :-(
> I guess it would be more precise if it was a function argument, e.g. the
> first argument to 'fancy-all-completions' or somesuch that all completion
> tables are supposed to use inside. OTOH, I suppose that might hinder those
> that use external programs.

In my "work in progress" (not touched since last December :-( ),
I replace `all-completions` with:

    (cl-defgeneric completion-table-fetch-matches ( table pattern
                                                    &optional pre session)
      "Return candidates matching PATTERN in the completion TABLE.
    For tables with subfields, PRE is the text found before PATTERN such that
       (let ((len (length PRE)))
         (equal (completion-table-boundaries TABLE PRE len) (cons len len)))
    
    Return a list of strings or a list of cons cells whose car is a string.
    SESSION if non-nil is a hash-table where we can stash arbitrary auxiliary
    info to avoid recomputing it between calls of the same \"session\".")

`pattern`s can take various shapes.  In my WiP code, I implement 4 kinds
of patterns: prefix, glob, regexp, and predicate.  Now, we don't want
completion tables to have to handle each and every one of those pattern
kinds (the set of which is extensible via CLOS methods), so there's
a middleman:

    (cl-defgeneric completion-pattern-convert (to pattern)
      "Convert PATTERN to be of type TO.
    Returns a pair (PRED . NEWPATTERN) where NEWPATTERN is of type TO
    and should match everything that PATTERN matches.  PRED is nil
    if NEWPATTERN matches exactly the same candidates as PATTERN
    and otherwise it is a function that takes a candidate and returns non-nil if the
    candidate also matches PATTERN.  PRED should not presume that the candidate
    has already been filtered by NEWPATTERN."

So the fallback definition of `completion-table-fetch-matches`, which
relies on the old API looks like:

    (defun completion-table--fetch-legacy (table pattern &optional pre _session)
      (pcase-let ((`(,pred . ,regexp)
                   (completion-pattern-convert 'regexp pattern))
                  (`(,ppred . ,prefix)
                   (completion-pattern-convert 'prefix pattern)))
        (let ((matches
               (let ((completion-regexp-list (if ppred (list regexp)))
                     (case-fold-search completion-ignore-case))
                 (all-completions (concat pre prefix) table))))
          (if (null pred)
              matches
            (seq-filter pred matches)))))

This is of course incorrect because `all-completions` could ignore
`completion-regexp-list`, in which case we should use `ppred` instead of
`pred` on the last 3 lines :-)

It has the disadvantage that every completion-table basically needs to
start by calling `completion-pattern-convert` so as to convert the
pattern to the format that it supports.  But I think it's still better
than the current API where completion tables "have to" obey the prefix
string, the `completion-regexp-list`, and the predicate (and where the
latter two are often nil so tables tends to ignore them, and since
tables ignore them callers don't use them, etc...).


        Stefan

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Sat, 28 Oct 2023 22:26:02 GMT) Full text and rfc822 format available.

Message #323 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: João Távora <joaotavora <at> gmail.com>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Eli Zaretskii <eliz <at> gnu.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add
 new `completion-filter-completions` API and deferred highlighting
Date: Sun, 29 Oct 2023 01:24:50 +0300

On 27/10/2023 20:16, João Távora wrote:

> My attempts to fix the optimization to work correctly in all cases
> complicated the code and then slowed down the bare completion-read case.
> Not worth it IMO, therefore I have reverted it in the branch and in the
> latest patch I attach (lazy-hilit-2023-v4.diff).  Here are the latest
> measurements for the "yoyo" test, now considering the C-h v scenario:
> 
> ;; Daniel+Dmitry completin-read: 0.834s avg
> ;; Daniel+Dmitry C-h v:          0.988s avg
>                                             
> ;; lazy-hilit completing-read:   0.825s avg
> ;; lazy-hilit C-h v:             0.946s avg
> 
> Again, this shows my patch to be about 2-4% faster, though not really
> relevant.
> 
> Again, the optimization I removed, if it were done correctly (which it
> wasn't) could maybe shave off another 8-9% off that.

Thanks. My measurements are similar, except the difference switch the 
other way a little bit. It might depend on the particulars of the 
individual machine anyway. I also tried to compare the perf for 150K 
symbols, in case either the filtering or the sorting (which should have 
different complexities) would come to the front, but no such luck:

lazy-hilit

300000
completing-read 0.6063497
C-h v 0.72435995

150000
completing-read 0.316961
C-h v 0.39933575

d+d

300000
completing-read 0.571481
C-h v 0.69817995

150000
completing-read 0.308940
C-h v 0.350437

All averages made using 'M-x calc-grab-region' followed by 'u M'.

>> But there are differences. The first is that the highlighter function
>> takes one string as an argument instead of a collection. I mentioned
>> this before, this will be much handier to use in company-capf.
> 
> Don't fully follow this.  Can you perhaps show two versions (or two
> snippets) of the company-capf code, the handy and the non-handy version?

I just meant that your version will be easier to use in 
company--match-from-capf-face (because it works on individual completions).

>> Third, it made a principled stance to avoid altering the original
>> strings, even the non-visual text properties. This approach could be
>> adopted piecewise from Daniel's patch, especially if the performance
>> ends up the same or insignificantly different in practical usage.
> 
> If we really wanted to, we could also adopt the non-propertizing
> approach in my lazy-hilit patch, by calculating the score "just in
> time", much like Daniel's patch does it.  But it should be clear that
> what we save in allocation in completion-pcm--hilit-commonality, we'll
> pay for in the same amount in consing later.  So no performance benefit.

Not for performance, no. Although the way it separates the sorting into 
its own phase makes it easier to reason about that particular cost. And 
for 300000 symbols, scoring and sorting really take the most time, e.g. 
about 2/3rds. Which might help with optimizing it further down in the 
future, somehow,

> And if we do that, don't forget there will be the ugly "unquoted"
> complication to deal with.  Then again, in my understanding that
> complication is due to similar problem of mixing business and display
> logic.  That's assuming I understand this comment in minibuffer.el
> correctly:
> 
>     ;; Here we choose to quote all elements returned, but a better option
>     ;; would be to return unquoted elements together with a function to
>     ;; requote them, so that *Completions* can show nicer unquoted values
>     ;; which only get quoted when needed by choose-completion.
> 
> So I would look into solving that first, instead of allowing the
> "unquoted" hacks to spread even further in minibuffer.el

I don't really understand this quoting-requoting business, never dug 
into the feature or the code. But perhaps keeping the original string 
might even help avoid the "requoting" step? Though that would depend on 
which version of the string the scoring and highligher functions expect 
to work on.

Speaking of the comment, it sounds like the said "requote function" 
would need to be passed up the call stack and used according to some 
protocol. The idea itself reminds me of the proposal described in 
https://github.com/company-mode/company-mode/discussions/1422 (it was 
also previously brought up on the lsp-mode tracker). It seems like 
either the completion tables would need a new method, or the capf tuples 
- a new function property, which all UIs using would need to start 
supporting in lock-step. At least that's the case if we use that to 
solve the requoting issue (the LSP clients' migration to use 
complete-function can be done more easily).

So far you advocated toward avoiding breaking changes while implementing 
the present performance improvement. Daniel's argument that the quoting 
completion tables are already slow enough sounds very reasonable to me, 
that an extra text property round-trip won't be noticeable. And the 
current version of the patch is functional and passes all tests, so it's 
not clear which other places would the "hack" need to spead into. Unless 
it helps with reducing code, etc, as per my vague guess above.

But if course it would be nice to avoid the wart, so if you have any 
better ideas, they are welcome.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Sun, 29 Oct 2023 02:09:02 GMT) Full text and rfc822 format available.

Message #326 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Eli Zaretskii <eliz <at> gnu.org>,
 João Távora <joaotavora <at> gmail.com>, 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add
 new `completion-filter-completions` API and deferred highlighting
Date: Sun, 29 Oct 2023 04:07:35 +0200

On 27/10/2023 21:12, Stefan Monnier wrote:
>>> More seriously, since it's a dynbound variable it can have unwanted
>>> effects in nested calls to `all/try-completions`, so it's safer to
>>> ignore that variable because its binding is not always "meant for us" 🙁
>> I guess it would be more precise if it was a function argument, e.g. the
>> first argument to 'fancy-all-completions' or somesuch that all completion
>> tables are supposed to use inside. OTOH, I suppose that might hinder those
>> that use external programs.
> In my "work in progress" (not touched since last December 🙁 ),
> I replace `all-completions` with:
> 
>      (cl-defgeneric completion-table-fetch-matches ( table pattern
>                                                      &optional pre session)
>        "Return candidates matching PATTERN in the completion TABLE.
>      For tables with subfields, PRE is the text found before PATTERN such that
>         (let ((len (length PRE)))
>           (equal (completion-table-boundaries TABLE PRE len) (cons len len)))
>      
>      Return a list of strings or a list of cons cells whose car is a string.
>      SESSION if non-nil is a hash-table where we can stash arbitrary auxiliary
>      info to avoid recomputing it between calls of the same \"session\".")
> 
> `pattern`s can take various shapes.  In my WiP code, I implement 4 kinds
> of patterns: prefix, glob, regexp, and predicate.  Now, we don't want
> completion tables to have to handle each and every one of those pattern
> kinds (the set of which is extensible via CLOS methods), so there's
> a middleman:
> 
>      (cl-defgeneric completion-pattern-convert (to pattern)
>        "Convert PATTERN to be of type TO.
>      Returns a pair (PRED . NEWPATTERN) where NEWPATTERN is of type TO
>      and should match everything that PATTERN matches.  PRED is nil
>      if NEWPATTERN matches exactly the same candidates as PATTERN
>      and otherwise it is a function that takes a candidate and returns non-nil if the
>      candidate also matches PATTERN.  PRED should not presume that the candidate
>      has already been filtered by NEWPATTERN."

FWIW, this neat structure might not help too much: the most popular 
external completion backend (the LSP language servers, collectively) 
don't accept regexps or globs, they just send you the lists of 
completions available at point. With the name matching method sometimes 
configurable per-server.

As such, the most useful methods currently are: 1) Emacs Regexp, 2) 
asking server for whatever it thinks is suitable (the "backend" 
completion style).

I would also probably want to standardize on the recommended type of TO 
anyway: some of them are likely going to result in better performance 
than others.

BTW, this reminds me about urgrep in GNU ELPA, which I think includes 
converters between different flavors of regexp. Something to keep in 
mind for the occasional completion table that's based on a Grep-like tool.

> So the fallback definition of `completion-table-fetch-matches`, which
> relies on the old API looks like:
> 
>      (defun completion-table--fetch-legacy (table pattern &optional pre _session)
>        (pcase-let ((`(,pred . ,regexp)
>                     (completion-pattern-convert 'regexp pattern))
>                    (`(,ppred . ,prefix)
>                     (completion-pattern-convert 'prefix pattern)))
>          (let ((matches
>                 (let ((completion-regexp-list (if ppred (list regexp)))
>                       (case-fold-search completion-ignore-case))
>                   (all-completions (concat pre prefix) table))))
>            (if (null pred)
>                matches
>              (seq-filter pred matches)))))
> 
> This is of course incorrect because `all-completions` could ignore
> `completion-regexp-list`, in which case we should use `ppred` instead of
> `pred` on the last 3 lines 😄
> 
> It has the disadvantage that every completion-table basically needs to
> start by calling `completion-pattern-convert` so as to convert the
> pattern to the format that it supports.  But I think it's still better
> than the current API where completion tables "have to" obey the prefix
> string, the `completion-regexp-list`, and the predicate (and where the
> latter two are often nil so tables tends to ignore them, and since
> tables ignore them callers don't use them, etc...).

So I guess it's also a way to make every completion table aware of PRED?

That should work; though it might be hard to reach the same raw 
performance as the current all-completions + completion-regexp-list.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Sun, 29 Oct 2023 04:43:01 GMT) Full text and rfc822 format available.

Message #329 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Eli Zaretskii <eliz <at> gnu.org>,
 João Távora <joaotavora <at> gmail.com>,
 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2]
 Add new `completion-filter-completions` API and deferred highlighting
Date: Sun, 29 Oct 2023 00:41:37 -0400

> FWIW, this neat structure might not help too much: the most popular external
> completion backend (the LSP language servers, collectively) don't accept
> regexps or globs, they just send you the lists of completions available at
> point. With the name matching method sometimes configurable per-server.

Largely agreed.  The main benefit tho is that you get just *one*
pattern, rather than three (one being the prefix argument, the second
being the `pred` arg which historically was so unused that it was abused
to hold the directory for file-name completion, so lots of tables don't
obey it, and the third being the `completion-regexp-list` that most coders
forget, and those who don't end up regretting not forgetting when it was
not meant for them), so it's much more clear.

> As such, the most useful methods currently are: 1) Emacs Regexp, 2) asking
> server for whatever it thinks is suitable (the "backend" completion style).

For the backends: agreed.
For the frontends (i.e. `completion-styles`), `glob` is the more useful
one, I'd say (except for the "external" style, of course).

We might also want support for things like `or` and `and` patterns, but
I haven't managed to fit them nicely in that structure :-(

> I would also probably want to standardize on the recommended type of TO
> anyway: some of them are likely going to result in better performance
> than others.

The TO is chosen by the specific completion table, based on what it can
handle best.  So it should always be "optimal".

> So I guess it's also a way to make every completion table aware of PRED?

Note also that these `pred` patterns are expected to be exclusively
looking at the string (they're used for `completion-styles` kind of
functionality), so nothing like `file-directory-p` or `fboundp` kind of
predicates here.

The `pred`s used in things like `completing-read` and `read-file-name`
would be handled elsewhere such as `completion-table-with-predicate`.
This part is still up in the air, tho.

> That should work; though it might be hard to reach the same raw performance
> as the current all-completions + completion-regexp-list.

I don't see why: currently my code actually uses `all-completions` and
`completion-regexp-list`, so as long as the pattern can be turned into
a regexp without requiring an additional PRED (that's usually the case),
it should be just as fast (at least for large tables; for small tables
we have the overhead of the various method calls and pattern
conversions, of course).


        Stefan

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Sun, 29 Oct 2023 23:11:01 GMT) Full text and rfc822 format available.

Message #332 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Eli Zaretskii <eliz <at> gnu.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2]
 Add new `completion-filter-completions` API and deferred highlighting
Date: Sun, 29 Oct 2023 23:12:36 +0000

Dmitry Gutov <dmitry <at> gutov.dev> writes:

> On 27/10/2023 20:16, João Távora wrote:

> Thanks. My measurements are similar, except the difference switch the
> other way a little bit. It might depend on the particulars of the
> individual machine anyway.

Yes, it could, but I've reproduced this in different hardware.

Check that you're taking enough samples, I take about 15-20 samples.
Maybe the lazy-hilit patch pays a an extra cost upfront for the very
first C-h v or completing-read, to create the properties keys on the
strings, which are then reused.  This cost is ammortized very quickly,
of course, but if you're taking the measurement immediately after Emacs
-Q and with few samples, it skews the numbers.

> All averages made using 'M-x calc-grab-region' followed by 'u M'.

Wow, thanks for this tip.  I wondered if there was an easier way than
M-x cua-rectangle-mark-mode + hand rolled avg function.

> I just meant that your version will be easier to use in
> company--match-from-capf-face (because it works on individual
> completions).

Ah, that was my intution too.  I misunderstood you then, I thought you
meant the list version would be easier and I found that odd.

>>> Third, it made a principled stance to avoid altering the original
>>> strings, even the non-visual text properties. This approach could be
>>> adopted piecewise from Daniel's patch, especially if the performance
>>> ends up the same or insignificantly different in practical usage.
>> If we really wanted to, we could also adopt the non-propertizing
>> approach in my lazy-hilit patch, by calculating the score "just in
>> time", much like Daniel's patch does it.  But it should be clear that
>> what we save in allocation in completion-pcm--hilit-commonality, we'll
>> pay for in the same amount in consing later.  So no performance benefit.
>
> Not for performance, no. Although the way it separates the sorting
> into its own phase makes it easier to reason about that particular
> cost. And for 300000 symbols, scoring and sorting really take the most
> time, e.g. about 2/3rds. Which might help with optimizing it further
> down in the future, somehow,

I think further optimization would be localized to the scoring function
itself, not to the place where it is performed.

> But if course it would be nice to avoid the wart, so if you have any
> better ideas, they are welcome.

I'm not saying it would necessarily spread even further, but if you want
to do scoring "just in time" like I suggested -- presumably to
completely avoid propertizing strings -- that particular wart spreads a
little more and thus becomes something that is slightly harder to remove
later on.

> So far you advocated toward avoiding breaking changes while
> implementing the present performance improvement.

Both patches do that, so what I've been arguing for is simplicity and
coherence.  I don't think completion-sorted-completions and the
complexity it brings to minibuffer.el is a step in the rigth direction,
and the performance benefits it does undeniably bring can be achieved
with less drastic means.

João

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Tue, 31 Oct 2023 03:21:02 GMT) Full text and rfc822 format available.

Message #335 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: João Távora <joaotavora <at> gmail.com>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Eli Zaretskii <eliz <at> gnu.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add
 new `completion-filter-completions` API and deferred highlighting
Date: Tue, 31 Oct 2023 05:20:04 +0200

Hi Joao,

On 30/10/2023 01:12, João Távora wrote:

>> Thanks. My measurements are similar, except the difference switch the
>> other way a little bit. It might depend on the particulars of the
>> individual machine anyway.
> 
> Yes, it could, but I've reproduced this in different hardware.
> 
> Check that you're taking enough samples, I take about 15-20 samples.
> Maybe the lazy-hilit patch pays a an extra cost upfront for the very
> first C-h v or completing-read, to create the properties keys on the
> strings, which are then reused.

I actually do see that. At first I didn't pay much attention to such 
outliers. They usually look like the second measurement here:

Elapsed time: 0.541624s (0.164396s in 5 GCs)
Elapsed time: 0.861175s (0.415142s in 10 GCs)
Elapsed time: 0.486012s (0.057915s in 1 GCs)
Elapsed time: 0.505339s (0.055759s in 1 GCs)
Elapsed time: 0.481024s (0.057757s in 1 GCs)
Elapsed time: 0.471350s (0.056383s in 1 GCs)
Elapsed time: 0.495125s (0.056129s in 1 GCs)
Elapsed time: 0.513310s (0.058437s in 1 GCs)
Elapsed time: 0.491978s (0.057144s in 1 GCs)

Which keys are those? I only know about one key - 'completion-score'.

> This cost is ammortized very quickly,
> of course, but if you're taking the measurement immediately after Emacs
> -Q and with few samples, it skews the numbers.

I always took around 16 samples, and now made sure to take exactly that 
number, starting with "y" and then "yo", "y", etc. Although the timing 
for the empty input is usually included (but it doesn't look too 
different from the rest).

Anyway, I retook the numbers a couple of times. One of the patches (the 
same one) still looks a little faster, but the fluctuations between runs 
are large enough to avoid making any big conclusions:

d+d
300000
completing-read 0.504
C-h v 0.630

lazy-hilit-2023-v4
300000
completing-read 0.517
C-h v 0.661

d+d again
300000
completing-read 0.486
C-h v 0.587

lazy-hilit-2023-v4 again
300000
completing-read 0.519
C-h v 0.651

And to double-check, these are comparison between 
0001-Add-new-completion-filter-completions-API-and-deferr-v3.diff and 
lazy-hilit-2023-v4.diff.

16 samples every time. And I think I dropped the run with the "spike" in 
all or most of the above. The first of the patches doesn't seem to cause 
it, though.

>> All averages made using 'M-x calc-grab-region' followed by 'u M'.
> 
> Wow, thanks for this tip.  I wondered if there was an easier way than
> M-x cua-rectangle-mark-mode + hand rolled avg function.

No problem! I lost your snippet for avg, so had to google. :-/

rectangle-mark-mode (C-x SPC) is still part of the recipe, though.

>>>> Third, it made a principled stance to avoid altering the original
>>>> strings, even the non-visual text properties. This approach could be
>>>> adopted piecewise from Daniel's patch, especially if the performance
>>>> ends up the same or insignificantly different in practical usage.
>>> If we really wanted to, we could also adopt the non-propertizing
>>> approach in my lazy-hilit patch, by calculating the score "just in
>>> time", much like Daniel's patch does it.  But it should be clear that
>>> what we save in allocation in completion-pcm--hilit-commonality, we'll
>>> pay for in the same amount in consing later.  So no performance benefit.
>>
>> Not for performance, no. Although the way it separates the sorting
>> into its own phase makes it easier to reason about that particular
>> cost. And for 300000 symbols, scoring and sorting really take the most
>> time, e.g. about 2/3rds. Which might help with optimizing it further
>> down in the future, somehow,
> 
> I think further optimization would be localized to the scoring function
> itself, not to the place where it is performed.

Most likely, yes. It seems to be the most expensive part. But it still 
seems easier to measure/tweak when it happens during a separate step, 
rather than mixed in with the rest of completion-all-completions' business.

>> But if course it would be nice to avoid the wart, so if you have any
>> better ideas, they are welcome.
> 
> I'm not saying it would necessarily spread even further, but if you want
> to do scoring "just in time" like I suggested -- presumably to
> completely avoid propertizing strings -- that particular wart spreads a
> little more and thus becomes something that is slightly harder to remove
> later on.

Could you describe the other places you think it might spread to? Other 
completion styles like Orderless?

As long as there's only one place producing this property (as opposed to 
consuming it), it seems straightforward enough to remove anyway.

>> So far you advocated toward avoiding breaking changes while
>> implementing the present performance improvement.
> 
> Both patches do that, so what I've been arguing for is simplicity and
> coherence.  I don't think completion-sorted-completions and the
> complexity it brings to minibuffer.el is a step in the rigth direction,
> and the performance benefits it does undeniably bring can be achieved
> with less drastic means.

What I meant is, solving the quote-unquote conundrum might require a 
larger breaking change than the one that you wanted for this discussion.

Anyway, have you looked into what it would take to solve it? Such text 
propertization might actually work as a cheap replacement to returning 
"a function to ... requote them". If both highlighting and scoring 
functions work fine on the "unquoted" strings, then we would only need 
to make sure the "quoted" is used when the completion is inserted. Could 
we make a rule that every table-with-quoting would have to call a 
particular exit-function? Perhaps before the existing exit-function.

That might solve a bunch of things, though I don't see a robust way to 
enforce this practice, given that completion-table-with-quoting works on 
the level of completion tables, whereas :exit-function is only specified 
in the capf tuple.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Tue, 31 Oct 2023 10:57:02 GMT) Full text and rfc822 format available.

Message #338 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Eli Zaretskii <eliz <at> gnu.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add
 new `completion-filter-completions` API and deferred highlighting
Date: Tue, 31 Oct 2023 10:55:44 +0000

On Tue, Oct 31, 2023 at 3:20 AM Dmitry Gutov <dmitry <at> gutov.dev> wrote:
>
> Hi Joao,
>
> On 30/10/2023 01:12, João Távora wrote:
>
> >> Thanks. My measurements are similar, except the difference switch the
> >> other way a little bit. It might depend on the particulars of the
> >> individual machine anyway.
> >
> > Yes, it could, but I've reproduced this in different hardware.
> >
> > Check that you're taking enough samples, I take about 15-20 samples.
> > Maybe the lazy-hilit patch pays a an extra cost upfront for the very
> > first C-h v or completing-read, to create the properties keys on the
> > strings, which are then reused.
>
> I actually do see that. At first I didn't pay much attention to such
> outliers. They usually look like the second measurement here:
>
> Elapsed time: 0.541624s (0.164396s in 5 GCs)
> Elapsed time: 0.861175s (0.415142s in 10 GCs)
> Elapsed time: 0.486012s (0.057915s in 1 GCs)
> Elapsed time: 0.505339s (0.055759s in 1 GCs)
> Elapsed time: 0.481024s (0.057757s in 1 GCs)
> Elapsed time: 0.471350s (0.056383s in 1 GCs)
> Elapsed time: 0.495125s (0.056129s in 1 GCs)
> Elapsed time: 0.513310s (0.058437s in 1 GCs)
> Elapsed time: 0.491978s (0.057144s in 1 GCs)
>
> Which keys are those? I only know about one key - 'completion-score'.

Yes, that's the one.  I suppose adding this key to the property lists of
large number of strings, which is only done once, is what's causing
the anomaly.

>
> > This cost is ammortized very quickly,
> > of course, but if you're taking the measurement immediately after Emacs
> > -Q and with few samples, it skews the numbers.
>
> I always took around 16 samples, and now made sure to take exactly that
> number, starting with "y" and then "yo", "y", etc. Although the timing
> for the empty input is usually included (but it doesn't look too
> different from the rest).
>
> Anyway, I retook the numbers a couple of times. One of the patches (the
> same one) still looks a little faster, but the fluctuations between runs
> are large enough to avoid making any big conclusions:

As did I, and I get the same results I posted :-/

> > I think further optimization would be localized to the scoring function
> > itself, not to the place where it is performed.
>
> Most likely, yes. It seems to be the most expensive part. But it still
> seems easier to measure/tweak when it happens during a separate step,
> rather than mixed in with the rest of completion-all-completions' business.
>
> >> But if course it would be nice to avoid the wart, so if you have any
> >> better ideas, they are welcome.
> >
> > I'm not saying it would necessarily spread even further, but if you want
> > to do scoring "just in time" like I suggested -- presumably to
> > completely avoid propertizing strings -- that particular wart spreads a
> > little more and thus becomes something that is slightly harder to remove
> > later on.
>
> Could you describe the other places you think it might spread to? Other
> completion styles like Orderless?

Maybe, I don't know.  But here I just meant that to do that idea it spreads
only one further degree.  I'm not saying it would necessarily spread even
further.

>
> As long as there's only one place producing this property (as opposed to
> consuming it), it seems straightforward enough to remove anyway.
>
> >> So far you advocated toward avoiding breaking changes while
> >> implementing the present performance improvement.
> >
> > Both patches do that, so what I've been arguing for is simplicity and
> > coherence.  I don't think completion-sorted-completions and the
> > complexity it brings to minibuffer.el is a step in the rigth direction,
> > and the performance benefits it does undeniably bring can be achieved
> > with less drastic means.
>
> What I meant is, solving the quote-unquote conundrum might require a
> larger breaking change than the one that you wanted for this discussion.

A larger change sure, not sure if "breaking" though.

> Anyway, have you looked into what it would take to solve it?

No, naively, I just think it's a similar situation of display and business
logic being mixed up.  Presumably the quoted stuff is just for insertion
(and display?), and the unquoted stuff is what patterns/scoring should
operate on.

But, IMO, there's no need to tackle it right now.

If the thing holding you back from the lazy-hilit-2023-v4.patch is the
completion-score propertization, I can move it to the sorting step
in a future v5 and add spread the completion--unquoted thing a little
bit more.

João

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Tue, 31 Oct 2023 20:53:01 GMT) Full text and rfc822 format available.

Message #341 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: João Távora <joaotavora <at> gmail.com>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Eli Zaretskii <eliz <at> gnu.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add
 new `completion-filter-completions` API and deferred highlighting
Date: Tue, 31 Oct 2023 22:52:06 +0200

On 31/10/2023 12:55, João Távora wrote:
>> Which keys are those? I only know about one key - 'completion-score'.
> 
> Yes, that's the one.  I suppose adding this key to the property lists of
> large number of strings, which is only done once, is what's causing
> the anomaly.

Ah, I didn't realize that the text property/value pairs were stored as 
plists internally as well. And that also triggers consing.

>>>> But if course it would be nice to avoid the wart, so if you have any
>>>> better ideas, they are welcome.
>>>
>>> I'm not saying it would necessarily spread even further, but if you want
>>> to do scoring "just in time" like I suggested -- presumably to
>>> completely avoid propertizing strings -- that particular wart spreads a
>>> little more and thus becomes something that is slightly harder to remove
>>> later on.
>>
>> Could you describe the other places you think it might spread to? Other
>> completion styles like Orderless?
> 
> Maybe, I don't know.  But here I just meant that to do that idea it spreads
> only one further degree.  I'm not saying it would necessarily spread even
> further.

It seems like the only code that would be concerned with it are 
completion styles that also do sorting, or completion tables that would 
do similar things to this "with quoting" business. But I'm not aware of 
any other examples of the latter aside from what is inside Emacs itself.

>> Anyway, have you looked into what it would take to solve it?
> 
> No, naively, I just think it's a similar situation of display and business
> logic being mixed up.  Presumably the quoted stuff is just for insertion
> (and display?), and the unquoted stuff is what patterns/scoring should
> operate on.

Apparently it's good for insertion, but according to that comment inside 
the function, the unquoted stuff might actually be better for display.

I'm not 100% clear which of the versions is better for 
scoring/highlighting, but apparently the unquoted one.

> But, IMO, there's no need to tackle it right now.
> 
> If the thing holding you back from the lazy-hilit-2023-v4.patch is the
> completion-score propertization, I can move it to the sorting step
> in a future v5 and add spread the completion--unquoted thing a little
> bit more.

I think that's the main blocker, yes.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Wed, 01 Nov 2023 18:49:02 GMT) Full text and rfc822 format available.

Message #344 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Eli Zaretskii <eliz <at> gnu.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add
 new `completion-filter-completions` API and deferred highlighting
Date: Wed, 1 Nov 2023 18:47:25 +0000

[Message part 1 (text/plain, inline)]

On Tue, Oct 31, 2023 at 8:52 PM Dmitry Gutov <dmitry <at> gutov.dev> wrote:

> It seems like the only code that would be concerned with it are
> completion styles that also do sorting, or completion tables that would
> do similar things to this "with quoting" business. But I'm not aware of
> any other examples of the latter aside from what is inside Emacs itself.

If orderless (which I've never tried), does some kind of scoring of
completions, it probably also needs the same complications of flex.

> >> Anyway, have you looked into what it would take to solve it?
> >
> > No, naively, I just think it's a similar situation of display and business
> > logic being mixed up.  Presumably the quoted stuff is just for insertion
> > (and display?), and the unquoted stuff is what patterns/scoring should
> > operate on.
>
> Apparently it's good for insertion, but according to that comment inside
> the function, the unquoted stuff might actually be better for display.

No idea what the unquoted stuff is for, so I haven't really tested it.

> I'm not 100% clear which of the versions is better for
> scoring/highlighting, but apparently the unquoted one.
>
> > But, IMO, there's no need to tackle it right now.
> >
> > If the thing holding you back from the lazy-hilit-2023-v4.patch is the
> > completion-score propertization, I can move it to the sorting step
> > in a future v5 and add spread the completion--unquoted thing a little
> > bit more.
>
> I think that's the main blocker, yes.

Alright, here goes v5 then, with this change.  Note I've implemented
this unquoted thing which kicks in in C-x f but I haven't actually
seen  any strings that have different "quoted" "non-quoted" versions.

The performance of the three main patches as measured in yet
another machine:

;; C-h v
;;
;; Daniel+Dmitry: 0.696340454545
;; lazy hilit v4: 0.692849642852
;; lazy hilit v5: 0.683088541667
;;
;; completing-read
;;
;; Daniel+Dmitry: 0.590994909091
;; lazy hilit v4: 0.586523307692
;; lazy hilit v5: 0.586165466667

Nothing unexpected.

So if you're satisfied with the general design now, maybe
we should start looking at finer details, docstrings, style,
etc.

João

[lazy-hilit-2023-v5.diff (application/octet-stream, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Wed, 01 Nov 2023 22:47:02 GMT) Full text and rfc822 format available.

Message #347 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: João Távora <joaotavora <at> gmail.com>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Eli Zaretskii <eliz <at> gnu.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add
 new `completion-filter-completions` API and deferred highlighting
Date: Thu, 2 Nov 2023 00:45:18 +0200

On 01/11/2023 20:47, João Távora wrote:
> On Tue, Oct 31, 2023 at 8:52 PM Dmitry Gutov <dmitry <at> gutov.dev> wrote:
> 
>> It seems like the only code that would be concerned with it are
>> completion styles that also do sorting, or completion tables that would
>> do similar things to this "with quoting" business. But I'm not aware of
>> any other examples of the latter aside from what is inside Emacs itself.
> 
> If orderless (which I've never tried), does some kind of scoring of
> completions, it probably also needs the same complications of flex.

Turns out, Orderless doesn't do any scoring or sorting. Only filtering.

>>>> Anyway, have you looked into what it would take to solve it?
>>>
>>> No, naively, I just think it's a similar situation of display and business
>>> logic being mixed up.  Presumably the quoted stuff is just for insertion
>>> (and display?), and the unquoted stuff is what patterns/scoring should
>>> operate on.
>>
>> Apparently it's good for insertion, but according to that comment inside
>> the function, the unquoted stuff might actually be better for display.
> 
> No idea what the unquoted stuff is for, so I haven't really tested it.

A couple of scenarios:

First:
1. sudo mkdir /home/${USER}-foobar
2. C-x C-f /home/${USER} TAB ; it shows both directories inside home as

  ${USER}-foo/
  ${USER}/

Second:
1. mkdir ~/examples/test\ test\ test/
2. mkdir ~/examples/test\ test/
3. M-x shell
4. In the shell buffer, type 'ls ~/examples/test\ ' and TAB. See:

  test\ test/
  test\ test\ test/

In the current implementation, both the inputs and the text in the 
completions buffer that we see, are "quoted". The "unquoted" versions 
would be the directory name with the variable substitution performed, 
and the directory names without backslashes.

>> I'm not 100% clear which of the versions is better for
>> scoring/highlighting, but apparently the unquoted one.
>>
>>> But, IMO, there's no need to tackle it right now.
>>>
>>> If the thing holding you back from the lazy-hilit-2023-v4.patch is the
>>> completion-score propertization, I can move it to the sorting step
>>> in a future v5 and add spread the completion--unquoted thing a little
>>> bit more.
>>
>> I think that's the main blocker, yes.
> 
> Alright, here goes v5 then, with this change.  Note I've implemented
> this unquoted thing which kicks in in C-x f but I haven't actually
> seen  any strings that have different "quoted" "non-quoted" versions.
> 
> The performance of the three main patches as measured in yet
> another machine:
> 
> ;; C-h v
> ;;
> ;; Daniel+Dmitry: 0.696340454545
> ;; lazy hilit v4: 0.692849642852
> ;; lazy hilit v5: 0.683088541667
> ;;
> ;; completing-read
> ;;
> ;; Daniel+Dmitry: 0.590994909091
> ;; lazy hilit v4: 0.586523307692
> ;; lazy hilit v5: 0.586165466667
> 
> Nothing unexpected.

Confirm. The "property allocation" spikes are gone too.

> So if you're satisfied with the general design now, maybe
> we should start looking at finer details, docstrings, style,
> etc.

LGTM overall, and I see that you compressed the sorting code a little.

Both quoting/unquoting scenarios also seem to work as expected (for 
highlighting, that seems to be thanks to completion--twq-all applying 
the faces eagerly anyway).

Though given the examples (and I think others should be similar) it 
wouldn't be an end of the world if scoring didn't really work for them 
-- filtering should have already done most of the job. All of this is to 
say that any new 3rd party completion styles, even those that do 
sorting, would be okay without knowing about this text property.

Some minor nits for the patch:

> +Completion-presenting frontends may opt to bind this variable to
> +non-nil value in the context of completion-producing calls (such
> +as `completion-all-sorted-completions').  This hints the

I suggest mentioning `completion-all-completions' instead, as it is more 
often used directly by the frontends.

> +responsible this fontification.  The frontend binds this variable

responsible for

> +hint and greedily fontify as usual.  It is still safe for a

"fontify eagerly"? I think that's a more common term than "greedily".

> +  "Used by completions styles to honouring `completion-lazy-hilit'.

"to honour", or "styles honouring"

> +(defun completion--flex-score (str re &optional dont-error)

Looks like the third argument is unused in both callers. I think it was 
intended for compose-flex-sort-fn.

> +see) for later lazy highlighting"

Missing period.

> +                      ;; Lazy highlight not requested, so strings are
> +                      ;; assumed to already contain `completion-> score'
> +                      ;; (and highlighting) and we can freely destroy
> +                      ;; list.

Perhaps drop the last two lines, since IIUC the list can be 
destructively sorted in both cases, lazy highlighting or not.

I guess we should wait a few days to see if anyone has more comments, 
and then install this?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Thu, 02 Nov 2023 09:50:01 GMT) Full text and rfc822 format available.

Message #350 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Eli Zaretskii <eliz <at> gnu.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add
 new `completion-filter-completions` API and deferred highlighting
Date: Thu, 2 Nov 2023 09:48:51 +0000

[Message part 1 (text/plain, inline)]

On Wed, Nov 1, 2023 at 10:45 PM Dmitry Gutov <dmitry <at> gutov.dev> wrote:

> > If orderless (which I've never tried), does some kind of scoring of
> > completions, it probably also needs the same complications of flex.
>
> Turns out, Orderless doesn't do any scoring or sorting. Only filtering.

Interesting, so if I M-x d i f f with orderless I don't get results in
any particular order?

> >>>> Anyway, have you looked into what it would take to solve it?
> >>>
> >>> No, naively, I just think it's a similar situation of display and business
> >>> logic being mixed up.  Presumably the quoted stuff is just for insertion
> >>> (and display?), and the unquoted stuff is what patterns/scoring should
> >>> operate on.
> >>
> >> Apparently it's good for insertion, but according to that comment inside
> >> the function, the unquoted stuff might actually be better for display.
> >
> > No idea what the unquoted stuff is for, so I haven't really tested it.
>
> A couple of scenarios:

Thanks.  Then I think it is working OK, but it would be safer if you were
to double-check yourself, as I really never use this functionality.

> LGTM overall, and I see that you compressed the sorting code a little.
>
> Both quoting/unquoting scenarios also seem to work as expected (for
> highlighting, that seems to be thanks to completion--twq-all applying
> the faces eagerly anyway).

That's good.

> Though given the examples (and I think others should be similar) it
> wouldn't be an end of the world if scoring didn't really work for them
> -- filtering should have already done most of the job. All of this is to
> say that any new 3rd party completion styles, even those that do
> sorting, would be okay without knowing about this text property.

Maybe.

> Some minor nits for the patch:

Thanks.

> I guess we should wait a few days to see if anyone has more comments,
> and then install this?

I addressed all your docstring suggestions, fixed a bug and significantly
simplified the code in the latest version of the patch.  I also
removed the instrumentation in icomplete.el.  Patch attached here
and pushed to feature/completion-lazy-hilit.

Stefan, Eli, would you like to chime in?

João

[lazy-hilit-2023-v6.diff (application/octet-stream, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Thu, 02 Nov 2023 10:12:01 GMT) Full text and rfc822 format available.

Message #353 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: João Távora <joaotavora <at> gmail.com>
Cc: dmitry <at> gutov.dev, 47711 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca,
 mail <at> daniel-mendler.de
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add
 new `completion-filter-completions` API and deferred highlighting
Date: Thu, 02 Nov 2023 12:10:50 +0200

> From: João Távora <joaotavora <at> gmail.com>
> Date: Thu, 2 Nov 2023 09:48:51 +0000
> Cc: Eli Zaretskii <eliz <at> gnu.org>, Daniel Mendler <mail <at> daniel-mendler.de>, 
> 	Stefan Monnier <monnier <at> iro.umontreal.ca>, 47711 <at> debbugs.gnu.org
> 
> Stefan, Eli, would you like to chime in?

Chime in on what aspect(s) of this discussion (or the patch)?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Thu, 02 Nov 2023 10:42:02 GMT) Full text and rfc822 format available.

Message #356 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: dmitry <at> gutov.dev, 47711 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca,
 mail <at> daniel-mendler.de
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add
 new `completion-filter-completions` API and deferred highlighting
Date: Thu, 2 Nov 2023 10:39:53 +0000

On Thu, Nov 2, 2023 at 10:11 AM Eli Zaretskii <eliz <at> gnu.org> wrote:
>
> > From: João Távora <joaotavora <at> gmail.com>
> > Date: Thu, 2 Nov 2023 09:48:51 +0000
> > Cc: Eli Zaretskii <eliz <at> gnu.org>, Daniel Mendler <mail <at> daniel-mendler.de>,
> >       Stefan Monnier <monnier <at> iro.umontreal.ca>, 47711 <at> debbugs.gnu.org
> >
> > Stefan, Eli, would you like to chime in?
>
> Chime in on what aspect(s) of this discussion (or the patch)?

The patch condenses the results of the discussion of last week,
as resuscitated by Dmitry after a 2-year long hiatus.

After some rounds of benchmarking and discussion, Dmitry and I
think the latest version of the patch should be installed.

If you've not been following closely, the variables the patch
introduces have docstrings explaining the functionality added, as
do the commit messages to feature/completion-lazy-hilit.

In a nutshell it solves the performance problem of overly eager
completion highlighting with minimal changes to the completion API.
The original patch proposed by Daniel Mendler was also partly about
solving this problem, but with much more extensive changes.

NEWS entries -- and possibly manual entries -- are also needed, not
done yet.

João

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Thu, 02 Nov 2023 11:00:02 GMT) Full text and rfc822 format available.

Message #359 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: João Távora <joaotavora <at> gmail.com>
Cc: dmitry <at> gutov.dev, 47711 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca,
 mail <at> daniel-mendler.de
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add
 new `completion-filter-completions` API and deferred highlighting
Date: Thu, 02 Nov 2023 12:58:27 +0200

> From: João Távora <joaotavora <at> gmail.com>
> Date: Thu, 2 Nov 2023 10:39:53 +0000
> Cc: dmitry <at> gutov.dev, mail <at> daniel-mendler.de, monnier <at> iro.umontreal.ca, 
> 	47711 <at> debbugs.gnu.org
> 
> > > Stefan, Eli, would you like to chime in?
> >
> > Chime in on what aspect(s) of this discussion (or the patch)?
> 
> The patch condenses the results of the discussion of last week,
> as resuscitated by Dmitry after a 2-year long hiatus.
> 
> After some rounds of benchmarking and discussion, Dmitry and I
> think the latest version of the patch should be installed.

Are there any problematic aspects of the patch that need to be
discussed or considered before installing the patch?

IOW, why are you soliciting our opinions, instead of just going ahead
and installing?

> In a nutshell it solves the performance problem of overly eager
> completion highlighting with minimal changes to the completion API.

It looks to me like it adds a new feature, not just solves a
performance problem?

Some minor comments to the patch itself:

> +(defvar-local completion-lazy-hilit nil
> +  "If non-nil, request completion lazy highlighting.
> +
> +Completion-presenting frontends may opt to bind this variable to
> +non-nil value in the context of completion-producing calls (such
> +as `completion-all-completions').  This hints the intervening
> +completion styles that they do not need to
> +fontify (i.e. propertize with the `face' property) completion
> +strings with highlights of the matching parts.

If this is intended to be bound by frontends, why is it defvar-local?
I thought let-binding buffer-local variables is a tricky business that
could have unexpected results?

Also, I think this doc string should reference
completion-lazy-hilit-fn.

> +(defvar completion-lazy-hilit-fn nil
> +  "Used by completions styles honoring `completion-lazy-hilit'.

This should mention "function", since just "Used to..." doesn't convey
that, and "-fn" could also mean "file name", not just "function".

> +(defun completion-lazy-hilit (str)
> +  "Return a copy of completion STR that is `face'-propertized.
                                              ^^^^^^^^^^^^^^^^^^
Strange quoting.  "face" is not a symbol that we want to have a link
to, is it?

I see a few more places in the doc strings that will "need work", but
that can be done later.

What did you want to say in NEWS about this?  If it's just a
performance improvement, we don't normally mention them in NEWS.

Thanks.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Thu, 02 Nov 2023 11:14:01 GMT) Full text and rfc822 format available.

Message #362 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: dmitry <at> gutov.dev, 47711 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca,
 mail <at> daniel-mendler.de
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add
 new `completion-filter-completions` API and deferred highlighting
Date: Thu, 2 Nov 2023 11:12:15 +0000

On Thu, Nov 2, 2023 at 10:58 AM Eli Zaretskii <eliz <at> gnu.org> wrote:

> > After some rounds of benchmarking and discussion, Dmitry and I
> > think the latest version of the patch should be installed.
>
> Are there any problematic aspects of the patch that need to be
> discussed or considered before installing the patch?
>
> IOW, why are you soliciting our opinions, instead of just going ahead
> and installing?

No particular reason.  Dmitry suggested that we do, and you
participated in this discussion a while back, and I know you
normally have some useful comment or two.

> > In a nutshell it solves the performance problem of overly eager
> > completion highlighting with minimal changes to the completion API.
>
> It looks to me like it adds a new feature, not just solves a
> performance problem?
>
> Some minor comments to the patch itself:
>
> > +(defvar-local completion-lazy-hilit nil
> > +  "If non-nil, request completion lazy highlighting.
> > +
> > +Completion-presenting frontends may opt to bind this variable to
> > +non-nil value in the context of completion-producing calls (such
> > +as `completion-all-completions').  This hints the intervening
> > +completion styles that they do not need to
> > +fontify (i.e. propertize with the `face' property) completion
> > +strings with highlights of the matching parts.
>
> If this is intended to be bound by frontends, why is it defvar-local?
> I thought let-binding buffer-local variables is a tricky business that
> could have unexpected results?

Good catch!  It shouldn't be defvar-local indeed, not with this latest
version.  See, glad I called you ;-)

> Also, I think this doc string should reference
> completion-lazy-hilit-fn.
>
> > +(defvar completion-lazy-hilit-fn nil
> > +  "Used by completions styles honoring `completion-lazy-hilit'.
>
> This should mention "function", since just "Used to..." doesn't convey
> that, and "-fn" could also mean "file name", not just "function".

Makes sense.

> > +(defun completion-lazy-hilit (str)
> > +  "Return a copy of completion STR that is `face'-propertized.
>                                               ^^^^^^^^^^^^^^^^^^
> Strange quoting.  "face" is not a symbol that we want to have a link
> to, is it?

It's not a symbol we'll be referencing, but it's a symbol.  I'll
rewrite it.

>
> I see a few more places in the doc strings that will "need work", but
> that can be done later.
>
> What did you want to say in NEWS about this?  If it's just a
> performance improvement, we don't normally mention them in NEWS.

But it needs frontends to opt in, and that requires an non-breaking
addition to completion API.

João

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Thu, 02 Nov 2023 14:42:01 GMT) Full text and rfc822 format available.

Message #365 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: João Távora <joaotavora <at> gmail.com>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Eli Zaretskii <eliz <at> gnu.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add
 new `completion-filter-completions` API and deferred highlighting
Date: Thu, 2 Nov 2023 16:40:29 +0200

On 02/11/2023 11:48, João Távora wrote:
> On Wed, Nov 1, 2023 at 10:45 PM Dmitry Gutov <dmitry <at> gutov.dev> wrote:
> 
>>> If orderless (which I've never tried), does some kind of scoring of
>>> completions, it probably also needs the same complications of flex.
>>
>> Turns out, Orderless doesn't do any scoring or sorting. Only filtering.
> 
> Interesting, so if I M-x d i f f with orderless I don't get results in
> any particular order?

Right.

And input "diff" actually results in prefix-only matching. But if the 
input contains space (or any other pre-configured delimiter character), 
then it's translated to a set of substring matches, performed in any 
order. So "diff mode" does not translate to ".*diff.*mode", but a more 
complex regexp with all permutations.

IOW, this whole approach results in stricter matching with fewer 
results, so a smarter sort isn't that necessary.

>>>>>> Anyway, have you looked into what it would take to solve it?
>>>>>
>>>>> No, naively, I just think it's a similar situation of display and business
>>>>> logic being mixed up.  Presumably the quoted stuff is just for insertion
>>>>> (and display?), and the unquoted stuff is what patterns/scoring should
>>>>> operate on.
>>>>
>>>> Apparently it's good for insertion, but according to that comment inside
>>>> the function, the unquoted stuff might actually be better for display.
>>>
>>> No idea what the unquoted stuff is for, so I haven't really tested it.
>>
>> A couple of scenarios:
> 
> Thanks.  Then I think it is working OK, but it would be safer if you were
> to double-check yourself, as I really never use this functionality.

They look okay, yes. I imagine Daniel also tested the 
'completion--unquoted' solution before offering his patches.

> I addressed all your docstring suggestions, fixed a bug and significantly
> simplified the code in the latest version of the patch.  I also
> removed the instrumentation in icomplete.el.  Patch attached here
> and pushed to feature/completion-lazy-hilit.

Thank you. I don't see that the bug was, but deferring the scoring even 
in the eager highlighting case looks sensible (why not, indeed, if the 
performance is no worse).

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Thu, 02 Nov 2023 15:25:02 GMT) Full text and rfc822 format available.

Message #368 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Eli Zaretskii <eliz <at> gnu.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add
 new `completion-filter-completions` API and deferred highlighting
Date: Thu, 2 Nov 2023 15:24:05 +0000

On Thu, Nov 2, 2023 at 2:40 PM Dmitry Gutov <dmitry <at> gutov.dev> wrote:

> IOW, this whole approach results in stricter matching with fewer
> results, so a smarter sort isn't that necessary.

Just curious, so in orderless, what do I type to quickly select
M-x vc-diff or M-x vc-version-diff or M-x vc-ediff?

In flex I just type "vcdiff" and these results normally bubble to the top.

> > I addressed all your docstring suggestions, fixed a bug and significantly
> > simplified the code in the latest version of the patch.  I also
> > removed the instrumentation in icomplete.el.  Patch attached here
> > and pushed to feature/completion-lazy-hilit.
>
> Thank you. I don't see that the bug was, but deferring the scoring even
> in the eager highlighting case looks sensible (why not, indeed, if the
> performance is no worse).

The bug was in the eager highlighting case.  Wasn't sorting at all.

João

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Thu, 02 Nov 2023 15:37:02 GMT) Full text and rfc822 format available.

Message #371 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: João Távora <joaotavora <at> gmail.com>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Eli Zaretskii <eliz <at> gnu.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add
 new `completion-filter-completions` API and deferred highlighting
Date: Thu, 2 Nov 2023 17:36:06 +0200

On 02/11/2023 17:24, João Távora wrote:
> On Thu, Nov 2, 2023 at 2:40 PM Dmitry Gutov<dmitry <at> gutov.dev>  wrote:
> 
>> IOW, this whole approach results in stricter matching with fewer
>> results, so a smarter sort isn't that necessary.
> Just curious, so in orderless, what do I type to quickly select
> M-x vc-diff or M-x vc-version-diff or M-x vc-ediff?
> 
> In flex I just type "vcdiff" and these results normally bubble to the top.

I'm not really a user of it (yet?), but

"vc-dif" or "vc dif" matches the first one, and "vc-ver" or "vc vers" 
matches the second one. Not a lot of difference in the amount of typing, 
but a little more control on the part of the user.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Thu, 02 Nov 2023 16:00:02 GMT) Full text and rfc822 format available.

Message #374 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Eli Zaretskii <eliz <at> gnu.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add
 new `completion-filter-completions` API and deferred highlighting
Date: Thu, 2 Nov 2023 15:58:10 +0000

On Thu, Nov 2, 2023 at 3:36 PM Dmitry Gutov <dmitry <at> gutov.dev> wrote:
>
> On 02/11/2023 17:24, João Távora wrote:
> > On Thu, Nov 2, 2023 at 2:40 PM Dmitry Gutov<dmitry <at> gutov.dev>  wrote:
> >
> >> IOW, this whole approach results in stricter matching with fewer
> >> results, so a smarter sort isn't that necessary.
> > Just curious, so in orderless, what do I type to quickly select
> > M-x vc-diff or M-x vc-version-diff or M-x vc-ediff?
> >
> > In flex I just type "vcdiff" and these results normally bubble to the top.
>
> I'm not really a user of it (yet?), but
>
> "vc-dif" or "vc dif" matches the first one, and "vc-ver" or "vc vers"
> matches the second one.

So "vc dif" doesn't also match 'vc-ediff' and 'vc-version-diff'?

> Not a lot of difference in the amount of typing,
> but a little more control on the part of the user.

That's debatable, I like to be able to type 'vcdiff' and
see all the commands that vc.el offers for diffing things.

But to each their own, of course
João

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Thu, 02 Nov 2023 16:05:01 GMT) Full text and rfc822 format available.

Message #377 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: João Távora <joaotavora <at> gmail.com>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Eli Zaretskii <eliz <at> gnu.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add
 new `completion-filter-completions` API and deferred highlighting
Date: Thu, 2 Nov 2023 18:03:49 +0200

On 02/11/2023 17:58, João Távora wrote:
> On Thu, Nov 2, 2023 at 3:36 PM Dmitry Gutov <dmitry <at> gutov.dev> wrote:
>>
>> On 02/11/2023 17:24, João Távora wrote:
>>> On Thu, Nov 2, 2023 at 2:40 PM Dmitry Gutov<dmitry <at> gutov.dev>  wrote:
>>>
>>>> IOW, this whole approach results in stricter matching with fewer
>>>> results, so a smarter sort isn't that necessary.
>>> Just curious, so in orderless, what do I type to quickly select
>>> M-x vc-diff or M-x vc-version-diff or M-x vc-ediff?
>>>
>>> In flex I just type "vcdiff" and these results normally bubble to the top.
>>
>> I'm not really a user of it (yet?), but
>>
>> "vc-dif" or "vc dif" matches the first one, and "vc-ver" or "vc vers"
>> matches the second one.
> 
> So "vc dif" doesn't also match 'vc-ediff' and 'vc-version-diff'?

It does, of course. I just described what I imagine is the more common 
scenario: continue typing until the command you want is at the top, so 
you don't have to reach for C-n/C-p or the arrow keys.

>> Not a lot of difference in the amount of typing,
>> but a little more control on the part of the user.
> 
> That's debatable, I like to be able to type 'vcdiff' and
> see all the commands that vc.el offers for diffing things.

That also works for Orderless and "vc dif".

And from that you can continue to "vc dif ver", which brings 
"vc-version-diff" to the top, that's something unique to Orderless, I 
suppose.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Thu, 02 Nov 2023 16:11:02 GMT) Full text and rfc822 format available.

Message #380 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Eli Zaretskii <eliz <at> gnu.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add
 new `completion-filter-completions` API and deferred highlighting
Date: Thu, 2 Nov 2023 16:09:10 +0000

On Thu, Nov 2, 2023 at 4:03 PM Dmitry Gutov <dmitry <at> gutov.dev> wrote:

> > That's debatable, I like to be able to type 'vcdiff' and
> > see all the commands that vc.el offers for diffing things.
>
> That also works for Orderless and "vc dif".

Ah, but the order in which all these commands appear is
arbitrary, if I understand correctly.  It must be, since
there's no sorting, right?

João

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Thu, 02 Nov 2023 16:17:02 GMT) Full text and rfc822 format available.

Message #383 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: João Távora <joaotavora <at> gmail.com>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Eli Zaretskii <eliz <at> gnu.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add
 new `completion-filter-completions` API and deferred highlighting
Date: Thu, 2 Nov 2023 18:15:31 +0200

On 02/11/2023 18:09, João Távora wrote:
> On Thu, Nov 2, 2023 at 4:03 PM Dmitry Gutov<dmitry <at> gutov.dev>  wrote:
> 
>>> That's debatable, I like to be able to type 'vcdiff' and
>>> see all the commands that vc.el offers for diffing things.
>> That also works for Orderless and "vc dif".
> Ah, but the order in which all these commands appear is
> arbitrary, if I understand correctly.  It must be, since
> there's no sorting, right?

Seems so.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Fri, 03 Nov 2023 00:18:02 GMT) Full text and rfc822 format available.

Message #386 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Eli Zaretskii <eliz <at> gnu.org>,
 João Távora <joaotavora <at> gmail.com>, 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add
 new `completion-filter-completions` API and deferred highlighting
Date: Fri, 3 Nov 2023 02:16:23 +0200

Hi Stefan,

Sorry for the pause, let's continue this digression. Though perhaps it 
would be better moved to emacs-devel or somewhere else. ;-/

On 29/10/2023 06:41, Stefan Monnier wrote:
>> FWIW, this neat structure might not help too much: the most popular external
>> completion backend (the LSP language servers, collectively) don't accept
>> regexps or globs, they just send you the lists of completions available at
>> point. With the name matching method sometimes configurable per-server.
> 
> Largely agreed.  The main benefit tho is that you get just *one*
> pattern, rather than three (one being the prefix argument, the second
> being the `pred` arg which historically was so unused that it was abused
> to hold the directory for file-name completion, so lots of tables don't
> obey it, and the third being the `completion-regexp-list` that most coders
> forget, and those who don't end up regretting not forgetting when it was
> not meant for them), so it's much more clear.

The motivation all makes sense.

>> As such, the most useful methods currently are: 1) Emacs Regexp, 2) asking
>> server for whatever it thinks is suitable (the "backend" completion style).
> 
> For the backends: agreed.
> For the frontends (i.e. `completion-styles`), `glob` is the more useful
> one, I'd say (except for the "external" style, of course).
> 
> We might also want support for things like `or` and `and` patterns, but
> I haven't managed to fit them nicely in that structure :-(

Possibly, but which code would produce such patterns?

>> I would also probably want to standardize on the recommended type of TO
>> anyway: some of them are likely going to result in better performance
>> than others.
> 
> The TO is chosen by the specific completion table, based on what it can
> handle best.  So it should always be "optimal".

Sorry, I meant the recommended type of FROM. Because if the original 
caller passes an arbitrary regexp, it will often get turned into a pair 
with a predicate where the latter calls string-match-p.

And if the type of FROM is standardized, there likely would be no need 
for a four-way bidirectional conversion. Maybe just a helper that 
converts from the original "main" type into any of the available that is 
currently required.

>> So I guess it's also a way to make every completion table aware of PRED?
> 
> Note also that these `pred` patterns are expected to be exclusively
> looking at the string (they're used for `completion-styles` kind of
> functionality), so nothing like `file-directory-p` or `fboundp` kind of
> predicates here.

Would we consider those pred's "fast enough"?

I've done a little comparison of completion-regexp-list vs pred:

(setq sss (cl-loop repeat 300000 collect (symbol-name (gensym "yoyo"))))

(benchmark-run 10 (let* ((re "yo[^o]*o")
                         (completion-regexp-list (list re)))
                    (all-completions "" sss)))
;; => 0.60s


(benchmark-run 10 (let* ((re "yo[^o]*o"))
                    (all-completions "" sss
                             (lambda (s) (string-match-p re s)))))
;; => 1.14s

> The `pred`s used in things like `completing-read` and `read-file-name`
> would be handled elsewhere such as `completion-table-with-predicate`.
> This part is still up in the air, tho.
> 
>> That should work; though it might be hard to reach the same raw performance
>> as the current all-completions + completion-regexp-list.
> 
> I don't see why: currently my code actually uses `all-completions` and
> `completion-regexp-list`, so as long as the pattern can be turned into
> a regexp without requiring an additional PRED (that's usually the case), <...>

Right, I was talking about the possible exceptions.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Fri, 03 Nov 2023 03:07:02 GMT) Full text and rfc822 format available.

Message #389 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Eli Zaretskii <eliz <at> gnu.org>,
 João Távora <joaotavora <at> gmail.com>,
 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2]
 Add new `completion-filter-completions` API and deferred highlighting
Date: Thu, 02 Nov 2023 23:05:13 -0400

> Sorry for the pause, let's continue this digression. Though perhaps it would
> be better moved to emacs-devel or somewhere else. ;-/

I see this API as an experiment.  I have no idea if I'll like
the result.  It's definitely far from being something ready to submit as
a proposal for a new design.

>>> As such, the most useful methods currently are: 1) Emacs Regexp, 2) asking
>>> server for whatever it thinks is suitable (the "backend" completion style).
>> For the backends: agreed.
>> For the frontends (i.e. `completion-styles`), `glob` is the more useful
>> one, I'd say (except for the "external" style, of course).
>> We might also want support for things like `or` and `and` patterns, but
>> I haven't managed to fit them nicely in that structure :-(
> Possibly, but which code would produce such patterns?

The `or` pattern?  No idea :-)
The `and` pattern?  Well, the `orderless` style, for one.

But indeed, I'm not sure it'd be useful to handle things like or/and
directly in there rather than by using union/intersection on
the resulting completions.  It's just an aspect of the design
I considered and I noticed that I had trouble extending it in
that direction.

Obviously, the caller which needs to collect a set of matching
candidates always has a choice between using a more refined pattern
or using a simpler pattern (including various calls with various
different patterns).

>>> I would also probably want to standardize on the recommended type of TO
>>> anyway: some of them are likely going to result in better performance
>>> than others.
>> The TO is chosen by the specific completion table, based on what it can
>> handle best.  So it should always be "optimal".
> Sorry, I meant the recommended type of FROM. Because if the original caller
> passes an arbitrary regexp, it will often get turned into a pair with
> a predicate where the latter calls string-match-p.

The caller should use the most primitive pattern they can.

> And if the type of FROM is standardized, there likely would be no need for
> a four-way bidirectional conversion. Maybe just a helper that converts from
> the original "main" type into any of the available that is
> currently required.

Indeed the default method does:

  (cond
   ((eq to (car pattern)) (cons nil (cdr pattern)))
   ((eq 'glob to) (cl-call-next-method))
   (t
    ;; Most conversions can be performed by going through `glob'.
    (pcase-let* ((`(,gpred . ,glob)
                  (completion-pattern-convert 'glob pattern))
                 (`(,tpred . ,newpattern)
                  (completion-pattern-convert to glob)))
      (cons (or gpred tpred) newpattern)))))

>>> So I guess it's also a way to make every completion table aware of PRED?
>> Note also that these `pred` patterns are expected to be exclusively
>> looking at the string (they're used for `completion-styles` kind of
>> functionality), so nothing like `file-directory-p` or `fboundp` kind of
>> predicates here.
> Would we consider those pred's "fast enough"?

I don't know.  I haven't had a use for a `pred` pattern yet, to
be honest.  As for the predicates returned by
`completion-pattern-convert`, they're currently just fancy booleans
indicating if the returned pattern is faithful or not :-)
So I'm not sure those predicates will survive this experiment.

> (benchmark-run 10 (let* ((re "yo[^o]*o")
>                          (completion-regexp-list (list re)))
>                     (all-completions "" sss)))
> ;; => 0.60s
>
>
> (benchmark-run 10 (let* ((re "yo[^o]*o"))
>                     (all-completions "" sss
>                              (lambda (s) (string-match-p re s)))))
> ;; => 1.14s

>>> That should work; though it might be hard to reach the same raw performance
>>> as the current all-completions + completion-regexp-list.
>> I don't see why: currently my code actually uses `all-completions` and
>> `completion-regexp-list`, so as long as the pattern can be turned into
>> a regexp without requiring an additional PRED (that's usually the case), <...>
> Right, I was talking about the possible exceptions.

I don't know what you're getting at.


        Stefan

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Sat, 04 Nov 2023 18:48:01 GMT) Full text and rfc822 format available.

Message #392 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Howard Melman <hmelman <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: Re: bug#47711: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH
 VERSION 2] Add new `completion-filter-completions` API and deferred
 highlighting
Date: Sat, 04 Nov 2023 14:46:43 -0400

Dmitry Gutov <dmitry <at> gutov.dev> writes:

> On 02/11/2023 18:09, João Távora wrote:
>> On Thu, Nov 2, 2023 at 4:03 PM Dmitry Gutov<dmitry <at> gutov.dev>  wrote:
>> 
>>>> That's debatable, I like to be able to type 'vcdiff' and
>>>> see all the commands that vc.el offers for diffing things.
>>> That also works for Orderless and "vc dif".
>> Ah, but the order in which all these commands appear is
>> arbitrary, if I understand correctly.  It must be, since
>> there's no sorting, right?
>
> Seems so.

FYI, it's true that orderless just does filtering, it's just a
completion-style and leaves sorting to the completion UI
(I use it with vertico, but others work too).

It's configurable so what people can input can vary a lot,
but the main feature is that each space-separated bit is used
in any order.  So in the above example "vc dif" and "dif vc"
would both work.  The second is unlikely in this example
but it's far more useful when searching for function or
variable names.

The other great feature is that each "word" can be evaluated
in different ways, I typically use them as regexps, but it's
also easy to add syntax.  It's common to make a leading ! in
a word mean "without".  So an input of "file !--" would
match all things that include file anywhere and doesn't
have -- anywhere.  "^rx- !--" matches everything in the
public rx API (well anything beginning with rx- without --).

-- 

Howard

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 06 Nov 2023 16:22:02 GMT) Full text and rfc822 format available.

Message #395 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Eli Zaretskii <eliz <at> gnu.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add
 new `completion-filter-completions` API and deferred highlighting
Date: Mon, 6 Nov 2023 16:20:52 +0000

On Wed, Nov 1, 2023 at 10:45 PM Dmitry Gutov <dmitry <at> gutov.dev> wrote:

> I guess we should wait a few days to see if anyone has more comments,
> and then install this?

Five days elapsed, and no more comments came in, so I addressed your
comments and Eli's and I pushed this to master as
dfffb91a70532ac0021648ba692336331cbe0499.

Thanks,
João

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 06 Nov 2023 19:40:02 GMT) Full text and rfc822 format available.

Message #398 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: João Távora <joaotavora <at> gmail.com>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Eli Zaretskii <eliz <at> gnu.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add
 new `completion-filter-completions` API and deferred highlighting
Date: Mon, 6 Nov 2023 21:38:55 +0200

On 06/11/2023 18:20, João Távora wrote:
> On Wed, Nov 1, 2023 at 10:45 PM Dmitry Gutov<dmitry <at> gutov.dev>  wrote:
> 
>> I guess we should wait a few days to see if anyone has more comments,
>> and then install this?
> Five days elapsed, and no more comments came in, so I addressed your
> comments and Eli's and I pushed this to master as
> dfffb91a70532ac0021648ba692336331cbe0499.

Thanks!

And thanks to Daniel for the original proposal and design work.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Tue, 07 Nov 2023 12:15:02 GMT) Full text and rfc822 format available.

Message #401 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Eli Zaretskii <eliz <at> gnu.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add
 new `completion-filter-completions` API and deferred highlighting
Date: Tue, 7 Nov 2023 12:13:05 +0000

On Mon, Nov 6, 2023 at 7:39 PM Dmitry Gutov <dmitry <at> gutov.dev> wrote:
>
> On 06/11/2023 18:20, João Távora wrote:
> > On Wed, Nov 1, 2023 at 10:45 PM Dmitry Gutov<dmitry <at> gutov.dev>  wrote:
> >
> >> I guess we should wait a few days to see if anyone has more comments,
> >> and then install this?
> > Five days elapsed, and no more comments came in, so I addressed your
> > comments and Eli's and I pushed this to master as
> > dfffb91a70532ac0021648ba692336331cbe0499.

Here's another place where completion-lazy-hilit could be leveraged:

diff --git a/lisp/minibuffer.el b/lisp/minibuffer.el
index ca2b25415f1..bb2670bccf6 100644
--- a/lisp/minibuffer.el
+++ b/lisp/minibuffer.el
@@ -2067,7 +2067,7 @@ completion--insert-strings
         ;; when the caller uses tabs inside prefix.
         (setq colwidth (- colwidth (mod colwidth completion-tab-width))))
       (funcall (intern (format "completion--insert-%s" completions-format))
-               strings group-fun length wwidth colwidth columns))))
+               (mapcar #'completion-lazy-hilit strings) group-fun
length wwidth colwidth columns))))

 (defun completion--insert-horizontal (strings group-fun
                                               length wwidth
@@ -2378,6 +2378,7 @@ minibuffer-completion-help
          (end (or end (point-max)))
          (string (buffer-substring start end))
          (md (completion--field-metadata start))
+         (completion-lazy-hilit t)
          (completions (completion-all-completions
                        string
                        minibuffer-completion-table

Any objections?  Seems to speed it up when flex is the preferred
completion style outside icomplete.  These are average times
collected from the instrumentation of the above
completion-all-completions when doing a M-: (setq i TAB)

I just used my normal Emacs session for this.

with lazy hilit:      0.104536125
without lazy hilit:   0.172522571

João

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Wed, 08 Nov 2023 01:08:02 GMT) Full text and rfc822 format available.

Message #404 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: João Távora <joaotavora <at> gmail.com>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Eli Zaretskii <eliz <at> gnu.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add
 new `completion-filter-completions` API and deferred highlighting
Date: Wed, 8 Nov 2023 03:06:48 +0200

On 07/11/2023 14:13, João Távora wrote:
> Any objections?  Seems to speed it up when flex is the preferred
> completion style outside icomplete.  These are average times
> collected from the instrumentation of the above
> completion-all-completions when doing a M-: (setq i TAB)
> 
> I just used my normal Emacs session for this.
> 
> with lazy hilit:      0.104536125
> without lazy hilit:   0.172522571

IIUC the problem with the default completion-at-point UI here is that is 
prints all completions anyway, in the buffer *Completions*. And so it 
applies syntax highlighting to them as well, and does that eagerly (as 
opposed to e.g. doing that via jit-lock).

If you instrumented only the 'completion-all-completions' call, then 
that might miss the subsequent time spent in sorting.

But speaking of the case when the *Completions* buffer isn't shown yet, 
the code calls something similar to completion-try-completion, which 
ultimately goes through completion-pcm--find-all-completions. Does it 
currently apply faces too (in the default, non-lazy scenario)?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Wed, 08 Nov 2023 01:26:02 GMT) Full text and rfc822 format available.

Message #407 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Eli Zaretskii <eliz <at> gnu.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add
 new `completion-filter-completions` API and deferred highlighting
Date: Wed, 8 Nov 2023 01:24:08 +0000

On Wed, Nov 8, 2023 at 1:06 AM Dmitry Gutov <dmitry <at> gutov.dev> wrote:
>
> On 07/11/2023 14:13, João Távora wrote:
> > Any objections?  Seems to speed it up when flex is the preferred
> > completion style outside icomplete.  These are average times
> > collected from the instrumentation of the above
> > completion-all-completions when doing a M-: (setq i TAB)
> >
> > I just used my normal Emacs session for this.
> >
> > with lazy hilit:      0.104536125
> > without lazy hilit:   0.172522571
>
> IIUC the problem with the default completion-at-point UI here is that is
> prints all completions anyway, in the buffer *Completions*. And so it
> applies syntax highlighting to them as well, and does that eagerly (as
> opposed to e.g. doing that via jit-lock).
>
> If you instrumented only the 'completion-all-completions' call, then
> that might miss the subsequent time spent in sorting.

You probably mean highlighting: that's the saving being made here,
not sorting.

Anyway, it _felt_ snappier, but maybe I was dreaming.  Got any
better suggestions for places where to place `benchmark-progn`?

Also, I don't think *Completions* has _every_ matching completion,
does it?  Doesn't it display more completions as you keep TABing?
That's what I supposed was providing the speedup.

> But speaking of the case when the *Completions* buffer isn't shown yet,
> the code calls something similar to completion-try-completion, which
> ultimately goes through completion-pcm--find-all-completions. Does it
> currently apply faces too (in the default, non-lazy scenario)?

No idea. just looking to optimize low-hanging fruit I can find.  If you
can find more, go ahead.

João

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Wed, 08 Nov 2023 01:49:02 GMT) Full text and rfc822 format available.

Message #410 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: João Távora <joaotavora <at> gmail.com>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Eli Zaretskii <eliz <at> gnu.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 47711 <at> debbugs.gnu.org
Subject: Re: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add
 new `completion-filter-completions` API and deferred highlighting
Date: Wed, 8 Nov 2023 03:47:34 +0200

On 08/11/2023 03:24, João Távora wrote:
> On Wed, Nov 8, 2023 at 1:06 AM Dmitry Gutov <dmitry <at> gutov.dev> wrote:
>>
>> On 07/11/2023 14:13, João Távora wrote:
>>> Any objections?  Seems to speed it up when flex is the preferred
>>> completion style outside icomplete.  These are average times
>>> collected from the instrumentation of the above
>>> completion-all-completions when doing a M-: (setq i TAB)
>>>
>>> I just used my normal Emacs session for this.
>>>
>>> with lazy hilit:      0.104536125
>>> without lazy hilit:   0.172522571
>>
>> IIUC the problem with the default completion-at-point UI here is that is
>> prints all completions anyway, in the buffer *Completions*. And so it
>> applies syntax highlighting to them as well, and does that eagerly (as
>> opposed to e.g. doing that via jit-lock).
>>
>> If you instrumented only the 'completion-all-completions' call, then
>> that might miss the subsequent time spent in sorting.
> 
> You probably mean highlighting: that's the saving being made here,
> not sorting.

I meant both but I forgot that we moved scoring to later in both cases.

> Anyway, it _felt_ snappier, but maybe I was dreaming.  Got any
> better suggestions for places where to place `benchmark-progn`?

Perhaps around the whole minibuffer-completion-help.

> Also, I don't think *Completions* has _every_ matching completion,
> does it?  Doesn't it display more completions as you keep TABing?
> That's what I supposed was providing the speedup.

Well, I type (d, press C-M-i, then select the window showing 
*Completions* and scroll to the end of the buffer. The last completion 
displayed there starts with "dy", so that's probably all of them.

That's with the default completion-styles config, but with flex it's the 
same: all completions shown in the buffer. Sorted and highlighted.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47711; Package emacs. (Mon, 08 Apr 2024 17:20:03 GMT) Full text and rfc822 format available.

Message #413 received at 47711 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Daniel Mendler <mail <at> daniel-mendler.de>, 47711 <at> debbugs.gnu.org
Cc: Stefan Monnier <monnier <at> iro.umontreal.ca>
Subject: Re: bug#47711: 27.1; Deferred highlighting support in
 `completion-all-completions', `vertico--all-completions`
Date: Mon, 8 Apr 2024 20:19:28 +0300

On 11/04/2021 23:51, Daniel Mendler wrote:
> Emacs is lacking a possibility to defer the completion highlighting when
> computing completions via `completion-all-completions'. This feature is
> important for the performance of completion UIs when the set of all
> completions is much larger than the set of completions which are
> displayed.

I think this bug can be closed, with commit dfffb91a70532 we installed 
last year, and subsequent refinements.

Those might not cover all of the features that the patch proposed in 
this thread did, but the needs described in the first message (the 
original bug description) should be covered.

Reply sent to Daniel Mendler <mail <at> daniel-mendler.de>:
You have taken responsibility. (Fri, 29 Nov 2024 17:56:01 GMT) Full text and rfc822 format available.

Notification sent to Daniel Mendler <mail <at> daniel-mendler.de>:
bug acknowledged by developer. (Fri, 29 Nov 2024 17:56:02 GMT) Full text and rfc822 format available.

Message #418 received at 47711-done <at> debbugs.gnu.org (full text, mbox):

From: Daniel Mendler <mail <at> daniel-mendler.de>
To: 47711-done <at> debbugs.gnu.org
Subject: Re: bug#47711: 27.1; Deferred highlighting support in
 `completion-all-completions', `vertico--all-completions`
Date: Fri, 29 Nov 2024 18:52:45 +0100

Has been implemented via completion-lazy-hilit.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sat, 28 Dec 2024 12:24:07 GMT) Full text and rfc822 format available.

This bug report was last modified 202 days ago.

Previous Next

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #47711 27.1; Deferred highlighting support in `completion-all-completions', `vertico--all-completions`

GNU bug report logs - #47711
27.1; Deferred highlighting support in `completion-all-completions', `vertico--all-completions`