GNU bug report logs - #12796
Optimize `ido-completing-read' for larger lists with flex matching enabled

Previous Next

Package: emacs;

Reported by: Dmitry Gutov <dgutov <at> yandex.ru>

Date: Sun, 4 Nov 2012 06:02:01 UTC

Severity: normal

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 12796 in the body.
You can then email your comments to 12796 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#12796; Package emacs. (Sun, 04 Nov 2012 06:02:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Dmitry Gutov <dgutov <at> yandex.ru>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Sun, 04 Nov 2012 06:02:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: bug-gnu-emacs <at> gnu.org
Subject: Optimize `ido-completing-read' for larger lists with flex matching
	enabled
Date: Sun, 04 Nov 2012 09:58:35 +0400
[Message part 1 (text/plain, inline)]
Tags: patch

Currently ido re-filters the full candidates list after every change in 
the minibuffer. With long candidates list and with flex matching enabled 
(like it's often the case with certain third-party packages, namely smex 
and ido-ubiquitous), as soon as ido switches to using flex matching, 
each update takes a noticeable fraction of a second. Even if there's no 
matches anymore for the current input.

If I decide to type quickly but make a typo in one of the first 
characters, I often need to wait a few seconds until I can fix the typo 
or start anew.

This patch adds a simple cache that keeps track of the current matching 
settings (prefix, regexp, or no), and checks the input against a 
previously entered string. If the latter is a prefix of the former (and 
regexp matching is disabled), then we can use the matches from the 
former input as the candidates list for the current one.

Any objections?
[ido-speed.diff (text/plain, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#12796; Package emacs. (Sun, 04 Nov 2012 08:36:01 GMT) Full text and rfc822 format available.

Message #8 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Leo <sdl.web <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: Re: bug#12796: Optimize `ido-completing-read' for larger lists with
	flex matching enabled
Date: Sun, 04 Nov 2012 16:32:24 +0800
On 2012-11-04 13:58 +0800, Dmitry Gutov wrote:
> Any objections?

This is special-cased optimisation which doesn't address the root cause
of the slowness. We need a better solution.

Leo





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#12796; Package emacs. (Sun, 04 Nov 2012 13:58:02 GMT) Full text and rfc822 format available.

Message #11 received at 12796 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> IRO.UMontreal.CA>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 12796 <at> debbugs.gnu.org
Subject: Re: bug#12796: Optimize `ido-completing-read' for larger lists with
	flex matching enabled
Date: Sun, 04 Nov 2012 08:53:58 -0500
> If I decide to type quickly but make a typo in one of the first characters,
> I often need to wait a few seconds until I can fix the typo or start anew.

`while-no-input' (which AFAICT is used by ido) is supposed to interrupt
the computation as soon as you type the next input so you don't need
to wait.

Are you saying that while-no-input doesn't work?


        Stefan




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#12796; Package emacs. (Sun, 04 Nov 2012 17:09:02 GMT) Full text and rfc822 format available.

Message #14 received at 12796 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Stefan Monnier <monnier <at> IRO.UMontreal.CA>
Cc: 12796 <at> debbugs.gnu.org
Subject: Re: bug#12796: Optimize `ido-completing-read' for larger lists with
	flex matching enabled
Date: Sun, 04 Nov 2012 21:05:10 +0400
On 04.11.2012 17:53, Stefan Monnier wrote:
>> If I decide to type quickly but make a typo in one of the first characters,
>> I often need to wait a few seconds until I can fix the typo or start anew.
>
> `while-no-input' (which AFAICT is used by ido) is supposed to interrupt
> the computation as soon as you type the next input so you don't need
> to wait.
>
> Are you saying that while-no-input doesn't work?

I only see `while-no-input' used in one place there: in 
`ido-make-merged-file-list', and that function is only used in 
`find-file' mode.

So yeah, using it around matching loops in `ido-set-matches-1' might be 
another way to optimize, provided the overhead is not too much.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#12796; Package emacs. (Mon, 05 Nov 2012 05:41:02 GMT) Full text and rfc822 format available.

Message #17 received at 12796 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Stefan Monnier <monnier <at> IRO.UMontreal.CA>
Cc: 12796 <at> debbugs.gnu.org
Subject: Re: bug#12796: Optimize `ido-completing-read' for larger lists with
	flex matching enabled
Date: Mon, 05 Nov 2012 09:37:52 +0400
On 04.11.2012 21:05, Dmitry Gutov wrote:
> On 04.11.2012 17:53, Stefan Monnier wrote:
>>> If I decide to type quickly but make a typo in one of the first
>>> characters,
>>> I often need to wait a few seconds until I can fix the typo or start
>>> anew.
>>
>> `while-no-input' (which AFAICT is used by ido) is supposed to interrupt
>> the computation as soon as you type the next input so you don't need
>> to wait.
>>
>> Are you saying that while-no-input doesn't work?
>
> I only see `while-no-input' used in one place there: in
> `ido-make-merged-file-list', and that function is only used in
> `find-file' mode.
>
> So yeah, using it around matching loops in `ido-set-matches-1' might be
> another way to optimize, provided the overhead is not too much.

Disregard the "overhead" remark, I misread what the macro does: it's not 
actually a looping construct.

To try it, I just wrapped in it the "busy" part of `ido-set-matches-1' - 
and indeed, no more waiting a several seconds after button mashing. It's 
a bit buggy so far, but that's to be expected.

The caching approach still feels faster in most cases, and is 
instantaneous in cases when we're editing input and have few or no 
matches for the current input (if we're backspacing, then only when no 
matches). It has room for usability improvements, too.

I won't insist, though. I kind of decided to disable flex anyway and 
just use regexp matching sometimes.

--Dmitry




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#12796; Package emacs. (Mon, 05 Nov 2012 21:01:02 GMT) Full text and rfc822 format available.

Message #20 received at 12796 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: sdl.web <at> gmail.com
Cc: 12796 <at> debbugs.gnu.org
Subject: Re: bug#12796: Optimize `ido-completing-read' for larger lists with
	flex matching enabled
Date: Tue, 06 Nov 2012 00:57:25 +0400
Leo <sdl.web <at> gmail.com> writes:

> On 2012-11-04 13:58 +0800, Dmitry Gutov wrote:
>> Any objections?
>
> This is special-cased optimisation which doesn't address the root cause
> of the slowness. We need a better solution.

And the root cause is..?

Doing some sort of preprocessing on the candidates list comes to mind
(search tree?), but I don't immediately see a way of doing that for flex
matching.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#12796; Package emacs. (Tue, 06 Nov 2012 01:49:01 GMT) Full text and rfc822 format available.

Message #23 received at 12796 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Kim F. Storm <storm <at> cua.dk>, Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 12796 <at> debbugs.gnu.org
Subject: Re: bug#12796: Optimize `ido-completing-read' for larger lists with
	flex matching enabled
Date: Mon, 05 Nov 2012 20:45:13 -0500
[ Hi Kim, can you give me your opinion on this?  ]

> To try it, I just wrapped in it the "busy" part of `ido-set-matches-1' -
> and indeed, no more waiting a several seconds after button
> mashing. It's a bit buggy so far, but that's to be expected.

To eliminate the buggy behavior, it should probably be put elsewhere,
maybe directly in ido-exhibit (the post-command-hook).

> The caching approach still feels faster in most cases, and is instantaneous
> in cases when we're editing input and have few or no matches for the current
> input (if we're backspacing, then only when no matches). It has room for
> usability improvements, too.

I'm not opposed to caching, actually.  I think the two are independent:
it's important that a slow computation of the completion-data doesn't
force the user to edit slowly.  But it's also good to optimize the
computation itself such that interrupting the computation
happens rarely.

I'm not familiar with the ido.el code, so I find it difficult to judge
if your approach to caching is right.  Kim could you take a look (the
patch can be seen at http://debbugs.gnu.org/12796)?


        Stefan




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#12796; Package emacs. (Tue, 06 Nov 2012 11:08:01 GMT) Full text and rfc822 format available.

Message #26 received at 12796 <at> debbugs.gnu.org (full text, mbox):

From: Kim Storm <storm <at> cua.dk>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 12796 <at> debbugs.gnu.org, Dmitry Gutov <dgutov <at> yandex.ru>
Subject: Re: bug#12796: Optimize `ido-completing-read' for larger lists with
	flex matching enabled
Date: Tue, 06 Nov 2012 12:03:45 +0100
On 2012-11-06 02:45, Stefan Monnier wrote:
> [ Hi Kim, can you give me your opinion on this?  ]
>
>> To try it, I just wrapped in it the "busy" part of `ido-set-matches-1' -
>> and indeed, no more waiting a several seconds after button
>> mashing. It's a bit buggy so far, but that's to be expected.
> To eliminate the buggy behavior, it should probably be put elsewhere,
> maybe directly in ido-exhibit (the post-command-hook).
Sounds right, but I a bit worried that some of the state information
may get corrupted if arbitrarily interrupted by user input.

If someone proposes a patch, I'll look at it.
>
>> The caching approach still feels faster in most cases, and is instantaneous
>> in cases when we're editing input and have few or no matches for the current
>> input (if we're backspacing, then only when no matches). It has room for
>> usability improvements, too.
> I'm not opposed to caching, actually.  I think the two are independent:
> it's important that a slow computation of the completion-data doesn't
> force the user to edit slowly.  But it's also good to optimize the
> computation itself such that interrupting the computation
> happens rarely.
>
> I'm not familiar with the ido.el code, so I find it difficult to judge
> if your approach to caching is right.  Kim could you take a look (the
> patch can be seen at http://debbugs.gnu.org/12796)?

I looked at the caching patch, and it looks alright (in the sense that I 
don't think
it will break ido behaviour.)

I'm not sure how efficient the caching is though. As far as I can see, 
it only
caches the most recent (non-empty) list of matches, i.e. the shortest list
corresponding to the longest "successful" user input in the minibuffer.

So if the user has to backtrack beyond that point, I don't really see 
how the
caching will help, as the cache is then invalidated.

Also, I don't quite understand why this condition is needed:

(<= (* 10 (length matches)) (length ido-cur-list)))

It seems to me to only cache a list of matches that has reduced
the set of matches by a factor 10 - if the aim is to reduce processing
time for long lists, even reducing by a factor of 2 may be noticable ?

But maybe the intention of this line was to stop caching once the list
has become short than 1/10th of the original list?  In that case, the
operator should be <= I think ?

I any case, I'm not opposed to adding some form of caching here,
and the proposed patch seems on the right track, but I'm not convinced
that it is the optimal approach.

Kim





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#12796; Package emacs. (Tue, 06 Nov 2012 15:43:01 GMT) Full text and rfc822 format available.

Message #29 received at 12796 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Kim Storm <storm <at> cua.dk>
Cc: Stefan Monnier <monnier <at> iro.umontreal.ca>, 12796 <at> debbugs.gnu.org
Subject: Re: bug#12796: Optimize `ido-completing-read' for larger lists with
	flex matching enabled
Date: Tue, 06 Nov 2012 19:38:45 +0400
On 06.11.2012 15:03, Kim Storm wrote:
>> I'm not familiar with the ido.el code, so I find it difficult to judge
>> if your approach to caching is right.  Kim could you take a look (the
>> patch can be seen at http://debbugs.gnu.org/12796)?
>
> I looked at the caching patch, and it looks alright (in the sense that I
> don't think
> it will break ido behaviour.)
>
> I'm not sure how efficient the caching is though. As far as I can see,
> it only
> caches the most recent (non-empty) list of matches, i.e. the shortest list
> corresponding to the longest "successful" user input in the minibuffer.
>
> So if the user has to backtrack beyond that point, I don't really see
> how the
> caching will help, as the cache is then invalidated.

That's true, backtracking was not a priority. But see below.

> Also, I don't quite understand why this condition is needed:
>
> (<= (* 10 (length matches)) (length ido-cur-list)))
>
> It seems to me to only cache a list of matches that has reduced
> the set of matches by a factor 10 - if the aim is to reduce processing
> time for long lists, even reducing by a factor of 2 may be noticable ?
>
> But maybe the intention of this line was to stop caching once the list
> has become short than 1/10th of the original list?  In that case, the
> operator should be <= I think ?

No, the idea is to limit memory consumption (which may be a bit 
premature) and make sure that the filtered matches list is smaller 
enough than the original to justify saving it. I probably should change 
10 to a smaller constant, like 3 or 2.

On the "stop caching" front, we can add a lower bound check, comparing 
the matches length to an absolute or relative value. This way, doing a 
few backspaces from a sufficiently narrowed state won't trigger a full 
recomputation.

To go further than that, it shouldn't be hard to keep a stack of caches 
for the current input string as it's typed, but I suspect memory 
consumption may be a bigger concern in this case.

WDYT?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#12796; Package emacs. (Tue, 06 Nov 2012 16:50:01 GMT) Full text and rfc822 format available.

Message #32 received at 12796 <at> debbugs.gnu.org (full text, mbox):

From: Kim Storm <storm <at> cua.dk>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: Stefan Monnier <monnier <at> iro.umontreal.ca>, 12796 <at> debbugs.gnu.org
Subject: Re: bug#12796: Optimize `ido-completing-read' for larger lists with
	flex matching enabled
Date: Tue, 06 Nov 2012 17:45:48 +0100
On 2012-11-06 16:38, Dmitry Gutov wrote:
>> So if the user has to backtrack beyond that point, I don't really see
>> how the
>> caching will help, as the cache is then invalidated.
>
> That's true, backtracking was not a priority. But see below.
That is what I thought was the primary purpose of your patch.

But I now understand that it was aimed at supplying a shorter list of 
matches to start from "down the list".
That's definitely worth doing!

>> It seems to me to only cache a list of matches that has reduced
>> the set of matches by a factor 10 - if the aim is to reduce processing
>> time for long lists, even reducing by a factor of 2 may be noticable ?
>>
>> But maybe the intention of this line was to stop caching once the list
>> has become short than 1/10th of the original list?  In that case, the
>> operator should be <= I think ?
>
>
> No, the idea is to limit memory consumption (which may be a bit 
> premature) and make sure that the filtered matches list is smaller 
> enough than the original to justify saving it. I probably should 
> change 10 to a smaller constant, like 3 or 2.
>
I wouldn't worry that much about memory consumption these days
- if you are really worried, create a ido-cache-max-matches custom variable
with the max length of matches that you want to cache. default nil 
meaning no limit.

So as long as the matches list is shorter than the original list and 
shorter than the max limit,
just cache the list.

> On the "stop caching" front, we can add a lower bound check, comparing 
> the matches length to an absolute or relative value. This way, doing a 
> few backspaces from a sufficiently narrowed state won't trigger a full 
> recomputation.
Right -- if the cache is less than say 25 elements long, I wouldn't 
expect the speedup to be noticable.

> To go further than that, it shouldn't be hard to keep a stack of 
> caches for the current input string as it's typed, but I suspect 
> memory consumption may be a bigger concern in this case.
Yes, for the problem at hand, I think your approach is just fine, so 
with a few tweaks as discussed above,
I support your patch.

Kim




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#12796; Package emacs. (Wed, 07 Nov 2012 02:28:02 GMT) Full text and rfc822 format available.

Message #35 received at 12796 <at> debbugs.gnu.org (full text, mbox):

From: Leo <sdl.web <at> gmail.com>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 12796 <at> debbugs.gnu.org
Subject: Re: bug#12796: Optimize `ido-completing-read' for larger lists with
	flex matching enabled
Date: Wed, 07 Nov 2012 10:27:48 +0800
On 2012-11-06 04:57 +0800, Dmitry Gutov wrote:
> And the root cause is..?

I haven't looked. So maybe you could use a profiler to find where it is.
It seems string-match in flex matching is slow. I think we should make
sure the computation is optimised. There are third party libs such as
ido-hacks.el which might have some ideas.

> Doing some sort of preprocessing on the candidates list comes to mind
> (search tree?), but I don't immediately see a way of doing that for flex
> matching.

Leo




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#12796; Package emacs. (Wed, 07 Nov 2012 04:07:02 GMT) Full text and rfc822 format available.

Message #38 received at 12796 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Leo <sdl.web <at> gmail.com>
Cc: Stefan Monnier <monnier <at> IRO.UMontreal.CA>, 12796 <at> debbugs.gnu.org,
	Kim Storm <storm <at> cua.dk>
Subject: Re: bug#12796: Optimize `ido-completing-read' for larger lists with
	flex matching enabled
Date: Wed, 07 Nov 2012 08:06:13 +0400
On 07.11.2012 6:27, Leo wrote:
> On 2012-11-06 04:57 +0800, Dmitry Gutov wrote:
>> And the root cause is..?
>
> I haven't looked. So maybe you could use a profiler to find where it is.
> It seems string-match in flex matching is slow. I think we should make
> sure the computation is optimised. There are third party libs such as
> ido-hacks.el which might have some ideas.

That was actually a good advice. As far as I can tell, most of the speed 
improvement comes from the following change:

=== modified file 'lisp/ido.el'
--- lisp/ido.el	2012-10-05 07:38:05 +0000
+++ lisp/ido.el	2012-11-07 03:40:57 +0000
@@ -3764,7 +3764,7 @@
 	       ido-enable-flex-matching
 	       (> (length ido-text) 1)
 	       (not ido-enable-regexp))
-      (setq re (mapconcat #'regexp-quote (split-string ido-text "") ".*"))
+      (setq re (mapconcat #'regexp-quote (split-string ido-text "" t) 
".*"))
       (if ido-enable-prefix
 	  (setq re (concat "\\`" re)))
       (mapc

:)

Then there's this (from ido-set-matches-1's defadvice's docstring):

"This advice makes it a good deal faster, by caching narrowed
choices lists."

Which looks like it's doing something with hashtables similar to what I 
proposed to do with a stack. With approximately the same performance 
improvement (which is only visible with the change above reverted).

Anyway, with my data sets (all Emacs functions or all Emacs vars, 12000 
and 4000 items respectively), the one-line change makes flex matching 
about as fast as I can type, so I guess we'll leave implementing cache 
until someone else complains. Feel free to ping me then.

I'm still going to see if I can make while-no-input work here, though.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#12796; Package emacs. (Wed, 07 Nov 2012 05:41:02 GMT) Full text and rfc822 format available.

Message #41 received at 12796 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Kim Storm <storm <at> cua.dk>
Cc: Stefan Monnier <monnier <at> iro.umontreal.ca>, 12796 <at> debbugs.gnu.org
Subject: Re: bug#12796: Optimize `ido-completing-read' for larger lists with
	flex matching enabled
Date: Wed, 07 Nov 2012 09:41:00 +0400
On 06.11.2012 15:03, Kim Storm wrote:
> On 2012-11-06 02:45, Stefan Monnier wrote:
>> [ Hi Kim, can you give me your opinion on this?  ]
>>
>>> To try it, I just wrapped in it the "busy" part of `ido-set-matches-1' -
>>> and indeed, no more waiting a several seconds after button
>>> mashing. It's a bit buggy so far, but that's to be expected.
>> To eliminate the buggy behavior, it should probably be put elsewhere,
>> maybe directly in ido-exhibit (the post-command-hook).
> Sounds right, but I a bit worried that some of the state information
> may get corrupted if arbitrarily interrupted by user input.
>
> If someone proposes a patch, I'll look at it.

How does this look to you?

I added a new var because ido-rescan is unconditionally set to nil in 
many places.

By the way, is there a place where ido-rotate is set to anything but 
nil? Does this variable actually do anything?

=== modified file 'lisp/ido.el'
--- lisp/ido.el	2012-10-05 07:38:05 +0000
+++ lisp/ido.el	2012-11-07 05:34:53 +0000
@@ -1020,6 +1020,9 @@
 (defvar ido-rotate nil
   "Non-nil means we are rotating list of matches.")

+(defvar ido-interrupted nil
+  "Non-nil means calculation of matches was interrupted by keyboard 
input.")
+
 (defvar ido-text nil
   "Stores the users string as it is typed in.")

@@ -3778,9 +3781,14 @@

 (defun ido-set-matches ()
   ;; Set `ido-matches' to the list of items matching prompt
-  (when ido-rescan
-    (setq ido-matches (ido-set-matches-1 (reverse ido-cur-list) (not 
ido-rotate))
-	  ido-rotate nil)))
+  (when (or ido-rescan ido-interrupted)
+    (setq ido-interrupted t)
+    (while-no-input
+      (setq ido-matches (ido-set-matches-1 (reverse ido-cur-list)
+                                           (not ido-rotate))
+            ido-interrupted nil
+            ido-rotate nil))
+    (when ido-interrupted (setq ido-matches nil))))

 (defun ido-ignore-item-p (name re-list &optional ignore-ext)
   ;; Return t if the buffer or file NAME should be ignored.
@@ -4513,11 +4521,12 @@
 	      (exit-minibuffer)))

 	;; Insert the match-status information:
-	(ido-set-common-completion)
-	(let ((inf (ido-completions contents)))
-	  (setq ido-show-confirm-message nil)
-	  (ido-trace "inf" inf)
-	  (insert inf))
+        (unless ido-interrupted
+          (ido-set-common-completion)
+          (let ((inf (ido-completions contents)))
+            (setq ido-show-confirm-message nil)
+            (ido-trace "inf" inf)
+            (insert inf)))
 	))))

 (defun ido-completions (name)







Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#12796; Package emacs. (Wed, 07 Nov 2012 10:39:02 GMT) Full text and rfc822 format available.

Message #44 received at 12796 <at> debbugs.gnu.org (full text, mbox):

From: Leo <sdl.web <at> gmail.com>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: Stefan Monnier <monnier <at> IRO.UMontreal.CA>, 12796 <at> debbugs.gnu.org,
	Kim Storm <storm <at> cua.dk>
Subject: Re: bug#12796: Optimize `ido-completing-read' for larger lists with
	flex matching enabled
Date: Wed, 07 Nov 2012 18:38:31 +0800
On 2012-11-07 12:06 +0800, Dmitry Gutov wrote:
> That was actually a good advice. As far as I can tell, most of the
> speed improvement comes from the following change

I seem to have some speedup on the flex matching part with the following
patch.

(tested on a ~9000 list with each item containing ~35 chars)

diff --git a/ido.el b/ido.el
index 31d5279d..dc623110 100644
--- a/ido.el
+++ b/ido.el
@@ -3710,6 +3710,25 @@ (defun ido-get-bufname (win)
 		(cons buf ido-bufs-in-frame)))))
 
 ;;; FIND MATCHING ITEMS
+(defun ido-chars-in-string (chars str &optional prefix)
+  (let ((p 0)
+	(len (length chars))
+	c)
+    (catch 'exit
+      (when (zerop len)
+	(throw 'exit t))
+      (when (zerop (length str))
+	(throw 'exit nil))
+      (setq c (aref chars 0))
+      (when (and prefix (/= c (aref str 0)))
+	(throw 'exit nil))
+      (dotimes (i (length str))
+	(when (eq (aref str i) c)
+	  (setq p (1+ p))
+	  (when (>= p len)
+	    (throw 'exit t))
+	  (setq c (aref chars p))))
+      (>= p len))))
 
 (defun ido-set-matches-1 (items &optional do-full)
   ;; Return list of matches in items
@@ -3783,13 +3802,10 @@ (defun ido-set-matches-1 (items &optional do-full)
 	       ido-enable-flex-matching
 	       (> (length ido-text) 1)
 	       (not ido-enable-regexp))
-      (setq re (mapconcat #'regexp-quote (split-string ido-text "") ".*"))
-      (if ido-enable-prefix
-	  (setq re (concat "\\`" re)))
       (mapc
        (lambda (item)
 	 (let ((name (ido-name item)))
-	   (if (string-match re name)
+	   (if (ido-chars-in-string ido-text name ido-enable-prefix)
 	       (setq matches (cons item matches)))))
        items))
     matches))




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#12796; Package emacs. (Wed, 07 Nov 2012 21:55:02 GMT) Full text and rfc822 format available.

Message #47 received at 12796 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Leo <sdl.web <at> gmail.com>
Cc: Stefan Monnier <monnier <at> IRO.UMontreal.CA>, 12796 <at> debbugs.gnu.org,
	Kim Storm <storm <at> cua.dk>
Subject: Re: bug#12796: Optimize `ido-completing-read' for larger lists with
	flex matching enabled
Date: Thu, 08 Nov 2012 01:54:51 +0400
On 07.11.2012 14:38, Leo wrote:
> On 2012-11-07 12:06 +0800, Dmitry Gutov wrote:
>> That was actually a good advice. As far as I can tell, most of the
>> speed improvement comes from the following change
>
> I seem to have some speedup on the flex matching part with the following
> patch.
>
> (tested on a ~9000 list with each item containing ~35 chars)
>
> ...

I've done some testing, see the setup and numbers at the end.

It looks like, with either patch, flex matching is not the bottleneck 
anymore. ido-hacks has some other functions changed or overridden, so if 
you're not satisfied with performance, you might want to look there.

(defun random-string (len)
  (let ((chars "1234567890abcdefghijklmnopqrstyvwxyz"))
    (apply 'string
           (loop for i from 1 to len
                 collect (aref chars (random (length chars)))))))

(defun random-dataset (size len)
  (loop for i from 1 to size
        collect (random-string len)))

(defmacro js2-time (form)
  "Evaluate FORM, discard result, and return elapsed time in sec"
  (declare (debug t))
  (let ((beg (make-symbol "--js2-time-beg--"))
        (delta (make-symbol "--js2-time-end--")))
    `(let ((,beg (current-time))
           ,delta)
       ,form
       (/ (truncate (* (- (float-time (current-time))
                          (float-time ,beg))
                       10000))
          10000.0))))

(defun ido-match-test (size len ido-text)
  (let ((items (random-dataset size len))
        (ido-cur-item 'list))
    (js2-time
     (ido-set-matches-1 items))))

;; *    size len        input  time
;; cis  9000 35 aaaaaaaaaa    0.06
;; cis 18000 15 aaaaaaaaaa    0.055
;; cis 18000 15 abcdefghzzzzz 0.057
;; omt  9000 35 aaaaaaaaaa    0.055 \
;; omt 18000 15 aaaaaaaaaa    0.054 + <- but the variance is bigger
;; omt 18000 15 abcdefghzzzzz 0.056 /
;; unp  9000 35 abcdefghzzzzz 0.82
;; unp 18000 15 abcdefghzzzzz 0.31

;; legend:
;; cis = ido-chars-in-string
;; ont = (split-string ido-text "" t)
;; unp = (split-string ido-text "")

;; bonus:
;; cis 18000 45 abcdefghzzz123 0.51
;; omt 18000 45 abcdefghzzz123 0.15
;; cis 18000 45 abcdefghzzz123 3.02




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#12796; Package emacs. (Thu, 08 Nov 2012 02:03:01 GMT) Full text and rfc822 format available.

Message #50 received at 12796 <at> debbugs.gnu.org (full text, mbox):

From: Leo <sdl.web <at> gmail.com>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: Stefan Monnier <monnier <at> iro.umontreal.ca>, 12796 <at> debbugs.gnu.org,
	Kim Storm <storm <at> cua.dk>
Subject: Re: bug#12796: Optimize `ido-completing-read' for larger lists with
	flex matching enabled
Date: Thu, 08 Nov 2012 10:00:58 +0800
On 2012-11-08 05:54 +0800, Dmitry Gutov wrote:
> It looks like, with either patch, flex matching is not the bottleneck
> anymore.

Excellent. The other patch is definitely simpler. So I prefer it.

Stefan, for concluding this bug, I think we should make this change.

From this bit in ido-set-matches-1:

(if ido-enable-prefix 
    (setq re (concat "\\`" re)))

It seems not including the leading and trailing .* is intended. So do
you mind installing the following small change for 24.3 that greatly
improves ido performance:

diff --git a/lisp/ido.el b/lisp/ido.el
index 31d5279d..c8bc0bb7 100644
--- a/lisp/ido.el
+++ b/lisp/ido.el
@@ -3783,7 +3783,7 @@ (defun ido-set-matches-1 (items &optional do-full)
 	       ido-enable-flex-matching
 	       (> (length ido-text) 1)
 	       (not ido-enable-regexp))
-      (setq re (mapconcat #'regexp-quote (split-string ido-text "") ".*"))
+      (setq re (mapconcat #'regexp-quote (split-string ido-text "" t) ".*"))
       (if ido-enable-prefix
 	  (setq re (concat "\\`" re)))
       (mapc

Leo




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#12796; Package emacs. (Thu, 08 Nov 2012 02:06:02 GMT) Full text and rfc822 format available.

Message #53 received at 12796 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: Leo <sdl.web <at> gmail.com>, 12796 <at> debbugs.gnu.org, Kim Storm <storm <at> cua.dk>
Subject: Re: bug#12796: Optimize `ido-completing-read' for larger lists with
	flex matching enabled
Date: Wed, 07 Nov 2012 21:05:54 -0500
> -      (setq re (mapconcat #'regexp-quote (split-string ido-text "") ".*"))
> +      (setq re (mapconcat #'regexp-quote (split-string ido-text "" t)
> ".*"))

Sounds like a good change.  Tho:

   (mapconcat (lambda (c) (regexp-quote (string c))) ido-text ".*")

would work as well.
You could try to speed up the regexp matching some more by eliminating
backtracking (which should mostly eliminate a few pathological cases):

   (let ((first t))
     (mapconcat (lambda (c)
                  (if first
                      (progn (setq first nil) (regexp-quote (string c)))
                    (concat "[^" (string c) "]*"
                            (regexp-quote (string c)))))
                ido-text ""))

> I'm still going to see if I can make while-no-input work here, though.

Yes, that'd be very welcome.


        Stefan




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#12796; Package emacs. (Thu, 08 Nov 2012 04:15:01 GMT) Full text and rfc822 format available.

Message #56 received at 12796 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Leo <sdl.web <at> gmail.com>
Cc: Kim Storm <storm <at> cua.dk>, 12796 <at> debbugs.gnu.org,
	Dmitry Gutov <dgutov <at> yandex.ru>
Subject: Re: bug#12796: Optimize `ido-completing-read' for larger lists with
	flex matching enabled
Date: Wed, 07 Nov 2012 23:14:44 -0500
>> From this bit in ido-set-matches-1:

> (if ido-enable-prefix 
>     (setq re (concat "\\`" re)))

> It seems not including the leading and trailing .* is intended.

Indeed.  This undesired behavior was introduced by the change to
split-string introduced in Emacs-22, so the patch fixes a regression
w.r.t Emacs-21.

> So do you mind installing the following small change for 24.3 that
> greatly improves ido performance:

I guess it's OK, yes.


        Stefan




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#12796; Package emacs. (Thu, 08 Nov 2012 04:30:02 GMT) Full text and rfc822 format available.

Message #59 received at 12796 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Leo <sdl.web <at> gmail.com>, 12796 <at> debbugs.gnu.org, Kim Storm <storm <at> cua.dk>
Subject: Re: bug#12796: Optimize `ido-completing-read' for larger lists with
	flex matching enabled
Date: Thu, 08 Nov 2012 08:29:19 +0400
On 08.11.2012 6:05, Stefan Monnier wrote:
>> -      (setq re (mapconcat #'regexp-quote (split-string ido-text "") ".*"))
>> +      (setq re (mapconcat #'regexp-quote (split-string ido-text "" t)
>> ".*"))
>
> Sounds like a good change.  Tho:
>
>     (mapconcat (lambda (c) (regexp-quote (string c))) ido-text ".*")
>
> would work as well.

Indeed. A two-character change offering massive speedup looks cuter, 
though. And easier to understand for casual readers.

> You could try to speed up the regexp matching some more by eliminating
> backtracking (which should mostly eliminate a few pathological cases):
>
>     (let ((first t))
>       (mapconcat (lambda (c)
>                    (if first
>                        (progn (setq first nil) (regexp-quote (string c)))
>                      (concat "[^" (string c) "]*"
>                              (regexp-quote (string c)))))
>                  ido-text ""))

Yep, this adds some further speedup especially with longer string.
To use the existing testing setup (numbers are a bit different in this 
session):

;; omt 18000 15 abcdefghzzzzz 0.042
;; nbt 18000 15 abcdefghzzzzz 0.040

;; omt 18000 45 abcdefghzzz123 0.127
;; nbt 18000 45 abcdefghzzz123 0.087

>> I'm still going to see if I can make while-no-input work here, though.
>
> Yes, that'd be very welcome.

I sent a patch that doesn't seem to break anything for me:

http://debbugs.gnu.org/cgi/bugreport.cgi?bug=12796#41

But in the light of the above numbers, it seems that (while-no-input) 
would almost always guard a section of code that takes 1/20th of a 
second to run, or less. Only useful when a user has floored "backspace", 
I think.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#12796; Package emacs. (Thu, 08 Nov 2012 07:37:02 GMT) Full text and rfc822 format available.

Message #62 received at 12796 <at> debbugs.gnu.org (full text, mbox):

From: Leo <sdl.web <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Kim Storm <storm <at> cua.dk>, 12796 <at> debbugs.gnu.org,
	Dmitry Gutov <dgutov <at> yandex.ru>
Subject: Re: bug#12796: Optimize `ido-completing-read' for larger lists with
	flex matching enabled
Date: Thu, 08 Nov 2012 15:36:23 +0800
On 2012-11-08 12:14 +0800, Stefan Monnier wrote:
> Indeed.  This undesired behavior was introduced by the change to
> split-string introduced in Emacs-22, so the patch fixes a regression
> w.r.t Emacs-21.

Thanks for that information.

>
>> So do you mind installing the following small change for 24.3 that
>> greatly improves ido performance:
>
> I guess it's OK, yes.

Can I incorporate your suggestion on removing the backtracking issue? I
have found cases where flex matching perform badly but with your
suggestion, for example, cut the time from 4.8s to 0.3s.

The patch could look like this:

diff --git a/lisp/ido.el b/lisp/ido.el
index 31d5279d..0a740b2a 100644
--- a/lisp/ido.el
+++ b/lisp/ido.el
@@ -3783,7 +3783,11 @@ (defun ido-set-matches-1 (items &optional do-full)
 	       ido-enable-flex-matching
 	       (> (length ido-text) 1)
 	       (not ido-enable-regexp))
-      (setq re (mapconcat #'regexp-quote (split-string ido-text "") ".*"))
+      (setq re (concat (regexp-quote (string (aref ido-text 0)))
+		       (mapconcat (lambda (c)
+				    (concat "[^" (string c) "]*"
+					    (regexp-quote (string c))))
+				  (substring ido-text 1) "")))



Leo




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#12796; Package emacs. (Thu, 08 Nov 2012 14:06:02 GMT) Full text and rfc822 format available.

Message #65 received at 12796 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Leo <sdl.web <at> gmail.com>
Cc: Kim Storm <storm <at> cua.dk>, 12796 <at> debbugs.gnu.org,
	Dmitry Gutov <dgutov <at> yandex.ru>
Subject: Re: bug#12796: Optimize `ido-completing-read' for larger lists with
	flex matching enabled
Date: Thu, 08 Nov 2012 09:05:14 -0500
>>> So do you mind installing the following small change for 24.3 that
>>> greatly improves ido performance:
>> I guess it's OK, yes.
> Can I incorporate your suggestion on removing the backtracking issue?

Not for 24.3, no, but on the trunk, of course, yes.

> I have found cases where flex matching perform badly but with your
> suggestion, for example, cut the time from 4.8s to 0.3s.

Cool,


        Stefan




Reply sent to Dmitry Gutov <dgutov <at> yandex.ru>:
You have taken responsibility. (Sat, 10 Nov 2012 17:53:02 GMT) Full text and rfc822 format available.

Notification sent to Dmitry Gutov <dgutov <at> yandex.ru>:
bug acknowledged by developer. (Sat, 10 Nov 2012 17:53:02 GMT) Full text and rfc822 format available.

Message #70 received at 12796-done <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: 12796-done <at> debbugs.gnu.org
Cc: Kim Storm <storm <at> cua.dk>, Stefan Monnier <monnier <at> iro.umontreal.ca>,
	Leo <sdl.web <at> gmail.com>
Subject: Re: bug#12796: Optimize `ido-completing-read' for larger lists with
	flex matching enabled
Date: Sat, 10 Nov 2012 21:52:31 +0400
With speed-up patches installed on both branches, I consider this fixed.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#12796; Package emacs. (Sat, 10 Nov 2012 22:52:01 GMT) Full text and rfc822 format available.

Message #73 received at 12796-done <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: Leo <sdl.web <at> gmail.com>, 12796-done <at> debbugs.gnu.org,
	Kim Storm <storm <at> cua.dk>
Subject: Re: bug#12796: Optimize `ido-completing-read' for larger lists with
	flex matching enabled
Date: Sat, 10 Nov 2012 17:51:07 -0500
> With speed-up patches installed on both branches, I consider this fixed.

No, the use of with-local-quit is the main issue to solve.


        Stefan




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#12796; Package emacs. (Sat, 10 Nov 2012 23:02:01 GMT) Full text and rfc822 format available.

Message #76 received at 12796-done <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Leo <sdl.web <at> gmail.com>, 12796-done <at> debbugs.gnu.org,
	Kim Storm <storm <at> cua.dk>
Subject: Re: bug#12796: Optimize `ido-completing-read' for larger lists with
	flex matching enabled
Date: Sun, 11 Nov 2012 03:01:13 +0400
On 11.11.2012 2:51, Stefan Monnier wrote:
>> With speed-up patches installed on both branches, I consider this fixed.
>
> No, the use of with-local-quit is the main issue to solve.

Do you mean `when-no-input' and the lack of its use?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#12796; Package emacs. (Sat, 10 Nov 2012 23:32:02 GMT) Full text and rfc822 format available.

Message #79 received at 12796-done <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: Leo <sdl.web <at> gmail.com>, 12796-done <at> debbugs.gnu.org,
	Kim Storm <storm <at> cua.dk>
Subject: Re: bug#12796: Optimize `ido-completing-read' for larger lists with
	flex matching enabled
Date: Sat, 10 Nov 2012 18:31:11 -0500
>>> With speed-up patches installed on both branches, I consider this fixed.
>> No, the use of with-local-quit is the main issue to solve.
> Do you mean `when-no-input' and the lack of its use?

Oh, yes, that's what I meant,


        Stefan




Did not alter fixed versions and reopened. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sun, 11 Nov 2012 18:34:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#12796; Package emacs. (Sun, 13 Sep 2020 16:15:02 GMT) Full text and rfc822 format available.

Message #84 received at 12796 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 12796-done <at> debbugs.gnu.org, Kim Storm <storm <at> cua.dk>,
 Leo <sdl.web <at> gmail.com>, 12796 <at> debbugs.gnu.org,
 Dmitry Gutov <dgutov <at> yandex.ru>
Subject: Re: bug#12796: Optimize `ido-completing-read' for larger lists with
 flex matching enabled
Date: Sun, 13 Sep 2020 18:14:07 +0200
Stefan Monnier <monnier <at> iro.umontreal.ca> writes:

>> With speed-up patches installed on both branches, I consider this fixed.
>
> No, the use of with-local-quit is the main issue to solve.

Dmitry Gutov <dgutov <at> yandex.ru> writes:

> Do you mean `when-no-input' and the lack of its use?

Stefan Monnier <monnier <at> iro.umontreal.ca> writes:

> Oh, yes, that's what I meant,

Or perhaps `while-no-input'?  :-)

Anyway, reading this bug report, the reported slowness itself was
fixed.  Stefan wondered why the `while-no-input' thing didn't do the
right thing to break out of it all?  I think?  But as there is no way to
reproduce the bug any more, it seems unlikely that we'll make any
further progress in this bug report, and I'm re-closing it again.

If somebody finds a way to reproduce this (seven years later), please
open a new bug report.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




bug closed, send any further explanations to 12796 <at> debbugs.gnu.org and Dmitry Gutov <dgutov <at> yandex.ru> Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Sun, 13 Sep 2020 16:15:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#12796; Package emacs. (Sun, 13 Sep 2020 16:15:03 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Mon, 12 Oct 2020 11:24:12 GMT) Full text and rfc822 format available.

This bug report was last modified 4 years and 38 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.