GNU bug report logs - #50906
xref-find-references blocks Emacs: asynchronous operation?

Package: emacs;

Reported by: Stefan Kangas <stefan <at> marxist.se>

Date: Wed, 29 Sep 2021 22:50:01 UTC

Severity: wishlist

To reply to this bug, email your comments to 50906 AT debbugs.gnu.org.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox

Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#50906; Package emacs. (Wed, 29 Sep 2021 22:50:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Stefan Kangas <stefan <at> marxist.se>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Wed, 29 Sep 2021 22:50:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Stefan Kangas <stefan <at> marxist.se>
To: bug-gnu-emacs <at> gnu.org
Subject: xref-find-references blocks Emacs: asynchronous operation?
Date: Wed, 29 Sep 2021 15:49:20 -0700

Severity: wishlist

`xref-find-references' blocks Emacs while searching for matches.
This can take a long time to complete in large repositories.

It would be nice if it could work asynchronously, like e.g. `M-x rgrep'.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50906; Package emacs. (Thu, 30 Sep 2021 07:45:02 GMT) Full text and rfc822 format available.

Message #8 received at 50906 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Martín <mardani29 <at> yahoo.es>
To: Stefan Kangas <stefan <at> marxist.se>
Cc: 50906 <at> debbugs.gnu.org
Subject: Re: bug#50906: xref-find-references blocks Emacs: asynchronous
 operation?
Date: Thu, 30 Sep 2021 09:44:12 +0200

Stefan Kangas <stefan <at> marxist.se> writes:

> Severity: wishlist
>
> `xref-find-references' blocks Emacs while searching for matches.
> This can take a long time to complete in large repositories.
>
> It would be nice if it could work asynchronously, like e.g. `M-x rgrep'.

Here's a related bug report:
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=50733

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50906; Package emacs. (Tue, 05 Oct 2021 02:00:02 GMT) Full text and rfc822 format available.

Message #11 received at 50906 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Stefan Kangas <stefan <at> marxist.se>, 50906 <at> debbugs.gnu.org
Subject: Re: bug#50906: xref-find-references blocks Emacs: asynchronous
 operation?
Date: Tue, 5 Oct 2021 04:59:29 +0300

On 30.09.2021 01:49, Stefan Kangas wrote:
> Severity: wishlist
> 
> `xref-find-references' blocks Emacs while searching for matches.
> This can take a long time to complete in large repositories.
> 
> It would be nice if it could work asynchronously, like e.g. `M-x rgrep'.

Wishlist indeed!

Daniel's bug report shows a good case for this kind of feature: huge 
projects where the search, even using fast tools (e.g. ripgrep), takes 
multiple seconds. So if results of such searches could be displayed 
incrementally, it would improve the perceive speed and usability.

What can be done here:

- Design an "asynchronous" format for xref-show-xrefs-function to 
consume. FETCHER of a different shape. Not sure how it's going to work 
in the end -- maybe a simple-ish iterator (call a function again for 
more results), but ideally it would look synchronous somehow, and the 
concurrency would be achieved through the use of threads. Not sure if 
that's realistic.

- The new kind of fetcher would need to provide a way to abort the 
search, since 'C-g' would not be available anymore.

- Implement it for the common searches of course.

Downsides:

- No way to quickly 'C-g' out of a search, supposedly one would have to 
switch to the results buffer (maybe it will be selected right away) and 
type 'C-c C-c'. And then kill the buffer, I guess?

- The size threshold of a project where the improvement will be 
significant is pretty big -- for instance, searching across the Emacs 
checkout takes about 100-200ms (just the time the external process 
uses). If the search results in many matches (1000s or 10000s) the 
results will take a while to display, but most of the time is taken up 
by processing of results which is implemented in Lisp. We might have 
Emacs which shows the first results soon, but then remains sluggish 
until all search results are processed. This problem could be worked 
around, however, by limiting the displayed number of results and having 
buttons like the ones at the bottom of vc-print-root-log output buffer.

- Search results come in unsorted, and, in the case of ripgrep, sorted 
randomly every time the search is performed (the files, at least). We 
sort them now at the bottom of xref-matches-in-files, but asynchronous 
search results would make that infeasible.

Given all of the above, I've been putting off this work, but thoughts 
and opinions welcome, and POC patches -- doubly so.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50906; Package emacs. (Tue, 05 Oct 2021 05:19:01 GMT) Full text and rfc822 format available.

Message #14 received at 50906 <at> debbugs.gnu.org (full text, mbox):

From: Arthur Miller <arthur.miller <at> live.com>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: Stefan Kangas <stefan <at> marxist.se>, 50906 <at> debbugs.gnu.org
Subject: Re: bug#50906: xref-find-references blocks Emacs: asynchronous
 operation?
Date: Tue, 05 Oct 2021 07:18:18 +0200

Dmitry Gutov <dgutov <at> yandex.ru> writes:

> On 30.09.2021 01:49, Stefan Kangas wrote:
>> Severity: wishlist
>> `xref-find-references' blocks Emacs while searching for matches.
>> This can take a long time to complete in large repositories.
>> It would be nice if it could work asynchronously, like e.g. `M-x rgrep'.
>
> Wishlist indeed!
>
> Daniel's bug report shows a good case for this kind of feature: huge projects
> where the search, even using fast tools (e.g. ripgrep), takes multiple
> seconds. So if results of such searches could be displayed incrementally, it
> would improve the perceive speed and usability.
>
> What can be done here:
>
> - Design an "asynchronous" format for xref-show-xrefs-function to
>   consume. FETCHER of a different shape. Not sure how it's going to work in the
>   end -- maybe a simple-ish iterator (call a function again for more results),
>   but ideally it would look synchronous somehow, and the concurrency would be
>   achieved through the use of threads. Not sure if that's realistic.

Built-in threads are not realistic, what you probably want is async processes. I
was myself thinking of something for finding all references for implementing
this asynchronosly for help, in style of , but I have not yet come to implement
that. However I have looked at native comp, 'comp-run-async-workers' and how it
processes it's qeue. I have no idea if it can be somehow adapted/reused, but
something like that at least as an idea.

> - The new kind of fetcher would need to provide a way to abort the search, since
>  'C-g' would not be available anymore.
It depends on how you would use it. If you would scan for references in the
background than you would be working with something else and wouldn't need
C-g. But reading your writing, something tells me that you would like to use it
interactively, which means you would start a *synchronous* operation, which
would use async workers, a lá Java's or MFC's thread workers to get responsive
and visible updates in real-time, while workers are still searching. In that
case you would still have C-g avaialable. On C-g you could signal worker
processes to quit.

Perhaps ...? :)

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50906; Package emacs. (Tue, 05 Oct 2021 06:30:02 GMT) Full text and rfc822 format available.

Message #17 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Helmut Eller <eller.helmut <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: Re: bug#50906: xref-find-references blocks Emacs: asynchronous
 operation?
Date: Tue, 05 Oct 2021 08:29:12 +0200

On Tue, Oct 05 2021, Dmitry Gutov wrote:

> What can be done here:
>
> - Design an "asynchronous" format for xref-show-xrefs-function to
>   consume. FETCHER of a different shape. Not sure how it's going to
>   work in the end -- maybe a simple-ish iterator (call a function
>   again for more results), but ideally it would look synchronous
>   somehow, and the concurrency would be achieved through the use of
>   threads. Not sure if that's realistic.
>
> - The new kind of fetcher would need to provide a way to abort the
>   search, since 'C-g' would not be available anymore.
>
> - Implement it for the common searches of course.

I think promises, as used in the Javascript world, would be a good fit
for this kind of problem.  Something like this:
https://github.com/chuntaro/emacs-promise.
        
> Downsides:
>
> - No way to quickly 'C-g' out of a search, supposedly one would have
>   to switch to the results buffer (maybe it will be selected right
>   away) and type 'C-c C-c'. And then kill the buffer, I guess?

Maybe we could have some "promise framework" that solves this problem
more generally, e.g., a list-promises command that works like
list-processes and offers a command to cancel promises.

> - The size threshold of a project where the improvement will be
>   significant is pretty big -- for instance, searching across the
>   Emacs checkout takes about 100-200ms (just the time the external
>   process uses). If the search results in many matches (1000s or
>   10000s) the results will take a while to display, but most of the
>   time is taken up by processing of results which is implemented in
>   Lisp. We might have Emacs which shows the first results soon, but
>   then remains sluggish until all search results are processed. This
>   problem could be worked around, however, by limiting the displayed
>   number of results and having buttons like the ones at the bottom of
>  vc-print-root-log output buffer.
>
> - Search results come in unsorted, and, in the case of ripgrep, sorted
>   randomly every time the search is performed (the files, at
>   least). We sort them now at the bottom of xref-matches-in-files, but
>   asynchronous search results would make that infeasible.

This is a good point and probably quite difficult to solve.  I'm
wondering if it would be possible to build some kind of index, like
search engines do.  So instead of grepping, we'd use the index and maybe
invest more effort in ranking the results?

Helmut

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50906; Package emacs. (Tue, 05 Oct 2021 15:13:02 GMT) Full text and rfc822 format available.

Message #20 received at 50906 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Arthur Miller <arthur.miller <at> live.com>
Cc: Stefan Kangas <stefan <at> marxist.se>, 50906 <at> debbugs.gnu.org
Subject: Re: bug#50906: xref-find-references blocks Emacs: asynchronous
 operation?
Date: Tue, 5 Oct 2021 18:11:40 +0300

On 05.10.2021 08:18, Arthur Miller wrote:

>> What can be done here:
>>
>> - Design an "asynchronous" format for xref-show-xrefs-function to
>>    consume. FETCHER of a different shape. Not sure how it's going to work in the
>>    end -- maybe a simple-ish iterator (call a function again for more results),
>>    but ideally it would look synchronous somehow, and the concurrency would be
>>    achieved through the use of threads. Not sure if that's realistic.
> 
> Built-in threads are not realistic, what you probably want is async processes.

Why not? It should be possible with cooperative multithreading (which we 
have), at least in theory. Under the hood it would be based on async 
processes, of course.

> I
> was myself thinking of something for finding all references for implementing
> this asynchronosly for help, in style of , but I have not yet come to implement
> that. However I have looked at native comp, 'comp-run-async-workers' and how it
> processes it's qeue. I have no idea if it can be somehow adapted/reused, but
> something like that at least as an idea.

Doesn't seem like it really can be reused directly: it launches a queue 
of processes. What we would need is a "queue" of result batches coming 
from one process. And we'd need some abstraction for it, not just 
concrete implementation.

>> - The new kind of fetcher would need to provide a way to abort the search, since
>>   'C-g' would not be available anymore.
> It depends on how you would use it. If you would scan for references in the
> background than you would be working with something else and wouldn't need
> C-g.

Why not? Sometimes the regexp I have typed is wrong (too short, for 
example), and I need to stop the search to correct it. Or even if the 
regexp was right, I might discover it brings too many matches to be useful.

> But reading your writing, something tells me that you would like to use it
> interactively, which means you would start a *synchronous* operation, which
> would use async workers, a lá Java's or MFC's thread workers to get responsive
> and visible updates in real-time, while workers are still searching. In that
> case you would still have C-g avaialable. On C-g you could signal worker
> processes to quit.

It's... an option too. And having lives with the current UI, I would 
probably be fine with it.

But I suppose a lot of users might want to be able to interact with the 
first results (that have been already rendered) before the search 
completes. Otherwise we're not really taking full advantage of 
asynchronous searching.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50906; Package emacs. (Tue, 05 Oct 2021 16:39:02 GMT) Full text and rfc822 format available.

Message #23 received at 50906 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Helmut Eller <eller.helmut <at> gmail.com>, 50906 <at> debbugs.gnu.org
Subject: Re: bug#50906: xref-find-references blocks Emacs: asynchronous
 operation?
Date: Tue, 5 Oct 2021 19:38:36 +0300

On 05.10.2021 09:29, Helmut Eller wrote:
> On Tue, Oct 05 2021, Dmitry Gutov wrote:
> 
>> What can be done here:
>>
>> - Design an "asynchronous" format for xref-show-xrefs-function to
>>    consume. FETCHER of a different shape. Not sure how it's going to
>>    work in the end -- maybe a simple-ish iterator (call a function
>>    again for more results), but ideally it would look synchronous
>>    somehow, and the concurrency would be achieved through the use of
>>    threads. Not sure if that's realistic.
>>
>> - The new kind of fetcher would need to provide a way to abort the
>>    search, since 'C-g' would not be available anymore.
>>
>> - Implement it for the common searches of course.
> 
> I think promises, as used in the Javascript world, would be a good fit
> for this kind of problem.  Something like this:
> https://github.com/chuntaro/emacs-promise.

A promise is something that resolves once. We could build on top of this 
concept, but what's really needed is some sort of a lazy sequence 
(Clojure-style), or a sequence of chunks.

>> Downsides:
>>
>> - No way to quickly 'C-g' out of a search, supposedly one would have
>>    to switch to the results buffer (maybe it will be selected right
>>    away) and type 'C-c C-c'. And then kill the buffer, I guess?
> 
> Maybe we could have some "promise framework" that solves this problem
> more generally, e.g., a list-promises command that works like
> list-processes and offers a command to cancel promises.

It would need be accessible by the code handling the "abort" command, 
not just by some special UI accessible to the user separately.

But some Promise/Future implementations include the "abort" 
functionality, so it can work together.

>> - The size threshold of a project where the improvement will be
>>    significant is pretty big -- for instance, searching across the
>>    Emacs checkout takes about 100-200ms (just the time the external
>>    process uses). If the search results in many matches (1000s or
>>    10000s) the results will take a while to display, but most of the
>>    time is taken up by processing of results which is implemented in
>>    Lisp. We might have Emacs which shows the first results soon, but
>>    then remains sluggish until all search results are processed. This
>>    problem could be worked around, however, by limiting the displayed
>>    number of results and having buttons like the ones at the bottom of
>>   vc-print-root-log output buffer.
>>
>> - Search results come in unsorted, and, in the case of ripgrep, sorted
>>    randomly every time the search is performed (the files, at
>>    least). We sort them now at the bottom of xref-matches-in-files, but
>>    asynchronous search results would make that infeasible.
> 
> This is a good point and probably quite difficult to solve.  I'm
> wondering if it would be possible to build some kind of index, like
> search engines do.  So instead of grepping, we'd use the index and maybe
> invest more effort in ranking the results?

For xref-find-references in particular, you can build an index using 'ID 
Utils' already, and the search will be fast. The downside is you will 
need to update this index manually when the project changes. E.g. when 
you switch to a different repository branch.

And the ripgrep devs are working on something similar: 
https://github.com/BurntSushi/ripgrep/issues/1497

Not sure how far off in the future that is, though.

A really fast searcher solves the biggest part of the problem, but we'd 
still be left with very imprecise searches (many matches) locking up 
Emacs for seconds, since the Lisp overhead processing a match is 
unavoidably larger than the time it takes for a search program to print 
it. Using lazy sequences could allow us some leeway as well -- namely, 
processing only the first N hits initially, and then processing the rest 
only if the user requests that.

If we only target this kind of improvement, the "abort" functionality 
could wait. We'd still need to choose between sorting the results and 
saving on parsing the output buffer eagerly, though.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50906; Package emacs. (Tue, 05 Oct 2021 18:10:01 GMT) Full text and rfc822 format available.

Message #26 received at 50906 <at> debbugs.gnu.org (full text, mbox):

From: Helmut Eller <eller.helmut <at> gmail.com>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 50906 <at> debbugs.gnu.org
Subject: Re: bug#50906: xref-find-references blocks Emacs: asynchronous
 operation?
Date: Tue, 05 Oct 2021 20:09:43 +0200

On Tue, Oct 05 2021, Dmitry Gutov wrote:

> A really fast searcher solves the biggest part of the problem, but
> we'd still be left with very imprecise searches (many matches) locking
> up Emacs for seconds, since the Lisp overhead processing a match is
> unavoidably larger than the time it takes for a search program to
> print it. Using lazy sequences could allow us some leeway as well --
> namely, processing only the first N hits initially, and then
> processing the rest only if the user requests that.
>
> If we only target this kind of improvement, the "abort" functionality
> could wait.

Yes, limiting the time that Emacs is locked up, by limiting the number of
hits that Emacs accepts in one chunk, seems like the way to go.

> We'd still need to choose between sorting the results and
> saving on parsing the output buffer eagerly, though.

Theoretically it should be possible to sort the first chunk and display
it.  Then, when the next chunk arrives, merge it in, à la heap-sort, and
update the display accordingly.  Probably not worth the effort, though.

Also, I think that the only "sorting" that we actually do, is grouping
by filename.  And that doesn't seem all that important to me.

Helmut

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50906; Package emacs. (Tue, 05 Oct 2021 19:25:01 GMT) Full text and rfc822 format available.

Message #29 received at 50906 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Helmut Eller <eller.helmut <at> gmail.com>
Cc: 50906 <at> debbugs.gnu.org
Subject: Re: bug#50906: xref-find-references blocks Emacs: asynchronous
 operation?
Date: Tue, 5 Oct 2021 22:24:26 +0300

On 05.10.2021 21:09, Helmut Eller wrote:

>> We'd still need to choose between sorting the results and
>> saving on parsing the output buffer eagerly, though.
> 
> Theoretically it should be possible to sort the first chunk and display
> it.  Then, when the next chunk arrives, merge it in, à la heap-sort, and
> update the display accordingly.  Probably not worth the effort, though.

This will lead to "jumping" of groups up and down. Not a pleasant UX.

> Also, I think that the only "sorting" that we actually do, is grouping
> by filename.  And that doesn't seem all that important to me.

xref-matches-in-files sorts results by filename alphabetically, because 
ripgrep returns them in random order every time. And the sorting step is 
pretty fast, as long as all results are available.

This bug report was last modified 4 years and 84 days ago.

Previous Next

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #50906 xref-find-references blocks Emacs: asynchronous operation?

GNU bug report logs - #50906
xref-find-references blocks Emacs: asynchronous operation?