GNU bug report logs -
#62368
29.0.60; Evaluating predicates before creating captured nodes in treesit-query-capture
Previous Next
Reported by: Yuan Fu <casouri <at> gmail.com>
Date: Wed, 22 Mar 2023 04:50:01 UTC
Severity: normal
Found in version 29.0.60
Done: Yuan Fu <casouri <at> gmail.com>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 62368 in the body.
You can then email your comments to 62368 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
dgutov <at> yandex.ru, bug-gnu-emacs <at> gnu.org
:
bug#62368
; Package
emacs
.
(Wed, 22 Mar 2023 04:50:01 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Yuan Fu <casouri <at> gmail.com>
:
New bug report received and forwarded. Copy sent to
dgutov <at> yandex.ru, bug-gnu-emacs <at> gnu.org
.
(Wed, 22 Mar 2023 04:50:01 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
X-Debbugs-CC: dgutov <at> yandex.ru
Dmitry, when you have time, could you try your benchmark in bug#60953
with this patch? I made predicates evaluate before we create any nodes,
so #equal and #match should be more efficient now, when there are a lot
of rejections. In the same time #pred is made slightly worst since they
now create a lisp node and discard it. (But this can be fixed with a
little more complexity.)
Yuan
[pred-first.patch (application/octet-stream, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62368
; Package
emacs
.
(Thu, 23 Mar 2023 00:43:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 62368 <at> debbugs.gnu.org (full text, mbox):
Hi Yuan!
On 22/03/2023 06:49, Yuan Fu wrote:
> X-Debbugs-CC:dgutov <at> yandex.ru
>
> Dmitry, when you have time, could you try your benchmark in bug#60953
> with this patch? I made predicates evaluate before we create any nodes,
> so #equal and #match should be more efficient now, when there are a lot
> of rejections. In the same time #pred is made slightly worst since they
> now create a lisp node and discard it. (But this can be fixed with a
> little more complexity.)
Thank you, I was curious what would the improvement be if we could delay
allocation of node structures until :match is checked.
But for my benchmark the difference is on the order of 4-5%. It seems we
are scraping the barrel in terms of improving allocations/reducing GC
because according to 'benchmark-run', where the whole run of a 100
iterations of the scenario takes ~1.1s, the time spent in GC is 0.150s.
And the improved version takes like 1.04s, with 0.1s in GC.
So if you ask me, I think I'd prefer to hold off on applying this patch
until we either find scenarios where the improvement is more
significant, or we find and eliminate some other bigger bottleneck
first, after which these 5% grow to become 10-20% or more, of remaining
runtime. The current approach is pretty Lisp-y, so I kind of like it.
And there's the issue of #pred, of course, which which could swing the
difference in the other direction (I didn't test any code which uses it).
We could also try a smaller change: where the initial list of conses for
result is build with capture_id's in car's, and then substituted with
capture_name if the predicates all match. Then tthe treesit_node
pseudovectors would still be created eagerly, though.
Here's the current perf report for my benchmark, most of the time is
spent in libtree-sitter:
17.02% emacs libtree-sitter.so.0.0 [.]
ts_tree_cursor_current_status ◆
10.94% emacs libtree-sitter.so.0.0 [.]
ts_tree_cursor_goto_next_sibling ▒
9.93% emacs libtree-sitter.so.0.0 [.]
ts_tree_cursor_goto_first_child ▒
9.55% emacs emacs [.]
process_mark_stack ▒
4.56% emacs libtree-sitter.so.0.0 [.]
ts_node_start_point ▒
3.90% emacs libtree-sitter.so.0.0 [.]
ts_tree_cursor_parent_node ▒
3.69% emacs emacs [.]
re_match_2_internal ▒
3.08% emacs libtree-sitter.so.0.0 [.]
ts_language_symbol_metadata ▒
1.61% emacs emacs [.] exec_byte_code
▒
1.47% emacs libtree-sitter.so.0.0 [.]
ts_node_end_point ▒
1.44% emacs libtree-sitter.so.0.0 [.]
ts_tree_cursor_current_node ▒
1.13% emacs emacs [.]
allocate_vectorlike ▒
1.11% emacs emacs [.] sweep_strings
▒
1.04% emacs libtree-sitter.so.0.0 [.]
ts_node_end_byte ▒
0.94% emacs emacs [.] next_interval
▒
0.91% emacs libtree-sitter.so.0.0 [.]
ts_tree_cursor_goto_parent ▒
0.88% emacs emacs [.]
lookup_char_property ▒
0.81% emacs emacs [.] find_interval
▒
0.68% emacs emacs [.]
pdumper_marked_p_impl ▒
0.67% emacs emacs [.] assq_no_quit
▒
0.56% emacs libtree-sitter.so.0.0 [.] ts_node_symbol
▒
0.56% emacs emacs [.] mark_char_table
▒
0.55% emacs emacs [.] execute_charset
▒
0.49% emacs libtree-sitter.so.0.0 [.]
0x000000000001ae3e ▒
0.49% emacs emacs [.] re_search_2
▒
0.48% emacs emacs [.] funcall_subr
▒
0.46% emacs libc.so.6 [.] __strncmp_sse42
▒
0.42% emacs libtree-sitter.so.0.0 [.]
ts_language_public_symbol ▒
0.41% emacs libtree-sitter.so.0.0 [.]
ts_node_is_named ▒
0.40% emacs libtree-sitter.so.0.0 [.] ts_node_new
▒
0.34% emacs emacs [.] Fassq
▒
0.34% emacs emacs [.] sweep_vectors
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62368
; Package
emacs
.
(Thu, 23 Mar 2023 03:17:01 GMT)
Full text and
rfc822 format available.
Message #11 received at 62368 <at> debbugs.gnu.org (full text, mbox):
> On Mar 22, 2023, at 5:42 PM, Dmitry Gutov <dgutov <at> yandex.ru> wrote:
>
> Hi Yuan!
>
> On 22/03/2023 06:49, Yuan Fu wrote:
>> X-Debbugs-CC:dgutov <at> yandex.ru
>> Dmitry, when you have time, could you try your benchmark in bug#60953
>> with this patch? I made predicates evaluate before we create any nodes,
>> so #equal and #match should be more efficient now, when there are a lot
>> of rejections. In the same time #pred is made slightly worst since they
>> now create a lisp node and discard it. (But this can be fixed with a
>> little more complexity.)
>
> Thank you, I was curious what would the improvement be if we could delay allocation of node structures until :match is checked.
>
> But for my benchmark the difference is on the order of 4-5%. It seems we are scraping the barrel in terms of improving allocations/reducing GC because according to 'benchmark-run', where the whole run of a 100 iterations of the scenario takes ~1.1s, the time spent in GC is 0.150s. And the improved version takes like 1.04s, with 0.1s in GC.
>
> So if you ask me, I think I'd prefer to hold off on applying this patch until we either find scenarios where the improvement is more significant, or we find and eliminate some other bigger bottleneck first, after which these 5% grow to become 10-20% or more, of remaining runtime. The current approach is pretty Lisp-y, so I kind of like it.
>
> And there's the issue of #pred, of course, which which could swing the difference in the other direction (I didn't test any code which uses it).
>
> We could also try a smaller change: where the initial list of conses for result is build with capture_id's in car's, and then substituted with capture_name if the predicates all match. Then tthe treesit_node pseudovectors would still be created eagerly, though.
Thank you very much! Yeah, it doesn’t seem to worth it. I guess we can keep this in our back sleeve for now ;-)
I think using symbols is fine for now, since I don’t think it would make much difference.
Yuan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62368
; Package
emacs
.
(Tue, 12 Sep 2023 00:07:01 GMT)
Full text and
rfc822 format available.
Message #14 received at 62368 <at> debbugs.gnu.org (full text, mbox):
Yuan Fu <casouri <at> gmail.com> writes:
>> On Mar 22, 2023, at 5:42 PM, Dmitry Gutov <dgutov <at> yandex.ru> wrote:
>>
>> Hi Yuan!
>>
>> On 22/03/2023 06:49, Yuan Fu wrote:
>>> X-Debbugs-CC:dgutov <at> yandex.ru Dmitry, when you have time, could you
>>> try your benchmark in bug#60953 with this patch? I made predicates
>>> evaluate before we create any nodes, so #equal and #match should be
>>> more efficient now, when there are a lot of rejections. In the same
>>> time #pred is made slightly worst since they now create a lisp node
>>> and discard it. (But this can be fixed with a little more
>>> complexity.)
>>
>> Thank you, I was curious what would the improvement be if we could
>> delay allocation of node structures until :match is checked.
>>
>> But for my benchmark the difference is on the order of 4-5%. It seems
>> we are scraping the barrel in terms of improving allocations/reducing
>> GC because according to 'benchmark-run', where the whole run of a 100
>> iterations of the scenario takes ~1.1s, the time spent in GC is
>> 0.150s. And the improved version takes like 1.04s, with 0.1s in GC.
>>
>> So if you ask me, I think I'd prefer to hold off on applying this
>> patch until we either find scenarios where the improvement is more
>> significant, or we find and eliminate some other bigger bottleneck
>> first, after which these 5% grow to become 10-20% or more, of
>> remaining runtime. The current approach is pretty Lisp-y, so I kind
>> of like it.
>>
>> And there's the issue of #pred, of course, which which could swing
>> the difference in the other direction (I didn't test any code which
>> uses it).
>>
>> We could also try a smaller change: where the initial list of conses
>> for result is build with capture_id's in car's, and then substituted
>> with capture_name if the predicates all match. Then tthe treesit_node
>> pseudovectors would still be created eagerly, though.
>
> Thank you very much! Yeah, it doesn’t seem to worth it. I guess we can
> keep this in our back sleeve for now ;-)
>
> I think using symbols is fine for now, since I don’t think it would
> make much difference.
So should this bug be closed, then?
Reply sent
to
Yuan Fu <casouri <at> gmail.com>
:
You have taken responsibility.
(Tue, 12 Sep 2023 00:39:02 GMT)
Full text and
rfc822 format available.
Notification sent
to
Yuan Fu <casouri <at> gmail.com>
:
bug acknowledged by developer.
(Tue, 12 Sep 2023 00:39:02 GMT)
Full text and
rfc822 format available.
Message #19 received at 62368-done <at> debbugs.gnu.org (full text, mbox):
> On Sep 11, 2023, at 5:05 PM, Stefan Kangas <stefankangas <at> gmail.com> wrote:
>
> Yuan Fu <casouri <at> gmail.com> writes:
>
>>> On Mar 22, 2023, at 5:42 PM, Dmitry Gutov <dgutov <at> yandex.ru> wrote:
>>>
>>> Hi Yuan!
>>>
>>> On 22/03/2023 06:49, Yuan Fu wrote:
>>>> X-Debbugs-CC:dgutov <at> yandex.ru Dmitry, when you have time, could you
>>>> try your benchmark in bug#60953 with this patch? I made predicates
>>>> evaluate before we create any nodes, so #equal and #match should be
>>>> more efficient now, when there are a lot of rejections. In the same
>>>> time #pred is made slightly worst since they now create a lisp node
>>>> and discard it. (But this can be fixed with a little more
>>>> complexity.)
>>>
>>> Thank you, I was curious what would the improvement be if we could
>>> delay allocation of node structures until :match is checked.
>>>
>>> But for my benchmark the difference is on the order of 4-5%. It seems
>>> we are scraping the barrel in terms of improving allocations/reducing
>>> GC because according to 'benchmark-run', where the whole run of a 100
>>> iterations of the scenario takes ~1.1s, the time spent in GC is
>>> 0.150s. And the improved version takes like 1.04s, with 0.1s in GC.
>>>
>>> So if you ask me, I think I'd prefer to hold off on applying this
>>> patch until we either find scenarios where the improvement is more
>>> significant, or we find and eliminate some other bigger bottleneck
>>> first, after which these 5% grow to become 10-20% or more, of
>>> remaining runtime. The current approach is pretty Lisp-y, so I kind
>>> of like it.
>>>
>>> And there's the issue of #pred, of course, which which could swing
>>> the difference in the other direction (I didn't test any code which
>>> uses it).
>>>
>>> We could also try a smaller change: where the initial list of conses
>>> for result is build with capture_id's in car's, and then substituted
>>> with capture_name if the predicates all match. Then tthe treesit_node
>>> pseudovectors would still be created eagerly, though.
>>
>> Thank you very much! Yeah, it doesn’t seem to worth it. I guess we can
>> keep this in our back sleeve for now ;-)
>>
>> I think using symbols is fine for now, since I don’t think it would
>> make much difference.
>
> So should this bug be closed, then?
We can close this for now. Thanks Stefan.
Yuan
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Tue, 10 Oct 2023 11:24:14 GMT)
Full text and
rfc822 format available.
This bug report was last modified 1 year and 214 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.