GNU bug report logs - #59693
29.0.50; treesitter in base buffer doesn't respond to modifications in indirect buffer correctly

Previous Next

Package: emacs;

Reported by: miha <at> kamnitnik.top

Date: Tue, 29 Nov 2022 20:21:01 UTC

Severity: normal

Found in version 29.0.50

Done: Yuan Fu <casouri <at> gmail.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 59693 in the body.
You can then email your comments to 59693 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#59693; Package emacs. (Tue, 29 Nov 2022 20:21:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to miha <at> kamnitnik.top:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Tue, 29 Nov 2022 20:21:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: miha <at> kamnitnik.top
To: bug-gnu-emacs <at> gnu.org
Subject: 29.0.50; treesitter in base buffer doesn't respond to modifications
 in indirect buffer correctly
Date: Tue, 29 Nov 2022 21:21:29 +0100
[Message part 1 (text/plain, inline)]
    *** Welcome to IELM ***  Type (describe-mode) or press C-h m for help.
    ELISP> (set-buffer (get-buffer-create "a"))
    ELISP> (insert "int main();")
    ELISP> (require 'treesit)
    ELISP> (treesit-node-children (treesit-node-child (treesit-buffer-root-node 'c) 0))
    (#<treesit-node
      (primitive_type)
      in 1-4> #<treesit-node
      (function_declarator)
      in 5-11> #<treesit-node ";" in 11-12>)

This is expected

    ELISP> (set-buffer (make-indirect-buffer "a" "b"))
    ELISP> (goto-char (point-min))
    ELISP> (insert " ")
    ELISP> (set-buffer "a")
    ELISP> (buffer-string)
    " int main();"

    ELISP> (treesit-node-children (treesit-node-child (treesit-buffer-root-node 'c) 0))
    (#<treesit-node
      (primitive_type)
      in 1-4> #<treesit-node
      (function_declarator)
      in 5-11> #<treesit-node
      (ERROR)
      in 11-12> #<treesit-node ";" in 12-13>)

This is unexpected. If we had called '(insert " ")' in the base buffer
"a", we would have got

    (#<treesit-node
      (primitive_type)
      in 2-5> #<treesit-node
      (function_declarator)
      in 6-12> #<treesit-node ";" in 12-13>)

Thanks for your hard work.
[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#59693; Package emacs. (Wed, 30 Nov 2022 10:18:02 GMT) Full text and rfc822 format available.

Message #8 received at 59693 <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: miha <at> kamnitnik.top
Cc: 59693 <at> debbugs.gnu.org
Subject: Re: bug#59693: 29.0.50; treesitter in base buffer doesn't respond  to
 modifications in indirect buffer correctly
Date: Wed, 30 Nov 2022 02:17:14 -0800
miha <at> kamnitnik.top writes:

>     *** Welcome to IELM ***  Type (describe-mode) or press C-h m for help.
>     ELISP> (set-buffer (get-buffer-create "a"))
>     ELISP> (insert "int main();")
>     ELISP> (require 'treesit)
>     ELISP> (treesit-node-children (treesit-node-child (treesit-buffer-root-node 'c) 0))
>     (#<treesit-node
>       (primitive_type)
>       in 1-4> #<treesit-node
>       (function_declarator)
>       in 5-11> #<treesit-node ";" in 11-12>)
>
> This is expected
>
>     ELISP> (set-buffer (make-indirect-buffer "a" "b"))
>     ELISP> (goto-char (point-min))
>     ELISP> (insert " ")
>     ELISP> (set-buffer "a")
>     ELISP> (buffer-string)
>     " int main();"
>
>     ELISP> (treesit-node-children (treesit-node-child (treesit-buffer-root-node 'c) 0))
>     (#<treesit-node
>       (primitive_type)
>       in 1-4> #<treesit-node
>       (function_declarator)
>       in 5-11> #<treesit-node
>       (ERROR)
>       in 11-12> #<treesit-node ";" in 12-13>)
>
> This is unexpected. If we had called '(insert " ")' in the base buffer
> "a", we would have got
>
>     (#<treesit-node
>       (primitive_type)
>       in 2-5> #<treesit-node
>       (function_declarator)
>       in 6-12> #<treesit-node ";" in 12-13>)
>
> Thanks for your hard work.

Thanks! I forgot about indirect buffers... We’ll need to make sure to
use the parsers in the original buffer when user edits the indirect
buffer, or something like that. I need to look into how does indirect
buffer works.

Yuan




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#59693; Package emacs. (Wed, 30 Nov 2022 14:07:03 GMT) Full text and rfc822 format available.

Message #11 received at 59693 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Yuan Fu <casouri <at> gmail.com>
Cc: 59693 <at> debbugs.gnu.org, miha <at> kamnitnik.top
Subject: Re: bug#59693: 29.0.50;
 treesitter in base buffer doesn't respond to modifications in
 indirect buffer correctly
Date: Wed, 30 Nov 2022 16:05:50 +0200
> Cc: 59693 <at> debbugs.gnu.org
> From: Yuan Fu <casouri <at> gmail.com>
> Date: Wed, 30 Nov 2022 02:17:14 -0800
> 
> Thanks! I forgot about indirect buffers... We’ll need to make sure to
> use the parsers in the original buffer when user edits the indirect
> buffer, or something like that. I need to look into how does indirect
> buffer works.

In the insdel.c hooks where you record changes to buffer text, you should
see if the buffer has a base_buffer, and if so, update any parsers of the
base buffer as well.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#59693; Package emacs. (Fri, 02 Dec 2022 05:06:01 GMT) Full text and rfc822 format available.

Message #14 received at 59693 <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 59693 <at> debbugs.gnu.org, miha <at> kamnitnik.top
Subject: Re: bug#59693: 29.0.50; treesitter in base buffer doesn't respond to
 modifications in indirect buffer correctly
Date: Thu, 1 Dec 2022 21:05:43 -0800

> On Nov 30, 2022, at 6:05 AM, Eli Zaretskii <eliz <at> gnu.org> wrote:
> 
>> Cc: 59693 <at> debbugs.gnu.org
>> From: Yuan Fu <casouri <at> gmail.com>
>> Date: Wed, 30 Nov 2022 02:17:14 -0800
>> 
>> Thanks! I forgot about indirect buffers... We’ll need to make sure to
>> use the parsers in the original buffer when user edits the indirect
>> buffer, or something like that. I need to look into how does indirect
>> buffer works.
> 
> In the insdel.c hooks where you record changes to buffer text, you should
> see if the buffer has a base_buffer, and if so, update any parsers of the
> base buffer as well.

Actually there’s a little bit of problem. When we edit the base buffer, we would want to update the parsers in all of its indirect buffers as well, and AFAICT there is no pointer from base buffer to the indirect buffer, only the other way around. 

We don’t want indirect buffer and base buffers to share parsers, since they can have different narrowing, and semantically indirect buffers should share anything but the text with the base buffer.

How about this: we change current_buffer->parser_list from a plain list of parsers to a cons (PARSER-LIST . INDIRECT-PARSER-LIST), where PARSER-LIST is as before. But for base buffers, INDIRECT-PARSER-LIST includes all the parsers of its indirect buffers; and for indirect buffers, INDIRECT-PARSER-LIST is nil.

Then base buffer can update all indirect buffers’ parsers, and indirect buffer can find its base buffer and update all the parsers, including the base buffer’s parsers and other indirect buffers’ parsers.

Of course, treesit-parser-create and treesit-parser-delete needs to do some extra work, but nothing complicated.

Yuan



Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#59693; Package emacs. (Fri, 02 Dec 2022 08:35:01 GMT) Full text and rfc822 format available.

Message #17 received at 59693 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Yuan Fu <casouri <at> gmail.com>
Cc: 59693 <at> debbugs.gnu.org, miha <at> kamnitnik.top
Subject: Re: bug#59693: 29.0.50; treesitter in base buffer doesn't respond to
 modifications in indirect buffer correctly
Date: Fri, 02 Dec 2022 10:33:31 +0200
> From: Yuan Fu <casouri <at> gmail.com>
> Date: Thu, 1 Dec 2022 21:05:43 -0800
> Cc: miha <at> kamnitnik.top,
>  59693 <at> debbugs.gnu.org
> 
> > In the insdel.c hooks where you record changes to buffer text, you should
> > see if the buffer has a base_buffer, and if so, update any parsers of the
> > base buffer as well.
> 
> Actually there’s a little bit of problem. When we edit the base buffer, we would want to update the parsers in all of its indirect buffers as well, and AFAICT there is no pointer from base buffer to the indirect buffer, only the other way around. 

That's not the problem presented by the OP, though.

> We don’t want indirect buffer and base buffers to share parsers, since they can have different narrowing, and semantically indirect buffers should share anything but the text with the base buffer.

Yes, the parsers should not be shared.

> How about this: we change current_buffer->parser_list from a plain list of parsers to a cons (PARSER-LIST . INDIRECT-PARSER-LIST), where PARSER-LIST is as before. But for base buffers, INDIRECT-PARSER-LIST includes all the parsers of its indirect buffers; and for indirect buffers, INDIRECT-PARSER-LIST is nil.

You can maybe have the indirect buffers in the list, not their parsers.
That could make it easier to access other treesit-related information of the
indirect buffers, if needed.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#59693; Package emacs. (Sat, 03 Dec 2022 01:02:02 GMT) Full text and rfc822 format available.

Message #20 received at 59693 <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 59693 <at> debbugs.gnu.org, miha <at> kamnitnik.top
Subject: Re: bug#59693: 29.0.50; treesitter in base buffer doesn't respond to
 modifications in indirect buffer correctly
Date: Fri, 2 Dec 2022 17:01:36 -0800

> On Dec 2, 2022, at 12:33 AM, Eli Zaretskii <eliz <at> gnu.org> wrote:
> 
>> From: Yuan Fu <casouri <at> gmail.com>
>> Date: Thu, 1 Dec 2022 21:05:43 -0800
>> Cc: miha <at> kamnitnik.top,
>> 59693 <at> debbugs.gnu.org
>> 
>>> In the insdel.c hooks where you record changes to buffer text, you should
>>> see if the buffer has a base_buffer, and if so, update any parsers of the
>>> base buffer as well.
>> 
>> Actually there’s a little bit of problem. When we edit the base buffer, we would want to update the parsers in all of its indirect buffers as well, and AFAICT there is no pointer from base buffer to the indirect buffer, only the other way around. 
> 
> That's not the problem presented by the OP, though.

Yeah, but they are the same problem in spirit, ie, parser not updated when base/indirect buffer receive changes.

> 
>> We don’t want indirect buffer and base buffers to share parsers, since they can have different narrowing, and semantically indirect buffers should share anything but the text with the base buffer.
> 
> Yes, the parsers should not be shared.
> 
>> How about this: we change current_buffer->parser_list from a plain list of parsers to a cons (PARSER-LIST . INDIRECT-PARSER-LIST), where PARSER-LIST is as before. But for base buffers, INDIRECT-PARSER-LIST includes all the parsers of its indirect buffers; and for indirect buffers, INDIRECT-PARSER-LIST is nil.
> 
> You can maybe have the indirect buffers in the list, not their parsers.
> That could make it easier to access other treesit-related information of the
> indirect buffers, if needed.

Good idea, it’s easier to know when to remove the reference with buffers, aka when buffer is killed.

Yuan



Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#59693; Package emacs. (Sun, 04 Dec 2022 07:22:01 GMT) Full text and rfc822 format available.

Message #23 received at 59693 <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 59693 <at> debbugs.gnu.org, miha <at> kamnitnik.top
Subject: Re: bug#59693: 29.0.50; treesitter in base buffer doesn't respond to
 modifications in indirect buffer correctly
Date: Sat, 3 Dec 2022 23:20:59 -0800
[Message part 1 (text/plain, inline)]

> On Dec 2, 2022, at 5:01 PM, Yuan Fu <casouri <at> gmail.com> wrote:
> 
> 
> 
>> On Dec 2, 2022, at 12:33 AM, Eli Zaretskii <eliz <at> gnu.org> wrote:
>> 
>>> From: Yuan Fu <casouri <at> gmail.com>
>>> Date: Thu, 1 Dec 2022 21:05:43 -0800
>>> Cc: miha <at> kamnitnik.top,
>>> 59693 <at> debbugs.gnu.org
>>> 
>>>> In the insdel.c hooks where you record changes to buffer text, you should
>>>> see if the buffer has a base_buffer, and if so, update any parsers of the
>>>> base buffer as well.
>>> 
>>> Actually there’s a little bit of problem. When we edit the base buffer, we would want to update the parsers in all of its indirect buffers as well, and AFAICT there is no pointer from base buffer to the indirect buffer, only the other way around. 
>> 
>> That's not the problem presented by the OP, though.
> 
> Yeah, but they are the same problem in spirit, ie, parser not updated when base/indirect buffer receive changes.
> 
>> 
>>> We don’t want indirect buffer and base buffers to share parsers, since they can have different narrowing, and semantically indirect buffers should share anything but the text with the base buffer.
>> 
>> Yes, the parsers should not be shared.
>> 
>>> How about this: we change current_buffer->parser_list from a plain list of parsers to a cons (PARSER-LIST . INDIRECT-PARSER-LIST), where PARSER-LIST is as before. But for base buffers, INDIRECT-PARSER-LIST includes all the parsers of its indirect buffers; and for indirect buffers, INDIRECT-PARSER-LIST is nil.
>> 
>> You can maybe have the indirect buffers in the list, not their parsers.
>> That could make it easier to access other treesit-related information of the
>> indirect buffers, if needed.
> 
> Good idea, it’s easier to know when to remove the reference with buffers, aka when buffer is killed.

I now have a patch that fixes this problem. WDYT? I added a new buffer field since it’s cleaner than turning ts_parser_list into a cons, hopefully that’s not frowned upon. 


Yuan

[indirect.patch (application/octet-stream, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#59693; Package emacs. (Sun, 04 Dec 2022 07:47:01 GMT) Full text and rfc822 format available.

Message #26 received at 59693 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Yuan Fu <casouri <at> gmail.com>, Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 59693 <at> debbugs.gnu.org, miha <at> kamnitnik.top
Subject: Re: bug#59693: 29.0.50; treesitter in base buffer doesn't respond to
 modifications in indirect buffer correctly
Date: Sun, 04 Dec 2022 09:46:13 +0200
> From: Yuan Fu <casouri <at> gmail.com>
> Date: Sat, 3 Dec 2022 23:20:59 -0800
> Cc: miha <at> kamnitnik.top,
>  59693 <at> debbugs.gnu.org
> 
> >>> We don’t want indirect buffer and base buffers to share parsers, since they can have different narrowing, and semantically indirect buffers should share anything but the text with the base buffer.
> >> 
> >> Yes, the parsers should not be shared.
> >> 
> >>> How about this: we change current_buffer->parser_list from a plain list of parsers to a cons (PARSER-LIST . INDIRECT-PARSER-LIST), where PARSER-LIST is as before. But for base buffers, INDIRECT-PARSER-LIST includes all the parsers of its indirect buffers; and for indirect buffers, INDIRECT-PARSER-LIST is nil.
> >> 
> >> You can maybe have the indirect buffers in the list, not their parsers.
> >> That could make it easier to access other treesit-related information of the
> >> indirect buffers, if needed.
> > 
> > Good idea, it’s easier to know when to remove the reference with buffers, aka when buffer is killed.
> 
> I now have a patch that fixes this problem. WDYT? I added a new buffer field since it’s cleaner than turning ts_parser_list into a cons, hopefully that’s not frowned upon. 

Thanks.

If we are adding to the buffer object a field that holds the list of
indirect buffers, then:

  . the name of the field should not include "treesit" in it, and it
    shouldn't be conditioned on HAVE_TREE_SITTER
  . I wonder if a flat list is a good idea, i.e. scalable enough? also,
    treesit_reap_indirect_buffers conses a lot as result
  . I vaguely remember that adding built-in fields to the buffer object had
    some caveats, but I don't recall the details (did you bootstrap?)

Stefan, any comments on this?  Are there better ideas of how to track buffer
text changes in indirect buffers?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#59693; Package emacs. (Sun, 04 Dec 2022 23:22:02 GMT) Full text and rfc822 format available.

Message #29 received at 59693 <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 59693 <at> debbugs.gnu.org, Stefan Monnier <monnier <at> iro.umontreal.ca>,
 miha <at> kamnitnik.top
Subject: Re: bug#59693: 29.0.50; treesitter in base buffer doesn't respond to
 modifications in indirect buffer correctly
Date: Sun, 4 Dec 2022 15:21:10 -0800

> On Dec 3, 2022, at 11:46 PM, Eli Zaretskii <eliz <at> gnu.org> wrote:
> 
>> From: Yuan Fu <casouri <at> gmail.com>
>> Date: Sat, 3 Dec 2022 23:20:59 -0800
>> Cc: miha <at> kamnitnik.top,
>> 59693 <at> debbugs.gnu.org
>> 
>>>>> We don’t want indirect buffer and base buffers to share parsers, since they can have different narrowing, and semantically indirect buffers should share anything but the text with the base buffer.
>>>> 
>>>> Yes, the parsers should not be shared.
>>>> 
>>>>> How about this: we change current_buffer->parser_list from a plain list of parsers to a cons (PARSER-LIST . INDIRECT-PARSER-LIST), where PARSER-LIST is as before. But for base buffers, INDIRECT-PARSER-LIST includes all the parsers of its indirect buffers; and for indirect buffers, INDIRECT-PARSER-LIST is nil.
>>>> 
>>>> You can maybe have the indirect buffers in the list, not their parsers.
>>>> That could make it easier to access other treesit-related information of the
>>>> indirect buffers, if needed.
>>> 
>>> Good idea, it’s easier to know when to remove the reference with buffers, aka when buffer is killed.
>> 
>> I now have a patch that fixes this problem. WDYT? I added a new buffer field since it’s cleaner than turning ts_parser_list into a cons, hopefully that’s not frowned upon. 
> 
> Thanks.
> 
> If we are adding to the buffer object a field that holds the list of
> indirect buffers, then:
> 
>  . the name of the field should not include "treesit" in it, and it
>    shouldn't be conditioned on HAVE_TREE_SITTER

I made it tree-sitter specific and stated that it doesn’t necessarily contain all indirect buffers to simply the implementation. Right now new indirect buffer are added to the list only when a parser is created in an indirect buffer. And dead indirect buffers are discarded only when treesit_record_change runs. Do we really need a proper indirect buffer list? After all, no one complained about its absence ever since indirect buffer were added.

>  . I wonder if a flat list is a good idea, i.e. scalable enough? also,
>    treesit_reap_indirect_buffers conses a lot as result

AFAIK in most cases only a handful of indirect buffer are created, so I didn’t worry about that. For tree-sitter’s case, we need to iterate through every indirect buffer anyway so that’s always O(n). Adding and removing a buffer from the list is O(n) but they are done infrequently. The add operation is called every time a parser is created, and the remove operation is called once when an indirect buffer is killed. 

We could use a hash table, but I doubt if that’s necessary, at least while the indirect buffer list is only used for tree-sitter. For tree-sitter, the common path is when we update each parser of buffer changes, which is always O(n) regardless of the data structure. 

>  . I vaguely remember that adding built-in fields to the buffer object had
>    some caveats, but I don't recall the details (did you bootstrap?)

I built from git clean, so yeah.

> 
> Stefan, any comments on this?  Are there better ideas of how to track buffer
> text changes in indirect buffers?

Specifically, when any one of the base and indirect buffers change, we want all parsers in all base and indirect buffers to be updated.

Yuan



Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#59693; Package emacs. (Mon, 05 Dec 2022 03:50:02 GMT) Full text and rfc822 format available.

Message #32 received at 59693 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Yuan Fu <casouri <at> gmail.com>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 59693 <at> debbugs.gnu.org, miha <at> kamnitnik.top
Subject: Re: bug#59693: 29.0.50; treesitter in base buffer doesn't respond
 to modifications in indirect buffer correctly
Date: Sun, 04 Dec 2022 22:49:14 -0500
> Actually there’s a little bit of problem. When we edit the base buffer, we
> would want to update the parsers in all of its indirect buffers as well, and
> AFAICT there is no pointer from base buffer to the indirect buffer, only the
> other way around. 

Background note: I consider indirect buffers an attractive nuisance
and for this reason I think we should spend as little time as possible
improving them.  Instead we should encourage people to develop
alternative approaches.
I know we have a large number of bugs lurking in that area.

E.g.:

    Emacs -Q
    M-x clone-indirect-buffer RET
    M-x fundamental-mode RET

we see that `*scratch*` has lost its font-lock highlighting even though we
changed major mode in the "other" buffer.
Now go to `*scratch*` type some text: you see that font-lock properly
updates the new code's highlighting.
Then go to `*scratch*<2>` and type some more text: you should see that
font-lock does *not* properly update the new code's highlighting in the
base buffer.

AFAICT this is the exact same bug as what you're seeing, just with
font-lock instead of tree-sitter.  [ Of course, for font-lock it's worse
because font-lock uses text-properties (which are shared between the
base and the its indirect buffers) so it simply can't do its job
properly in indirect buffers.  ]

This bug has been with us since indirect buffers were invented, so
it has very low priority, AFAIC.

> Then base buffer can update all indirect buffers’ parsers, and indirect
> buffer can find its base buffer and update all the parsers, including the
> base buffer’s parsers and other indirect buffers’ parsers.

Don't bother, please.
Instead, I recommend you disallow the use of tree sitter in indirect
buffers.  And we should probably try and change `insdel.c` so it always
runs the `after/before-change-functions` in the base buffer (this should
fix the worst part of the problems).


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#59693; Package emacs. (Mon, 05 Dec 2022 08:20:01 GMT) Full text and rfc822 format available.

Message #35 received at 59693 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>, Yuan Fu <casouri <at> gmail.com>
Cc: 59693 <at> debbugs.gnu.org, miha <at> kamnitnik.top
Subject: Re: bug#59693: 29.0.50; treesitter in base buffer doesn't respond to modifications in indirect buffer correctly
Date: Mon, 05 Dec 2022 10:19:15 +0200
On December 5, 2022 5:49:14 AM GMT+02:00, Stefan Monnier <monnier <at> iro.umontreal.ca> wrote:
> 
> Don't bother, please.
> Instead, I recommend you disallow the use of tree sitter in indirect
> buffers.  And we should probably try and change `insdel.c` so it always
> runs the `after/before-change-functions` in the base buffer (this should
> fix the worst part of the problems).


Will this "hold water" vis-a-vis the expectations of various features that would like to use tree sitter related capabilities?  Regardless of your opinion on indirect buffers, many 3rd party packages and features use indirect buffers as the backbone of their implementation.  We don't currently have any alternatives to that, AFAIK.  So I'd expect a lot of disappointment if we declare that indirect buffers will not be supported by tree sitter as a matter of design decision.

Changing insdel.c to run stuff in base buffer could be a solution, but I don't feel we can make such changes on the release branch.  But maybe we can do that now only for treesit.c functions.  Yuan, would tgat solve this particular problem?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#59693; Package emacs. (Mon, 05 Dec 2022 15:30:02 GMT) Full text and rfc822 format available.

Message #38 received at 59693 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: Yuan Fu <casouri <at> gmail.com>, 59693 <at> debbugs.gnu.org, miha <at> kamnitnik.top
Subject: Re: bug#59693: 29.0.50; treesitter in base buffer doesn't respond
 to modifications in indirect buffer correctly
Date: Mon, 05 Dec 2022 10:29:06 -0500
> Will this "hold water" vis-a-vis the expectations of various features that
> would like to use tree sitter related capabilities?  Regardless of your
> opinion on indirect buffers, many 3rd party packages and features use
> indirect buffers as the backbone of their implementation.  We don't
> currently have any alternatives to that, AFAIK.  So I'd expect a lot of
> disappointment if we declare that indirect buffers will not be supported by
> tree sitter as a matter of design decision.

I can't see why: a package that uses indirect buffers can just as well
run its tree-sitter parsers in the base buffer.

> Changing insdel.c to run stuff in base buffer could be a solution, but
> I don't feel we can make such changes on the release branch.

Agreed.

> But maybe we can do that now only for treesit.c functions.

Sounds OK to me, yes.


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#59693; Package emacs. (Mon, 05 Dec 2022 15:46:01 GMT) Full text and rfc822 format available.

Message #41 received at 59693 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: casouri <at> gmail.com, 59693 <at> debbugs.gnu.org, miha <at> kamnitnik.top
Subject: Re: bug#59693: 29.0.50; treesitter in base buffer doesn't respond
 to modifications in indirect buffer correctly
Date: Mon, 05 Dec 2022 17:44:49 +0200
> From: Stefan Monnier <monnier <at> iro.umontreal.ca>
> Cc: Yuan Fu <casouri <at> gmail.com>,  59693 <at> debbugs.gnu.org,  miha <at> kamnitnik.top
> Date: Mon, 05 Dec 2022 10:29:06 -0500
> 
> > Will this "hold water" vis-a-vis the expectations of various features that
> > would like to use tree sitter related capabilities?  Regardless of your
> > opinion on indirect buffers, many 3rd party packages and features use
> > indirect buffers as the backbone of their implementation.  We don't
> > currently have any alternatives to that, AFAIK.  So I'd expect a lot of
> > disappointment if we declare that indirect buffers will not be supported by
> > tree sitter as a matter of design decision.
> 
> I can't see why: a package that uses indirect buffers can just as well
> run its tree-sitter parsers in the base buffer.

I thought about those multi-mode modes, or features that show the same
buffer more than once with different modes.

> > Changing insdel.c to run stuff in base buffer could be a solution, but
> > I don't feel we can make such changes on the release branch.
> 
> Agreed.
> 
> > But maybe we can do that now only for treesit.c functions.
> 
> Sounds OK to me, yes.

OK, thanks.  Yuan, will this work for you? can you show a patch along these
lines?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#59693; Package emacs. (Mon, 05 Dec 2022 20:15:02 GMT) Full text and rfc822 format available.

Message #44 received at 59693 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: casouri <at> gmail.com, 59693 <at> debbugs.gnu.org, miha <at> kamnitnik.top
Subject: Re: bug#59693: 29.0.50; treesitter in base buffer doesn't respond
 to modifications in indirect buffer correctly
Date: Mon, 05 Dec 2022 15:14:20 -0500
>> I can't see why: a package that uses indirect buffers can just as well
>> run its tree-sitter parsers in the base buffer.
> I thought about those multi-mode modes, or features that show the same
> buffer more than once with different modes.

Yes, that's exactly the case I'm thinking about: these packages should
have no trouble running all their parsers in the base buffer (even if
the rest of their work is done in indirect buffers).


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#59693; Package emacs. (Tue, 06 Dec 2022 02:16:01 GMT) Full text and rfc822 format available.

Message #47 received at 59693 <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 59693 <at> debbugs.gnu.org, Stefan Monnier <monnier <at> iro.umontreal.ca>,
 miha <at> kamnitnik.top
Subject: Re: bug#59693: 29.0.50; treesitter in base buffer doesn't respond to
 modifications in indirect buffer correctly
Date: Mon, 5 Dec 2022 18:15:07 -0800

> On Dec 5, 2022, at 7:44 AM, Eli Zaretskii <eliz <at> gnu.org> wrote:
> 
>> From: Stefan Monnier <monnier <at> iro.umontreal.ca>
>> Cc: Yuan Fu <casouri <at> gmail.com>,  59693 <at> debbugs.gnu.org,  miha <at> kamnitnik.top
>> Date: Mon, 05 Dec 2022 10:29:06 -0500
>> 
>>> Will this "hold water" vis-a-vis the expectations of various features that
>>> would like to use tree sitter related capabilities?  Regardless of your
>>> opinion on indirect buffers, many 3rd party packages and features use
>>> indirect buffers as the backbone of their implementation.  We don't
>>> currently have any alternatives to that, AFAIK.  So I'd expect a lot of
>>> disappointment if we declare that indirect buffers will not be supported by
>>> tree sitter as a matter of design decision.
>> 
>> I can't see why: a package that uses indirect buffers can just as well
>> run its tree-sitter parsers in the base buffer.
> 
> I thought about those multi-mode modes, or features that show the same
> buffer more than once with different modes.
> 
>>> Changing insdel.c to run stuff in base buffer could be a solution, but
>>> I don't feel we can make such changes on the release branch.
>> 
>> Agreed.
>> 
>>> But maybe we can do that now only for treesit.c functions.
>> 
>> Sounds OK to me, yes.
> 
> OK, thanks.  Yuan, will this work for you? can you show a patch along these
> lines?

That wouldn’t work, unfortunately. We need to update all the parsers in every indirect and base buffer, enforce all changes to happen in the base buffer doesn’t help with that: indirect buffers’ parsers would be out-of-sync.

I can think of two ways:

1. Only allow base buffer to have parsers, no change is needed for insdel.c, treesit_record_change can find the base buffer and update its parsers. We can ask indirect buffers to use their base buffer’s parser. Unless the base buffer is narrowed, I think it will work fine. 

I remember that there were a discussion along the lines of user-narrow vs low-level narrow, what was the outcome of that discussion?

2. The patch I sent earlier, which tracks indirect buffers so all the relevant buffer’s parser are updated at a buffer content change.

Yuan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#59693; Package emacs. (Tue, 06 Dec 2022 03:58:01 GMT) Full text and rfc822 format available.

Message #50 received at 59693 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Yuan Fu <casouri <at> gmail.com>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 59693 <at> debbugs.gnu.org, miha <at> kamnitnik.top
Subject: Re: bug#59693: 29.0.50; treesitter in base buffer doesn't respond
 to modifications in indirect buffer correctly
Date: Mon, 05 Dec 2022 22:57:23 -0500
> 1. Only allow base buffer to have parsers, no change is needed for insdel.c,

That's the option I recommend, and I that's what I understood Eli was
agreeing we do.

> treesit_record_change can find the base buffer and update its parsers. We
> can ask indirect buffers to use their base buffer’s parser. Unless the base
> buffer is narrowed, I think it will work fine. 

Exactly.  Code that uses this can `widen` as needed as well.


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#59693; Package emacs. (Tue, 06 Dec 2022 12:19:02 GMT) Full text and rfc822 format available.

Message #53 received at 59693 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Yuan Fu <casouri <at> gmail.com>
Cc: 59693 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca, miha <at> kamnitnik.top
Subject: Re: bug#59693: 29.0.50; treesitter in base buffer doesn't respond to
 modifications in indirect buffer correctly
Date: Tue, 06 Dec 2022 14:17:44 +0200
> From: Yuan Fu <casouri <at> gmail.com>
> Date: Mon, 5 Dec 2022 18:15:07 -0800
> Cc: Stefan Monnier <monnier <at> iro.umontreal.ca>,
>  59693 <at> debbugs.gnu.org,
>  miha <at> kamnitnik.top
> 
> 1. Only allow base buffer to have parsers, no change is needed for insdel.c, treesit_record_change can find the base buffer and update its parsers. We can ask indirect buffers to use their base buffer’s parser. Unless the base buffer is narrowed, I think it will work fine. 

I think this is fine, but we need to document it.

> I remember that there were a discussion along the lines of user-narrow vs low-level narrow, what was the outcome of that discussion?

Nothing in particular, and I don't think it's relevant.  If some mode needs
to widen, it can.

Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#59693; Package emacs. (Wed, 07 Dec 2022 23:14:01 GMT) Full text and rfc822 format available.

Message #56 received at 59693 <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 59693 <at> debbugs.gnu.org, Stefan Monnier <monnier <at> iro.umontreal.ca>,
 miha <at> kamnitnik.top
Subject: Re: bug#59693: 29.0.50; treesitter in base buffer doesn't respond to
 modifications in indirect buffer correctly
Date: Wed, 7 Dec 2022 15:13:19 -0800
[Message part 1 (text/plain, inline)]

> On Dec 6, 2022, at 4:17 AM, Eli Zaretskii <eliz <at> gnu.org> wrote:
> 
>> From: Yuan Fu <casouri <at> gmail.com>
>> Date: Mon, 5 Dec 2022 18:15:07 -0800
>> Cc: Stefan Monnier <monnier <at> iro.umontreal.ca>,
>> 59693 <at> debbugs.gnu.org,
>> miha <at> kamnitnik.top
>> 
>> 1. Only allow base buffer to have parsers, no change is needed for insdel.c, treesit_record_change can find the base buffer and update its parsers. We can ask indirect buffers to use their base buffer’s parser. Unless the base buffer is narrowed, I think it will work fine. 
> 
> I think this is fine, but we need to document it.
> 
>> I remember that there were a discussion along the lines of user-narrow vs low-level narrow, what was the outcome of that discussion?
> 
> Nothing in particular, and I don't think it's relevant.  If some mode needs
> to widen, it can.
> 
> Thanks.

Here is a patch that does #1.

Yuan

[indirect.patch (application/octet-stream, attachment)]
[Message part 3 (text/plain, inline)]


Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#59693; Package emacs. (Thu, 08 Dec 2022 06:48:01 GMT) Full text and rfc822 format available.

Message #59 received at 59693 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Yuan Fu <casouri <at> gmail.com>
Cc: 59693 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca, miha <at> kamnitnik.top
Subject: Re: bug#59693: 29.0.50; treesitter in base buffer doesn't respond to
 modifications in indirect buffer correctly
Date: Thu, 08 Dec 2022 08:47:13 +0200
> From: Yuan Fu <casouri <at> gmail.com>
> Date: Wed, 7 Dec 2022 15:13:19 -0800
> Cc: Stefan Monnier <monnier <at> iro.umontreal.ca>,
>  59693 <at> debbugs.gnu.org,
>  miha <at> kamnitnik.top
> 
> >> 1. Only allow base buffer to have parsers, no change is needed for insdel.c, treesit_record_change can find the base buffer and update its parsers. We can ask indirect buffers to use their base buffer’s parser. Unless the base buffer is narrowed, I think it will work fine. 
> > 
> > I think this is fine, but we need to document it.
> > 
> >> I remember that there were a discussion along the lines of user-narrow vs low-level narrow, what was the outcome of that discussion?
> > 
> > Nothing in particular, and I don't think it's relevant.  If some mode needs
> > to widen, it can.
> > 
> > Thanks.
> 
> Here is a patch that does #1.

Thanks, a few minor comments for documentation below.

> +If @var{buffer} (or the current buffer) is an indirect buffer, its
> +base buffer is used instead.  That is, indirect buffers uses their
                                          ^^^^^^^^^^^^^^^^^^^^^
"use", in plural.

> @@ -447,7 +455,9 @@ Using Parser
>  @defun treesit-parser-list &optional buffer
>  This function returns the parser list of @var{buffer}.  If
>  @var{buffer} is @code{nil} or omitted, it defaults to the current
> -buffer.
> +buffer.  If @var{buffer} (or the current buffer) is an indirect
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
I'd say more concisely

  If that buffer is an indirect buffer, ...

And please add a cross-reference to the node where indirect buffers
are described.

> +buffer, its base buffer is used instead.  That is, indirect buffers
> +uses their base buffer's parsers.
   ^^^^
"use".

> +   Parsers in indirect buffers: We make indirect buffers to share the
> +   parser of its base buffer.  See bug#59693 for reasoning.  */

I'd rather have a short summary of the reasoning here than ask the
readers to go to the bug tracker and read a long discussion.  Just
explain why indirect buffers present a problem for a parser, and then
say that we decided to do this as the easiest, simplest solution.

> +If BUFFER (or the current buffer) is an indirect buffer, its base
> +buffer is used instead.  That is, indirect buffers uses their base
                                                      ^^^^
"use"

> +buffer's parsers.  If the base buffer is narrowed, an indirect buffer
> +might not be able to retrieve information of the portion of the buffer
> +text that are invisible in the base buffer.  Lisp programs should
> +widen as necessary should they want to use a parser in an indirect
> +buffer.  */)

Here I would remove the second sentence: it is appropriate for the
manual, but is redundant in the doc string, since the next sentence
says it all.

> @@ -1329,7 +1345,10 @@ DEFUN ("treesit-parser-list",
>         Ftreesit_parser_list, Streesit_parser_list,
>         0, 1, 0,
>         doc: /* Return BUFFER's parser list.
> -BUFFER defaults to the current buffer.  */)
> +
> +BUFFER defaults to the current buffer.  If BUFFER (or the current
> +buffer) is an indirect buffer, its base buffer is used instead.  That
> +is, indirect buffers uses their base buffer's parsers.  */)
                        ^^^^
"use"

Otherwise, LGTM.




Reply sent to Yuan Fu <casouri <at> gmail.com>:
You have taken responsibility. (Sat, 10 Dec 2022 01:42:02 GMT) Full text and rfc822 format available.

Notification sent to miha <at> kamnitnik.top:
bug acknowledged by developer. (Sat, 10 Dec 2022 01:42:02 GMT) Full text and rfc822 format available.

Message #64 received at 59693-done <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 59693-done <at> debbugs.gnu.org, miha <at> kamnitnik.top, monnier <at> iro.umontreal.ca
Subject: Re: bug#59693: 29.0.50; treesitter in base buffer doesn't respond  to
 modifications in indirect buffer correctly
Date: Fri, 9 Dec 2022 17:41:04 -0800
Eli Zaretskii <eliz <at> gnu.org> writes:

>> From: Yuan Fu <casouri <at> gmail.com>
>> Date: Wed, 7 Dec 2022 15:13:19 -0800
>> Cc: Stefan Monnier <monnier <at> iro.umontreal.ca>,
>>  59693 <at> debbugs.gnu.org,
>>  miha <at> kamnitnik.top
>> 
>> >> 1. Only allow base buffer to have parsers, no change is needed
>> >> for insdel.c, treesit_record_change can find the base buffer and
>> >> update its parsers. We can ask indirect buffers to use their base
>> >> buffer’s parser. Unless the base buffer is narrowed, I think it
>> >> will work fine.
>> > 
>> > I think this is fine, but we need to document it.
>> > 
>> >> I remember that there were a discussion along the lines of user-narrow vs low-level narrow, what was the outcome of that discussion?
>> > 
>> > Nothing in particular, and I don't think it's relevant.  If some mode needs
>> > to widen, it can.
>> > 
>> > Thanks.
>> 
>> Here is a patch that does #1.
>
> Thanks, a few minor comments for documentation below.
>
>> +If @var{buffer} (or the current buffer) is an indirect buffer, its
>> +base buffer is used instead.  That is, indirect buffers uses their
>                                           ^^^^^^^^^^^^^^^^^^^^^
> "use", in plural.
>
>> @@ -447,7 +455,9 @@ Using Parser
>>  @defun treesit-parser-list &optional buffer
>>  This function returns the parser list of @var{buffer}.  If
>>  @var{buffer} is @code{nil} or omitted, it defaults to the current
>> -buffer.
>> +buffer.  If @var{buffer} (or the current buffer) is an indirect
>             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> I'd say more concisely
>
>   If that buffer is an indirect buffer, ...
>
> And please add a cross-reference to the node where indirect buffers
> are described.
>
>> +buffer, its base buffer is used instead.  That is, indirect buffers
>> +uses their base buffer's parsers.
>    ^^^^
> "use".
>
>> +   Parsers in indirect buffers: We make indirect buffers to share the
>> +   parser of its base buffer.  See bug#59693 for reasoning.  */
>
> I'd rather have a short summary of the reasoning here than ask the
> readers to go to the bug tracker and read a long discussion.  Just
> explain why indirect buffers present a problem for a parser, and then
> say that we decided to do this as the easiest, simplest solution.
>
>> +If BUFFER (or the current buffer) is an indirect buffer, its base
>> +buffer is used instead.  That is, indirect buffers uses their base
>                                                       ^^^^
> "use"
>
>> +buffer's parsers.  If the base buffer is narrowed, an indirect buffer
>> +might not be able to retrieve information of the portion of the buffer
>> +text that are invisible in the base buffer.  Lisp programs should
>> +widen as necessary should they want to use a parser in an indirect
>> +buffer.  */)
>
> Here I would remove the second sentence: it is appropriate for the
> manual, but is redundant in the doc string, since the next sentence
> says it all.
>
>> @@ -1329,7 +1345,10 @@ DEFUN ("treesit-parser-list",
>>         Ftreesit_parser_list, Streesit_parser_list,
>>         0, 1, 0,
>>         doc: /* Return BUFFER's parser list.
>> -BUFFER defaults to the current buffer.  */)
>> +
>> +BUFFER defaults to the current buffer.  If BUFFER (or the current
>> +buffer) is an indirect buffer, its base buffer is used instead.  That
>> +is, indirect buffers uses their base buffer's parsers.  */)
>                         ^^^^
> "use"
>
> Otherwise, LGTM.

Cool, I fixed those and pushed.

Yuan




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sat, 07 Jan 2023 12:24:11 GMT) Full text and rfc822 format available.

This bug report was last modified 1 year and 109 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.