GNU bug report logs - #62086
29.0.60; ruby-ts-mode regressions

Package: emacs;

Reported by: Juri Linkov <juri <at> linkov.net>

Date: Thu, 9 Mar 2023 17:28:02 UTC

Severity: normal

Fixed in version 29.0.60

Done: Juri Linkov <juri <at> linkov.net>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 62086 in the body.
You can then email your comments to 62086 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox

Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#62086; Package emacs. (Thu, 09 Mar 2023 17:28:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Juri Linkov <juri <at> linkov.net>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Thu, 09 Mar 2023 17:28:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: bug-gnu-emacs <at> gnu.org
Subject: 29.0.60; ruby-ts-mode regressions
Date: Thu, 09 Mar 2023 19:24:40 +0200

'C-M-f' ('forward-sexp') commands currently are unusable in master
because they skip too much.  So I relied on word motion commands like
'M-f' ('forward-word') to move in ruby-ts-mode.  But unfortunately
some recent change broke even word motion in emacs-29, so no motion commands
can be used in ruby-ts-mode, only motion by characters can be used with
'C-f' ('forward-char').  Here is a recipe for recent regression in emacs-29:

0. emacs -Q
1. C-x C-f test/lisp/progmodes/ruby-mode-resources/ruby-parenless-call-arguments-indent.rb RET
2. M-x ruby-ts-mode RET
3. move point to after the first letter 'c'
4. type 'M-f' ('forward-word')

It skips two words in symbols.

I don't know if the second bug is related to this, but while
in the same file, also type 'C-M-l' ('reposition-window').
It raises the error:

  Debugger entered--Lisp error: (wrong-type-argument number-or-marker-p nil)
    treesit-end-of-defun()
    end-of-defun(-1)
    reposition-window(nil nil)
    reposition-window(nil 89)
    funcall-interactively(reposition-window nil 89)
    command-execute(reposition-window)

This regression is also recent.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#62086; Package emacs. (Thu, 09 Mar 2023 18:09:02 GMT) Full text and rfc822 format available.

Message #8 received at 62086 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Juri Linkov <juri <at> linkov.net>
Cc: 62086 <at> debbugs.gnu.org
Subject: Re: bug#62086: 29.0.60; ruby-ts-mode regressions
Date: Thu, 09 Mar 2023 20:08:15 +0200

> From: Juri Linkov <juri <at> linkov.net>
> Date: Thu, 09 Mar 2023 19:24:40 +0200
> 
> 'C-M-f' ('forward-sexp') commands currently are unusable in master
> because they skip too much.  So I relied on word motion commands like
> 'M-f' ('forward-word') to move in ruby-ts-mode.  But unfortunately
> some recent change broke even word motion in emacs-29, so no motion commands
> can be used in ruby-ts-mode, only motion by characters can be used with
> 'C-f' ('forward-char').  Here is a recipe for recent regression in emacs-29:
> 
> 0. emacs -Q
> 1. C-x C-f test/lisp/progmodes/ruby-mode-resources/ruby-parenless-call-arguments-indent.rb RET
> 2. M-x ruby-ts-mode RET
> 3. move point to after the first letter 'c'
> 4. type 'M-f' ('forward-word')
> 
> It skips two words in symbols.

I guess this is because of the syntax-table properties that
ruby-ts-mode puts on the buffer text?

> I don't know if the second bug is related to this, but while
> in the same file, also type 'C-M-l' ('reposition-window').
> It raises the error:
> 
>   Debugger entered--Lisp error: (wrong-type-argument number-or-marker-p nil)
>     treesit-end-of-defun()
>     end-of-defun(-1)
>     reposition-window(nil nil)
>     reposition-window(nil 89)
>     funcall-interactively(reposition-window nil 89)
>     command-execute(reposition-window)
> 
> This regression is also recent.

I seem to unable to reproduce this.  Maybe it happens only in some
particular place in the file?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#62086; Package emacs. (Thu, 09 Mar 2023 22:03:02 GMT) Full text and rfc822 format available.

Message #11 received at 62086 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Juri Linkov <juri <at> linkov.net>, 62086 <at> debbugs.gnu.org
Subject: Re: bug#62086: 29.0.60; ruby-ts-mode regressions
Date: Fri, 10 Mar 2023 00:02:14 +0200

Hi! Thanks for the report.

On 09/03/2023 19:24, Juri Linkov wrote:
> 'C-M-f' ('forward-sexp') commands currently are unusable in master
> because they skip too much.

I'm happy to discuss this sometime later, in a different report, 
preferably after Emacs 29's pre-release drops. We'd probably just need 
to tweak the relevant regexp.

But from what I see, most of the possible confusion stems from it 
jumping over implicit parens, just like over explicit ones. The addition 
of binary operators and assignments might also have something to do with it.

> So I relied on word motion commands like
> 'M-f' ('forward-word') to move in ruby-ts-mode.  But unfortunately
> some recent change broke even word motion in emacs-29, so no motion commands
> can be used in ruby-ts-mode, only motion by characters can be used with
> 'C-f' ('forward-char').  Here is a recipe for recent regression in emacs-29:
> 
> 0. emacs -Q
> 1. C-x C-f test/lisp/progmodes/ruby-mode-resources/ruby-parenless-call-arguments-indent.rb RET
> 2. M-x ruby-ts-mode RET
> 3. move point to after the first letter 'c'
> 4. type 'M-f' ('forward-word')
> 
> It skips two words in symbols.

I might have been too eager in propertizing symbol contents with the 
"symbol" syntax. Now fixed in emacs-29, commit ecdfd584a52.

> I don't know if the second bug is related to this, but while
> in the same file, also type 'C-M-l' ('reposition-window').
> It raises the error:
> 
>    Debugger entered--Lisp error: (wrong-type-argument number-or-marker-p nil)
>      treesit-end-of-defun()
>      end-of-defun(-1)
>      reposition-window(nil nil)
>      reposition-window(nil 89)
>      funcall-interactively(reposition-window nil 89)
>      command-execute(reposition-window)
> 
> This regression is also recent.

I've managed to reproduce this, but only once. Do you see this every time?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#62086; Package emacs. (Fri, 10 Mar 2023 07:43:01 GMT) Full text and rfc822 format available.

Message #14 received at 62086 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 62086 <at> debbugs.gnu.org
Subject: Re: bug#62086: 29.0.60; ruby-ts-mode regressions
Date: Fri, 10 Mar 2023 09:29:01 +0200

>> I don't know if the second bug is related to this, but while
>> in the same file, also type 'C-M-l' ('reposition-window').
>> It raises the error:
>>
>>   Debugger entered--Lisp error: (wrong-type-argument number-or-marker-p nil)
>>     treesit-end-of-defun()
>>     end-of-defun(-1)
>>     reposition-window(nil nil)
>>     reposition-window(nil 89)
>>     funcall-interactively(reposition-window nil 89)
>>     command-execute(reposition-window)
>>
>> This regression is also recent.
>
> I seem to unable to reproduce this.  Maybe it happens only in some
> particular place in the file?

It happens everywhere in that file with ruby-ts-mode in 'emacs-29 -Q'.
To get the backtrace, I set debug-on-error to t.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#62086; Package emacs. (Fri, 10 Mar 2023 07:43:02 GMT) Full text and rfc822 format available.

Message #17 received at 62086 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 62086 <at> debbugs.gnu.org
Subject: Re: bug#62086: 29.0.60; ruby-ts-mode regressions
Date: Fri, 10 Mar 2023 09:35:46 +0200

>> 'C-M-f' ('forward-sexp') commands currently are unusable in master
>> because they skip too much.
>
> I'm happy to discuss this sometime later, in a different report, preferably
> after Emacs 29's pre-release drops. We'd probably just need to tweak the
> relevant regexp.
>
> But from what I see, most of the possible confusion stems from it jumping
> over implicit parens, just like over explicit ones. The addition of binary
> operators and assignments might also have something to do with it.

That's the problem: some implicit parens are unexpected.
But let's adjust this later in another report.

>> So I relied on word motion commands like
>> 'M-f' ('forward-word') to move in ruby-ts-mode.  But unfortunately
>> some recent change broke even word motion in emacs-29, so no motion commands
>> can be used in ruby-ts-mode, only motion by characters can be used with
>> 'C-f' ('forward-char').  Here is a recipe for recent regression in emacs-29:
>> 0. emacs -Q
>> 1. C-x C-f test/lisp/progmodes/ruby-mode-resources/ruby-parenless-call-arguments-indent.rb RET
>> 2. M-x ruby-ts-mode RET
>> 3. move point to after the first letter 'c'
>> 4. type 'M-f' ('forward-word')
>> It skips two words in symbols.
>
> I might have been too eager in propertizing symbol contents with the
> "symbol" syntax. Now fixed in emacs-29, commit ecdfd584a52.

Thanks, I confirm this is fixed.

>> I don't know if the second bug is related to this, but while
>> in the same file, also type 'C-M-l' ('reposition-window').
>> It raises the error:
>>    Debugger entered--Lisp error: (wrong-type-argument number-or-marker-p
>> nil)
>>      treesit-end-of-defun()
>>      end-of-defun(-1)
>>      reposition-window(nil nil)
>>      reposition-window(nil 89)
>>      funcall-interactively(reposition-window nil 89)
>>      command-execute(reposition-window)
>> This regression is also recent.
>
> I've managed to reproduce this, but only once. Do you see this every time?

I see it only in some files in test/lisp/progmodes/ruby-mode-resources/
e.g. ruby-parenless-call-arguments-indent.rb, ruby-method-call-indent.rb,
ruby-block-indent.rb.  But not in e.g. ruby-after-operator-indent.rb.
Also everywhere in test/lisp/progmodes/js-resources/js-indent-init-dynamic.js,
js-indent-init-t.js.  But not in e.g. js-chain.js.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#62086; Package emacs. (Fri, 10 Mar 2023 16:38:02 GMT) Full text and rfc822 format available.

Message #20 received at 62086 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Juri Linkov <juri <at> linkov.net>, Yuan Fu <casouri <at> gmail.com>
Cc: 62086 <at> debbugs.gnu.org
Subject: Re: bug#62086: 29.0.60; ruby-ts-mode regressions
Date: Fri, 10 Mar 2023 18:37:08 +0200

On 10/03/2023 09:35, Juri Linkov wrote:
>>> I don't know if the second bug is related to this, but while
>>> in the same file, also type 'C-M-l' ('reposition-window').
>>> It raises the error:
>>>     Debugger entered--Lisp error: (wrong-type-argument number-or-marker-p
>>> nil)
>>>       treesit-end-of-defun()
>>>       end-of-defun(-1)
>>>       reposition-window(nil nil)
>>>       reposition-window(nil 89)
>>>       funcall-interactively(reposition-window nil 89)
>>>       command-execute(reposition-window)
>>> This regression is also recent.
>> I've managed to reproduce this, but only once. Do you see this every time?
> I see it only in some files in test/lisp/progmodes/ruby-mode-resources/
> e.g. ruby-parenless-call-arguments-indent.rb, ruby-method-call-indent.rb,
> ruby-block-indent.rb.  But not in e.g. ruby-after-operator-indent.rb.
> Also everywhere in test/lisp/progmodes/js-resources/js-indent-init-dynamic.js,
> js-indent-init-t.js.  But not in e.g. js-chain.js.

Thanks, I can repro. I might have been trying the wrong binding at the 
end last night (C-l instead of C-M-l).

The fix seems to be easy:

diff --git a/lisp/treesit.el b/lisp/treesit.el
index c118f5d52a4..b271a1f0c4b 100644
--- a/lisp/treesit.el
+++ b/lisp/treesit.el
@@ -1882,6 +1882,7 @@ treesit-end-of-defun
 `treesit-defun-skipper'."
   (interactive "^p\nd")
   (let ((orig-point (point)))
+    (if (or (null arg) (= arg 0)) (setq arg 1))
     (catch 'done
       (dotimes (_ 2) ; Not making progress is better than infloop.

But I'm not quite sure if that is what we want to do.

More naturally, I think, would be to remove the argument from 
treesit-end-of-defun altogether (and adjust the code accordingly), 
because end-of-defun-function is documented to take no arguments.

The only other place where treesit-end-of-defun seems to be used is the 
<remap> <end-of-defun> binding set up by treesit-major-mode-setup.

Why not keep the default bindings for these? When 
beginning-of-defun-function and end-of-defun-function are set 
appropriately, they should work fine. Don't they?

Cc'ing Yuan on that subject.

bug marked as fixed in version 29.0.60, send any further explanations to 62086 <at> debbugs.gnu.org and Juri Linkov <juri <at> linkov.net> Request was from Juri Linkov <juri <at> linkov.net> to control <at> debbugs.gnu.org. (Mon, 13 Mar 2023 07:36:03 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#62086; Package emacs. (Mon, 03 Apr 2023 16:30:06 GMT) Full text and rfc822 format available.

Message #25 received at 62086 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 62086 <at> debbugs.gnu.org
Subject: Re: bug#62086: 29.0.60; ruby-ts-mode regressions
Date: Mon, 03 Apr 2023 19:29:27 +0300

>> 1. C-x C-f test/lisp/progmodes/ruby-mode-resources/ruby-parenless-call-arguments-indent.rb RET
>> 2. M-x ruby-ts-mode RET
>> 3. move point to after the first letter 'c'
>> 4. type 'M-f' ('forward-word')
>> It skips two words in symbols.
>
> I might have been too eager in propertizing symbol contents with the
> "symbol" syntax. Now fixed in emacs-29, commit ecdfd584a52.

Thanks.  Here is a new problem:

  @foo, @bar = baz.(
    some_arg
  )

'C-M-f' and 'C-M-b' skip @foo and @bar.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#62086; Package emacs. (Mon, 03 Apr 2023 20:43:01 GMT) Full text and rfc822 format available.

Message #28 received at 62086 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Juri Linkov <juri <at> linkov.net>
Cc: 62086 <at> debbugs.gnu.org
Subject: Re: bug#62086: 29.0.60; ruby-ts-mode regressions
Date: Mon, 3 Apr 2023 23:42:04 +0300

On 03/04/2023 19:29, Juri Linkov wrote:
>>> 1. C-x C-f test/lisp/progmodes/ruby-mode-resources/ruby-parenless-call-arguments-indent.rb RET
>>> 2. M-x ruby-ts-mode RET
>>> 3. move point to after the first letter 'c'
>>> 4. type 'M-f' ('forward-word')
>>> It skips two words in symbols.
>> I might have been too eager in propertizing symbol contents with the
>> "symbol" syntax. Now fixed in emacs-29, commit ecdfd584a52.
> Thanks.  Here is a new problem:
> 
>    @foo, @bar = baz.(
>      some_arg
>    )
> 
> 'C-M-f' and 'C-M-b' skip @foo and @bar.

Also fixed in commit bd5c1d1cbbd.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#62086; Package emacs. (Tue, 04 Apr 2023 07:40:02 GMT) Full text and rfc822 format available.

Message #31 received at 62086 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 62086 <at> debbugs.gnu.org
Subject: Re: bug#62086: 29.0.60; ruby-ts-mode regressions
Date: Tue, 04 Apr 2023 10:16:47 +0300

>> Here is a new problem:
>>    @foo, @bar = baz.(
>>      some_arg
>>    )
>> 'C-M-f' and 'C-M-b' skip @foo and @bar.
>
> Also fixed in commit bd5c1d1cbbd.

Thanks, I confirm these fixed.  I wonder is it possible to fix more.
Many parens/brackets are still not matched in e.g.
test/lisp/progmodes/ruby-mode-resources/ruby.rb
such as parens in def argument list:

  def test1(arg)

and in

  method (a + b),

and brackets in

  case translation
  in ['th', orig_text, 'en', trans_text]
    puts "English translation: #{orig_text} => #{trans_text}"
  in {th: orig_text, ja: trans_text} => whole

Also square brackets are not matched by 'C-M-f' in

  h[:key]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#62086; Package emacs. (Wed, 05 Apr 2023 00:08:01 GMT) Full text and rfc822 format available.

Message #34 received at 62086 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Juri Linkov <juri <at> linkov.net>
Cc: 62086 <at> debbugs.gnu.org
Subject: Re: bug#62086: 29.0.60; ruby-ts-mode regressions
Date: Wed, 5 Apr 2023 03:06:52 +0300

On 04/04/2023 10:16, Juri Linkov wrote:
> I wonder is it possible to fix more.
> Many parens/brackets are still not matched in e.g.
> test/lisp/progmodes/ruby-mode-resources/ruby.rb
> such as parens in def argument list:
> 
>    def test1(arg)

This one was a regression from the addition of strict bos/eos anchors, 
now fixed.

> and in
> 
>    method (a + b),

When you say that this is broken, do you mean that these parens get 
jumped over unexpectedly (with forward-sexp movement ending at the end 
of the arguments list)? This is an artefact of the implementation of 
treesit-forward-sexp. It might be possible to improve, but from a brief 
dig, it has some internal logic. So some care would need to be taken to 
decide which contract nedds changing.

> and brackets in
> 
>    case translation
>    in ['th', orig_text, 'en', trans_text]
>      puts "English translation: #{orig_text} => #{trans_text}"
>    in {th: orig_text, ja: trans_text} => whole

Now fixed. Also, "case" matches "end" with this syntax too now.

> Also square brackets are not matched by 'C-M-f' in
> 
>    h[:key]

And this, surprisingly, seems impossible to handle just using 
treesit-sexp-type-regexp. The brackets are present in the tree, but they 
are not at the ends of any node. So that will require some custom Lisp, 
I guess.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#62086; Package emacs. (Wed, 05 Apr 2023 06:29:02 GMT) Full text and rfc822 format available.

Message #37 received at 62086 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 62086 <at> debbugs.gnu.org
Subject: Re: bug#62086: 29.0.60; ruby-ts-mode regressions
Date: Wed, 05 Apr 2023 09:24:24 +0300

>> I wonder is it possible to fix more.
>> Many parens/brackets are still not matched in e.g.
>> test/lisp/progmodes/ruby-mode-resources/ruby.rb
>> such as parens in def argument list:
>>    def test1(arg)
>
> This one was a regression from the addition of strict bos/eos anchors, now
> fixed.

Maybe there are more types that now are not found, but probably easier
to add them one by one after testing than to try finding all of them in
https://github.com/tree-sitter/tree-sitter-ruby/blob/master/src/node-types.json
or in
https://github.com/tree-sitter/tree-sitter-ruby/blob/master/src/grammar.json

>> and in
>>    method (a + b),
>
> When you say that this is broken, do you mean that these parens get jumped
> over unexpectedly (with forward-sexp movement ending at the end of the
> arguments list)?

It seems natural to expect that when point is on an opening paren/bracket
then 'C-M-f' should jump to its closing pair.  At least, this is more WYSIWYG.

> This is an artefact of the implementation of treesit-forward-sexp.
> It might be possible to improve, but from a brief dig, it has some
> internal logic. So some care would need to be taken to decide which
> contract nedds changing.

This is an example where explicit parens conflict with implicit parens.
Visible parens have the type "parenthesized_statements", but invisible
parens have the type "argument_list".  Both start at the same position.
So maybe treesit-forward-sexp should prefer the former over the latter?
And in a similar case

  method [],
         arg2

maybe "array" should take precedence over "argument_list".

>> Also square brackets are not matched by 'C-M-f' in
>>    h[:key]
>
> And this, surprisingly, seems impossible to handle just using
> treesit-sexp-type-regexp. The brackets are present in the tree, but they
> are not at the ends of any node. So that will require some custom Lisp,
> I guess.

This is the same problem that occurs in other places such as in "#{ddf}"
where only '#' but not '{' matches '}'.  So adding "element_reference"
will allow to jump only from the beginning of an identifier.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#62086; Package emacs. (Wed, 05 Apr 2023 14:59:02 GMT) Full text and rfc822 format available.

Message #40 received at 62086 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Juri Linkov <juri <at> linkov.net>
Cc: 62086 <at> debbugs.gnu.org
Subject: Re: bug#62086: 29.0.60; ruby-ts-mode regressions
Date: Wed, 5 Apr 2023 17:58:38 +0300

On 05/04/2023 09:24, Juri Linkov wrote:
>>> I wonder is it possible to fix more.
>>> Many parens/brackets are still not matched in e.g.
>>> test/lisp/progmodes/ruby-mode-resources/ruby.rb
>>> such as parens in def argument list:
>>>     def test1(arg)
>>
>> This one was a regression from the addition of strict bos/eos anchors, now
>> fixed.
> 
> Maybe there are more types that now are not found, but probably easier
> to add them one by one after testing than to try finding all of them in
> https://github.com/tree-sitter/tree-sitter-ruby/blob/master/src/node-types.json
> or in
> https://github.com/tree-sitter/tree-sitter-ruby/blob/master/src/grammar.json

Yep. And we've hopefully more-or-less covered the existing grammar at 
this point.

>>> and in
>>>     method (a + b),
>>
>> When you say that this is broken, do you mean that these parens get jumped
>> over unexpectedly (with forward-sexp movement ending at the end of the
>> arguments list)?
> 
> It seems natural to expect that when point is on an opening paren/bracket
> then 'C-M-f' should jump to its closing pair.  At least, this is more WYSIWYG.
> 
>> This is an artefact of the implementation of treesit-forward-sexp.
>> It might be possible to improve, but from a brief dig, it has some
>> internal logic. So some care would need to be taken to decide which
>> contract nedds changing.
> 
> This is an example where explicit parens conflict with implicit parens.
> Visible parens have the type "parenthesized_statements", but invisible
> parens have the type "argument_list".  Both start at the same position.
> So maybe treesit-forward-sexp should prefer the former over the latter?
> And in a similar case
> 
>    method [],
>           arg2
> 
> maybe "array" should take precedence over "argument_list".

There is no mechanism for precedence in the current implementation. We 
can try ignoring the implicit parens in the parenless method calls, 
though. Like this:

diff --git a/lisp/progmodes/ruby-ts-mode.el b/lisp/progmodes/ruby-ts-mode.el
index ddf2ee98c3b..cf8f1b0d315 100644
--- a/lisp/progmodes/ruby-ts-mode.el
+++ b/lisp/progmodes/ruby-ts-mode.el
@@ -1086,6 +1086,15 @@ ruby-ts--syntax-propertize
            (put-text-property pos (1+ pos) 'syntax-table
                               (string-to-syntax "!"))))))))

+(defun ruby-ts--sexp-p (node)
+  ;; Skip parenless calls (implicit parens are both non-obvious to the
+  ;; user, and might take over when we want to just over some physical
+  ;; parens/braces).
+  (or (not (equal (treesit-node-type node)
+                  "argument_list"))
+      (equal (treesit-node-type (treesit-node-child node 0))
+             "(")))
+
 (defvar-keymap ruby-ts-mode-map
   :doc "Keymap used in Ruby mode"
   :parent prog-mode-map
@@ -1114,6 +1123,7 @@ ruby-ts-mode
   (setq-local treesit-defun-type-regexp ruby-ts--method-regex)

   (setq-local treesit-sexp-type-regexp
+              (cons
               (rx bol
                   (or "class"
                       "module"
@@ -1147,7 +1157,8 @@ ruby-ts-mode
                       "instance_variable"
                       "global_variable"
                       )
-                  eol))
+                  eol)
+                  #'ruby-ts--sexp-p))

   ;; AFAIK, Ruby can not nest methods
   (setq-local treesit-defun-prefer-top-level nil)



>>> Also square brackets are not matched by 'C-M-f' in
>>>     h[:key]
>>
>> And this, surprisingly, seems impossible to handle just using
>> treesit-sexp-type-regexp. The brackets are present in the tree, but they
>> are not at the ends of any node. So that will require some custom Lisp,
>> I guess.
> 
> This is the same problem that occurs in other places such as in "#{ddf}"
> where only '#' but not '{' matches '}'.  So adding "element_reference"
> will allow to jump only from the beginning of an identifier.

Right, except it's worse because the identifier is usually much longer 
than one character.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#62086; Package emacs. (Wed, 05 Apr 2023 16:30:02 GMT) Full text and rfc822 format available.

Message #43 received at 62086 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 62086 <at> debbugs.gnu.org
Subject: Re: bug#62086: 29.0.60; ruby-ts-mode regressions
Date: Wed, 05 Apr 2023 19:25:46 +0300

> There is no mechanism for precedence in the current implementation. We can
> try ignoring the implicit parens in the parenless method calls,
> though. Like this:

I don't know how many users might still want to skip implicit parens.
Maybe this could be customizable with another list that by default
includes "argument_list".  It's nice that it's doable with the
current treesit features.

> +(defun ruby-ts--sexp-p (node)
> +  ;; Skip parenless calls (implicit parens are both non-obvious to the
> +  ;; user, and might take over when we want to just over some physical
> +  ;; parens/braces).
> +  (or (not (equal (treesit-node-type node)
> +                  "argument_list"))
> +      (equal (treesit-node-type (treesit-node-child node 0))
> +             "(")))

Maybe something similar could be used to detect '[' in 'h[:key]'
to match the corresponding ']'.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#62086; Package emacs. (Wed, 05 Apr 2023 16:37:02 GMT) Full text and rfc822 format available.

Message #46 received at 62086 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Juri Linkov <juri <at> linkov.net>
Cc: 62086 <at> debbugs.gnu.org
Subject: Re: bug#62086: 29.0.60; ruby-ts-mode regressions
Date: Wed, 5 Apr 2023 19:36:37 +0300

On 05/04/2023 19:25, Juri Linkov wrote:
>> There is no mechanism for precedence in the current implementation. We can
>> try ignoring the implicit parens in the parenless method calls,
>> though. Like this:
> I don't know how many users might still want to skip implicit parens.
> Maybe this could be customizable with another list that by default
> includes "argument_list".  It's nice that it's doable with the
> current treesit features.

Calls with both physical and implicit parens have this type.

I'd rather not add user option in advance, let's try to work out what 
looks like the most reasonable behavior, and then add them after 
specific requests.

>> +(defun ruby-ts--sexp-p (node)
>> +  ;; Skip parenless calls (implicit parens are both non-obvious to the
>> +  ;; user, and might take over when we want to just over some physical
>> +  ;; parens/braces).
>> +  (or (not (equal (treesit-node-type node)
>> +                  "argument_list"))
>> +      (equal (treesit-node-type (treesit-node-child node 0))
>> +             "(")))
> Maybe something similar could be used to detect '[' in 'h[:key]'
> to match the corresponding ']'.

It doesn't look like that, no.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#62086; Package emacs. (Tue, 11 Apr 2023 16:59:02 GMT) Full text and rfc822 format available.

Message #49 received at 62086 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 62086 <at> debbugs.gnu.org
Subject: Re: bug#62086: 29.0.60; ruby-ts-mode regressions
Date: Tue, 11 Apr 2023 19:53:53 +0300

[Message part 1 (text/plain, inline)]

I don't know if opening a new bug report is needed.
Actually I'm doing the same thing for more ts-modes -
trying to find a set of node names that match parens/brackets.
So maybe this patch makes sense too:

[treesit-sexp-type-regexp.patch (text/x-diff, inline)]

diff --git a/lisp/progmodes/c-ts-mode.el b/lisp/progmodes/c-ts-mode.el
index d773b4a41f4..e55d26177af 100644
--- a/lisp/progmodes/c-ts-mode.el
+++ b/lisp/progmodes/c-ts-mode.el
@@ -927,7 +927,9 @@ c-ts-base-mode
                             "qualifier"
                             "type"
                             "parameter"
-                            "expression"
+                            ;; "expression"
+                            "argument_list"
+                            "identifier"
                             "literal"
                             "string")))
 
diff --git a/lisp/progmodes/js.el b/lisp/progmodes/js.el
index f68ecb6fa6c..3876a5b54f1 100644
--- a/lisp/progmodes/js.el
+++ b/lisp/progmodes/js.el
@@ -3827,7 +3827,9 @@ js--treesit-sentence-nodes
 "See `treesit-sentence-type-regexp' for more information.")
 
 (defvar js--treesit-sexp-nodes
-  '("expression"
+  '("expression" ;; SHOULD NOT MATCH "expression_statement", BUT SHOULD MATCH "parenthesized_expression"
+    "parenthesized_expression"
+    "formal_parameters"
     "pattern"
     "array"
     "function"
@@ -3845,7 +3847,13 @@ js--treesit-sexp-nodes
     "undefined"
     "arguments"
     "pair"
-    "jsx")
+    "jsx"
+    "statement_block"
+    "object"
+    "object_pattern"
+    "named_imports"
+    "class_body"
+    )
   "Nodes that designate sexps in JavaScript.
 See `treesit-sexp-type-regexp' for more information.")
 
@@ -3893,7 +3901,7 @@ js-ts-mode
                 (regexp-opt js--treesit-sentence-nodes))
 
     (setq-local treesit-sexp-type-regexp
-                (regexp-opt js--treesit-sexp-nodes))
+                (rx-to-string `(seq bol (or ,@js--treesit-sexp-nodes) eol)))
 
     ;; Fontification.
     (setq-local treesit-font-lock-settings js--treesit-font-lock-settings)

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#62086; Package emacs. (Tue, 11 Apr 2023 23:31:01 GMT) Full text and rfc822 format available.

Message #52 received at 62086 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Juri Linkov <juri <at> linkov.net>
Cc: 62086 <at> debbugs.gnu.org, Yuan Fu <casouri <at> gmail.com>,
 Theodor Thornhill <theo <at> thornhill.no>
Subject: Re: bug#62086: 29.0.60; ruby-ts-mode regressions
Date: Wed, 12 Apr 2023 02:30:19 +0300

On 11/04/2023 19:53, Juri Linkov wrote:
> I don't know if opening a new bug report is needed.
> Actually I'm doing the same thing for more ts-modes -
> trying to find a set of node names that match parens/brackets.
> So maybe this patch makes sense too:

These look sensible to me.

I think we should give a chance to the authors to chime in, though.

> treesit-sexp-type-regexp.patch
> 
> diff --git a/lisp/progmodes/c-ts-mode.el b/lisp/progmodes/c-ts-mode.el
> index d773b4a41f4..e55d26177af 100644
> --- a/lisp/progmodes/c-ts-mode.el
> +++ b/lisp/progmodes/c-ts-mode.el
> @@ -927,7 +927,9 @@ c-ts-base-mode
>                               "qualifier"
>                               "type"
>                               "parameter"
> -                            "expression"
> +                            ;; "expression"
> +                            "argument_list"
> +                            "identifier"
>                               "literal"
>                               "string")))
>   
> diff --git a/lisp/progmodes/js.el b/lisp/progmodes/js.el
> index f68ecb6fa6c..3876a5b54f1 100644
> --- a/lisp/progmodes/js.el
> +++ b/lisp/progmodes/js.el
> @@ -3827,7 +3827,9 @@ js--treesit-sentence-nodes
>   "See `treesit-sentence-type-regexp' for more information.")
>   
>   (defvar js--treesit-sexp-nodes
> -  '("expression"
> +  '("expression" ;; SHOULD NOT MATCH "expression_statement", BUT SHOULD MATCH "parenthesized_expression"
> +    "parenthesized_expression"
> +    "formal_parameters"
>       "pattern"
>       "array"
>       "function"
> @@ -3845,7 +3847,13 @@ js--treesit-sexp-nodes
>       "undefined"
>       "arguments"
>       "pair"
> -    "jsx")
> +    "jsx"
> +    "statement_block"
> +    "object"
> +    "object_pattern"
> +    "named_imports"
> +    "class_body"
> +    )
>     "Nodes that designate sexps in JavaScript.
>   See `treesit-sexp-type-regexp' for more information.")
>   
> @@ -3893,7 +3901,7 @@ js-ts-mode
>                   (regexp-opt js--treesit-sentence-nodes))
>   
>       (setq-local treesit-sexp-type-regexp
> -                (regexp-opt js--treesit-sexp-nodes))
> +                (rx-to-string `(seq bol (or ,@js--treesit-sexp-nodes) eol)))
>   
>       ;; Fontification.
>       (setq-local treesit-font-lock-settings js--treesit-font-lock-settings)

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#62086; Package emacs. (Wed, 12 Apr 2023 07:07:02 GMT) Full text and rfc822 format available.

Message #55 received at 62086 <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 62086 <at> debbugs.gnu.org, Theodor Thornhill <theo <at> thornhill.no>,
 Juri Linkov <juri <at> linkov.net>
Subject: Re: bug#62086: 29.0.60; ruby-ts-mode regressions
Date: Wed, 12 Apr 2023 00:05:41 -0700


> On Apr 11, 2023, at 4:30 PM, Dmitry Gutov <dgutov <at> yandex.ru> wrote:
> 
> On 11/04/2023 19:53, Juri Linkov wrote:
>> I don't know if opening a new bug report is needed.
>> Actually I'm doing the same thing for more ts-modes -
>> trying to find a set of node names that match parens/brackets.
>> So maybe this patch makes sense too:
> 
> These look sensible to me.
> 
> I think we should give a chance to the authors to chime in, though.
> 
>> treesit-sexp-type-regexp.patch
>> diff --git a/lisp/progmodes/c-ts-mode.el b/lisp/progmodes/c-ts-mode.el
>> index d773b4a41f4..e55d26177af 100644
>> --- a/lisp/progmodes/c-ts-mode.el
>> +++ b/lisp/progmodes/c-ts-mode.el
>> @@ -927,7 +927,9 @@ c-ts-base-mode
>>                              "qualifier"
>>                              "type"
>>                              "parameter"
>> -                            "expression"
>> +                            ;; "expression"
>> +                            "argument_list"
>> +                            "identifier"
>>                              "literal"
>>                              "string")))
>>  diff --git a/lisp/progmodes/js.el b/lisp/progmodes/js.el
>> index f68ecb6fa6c..3876a5b54f1 100644
>> --- a/lisp/progmodes/js.el
>> +++ b/lisp/progmodes/js.el
>> @@ -3827,7 +3827,9 @@ js--treesit-sentence-nodes
>>  "See `treesit-sentence-type-regexp' for more information.")
>>    (defvar js--treesit-sexp-nodes
>> -  '("expression"
>> +  '("expression" ;; SHOULD NOT MATCH "expression_statement", BUT SHOULD MATCH "parenthesized_expression"
>> +    "parenthesized_expression"
>> +    "formal_parameters"
>>      "pattern"
>>      "array"
>>      "function"
>> @@ -3845,7 +3847,13 @@ js--treesit-sexp-nodes
>>      "undefined"
>>      "arguments"
>>      "pair"
>> -    "jsx")
>> +    "jsx"
>> +    "statement_block"
>> +    "object"
>> +    "object_pattern"
>> +    "named_imports"
>> +    "class_body"
>> +    )
>>    "Nodes that designate sexps in JavaScript.
>>  See `treesit-sexp-type-regexp' for more information.")
>>  @@ -3893,7 +3901,7 @@ js-ts-mode
>>                  (regexp-opt js--treesit-sentence-nodes))
>>        (setq-local treesit-sexp-type-regexp
>> -                (regexp-opt js--treesit-sexp-nodes))
>> +                (rx-to-string `(seq bol (or ,@js--treesit-sexp-nodes) eol)))
>>        ;; Fontification.
>>      (setq-local treesit-font-lock-settings js--treesit-font-lock-settings)
> 


Actually, would it make sense to define sexp as “anything but some very small punctuation and delimiters”? I changed the definition of c-ts-mode-sexp-type-regexp to that (see bug#62302). It seems to work just fine. Of course, if there are problems we can revert back.

Yuan

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#62086; Package emacs. (Wed, 12 Apr 2023 07:31:01 GMT) Full text and rfc822 format available.

Message #58 received at 62086 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 62086 <at> debbugs.gnu.org, casouri <at> gmail.com, theo <at> thornhill.no,
 juri <at> linkov.net
Subject: Re: bug#62086: 29.0.60; ruby-ts-mode regressions
Date: Wed, 12 Apr 2023 10:30:51 +0300

> Cc: 62086 <at> debbugs.gnu.org, Yuan Fu <casouri <at> gmail.com>,
>  Theodor Thornhill <theo <at> thornhill.no>
> Date: Wed, 12 Apr 2023 02:30:19 +0300
> From: Dmitry Gutov <dgutov <at> yandex.ru>
> 
> On 11/04/2023 19:53, Juri Linkov wrote:
> > I don't know if opening a new bug report is needed.
> > Actually I'm doing the same thing for more ts-modes -
> > trying to find a set of node names that match parens/brackets.
> > So maybe this patch makes sense too:
> 
> These look sensible to me.
> 
> I think we should give a chance to the authors to chime in, though.
> 
> > treesit-sexp-type-regexp.patch
> > 
> > diff --git a/lisp/progmodes/c-ts-mode.el b/lisp/progmodes/c-ts-mode.el
> > index d773b4a41f4..e55d26177af 100644
> > --- a/lisp/progmodes/c-ts-mode.el
> > +++ b/lisp/progmodes/c-ts-mode.el
> > @@ -927,7 +927,9 @@ c-ts-base-mode
> >                               "qualifier"
> >                               "type"
> >                               "parameter"
> > -                            "expression"
> > +                            ;; "expression"
> > +                            "argument_list"
> > +                            "identifier"
> >                               "literal"
> >                               "string")))

Can someone please tell which problem(s) this is supposed to fix, and
on what branch?  This bug report has "29.0.60" in the title, but it
starts with describing what happens on master.  On the emacs-29 branch
C-M-f doesn't use treesit capabilities, at least not in c-ts-mode.  So
I'm confused regarding the scope and the purpose of this proposal.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#62086; Package emacs. (Wed, 12 Apr 2023 15:32:01 GMT) Full text and rfc822 format available.

Message #61 received at 62086 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Yuan Fu <casouri <at> gmail.com>
Cc: 62086 <at> debbugs.gnu.org, Theodor Thornhill <theo <at> thornhill.no>,
 Juri Linkov <juri <at> linkov.net>
Subject: Re: bug#62086: 29.0.60; ruby-ts-mode regressions
Date: Wed, 12 Apr 2023 18:31:25 +0300

On 12/04/2023 10:05, Yuan Fu wrote:
> Actually, would it make sense to define sexp as “anything but some very small punctuation and delimiters”?

Pretty much. If I understood you correctly.

E.g. in ruby-ts-mode identifiers and numbers are also sexps.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#62086; Package emacs. (Wed, 12 Apr 2023 15:32:01 GMT) Full text and rfc822 format available.

Message #64 received at 62086 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 62086 <at> debbugs.gnu.org, casouri <at> gmail.com, theo <at> thornhill.no,
 juri <at> linkov.net
Subject: Re: bug#62086: 29.0.60; ruby-ts-mode regressions
Date: Wed, 12 Apr 2023 18:31:46 +0300

On 12/04/2023 10:30, Eli Zaretskii wrote:
> Can someone please tell which problem(s) this is supposed to fix, and
> on what branch?  This bug report has "29.0.60" in the title, but it
> starts with describing what happens on master.  On the emacs-29 branch
> C-M-f doesn't use treesit capabilities, at least not in c-ts-mode.  So
> I'm confused regarding the scope and the purpose of this proposal.

Indeed, it's only for master.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#62086; Package emacs. (Wed, 12 Apr 2023 15:40:02 GMT) Full text and rfc822 format available.

Message #67 received at 62086 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 62086 <at> debbugs.gnu.org, casouri <at> gmail.com, theo <at> thornhill.no,
 juri <at> linkov.net
Subject: Re: bug#62086: 29.0.60; ruby-ts-mode regressions
Date: Wed, 12 Apr 2023 18:40:05 +0300

> Date: Wed, 12 Apr 2023 18:31:46 +0300
> Cc: 62086 <at> debbugs.gnu.org, casouri <at> gmail.com, theo <at> thornhill.no,
>  juri <at> linkov.net
> From: Dmitry Gutov <dgutov <at> yandex.ru>
> 
> On 12/04/2023 10:30, Eli Zaretskii wrote:
> > Can someone please tell which problem(s) this is supposed to fix, and
> > on what branch?  This bug report has "29.0.60" in the title, but it
> > starts with describing what happens on master.  On the emacs-29 branch
> > C-M-f doesn't use treesit capabilities, at least not in c-ts-mode.  So
> > I'm confused regarding the scope and the purpose of this proposal.
> 
> Indeed, it's only for master.

Thanks.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#62086; Package emacs. (Wed, 12 Apr 2023 20:14:02 GMT) Full text and rfc822 format available.

Message #70 received at 62086 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Yuan Fu <casouri <at> gmail.com>
Cc: 62086 <at> debbugs.gnu.org, Theodor Thornhill <theo <at> thornhill.no>,
 Juri Linkov <juri <at> linkov.net>
Subject: Re: bug#62086: 29.0.60; ruby-ts-mode regressions
Date: Wed, 12 Apr 2023 23:13:02 +0300

On 12/04/2023 18:31, Dmitry Gutov wrote:
> On 12/04/2023 10:05, Yuan Fu wrote:
>> Actually, would it make sense to define sexp as “anything but some 
>> very small punctuation and delimiters”?
> 
> Pretty much. If I understood you correctly.
> 
> E.g. in ruby-ts-mode identifiers and numbers are also sexps.

Allow me to update that.

From the previous threads, for ruby-ts-mode at least, we seem to have 
concluded that it's best to treat those nodes as sexps which have 
visible boundaries that are visible and don't overlay exactly the 
boundaries of the contained nodes.

For example, we now exclude statement nodes and binary expression nodes 
because both make forward/backward-sexp less obvious and predictable: 
you move point to the beginning of 'a + b', press C-M-f, and if the jump 
happens over the whole expression, this is just as likely to mismatch 
the user's intention (which might have wanted to only jump over 'a'). So 
these are the node we rule out.

The easiest choice would be to go back to treating only 
braces/brackets/parens are sexp delimiters, but in Ruby, at least, we 
have lots of constructs that are delimited with keywords (such as 'if', 
'def', 'end'), so that doesn't work. Maybe it'll work better in C/C++, 
where you mostly need to be able to differentiate between different 
types of angle brackets.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#62086; Package emacs. (Wed, 12 Apr 2023 21:51:03 GMT) Full text and rfc822 format available.

Message #73 received at 62086 <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 62086 <at> debbugs.gnu.org, Theodor Thornhill <theo <at> thornhill.no>,
 Juri Linkov <juri <at> linkov.net>
Subject: Re: bug#62086: 29.0.60; ruby-ts-mode regressions
Date: Wed, 12 Apr 2023 14:50:04 -0700


> On Apr 12, 2023, at 1:13 PM, Dmitry Gutov <dgutov <at> yandex.ru> wrote:
> 
> On 12/04/2023 18:31, Dmitry Gutov wrote:
>> On 12/04/2023 10:05, Yuan Fu wrote:
>>> Actually, would it make sense to define sexp as “anything but some very small punctuation and delimiters”?
>> Pretty much. If I understood you correctly.
>> E.g. in ruby-ts-mode identifiers and numbers are also sexps.
> 
> Allow me to update that.
> 
> From the previous threads, for ruby-ts-mode at least, we seem to have concluded that it's best to treat those nodes as sexps which have visible boundaries that are visible and don't overlay exactly the boundaries of the contained nodes.
> 
> For example, we now exclude statement nodes and binary expression nodes because both make forward/backward-sexp less obvious and predictable: you move point to the beginning of 'a + b', press C-M-f, and if the jump happens over the whole expression, this is just as likely to mismatch the user's intention (which might have wanted to only jump over 'a'). So these are the node we rule out.

User might as well want to move over the whole expression, since they can use forward-word if they want to move over smaller elements. But I guess that’s just personal preferences.

> The easiest choice would be to go back to treating only braces/brackets/parens are sexp delimiters, but in Ruby, at least, we have lots of constructs that are delimited with keywords (such as 'if', 'def', 'end'), so that doesn't work. Maybe it'll work better in C/C++, where you mostly need to be able to differentiate between different types of angle brackets.

To clarify, my point is to define sexp by exclusion rather than inclusion, ie, defining a set of nodes that are not sexp, rather than defining a set of nodes that are sexp. I mentioned delimiters because they are excluded from sexp, not because they delimit sexp.

Yuan

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#62086; Package emacs. (Wed, 12 Apr 2023 21:57:02 GMT) Full text and rfc822 format available.

Message #76 received at 62086 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Yuan Fu <casouri <at> gmail.com>
Cc: 62086 <at> debbugs.gnu.org, Theodor Thornhill <theo <at> thornhill.no>,
 Juri Linkov <juri <at> linkov.net>
Subject: Re: bug#62086: 29.0.60; ruby-ts-mode regressions
Date: Thu, 13 Apr 2023 00:56:33 +0300

On 13/04/2023 00:50, Yuan Fu wrote:
> 
> 
>> On Apr 12, 2023, at 1:13 PM, Dmitry Gutov <dgutov <at> yandex.ru> wrote:
>>
>> On 12/04/2023 18:31, Dmitry Gutov wrote:
>>> On 12/04/2023 10:05, Yuan Fu wrote:
>>>> Actually, would it make sense to define sexp as “anything but some very small punctuation and delimiters”?
>>> Pretty much. If I understood you correctly.
>>> E.g. in ruby-ts-mode identifiers and numbers are also sexps.
>>
>> Allow me to update that.
>>
>>  From the previous threads, for ruby-ts-mode at least, we seem to have concluded that it's best to treat those nodes as sexps which have visible boundaries that are visible and don't overlay exactly the boundaries of the contained nodes.
>>
>> For example, we now exclude statement nodes and binary expression nodes because both make forward/backward-sexp less obvious and predictable: you move point to the beginning of 'a + b', press C-M-f, and if the jump happens over the whole expression, this is just as likely to mismatch the user's intention (which might have wanted to only jump over 'a'). So these are the node we rule out.
> 
> User might as well want to move over the whole expression, since they can use forward-word if they want to move over smaller elements. But I guess that’s just personal preferences.

forward-word works for minor elements, but the sub-expression can be, 
for example, a parenthesized expression (with "real" parens).

It's definitely something that can be discussed, but the above guideline 
seems to me like something that puts the user more in control. Because 
as handy jumping over statements can be, it's usually not what one is 
trying to do.

>> The easiest choice would be to go back to treating only braces/brackets/parens are sexp delimiters, but in Ruby, at least, we have lots of constructs that are delimited with keywords (such as 'if', 'def', 'end'), so that doesn't work. Maybe it'll work better in C/C++, where you mostly need to be able to differentiate between different types of angle brackets.
> 
> To clarify, my point is to define sexp by exclusion rather than inclusion, ie, defining a set of nodes that are not sexp, rather than defining a set of nodes that are sexp. I mentioned delimiters because they are excluded from sexp, not because they delimit sexp.

Yes, that can work. Only when the excluded type names a one-char long, 
though, because Elisp has no lookahead. In ruby-ts-mode there are longer 
excluded types.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#62086; Package emacs. (Wed, 12 Apr 2023 22:13:01 GMT) Full text and rfc822 format available.

Message #79 received at 62086 <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 62086 <at> debbugs.gnu.org, Theodor Thornhill <theo <at> thornhill.no>,
 Juri Linkov <juri <at> linkov.net>
Subject: Re: bug#62086: 29.0.60; ruby-ts-mode regressions
Date: Wed, 12 Apr 2023 15:11:52 -0700


> On Apr 12, 2023, at 2:56 PM, Dmitry Gutov <dgutov <at> yandex.ru> wrote:
> 
> On 13/04/2023 00:50, Yuan Fu wrote:
>>> On Apr 12, 2023, at 1:13 PM, Dmitry Gutov <dgutov <at> yandex.ru> wrote:
>>> 
>>> On 12/04/2023 18:31, Dmitry Gutov wrote:
>>>> On 12/04/2023 10:05, Yuan Fu wrote:
>>>>> Actually, would it make sense to define sexp as “anything but some very small punctuation and delimiters”?
>>>> Pretty much. If I understood you correctly.
>>>> E.g. in ruby-ts-mode identifiers and numbers are also sexps.
>>> 
>>> Allow me to update that.
>>> 
>>> From the previous threads, for ruby-ts-mode at least, we seem to have concluded that it's best to treat those nodes as sexps which have visible boundaries that are visible and don't overlay exactly the boundaries of the contained nodes.
>>> 
>>> For example, we now exclude statement nodes and binary expression nodes because both make forward/backward-sexp less obvious and predictable: you move point to the beginning of 'a + b', press C-M-f, and if the jump happens over the whole expression, this is just as likely to mismatch the user's intention (which might have wanted to only jump over 'a'). So these are the node we rule out.
>> User might as well want to move over the whole expression, since they can use forward-word if they want to move over smaller elements. But I guess that’s just personal preferences.
> 
> forward-word works for minor elements, but the sub-expression can be, for example, a parenthesized expression (with "real" parens).
> 
> It's definitely something that can be discussed, but the above guideline seems to me like something that puts the user more in control. Because as handy jumping over statements can be, it's usually not what one is trying to do.
> 
>>> The easiest choice would be to go back to treating only braces/brackets/parens are sexp delimiters, but in Ruby, at least, we have lots of constructs that are delimited with keywords (such as 'if', 'def', 'end'), so that doesn't work. Maybe it'll work better in C/C++, where you mostly need to be able to differentiate between different types of angle brackets.
>> To clarify, my point is to define sexp by exclusion rather than inclusion, ie, defining a set of nodes that are not sexp, rather than defining a set of nodes that are sexp. I mentioned delimiters because they are excluded from sexp, not because they delimit sexp.
> 
> Yes, that can work. Only when the excluded type names a one-char long, though, because Elisp has no lookahead. In ruby-ts-mode there are longer excluded types.

Actually, I’m working on extending the “pattern” treesit-search-forward and friends can accept. Right now it has to be a regexp or a pred function. I plan to extend it to regexp | function | (regexp . function) | (or <pattern>…) | (not <pattern>…) | (verbatim string)

I’m not yet sure about the performance implication of the recursive patterns (or and not). And I’m not sure if verbatim is necessary, but I guess having it wouldn’t hurt.

Yuan

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#62086; Package emacs. (Thu, 13 Apr 2023 17:49:02 GMT) Full text and rfc822 format available.

Message #82 received at 62086 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 62086 <at> debbugs.gnu.org, Yuan Fu <casouri <at> gmail.com>,
 Theodor Thornhill <theo <at> thornhill.no>
Subject: Re: bug#62086: 29.0.60; ruby-ts-mode regressions
Date: Thu, 13 Apr 2023 20:42:27 +0300

> The easiest choice would be to go back to treating only
> braces/brackets/parens are sexp delimiters, but in Ruby, at least, we have
> lots of constructs that are delimited with keywords (such as 'if', 'def',
> 'end'), so that doesn't work. Maybe it'll work better in C/C++, where you
> mostly need to be able to differentiate between different types of angle
> brackets.

Ideally, all previously supported pairs of braces/brackets/parens and
symbols should be navigated with 'C-M-f' plus a very small number
of constructs with "implicit parens" such as do...end, def...end, etc.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#62086; Package emacs. (Fri, 14 Apr 2023 17:08:02 GMT) Full text and rfc822 format available.

Message #85 received at 62086 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 62086 <at> debbugs.gnu.org, Yuan Fu <casouri <at> gmail.com>,
 Theodor Thornhill <theo <at> thornhill.no>
Subject: Re: bug#62086: 29.0.60; ruby-ts-mode regressions
Date: Fri, 14 Apr 2023 20:03:43 +0300

>> The easiest choice would be to go back to treating only
>> braces/brackets/parens are sexp delimiters, but in Ruby, at least, we have
>> lots of constructs that are delimited with keywords (such as 'if', 'def',
>> 'end'), so that doesn't work. Maybe it'll work better in C/C++, where you
>> mostly need to be able to differentiate between different types of angle
>> brackets.
>
> Ideally, all previously supported pairs of braces/brackets/parens and
> symbols should be navigated with 'C-M-f' plus a very small number
> of constructs with "implicit parens" such as do...end, def...end, etc.

Plus braces/brackets/parens inside comments and strings.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#62086; Package emacs. (Sat, 15 Apr 2023 00:10:01 GMT) Full text and rfc822 format available.

Message #88 received at 62086 <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 62086 <at> debbugs.gnu.org, Theodor Thornhill <theo <at> thornhill.no>,
 Juri Linkov <juri <at> linkov.net>
Subject: Re: bug#62086: 29.0.60; ruby-ts-mode regressions
Date: Fri, 14 Apr 2023 17:08:58 -0700


> On Apr 12, 2023, at 3:11 PM, Yuan Fu <casouri <at> gmail.com> wrote:
> 
> 
> 
>> On Apr 12, 2023, at 2:56 PM, Dmitry Gutov <dgutov <at> yandex.ru> wrote:
>> 
>> On 13/04/2023 00:50, Yuan Fu wrote:
>>>> On Apr 12, 2023, at 1:13 PM, Dmitry Gutov <dgutov <at> yandex.ru> wrote:
>>>> 
>>>> On 12/04/2023 18:31, Dmitry Gutov wrote:
>>>>> On 12/04/2023 10:05, Yuan Fu wrote:
>>>>>> Actually, would it make sense to define sexp as “anything but some very small punctuation and delimiters”?
>>>>> Pretty much. If I understood you correctly.
>>>>> E.g. in ruby-ts-mode identifiers and numbers are also sexps.
>>>> 
>>>> Allow me to update that.
>>>> 
>>>> From the previous threads, for ruby-ts-mode at least, we seem to have concluded that it's best to treat those nodes as sexps which have visible boundaries that are visible and don't overlay exactly the boundaries of the contained nodes.
>>>> 
>>>> For example, we now exclude statement nodes and binary expression nodes because both make forward/backward-sexp less obvious and predictable: you move point to the beginning of 'a + b', press C-M-f, and if the jump happens over the whole expression, this is just as likely to mismatch the user's intention (which might have wanted to only jump over 'a'). So these are the node we rule out.
>>> User might as well want to move over the whole expression, since they can use forward-word if they want to move over smaller elements. But I guess that’s just personal preferences.
>> 
>> forward-word works for minor elements, but the sub-expression can be, for example, a parenthesized expression (with "real" parens).
>> 
>> It's definitely something that can be discussed, but the above guideline seems to me like something that puts the user more in control. Because as handy jumping over statements can be, it's usually not what one is trying to do.
>> 
>>>> The easiest choice would be to go back to treating only braces/brackets/parens are sexp delimiters, but in Ruby, at least, we have lots of constructs that are delimited with keywords (such as 'if', 'def', 'end'), so that doesn't work. Maybe it'll work better in C/C++, where you mostly need to be able to differentiate between different types of angle brackets.
>>> To clarify, my point is to define sexp by exclusion rather than inclusion, ie, defining a set of nodes that are not sexp, rather than defining a set of nodes that are sexp. I mentioned delimiters because they are excluded from sexp, not because they delimit sexp.
>> 
>> Yes, that can work. Only when the excluded type names a one-char long, though, because Elisp has no lookahead. In ruby-ts-mode there are longer excluded types.
> 
> Actually, I’m working on extending the “pattern” treesit-search-forward and friends can accept. Right now it has to be a regexp or a pred function. I plan to extend it to regexp | function | (regexp . function) | (or <pattern>…) | (not <pattern>…) | (verbatim string)
> 
> I’m not yet sure about the performance implication of the recursive patterns (or and not). And I’m not sure if verbatim is necessary, but I guess having it wouldn’t hurt.
> 
> Yuan

Ok, I added experimental support for those patterns (except for verbatim) and a central place to define things: treesit-thing-settings. If you define a ‘block in treesit-thing-settings, you can use ‘block for treesit-thing-at-point, treesit-beginning-of-thing, etc.

Yuan

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sat, 13 May 2023 11:24:08 GMT) Full text and rfc822 format available.

This bug report was last modified 2 years and 52 days ago.

Previous Next

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #62086 29.0.60; ruby-ts-mode regressions

GNU bug report logs - #62086
29.0.60; ruby-ts-mode regressions