GNU bug report logs - #73978
31.0.50; Text syntax applied on too many things in tsx-ts-mode

Package: emacs;

Reported by: Yuan Fu <casouri <at> gmail.com>

Date: Thu, 24 Oct 2024 04:08:02 UTC

Severity: normal

Found in version 31.0.50

Done: Yuan Fu <casouri <at> gmail.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 73978 in the body.
You can then email your comments to 73978 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox

Report forwarded to theo <at> thornhill.no, bug-gnu-emacs <at> gnu.org:
bug#73978; Package emacs. (Thu, 24 Oct 2024 04:08:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Yuan Fu <casouri <at> gmail.com>:
New bug report received and forwarded. Copy sent to theo <at> thornhill.no, bug-gnu-emacs <at> gnu.org. (Thu, 24 Oct 2024 04:08:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: Bug Report Emacs <bug-gnu-emacs <at> gnu.org>
Subject: 31.0.50; Text syntax applied on too many things in tsx-ts-mode
Date: Wed, 23 Oct 2024 21:06:40 -0700

X-Debbugs-CC: theo <at> thornhill.no

In tsx-ts-mode we use this query to apply syntax properties:


(defvar tsx-ts--s-p-query
  (when (treesit-available-p)
    (treesit-query-compile 'tsx
                           '(((regex pattern: (regex_pattern) @regexp))
                             ((variable_declarator value: (jsx_element) @jsx))
                             ((assignment_expression right: (jsx_element) @jsx))
                             ((arguments (jsx_element) @jsx))
                             ((parenthesized_expression (jsx_element) @jsx))
                             ((return_statement (jsx_element) @jsx))))))


And then in tsx-ts--syntax-propertize-captures we mark everything
enclosed by the captured jsx_element nodes in text fences.

Then for the following code

<button onClick={() => {
  func();
  return true;
}}>
  Text
  {func();}
</button>

All the func() and other code will be considered text because the whole
jsx tag (<button>...</button>) are wrapped in string fences. Theo,
what’s the original intention for marking jsx_elements as text? Can we
only mark jsx_text as string?

Yuan

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#73978; Package emacs. (Sat, 09 Nov 2024 09:12:01 GMT) Full text and rfc822 format available.

Message #8 received at 73978 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Yuan Fu <casouri <at> gmail.com>, theo <at> thornhill.no
Cc: 73978 <at> debbugs.gnu.org
Subject: Re: bug#73978: 31.0.50;
 Text syntax applied on too many things in tsx-ts-mode
Date: Sat, 09 Nov 2024 11:11:14 +0200

Ping! Theo, can you answer Yuan's questions?

> Cc: theo <at> thornhill.no
> From: Yuan Fu <casouri <at> gmail.com>
> Date: Wed, 23 Oct 2024 21:06:40 -0700
> 
> X-Debbugs-CC: theo <at> thornhill.no
> 
> In tsx-ts-mode we use this query to apply syntax properties:
> 
> 
> (defvar tsx-ts--s-p-query
>   (when (treesit-available-p)
>     (treesit-query-compile 'tsx
>                            '(((regex pattern: (regex_pattern) @regexp))
>                              ((variable_declarator value: (jsx_element) @jsx))
>                              ((assignment_expression right: (jsx_element) @jsx))
>                              ((arguments (jsx_element) @jsx))
>                              ((parenthesized_expression (jsx_element) @jsx))
>                              ((return_statement (jsx_element) @jsx))))))
> 
> 
> And then in tsx-ts--syntax-propertize-captures we mark everything
> enclosed by the captured jsx_element nodes in text fences.
> 
> Then for the following code
> 
> <button onClick={() => {
>   func();
>   return true;
> }}>
>   Text
>   {func();}
> </button>
> 
> All the func() and other code will be considered text because the whole
> jsx tag (<button>...</button>) are wrapped in string fences. Theo,
> what’s the original intention for marking jsx_elements as text? Can we
> only mark jsx_text as string?
> 
> Yuan
> 
> 
> 
> 
>

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#73978; Package emacs. (Sat, 09 Nov 2024 16:52:01 GMT) Full text and rfc822 format available.

Message #11 received at 73978 <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: Theodor Thornhill <theo <at> thornhill.no>, 73978 <at> debbugs.gnu.org
Subject: Re: bug#73978: 31.0.50; Text syntax applied on too many things in
 tsx-ts-mode
Date: Sat, 9 Nov 2024 08:49:55 -0800


> On Nov 9, 2024, at 1:11 AM, Eli Zaretskii <eliz <at> gnu.org> wrote:
> 
> Ping! Theo, can you answer Yuan's questions?

I’ve been using my local fix at work for a while now and it seems to work fine. I’ll make a patch and apply in a few days.

Yuan

> 
>> Cc: theo <at> thornhill.no
>> From: Yuan Fu <casouri <at> gmail.com>
>> Date: Wed, 23 Oct 2024 21:06:40 -0700
>> 
>> X-Debbugs-CC: theo <at> thornhill.no
>> 
>> In tsx-ts-mode we use this query to apply syntax properties:
>> 
>> 
>> (defvar tsx-ts--s-p-query
>>  (when (treesit-available-p)
>>    (treesit-query-compile 'tsx
>>                           '(((regex pattern: (regex_pattern) @regexp))
>>                             ((variable_declarator value: (jsx_element) @jsx))
>>                             ((assignment_expression right: (jsx_element) @jsx))
>>                             ((arguments (jsx_element) @jsx))
>>                             ((parenthesized_expression (jsx_element) @jsx))
>>                             ((return_statement (jsx_element) @jsx))))))
>> 
>> 
>> And then in tsx-ts--syntax-propertize-captures we mark everything
>> enclosed by the captured jsx_element nodes in text fences.
>> 
>> Then for the following code
>> 
>> <button onClick={() => {
>>  func();
>>  return true;
>> }}>
>>  Text
>>  {func();}
>> </button>
>> 
>> All the func() and other code will be considered text because the whole
>> jsx tag (<button>...</button>) are wrapped in string fences. Theo,
>> what’s the original intention for marking jsx_elements as text? Can we
>> only mark jsx_text as string?
>> 
>> Yuan
>> 
>> 
>> 
>> 
>>

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#73978; Package emacs. (Sat, 23 Nov 2024 12:16:02 GMT) Full text and rfc822 format available.

Message #14 received at 73978 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Yuan Fu <casouri <at> gmail.com>
Cc: theo <at> thornhill.no, 73978 <at> debbugs.gnu.org
Subject: Re: bug#73978: 31.0.50; Text syntax applied on too many things in
 tsx-ts-mode
Date: Sat, 23 Nov 2024 14:15:07 +0200

> From: Yuan Fu <casouri <at> gmail.com>
> Date: Sat, 9 Nov 2024 08:49:55 -0800
> Cc: Theodor Thornhill <theo <at> thornhill.no>,
>  73978 <at> debbugs.gnu.org
> 
> 
> 
> > On Nov 9, 2024, at 1:11 AM, Eli Zaretskii <eliz <at> gnu.org> wrote:
> > 
> > Ping! Theo, can you answer Yuan's questions?
> 
> I’ve been using my local fix at work for a while now and it seems to work fine. I’ll make a patch and apply in a few days.

Did you have an opportunity to install such a patch?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#73978; Package emacs. (Sun, 24 Nov 2024 05:28:01 GMT) Full text and rfc822 format available.

Message #17 received at 73978 <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: Theodor Thornhill <theo <at> thornhill.no>, 73978 <at> debbugs.gnu.org
Subject: Re: bug#73978: 31.0.50; Text syntax applied on too many things in
 tsx-ts-mode
Date: Sat, 23 Nov 2024 21:25:50 -0800

> On Nov 23, 2024, at 4:15 AM, Eli Zaretskii <eliz <at> gnu.org> wrote:
> 
>> From: Yuan Fu <casouri <at> gmail.com>
>> Date: Sat, 9 Nov 2024 08:49:55 -0800
>> Cc: Theodor Thornhill <theo <at> thornhill.no>,
>> 73978 <at> debbugs.gnu.org
>> 
>> 
>> 
>>> On Nov 9, 2024, at 1:11 AM, Eli Zaretskii <eliz <at> gnu.org> wrote:
>>> 
>>> Ping! Theo, can you answer Yuan's questions?
>> 
>> I’ve been using my local fix at work for a while now and it seems to work fine. I’ll make a patch and apply in a few days.
> 
> Did you have an opportunity to install such a patch?

Hey sorry, I haven’t applied the patch. Actually, I want to ask you a question before I do: is there a way to mark a single character in buffer in string syntax? The only way I’m aware of is to mark string delimiter syntax to the start and end of the string, but that doesn’t work for a single character.

Take the following snippet as an example:

<button>a<button>

I want to apply string syntax to “a”.

If there’s no such way, I guess just not applying the string syntax in such case is also an option.

Yuan

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#73978; Package emacs. (Sun, 24 Nov 2024 07:48:01 GMT) Full text and rfc822 format available.

Message #20 received at 73978 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Yuan Fu <casouri <at> gmail.com>
Cc: theo <at> thornhill.no, 73978 <at> debbugs.gnu.org
Subject: Re: bug#73978: 31.0.50; Text syntax applied on too many things in
 tsx-ts-mode
Date: Sun, 24 Nov 2024 09:47:37 +0200

> From: Yuan Fu <casouri <at> gmail.com>
> Date: Sat, 23 Nov 2024 21:25:50 -0800
> Cc: Theodor Thornhill <theo <at> thornhill.no>,
>  73978 <at> debbugs.gnu.org
> 
> >> I’ve been using my local fix at work for a while now and it seems to work fine. I’ll make a patch and apply in a few days.
> > 
> > Did you have an opportunity to install such a patch?
> 
> Hey sorry, I haven’t applied the patch. Actually, I want to ask you a question before I do: is there a way to mark a single character in buffer in string syntax? The only way I’m aware of is to mark string delimiter syntax to the start and end of the string, but that doesn’t work for a single character.
> 
> Take the following snippet as an example:
> 
> <button>a<button>
> 
> I want to apply string syntax to “a”.
> 
> If there’s no such way, I guess just not applying the string syntax in such case is also an option.

There's a syntax-table text property, see the node "Syntax Properties"
in the ELisp manual.  Would that do the job?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#73978; Package emacs. (Sun, 24 Nov 2024 13:46:01 GMT) Full text and rfc822 format available.

Message #23 received at 73978 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Eli Zaretskii <eliz <at> gnu.org>, Yuan Fu <casouri <at> gmail.com>
Cc: theo <at> thornhill.no, 73978 <at> debbugs.gnu.org
Subject: Re: bug#73978: 31.0.50; Text syntax applied on too many things in
 tsx-ts-mode
Date: Sun, 24 Nov 2024 15:45:10 +0200

On 24/11/2024 09:47, Eli Zaretskii wrote:
>> Hey sorry, I haven’t applied the patch. Actually, I want to ask you a question before I do: is there a way to mark a single character in buffer in string syntax? The only way I’m aware of is to mark string delimiter syntax to the start and end of the string, but that doesn’t work for a single character.
>>
>> Take the following snippet as an example:
>>
>> <button>a<button>
>>
>> I want to apply string syntax to “a”.
>>
>> If there’s no such way, I guess just not applying the string syntax in such case is also an option.
> There's a syntax-table text property, see the node "Syntax Properties"
> in the ELisp manual.  Would that do the job?

In particular, the "generic string" syntax property, this one

  (string-to-syntax "|")

You put it on the first and the last chars of a "generic string".

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#73978; Package emacs. (Mon, 25 Nov 2024 01:29:01 GMT) Full text and rfc822 format available.

Message #26 received at 73978 <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: Eli Zaretskii <eliz <at> gnu.org>, Theodor Thornhill <theo <at> thornhill.no>,
 73978 <at> debbugs.gnu.org
Subject: Re: bug#73978: 31.0.50; Text syntax applied on too many things in
 tsx-ts-mode
Date: Sun, 24 Nov 2024 17:27:43 -0800


> On Nov 24, 2024, at 5:45 AM, Dmitry Gutov <dmitry <at> gutov.dev> wrote:
> 
> On 24/11/2024 09:47, Eli Zaretskii wrote:
>>> Hey sorry, I haven’t applied the patch. Actually, I want to ask you a question before I do: is there a way to mark a single character in buffer in string syntax? The only way I’m aware of is to mark string delimiter syntax to the start and end of the string, but that doesn’t work for a single character.
>>> 
>>> Take the following snippet as an example:
>>> 
>>> <button>a<button>
>>> 
>>> I want to apply string syntax to “a”.
>>> 
>>> If there’s no such way, I guess just not applying the string syntax in such case is also an option.
>> There's a syntax-table text property, see the node "Syntax Properties"
>> in the ELisp manual.  Would that do the job?
> 
> In particular, the "generic string" syntax property, this one
> 
>  (string-to-syntax "|")
> 
> You put it on the first and the last chars of a "generic string”.

The problem is, that doesn’t work when there’s only one character. Take the snippet as an example:

<button>a</button>

You can’t put the string fence syntax on the “a”, because there isn’t a closing fence to close it.

Yuan

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#73978; Package emacs. (Wed, 11 Dec 2024 04:54:02 GMT) Full text and rfc822 format available.

Message #29 received at 73978 <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: Eli Zaretskii <eliz <at> gnu.org>, Theodor Thornhill <theo <at> thornhill.no>,
 73978 <at> debbugs.gnu.org
Subject: Re: bug#73978: 31.0.50; Text syntax applied on too many things in
 tsx-ts-mode
Date: Tue, 10 Dec 2024 20:52:20 -0800


> On Nov 24, 2024, at 5:27 PM, Yuan Fu <casouri <at> gmail.com> wrote:
> 
> 
> 
>> On Nov 24, 2024, at 5:45 AM, Dmitry Gutov <dmitry <at> gutov.dev> wrote:
>> 
>> On 24/11/2024 09:47, Eli Zaretskii wrote:
>>>> Hey sorry, I haven’t applied the patch. Actually, I want to ask you a question before I do: is there a way to mark a single character in buffer in string syntax? The only way I’m aware of is to mark string delimiter syntax to the start and end of the string, but that doesn’t work for a single character.
>>>> 
>>>> Take the following snippet as an example:
>>>> 
>>>> <button>a<button>
>>>> 
>>>> I want to apply string syntax to “a”.
>>>> 
>>>> If there’s no such way, I guess just not applying the string syntax in such case is also an option.
>>> There's a syntax-table text property, see the node "Syntax Properties"
>>> in the ELisp manual.  Would that do the job?
>> 
>> In particular, the "generic string" syntax property, this one
>> 
>> (string-to-syntax "|")
>> 
>> You put it on the first and the last chars of a "generic string”.
> 
> The problem is, that doesn’t work when there’s only one character. Take the snippet as an example:
> 
> <button>a</button>
> 
> You can’t put the string fence syntax on the “a”, because there isn’t a closing fence to close it.
> 
> Yuan

Circling back on this. I don’t think there’s a way to apply string syntax to a single character.

Yuan

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#73978; Package emacs. (Thu, 12 Dec 2024 02:53:02 GMT) Full text and rfc822 format available.

Message #32 received at 73978 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Yuan Fu <casouri <at> gmail.com>
Cc: Eli Zaretskii <eliz <at> gnu.org>, Theodor Thornhill <theo <at> thornhill.no>,
 73978 <at> debbugs.gnu.org
Subject: Re: bug#73978: 31.0.50; Text syntax applied on too many things in
 tsx-ts-mode
Date: Thu, 12 Dec 2024 04:52:27 +0200

On 11/12/2024 06:52, Yuan Fu wrote:
>> The problem is, that doesn’t work when there’s only one character. Take the snippet as an example:
>>
>> <button>a</button>
>>
>> You can’t put the string fence syntax on the “a”, because there isn’t a closing fence to close it.
>>
>> Yuan
> Circling back on this. I don’t think there’s a way to apply string syntax to a single character.

Indeed, sorry.

There needs to be a separate char as a "closing fence" like you say 
because it's treated as a part of the string. So it's 2 chars minimum.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#73978; Package emacs. (Thu, 12 Dec 2024 04:59:02 GMT) Full text and rfc822 format available.

Message #35 received at 73978 <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: Eli Zaretskii <eliz <at> gnu.org>, Theodor Thornhill <theo <at> thornhill.no>,
 73978 <at> debbugs.gnu.org
Subject: Re: bug#73978: 31.0.50; Text syntax applied on too many things in
 tsx-ts-mode
Date: Wed, 11 Dec 2024 20:56:55 -0800


> On Dec 11, 2024, at 6:52 PM, Dmitry Gutov <dmitry <at> gutov.dev> wrote:
> 
> On 11/12/2024 06:52, Yuan Fu wrote:
>>> The problem is, that doesn’t work when there’s only one character. Take the snippet as an example:
>>> 
>>> <button>a</button>
>>> 
>>> You can’t put the string fence syntax on the “a”, because there isn’t a closing fence to close it.
>>> 
>>> Yuan
>> Circling back on this. I don’t think there’s a way to apply string syntax to a single character.
> 
> Indeed, sorry.
> 
> There needs to be a separate char as a "closing fence" like you say because it's treated as a part of the string. So it's 2 chars minimum.

How hard is it to add a new syntax for this case? Or is there some way to work around this? We can’t just not apply the string syntax, because if the “a” is a parenthesis, etc, it would mess up the parenthesis balancing after it.

Maybe just give it a whitespace syntax? 

Yuan

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#73978; Package emacs. (Thu, 12 Dec 2024 17:20:02 GMT) Full text and rfc822 format available.

Message #38 received at 73978 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Yuan Fu <casouri <at> gmail.com>
Cc: Eli Zaretskii <eliz <at> gnu.org>, Theodor Thornhill <theo <at> thornhill.no>,
 73978 <at> debbugs.gnu.org
Subject: Re: bug#73978: 31.0.50; Text syntax applied on too many things in
 tsx-ts-mode
Date: Thu, 12 Dec 2024 19:19:19 +0200

On 12/12/2024 06:56, Yuan Fu wrote:
> How hard is it to add a new syntax for this case? Or is there some way to work around this? We can’t just not apply the string syntax, because if the “a” is a parenthesis, etc, it would mess up the parenthesis balancing after it.

Probably not very hard, but that seems like it'd affect the total set of 
syntax classes - which means adding it to the manual, etc.

> Maybe just give it a whitespace syntax?

Right, in such cases I applied the "whitespace" or "punctuation" syntax 
to the whole character span, like in 
https://github.com/dgutov/mmm-mode/blob/master/mmm-erb.el#L97

Reply sent to Yuan Fu <casouri <at> gmail.com>:
You have taken responsibility. (Fri, 13 Dec 2024 05:49:02 GMT) Full text and rfc822 format available.

Notification sent to Yuan Fu <casouri <at> gmail.com>:
bug acknowledged by developer. (Fri, 13 Dec 2024 05:49:02 GMT) Full text and rfc822 format available.

Message #43 received at 73978-done <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: Eli Zaretskii <eliz <at> gnu.org>, Theodor Thornhill <theo <at> thornhill.no>,
 73978-done <at> debbugs.gnu.org
Subject: Re: bug#73978: 31.0.50; Text syntax applied on too many things in
 tsx-ts-mode
Date: Thu, 12 Dec 2024 21:47:41 -0800


> On Dec 12, 2024, at 9:19 AM, Dmitry Gutov <dmitry <at> gutov.dev> wrote:
> 
> On 12/12/2024 06:56, Yuan Fu wrote:
>> How hard is it to add a new syntax for this case? Or is there some way to work around this? We can’t just not apply the string syntax, because if the “a” is a parenthesis, etc, it would mess up the parenthesis balancing after it.
> 
> Probably not very hard, but that seems like it'd affect the total set of syntax classes - which means adding it to the manual, etc.
> 
>> Maybe just give it a whitespace syntax?
> 
> Right, in such cases I applied the "whitespace" or "punctuation" syntax to the whole character span, like in https://github.com/dgutov/mmm-mode/blob/master/mmm-erb.el#L97

Thanks. I went with the whitespace trick and pushed my patch to master.

Yuan

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#73978; Package emacs. (Tue, 24 Dec 2024 08:23:02 GMT) Full text and rfc822 format available.

Message #46 received at 73978 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Yuan Fu <casouri <at> gmail.com>
Cc: dmitry <at> gutov.dev, theo <at> thornhill.no, 73978 <at> debbugs.gnu.org
Subject: Re: bug#73978: 31.0.50; Text syntax applied on too many things in
 tsx-ts-mode
Date: Tue, 24 Dec 2024 09:59:10 +0200

> I went with the whitespace trick and pushed my patch to master.

While testing forward-sexp in tsx-ts-mode I noticed that
this line in 'tsx-ts--s-p-query':

  ((jsx_text) @jsx)

disrupts syntax-based navigation for forward-sentence-default-function.

In such example:

import * as React from "react";
import * as ReactDOM from "react-dom";
ReactDOM.render(
<div>
<h1>Hello, Welcome to React and TypeScript</h1>
</div>,
  document.getElementById("root")
);

'C-M-b' on text inside <h1> stops after the first "H" in "Hello", and
'C-M-f' before the last "t" in "TypeScript".  It seems the first
and the last characters are interpreted as the opening/closing fence?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#73978; Package emacs. (Tue, 24 Dec 2024 08:33:01 GMT) Full text and rfc822 format available.

Message #49 received at 73978 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Yuan Fu <casouri <at> gmail.com>
Cc: dmitry <at> gutov.dev, theo <at> thornhill.no, 73978 <at> debbugs.gnu.org
Subject: Re: bug#73978: 31.0.50; Text syntax applied on too many things in
 tsx-ts-mode
Date: Tue, 24 Dec 2024 10:31:33 +0200

> While testing forward-sexp in tsx-ts-mode I noticed that
> this line in 'tsx-ts--s-p-query':
>
>   ((jsx_text) @jsx)
>
> disrupts syntax-based navigation for forward-sentence-default-function.

With this patch everything works perfectly:

diff --git a/lisp/progmodes/typescript-ts-mode.el b/lisp/progmodes/typescript-ts-mode.el
index 5c3c9a24ff4..01dd8297996 100644
--- a/lisp/progmodes/typescript-ts-mode.el
+++ b/lisp/progmodes/typescript-ts-mode.el
@@ -630,7 +640,8 @@ tsx-ts--s-p-query
   (when (treesit-available-p)
     (treesit-query-compile 'tsx
                            '(((regex pattern: (regex_pattern) @regexp))
-                             ((jsx_text) @jsx)))))
+                             ((jsx_opening_element) @jsx)
+                             ((jsx_closing_element) @jsx)))))
 
 (defun typescript-ts--syntax-propertize (beg end)
   (let ((captures (treesit-query-capture 'typescript typescript-ts--s-p-query beg end)))

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#73978; Package emacs. (Tue, 24 Dec 2024 08:54:01 GMT) Full text and rfc822 format available.

Message #52 received at 73978 <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: Juri Linkov <juri <at> linkov.net>
Cc: Dmitry Gutov <dmitry <at> gutov.dev>, Theodor Thornhill <theo <at> thornhill.no>,
 73978 <at> debbugs.gnu.org
Subject: Re: bug#73978: 31.0.50; Text syntax applied on too many things in
 tsx-ts-mode
Date: Tue, 24 Dec 2024 00:52:24 -0800


> On Dec 24, 2024, at 12:31 AM, Juri Linkov <juri <at> linkov.net> wrote:
> 
>> While testing forward-sexp in tsx-ts-mode I noticed that
>> this line in 'tsx-ts--s-p-query':
>> 
>>  ((jsx_text) @jsx)
>> 
>> disrupts syntax-based navigation for forward-sentence-default-function.
> 
> With this patch everything works perfectly:
> 
> diff --git a/lisp/progmodes/typescript-ts-mode.el b/lisp/progmodes/typescript-ts-mode.el
> index 5c3c9a24ff4..01dd8297996 100644
> --- a/lisp/progmodes/typescript-ts-mode.el
> +++ b/lisp/progmodes/typescript-ts-mode.el
> @@ -630,7 +640,8 @@ tsx-ts--s-p-query
>   (when (treesit-available-p)
>     (treesit-query-compile 'tsx
>                            '(((regex pattern: (regex_pattern) @regexp))
> -                             ((jsx_text) @jsx)))))
> +                             ((jsx_opening_element) @jsx)
> +                             ((jsx_closing_element) @jsx)))))
> 
> (defun typescript-ts--syntax-propertize (beg end)
>   (let ((captures (treesit-query-capture 'typescript typescript-ts--s-p-query beg end)))

Thanks for looking into this! But what’s the intention of this change? In a snippet like this:

<button onClick={() => {
  func();
  return true;
}}>
  Text
  {func();}
</button>

Only the “Text” part should be marked as string. With the change you proposed, the <button …> and </button> part would be marked as string.

We must mark text as strings because they could include </>/(/) etc and mess with syntax-ppss. 

Yuan

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#73978; Package emacs. (Tue, 24 Dec 2024 17:29:02 GMT) Full text and rfc822 format available.

Message #55 received at 73978 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Yuan Fu <casouri <at> gmail.com>
Cc: Dmitry Gutov <dmitry <at> gutov.dev>, Theodor Thornhill <theo <at> thornhill.no>,
 73978 <at> debbugs.gnu.org
Subject: Re: bug#73978: 31.0.50; Text syntax applied on too many things in
 tsx-ts-mode
Date: Tue, 24 Dec 2024 19:25:22 +0200

>> @@ -630,7 +640,8 @@ tsx-ts--s-p-query
>>   (when (treesit-available-p)
>>     (treesit-query-compile 'tsx
>>                            '(((regex pattern: (regex_pattern) @regexp))
>> -                             ((jsx_text) @jsx)))))
>> +                             ((jsx_opening_element) @jsx)
>> +                             ((jsx_closing_element) @jsx)))))
>> 
>> (defun typescript-ts--syntax-propertize (beg end)
>>   (let ((captures (treesit-query-capture 'typescript typescript-ts--s-p-query beg end)))
>
> Thanks for looking into this! But what’s the intention of this change?
> In a snippet like this:
>
> <button onClick={() => {
>   func();
>   return true;
> }}>
>   Text
>   {func();}
> </button>
>
> Only the “Text” part should be marked as string.  With the change you
> proposed, the <button …> and </button> part would be marked as string.

How could I see that text is marked as string?
I see no different fontification.

> We must mark text as strings because they could include </>/(/) etc
> and mess with syntax-ppss. 

With the updates in 'tsx-ts-mode' that I just pushed to master
please try in the following example:

ReactDOM.render(
<div>
<h1>Hello, Welcome to React and TypeScript</h1>
</div>,
  document.getElementById("root")
);

1. move point to the beginning of <h1>
2. type C-M-b
3. point incorrectly moves to inside <div>

However, this the above patch, point doesn't move
that is correct.

Another test case:

1. move point to the beginning of the word "Welcome"
2. type C-M-b
3. point incorrectly moves to the letter "e" instead of correct "H"

4. type C-M-f a few times until the end of text
5. point stops at the letter "t" instead of moving after the last letter

All these cases work correctly with the patch above.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#73978; Package emacs. (Tue, 24 Dec 2024 20:59:01 GMT) Full text and rfc822 format available.

Message #58 received at 73978 <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: Juri Linkov <juri <at> linkov.net>
Cc: Dmitry Gutov <dmitry <at> gutov.dev>, Theodor Thornhill <theo <at> thornhill.no>,
 73978 <at> debbugs.gnu.org
Subject: Re: bug#73978: 31.0.50; Text syntax applied on too many things in
 tsx-ts-mode
Date: Tue, 24 Dec 2024 12:57:15 -0800


> On Dec 24, 2024, at 9:25 AM, Juri Linkov <juri <at> linkov.net> wrote:
> 
>>> @@ -630,7 +640,8 @@ tsx-ts--s-p-query
>>>  (when (treesit-available-p)
>>>    (treesit-query-compile 'tsx
>>>                           '(((regex pattern: (regex_pattern) @regexp))
>>> -                             ((jsx_text) @jsx)))))
>>> +                             ((jsx_opening_element) @jsx)
>>> +                             ((jsx_closing_element) @jsx)))))
>>> 
>>> (defun typescript-ts--syntax-propertize (beg end)
>>>  (let ((captures (treesit-query-capture 'typescript typescript-ts--s-p-query beg end)))
>> 
>> Thanks for looking into this! But what’s the intention of this change?
>> In a snippet like this:
>> 
>> <button onClick={() => {
>>  func();
>>  return true;
>> }}>
>>  Text
>>  {func();}
>> </button>
>> 
>> Only the “Text” part should be marked as string.  With the change you
>> proposed, the <button …> and </button> part would be marked as string.
> 
> How could I see that text is marked as string?
> I see no different fontification.

It’s marked as string for syntax-ppss purpose, so that syntax-ppss skips it when scanning for balanced pairs. It’s not related to fontification.

> 
>> We must mark text as strings because they could include </>/(/) etc
>> and mess with syntax-ppss.
> 
> With the updates in 'tsx-ts-mode' that I just pushed to master
> please try in the following example:
> 
> ReactDOM.render(
> <div>
> <h1>Hello, Welcome to React and TypeScript</h1>
> </div>,
>  document.getElementById("root")
> );
> 
> 1. move point to the beginning of <h1>
> 2. type C-M-b
> 3. point incorrectly moves to inside <div>
> 
> However, this the above patch, point doesn't move
> that is correct.
> 
> Another test case:
> 
> 1. move point to the beginning of the word "Welcome"
> 2. type C-M-b
> 3. point incorrectly moves to the letter "e" instead of correct "H"
> 
> 4. type C-M-f a few times until the end of text
> 5. point stops at the letter "t" instead of moving after the last letter
> 
> All these cases work correctly with the patch above.

I understand the problem you want to solve, but the patch above will bring back the bug I was trying to fix in the first place. 

And I still don’t understand the intention of your patch. Maybe I missed something. Am I correct that you want to apply string syntax on the tags, eg, <div>, <button>, </button>, </div>? 

Yuan

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#73978; Package emacs. (Wed, 25 Dec 2024 08:11:02 GMT) Full text and rfc822 format available.

Message #61 received at 73978 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Yuan Fu <casouri <at> gmail.com>
Cc: Dmitry Gutov <dmitry <at> gutov.dev>, Theodor Thornhill <theo <at> thornhill.no>,
 73978 <at> debbugs.gnu.org
Subject: Re: bug#73978: 31.0.50; Text syntax applied on too many things in
 tsx-ts-mode
Date: Wed, 25 Dec 2024 09:40:48 +0200

>> ReactDOM.render(
>> <div>
>> <h1>Hello, Welcome to React and TypeScript</h1>
>> </div>,
>>  document.getElementById("root")
>> );
>> 
>> 1. move point to the beginning of <h1>
>> 2. type C-M-b
>> 3. point incorrectly moves to inside <div>
>> 
>> However, this the above patch, point doesn't move
>> that is correct.
>> 
>> Another test case:
>> 
>> 1. move point to the beginning of the word "Welcome"
>> 2. type C-M-b
>> 3. point incorrectly moves to the letter "e" instead of correct "H"
>> 
>> 4. type C-M-f a few times until the end of text
>> 5. point stops at the letter "t" instead of moving after the last letter
>> 
>> All these cases work correctly with the patch above.
>
> I understand the problem you want to solve, but the patch above will
> bring back the bug I was trying to fix in the first place.

The patch just demonstrated one of possible ways to solve the problem.

> And I still don’t understand the intention of your patch.  Maybe
> I missed something.  Am I correct that you want to apply string syntax
> on the tags, eg, <div>, <button>, </button>, </div>?

I don't need to apply string syntax on the tags.  I just found
that currently C-M-f navigation was broken.  Maybe there are
other ways to fix it?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#73978; Package emacs. (Wed, 25 Dec 2024 08:35:01 GMT) Full text and rfc822 format available.

Message #64 received at 73978 <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: Juri Linkov <juri <at> linkov.net>
Cc: Dmitry Gutov <dmitry <at> gutov.dev>, Theodor Thornhill <theo <at> thornhill.no>,
 73978 <at> debbugs.gnu.org
Subject: Re: bug#73978: 31.0.50; Text syntax applied on too many things in
 tsx-ts-mode
Date: Wed, 25 Dec 2024 00:33:10 -0800


> On Dec 24, 2024, at 11:40 PM, Juri Linkov <juri <at> linkov.net> wrote:
> 
>>> ReactDOM.render(
>>> <div>
>>> <h1>Hello, Welcome to React and TypeScript</h1>
>>> </div>,
>>> document.getElementById("root")
>>> );
>>> 
>>> 1. move point to the beginning of <h1>
>>> 2. type C-M-b
>>> 3. point incorrectly moves to inside <div>
>>> 
>>> However, this the above patch, point doesn't move
>>> that is correct.
>>> 
>>> Another test case:
>>> 
>>> 1. move point to the beginning of the word "Welcome"
>>> 2. type C-M-b
>>> 3. point incorrectly moves to the letter "e" instead of correct "H"
>>> 
>>> 4. type C-M-f a few times until the end of text
>>> 5. point stops at the letter "t" instead of moving after the last letter
>>> 
>>> All these cases work correctly with the patch above.
>> 
>> I understand the problem you want to solve, but the patch above will
>> bring back the bug I was trying to fix in the first place.
> 
> The patch just demonstrated one of possible ways to solve the problem.
> 
>> And I still don’t understand the intention of your patch.  Maybe
>> I missed something.  Am I correct that you want to apply string syntax
>> on the tags, eg, <div>, <button>, </button>, </div>?
> 
> I don't need to apply string syntax on the tags.  I just found
> that currently C-M-f navigation was broken.  Maybe there are
> other ways to fix it?

Then let’s look for other ways to solve the problem you demonstrated. Dmitry, would there be any negative effects if we apply the whitespace syntax on all the text (rather than string syntax)? Ah, I guess skip-syntax wouldn’t work right. Is there another way to tell syntax-ppss to skip a chunk of text when scanning?

Yuan

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#73978; Package emacs. (Thu, 26 Dec 2024 05:38:02 GMT) Full text and rfc822 format available.

Message #67 received at 73978 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Yuan Fu <casouri <at> gmail.com>, Juri Linkov <juri <at> linkov.net>
Cc: Theodor Thornhill <theo <at> thornhill.no>, 73978 <at> debbugs.gnu.org
Subject: Re: bug#73978: 31.0.50; Text syntax applied on too many things in
 tsx-ts-mode
Date: Thu, 26 Dec 2024 07:37:17 +0200

On 25/12/2024 10:33, Yuan Fu wrote:
>> The patch just demonstrated one of possible ways to solve the problem.
>>
>>> And I still don’t understand the intention of your patch.  Maybe
>>> I missed something.  Am I correct that you want to apply string syntax
>>> on the tags, eg, <div>, <button>, </button>, </div>?
>> I don't need to apply string syntax on the tags.  I just found
>> that currently C-M-f navigation was broken.  Maybe there are
>> other ways to fix it?
> Then let’s look for other ways to solve the problem you demonstrated. Dmitry, would there be any negative effects if we apply the whitespace syntax on all the text (rather than string syntax)? Ah, I guess skip-syntax wouldn’t work right. Is there another way to tell syntax-ppss to skip a chunk of text when scanning?

Maybe not.

But I guess tsx-ts--syntax-propertize-captures could only apply syntax 
to specific characters inside the text - it would search for parens, 
brackets, (something else?), and put the "punctuation" syntax on them - 
that should play nicer with sexp/word/symbol navigation.

A bit more code, but OTOH we would drop the (eq ne (1+ ns)) distinction.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#73978; Package emacs. (Sat, 04 Jan 2025 19:59:02 GMT) Full text and rfc822 format available.

Message #70 received at 73978 <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: Theodor Thornhill <theo <at> thornhill.no>, 73978 <at> debbugs.gnu.org,
 Juri Linkov <juri <at> linkov.net>
Subject: Re: bug#73978: 31.0.50; Text syntax applied on too many things in
 tsx-ts-mode
Date: Sat, 4 Jan 2025 11:58:02 -0800


> On Dec 25, 2024, at 9:37 PM, Dmitry Gutov <dmitry <at> gutov.dev> wrote:
> 
> On 25/12/2024 10:33, Yuan Fu wrote:
>>> The patch just demonstrated one of possible ways to solve the problem.
>>> 
>>>> And I still don’t understand the intention of your patch.  Maybe
>>>> I missed something.  Am I correct that you want to apply string syntax
>>>> on the tags, eg, <div>, <button>, </button>, </div>?
>>> I don't need to apply string syntax on the tags.  I just found
>>> that currently C-M-f navigation was broken.  Maybe there are
>>> other ways to fix it?
>> Then let’s look for other ways to solve the problem you demonstrated. Dmitry, would there be any negative effects if we apply the whitespace syntax on all the text (rather than string syntax)? Ah, I guess skip-syntax wouldn’t work right. Is there another way to tell syntax-ppss to skip a chunk of text when scanning?
> 
> Maybe not.
> 
> But I guess tsx-ts--syntax-propertize-captures could only apply syntax to specific characters inside the text - it would search for parens, brackets, (something else?), and put the "punctuation" syntax on them - that should play nicer with sexp/word/symbol navigation.
> 
> A bit more code, but OTOH we would drop the (eq ne (1+ ns)) distinction.

That’s a good idea! I implemented this. Now forward-sexp should work as normal.

Yuan

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#73978; Package emacs. (Sun, 05 Jan 2025 08:10:01 GMT) Full text and rfc822 format available.

Message #73 received at 73978 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Yuan Fu <casouri <at> gmail.com>
Cc: Dmitry Gutov <dmitry <at> gutov.dev>, Theodor Thornhill <theo <at> thornhill.no>,
 73978 <at> debbugs.gnu.org
Subject: Re: bug#73978: 31.0.50; Text syntax applied on too many things in
 tsx-ts-mode
Date: Sun, 05 Jan 2025 09:57:59 +0200

>> But I guess tsx-ts--syntax-propertize-captures could only apply syntax to
>> specific characters inside the text - it would search for parens,
>> brackets, (something else?), and put the "punctuation" syntax on them -
>> that should play nicer with sexp/word/symbol navigation.
>> 
>> A bit more code, but OTOH we would drop the (eq ne (1+ ns)) distinction.
>
> That’s a good idea! I implemented this. Now forward-sexp should work as normal.

Thanks, this is better.  Now forward-sexp correctly moves to the end of the string.

However, it still moves inside the tag, e.g. with point in

ReactDOM.render(
<div>
<h1>Hello, Welcome to React and TypeScript-!-</h1>
</div>,
  document.getElementById("root")
);

C-M-f moves point to

ReactDOM.render(
<div>
<h1>Hello, Welcome to React and TypeScript</h1-!->
</div>,
  document.getElementById("root")
);

But maybe this is a different problem.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#73978; Package emacs. (Sun, 05 Jan 2025 08:43:01 GMT) Full text and rfc822 format available.

Message #76 received at 73978 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Yuan Fu <casouri <at> gmail.com>
Cc: Dmitry Gutov <dmitry <at> gutov.dev>, Theodor Thornhill <theo <at> thornhill.no>,
 73978 <at> debbugs.gnu.org
Subject: Re: bug#73978: 31.0.50; Text syntax applied on too many things in
 tsx-ts-mode
Date: Sun, 05 Jan 2025 10:21:25 +0200

> ReactDOM.render(
> <div>
> <h1>Hello, Welcome to React and TypeScript-!-</h1>
> </div>,
>   document.getElementById("root")
> );
>
> C-M-f moves point to
>
> ReactDOM.render(
> <div>
> <h1>Hello, Welcome to React and TypeScript</h1-!->
> </div>,
>   document.getElementById("root")
> );
>
> But maybe this is a different problem.

This can be fixed by the following patch that copied the syntax of < and >
from sgml-make-syntax-table:

diff --git a/lisp/progmodes/typescript-ts-mode.el b/lisp/progmodes/typescript-ts-mode.el
index 09f29a4ac65..21672d2d9c1 100644
--- a/lisp/progmodes/typescript-ts-mode.el
+++ b/lisp/progmodes/typescript-ts-mode.el
@@ -692,7 +694,11 @@ tsx-ts--syntax-propertize-captures
                                      ne t)
              (put-text-property
               (match-beginning 0) (match-end 0)
-              'syntax-table (string-to-syntax ".")))))))))
+              'syntax-table (string-to-syntax
+                             (cond
+                              ((equal (match-string 0) "<") "(<")
+                              ((equal (match-string 0) ">") "(>")
+                              (t ".")))))))))))
 
 (if (treesit-ready-p 'tsx)
     (add-to-list 'auto-mode-alist '("\\.tsx\\'" . tsx-ts-mode)))

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#73978; Package emacs. (Sun, 05 Jan 2025 11:56:02 GMT) Full text and rfc822 format available.

Message #79 received at 73978-done <at> debbugs.gnu.org (full text, mbox):

From: Theodor Thornhill <theo <at> thornhill.no>
To: Yuan Fu <casouri <at> gmail.com>, Dmitry Gutov <dmitry <at> gutov.dev>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 73978-done <at> debbugs.gnu.org
Subject: Re: bug#73978: 31.0.50; Text syntax applied on too many things in
 tsx-ts-mode
Date: Sun, 05 Jan 2025 12:55:11 +0100

Yuan Fu <casouri <at> gmail.com> writes:

>> On Dec 12, 2024, at 9:19 AM, Dmitry Gutov <dmitry <at> gutov.dev> wrote:
>> 
>> On 12/12/2024 06:56, Yuan Fu wrote:
>>> How hard is it to add a new syntax for this case? Or is there some way to work around this? We can’t just not apply the string syntax, because if the “a” is a parenthesis, etc, it would mess up the parenthesis balancing after it.
>> 
>> Probably not very hard, but that seems like it'd affect the total set of syntax classes - which means adding it to the manual, etc.
>> 
>>> Maybe just give it a whitespace syntax?
>> 
>> Right, in such cases I applied the "whitespace" or "punctuation" syntax to the whole character span, like in https://github.com/dgutov/mmm-mode/blob/master/mmm-erb.el#L97
>
> Thanks. I went with the whitespace trick and pushed my patch to master.
>
> Yuan

FWIW, I believe this is a regression caused by later versions of the
treesit grammar. What is talked about here was explicitly a goal for me
to handle at least better than what it appears it has been for some
time. IIRC they changed what nodes were applied as the jsx nodes quite
dramatically some time ago. I'm not surprised there are issues after
that, as most wasn't backward compatible

Theo

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#73978; Package emacs. (Sun, 05 Jan 2025 14:14:01 GMT) Full text and rfc822 format available.

Message #82 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Daniel Colascione <dancol <at> dancol.org>
To: Theodor Thornhill <theo <at> thornhill.no>,
 "Theodor Thornhill via Bug reports for GNU Emacs,
 the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org>, 
 Yuan Fu <casouri <at> gmail.com>, Dmitry Gutov <dmitry <at> gutov.dev>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 73978-done <at> debbugs.gnu.org
Subject: Re: bug#73978: 31.0.50; Text syntax applied on too many things in tsx-ts-mode
Date: Sun, 05 Jan 2025 09:13:41 -0500


On January 5, 2025 6:55:11 AM EST, "Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org> wrote:
>Yuan Fu <casouri <at> gmail.com> writes:
>
>>> On Dec 12, 2024, at 9:19 AM, Dmitry Gutov <dmitry <at> gutov.dev> wrote:
>>> 
>>> On 12/12/2024 06:56, Yuan Fu wrote:
>>>> How hard is it to add a new syntax for this case? Or is there some way to work around this? We can’t just not apply the string syntax, because if the “a” is a parenthesis, etc, it would mess up the parenthesis balancing after it.
>>> 
>>> Probably not very hard, but that seems like it'd affect the total set of syntax classes - which means adding it to the manual, etc.
>>> 
>>>> Maybe just give it a whitespace syntax?
>>> 
>>> Right, in such cases I applied the "whitespace" or "punctuation" syntax to the whole character span, like in https://github.com/dgutov/mmm-mode/blob/master/mmm-erb.el#L97
>>
>> Thanks. I went with the whitespace trick and pushed my patch to master.
>>
>> Yuan
>
>FWIW, I believe this is a regression caused by later versions of the
>treesit grammar. What is talked about here was explicitly a goal for me
>to handle at least better than what it appears it has been for some
>time. IIRC they changed what nodes were applied as the jsx nodes quite
>dramatically some time ago. I'm not surprised there are issues after
>that, as most wasn't backward compatible

Hrm.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#73978; Package emacs. (Sun, 05 Jan 2025 14:14:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#73978; Package emacs. (Fri, 10 Jan 2025 07:58:02 GMT) Full text and rfc822 format available.

Message #88 received at 73978 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Yuan Fu <casouri <at> gmail.com>
Cc: Dmitry Gutov <dmitry <at> gutov.dev>, Theodor Thornhill <theo <at> thornhill.no>,
 73978 <at> debbugs.gnu.org
Subject: Re: bug#73978: 31.0.50; Text syntax applied on too many things in
 tsx-ts-mode
Date: Fri, 10 Jan 2025 09:56:16 +0200

>> ReactDOM.render(
>> <div>
>> <h1>Hello, Welcome to React and TypeScript-!-</h1>
>> </div>,
>>   document.getElementById("root")
>> );
>>
>> C-M-f moves point to
>>
>> ReactDOM.render(
>> <div>
>> <h1>Hello, Welcome to React and TypeScript</h1-!->
>> </div>,
>>   document.getElementById("root")
>> );
>>
>> But maybe this is a different problem.
>
> This can be fixed by the following patch that copied the syntax of < and >
> from sgml-make-syntax-table:

I pushed this as well.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#73978; Package emacs. (Sat, 11 Jan 2025 17:42:01 GMT) Full text and rfc822 format available.

Message #91 received at 73978 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Yuan Fu <casouri <at> gmail.com>
Cc: Dmitry Gutov <dmitry <at> gutov.dev>, Theodor Thornhill <theo <at> thornhill.no>,
 73978 <at> debbugs.gnu.org
Subject: Re: bug#73978: 31.0.50; Text syntax applied on too many things in
 tsx-ts-mode
Date: Sat, 11 Jan 2025 19:40:12 +0200

> This can be fixed by the following patch that copied the syntax of < and >
> from sgml-make-syntax-table:
>
> @@ -692,7 +694,11 @@ tsx-ts--syntax-propertize-captures
>               (put-text-property
>                (match-beginning 0) (match-end 0)
> -              'syntax-table (string-to-syntax ".")))))))))
> +              'syntax-table (string-to-syntax
> +                             (cond
> +                              ((equal (match-string 0) "<") "(<")
> +                              ((equal (match-string 0) ">") "(>")
> +                              (t ".")))))))))))

I missed the need to apply this < and > syntax on jsx elements,
so also pushed this patch.  Hope this is correct.

diff --git a/lisp/progmodes/typescript-ts-mode.el b/lisp/progmodes/typescript-ts-mode.el
index 3c1b27696bc..937146ddf23 100644
--- a/lisp/progmodes/typescript-ts-mode.el
+++ b/lisp/progmodes/typescript-ts-mode.el
@@ -660,7 +660,9 @@ tsx-ts--s-p-query
   (when (treesit-available-p)
     (treesit-query-compile 'tsx
                            '(((regex pattern: (regex_pattern) @regexp))
-                             ((jsx_text) @jsx)))))
+                             ((jsx_text) @jsx)
+                             ((jsx_opening_element) @jsx)
+                             ((jsx_closing_element) @jsx)))))

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sun, 09 Feb 2025 12:24:14 GMT) Full text and rfc822 format available.

This bug report was last modified 126 days ago.

Previous Next

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #73978 31.0.50; Text syntax applied on too many things in tsx-ts-mode

GNU bug report logs - #73978
31.0.50; Text syntax applied on too many things in tsx-ts-mode