GNU bug report logs -
#77746
[PATCH] sh-mode: Fix incorrect word syntax for punctuation in sh-mode
Previous Next
Reported by: James Cherti <contact <at> jamescherti.com>
Date: Fri, 11 Apr 2025 14:56:02 UTC
Severity: normal
Tags: patch
Done: Stefan Monnier <monnier <at> iro.umontreal.ca>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 77746 in the body.
You can then email your comments to 77746 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77746
; Package
emacs
.
(Fri, 11 Apr 2025 14:56:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
James Cherti <contact <at> jamescherti.com>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Fri, 11 Apr 2025 14:56:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Hello,
In sh and Bash, the characters !%^~:.,= and are not valid in variable or
function names.
Assigning them the "_" syntax causes Emacs to treat them as word
constituents, disrupting navigation and completion
(e.g. dabbrev-expand, forward-word, etc.).
The attached patch updates the syntax table in sh-mode to mark
these characters as punctuation, correcting the issue.
--
James Cherti
GitHub: https://github.com/jamescherti
Website: https://www.jamescherti.com/
[fix-sh-mode-syntax-table.patch (text/x-patch, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77746
; Package
emacs
.
(Sun, 13 Apr 2025 09:46:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 77746 <at> debbugs.gnu.org (full text, mbox):
> Date: Fri, 11 Apr 2025 10:55:02 -0400
> From: James Cherti <contact <at> jamescherti.com>
>
> In sh and Bash, the characters !%^~:.,= and are not valid in variable or
> function names.
>
> Assigning them the "_" syntax causes Emacs to treat them as word
> constituents, disrupting navigation and completion
> (e.g. dabbrev-expand, forward-word, etc.).
>
> The attached patch updates the syntax table in sh-mode to mark
> these characters as punctuation, correcting the issue.
Thanks.
TBH, such a change sounds scary, as it could cause all kinds of
unintended changes in behavior.
I've added a couple of people who might know this mode better than I
do, in the hope that they will have comments or opinions.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77746
; Package
emacs
.
(Sun, 13 Apr 2025 17:35:01 GMT)
Full text and
rfc822 format available.
Message #11 received at 77746 <at> debbugs.gnu.org (full text, mbox):
>> In sh and Bash, the characters !%^~:.,= and are not valid in variable or
>> function names.
I'm not positive about all of them, but at least some of those can
appear in the names of commands.
>> Assigning them the "_" syntax causes Emacs to treat them as word
>> constituents,
Not quite: it makes them appear as "symbol constituents".
>> disrupting navigation and completion (e.g. dabbrev-expand,
>> forward-word, etc.).
`forward-word` for example shouldn't be affected (unless you enable
`superword-mode`). `dabbrev-expand` OTOH is affected, indeed.
>> The attached patch updates the syntax table in sh-mode to mark
>> these characters as punctuation, correcting the issue.
> TBH, such a change sounds scary, as it could cause all kinds of
> unintended changes in behavior.
It's indeed risky/delicate. The syntax-tables are a fairly crude tool,
so we often need to use different tables at different places.
Rather than go straight to changing the syntax-table, I suggest you
start by providing some concrete examples of behaviors you consider
incorrect with the current code. Maybe changing the main syntax-table
of that mode will be the better option, but if so, it'll probably
require changing other code to keep using the current
syntax-table there.
[ I haven't tested it, but I'd expect trouble with your patch either in
font-lock or indentation if you have commands with names like
`if-config`. ]
Stefan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77746
; Package
emacs
.
(Mon, 14 Apr 2025 05:14:02 GMT)
Full text and
rfc822 format available.
Message #14 received at 77746 <at> debbugs.gnu.org (full text, mbox):
Stefan Monnier via "Bug reports for GNU Emacs, the Swiss army knife of
text editors" <bug-gnu-emacs <at> gnu.org> writes:
>>> In sh and Bash, the characters !%^~:.,= and are not valid in variable or
>>> function names.
>
> I'm not positive about all of them, but at least some of those can
> appear in the names of commands.
>
>>> Assigning them the "_" syntax causes Emacs to treat them as word
>>> constituents,
>
> Not quite: it makes them appear as "symbol constituents".
>
>>> disrupting navigation and completion (e.g. dabbrev-expand,
>>> forward-word, etc.).
>
> `forward-word` for example shouldn't be affected (unless you enable
> `superword-mode`). `dabbrev-expand` OTOH is affected, indeed.
>
>>> The attached patch updates the syntax table in sh-mode to mark
>>> these characters as punctuation, correcting the issue.
>> TBH, such a change sounds scary, as it could cause all kinds of
>> unintended changes in behavior.
>
> It's indeed risky/delicate.
Yes, this one does not look straightforward.
IMO, risky changes like this one should really come with a reasonably
comprehensive set of unit tests too, to give us better confidence that
we have considered a reasonable amount of use cases.
Sadly, our unit tests in this area do not have very good coverage as it
stands, so this would take some work.
> Rather than go straight to changing the syntax-table, I suggest you
> start by providing some concrete examples of behaviors you consider
> incorrect with the current code. Maybe changing the main syntax-table
> of that mode will be the better option, but if so, it'll probably
> require changing other code to keep using the current
> syntax-table there.
This is probably the best way forward here.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77746
; Package
emacs
.
(Mon, 14 Apr 2025 09:21:02 GMT)
Full text and
rfc822 format available.
Message #17 received at 77746 <at> debbugs.gnu.org (full text, mbox):
[ஞாயிறு ஏப்ரல் 13, 2025] Stefan Monnier via "Bug reports for GNU Emacs, the Swiss army knife of text editors" wrote:
>>> In sh and Bash, the characters !%^~:.,= and are not valid in variable or
>>> function names.
>
> I'm not positive about all of them, but at least some of those can
> appear in the names of commands.
Not sure sh (and its million implementations) but bash definitely allows
%. I have [-A-Z0-9]+% as a function in ~/.emacs.d/init_bash.sh so I
don't have to worry about editing out the prompt when submitting an
(edited) old prompt line in M-x shell.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77746
; Package
emacs
.
(Mon, 14 Apr 2025 09:45:01 GMT)
Full text and
rfc822 format available.
Message #20 received at 77746 <at> debbugs.gnu.org (full text, mbox):
On Apr 14 2025, Visuwesh wrote:
> [ஞாயிறு ஏப்ரல் 13, 2025] Stefan Monnier via "Bug reports for GNU Emacs, the Swiss army knife of text editors" wrote:
>
>>>> In sh and Bash, the characters !%^~:.,= and are not valid in variable or
>>>> function names.
>>
>> I'm not positive about all of them, but at least some of those can
>> appear in the names of commands.
>
> Not sure sh (and its million implementations) but bash definitely allows
> %.
Bash allows any WORD as a function name.
--
Andreas Schwab, SUSE Labs, schwab <at> suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77746
; Package
emacs
.
(Mon, 14 Apr 2025 12:57:02 GMT)
Full text and
rfc822 format available.
Message #23 received at submit <at> debbugs.gnu.org (full text, mbox):
On 2025-04-14 05:44, Andreas Schwab wrote:
> On Apr 14 2025, Visuwesh wrote:
>
>> [ஞாயிறு ஏப்ரல் 13, 2025] Stefan Monnier via "Bug reports for GNU Emacs, the Swiss army knife of text editors" wrote:
>>
>>>>> In sh and Bash, the characters !%^~:.,= and are not valid in variable or
>>>>> function names.
>>>
>>> I'm not positive about all of them, but at least some of those can
>>> appear in the names of commands.
>>
>> Not sure sh (and its million implementations) but bash definitely allows
>> %.
>
> Bash allows any WORD as a function name.
Yes, Bash permits the characters !%^~:.,= in function names.
However, it does not permit them in variable names. Try:
var!name=1
var%name=1
var^name=1
var:name=1
var.name=1
var,name=1
var=name=1
Sh, on the other hand, disallows the characters !%^~:.,= in function
and variable names in many implementations.
The primary reason I submitted this patch is to address the
inconvenience caused by Emacs including certain characters when
completing variable names.
--
James Cherti
GitHub: https://github.com/jamescherti
Website: https://www.jamescherti.com/
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77746
; Package
emacs
.
(Mon, 14 Apr 2025 13:12:02 GMT)
Full text and
rfc822 format available.
Message #26 received at submit <at> debbugs.gnu.org (full text, mbox):
> The primary reason I submitted this patch is to address the
> inconvenience caused by Emacs including certain characters when
> completing variable names.
I guess this can count as an answer to my request for a concrete case,
but it's not concrete/detailed enough. E.g. I don't really know what
"completing variable names" means.
Stefan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77746
; Package
emacs
.
(Mon, 14 Apr 2025 13:26:02 GMT)
Full text and
rfc822 format available.
Message #29 received at 77746 <at> debbugs.gnu.org (full text, mbox):
On 2025-04-13 13:33, Stefan Monnier wrote:
>>> In sh and Bash, the characters !%^~:.,= and are not valid in variable or
>>> function names.
>
> I'm not positive about all of them, but at least some of those can
> appear in the names of commands.
>
>>> Assigning them the "_" syntax causes Emacs to treat them as word
>>> constituents,
>
> Not quite: it makes them appear as "symbol constituents".
>
>>> disrupting navigation and completion (e.g. dabbrev-expand,
>>> forward-word, etc.).
>
> `forward-word` for example shouldn't be affected (unless you enable
> `superword-mode`). `dabbrev-expand` OTOH is affected, indeed.
>
>>> The attached patch updates the syntax table in sh-mode to mark
>>> these characters as punctuation, correcting the issue.
>> TBH, such a change sounds scary, as it could cause all kinds of
>> unintended changes in behavior.
>
> It's indeed risky/delicate. The syntax-tables are a fairly crude tool,
> so we often need to use different tables at different places.
>
> Rather than go straight to changing the syntax-table, I suggest you
> start by providing some concrete examples of behaviors you consider
> incorrect with the current code. Maybe changing the main syntax-table
> of that mode will be the better option, but if so, it'll probably
> require changing other code to keep using the current
> syntax-table there.
> [ I haven't tested it, but I'd expect trouble with your patch either in
> font-lock or indentation if you have commands with names like
> `if-config`. ]
>
>
> Stefan
>
I primarily wrote this patch to prevent Emacs from including
the characters !%^~:., in symbols when highlighting a
symbol at point or completing a variable name.
Actually, the ones I specifically need in my case are: .:,
I often include variable names in comments followed by these
characters.
This patch addresses two issues I encountered related to
symbols:
1. Completion
--------------
Given the variable name `varname` in a comment,
followed by ".":
#!/usr/bin/env bash
# The name of this variable is varname. Code:
var
Completing "var" includes "varname." in the list of
completions.
(e.g. dabbrev completion. I am using Corfu/Cape,
which displays all suggestions)
2. Highlight symbol at point
----------------------------
When the cursor is on the comment `varname.`,
Emacs highlights `varname.` instead of `varname` when
using `(hi-lock-face-symbol-at-point)`:
#!/usr/bin/env bash
# The name of this variable is varname. Code:
var
Please feel free to share suggestions that could help resolve such issues.
--
James Cherti
GitHub: https://github.com/jamescherti
Website: https://www.jamescherti.com/
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77746
; Package
emacs
.
(Mon, 14 Apr 2025 13:27:01 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77746
; Package
emacs
.
(Mon, 14 Apr 2025 17:25:01 GMT)
Full text and
rfc822 format available.
Message #35 received at submit <at> debbugs.gnu.org (full text, mbox):
> 1. Completion
> --------------
> Given the variable name `varname` in a comment,
> followed by ".":
> #!/usr/bin/env bash
> # The name of this variable is varname. Code:
> var
>
> Completing "var" includes "varname." in the list of
> completions.
`varname.` is a valid command name and at the spot where you have `var`
above, you could very well be completing a command rather than a variable.
> (e.g. dabbrev completion. I am using Corfu/Cape,
> which displays all suggestions)
dabbrev doesn't even try to distinguish whether you're completing a var
or a command or a type or anything else for that matter, so it's mostly
unavoidable that it includes "useless" candidates (and that it misses
valid candidates, as well). IOW you can't argue that it's correct or
not: you need to argue whether something will be usually useless or not.
> 2. Highlight symbol at point
> ----------------------------
> When the cursor is on the comment `varname.`,
> Emacs highlights `varname.` instead of `varname` when
> using `(hi-lock-face-symbol-at-point)`:
> #!/usr/bin/env bash
> # The name of this variable is varname. Code:
> var
Same here. `varname.` is a valid command name so it can make perfect
sense to highlight it. Admittedly, I'd never seen a command with a `.`
at the end, but removing `.` from the symbol constituents would rule out
not just `varname.` but also all commands with a `.` in the middle of
their names:
% ls /usr/bin/??*.* | wc
180 180 4274
%
- Stefan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77746
; Package
emacs
.
(Mon, 14 Apr 2025 17:25:02 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77746
; Package
emacs
.
(Mon, 14 Apr 2025 18:59:01 GMT)
Full text and
rfc822 format available.
Message #41 received at 77746 <at> debbugs.gnu.org (full text, mbox):
On 2025-04-14 13:23, Stefan Monnier via Bug reports for GNU Emacs, the
Swiss army knife of text editors wrote:
>> 1. Completion
>> --------------
>> Given the variable name `varname` in a comment,
>> followed by ".":
>> #!/usr/bin/env bash
>> # The name of this variable is varname. Code:
>> var
>>
>> Completing "var" includes "varname." in the list of
>> completions.
>
> `varname.` is a valid command name and at the spot where you have `var`
> above, you could very well be completing a command rather than a variable.
>
>> (e.g. dabbrev completion. I am using Corfu/Cape,
>> which displays all suggestions)
>
> dabbrev doesn't even try to distinguish whether you're completing a var
> or a command or a type or anything else for that matter, so it's mostly
> unavoidable that it includes "useless" candidates (and that it misses
> valid candidates, as well). IOW you can't argue that it's correct or
> not: you need to argue whether something will be usually useless or not.
>
>> 2. Highlight symbol at point
>> ----------------------------
>> When the cursor is on the comment `varname.`,
>> Emacs highlights `varname.` instead of `varname` when
>> using `(hi-lock-face-symbol-at-point)`:
>> #!/usr/bin/env bash
>> # The name of this variable is varname. Code:
>> var
>
> Same here. `varname.` is a valid command name so it can make perfect
> sense to highlight it. Admittedly, I'd never seen a command with a `.`
> at the end, but removing `.` from the symbol constituents would rule out
> not just `varname.` but also all commands with a `.` in the middle of
> their names:
>
> % ls /usr/bin/??*.* | wc
> 180 180 4274
> %
Yes, I agree that 'varname.' is:
- A valid command in both Sh and Bash,
- A valid function name in Bash (not in Sh).
However, it is an *invalid* variable name in both
Sh and Bash.
This is what complicates addressing this issue.
Applying this patch will impose the same limitations on
function names as those applied to variables, specifically
in terms of what is considered a valid symbol when
developing Sh or Bash scripts.
Here, in my opinion, is the difference between merging and
not merging this patch:
- Merging this patch: The symbol representation of variable
names will be more accurate. However, functions/commands
containing characters like !%^~:.,
will be treated as two separate symbols.
- Not merging this patch: The symbol representation of
functions/commands containing !%^~:. will be more accurate.
However, for variables, characters like !%^~:., will be
included in symbols, leading to extraneous characters when
completing or highlighting symbols.
--
James Cherti
GitHub: https://github.com/jamescherti
Website: https://www.jamescherti.com/
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77746
; Package
emacs
.
(Mon, 14 Apr 2025 19:00:02 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77746
; Package
emacs
.
(Mon, 14 Apr 2025 19:30:02 GMT)
Full text and
rfc822 format available.
Message #47 received at 77746 <at> debbugs.gnu.org (full text, mbox):
On 2025-04-14 13:23, Stefan Monnier via Bug reports for GNU Emacs, the
Swiss army knife of text editors wrote:
>> 1. Completion
>> --------------
>> Given the variable name `varname` in a comment,
>> followed by ".":
>> #!/usr/bin/env bash
>> # The name of this variable is varname. Code:
>> var
>>
>> Completing "var" includes "varname." in the list of
>> completions.
>
> `varname.` is a valid command name and at the spot where you have `var`
> above, you could very well be completing a command rather than a variable.
>
>> (e.g. dabbrev completion. I am using Corfu/Cape,
>> which displays all suggestions)
>
> dabbrev doesn't even try to distinguish whether you're completing a var
> or a command or a type or anything else for that matter, so it's mostly
> unavoidable that it includes "useless" candidates (and that it misses
> valid candidates, as well). IOW you can't argue that it's correct or
> not: you need to argue whether something will be usually useless or not.
>
>> 2. Highlight symbol at point
>> ----------------------------
>> When the cursor is on the comment `varname.`,
>> Emacs highlights `varname.` instead of `varname` when
>> using `(hi-lock-face-symbol-at-point)`:
>> #!/usr/bin/env bash
>> # The name of this variable is varname. Code:
>> var
>
> Same here. `varname.` is a valid command name so it can make perfect
> sense to highlight it. Admittedly, I'd never seen a command with a `.`
> at the end, but removing `.` from the symbol constituents would rule out
> not just `varname.` but also all commands with a `.` in the middle of
> their names:
>
> % ls /usr/bin/??*.* | wc
> 180 180 4274
> %
Yes, commands like mkfs.ext4 contain '.' and are valid.
However, while :,!%^~ are valid characters for commands,
it's rare to see them used in command names. For instance, I
have no commands on my system that contain :,!%^~ despite
having 3,843 files in /usr/bin/.
Perhaps this patch could be adjusted by removing '.' (given
its common use in command names), and keeping :,!%^~
as punctuation for variable names.
(Removing '.' would not only cover the majority of
command/function names but also prevent using `!%^~:,` as
symbol constituents, resulting in more accurate variable
symbols.)
--
James Cherti
GitHub: https://github.com/jamescherti
Website: https://www.jamescherti.com/
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77746
; Package
emacs
.
(Mon, 14 Apr 2025 19:40:02 GMT)
Full text and
rfc822 format available.
Message #50 received at 77746 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Attached: v2 of this patch
This patch marks only the following characters as
punctuation: !%^~:,
I removed . because it is commonly used in command names
(e.g., mkfs.ext4).
(Excluding '.' not only covers the majority of command and
function names but also prevents using ,!%^~: as
symbol constituents, resulting in more accurate
variable symbols.)
--
James Cherti
GitHub: https://github.com/jamescherti
Website: https://www.jamescherti.com/
On 2025-04-13 05:45, Eli Zaretskii wrote:
>> Date: Fri, 11 Apr 2025 10:55:02 -0400
>> From: James Cherti <contact <at> jamescherti.com>
>>
>> In sh and Bash, the characters !%^~:.,= and are not valid in variable or
>> function names.
>>
>> Assigning them the "_" syntax causes Emacs to treat them as word
>> constituents, disrupting navigation and completion
>> (e.g. dabbrev-expand, forward-word, etc.).
>>
>> The attached patch updates the syntax table in sh-mode to mark
>> these characters as punctuation, correcting the issue.
>
> Thanks.
>
> TBH, such a change sounds scary, as it could cause all kinds of
> unintended changes in behavior.
>
> I've added a couple of people who might know this mode better than I
> do, in the hope that they will have comments or opinions.
[fix-sh-mode-syntax-table-v2.patch (text/x-patch, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77746
; Package
emacs
.
(Mon, 14 Apr 2025 21:35:02 GMT)
Full text and
rfc822 format available.
Message #53 received at 77746 <at> debbugs.gnu.org (full text, mbox):
> This patch marks only the following characters as
> punctuation: !%^~:,
Same problem as before: by just changing the syntax-table willy-nilly
you're affecting a lot more code than the one you care about, and we
don't know what the impact will be.
You may want to set `dabbrev-abbrev-char-regexp` instead, for example.
Stefan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77746
; Package
emacs
.
(Mon, 14 Apr 2025 21:54:01 GMT)
Full text and
rfc822 format available.
Message #56 received at 77746 <at> debbugs.gnu.org (full text, mbox):
On 2025-04-14 17:34, Stefan Monnier via Bug reports for GNU Emacs, the
Swiss army knife of text editors wrote:
>> This patch marks only the following characters as
>> punctuation: !%^~:,
>
> Same problem as before: by just changing the syntax-table willy-nilly
> you're affecting a lot more code than the one you care about, and we
> don't know what the impact will be.
>
> You may want to set `dabbrev-abbrev-char-regexp` instead, for example.
>
>
> Stefan
Thank you for suggesting a solution for dabbrev. However,
this won't address the issue with symbol highlighting
using the built-in (highlight-symbol-at-point) function.
For context: I have been using the syntax table introduced
by this patch for over a year in extensive Bash scripting (I
write hundreds of Bash scripts annually), and I have not
encountered any issues with `sh-mode` or `bash-ts-mode` when
this patch is applied.
That said, I understand there may be use cases outside my
workflow where this change could impact users who use
sh-mode differently.
I'll leave the decision to your judgment, given your
long-standing experience maintaining Emacs.
--
James Cherti
GitHub: https://github.com/jamescherti
Website: https://www.jamescherti.com/
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77746
; Package
emacs
.
(Tue, 15 Apr 2025 00:24:01 GMT)
Full text and
rfc822 format available.
Message #59 received at 77746 <at> debbugs.gnu.org (full text, mbox):
> Thank you for suggesting a solution for dabbrev. However,
> this won't address the issue with symbol highlighting
> using the built-in (highlight-symbol-at-point) function.
No, indeed. There's probably another setting available that can fix
that case, tho.
> For context: I have been using the syntax table introduced
> by this patch for over a year in extensive Bash scripting (I
> write hundreds of Bash scripts annually), and I have not
> encountered any issues with `sh-mode` or `bash-ts-mode` when
> this patch is applied.
It's possible that your patch is actually safe.
Someone™ just needs to take a close look at the syntax-propertize,
font-lock, and indentation code to assess the potential effects (I think
it's the only two cases that deserve such close scrutiny, because other
cases are usually more like dabbrev in the sense that there isn't as
clear a "right-vs-wrong" distinction).
Another option, might be to combine your patch with another patch which
sets the syntax-table used for font-lock, indentation, and syntax-ppss.
Stefan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77746
; Package
emacs
.
(Tue, 15 Apr 2025 06:16:02 GMT)
Full text and
rfc822 format available.
Message #62 received at 77746 <at> debbugs.gnu.org (full text, mbox):
> Date: Mon, 14 Apr 2025 15:29:44 -0400
> Cc: eliz <at> gnu.org, 77746 <at> debbugs.gnu.org, juri <at> linkov.net
> From: James Cherti <contact <at> jamescherti.com>
>
>
> On 2025-04-14 13:23, Stefan Monnier via Bug reports for GNU Emacs, the
> Swiss army knife of text editors wrote:
> >> 1. Completion
> >> --------------
> >> Given the variable name `varname` in a comment,
> >> followed by ".":
> >> #!/usr/bin/env bash
> >> # The name of this variable is varname. Code:
> >> var
> >>
> >> Completing "var" includes "varname." in the list of
> >> completions.
> >
> > `varname.` is a valid command name and at the spot where you have `var`
> > above, you could very well be completing a command rather than a variable.
> >
> >> (e.g. dabbrev completion. I am using Corfu/Cape,
> >> which displays all suggestions)
> >
> > dabbrev doesn't even try to distinguish whether you're completing a var
> > or a command or a type or anything else for that matter, so it's mostly
> > unavoidable that it includes "useless" candidates (and that it misses
> > valid candidates, as well). IOW you can't argue that it's correct or
> > not: you need to argue whether something will be usually useless or not.
> >
> >> 2. Highlight symbol at point
> >> ----------------------------
> >> When the cursor is on the comment `varname.`,
> >> Emacs highlights `varname.` instead of `varname` when
> >> using `(hi-lock-face-symbol-at-point)`:
> >> #!/usr/bin/env bash
> >> # The name of this variable is varname. Code:
> >> var
> >
> > Same here. `varname.` is a valid command name so it can make perfect
> > sense to highlight it. Admittedly, I'd never seen a command with a `.`
> > at the end, but removing `.` from the symbol constituents would rule out
> > not just `varname.` but also all commands with a `.` in the middle of
> > their names:
> >
> > % ls /usr/bin/??*.* | wc
> > 180 180 4274
> > %
> Yes, commands like mkfs.ext4 contain '.' and are valid.
> However, while :,!%^~ are valid characters for commands,
> it's rare to see them used in command names. For instance, I
> have no commands on my system that contain :,!%^~ despite
> having 3,843 files in /usr/bin/.
>
> Perhaps this patch could be adjusted by removing '.' (given
> its common use in command names), and keeping :,!%^~
> as punctuation for variable names.
>
> (Removing '.' would not only cover the majority of
> command/function names but also prevent using `!%^~:,` as
> symbol constituents, resulting in more accurate variable
> symbols.)
Did you try the tree-sitter based bash-ts-mode? If you did, does it
solve this problem better than sh-mode?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77746
; Package
emacs
.
(Tue, 15 Apr 2025 06:18:02 GMT)
Full text and
rfc822 format available.
Message #65 received at 77746 <at> debbugs.gnu.org (full text, mbox):
> Date: Mon, 14 Apr 2025 15:39:24 -0400
> Cc: 77746 <at> debbugs.gnu.org
> From: James Cherti <contact <at> jamescherti.com>
>
> Attached: v2 of this patch
>
> This patch marks only the following characters as
> punctuation: !%^~:,
>
> I removed . because it is commonly used in command names
> (e.g., mkfs.ext4).
>
> (Excluding '.' not only covers the majority of command and
> function names but also prevents using ,!%^~: as
> symbol constituents, resulting in more accurate
> variable symbols.)
Should this backward-incompatible change be controlled by a user
variable? I can easily imagine some user who'd come complaining about
this compromise. Without a knob to get back previous behavior, we
will have no way of satisfying such users without reverting the
change.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77746
; Package
emacs
.
(Tue, 15 Apr 2025 11:22:02 GMT)
Full text and
rfc822 format available.
Message #68 received at 77746 <at> debbugs.gnu.org (full text, mbox):
Hello Eli,
The issue occurs in both sh-mode and bash-ts-mode.
This patch resolves it for me in both modes.
(I am using the first patch that marks !%^~:., as
punctuation. It includes '.'. In my workflow, I find little
value in completing commands within Bash scripts. I
prioritize completing variables and functions, as they are
more relevant to the script’s logic and structure.)
--
James Cherti
GitHub: https://github.com/jamescherti
Website: https://www.jamescherti.com/
On 2025-04-15 02:15, Eli Zaretskii wrote:
>> Date: Mon, 14 Apr 2025 15:29:44 -0400
>> Cc: eliz <at> gnu.org, 77746 <at> debbugs.gnu.org, juri <at> linkov.net
>> From: James Cherti <contact <at> jamescherti.com>
>>
>>
>> On 2025-04-14 13:23, Stefan Monnier via Bug reports for GNU Emacs, the
>> Swiss army knife of text editors wrote:
>>>> 1. Completion
>>>> --------------
>>>> Given the variable name `varname` in a comment,
>>>> followed by ".":
>>>> #!/usr/bin/env bash
>>>> # The name of this variable is varname. Code:
>>>> var
>>>>
>>>> Completing "var" includes "varname." in the list of
>>>> completions.
>>>
>>> `varname.` is a valid command name and at the spot where you have `var`
>>> above, you could very well be completing a command rather than a variable.
>>>
>>>> (e.g. dabbrev completion. I am using Corfu/Cape,
>>>> which displays all suggestions)
>>>
>>> dabbrev doesn't even try to distinguish whether you're completing a var
>>> or a command or a type or anything else for that matter, so it's mostly
>>> unavoidable that it includes "useless" candidates (and that it misses
>>> valid candidates, as well). IOW you can't argue that it's correct or
>>> not: you need to argue whether something will be usually useless or not.
>>>
>>>> 2. Highlight symbol at point
>>>> ----------------------------
>>>> When the cursor is on the comment `varname.`,
>>>> Emacs highlights `varname.` instead of `varname` when
>>>> using `(hi-lock-face-symbol-at-point)`:
>>>> #!/usr/bin/env bash
>>>> # The name of this variable is varname. Code:
>>>> var
>>>
>>> Same here. `varname.` is a valid command name so it can make perfect
>>> sense to highlight it. Admittedly, I'd never seen a command with a `.`
>>> at the end, but removing `.` from the symbol constituents would rule out
>>> not just `varname.` but also all commands with a `.` in the middle of
>>> their names:
>>>
>>> % ls /usr/bin/??*.* | wc
>>> 180 180 4274
>>> %
>> Yes, commands like mkfs.ext4 contain '.' and are valid.
>> However, while :,!%^~ are valid characters for commands,
>> it's rare to see them used in command names. For instance, I
>> have no commands on my system that contain :,!%^~ despite
>> having 3,843 files in /usr/bin/.
>>
>> Perhaps this patch could be adjusted by removing '.' (given
>> its common use in command names), and keeping :,!%^~
>> as punctuation for variable names.
>>
>> (Removing '.' would not only cover the majority of
>> command/function names but also prevent using `!%^~:,` as
>> symbol constituents, resulting in more accurate variable
>> symbols.)
>
> Did you try the tree-sitter based bash-ts-mode? If you did, does it
> solve this problem better than sh-mode?
>
>
>
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77746
; Package
emacs
.
(Tue, 15 Apr 2025 11:49:01 GMT)
Full text and
rfc822 format available.
Message #71 received at 77746 <at> debbugs.gnu.org (full text, mbox):
Hello Eli,
I just want to note that the second patch is not ideal, as
it omits the '.' character. If you decide to merge this
patch (or a variation of it), I recommend including '.' as
well, since '.' is not allowed in variable names.
In the context of Bash development, it is uncommon to use
characters like !%^~:. in function names. In my view, it is
more practical to treat them as punctuation rather than
symbol constituents by default, as what matters the most
in Bash development is variable names and function names.
Maintaining some degree of backward compatibility might be
beneficial in case other users prefer completing commands
that contain characters like !%^~:,. (though in practice,
this is rare that users complete commands containing
!%^~:, with the exception of '.'. Outside of command names,
the '.' character is also rarely used in function names
and never used in variable names.).
The variable for backward compatibility could be defined
as one of the following, or a variation of them:
(defvar sh-mode-syntax-table-respect-variable-name-limits t)
(defvar sh-mode-symbol-respect-variable-name-limits t)
(defvar sh-mode-respect-variable-name-limits t)
(It could also default to nil, mitigating the risk of any
user encountering issues with this new behavior. As I
mentioned earlier, I haven't faced any problems with this
patch applied for several months.)
--
James Cherti
GitHub: https://github.com/jamescherti
Website: https://www.jamescherti.com/
On 2025-04-15 02:17, Eli Zaretskii wrote:
>> Date: Mon, 14 Apr 2025 15:39:24 -0400
>> Cc: 77746 <at> debbugs.gnu.org
>> From: James Cherti <contact <at> jamescherti.com>
>>
>> Attached: v2 of this patch
>>
>> This patch marks only the following characters as
>> punctuation: !%^~:,
>>
>> I removed . because it is commonly used in command names
>> (e.g., mkfs.ext4).
>>
>> (Excluding '.' not only covers the majority of command and
>> function names but also prevents using ,!%^~: as
>> symbol constituents, resulting in more accurate
>> variable symbols.)
>
> Should this backward-incompatible change be controlled by a user
> variable? I can easily imagine some user who'd come complaining about
> this compromise. Without a knob to get back previous behavior, we
> will have no way of satisfying such users without reverting the
> change.
>
>
>
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77746
; Package
emacs
.
(Tue, 15 Apr 2025 12:49:02 GMT)
Full text and
rfc822 format available.
Message #74 received at 77746 <at> debbugs.gnu.org (full text, mbox):
> Date: Tue, 15 Apr 2025 07:21:08 -0400
> Cc: 77746 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca, juri <at> linkov.net
> From: James Cherti <contact <at> jamescherti.com>
>
> The issue occurs in both sh-mode and bash-ts-mode.
I'm surprised that bash-ts-mode should use syntax tables, instead of
relying on the results of parsing by tree-sitter.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77746
; Package
emacs
.
(Tue, 15 Apr 2025 14:07:01 GMT)
Full text and
rfc822 format available.
Message #77 received at 77746 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On 2025-04-15 08:48, Eli Zaretskii wrote:
>> Date: Tue, 15 Apr 2025 07:21:08 -0400
>> Cc: 77746 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca, juri <at> linkov.net
>> From: James Cherti <contact <at> jamescherti.com>
>>
>> The issue occurs in both sh-mode and bash-ts-mode.
>
> I'm surprised that bash-ts-mode should use syntax tables, instead of
> relying on the results of parsing by tree-sitter.
Here are a few screenshots that show exactly the issue. The
mode used when I took the screenshot was bash-ts-mode.
- without-patch1-highlight.png: Only varname should have
been highlighted
- without-patch2-highlight.png: varname in the comment
should have been highlighted
- without-patch3-completion.png: The completion included
invalid variable names.
- without-patch4-major-mode.png: The major mode is
bash-ts-mode. The same issue happens with sh-mode
without this patch.
Here are screenshots with this patch applied:
- with-patch1.png: varname highlighted correctly
- with-patch2.png: only complete valid variable names
--
James Cherti
GitHub: https://github.com/jamescherti
Website: https://www.jamescherti.com/
[without-patch1-highlight.png (image/png, attachment)]
[without-patch2-highlight.png (image/png, attachment)]
[without-patch3-completion.png (image/png, attachment)]
[without-patch4-major-mode.png (image/png, attachment)]
[with-patch1.png (image/png, attachment)]
[with-patch2.png (image/png, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77746
; Package
emacs
.
(Tue, 15 Apr 2025 15:18:02 GMT)
Full text and
rfc822 format available.
Message #80 received at 77746 <at> debbugs.gnu.org (full text, mbox):
>> The issue occurs in both sh-mode and bash-ts-mode.
> I'm surprised that bash-ts-mode should use syntax tables, instead of
> relying on the results of parsing by tree-sitter.
It's not front-and-center in his bug report, but what he means by
"completion" seems to be mostly dabbrev completion, which doesn't make
use of tree-sitter.
Stefan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77746
; Package
emacs
.
(Tue, 15 Apr 2025 15:25:01 GMT)
Full text and
rfc822 format available.
Message #83 received at 77746 <at> debbugs.gnu.org (full text, mbox):
> From: Stefan Monnier <monnier <at> iro.umontreal.ca>
> Cc: James Cherti <contact <at> jamescherti.com>, 77746 <at> debbugs.gnu.org,
> juri <at> linkov.net
> Date: Tue, 15 Apr 2025 11:17:02 -0400
>
> >> The issue occurs in both sh-mode and bash-ts-mode.
> > I'm surprised that bash-ts-mode should use syntax tables, instead of
> > relying on the results of parsing by tree-sitter.
>
> It's not front-and-center in his bug report, but what he means by
> "completion" seems to be mostly dabbrev completion, which doesn't make
> use of tree-sitter.
He also mentioned font-lock.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77746
; Package
emacs
.
(Tue, 15 Apr 2025 16:23:01 GMT)
Full text and
rfc822 format available.
Message #86 received at 77746 <at> debbugs.gnu.org (full text, mbox):
On 2025-04-15 11:24, Eli Zaretskii wrote:
>> From: Stefan Monnier <monnier <at> iro.umontreal.ca>
>> Cc: James Cherti <contact <at> jamescherti.com>, 77746 <at> debbugs.gnu.org,
>> juri <at> linkov.net
>> Date: Tue, 15 Apr 2025 11:17:02 -0400
>>
>>>> The issue occurs in both sh-mode and bash-ts-mode.
>>> I'm surprised that bash-ts-mode should use syntax tables, instead of
>>> relying on the results of parsing by tree-sitter.
>>
>> It's not front-and-center in his bug report, but what he means by
>> "completion" seems to be mostly dabbrev completion, which doesn't make
>> use of tree-sitter.
>
> He also mentioned font-lock.
Yes, this patch fixes dabbrev, functions such as
(highlight-symbol-at-point), and any other function that
relies on thing-at-point for symbol detection.
(The core issue is inaccurate symbol detection. This patch
improves it by excluding characters that should not be
considered part of a valid variable name.)
--
James Cherti
GitHub: https://github.com/jamescherti
Website: https://www.jamescherti.com/
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77746
; Package
emacs
.
(Tue, 15 Apr 2025 16:47:02 GMT)
Full text and
rfc822 format available.
Message #89 received at 77746 <at> debbugs.gnu.org (full text, mbox):
> (The core issue is inaccurate symbol detection. This patch
> improves it by excluding characters that should not be
> considered part of a valid variable name.)
Notice how you jumped from "symbol" to "variable".
I don't know of any reason why we should presume that "variables" are
the only symbols of interest.
Stefan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77746
; Package
emacs
.
(Tue, 15 Apr 2025 18:26:03 GMT)
Full text and
rfc822 format available.
Message #92 received at 77746 <at> debbugs.gnu.org (full text, mbox):
> Date: Tue, 15 Apr 2025 12:22:08 -0400
> Cc: juri <at> linkov.net, 77746 <at> debbugs.gnu.org
> From: James Cherti <contact <at> jamescherti.com>
>
> On 2025-04-15 11:24, Eli Zaretskii wrote:
> >> From: Stefan Monnier <monnier <at> iro.umontreal.ca>
> >> Cc: James Cherti <contact <at> jamescherti.com>, 77746 <at> debbugs.gnu.org,
> >> juri <at> linkov.net
> >> Date: Tue, 15 Apr 2025 11:17:02 -0400
> >>
> >>>> The issue occurs in both sh-mode and bash-ts-mode.
> >>> I'm surprised that bash-ts-mode should use syntax tables, instead of
> >>> relying on the results of parsing by tree-sitter.
> >>
> >> It's not front-and-center in his bug report, but what he means by
> >> "completion" seems to be mostly dabbrev completion, which doesn't make
> >> use of tree-sitter.
> >
> > He also mentioned font-lock.
>
> Yes, this patch fixes dabbrev, functions such as
> (highlight-symbol-at-point), and any other function that
> relies on thing-at-point for symbol detection.
>
> (The core issue is inaccurate symbol detection. This patch
> improves it by excluding characters that should not be
> considered part of a valid variable name.)
I'd expect a parser to detect variable names correctly, and without
the help of our syntax tables and related code.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77746
; Package
emacs
.
(Tue, 15 Apr 2025 19:42:04 GMT)
Full text and
rfc822 format available.
Message #95 received at 77746 <at> debbugs.gnu.org (full text, mbox):
On 2025-04-15 12:46, Stefan Monnier via Bug reports for GNU Emacs, the
Swiss army knife of text editors wrote:
>> (The core issue is inaccurate symbol detection. This patch
>> improves it by excluding characters that should not be
>> considered part of a valid variable name.)
>
> Notice how you jumped from "symbol" to "variable".
> I don't know of any reason why we should presume that "variables" are
> the only symbols of interest.
Let me rephrase it:
In my view, variables and functions are the two most
significant types of symbols when developing shell scripts.
It is inconvenient to treat !%^~:. as symbol constituents
when they are rarely used in function names and never used
in variable names. Even among commands, characters like
!%^~: are rarely used. The only one commonly seen in command
names is ".".
(Although these characters are *allowed* in command names,
developers generally avoid using these characters because
they have special meanings in the shell: ! is used for
history expansion, % for job control, ~ for home
directories, and ^ for quick substitutions in previous
commands.)
--
James Cherti
GitHub: https://github.com/jamescherti
Website: https://www.jamescherti.com/
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77746
; Package
emacs
.
(Tue, 15 Apr 2025 19:46:02 GMT)
Full text and
rfc822 format available.
Message #98 received at 77746 <at> debbugs.gnu.org (full text, mbox):
On 2025-04-15 14:25, Eli Zaretskii wrote:
>> Date: Tue, 15 Apr 2025 12:22:08 -0400
>> Cc: juri <at> linkov.net, 77746 <at> debbugs.gnu.org
>> From: James Cherti <contact <at> jamescherti.com>
>>
>> On 2025-04-15 11:24, Eli Zaretskii wrote:
>> >> From: Stefan Monnier <monnier <at> iro.umontreal.ca>
>> >> Cc: James Cherti <contact <at> jamescherti.com>, 77746 <at> debbugs.gnu.org,
>> >> juri <at> linkov.net
>> >> Date: Tue, 15 Apr 2025 11:17:02 -0400
>> >>
>> >>>> The issue occurs in both sh-mode and bash-ts-mode.
>> >>> I'm surprised that bash-ts-mode should use syntax tables, instead of
>> >>> relying on the results of parsing by tree-sitter.
>> >>
>> >> It's not front-and-center in his bug report, but what he means by
>> >> "completion" seems to be mostly dabbrev completion, which doesn't make
>> >> use of tree-sitter.
>> >
>> > He also mentioned font-lock.
>>
>> Yes, this patch fixes dabbrev, functions such as
>> (highlight-symbol-at-point), and any other function that
>> relies on thing-at-point for symbol detection.
>>
>> (The core issue is inaccurate symbol detection. This patch
>> improves it by excluding characters that should not be
>> considered part of a valid variable name.)
>
> I'd expect a parser to detect variable names correctly, and without
> the help of our syntax tables and related code.
Hello Eli,
It would be highly convenient if the Tree-sitter parser
could accurately detect variable names (e.g., the symbol at point).
While this would not fix sh-mode, it would at least enable an
improved editing experience when using bash-ts-mode.
--
James Cherti
GitHub: https://github.com/jamescherti
Website: https://www.jamescherti.com/
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77746
; Package
emacs
.
(Tue, 15 Apr 2025 20:47:04 GMT)
Full text and
rfc822 format available.
Message #101 received at 77746 <at> debbugs.gnu.org (full text, mbox):
> In my view, variables and functions are the two most
> significant types of symbols when developing shell scripts.
I know. But the fact is, we don't really have any maintainer for
that code. I'm probably one of those most familiar with it (from back
when I implemented the SMIE-based indentation code), and I'm not sure
what kind of impact your change could have, so I'm uneasy accepting
your patch.
I added Dmitry to the Cc because AFAICT he's the last one to have
changed that syntax-table and he seems to have changed it for the same
kind of reasons as you:
commit f6277911eb2c520aec8f0efd80c91999226e3322
Author: Dmitry Gutov <dgutov <at> yandex.ru>
Date: Fri Oct 2 07:11:56 2020 +0200
Make xref work better on variables in shell-script-mode
* lisp/progmodes/sh-script.el (sh-mode-syntax-table): Classify "/"
as punctuation so that `M-.' on $foo/bar works on the $foo part
(bug#25585).
diff --git a/lisp/progmodes/sh-script.el b/lisp/progmodes/sh-script.el
--- a/lisp/progmodes/sh-script.el
+++ b/lisp/progmodes/sh-script.el
@@ -370,26 +370,27 @@
(defvar sh-mode-syntax-table
[...]
?= "."
+ ?/ "."
?\; "."
?| "."
Apparently this change hasn't brought any trouble over the last 5 years,
so that's encouraging.
AFAICT the chars you want to change have had the symbol-constituent
syntax "forever", i.e. since the first commit:
commit ac59aed83fbdfd298f58a1a7e638264b0c3b0caa
Author: Richard M. Stallman <rms <at> gnu.org>
Date: Tue Mar 22 05:43:25 1994 +0000
entered into RCS
diff --git a/lisp/progmodes/sh-script.el b/lisp/progmodes/sh-script.el
--- /dev/null
+++ b/lisp/progmodes/sh-script.el
[...]
+(defvar sh-mode-syntax-table
+ (let ((table (copy-syntax-table)))
+ (modify-syntax-entry ?\# "<" table)
+ (modify-syntax-entry ?\^l ">#" table)
+ (modify-syntax-entry ?\n ">#" table)
+ (modify-syntax-entry ?\" "\"\"" table)
+ (modify-syntax-entry ?\' "\"'" table)
+ (modify-syntax-entry ?\` "$`" table)
+ (modify-syntax-entry ?$ "_" table)
+ (modify-syntax-entry ?! "_" table)
+ (modify-syntax-entry ?% "_" table)
+ (modify-syntax-entry ?: "_" table)
+ (modify-syntax-entry ?. "_" table)
+ (modify-syntax-entry ?^ "_" table)
+ (modify-syntax-entry ?~ "_" table)
+ table)
+ "Syntax table in use in Shell-Script mode.")
That can be taken to mean that we have 30 years of experience with that
setting as being "the right one". Of course it can also be taken to
mean that it was just an accident that simply bites rarely enough that
nobody bothered to fix it yet.
Stefan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77746
; Package
emacs
.
(Tue, 15 Apr 2025 21:30:02 GMT)
Full text and
rfc822 format available.
Message #104 received at 77746 <at> debbugs.gnu.org (full text, mbox):
On 2025-04-15 16:46, Stefan Monnier via Bug reports for GNU Emacs, the
Swiss army knife of text editors wrote:
>> In my view, variables and functions are the two most
>> significant types of symbols when developing shell scripts.
>
> I know. But the fact is, we don't really have any maintainer for
> that code. I'm probably one of those most familiar with it (from back
> when I implemented the SMIE-based indentation code), and I'm not sure
> what kind of impact your change could have, so I'm uneasy accepting
> your patch.
Hello Stefan,
I fully understand your desire to be certain before accepting it.
> I added Dmitry to the Cc because AFAICT he's the last one to have
> changed that syntax-table and he seems to have changed it for the same
> kind of reasons as you:
Excellent. Thanks. This will help move the discussion about this
patch forward.
I hope we can make progress toward getting this (or a variation
of it) merged into Emacs.
> commit f6277911eb2c520aec8f0efd80c91999226e3322
> Author: Dmitry Gutov <dgutov <at> yandex.ru>
> Date: Fri Oct 2 07:11:56 2020 +0200
>
> Make xref work better on variables in shell-script-mode
>
> * lisp/progmodes/sh-script.el (sh-mode-syntax-table): Classify "/"
> as punctuation so that `M-.' on $foo/bar works on the $foo part
> (bug#25585).
>
> diff --git a/lisp/progmodes/sh-script.el b/lisp/progmodes/sh-script.el
> --- a/lisp/progmodes/sh-script.el
> +++ b/lisp/progmodes/sh-script.el
> @@ -370,26 +370,27 @@
> (defvar sh-mode-syntax-table
> [...]
> ?= "."
> + ?/ "."
> ?\; "."
> ?| "."
>
> Apparently this change hasn't brought any trouble over the last 5 years,
> so that's encouraging.
Yes, that's encouraging.
(For those who haven’t read the previous email: I’ve been
using Emacs with this patch applied for over a year without
encountering any issues in my sh-mode/bash-ts-mode
workflow. I generally search for the symbol at point,
highlight the symbol at point, complete using dabbrev, and
a few other methods...)
> AFAICT the chars you want to change have had the symbol-constituent
> syntax "forever", i.e. since the first commit:
>
> commit ac59aed83fbdfd298f58a1a7e638264b0c3b0caa
> Author: Richard M. Stallman <rms <at> gnu.org>
> Date: Tue Mar 22 05:43:25 1994 +0000
>
> entered into RCS
>
> diff --git a/lisp/progmodes/sh-script.el b/lisp/progmodes/sh-script.el
> --- /dev/null
> +++ b/lisp/progmodes/sh-script.el
> [...]
> +(defvar sh-mode-syntax-table
> + (let ((table (copy-syntax-table)))
> + (modify-syntax-entry ?\# "<" table)
> + (modify-syntax-entry ?\^l ">#" table)
> + (modify-syntax-entry ?\n ">#" table)
> + (modify-syntax-entry ?\" "\"\"" table)
> + (modify-syntax-entry ?\' "\"'" table)
> + (modify-syntax-entry ?\` "$`" table)
> + (modify-syntax-entry ?$ "_" table)
> + (modify-syntax-entry ?! "_" table)
> + (modify-syntax-entry ?% "_" table)
> + (modify-syntax-entry ?: "_" table)
> + (modify-syntax-entry ?. "_" table)
> + (modify-syntax-entry ?^ "_" table)
> + (modify-syntax-entry ?~ "_" table)
> + table)
> + "Syntax table in use in Shell-Script mode.")
>
> That can be taken to mean that we have 30 years of experience with that
> setting as being "the right one". Of course it can also be taken to
It's interesting to see how many years this syntax table has
remained unchanged.
> mean that it was just an accident that simply bites rarely enough that
> nobody bothered to fix it yet.
Or, as in my case before submitting this patch, developers
may have simply patched Emacs locally for months, either by
modifying the source or adjusting the syntax table via Elisp.
Many users prefer to address annoyances like the one this
patch fixes by "patching" locally through their init files,
rather than going through the process of submitting an actual
patch.
>
>
> Stefan
--
James Cherti
GitHub: https://github.com/jamescherti
Website: https://www.jamescherti.com/
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77746
; Package
emacs
.
(Wed, 16 Apr 2025 23:31:02 GMT)
Full text and
rfc822 format available.
Message #107 received at 77746 <at> debbugs.gnu.org (full text, mbox):
On 15/04/2025 03:23, Stefan Monnier via Bug reports for GNU Emacs, the
Swiss army knife of text editors wrote:
>> For context: I have been using the syntax table introduced
>> by this patch for over a year in extensive Bash scripting (I
>> write hundreds of Bash scripts annually), and I have not
>> encountered any issues with `sh-mode` or `bash-ts-mode` when
>> this patch is applied.
> It's possible that your patch is actually safe.
>
> Someone™ just needs to take a close look at the syntax-propertize,
> font-lock, and indentation code to assess the potential effects (I think
> it's the only two cases that deserve such close scrutiny, because other
> cases are usually more like dabbrev in the sense that there isn't as
> clear a "right-vs-wrong" distinction).
>
> Another option, might be to combine your patch with another patch which
> sets the syntax-table used for font-lock, indentation, and syntax-ppss.
To point out a more advanced solution: if the character constituents for
variables and commands are different, perhaps Somebody(TM) could write a
syntax-propertize-function for sh-mode which would distinguish some of
these constructs from others, and add the 'syntax-table' text property.
This is a lot more work, but I figured I should have mentioned it.
And for bash-ts-mode, the syntax-propertize-function could be easier to
write, at least if the parser properly distinguishes these cases.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77746
; Package
emacs
.
(Wed, 16 Apr 2025 23:44:05 GMT)
Full text and
rfc822 format available.
Message #110 received at 77746 <at> debbugs.gnu.org (full text, mbox):
Hi! Regarding this:
On 15/04/2025 23:46, Stefan Monnier via Bug reports for GNU Emacs, the
Swiss army knife of text editors wrote:
> I added Dmitry to the Cc because AFAICT he's the last one to have
> changed that syntax-table and he seems to have changed it for the same
> kind of reasons as you:
>
> commit f6277911eb2c520aec8f0efd80c91999226e3322
> Author: Dmitry Gutov<dgutov <at> yandex.ru>
> Date: Fri Oct 2 07:11:56 2020 +0200
>
> Make xref work better on variables in shell-script-mode
>
> * lisp/progmodes/sh-script.el (sh-mode-syntax-table): Classify "/"
> as punctuation so that `M-.' on $foo/bar works on the $foo part
> (bug#25585).
>
> diff --git a/lisp/progmodes/sh-script.el b/lisp/progmodes/sh-script.el
> --- a/lisp/progmodes/sh-script.el
> +++ b/lisp/progmodes/sh-script.el
> @@ -370,26 +370,27 @@
> (defvar sh-mode-syntax-table
> [...]
> ?= "."
> + ?/ "."
> ?\; "."
> ?| "."
>
> Apparently this change hasn't brought any trouble over the last 5 years,
> so that's encouraging.
Sorry to say, I don't have much more experience with these files. The
'/' character seemed safe enough, since it sounds silly to define
symbols including it, given how prevalent its other use in scripts.
A lot of the questions that have been brought up in this thread seem
very good. In particular, is there a more urgent subset of the
characters that we would want to change, and what are the code examples
that would be affected? Even if we just list the positive cases. But
preferably realistic.
E.g. this one from the screenshot:
var%name=1
What is such code supposed to do? Is that an assignment inside the
modulo operation?
More generally, a good argument for a wholesale change could be made
with be an analysis of some larger body of shell scripts. For example,
the Git codebase has a lot of .sh files (even if a lot of them are in
the 't' directory). Better examples welcome.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77746
; Package
emacs
.
(Thu, 17 Apr 2025 12:30:05 GMT)
Full text and
rfc822 format available.
Message #113 received at 77746 <at> debbugs.gnu.org (full text, mbox):
On 2025-04-16 19:43, Dmitry Gutov wrote:
> Hi! Regarding this:
>
> On 15/04/2025 23:46, Stefan Monnier via Bug reports for GNU Emacs, the
> Swiss army knife of text editors wrote:
>> I added Dmitry to the Cc because AFAICT he's the last one to have
>> changed that syntax-table and he seems to have changed it for the same
>> kind of reasons as you:
>>
>> commit f6277911eb2c520aec8f0efd80c91999226e3322
>> Author: Dmitry Gutov<dgutov <at> yandex.ru>
>> Date: Fri Oct 2 07:11:56 2020 +0200
>> Make xref work better on variables in shell-script-mode
>> * lisp/progmodes/sh-script.el (sh-mode-syntax-table):
>> Classify "/"
>> as punctuation so that `M-.' on $foo/bar works on the $foo part
>> (bug#25585).
>> diff --git a/lisp/progmodes/sh-script.el b/lisp/progmodes/sh-
>> script.el
>> --- a/lisp/progmodes/sh-script.el
>> +++ b/lisp/progmodes/sh-script.el
>> @@ -370,26 +370,27 @@
>> (defvar sh-mode-syntax-table
>> [...]
>> ?= "."
>> + ?/ "."
>> ?\; "."
>> ?| "."
>>
>> Apparently this change hasn't brought any trouble over the last 5 years,
>> so that's encouraging.
>
> Sorry to say, I don't have much more experience with these files. The
> '/' character seemed safe enough, since it sounds silly to define
> symbols including it, given how prevalent its other use in scripts.
The '/' character is allowed as a function name in Bash, but
you chose to remove it because you recognized that it is
rarely used in practice.
The same reasoning applies to this patch. Characters like
`!`, `%`, `^`, `~`, `.`, and `:` are technically allowed as
function names, but this patch reclassifies them as
punctuation, just as you did with `/`, because they are
rarely used as function names and never used as variable
names.
Treating them as punctuation is more convenient, as it
improves the accuracy of symbol detection.
> A lot of the questions that have been brought up in this thread seem
> very good. In particular, is there a more urgent subset of the
> characters that we would want to change, and what are the code examples
> that would be affected? Even if we just list the positive cases. But
> preferably realistic.
>
> E.g. this one from the screenshot:
>
> var%name=1
I intentionally used symbols like % in my example to
illustrate that those symbols should be treated as
punctuation rather than symbol constituents.
(I recommend reading my other message in this thread for more
details.)
I believe merging this patch is just as beneficial
as your modification to `/`. (I have been using this patch
for over a year without encountering any issues in my
workflow.)
> What is such code supposed to do? Is that an assignment inside the
> modulo operation?
>
> More generally, a good argument for a wholesale change could be made
> with be an analysis of some larger body of shell scripts. For example,
> the Git codebase has a lot of .sh files (even if a lot of them are in
> the 't' directory). Better examples welcome.
--
James Cherti
GitHub: https://github.com/jamescherti
Website: https://www.jamescherti.com/
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77746
; Package
emacs
.
(Thu, 17 Apr 2025 13:02:02 GMT)
Full text and
rfc822 format available.
Message #116 received at 77746 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Hello Eli,
Attached is v3 of this patch.
(I have updated the commit message to include more details
and reintroduced the '.' character.)
--
James Cherti
GitHub: https://github.com/jamescherti
Website: https://www.jamescherti.com/
On 2025-04-15 07:47, James Cherti wrote:
> Hello Eli,
>
> I just want to note that the second patch is not ideal, as
> it omits the '.' character. If you decide to merge this
> patch (or a variation of it), I recommend including '.' as
> well, since '.' is not allowed in variable names.
>
> In the context of Bash development, it is uncommon to use
> characters like !%^~:. in function names. In my view, it is
> more practical to treat them as punctuation rather than
> symbol constituents by default, as what matters the most
> in Bash development is variable names and function names.
>
> Maintaining some degree of backward compatibility might be
> beneficial in case other users prefer completing commands
> that contain characters like !%^~:,. (though in practice,
> this is rare that users complete commands containing
> !%^~:, with the exception of '.'. Outside of command names,
> the '.' character is also rarely used in function names
> and never used in variable names.).
>
> The variable for backward compatibility could be defined
> as one of the following, or a variation of them:
> (defvar sh-mode-syntax-table-respect-variable-name-limits t)
> (defvar sh-mode-symbol-respect-variable-name-limits t)
> (defvar sh-mode-respect-variable-name-limits t)
>
> (It could also default to nil, mitigating the risk of any
> user encountering issues with this new behavior. As I
> mentioned earlier, I haven't faced any problems with this
> patch applied for several months.)
>
> --
> James Cherti
> GitHub: https://github.com/jamescherti
> Website: https://www.jamescherti.com/
>
> On 2025-04-15 02:17, Eli Zaretskii wrote:
>>> Date: Mon, 14 Apr 2025 15:39:24 -0400
>>> Cc: 77746 <at> debbugs.gnu.org
>>> From: James Cherti <contact <at> jamescherti.com>
>>>
>>> Attached: v2 of this patch
>>>
>>> This patch marks only the following characters as
>>> punctuation: !%^~:,
>>>
>>> I removed . because it is commonly used in command names
>>> (e.g., mkfs.ext4).
>>>
>>> (Excluding '.' not only covers the majority of command and
>>> function names but also prevents using ,!%^~: as
>>> symbol constituents, resulting in more accurate
>>> variable symbols.)
>>
>> Should this backward-incompatible change be controlled by a user
>> variable? I can easily imagine some user who'd come complaining about
>> this compromise. Without a knob to get back previous behavior, we
>> will have no way of satisfying such users without reverting the
>> change.
>>
>>
>>
>
>
>
>
[fix-sh-mode-syntax-table-v3.patch (text/x-patch, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77746
; Package
emacs
.
(Sat, 26 Apr 2025 12:25:01 GMT)
Full text and
rfc822 format available.
Message #119 received at 77746 <at> debbugs.gnu.org (full text, mbox):
> Date: Thu, 17 Apr 2025 09:01:31 -0400
> From: James Cherti <contact <at> jamescherti.com>
> Cc: 77746 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca, juri <at> linkov.net
>
> Hello Eli,
>
> Attached is v3 of this patch.
>
> (I have updated the commit message to include more details
> and reintroduced the '.' character.)
Thanks. I'm not sure whether the discussion was concluded and we are
all okay with this change. Stefan and Juri, please tell what you
think we should do next.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77746
; Package
emacs
.
(Mon, 28 Apr 2025 16:15:02 GMT)
Full text and
rfc822 format available.
Message #122 received at 77746 <at> debbugs.gnu.org (full text, mbox):
Hello Eli,
I am the only person who has expressed a definitive opinion
that merging this would benefit Sh and Bash scripts. I have
been using this patch for over a year without any issues.
This change is similar to the one previously merged to treat
"/" as punctuation:
commit f6277911eb2c520aec8f0efd80c91999226e3322
Author: Dmitry Gutov <dgutov <at> yandex.ru>
Date: 2020-10-02 07:11:56 +0200
Make xref work better on variables in shell-script-mode
* lisp/progmodes/sh-script.el (sh-mode-syntax-table): Classify "/"
as punctuation so that `M-.' on $foo/bar works on the $foo part
(bug#25585).
The above patch was accepted.
@Stefan @Juri: Have you been able to perform any checks that
would allow this patch to be merged?
I hope that Stefan, Juri, or you, Eli, will review the
discussion and make a decision. I have sent numerous emails
with explanations.
Thanks,
James Cherti
GitHub: https://github.com/jamescherti
Website: https://www.jamescherti.com/
On 2025-04-26 08:23, Eli Zaretskii wrote:
>> Date: Thu, 17 Apr 2025 09:01:31 -0400
>> From: James Cherti <contact <at> jamescherti.com>
>> Cc: 77746 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca, juri <at> linkov.net
>>
>> Hello Eli,
>>
>> Attached is v3 of this patch.
>>
>> (I have updated the commit message to include more details
>> and reintroduced the '.' character.)
>
> Thanks. I'm not sure whether the discussion was concluded and we are
> all okay with this change. Stefan and Juri, please tell what you
> think we should do next.
>
>
>
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77746
; Package
emacs
.
(Mon, 28 Apr 2025 17:42:02 GMT)
Full text and
rfc822 format available.
Message #125 received at 77746 <at> debbugs.gnu.org (full text, mbox):
> commit f6277911eb2c520aec8f0efd80c91999226e3322
> Author: Dmitry Gutov <dgutov <at> yandex.ru>
> Date: 2020-10-02 07:11:56 +0200
> Make xref work better on variables in shell-script-mode
> * lisp/progmodes/sh-script.el (sh-mode-syntax-table): Classify "/"
> as punctuation so that `M-.' on $foo/bar works on the $foo part
> (bug#25585).
>
> The above patch was accepted.
>
> @Stefan @Juri: Have you been able to perform any checks that
> would allow this patch to be merged?
Yeah, based on my experimentation with `/`, I'm now thinking it might be
safe enough to give punctuation syntax to those other characters.
Any remaining objection to pushing this to `master`?
Stefan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77746
; Package
emacs
.
(Mon, 28 Apr 2025 19:29:01 GMT)
Full text and
rfc822 format available.
Message #128 received at 77746 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Thank you for your prompt confirmation that the patch can be
merged, Stefan. This change will enhance the convenience of
editing bash and sh scripts.
While awaiting any objections, I am attaching v3 of the
patch, which includes all the characters we discussed. I am
sending it again in case the previous email was overlooked,
given the high volume of emails exchanged.
Attached: fix-sh-mode-syntax-table-v3.patch
--
James Cherti
GitHub: https://github.com/jamescherti
Website: https://www.jamescherti.com/
On 2025-04-28 13:41, Stefan Monnier via Bug reports for GNU Emacs, the
Swiss army knife of text editors wrote:
>> commit f6277911eb2c520aec8f0efd80c91999226e3322
>> Author: Dmitry Gutov <dgutov <at> yandex.ru>
>> Date: 2020-10-02 07:11:56 +0200
>> Make xref work better on variables in shell-script-mode
>> * lisp/progmodes/sh-script.el (sh-mode-syntax-table): Classify "/"
>> as punctuation so that `M-.' on $foo/bar works on the $foo part
>> (bug#25585).
>>
>> The above patch was accepted.
>>
>> @Stefan @Juri: Have you been able to perform any checks that
>> would allow this patch to be merged?
>
> Yeah, based on my experimentation with `/`, I'm now thinking it might be
> safe enough to give punctuation syntax to those other characters.
>
> Any remaining objection to pushing this to `master`?
>
>
> Stefan
>
>
>
>
[fix-sh-mode-syntax-table-v3.patch (text/x-patch, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77746
; Package
emacs
.
(Tue, 06 May 2025 00:51:01 GMT)
Full text and
rfc822 format available.
Message #131 received at 77746 <at> debbugs.gnu.org (full text, mbox):
Hello Eli,
Stefan confirmed that, based on experimentation with `/`, he
believes the patch should be safe enough.
I wanted to kindly follow up on the status of the patch
merge.
Thank you.
--
James Cherti
GitHub: https://github.com/jamescherti
Website: https://www.jamescherti.com/
On 2025-04-28 13:41, Stefan Monnier via Bug reports for GNU Emacs, the
Swiss army knife of text editors wrote:
>> commit f6277911eb2c520aec8f0efd80c91999226e3322
>> Author: Dmitry Gutov <dgutov <at> yandex.ru>
>> Date: 2020-10-02 07:11:56 +0200
>> Make xref work better on variables in shell-script-mode
>> * lisp/progmodes/sh-script.el (sh-mode-syntax-table): Classify "/"
>> as punctuation so that `M-.' on $foo/bar works on the $foo part
>> (bug#25585).
>>
>> The above patch was accepted.
>>
>> @Stefan @Juri: Have you been able to perform any checks that
>> would allow this patch to be merged?
>
> Yeah, based on my experimentation with `/`, I'm now thinking it might be
> safe enough to give punctuation syntax to those other characters.
>
> Any remaining objection to pushing this to `master`?
>
>
> Stefan
>
>
>
>
Reply sent
to
Stefan Monnier <monnier <at> iro.umontreal.ca>
:
You have taken responsibility.
(Wed, 07 May 2025 19:06:02 GMT)
Full text and
rfc822 format available.
Notification sent
to
James Cherti <contact <at> jamescherti.com>
:
bug acknowledged by developer.
(Wed, 07 May 2025 19:06:02 GMT)
Full text and
rfc822 format available.
Message #136 received at 77746-done <at> debbugs.gnu.org (full text, mbox):
I didn't see any other objections, so I pushed it to `master`.
Thanks James, especially for your patience,
Stefan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77746
; Package
emacs
.
(Wed, 07 May 2025 19:52:03 GMT)
Full text and
rfc822 format available.
Message #139 received at 77746 <at> debbugs.gnu.org (full text, mbox):
On 2025-05-07 15:05, Stefan Monnier via Bug reports for GNU Emacs, the
Swiss army knife of text editors wrote:
> I didn't see any other objections, so I pushed it to `master`.
> Thanks James, especially for your patience,
My pleasure, Stefan.
I also want to thank you for ensuring the patch is worth
merging by asking insightful questions, which allowed me to
add more information to the commit message and clarify the
purpose of this patch.
--
James Cherti
GitHub: https://github.com/jamescherti
Website: https://www.jamescherti.com/
>
>
> Stefan
>
>
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Thu, 05 Jun 2025 11:24:10 GMT)
Full text and
rfc822 format available.
This bug report was last modified 40 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.