GNU bug report logs - #39277
26.3; Tcl font lock does not understand quoting

Previous Next

Package: emacs;

Reported by: Hadrien Lacour <hadrien.lacour <at> posteo.net>

Date: Sat, 25 Jan 2020 10:01:02 UTC

Severity: normal

Tags: fixed

Found in version 26.3

Fixed in version 28.1

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 39277 in the body.
You can then email your comments to 39277 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#39277; Package emacs. (Sat, 25 Jan 2020 10:01:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Hadrien Lacour <hadrien.lacour <at> posteo.net>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Sat, 25 Jan 2020 10:01:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Hadrien Lacour <hadrien.lacour <at> posteo.net>
To: bug-gnu-emacs <at> gnu.org
Subject: 26.3; Tcl font lock does not understand quoting
Date: Sat, 25 Jan 2020 11:00:09 +0100
Hello, tcl-mode's font lock (highlighting) chokes on this simple case:
    puts {"hello}
where it considers the double quote inside the curly braces as a
"quoting" character.
I have confirmed it works with `emacs -Q`.


In GNU Emacs 26.3 (build 1, x86_64-pc-linux-gnu, X toolkit)
 of 2019-09-23 built on gentoo-zen2700x
Windowing system distributor 'The X.Org Foundation', version 11.0.12006000
System Description:	Gentoo Base System release 2.6

Recent messages:
Applying X config
For information about GNU Emacs and the GNU system, type C-h C-a.
Making completion list...

Configured using:
 'configure --prefix=/usr --build=x86_64-pc-linux-gnu
 --host=x86_64-pc-linux-gnu --mandir=/usr/share/man
 --infodir=/usr/share/info --datadir=/usr/share --sysconfdir=/etc
 --localstatedir=/var/lib --disable-silent-rules
 --docdir=/usr/share/doc/emacs-26.3-r1
 --htmldir=/usr/share/doc/emacs-26.3-r1/html --libdir=/usr/lib64
 --program-suffix=-emacs-26 --includedir=/usr/include/emacs-26
 --infodir=/usr/share/info/emacs-26 --localstatedir=/var
 --enable-locallisppath=/etc/emacs:/usr/share/emacs/site-lisp
 --without-compress-install --without-hesiod --without-pop
 --with-file-notification=inotify --enable-acl --without-dbus
 --without-modules --without-gameuser --without-gpm --without-kerberos
 --without-kerberos5 --without-lcms2 --without-xml2 --without-mailutils
 --without-selinux --with-gnutls --without-libsystemd --with-threads
 --without-wide-int --with-zlib --with-sound=no --with-x --without-ns
 --without-gconf --without-gsettings --without-toolkit-scroll-bars
 --without-gif --without-jpeg --without-png --without-rsvg
 --without-tiff --with-xpm --without-imagemagick --with-xft
 --without-cairo --without-libotf --without-m17n-flt
 --with-x-toolkit=lucid --without-xaw3d 'CFLAGS=-march=native -pipe -O2'
 CPPFLAGS= 'LDFLAGS=-Wl,-O1 -Wl,--as-needed''

Configured features:
XPM NOTIFY ACL GNUTLS FREETYPE XFT ZLIB LUCID X11 XDBE XIM THREADS

Important settings:
  value of $LANG: en_US.utf8
  locale-coding-system: utf-8-unix

Major mode: Lisp Interaction

Minor modes in effect:
  diff-auto-refine-mode: t
  linum-mode: t
  electric-pair-mode: t
  savehist-mode: t
  global-whitespace-mode: t
  show-paren-mode: t
  company-mode: t
  tooltip-mode: t
  global-eldoc-mode: t
  eldoc-mode: t
  electric-indent-mode: t
  mouse-wheel-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  line-number-mode: t
  transient-mark-mode: t
  hs-minor-mode: t

Load-path shadows:
/home/hadrien/.emacs.d/elpa/flymake-1.0.8/flymake hides /usr/share/emacs/26.3/lisp/progmodes/flymake

Features:
(shadow sort vc-mtn vc-hg vc-git diff-mode easy-mmode vc-bzr vc-src
vc-sccs vc-svn vc-cvs vc-rcs vc vc-dispatcher mail-extr emacsbug message
rmc puny dired dired-loaddefs format-spec rfc822 mml mml-sec epa derived
epg gnus-util rmail rmail-loaddefs mm-decode mm-bodies mm-encode
mail-parse rfc2231 mailabbrev gmm-utils mailheader sendmail rfc2047
rfc2045 ietf-drums mm-util mail-prsvr mail-utils company-oddmuse
company-keywords company-etags etags company-gtags company-dabbrev-code
company-dabbrev company-files company-capf company-cmake company-xcode
company-clang company-semantic company-eclim company-template
company-bbdb linum hideshow elec-pair savehist whitespace cc-styles
cc-align cc-engine cc-vars cc-defs paren eglot array filenotify jsonrpc
ert pp find-func ewoc debug xref flymake-proc flymake thingatpt warnings
compile comint ansi-color ring url-util project json map company edmacro
kmacro pcase deeper-blue-theme finder-inf info tex-site package easymenu
epg-config url-handlers url-parse auth-source cl-seq eieio eieio-core
cl-macs eieio-loaddefs password-cache url-vars seq byte-opt gv bytecomp
byte-compile cconv cl-loaddefs cl-lib site-gentoo time-date mule-util
tooltip eldoc electric uniquify ediff-hook vc-hooks lisp-float-type
mwheel term/x-win x-win term/common-win x-dnd tool-bar dnd fontset image
regexp-opt fringe tabulated-list replace newcomment text-mode elisp-mode
lisp-mode prog-mode register page menu-bar rfn-eshadow isearch timer
select scroll-bar mouse jit-lock font-lock syntax facemenu font-core
term/tty-colors frame cl-generic cham georgian utf-8-lang misc-lang
vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms cp51932
hebrew greek romanian slovak czech european ethiopic indian cyrillic
chinese composite charscript charprop case-table epa-hook jka-cmpr-hook
help simple abbrev obarray minibuffer cl-preloaded nadvice loaddefs
button faces cus-face macroexp files text-properties overlay sha1 md5
base64 format env code-pages mule custom widget hashtable-print-readable
backquote threads inotify dynamic-setting font-render-setting x-toolkit
x multi-tty make-network-process emacs)

Memory information:
((conses 16 341610 9384)
 (symbols 48 32614 9)
 (miscs 40 93 178)
 (strings 32 113853 2521)
 (string-bytes 1 2844540)
 (vectors 16 33288)
 (vector-slots 8 769217 11356)
 (floats 8 92 262)
 (intervals 56 333 0)
 (buffers 992 12))




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39277; Package emacs. (Sat, 25 Jan 2020 10:13:01 GMT) Full text and rfc822 format available.

Message #8 received at 39277 <at> debbugs.gnu.org (full text, mbox):

From: Hadrien Lacour <hadrien.lacour <at> posteo.net>
To: 39277 <at> debbugs.gnu.org
Subject: tcl-mode does not understand quoting
Date: Sat, 25 Jan 2020 11:12:35 +0100
Sorry, the problem is even bigger than what I thought, since it affects more
than the font lock. The cause is still double quotes not ignore inside curlies,
but it messed indentation and parens matching too.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39277; Package emacs. (Sat, 25 Jan 2020 10:54:01 GMT) Full text and rfc822 format available.

Message #11 received at 39277 <at> debbugs.gnu.org (full text, mbox):

From: Hadrien Lacour <hadrien.lacour <at> posteo.net>
To: 39277 <at> debbugs.gnu.org
Subject: tcl-mode does not understand quoting
Date: Sat, 25 Jan 2020 11:53:22 +0100
Oh, by "it works" in the first message, I meant that it doesn't work.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39277; Package emacs. (Mon, 26 Oct 2020 21:05:01 GMT) Full text and rfc822 format available.

Message #14 received at 39277 <at> debbugs.gnu.org (full text, mbox):

From: mvar <mvar.40k <at> gmail.com>
To: Hadrien Lacour <hadrien.lacour <at> posteo.net>
Cc: 39277 <at> debbugs.gnu.org
Subject: Re: bug#39277: 26.3; Tcl font lock does not understand quoting
Date: Mon, 26 Oct 2020 22:44:03 +0200
[Message part 1 (text/plain, inline)]
Hadrien Lacour <hadrien.lacour <at> posteo.net> writes:

> Hello, tcl-mode's font lock (highlighting) chokes on this simple case:
>     puts {"hello}
> where it considers the double quote inside the curly braces as a
> "quoting" character.
> I have confirmed it works with `emacs -Q`.

hi Hadrien,

there's some generic(?) syntactic font lock getting triggered once the
doublequote character is found, that expects a closing doublequote - until then
everything is locked as a string. Is this what this bug is about (it was not
100% clear to me from your initial report) ? i'm attaching a patch that works
around this behavior but i don't know if it is the proper way to deal with the
problem (it certainly doesn't look pretty). The idea is to insert an additional
rule in tcl-syntax-propertize-function that will match the tcl-builtin-list
keywords ('puts' is in there among others) plus the brackets that follow, so
that if a doublequote is found in-between the brackets then there won't be any
automatic string locking that would mess up the closing brackets and everything
else (until another doublequote was found). Then in tcl-set-font-keywords a new
rule will match the brackets and any characters inside will be locked as a
string (including the single quote).

[tcl.patch (text/x-patch, inline)]
diff --git a/lisp/progmodes/tcl.el b/lisp/progmodes/tcl.el
index 33aad2d39f..5dd02c1367 100644
--- a/lisp/progmodes/tcl.el
+++ b/lisp/progmodes/tcl.el
@@ -410,7 +410,8 @@ tcl-font-lock-keywords
 (defconst tcl-syntax-propertize-function
   (syntax-propertize-rules
    ;; Mark the few `#' that are not comment-markers.
-   ("[^;[{ \t\n][ \t]*\\(#\\)" (1 ".")))
+   ("[^;[{ \t\n][ \t]*\\(#\\)" (1 "."))
+   ((concat "\\_<" (regexp-opt tcl-builtin-list t) "\\_>" "\s*{\\([^}].*\\)}") (2 "_")))
   "Syntactic keywords for `tcl-mode'.")
 
 ;; FIXME need some way to recognize variables because array refs look
@@ -506,6 +507,7 @@ tcl-set-font-lock-keywords
          ;; number of "namespace::" qualifiers.  A leading "::" refers
          ;; to the global namespace.
          '("\\${\\([^}]+\\)}" 1 font-lock-variable-name-face)
+         '("{\\([^}]+\\)}" 1 font-lock-string-face)
          '("\\$\\(\\(?:::\\)?\\(?:[[:alnum:]_]+::\\)*[[:alnum:]_]+\\)"
            1 font-lock-variable-name-face)
          '("\\(?:\\s-\\|^\\|\\[\\)set\\s-+{\\([^}]+\\)}"

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39277; Package emacs. (Tue, 27 Oct 2020 08:33:01 GMT) Full text and rfc822 format available.

Message #17 received at 39277 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: mvar <mvar.40k <at> gmail.com>
Cc: 39277 <at> debbugs.gnu.org, Hadrien Lacour <hadrien.lacour <at> posteo.net>
Date: Tue, 27 Oct 2020 09:31:57 +0100
mvar <mvar.40k <at> gmail.com> writes:

> there's some generic(?) syntactic font lock getting triggered once the
> doublequote character is found, that expects a closing doublequote -
> until then everything is locked as a string. Is this what this bug is
> about (it was not 100% clear to me from your initial report) ? i'm
> attaching a patch that works around this behavior but i don't know if
> it is the proper way to deal with the problem (it certainly doesn't
> look pretty).

I'm not sure, either -- it's been a while since I did Tcl programming,
but this change doesn't seem very invasive at least, and fixes this
specific problem, so I've applied it to Emacs 28.

This change was small enough to apply without assigning copyright to the
FSF, but for future patches you want to submit, it might make sense to
get the paperwork started now, so that subsequent patches can be applied
speedily.  Would you be willing to sign such paperwork?

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Added tag(s) fixed. Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Tue, 27 Oct 2020 08:33:02 GMT) Full text and rfc822 format available.

bug marked as fixed in version 28.1, send any further explanations to 39277 <at> debbugs.gnu.org and Hadrien Lacour <hadrien.lacour <at> posteo.net> Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Tue, 27 Oct 2020 08:33:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39277; Package emacs. (Tue, 27 Oct 2020 08:52:01 GMT) Full text and rfc822 format available.

Message #24 received at 39277 <at> debbugs.gnu.org (full text, mbox):

From: Andreas Schwab <schwab <at> linux-m68k.org>
To: mvar <mvar.40k <at> gmail.com>
Cc: 39277 <at> debbugs.gnu.org, Hadrien Lacour <hadrien.lacour <at> posteo.net>
Subject: Re: bug#39277: 26.3; Tcl font lock does not understand quoting
Date: Tue, 27 Oct 2020 09:51:17 +0100
On Okt 26 2020, mvar wrote:

> diff --git a/lisp/progmodes/tcl.el b/lisp/progmodes/tcl.el
> index 33aad2d39f..5dd02c1367 100644
> --- a/lisp/progmodes/tcl.el
> +++ b/lisp/progmodes/tcl.el
> @@ -410,7 +410,8 @@ tcl-font-lock-keywords
>  (defconst tcl-syntax-propertize-function
>    (syntax-propertize-rules
>     ;; Mark the few `#' that are not comment-markers.
> -   ("[^;[{ \t\n][ \t]*\\(#\\)" (1 ".")))
> +   ("[^;[{ \t\n][ \t]*\\(#\\)" (1 "."))
> +   ((concat "\\_<" (regexp-opt tcl-builtin-list t) "\\_>" "\s*{\\([^}].*\\)}") (2 "_")))
>    "Syntactic keywords for `tcl-mode'.")
>  
>  ;; FIXME need some way to recognize variables because array refs look
> @@ -506,6 +507,7 @@ tcl-set-font-lock-keywords
>           ;; number of "namespace::" qualifiers.  A leading "::" refers
>           ;; to the global namespace.
>           '("\\${\\([^}]+\\)}" 1 font-lock-variable-name-face)
> +         '("{\\([^}]+\\)}" 1 font-lock-string-face)

That mishandles nested or quoted braces.

-- 
Andreas Schwab, schwab <at> linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39277; Package emacs. (Tue, 27 Oct 2020 08:58:02 GMT) Full text and rfc822 format available.

Message #27 received at 39277 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Andreas Schwab <schwab <at> linux-m68k.org>
Cc: mvar <mvar.40k <at> gmail.com>, 39277 <at> debbugs.gnu.org,
 Hadrien Lacour <hadrien.lacour <at> posteo.net>
Subject: Re: bug#39277: 26.3; Tcl font lock does not understand quoting
Date: Tue, 27 Oct 2020 09:56:57 +0100
Andreas Schwab <schwab <at> linux-m68k.org> writes:

> That mishandles nested or quoted braces.

Yes...  but is it any worse than it was?  Not in the cases I was looking
at, but perhaps I wasn't looking at the right examples.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39277; Package emacs. (Tue, 27 Oct 2020 13:28:02 GMT) Full text and rfc822 format available.

Message #30 received at 39277 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: mvar <mvar.40k <at> gmail.com>
Cc: 39277 <at> debbugs.gnu.org, Hadrien Lacour <hadrien.lacour <at> posteo.net>
Subject: Re: bug#39277: 26.3; Tcl font lock does not understand quoting
Date: Tue, 27 Oct 2020 09:27:45 -0400
>> Hello, tcl-mode's font lock (highlighting) chokes on this simple case:
>>     puts {"hello}
>> where it considers the double quote inside the curly braces as a
>> "quoting" character.
>> I have confirmed it works with `emacs -Q`.
> there's some generic(?) syntactic font lock getting triggered once the
> doublequote character is found, that expects a closing doublequote - until then
> everything is locked as a string. Is this what this bug is about (it was not
> 100% clear to me from your initial report) ? i'm attaching a patch that works
> around this behavior but i don't know if it is the proper way to deal with the
> problem (it certainly doesn't look pretty).

It's been too long since I last had to deal with Tcl so I can't remember
the rules.  The patch you submitted is most likely not "correct" in the
sense that it still leaves many cases that are mishandled.

Could someone remind me how " and {..} interact in Tcl?

E.g.

    proc foo1 () {
       puts "hello"
    }

prints "hello" (without the quotes)?
And

    proc foo2 () {
       puts {"hello}
    }

prints "hello (with the quote)?
And what about

    proc foo3 () {
       puts "hello}"
    }

    proc foo4 () {
       puts "hello\}"
    }

    proc foo5 () {
       puts "hello
    }

> The idea is to insert an additional
> rule in tcl-syntax-propertize-function that will match the tcl-builtin-list

I'm thinking that maybe a better option is to catch all " in
tcl-syntax-propertize-function and for every one of them see if they're
"closing" a string and if not, check whether they're closed by a } before
a matching " and if so mark them as "not opening a string".

> +         '("{\\([^}]+\\)}" 1 font-lock-string-face)

Won't this catch cases not usually considered as strings, like

    proc foo5 () {
        return 6
    }

?

        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39277; Package emacs. (Tue, 27 Oct 2020 15:25:02 GMT) Full text and rfc822 format available.

Message #33 received at 39277 <at> debbugs.gnu.org (full text, mbox):

From: Hadrien Lacour <hadrien.lacour <at> posteo.net>
To: 39277 <at> debbugs.gnu.org
Subject: bug#39277
Date: Tue, 27 Oct 2020 16:24:48 +0100
Sorry for not really contributing, I must provid the excuse that I don't have
enough time or energy right now.

About Tcl syntax rules, Tcl(3tcl) (or Tcl(n) on gentoo) explains it simply:
...
[4] Double quotes.
	If the first character of a word is double-quote (“"”) then the word is
	terminated by the next double-quote character.  If semi-colons, close
	brackets, or white space characters (including newlines) appear between the
	quotes then they are treated as ordinary characters and included in the
	word.  Command substitution, variable substitution, and backslash
	substitution are performed on the characters between the quotes as
	described below. The double-quotes are not retained as part of the word.
...
[6] Braces.
	If the first character of a word is an open brace (“{”) and rule [5] does
	not apply, then the word is terminated by the matching close brace (“}”).
	Braces nest within the word: for each additional open brace there must be
	an additional close brace (however, if an open brace or close brace within
	the word is quoted with a backslash then it is not counted in locating the
	matching close brace).  No substitutions are performed on the characters
	between the braces except for backslash-newline substitutions described
	below, nor do semi-colons, newlines, close brackets, or white space receive
	any special interpretation.  The word will consist of exactly the
	characters between the outer braces, not including the braces themselves.
...

To put simply, braces act like sh's single quotes and double quotes are
basically the same (only $ or [] is substituted).




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39277; Package emacs. (Tue, 27 Oct 2020 17:47:01 GMT) Full text and rfc822 format available.

Message #36 received at 39277 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: mvar <mvar.40k <at> gmail.com>, 39277 <at> debbugs.gnu.org,
 Hadrien Lacour <hadrien.lacour <at> posteo.net>
Subject: Re: bug#39277: 26.3; Tcl font lock does not understand quoting
Date: Tue, 27 Oct 2020 18:45:48 +0100
Stefan Monnier <monnier <at> iro.umontreal.ca> writes:

> And what about
>
>     proc foo3 () {
>        puts "hello}"
>     }
>
>     proc foo4 () {
>        puts "hello\}"
>     }
>
>     proc foo5 () {
>        puts "hello
>     }

It's fortunately been a couple of decades since I wrote Tcl, and...  I
don't remember.  :-/

> Won't this catch cases not usually considered as strings, like
>
>     proc foo5 () {
>         return 6
>     }

Yup.  I'll revert the patch and reopen this bug report.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




bug No longer marked as fixed in versions 28.1 and reopened. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Tue, 27 Oct 2020 17:47:02 GMT) Full text and rfc822 format available.

Removed tag(s) fixed. Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Tue, 27 Oct 2020 17:47:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39277; Package emacs. (Tue, 27 Oct 2020 20:43:02 GMT) Full text and rfc822 format available.

Message #43 received at 39277 <at> debbugs.gnu.org (full text, mbox):

From: mvar <mvar.40k <at> gmail.com>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: mvar <mvar.40k <at> gmail.com>, 39277 <at> debbugs.gnu.org,
 Stefan Monnier <monnier <at> iro.umontreal.ca>,
 Hadrien Lacour <hadrien.lacour <at> posteo.net>
Subject: Re: bug#39277: 26.3; Tcl font lock does not understand quoting
Date: Tue, 27 Oct 2020 22:42:44 +0200
Lars Ingebrigtsen <larsi <at> gnus.org> writes:

> Stefan Monnier <monnier <at> iro.umontreal.ca> writes:
>
>> And what about
>>
>>     proc foo3 () {
>>        puts "hello}"
>>     }
>>
>>     proc foo4 () {
>>        puts "hello\}"
>>     }
>>
>>     proc foo5 () {
>>        puts "hello
>>     }
>
> It's fortunately been a couple of decades since I wrote Tcl, and...  I
> don't remember.  :-/
>
>> Won't this catch cases not usually considered as strings, like
>>
>>     proc foo5 () {
>>         return 6
>>     }
>
> Yup.  I'll revert the patch and reopen this bug report.

thank you Lars for reverting, this didn't feel right anyway. I'll try to come up
with some more elegant solution or at least find some way to skip
breaking the other locks - for example moving the tcl-font-lock-keywords
regexp to the end of that list solves the problem Stefan mentioned but it
still doesn't address what Andreas pointed out, i.e. proc test (args) will
have args locked as a string.

btw to answer your previous email, i'd like to sign the copyright paperwork

thanks,
Michalis




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39277; Package emacs. (Tue, 27 Oct 2020 20:48:01 GMT) Full text and rfc822 format available.

Message #46 received at 39277 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: mvar <mvar.40k <at> gmail.com>
Cc: 39277 <at> debbugs.gnu.org, Stefan Monnier <monnier <at> iro.umontreal.ca>,
 Hadrien Lacour <hadrien.lacour <at> posteo.net>
Subject: Re: bug#39277: 26.3; Tcl font lock does not understand quoting
Date: Tue, 27 Oct 2020 21:47:02 +0100
mvar <mvar.40k <at> gmail.com> writes:

> btw to answer your previous email, i'd like to sign the copyright paperwork

(Sent off-list.)

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39277; Package emacs. (Tue, 27 Oct 2020 22:50:02 GMT) Full text and rfc822 format available.

Message #49 received at 39277 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: mvar <mvar.40k <at> gmail.com>
Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 39277 <at> debbugs.gnu.org,
 Hadrien Lacour <hadrien.lacour <at> posteo.net>
Subject: Re: bug#39277: 26.3; Tcl font lock does not understand quoting
Date: Tue, 27 Oct 2020 18:48:56 -0400
> thank you Lars for reverting, this didn't feel right anyway. I'll try to come up
> with some more elegant solution or at least find some way to skip
> breaking the other locks - for example moving the tcl-font-lock-keywords
> regexp to the end of that list solves the problem Stefan mentioned but it
> still doesn't address what Andreas pointed out, i.e. proc test (args) will
> have args locked as a string.

How 'bout the patch below?


        Stefan


diff --git a/lisp/progmodes/tcl.el b/lisp/progmodes/tcl.el
index 717008a0a2..d4d51e8b50 100644
--- a/lisp/progmodes/tcl.el
+++ b/lisp/progmodes/tcl.el
@@ -407,10 +407,64 @@ tcl-font-lock-keywords
 `tcl-typeword-list', and `tcl-keyword-list' by the function
 `tcl-set-font-lock-keywords'.")
 
+(eval-and-compile
+  (defconst tcl--word-delimiters "[;{ \t\n"))
+
+(defun tcl--syntax-of-quote (pos)
+  "Decide whether a double quote opens a string or not."
+  ;; This is pretty tricky, because strings can be written as "..."
+  ;; or as {...} or without any quoting at all for some simple and not so
+  ;; simple cases (e.g. `abc' but also `a"b').  To make things more
+  ;; interesting, code is represented as strings, so the content of
+  ;; strings can be later re-lexed to find nested strings.
+  (save-excursion
+    (let ((ppss (syntax-ppss pos)))
+      (cond
+       ((nth 8 ppss) nil) ;; Within a string or a comment.
+       ((not (memq (char-before pos)
+                   (cons nil
+                         (eval-when-compile
+                           (mapcar #'identity tcl--word-delimiters)))))
+        ;; The double quote appears within some other lexical entity.
+        ;; FIXME: Similar treatment should be used for `{' which can appear
+        ;; within non-delimited strings (but only at top-level, so
+        ;; maybe it's not worth worrying about).
+        (string-to-syntax "."))
+       ((zerop (nth 0 ppss))
+        ;; Not within a { ... }, so can't be truncated by a }.
+        ;; FIXME: The syntax-table also considers () and [] as paren
+        ;; delimiters just like {}, even though Tcl treats them differently.
+        ;; Tho I'm not sure it's worth worrying about, either.
+        nil)
+       (t
+        ;; A double quote within a {...}: leave it as a normal string
+        ;; delimiter only if we don't find a closing } before we
+        ;; find a closing ".
+        (let ((type nil)
+              (depth 0))
+          (forward-char 1)
+          (while (and (not type)
+                      (re-search-forward "[\"{}\\]" nil t))
+            (pcase (char-after (match-beginning 0))
+              (?\\ (forward-char 1))
+              (?\" (setq type 'matched))
+              (?\{ (cl-incf depth))
+              (?\} (if (zerop depth) (setq type 'unmatched)
+                     (cl-incf depth)))))
+          (when (> (line-beginning-position) pos)
+            ;; The quote is not on the same line as the deciding
+            ;; factor, so make sure we revisit this choice later.
+            (put-text-property pos (point) 'syntax-multiline t))
+          (when (eq type 'unmatched)
+            ;; The quote has no matching close because a } closes the
+            ;; surrounding string before, so it doesn't really "open a string".
+            (string-to-syntax "."))))))))
+
 (defconst tcl-syntax-propertize-function
   (syntax-propertize-rules
    ;; Mark the few `#' that are not comment-markers.
-   ("[^;[{ \t\n][ \t]*\\(#\\)" (1 ".")))
+   ((concat "[^" tcl--word-delimiters "][ \t]*\\(#\\)") (1 "."))
+   ("\"" (0 (tcl--syntax-of-quote (match-beginning 0)))))
   "Syntactic keywords for `tcl-mode'.")
 
 ;; FIXME need some way to recognize variables because array refs look
@@ -593,6 +647,8 @@ tcl-mode
        '(tcl-font-lock-keywords nil nil nil beginning-of-defun))
   (set (make-local-variable 'syntax-propertize-function)
        tcl-syntax-propertize-function)
+  (add-hook 'syntax-propertize-extend-region-functions
+            #'syntax-propertize-multiline 'append 'local)
 
   (set (make-local-variable 'imenu-generic-expression)
        tcl-imenu-generic-expression)
diff --git a/test/manual/indent/tcl.tcl b/test/manual/indent/tcl.tcl
new file mode 100644
index 0000000000..447b64cf1c
--- /dev/null
+++ b/test/manual/indent/tcl.tcl
@@ -0,0 +1,19 @@
+
+puts "hello}"; # Top-level strings can contain unescaped closing braces!
+
+puts a"b;                  # Non-delimited strings can contain quotes!
+puts a""b;                 # Even several of them!
+
+proc foo1 {} {
+    puts "hello";   # Normal case!
+    puts "hello\};  # This will signal an error when `foo1` is called!
+}
+
+proc foo1 {} {
+    puts "hello; # This will also signal an error when `foo1` is called!
+}
+
+proc foo1 {} {
+    puts a"b;                   # This will not signal an error!
+    puts a""b";                 # And that won't either!
+}





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39277; Package emacs. (Thu, 29 Oct 2020 17:40:02 GMT) Full text and rfc822 format available.

Message #52 received at 39277 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: mvar <mvar.40k <at> gmail.com>
Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 39277 <at> debbugs.gnu.org,
 Hadrien Lacour <hadrien.lacour <at> posteo.net>
Subject: Re: bug#39277: 26.3; Tcl font lock does not understand quoting
Date: Thu, 29 Oct 2020 13:39:20 -0400
> How 'bout the patch below?

Pushed to `master`,


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39277; Package emacs. (Fri, 30 Oct 2020 12:03:02 GMT) Full text and rfc822 format available.

Message #55 received at 39277 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: mvar <mvar.40k <at> gmail.com>, 39277 <at> debbugs.gnu.org,
 Hadrien Lacour <hadrien.lacour <at> posteo.net>
Subject: Re: bug#39277: 26.3; Tcl font lock does not understand quoting
Date: Fri, 30 Oct 2020 13:02:03 +0100
Stefan Monnier <monnier <at> iro.umontreal.ca> writes:

>> How 'bout the patch below?
>
> Pushed to `master`,

Great; seems to work fine here with the test cases in this bug report.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Added tag(s) fixed. Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Fri, 30 Oct 2020 12:03:02 GMT) Full text and rfc822 format available.

bug marked as fixed in version 28.1, send any further explanations to 39277 <at> debbugs.gnu.org and Hadrien Lacour <hadrien.lacour <at> posteo.net> Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Fri, 30 Oct 2020 12:03:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39277; Package emacs. (Sat, 31 Oct 2020 11:02:02 GMT) Full text and rfc822 format available.

Message #62 received at 39277 <at> debbugs.gnu.org (full text, mbox):

From: mvar <mvar.40k <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: mvar <mvar.40k <at> gmail.com>, Lars Ingebrigtsen <larsi <at> gnus.org>,
 39277 <at> debbugs.gnu.org, Hadrien Lacour <hadrien.lacour <at> posteo.net>
Subject: Re: bug#39277: 26.3; Tcl font lock does not understand quoting
Date: Sat, 31 Oct 2020 13:01:08 +0200
Stefan Monnier <monnier <at> iro.umontreal.ca> writes:

>> How 'bout the patch below?
>
> Pushed to `master`,
>
>
>         Stefan


hi Stefan,

apologies for late reply, i needed a couple of days to work with the
patched tcl.el in my (disgustingly large) tcl codebase to be sure
nothing breaks & can confirm now. The original case is solved
(although the enclosed {"string} is not font-locked as string but i
wouldn't consider it an error) plus it fixes the following:

    proc foo4 () {
       puts "hello}"
    }

this was marked as valid before your changes but tclsh won't accept it
as such - the bracket } inside the string needs to be escaped when inside a proc
context (but as a plain statement there's no such requirement).

many thanks,
Michalis





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39277; Package emacs. (Sat, 31 Oct 2020 13:21:02 GMT) Full text and rfc822 format available.

Message #65 received at 39277-done <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: mvar <mvar.40k <at> gmail.com>
Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 39277-done <at> debbugs.gnu.org,
 Hadrien Lacour <hadrien.lacour <at> posteo.net>
Subject: Re: bug#39277: 26.3; Tcl font lock does not understand quoting
Date: Sat, 31 Oct 2020 09:20:12 -0400
> apologies for late reply,

No need to apologize, and it's been less than 4 days, so it's definitely
not late.

> I needed a couple of days to work with the patched tcl.el in my
> (disgustingly large) tcl codebase to be sure nothing breaks & can
> confirm now.

Great, thanks!

> The original case is solved
> (although the enclosed {"string} is not font-locked as string but I
> wouldn't consider it an error)

Yes, this is a separate problem and I can't see how to fix it: since
"everything's a string" in Tcl, it's really not clear what
`font-lock-string-face` should apply to and what it shouldn't apply to.

The current design is to use it only where "..." is used.  When the code
is fully under your control it lets you choose (to some extent at least)
what is highlighted and what is not (by choosing "..." vs {...}), but
clearly it won't be "right" in all cases.

> plus it fixes the following:
>
>     proc foo4 () {
>        puts "hello}"
>     }
>
> this was marked as valid before your changes but tclsh won't accept it
> as such - the bracket } inside the string needs to be escaped when
> inside a proc context (but as a plain statement there's no such
> requirement).

Indeed.


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39277; Package emacs. (Tue, 03 Nov 2020 19:49:02 GMT) Full text and rfc822 format available.

Message #68 received at 39277-done <at> debbugs.gnu.org (full text, mbox):

From: mvar <mvar.40k <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: mvar <mvar.40k <at> gmail.com>, Lars Ingebrigtsen <larsi <at> gnus.org>,
 39277-done <at> debbugs.gnu.org
Subject: Re: bug#39277: 26.3; Tcl font lock does not understand quoting
Date: Tue, 03 Nov 2020 21:47:53 +0200
Stefan Monnier <monnier <at> iro.umontreal.ca> writes:

>> The original case is solved
>> (although the enclosed {"string} is not font-locked as string but I
>> wouldn't consider it an error)
>
> Yes, this is a separate problem and I can't see how to fix it: since
> "everything's a string" in Tcl, it's really not clear what
> `font-lock-string-face` should apply to and what it shouldn't apply to.
>
> The current design is to use it only where "..." is used.  When the code
> is fully under your control it lets you choose (to some extent at least)
> what is highlighted and what is not (by choosing "..." vs {...}), but
> clearly it won't be "right" in all cases.
>
>> plus it fixes the following:
>>
>>     proc foo4 () {
>>        puts "hello}"
>>     }
>>
>> this was marked as valid before your changes but tclsh won't accept it
>> as such - the bracket } inside the string needs to be escaped when
>> inside a proc context (but as a plain statement there's no such
>> requirement).
>
> Indeed.
>

hi again,

i stumbled upon a not-so-rare case where this fix breaks a previously
valid syntax locking. Example:

set a "Testing: [split "192.168.1.1/24" "/"] address"

the closing ] is marked as unmatched (no matching parenthesis found)

notice how the above statement is evaluated by tclsh/wish:

% set a "Testing: [split "192.168.1.1/24" "/"] address"
Testing: 192.168.1.1 24 address
% 

the problem with the unmatched ] can only(?) be solved by escaping both
inner "strings"

set a "Testing: [split \"192.168.1.1/24\" \"/\"] address"

but then this is evaluated into an array

% set a "Testing: [split \"192.168.1.1/24\" \"/\"] address"
Testing: {} 192.168.1.1 24 {} address

i still consider the previously applied fix as an overall improvement,
so perhaps i should open a new bug report for this problem? btw i tried
to come up with some fix but i'm still a looong way from grasping those
syntax & parsing mechanisms (syntax-ppss and friends)


thanks,
Michalis




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39277; Package emacs. (Tue, 03 Nov 2020 21:46:02 GMT) Full text and rfc822 format available.

Message #71 received at 39277-done <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: mvar <mvar.40k <at> gmail.com>
Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 39277-done <at> debbugs.gnu.org
Subject: Re: bug#39277: 26.3; Tcl font lock does not understand quoting
Date: Tue, 03 Nov 2020 16:45:44 -0500
> i stumbled upon a not-so-rare case where this fix breaks a previously
> valid syntax locking. Example:
>
> set a "Testing: [split "192.168.1.1/24" "/"] address"
>
> the closing ] is marked as unmatched (no matching parenthesis found)

Oh, right, that's like the `...` inside "..." in sh.
I had completely forgotten about it.

> i still consider the previously applied fix as an overall improvement,
> so perhaps i should open a new bug report for this problem?

I think so, yes (but if so, please send me the bugnb).

The old code handled it differently but not correctly either (in this
case the breakage was less annoying, and maybe it's even the case in
general, but it's largely by accident), so it's really a separate issue.


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39277; Package emacs. (Thu, 05 Nov 2020 12:39:01 GMT) Full text and rfc822 format available.

Message #74 received at 39277-done <at> debbugs.gnu.org (full text, mbox):

From: mvar <mvar.40k <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: mvar <mvar.40k <at> gmail.com>, Lars Ingebrigtsen <larsi <at> gnus.org>,
 39277-done <at> debbugs.gnu.org
Subject: Re: bug#39277: 26.3; Tcl font lock does not understand quoting
Date: Thu, 05 Nov 2020 14:38:19 +0200
Stefan Monnier <monnier <at> iro.umontreal.ca> writes:

>> i still consider the previously applied fix as an overall improvement,
>> so perhaps i should open a new bug report for this problem?
>
> I think so, yes (but if so, please send me the bugnb).
>
> The old code handled it differently but not correctly either (in this
> case the breakage was less annoying, and maybe it's even the case in
> general, but it's largely by accident), so it's really a separate issue.
>
>
>         Stefan

thank you Stefan, i've opened #44465

br,
Michalis




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Fri, 04 Dec 2020 12:24:08 GMT) Full text and rfc822 format available.

This bug report was last modified 3 years and 142 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.