GNU bug report logs - #30393
cperl-mode: open-paren-in-column-0 of string literal affects later statement indentation

Previous Next

Package: emacs;

Reported by: paulusm <paulusm <at> bigpond.com>

Date: Thu, 8 Feb 2018 16:20:03 UTC

Severity: minor

Tags: fixed

Found in version 24.4

Fixed in version 28.1

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 30393 in the body.
You can then email your comments to 30393 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox

Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#30393; Package emacs. (Thu, 08 Feb 2018 16:20:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to paulusm <paulusm <at> bigpond.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Thu, 08 Feb 2018 16:20:03 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: paulusm <paulusm <at> bigpond.com>
To: bug-gnu-emacs <at> gnu.org
Subject: 24.4; cperl-mode: indentation failure
Date: Fri, 9 Feb 2018 02:25:52 +1100

### How to foul up indentation in cperl-mode (cperl-indent-command)

# Here is some badly-indented Perl:

          my $sql = "insert into jobs (id, priority) values (1, 2);";
               my $sth = $dbh->prepare($sql) or die "bother";

# try indenting each line above (individually) by pressing TAB
# (cperl-indent-command).
#
# No problem - everything behaving normally.


# Now try this:

          my $sql = "insert into jobs
(id, priority)
values (1, 2);";
               my $sth = $dbh->prepare($sql) or die "bother";

# Note how "my $sth..." doesn't indent properly?  On my system, it
# stays where it is, where I expect it to re-indent to column 0.


# Now anything following refuses to indent:

          print "Help!";


# Comment out the second code block, and the "Help!" line indents
# properly again.

# It's worth noting at this point that the /contents/ of the string
# seem to trigger the issue.  If the "$sql = ..." lines were changed
# to:
#
# first case:
#   $sql = "select * from jobs;";
#
# second case:
#   $sql = "select *
#   from jobs;";
#
# The issue does not appear.



In GNU Emacs 24.4.1 (x86_64-pc-linux-gnu, X toolkit, Xaw3d scroll bars)
 of 2017-09-12 on hullmann, modified by Debian
Windowing system distributor `The X.Org Foundation', version 11.0.11604000
System Description:	Debian GNU/Linux 8.10 (jessie)

--
Paul.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30393; Package emacs. (Fri, 09 Feb 2018 01:45:01 GMT) Full text and rfc822 format available.

Message #8 received at 30393 <at> debbugs.gnu.org (full text, mbox):

From: Noam Postavsky <npostavs <at> users.sourceforge.net>
To: paulusm <paulusm <at> bigpond.com>
Cc: 30393 <at> debbugs.gnu.org
Subject: Re: bug#30393: 24.4; cperl-mode: indentation failure
Date: Thu, 08 Feb 2018 20:44:07 -0500

retitle 30393 cperl-mode: open-paren-in-column-0 of string literal affects later statement indentation
quit

paulusm <paulusm <at> bigpond.com> writes:

> # It's worth noting at this point that the /contents/ of the string
> # seem to trigger the issue.

Specifically, it's the open paren in the column 0 that triggers it.  You
can set `open-paren-in-column-0-is-defun-start' to nil to fix it.  Same
idea as Bug#25480 (that one is cc-mode).

Changed bug title to 'cperl-mode: open-paren-in-column-0 of string literal affects later statement indentation' from '24.4; cperl-mode: indentation failure' Request was from Noam Postavsky <npostavs <at> users.sourceforge.net> to control <at> debbugs.gnu.org. (Fri, 09 Feb 2018 01:45:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30393; Package emacs. (Fri, 09 Feb 2018 17:51:01 GMT) Full text and rfc822 format available.

Message #13 received at 30393 <at> debbugs.gnu.org (full text, mbox):

From: Alan Mackenzie <acm <at> muc.de>
To: Noam Postavsky <npostavs <at> users.sourceforge.net>
Cc: 30393 <at> debbugs.gnu.org
Subject: Re: bug#30393: 24.4; cperl-mode: indentation failure
Date: 9 Feb 2018 17:50:40 -0000

In article <mailman.8766.1518140709.27995.bug-gnu-emacs <at> gnu.org> you wrote:
> retitle 30393 cperl-mode: open-paren-in-column-0 of string literal affects later statement indentation
> quit

> paulusm <paulusm <at> bigpond.com> writes:

>> # It's worth noting at this point that the /contents/ of the string
>> # seem to trigger the issue.

> Specifically, it's the open paren in the column 0 that triggers it.  You
> can set `open-paren-in-column-0-is-defun-start' to nil to fix it.  Same
> idea as Bug#25480 (that one is cc-mode).

Just to remind people, I fixed all this nonsense about parens in column
0 and `open-paren-in-column-0-is-defun-start' over a year ago.  Key
search term: "comment-cache".

My fix was rejected without any deep, soul-searching consideration, for
reasons which appeared obscure then and haven't become any clearer
since.

This bug, the failure to deal reasonably with open parens in column
zero, is a malignancy on the face of Emacs, breeding bug after bug after
bug, as we see here, as we have seen many times over the years.

If my fix isn't going to be accepted, I think it's high time that
somebody else stepped up to the plate and fixed this monstrosity once
and for all.

-- 
Alan Mackenzie (Nuremberg, Germany).

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30393; Package emacs. (Sat, 10 Feb 2018 03:56:02 GMT) Full text and rfc822 format available.

Message #16 received at 30393 <at> debbugs.gnu.org (full text, mbox):

From: Noam Postavsky <npostavs <at> users.sourceforge.net>
To: Alan Mackenzie <acm <at> muc.de>
Cc: 30393 <at> debbugs.gnu.org
Subject: Re: bug#30393: 24.4; cperl-mode: indentation failure
Date: Fri, 09 Feb 2018 22:55:31 -0500

Alan Mackenzie <acm <at> muc.de> writes:

>> Specifically, it's the open paren in the column 0 that triggers it.  You
>> can set `open-paren-in-column-0-is-defun-start' to nil to fix it.  Same
>> idea as Bug#25480 (that one is cc-mode).
>
> Just to remind people, I fixed all this nonsense about parens in column
> 0 and `open-paren-in-column-0-is-defun-start' over a year ago.  Key
> search term: "comment-cache".

I fixed it for emacs-lisp-mode using syntax-ppss (Bug#27920, Bug#25122),
but based on previous discussions, you wouldn't be especially happy with
such a solution...

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30393; Package emacs. (Sat, 10 Feb 2018 08:55:02 GMT) Full text and rfc822 format available.

Message #19 received at 30393 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Alan Mackenzie <acm <at> muc.de>,
 Noam Postavsky <npostavs <at> users.sourceforge.net>
Cc: 30393 <at> debbugs.gnu.org
Subject: Re: bug#30393: 24.4; cperl-mode: indentation failure
Date: Sat, 10 Feb 2018 11:53:55 +0300

On 2/9/18 8:50 PM, Alan Mackenzie wrote:

>> Specifically, it's the open paren in the column 0 that triggers it.  You
>> can set `open-paren-in-column-0-is-defun-start' to nil to fix it.  Same
>> idea as Bug#25480 (that one is cc-mode).
> 
> Just to remind people, I fixed all this nonsense about parens in column
> 0 and `open-paren-in-column-0-is-defun-start' over a year ago.  Key
> search term: "comment-cache".

Note that this particular bug has been filed against Emacs 24.4, and I 
can't reproduce it on master.

> If my fix isn't going to be accepted, I think it's high time that
> somebody else stepped up to the plate and fixed this monstrosity once
> and for all.

Have you seen 14b95587520959c5b54356547a0a69932a9bb480?

AFAICT, open-paren-in-column-0-is-defun-start doesn't have much effect now.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30393; Package emacs. (Sat, 10 Feb 2018 11:38:02 GMT) Full text and rfc822 format available.

Message #22 received at 30393 <at> debbugs.gnu.org (full text, mbox):

From: Alan Mackenzie <acm <at> muc.de>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 30393 <at> debbugs.gnu.org, Stefan Monnier <monnier <at> IRO.UMontreal.CA>,
 Noam Postavsky <npostavs <at> users.sourceforge.net>
Subject: Re: bug#30393: 24.4; cperl-mode: indentation failure
Date: Sat, 10 Feb 2018 11:26:54 +0000

Hello, Dmitry.

On Sat, Feb 10, 2018 at 11:53:55 +0300, Dmitry Gutov wrote:
> On 2/9/18 8:50 PM, Alan Mackenzie wrote:

> >> Specifically, it's the open paren in the column 0 that triggers it.  You
> >> can set `open-paren-in-column-0-is-defun-start' to nil to fix it.  Same
> >> idea as Bug#25480 (that one is cc-mode).

[ .... ]

> > If my fix isn't going to be accepted, I think it's high time that
> > somebody else stepped up to the plate and fixed this monstrosity once
> > and for all.

> Have you seen 14b95587520959c5b54356547a0a69932a9bb480?

No, I hadn't.  Thanks, Stefan!

I'm not sure, but I think there's a danger of a recursive loop, should a
major mode use a hook in syntax-ppss to calculate syntax-table
properties, and that hook call forward-comment.

> AFAICT, open-paren-in-column-0-is-defun-start doesn't have much effect now.

That patch is incomplete, though.  I propose the following, to make it
more complete:



diff --git a/doc/emacs/programs.texi b/doc/emacs/programs.texi
index 4289124545..0fa37530ef 100644
--- a/doc/emacs/programs.texi
+++ b/doc/emacs/programs.texi
@@ -153,54 +153,35 @@ Left Margin Paren
 @cindex ( in leftmost column
   Many programming-language modes assume by default that any opening
 delimiter found at the left margin is the start of a top-level
-definition, or defun.  Therefore, @strong{don't put an opening
-delimiter at the left margin unless it should have that significance}.
-For instance, never put an open-parenthesis at the left margin in a
-Lisp file unless it is the start of a top-level list.
-
-  The convention speeds up many Emacs operations, which would
-otherwise have to scan back to the beginning of the buffer to analyze
-the syntax of the code.
-
-  If you don't follow this convention, not only will you have trouble
-when you explicitly use the commands for motion by defuns; other
-features that use them will also give you trouble.  This includes the
-indentation commands (@pxref{Program Indent}) and Font Lock mode
-(@pxref{Font Lock}).
-
-  The most likely problem case is when you want an opening delimiter
-at the start of a line inside a string.  To avoid trouble, put an
-escape character (@samp{\}, in C and Emacs Lisp, @samp{/} in some
-other Lisp dialects) before the opening delimiter.  This will not
-affect the contents of the string, but will prevent that opening
-delimiter from starting a defun.  Here's an example:
-
-@example
-  (insert "Foo:
-\(bar)
-")
-@end example
-
-  To help you catch violations of this convention, Font Lock mode
-highlights confusing opening delimiters (those that ought to be
-quoted) in bold red.
+definition, or defun.  Therefore, in these modes, don't put an opening
+delimiter at the left margin, except in a comment or string, unless it
+should have that significance.  For instance, never put an
+open-parenthesis at the left margin in a Lisp file unless it is the
+start of a top-level list.
+
+  In earlier versions of Emacs (through version 26.n), Emacs exploited
+this convention to speed up many low-level operations, which would
+otherwise have to scan back to the beginning of the buffer.
+Unfortunately, this caused confusion when an opening delimiter
+occurred at column zero inside a comment.  The resulting faulty
+analysis often caused wrong indentation or fontification.  The
+convention could be overridden by setting the user option
+@code{open-paren-in-column-0-is-defun-start} to @code{nil}, but this
+slowed Emacs down, particularaly when editing large buffers.
+
+  To eliminate these problems, the low level functionality which used
+to test for opening delimiters at column 0 no longer does so.  Open
+delimiters may now be freely written at the left margin inside
+comments and strings without triggering these problems.
 
 @vindex open-paren-in-column-0-is-defun-start
-  If you need to override this convention, you can do so by setting
-the variable @code{open-paren-in-column-0-is-defun-start}.
-If this user option is set to @code{t} (the default), opening
-parentheses or braces at column zero always start defuns.  When it is
-@code{nil}, defuns are found by searching for parens or braces at the
-outermost level.
-
-  Usually, you should leave this option at its default value of
-@code{t}.  If your buffer contains parentheses or braces in column
-zero which don't start defuns, and it is somehow impractical to remove
-these parentheses or braces, it might be helpful to set the option to
-@code{nil}.  Be aware that this might make scrolling and display in
-large buffers quite sluggish.  Furthermore, the parentheses and braces
-must be correctly matched throughout the buffer for it to work
-properly.
+  If you want to override the convention, which is still used by some
+higher level commands, you can do so by setting the variable
+@code{open-paren-in-column-0-is-defun-start} to @code{nil}.  If this
+user option is set to @code{t} (the default), these commands will stop
+at opening parentheses or braces at column zero when seeking the start
+of defuns.  When it is @code{nil}, defuns are found by searching for
+parens or braces at the outermost level.
 
 @node Moving by Defuns
 @subsection Moving by Defuns


-- 
Alan Mackenzie (Nuremberg, Germany).

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30393; Package emacs. (Sat, 10 Feb 2018 12:10:02 GMT) Full text and rfc822 format available.

Message #25 received at 30393 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Alan Mackenzie <acm <at> muc.de>
Cc: npostavs <at> users.sourceforge.net, 30393 <at> debbugs.gnu.org,
 monnier <at> IRO.UMontreal.CA, dgutov <at> yandex.ru
Subject: Re: bug#30393: 24.4; cperl-mode: indentation failure
Date: Sat, 10 Feb 2018 14:08:43 +0200

> Date: Sat, 10 Feb 2018 11:26:54 +0000
> From: Alan Mackenzie <acm <at> muc.de>
> Cc: Noam Postavsky <npostavs <at> users.sourceforge.net>,
> 	Stefan Monnier <monnier <at> IRO.UMontreal.CA>, 30393 <at> debbugs.gnu.org
> 
> +definition, or defun.  Therefore, in these modes, don't put an opening

Which "these modes" does this refer to?  How will the reader know when
to use this convention and when not?

> +  In earlier versions of Emacs (through version 26.n), Emacs exploited
> +this convention to speed up many low-level operations, which would
> +otherwise have to scan back to the beginning of the buffer.
> +Unfortunately, this caused confusion when an opening delimiter
> +occurred at column zero inside a comment.  The resulting faulty
> +analysis often caused wrong indentation or fontification.  The
> +convention could be overridden by setting the user option
> +@code{open-paren-in-column-0-is-defun-start} to @code{nil}, but this
> +slowed Emacs down, particularaly when editing large buffers.
> +
> +  To eliminate these problems, the low level functionality which used
> +to test for opening delimiters at column 0 no longer does so.  Open
> +delimiters may now be freely written at the left margin inside
> +comments and strings without triggering these problems.

This text is not needed.  The original text, which you deleted,
described how to avoid a real problem; if that problem no longer
exists, we should just delete that text.  If that problem does exist
in some modes, we should leave that text as it was, with a better
description of what modes are still subject to these problems.

But describing something that is no longer done by Emacs is just waste
of paper.

Overall, I must say I'm confused regarding the purpose of this patch.
What does it try to accomplish?

Thanks.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30393; Package emacs. (Sat, 10 Feb 2018 15:00:02 GMT) Full text and rfc822 format available.

Message #28 received at 30393 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> IRO.UMontreal.CA>
To: Alan Mackenzie <acm <at> muc.de>
Cc: Noam Postavsky <npostavs <at> users.sourceforge.net>, 30393 <at> debbugs.gnu.org,
 Dmitry Gutov <dgutov <at> yandex.ru>
Subject: Re: bug#30393: 24.4; cperl-mode: indentation failure
Date: Sat, 10 Feb 2018 09:58:55 -0500

> I'm not sure, but I think there's a danger of a recursive loop,

Definitely.

> should a major mode use a hook in syntax-ppss to calculate syntax-table
> properties,

You mean when a major mode sets syntax-propertize-function, right?

> and that hook call forward-comment.

The principle I tried to follow to avoid inf-loop is that each
recursive-invocation of syntax-ppss should be on a strictly smaller
buffer position.

Another principle is that syntax-propertize moves
syntax-propertize--done before calling syntax-propertize-function, so
similarly each recursive invocation of syntax-propertize would have to
be a strictly greater buffer position.

So in a large buffer, this can recurse faily deep, but it shouldn't be
able to recurse infinitely.

This said, in practice I haven't bumped into this problem yet, so
if/when it shows up, we'll see how it should be fixed.


        Stefan

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30393; Package emacs. (Sun, 11 Feb 2018 10:48:01 GMT) Full text and rfc822 format available.

Message #31 received at 30393 <at> debbugs.gnu.org (full text, mbox):

From: Alan Mackenzie <acm <at> muc.de>
To: Stefan Monnier <monnier <at> IRO.UMontreal.CA>
Cc: Noam Postavsky <npostavs <at> users.sourceforge.net>, 30393 <at> debbugs.gnu.org,
 Dmitry Gutov <dgutov <at> yandex.ru>
Subject: Re: bug#30393: 24.4; cperl-mode: indentation failure
Date: Sun, 11 Feb 2018 10:36:10 +0000

Hello, Stefan.

On Sat, Feb 10, 2018 at 09:58:55 -0500, Stefan Monnier wrote:
> > I'm not sure, but I think there's a danger of a recursive loop,

> Definitely.

> > should a major mode use a hook in syntax-ppss to calculate syntax-table
> > properties,

> You mean when a major mode sets syntax-propertize-function, right?

> > and that hook call forward-comment.

> The principle I tried to follow to avoid inf-loop is that each
> recursive-invocation of syntax-ppss should be on a strictly smaller
> buffer position.

> Another principle is that syntax-propertize moves
> syntax-propertize--done before calling syntax-propertize-function, so
> similarly each recursive invocation of syntax-propertize would have to
> be a strictly greater buffer position.

> So in a large buffer, this can recurse faily deep, but it shouldn't be
> able to recurse infinitely.

> This said, in practice I haven't bumped into this problem yet, so
> if/when it shows up, we'll see how it should be fixed.

OK, but I suspect in practice, this would be impossible to debug for
lack of reproducibility.

Another definite bug is that the syntax-ppss cache is not flushed when
the syntax-table is changed, whether with set-syntax-table or
modify-syntax-entry.

This is critical, now that primitives depend on this cache.

Would you please fix this, Stefan.

Thanks!

>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30393; Package emacs. (Sun, 11 Feb 2018 13:01:02 GMT) Full text and rfc822 format available.

Message #34 received at 30393 <at> debbugs.gnu.org (full text, mbox):

From: Alan Mackenzie <acm <at> muc.de>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: npostavs <at> users.sourceforge.net, 30393 <at> debbugs.gnu.org,
 monnier <at> IRO.UMontreal.CA, dgutov <at> yandex.ru
Subject: Re: bug#30393: 24.4; cperl-mode: indentation failure
Date: Sun, 11 Feb 2018 12:49:30 +0000

Hello, Eli.

On Sat, Feb 10, 2018 at 14:08:43 +0200, Eli Zaretskii wrote:
> > Date: Sat, 10 Feb 2018 11:26:54 +0000
> > From: Alan Mackenzie <acm <at> muc.de>
> > Cc: Noam Postavsky <npostavs <at> users.sourceforge.net>,
> > 	Stefan Monnier <monnier <at> IRO.UMontreal.CA>, 30393 <at> debbugs.gnu.org

> > +definition, or defun.  Therefore, in these modes, don't put an opening

> Which "these modes" does this refer to?  How will the reader know when
> to use this convention and when not?

Good point.  I suppose the answer is that there now aren't any such
modes.  Maybe this part of the section should be removed.

> > +  In earlier versions of Emacs (through version 26.n), Emacs exploited
> > +this convention to speed up many low-level operations, which would
> > +otherwise have to scan back to the beginning of the buffer.
> > +Unfortunately, this caused confusion when an opening delimiter
> > +occurred at column zero inside a comment.  The resulting faulty
> > +analysis often caused wrong indentation or fontification.  The
> > +convention could be overridden by setting the user option
> > +@code{open-paren-in-column-0-is-defun-start} to @code{nil}, but this
> > +slowed Emacs down, particularaly when editing large buffers.
> > +
> > +  To eliminate these problems, the low level functionality which used
> > +to test for opening delimiters at column 0 no longer does so.  Open
> > +delimiters may now be freely written at the left margin inside
> > +comments and strings without triggering these problems.

> This text is not needed.  The original text, which you deleted,
> described how to avoid a real problem; if that problem no longer
> exists, we should just delete that text.  If that problem does exist
> in some modes, we should leave that text as it was, with a better
> description of what modes are still subject to these problems.

> But describing something that is no longer done by Emacs is just waste
> of paper.

Perhaps the proposed fix was somewhat prolix ("long winded").  But, in a
sense, we're providing a new feature, the ability to write syntactically
correct parens.  If we don't mention this, people won't notice.
Occasionally somebody will remember the previous restriction, try to
look it up in the manual, and end up puzzled.

How about a compromise, and replacing those two long paragraphs with a
simple sentence such as:

    From Emacs 27.1, you can write opening parens at column zero without
    problems.

> Overall, I must say I'm confused regarding the purpose of this patch.
> What does it try to accomplish?

To note that the documented previous restrictions on parens in column 0
no longer hold.

I suppose we really want to mark this part of the manual as obsolete,
but we've got no mechanism for doing this.  Besides,
open-paren-in-column-0-is-defun-start still has _some_ functionality.

> Thanks.

-- 
Alan Mackenzie (Nuremberg, Germany).

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30393; Package emacs. (Sun, 11 Feb 2018 16:17:02 GMT) Full text and rfc822 format available.

Message #37 received at 30393 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Alan Mackenzie <acm <at> muc.de>
Cc: npostavs <at> users.sourceforge.net, 30393 <at> debbugs.gnu.org,
 monnier <at> IRO.UMontreal.CA, dgutov <at> yandex.ru
Subject: Re: bug#30393: 24.4; cperl-mode: indentation failure
Date: Sun, 11 Feb 2018 18:16:00 +0200

> Date: Sun, 11 Feb 2018 12:49:30 +0000
> Cc: dgutov <at> yandex.ru, npostavs <at> users.sourceforge.net,
>   monnier <at> IRO.UMontreal.CA, 30393 <at> debbugs.gnu.org
> From: Alan Mackenzie <acm <at> muc.de>
> 
> > This text is not needed.  The original text, which you deleted,
> > described how to avoid a real problem; if that problem no longer
> > exists, we should just delete that text.  If that problem does exist
> > in some modes, we should leave that text as it was, with a better
> > description of what modes are still subject to these problems.
> 
> > But describing something that is no longer done by Emacs is just waste
> > of paper.
> 
> Perhaps the proposed fix was somewhat prolix ("long winded").  But, in a
> sense, we're providing a new feature, the ability to write syntactically
> correct parens.  If we don't mention this, people won't notice.
> Occasionally somebody will remember the previous restriction, try to
> look it up in the manual, and end up puzzled.
> 
> How about a compromise, and replacing those two long paragraphs with a
> simple sentence such as:
> 
>     From Emacs 27.1, you can write opening parens at column zero without
>     problems.
> 
> > Overall, I must say I'm confused regarding the purpose of this patch.
> > What does it try to accomplish?
> 
> To note that the documented previous restrictions on parens in column 0
> no longer hold.

The right place for such stuff is in NEWS.

> I suppose we really want to mark this part of the manual as obsolete,
> but we've got no mechanism for doing this.  Besides,
> open-paren-in-column-0-is-defun-start still has _some_ functionality.

The variable should have some minimal description with a note that
using it nowadays is seldom needed.  That should be enough to drive
your point home, I think.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30393; Package emacs. (Sun, 11 Feb 2018 22:54:02 GMT) Full text and rfc822 format available.

Message #40 received at 30393 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> IRO.UMontreal.CA>
To: Alan Mackenzie <acm <at> muc.de>
Cc: Noam Postavsky <npostavs <at> users.sourceforge.net>, 30393 <at> debbugs.gnu.org,
 Dmitry Gutov <dgutov <at> yandex.ru>
Subject: Re: bug#30393: 24.4; cperl-mode: indentation failure
Date: Sun, 11 Feb 2018 17:53:38 -0500

> OK, but I suspect in practice, this would be impossible to debug for
> lack of reproducibility.

Those problems can be hard to debug, indeed.  But an inf-loop should at
least be diagnosed as such fairly easily (even if its origin can be
difficult to track down), so I don't think "in practice I haven't bumped
into this problem yet" just because those problems stay undetected.

> Another definite bug is that the syntax-ppss cache is not flushed when
> the syntax-table is changed, whether with set-syntax-table or
> modify-syntax-entry.

That's right.  I haven't bumped into such issues yet, but here (contrary
to the above problem) it might very well be because the error
stays undetected.

> This is critical, now that primitives depend on this cache.

I can see two approaches to solve this problem:
- hook into set-syntax-table and modify-syntax-entry or something
  like that.  This will make it work right everywhere automatically, but
  I'm afraid it could turn out to be difficult to make it efficient
  (because of the cost of the tests needed to detect changes and more
  importantly because of excessive flushing of the syntax-ppss cache).
- provide new functions to let packages tell syntax-ppss about
  such things.  E.g. a macro `with-new-syntax-context` (which would
  be treated a bit like narrowing, maybe).  This would require changes
  to packages that suffer from this problem but should give
  better performance.

I'd prefer the second option, but at the same time, I'm not completely sure
what are the "typical" problem cases (which makes it hard to come up
with good new functions/macros) other than the case where we use
with-syntax-table (which is sometimes combined with a local narrowing)
but some of those only tweak the "word-vs-symbol-vs-punctuation"
settings so should ideally not flush the syntax-ppss cache.

Also I don't actually know whether the "fully automatic" approach would
actually turn out to be too expensive, it's just a gut feeling.

> Would you please fix this, Stefan.

It's fairly high up on my todo list, but I'm kinda swamped right now.


        Stefan

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30393; Package emacs. (Mon, 12 Feb 2018 18:50:01 GMT) Full text and rfc822 format available.

Message #43 received at 30393 <at> debbugs.gnu.org (full text, mbox):

From: Alan Mackenzie <acm <at> muc.de>
To: Stefan Monnier <monnier <at> IRO.UMontreal.CA>
Cc: Noam Postavsky <npostavs <at> users.sourceforge.net>, 30393 <at> debbugs.gnu.org,
 Dmitry Gutov <dgutov <at> yandex.ru>
Subject: Re: bug#30393: 24.4; cperl-mode: indentation failure
Date: Mon, 12 Feb 2018 18:38:00 +0000

Hello, Stefan

On Sun, Feb 11, 2018 at 17:53:38 -0500, Stefan Monnier wrote:
> > OK, but I suspect in practice, this would be impossible to debug for
> > lack of reproducibility.

> Those problems can be hard to debug, indeed.  But an inf-loop should at
> least be diagnosed as such fairly easily (even if its origin can be
> difficult to track down), so I don't think "in practice I haven't bumped
> into this problem yet" just because those problems stay undetected.

> > Another definite bug is that the syntax-ppss cache is not flushed when
> > the syntax-table is changed, whether with set-syntax-table or
> > modify-syntax-entry.

> That's right.  I haven't bumped into such issues yet, but here (contrary
> to the above problem) it might very well be because the error
> stays undetected.

.... and of course, the need to flush the cache when a syntax-table text
property is applied or removed.

> > This is critical, now that primitives depend on this cache.

> I can see two approaches to solve this problem:
> - hook into set-syntax-table and modify-syntax-entry or something
>   like that.  This will make it work right everywhere automatically, but
>   I'm afraid it could turn out to be difficult to make it efficient
>   (because of the cost of the tests needed to detect changes and more
>   importantly because of excessive flushing of the syntax-ppss cache).
> - provide new functions to let packages tell syntax-ppss about
>   such things.  E.g. a macro `with-new-syntax-context` (which would
>   be treated a bit like narrowing, maybe).  This would require changes
>   to packages that suffer from this problem but should give
>   better performance.

> I'd prefer the second option, but at the same time, I'm not completely sure
> what are the "typical" problem cases (which makes it hard to come up
> with good new functions/macros) other than the case where we use
> with-syntax-table (which is sometimes combined with a local narrowing)
> but some of those only tweak the "word-vs-symbol-vs-punctuation"
> settings so should ideally not flush the syntax-ppss cache.

> Also I don't actually know whether the "fully automatic" approach would
> actually turn out to be too expensive, it's just a gut feeling.

> > Would you please fix this, Stefan.

> It's fairly high up on my todo list, but I'm kinda swamped right now.

It has occurred to me over the last day or two that I have already
solved these problems (basically, with your first approach, hooking into
set-syntax-table and friends) in the comment-cache branch, and that the
approach taken could be used more or less unchanged in the current
master.

For set-syntax-table, it compares the old and the new syntax tables to
see if they are "literally the same" (i.e. process strings and comments
identically) or "literally different", and only in the latter case does
it flush the cache.  These comparisons, which are expensive, are cached
inside the syntax-tables (in "extra slots").  Similarly, on
modify-syntax-entry, the cache is flushed iff the change affects
literals.

Similarly, on setting or deleting a syntax-table text property, the
cache is flushed from that point if the change affects literals.  This
happens regardless of the setting of inhibit-modification-hooks, etc.

It is a fact that the vast bulk of libraries which use syntax-ppss use
only elements 3, 4, and 8, i.e. the ones relevant to literals, and
ignore everything else.  For these the scheme outlined above is
rigorous.  I have timed it in the comment-cache branch, scanning through
.../src/xdisp.c displaying each screen, and found no difference to the
approach without comment-cache.

For those few libraries which do use the full capabilities of the
parsing state, we would need to flush the cache on all
set-syntax-table's and so on.  Maybe.  Maybe this would be too expensive
in run time.

So the interface I propose would be two abnormal hooks, one for
"literally important" changes to the syntax, and the other for other
changes to the syntax.  The hook functions would take an optional
argument which would be nil for changes to the syntax table, or a buffer
position where a syntax-table property is being changed.

Mostly, only the first of these hooks need be used, the standard
function on them being syntax-ppss's flush function.  Major modes could
add syntax-ppss's flush function to the second hook (possibly through
some nice interface), should they use the non-literal parts of the parse
state.

One or two incidental changes would be needed, for example to fix the
infinite recursion in printing syntax-tables, caused by the mutual
presence of "literally the same/different" syntax tables in the extra
slots.  This would not be difficult.

Then, finally, if we can be bothered, we could put in a mechanism to
deal with changes in parse-sexp-lookup-properties and
parse-sexp-ignore-comments.

What do you think?

>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30393; Package emacs. (Mon, 12 Feb 2018 20:46:01 GMT) Full text and rfc822 format available.

Message #46 received at 30393 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> IRO.UMontreal.CA>
To: Alan Mackenzie <acm <at> muc.de>
Cc: Noam Postavsky <npostavs <at> users.sourceforge.net>, 30393 <at> debbugs.gnu.org,
 Dmitry Gutov <dgutov <at> yandex.ru>
Subject: Re: bug#30393: 24.4; cperl-mode: indentation failure
Date: Mon, 12 Feb 2018 15:45:40 -0500

> It has occurred to me over the last day or two that I have already
> solved these problems (basically, with your first approach, hooking into
> set-syntax-table and friends) in the comment-cache branch, and that the
> approach taken could be used more or less unchanged in the current
> master.

If you can reuse existing code, even better.


        Stefan

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30393; Package emacs. (Wed, 14 Feb 2018 21:12:02 GMT) Full text and rfc822 format available.

Message #49 received at 30393 <at> debbugs.gnu.org (full text, mbox):

From: Alan Mackenzie <acm <at> muc.de>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: npostavs <at> users.sourceforge.net, 30393 <at> debbugs.gnu.org,
 monnier <at> IRO.UMontreal.CA, dgutov <at> yandex.ru
Subject: Re: bug#30393: 24.4; cperl-mode: indentation failure
Date: Wed, 14 Feb 2018 21:00:22 +0000

Hello, Eli.

On Sun, Feb 11, 2018 at 18:16:00 +0200, Eli Zaretskii wrote:
> > Date: Sun, 11 Feb 2018 12:49:30 +0000
> > Cc: dgutov <at> yandex.ru, npostavs <at> users.sourceforge.net,
> >   monnier <at> IRO.UMontreal.CA, 30393 <at> debbugs.gnu.org
> > From: Alan Mackenzie <acm <at> muc.de>

> > > This text is not needed.  The original text, which you deleted,
> > > described how to avoid a real problem; if that problem no longer
> > > exists, we should just delete that text.  If that problem does exist
> > > in some modes, we should leave that text as it was, with a better
> > > description of what modes are still subject to these problems.

> > > But describing something that is no longer done by Emacs is just waste
> > > of paper.

> > Perhaps the proposed fix was somewhat prolix ("long winded").  But, in a
> > sense, we're providing a new feature, the ability to write syntactically
> > correct parens.  If we don't mention this, people won't notice.
> > Occasionally somebody will remember the previous restriction, try to
> > look it up in the manual, and end up puzzled.

> > How about a compromise, and replacing those two long paragraphs with a
> > simple sentence such as:

> >     From Emacs 27.1, you can write opening parens at column zero without
> >     problems.

> > > Overall, I must say I'm confused regarding the purpose of this patch.
> > > What does it try to accomplish?

> > To note that the documented previous restrictions on parens in column 0
> > no longer hold.

> The right place for such stuff is in NEWS.

> > I suppose we really want to mark this part of the manual as obsolete,
> > but we've got no mechanism for doing this.  Besides,
> > open-paren-in-column-0-is-defun-start still has _some_ functionality.

> The variable should have some minimal description with a note that
> using it nowadays is seldom needed.  That should be enough to drive
> your point home, I think.

In accordance with that, then, I propose the following as the complete
emacs manual page "Left Margin Convention":

    26.2.1 Left Margin Convention
    -----------------------------

    Many programming-language modes have traditionally assumed that any
    opening delimiter found at the left margin is the start of a top-level
    definition, or defun.  So, by default, commands which seek the beginning
    of a defun accept such a delimiter as signifying that position.

       If you want to override this convention, you can do so by setting the
    user option `open-paren-in-column-0-is-defun-start' to `nil'.  If this
    option is set to `t' (the default), commands seeking the start of a
    defun will stop at opening parentheses or braces at column zero.  When
    it is `nil', defuns are found by searching for parens or braces at the
    outermost level.  Since low-level Emacs routines no longer depend on
    this convention, you usually won't need to change
    `open-paren-in-column-0-is-defun-start' from its default.

-- 
Alan Mackenzie (Nuremberg, Germany).

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30393; Package emacs. (Thu, 15 Feb 2018 17:40:02 GMT) Full text and rfc822 format available.

Message #52 received at 30393 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Alan Mackenzie <acm <at> muc.de>
Cc: npostavs <at> users.sourceforge.net, 30393 <at> debbugs.gnu.org,
 monnier <at> IRO.UMontreal.CA, dgutov <at> yandex.ru
Subject: Re: bug#30393: 24.4; cperl-mode: indentation failure
Date: Thu, 15 Feb 2018 19:39:06 +0200

> Date: Wed, 14 Feb 2018 21:00:22 +0000
> Cc: dgutov <at> yandex.ru, npostavs <at> users.sourceforge.net,
>   monnier <at> IRO.UMontreal.CA, 30393 <at> debbugs.gnu.org
> From: Alan Mackenzie <acm <at> muc.de>
> 
>     26.2.1 Left Margin Convention
>     -----------------------------
> 
>     Many programming-language modes have traditionally assumed that any
>     opening delimiter found at the left margin is the start of a top-level
>     definition, or defun.  So, by default, commands which seek the beginning
>     of a defun accept such a delimiter as signifying that position.
> 
>        If you want to override this convention, you can do so by setting the
>     user option `open-paren-in-column-0-is-defun-start' to `nil'.  If this
>     option is set to `t' (the default), commands seeking the start of a
>     defun will stop at opening parentheses or braces at column zero.  When
>     it is `nil', defuns are found by searching for parens or braces at the
>     outermost level.  Since low-level Emacs routines no longer depend on
>     this convention, you usually won't need to change
>     `open-paren-in-column-0-is-defun-start' from its default.

This is fine by me, but please replace "delimiters" in the beginning
of the text with "opening parenthesis or brace", for clarity.

Thanks.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30393; Package emacs. (Fri, 16 Feb 2018 11:53:02 GMT) Full text and rfc822 format available.

Message #55 received at 30393 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Alan Mackenzie <acm <at> muc.de>, Eli Zaretskii <eliz <at> gnu.org>
Cc: npostavs <at> users.sourceforge.net, monnier <at> IRO.UMontreal.CA,
 30393 <at> debbugs.gnu.org
Subject: Re: bug#30393: 24.4; cperl-mode: indentation failure
Date: Fri, 16 Feb 2018 13:52:03 +0200

On 2/14/18 11:00 PM, Alan Mackenzie wrote:

>      ...If this
>      option is set to `t' (the default), commands seeking the start of a
>      defun will stop at opening parentheses or braces at column zero.

Is this still true actually? On master, in emacs-lisp-mode, with point after

(defun asdasd ()
  "
(")

C-M-a moves to before the defun, and not inside the docstring.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30393; Package emacs. (Fri, 16 Feb 2018 17:56:01 GMT) Full text and rfc822 format available.

Message #58 received at 30393 <at> debbugs.gnu.org (full text, mbox):

From: Alan Mackenzie <acm <at> muc.de>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: Eli Zaretskii <eliz <at> gnu.org>, npostavs <at> users.sourceforge.net,
 monnier <at> IRO.UMontreal.CA, 30393 <at> debbugs.gnu.org
Subject: Re: bug#30393: 24.4; cperl-mode: indentation failure
Date: Fri, 16 Feb 2018 17:43:31 +0000

Hello, Dmitry.

On Fri, Feb 16, 2018 at 13:52:03 +0200, Dmitry Gutov wrote:
> On 2/14/18 11:00 PM, Alan Mackenzie wrote:

> >      ...If this
> >      option is set to `t' (the default), commands seeking the start of a
> >      defun will stop at opening parentheses or braces at column zero.

> Is this still true actually? On master, in emacs-lisp-mode, with point after

> (defun asdasd ()
>    "
> (")


> C-M-a moves to before the defun, and not inside the docstring.

Yes, you're right.  How about .....

    ...., commands seeking the start of a defun will stop at opening
    parentheses or braces at column zero which aren't in a comment or
    string.

?  It's more accurate, if less elegant, than what has been committed.

-- 
Alan Mackenzie (Nuremberg, Germany).

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30393; Package emacs. (Sat, 17 Feb 2018 02:17:02 GMT) Full text and rfc822 format available.

Message #61 received at 30393 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Alan Mackenzie <acm <at> muc.de>
Cc: Eli Zaretskii <eliz <at> gnu.org>, npostavs <at> users.sourceforge.net,
 monnier <at> IRO.UMontreal.CA, 30393 <at> debbugs.gnu.org
Subject: Re: bug#30393: 24.4; cperl-mode: indentation failure
Date: Sat, 17 Feb 2018 04:16:31 +0200

On 2/16/18 7:43 PM, Alan Mackenzie wrote:

> Yes, you're right.  How about .....
> 
>      ...., commands seeking the start of a defun will stop at opening
>      parentheses or braces at column zero which aren't in a comment or
>      string.
> 
> ?  It's more accurate, if less elegant, than what has been committed.

I think it's better, thank you. Though more verbose, it addresses the 
issue that has been a problem for years.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30393; Package emacs. (Sat, 17 Feb 2018 11:07:02 GMT) Full text and rfc822 format available.

Message #64 received at 30393 <at> debbugs.gnu.org (full text, mbox):

From: Alan Mackenzie <acm <at> muc.de>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: Eli Zaretskii <eliz <at> gnu.org>, npostavs <at> users.sourceforge.net,
 monnier <at> IRO.UMontreal.CA, 30393 <at> debbugs.gnu.org
Subject: Re: bug#30393: 24.4; cperl-mode: indentation failure
Date: Sat, 17 Feb 2018 10:54:31 +0000

Hello, Dmitry.

On Sat, Feb 17, 2018 at 04:16:31 +0200, Dmitry Gutov wrote:
> On 2/16/18 7:43 PM, Alan Mackenzie wrote:

> > Yes, you're right.  How about .....

> >      ...., commands seeking the start of a defun will stop at opening
> >      parentheses or braces at column zero which aren't in a comment or
> >      string.

> > ?  It's more accurate, if less elegant, than what has been committed.

> I think it's better, thank you. Though more verbose, it addresses the 
> issue that has been a problem for years.

Not years.  Decades.  :-)

I've committed that fix.  Thanks for pointing it out.

-- 
Alan Mackenzie (Nuremberg, Germany).

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30393; Package emacs. (Mon, 05 Mar 2018 08:59:02 GMT) Full text and rfc822 format available.

Message #67 received at 30393 <at> debbugs.gnu.org (full text, mbox):

From: Alan Mackenzie <acm <at> muc.de>
To: Stefan Monnier <monnier <at> IRO.UMontreal.CA>
Cc: Noam Postavsky <npostavs <at> users.sourceforge.net>, 30393 <at> debbugs.gnu.org,
 Dmitry Gutov <dgutov <at> yandex.ru>
Subject: Re: bug#30393: 24.4; cperl-mode: indentation failure
Date: Mon, 5 Mar 2018 08:42:55 +0000

Hello, Stefan.

Sorry this has taken so long.  I've been preoccupied with things outside
of Emacs.

On Mon, Feb 12, 2018 at 15:45:40 -0500, Stefan Monnier wrote:
> > It has occurred to me over the last day or two that I have already
> > solved these problems (basically, with your first approach, hooking into
> > set-syntax-table and friends) in the comment-cache branch, and that the
> > approach taken could be used more or less unchanged in the current
> > master.

> If you can reuse existing code, even better.

The following patch ensures that (almost) any time the syntax table or a
syntax-table text property is changed in a way which affects literals,
the function syntax-ppss-flush-cache is called.

The known exceptions to the above are that setting any of the variables
parse-sexp-lookup-properties, syntax-propertize-function,
syntax-propertize-extend-region-functions, multibyte-syntax-as-symbol,
parse-sexp-ignore-comments, comment-end-can-be-escaped after mode
initialisation is not detected.  The major mode code would need to flush
the caches explicitly on such a change.  Also, changing the syntax-table
property of a symbol which is the value of a category text-property is
not detected.

In the following descriptions "literally the same" and "literally
different", when applied to syntax things, are shorthand for "will parse
literals (comments and strings) the same/differently".

Existing C functions have been modified as follows:
  o - signal_after_change.
    * - Regardless of the settings of the change hooks,
      syntax-ppss-flush-cache will be called for an actual textual
      change.
  o - Fset_syntax_table.
    * - If the new table is literally different from the old,
      syntax-ppss-flush-cache will be called with an argument of -1.
  o - Fmodify_syntax_entry.
    * - If the new entry is literally different from the old one,
      syntax-ppss-flush-cache will be called with an argument of -1.
  o - init_syntax_once and syms_of_syntax.
    * - Administrative amendments.
  o - set_properties, add_properties, remove_properties.  If a
    syntax-table property set or removed, whether directly or via a
    category property, potentially alters the parsing of literals,
    syntax-ppss-flush-cache will be called.



diff --git a/src/chartab.c b/src/chartab.c
index 065ae4f9f2..e2b9e682cc 100644
--- a/src/chartab.c
+++ b/src/chartab.c
@@ -314,7 +314,6 @@ sub_char_table_ref_and_range (Lisp_Object table, int c, int *from, int *to,
   return val;
 }
 
-
 /* Return the value for C in char-table TABLE.  Shrink the range *FROM
    and *TO to cover characters (containing C) that have the same value
    as C.  It is not assured that the values of (*FROM - 1) and (*TO +
@@ -386,6 +385,60 @@ char_table_ref_and_range (Lisp_Object table, int c, int *from, int *to)
   return val;
 }
 
+/* Return the value for C in char-table TABLE.  Shrink the range
+   *FROM and *TO to cover characters (containing C) that have the same
+   value as C.  Should the value for C in TABLE be nil, consult the
+   parent table of TABLE, recursively if necessary.  It is not
+   guaranteed that the values of (*FROM - 1) and (*TO + 1) are
+   different from that of C.  */
+Lisp_Object
+char_table_ref_and_range_with_parents (Lisp_Object table, int c,
+                                       int *from, int *to)
+{
+  Lisp_Object val;
+  Lisp_Object parent, defalt;
+  struct Lisp_Char_Table *tbl;
+
+  if (*to < 0)
+    *to = MAX_CHAR;
+  if (ASCII_CHAR_P (c)
+      && *from <= c
+      && *to >= c)
+    {
+      tbl = XCHAR_TABLE (table);
+      parent = tbl->parent;     /* Added in to try to fix segfault.  2018-02-18. */
+      defalt = tbl->defalt;
+      val = NILP (tbl->ascii)
+        ? defalt /*Qnil*/
+        : sub_char_table_ref_and_range (tbl->ascii, c, from, to, defalt, false);
+      while (NILP (val) && !NILP (parent))
+        {
+          tbl = XCHAR_TABLE (parent);
+          parent = tbl->parent;
+          defalt = tbl->defalt;
+          val = NILP (tbl->ascii)
+            ? defalt /*Qnil*/
+            : sub_char_table_ref_and_range (tbl->ascii, c, from, to, defalt, false);
+        }
+      return val;
+    }
+  else if (!ASCII_CHAR_P (c))
+    {
+      val = char_table_ref_and_range (table, c, from, to);
+      tbl = XCHAR_TABLE (table);
+      while (NILP (val))
+        {
+          parent = tbl->parent;
+          if (NILP (parent))
+            break;
+          val = char_table_ref_and_range (parent, c, from, to);
+          tbl = XCHAR_TABLE (parent);
+        }
+      return val;
+    }
+  else
+    return Qnil;
+}
 
 static void
 sub_char_table_set (Lisp_Object table, int c, Lisp_Object val, bool is_uniprop)
diff --git a/src/insdel.c b/src/insdel.c
index 02e3f41bc9..4016ceb845 100644
--- a/src/insdel.c
+++ b/src/insdel.c
@@ -2170,6 +2170,12 @@ signal_after_change (ptrdiff_t charpos, ptrdiff_t lendel, ptrdiff_t lenins)
   ptrdiff_t count = SPECPDL_INDEX ();
   struct rvoe_arg rvoe_arg;
 
+  /* Ensure we invalidate the syntax cache on an actual text change
+     regardless of the settings of inhibit-modification-hooks and
+     after-change-functions. */
+  if ((lenins != lendel)
+      && Ffboundp (Qsyntax_ppss_flush_cache)) /* For bootstrapping. */
+    call1 (Qsyntax_ppss_flush_cache, make_number (charpos));
   if (inhibit_modification_hooks)
     return;
 
diff --git a/src/lisp.h b/src/lisp.h
index a7f0a1d78f..e4f76f4561 100644
--- a/src/lisp.h
+++ b/src/lisp.h
@@ -3851,6 +3851,8 @@ extern void r_alloc_inhibit_buffer_relocation (int);
 extern Lisp_Object copy_char_table (Lisp_Object);
 extern Lisp_Object char_table_ref_and_range (Lisp_Object, int,
                                              int *, int *);
+extern Lisp_Object char_table_ref_and_range_with_parents (Lisp_Object, int,
+                                                          int*, int*);
 extern void char_table_set_range (Lisp_Object, int, int, Lisp_Object);
 extern void map_char_table (void (*) (Lisp_Object, Lisp_Object,
                             Lisp_Object),
diff --git a/src/syntax.c b/src/syntax.c
index 52cec23cd7..d81a57cd22 100644
--- a/src/syntax.c
+++ b/src/syntax.c
@@ -178,8 +178,12 @@ static ptrdiff_t find_start_begv;
 static EMACS_INT find_start_modiff;
 
 
+static void check_syntax_table (Lisp_Object);
 static Lisp_Object skip_chars (bool, Lisp_Object, Lisp_Object, bool);
 static Lisp_Object skip_syntaxes (bool, Lisp_Object, Lisp_Object);
+static bool syntax_table_value_range_is_interesting_for_literals (Lisp_Object,
+                                                                  int, int);
+static void break_off_syntax_tables_literal_relations (Lisp_Object);
 static Lisp_Object scan_lists (EMACS_INT, EMACS_INT, EMACS_INT, bool);
 static void scan_sexps_forward (struct lisp_parse_state *,
                                 ptrdiff_t, ptrdiff_t, ptrdiff_t, EMACS_INT,
@@ -687,6 +691,112 @@ prev_char_comend_first (ptrdiff_t pos, ptrdiff_t pos_byte)
   return val;
 }
 
+/* Empty the syntax-ppss cache of every buffer whose syntax table is
+   currently set to TABLE. */
+static void
+empty_syntax_tables_buffers_syntax_caches (Lisp_Object table)
+{
+  Lisp_Object buf, buf_list;
+  struct buffer *current = current_buffer;
+  struct buffer *b;
+
+  buf_list = Fbuffer_list (Qnil);
+  while (!NILP (buf_list))
+    {
+      buf = XCAR (buf_list);
+      b = XBUFFER (buf);
+      if (EQ (BVAR (b, syntax_table), table))
+        {
+          set_buffer_internal_1 (b);
+          call1 (Qsyntax_ppss_flush_cache, make_number (-1));
+        }
+      buf_list = XCDR (buf_list);
+    }
+  set_buffer_internal_1 (current);
+}
+
+#define LITERAL_MASK ((1 << Sstring)            \
+                      | (1 << Sescape)          \
+                      | (1 << Scharquote)       \
+                      | (1 << Scomment)         \
+                      | (1 << Sendcomment)      \
+                      | (1 << Scomment_fence)   \
+                      | (1 << Sstring_fence))
+
+/* The following returns true if ELT (which will be a raw syntax
+   descriptor (see page "Syntax Table Internals" in the Elisp manual)
+   or nil) represents a syntax which is (potentially) relevant to
+   strings or comments.  */
+static bool
+SYNTAB_LITERAL (Lisp_Object elt)
+{
+  int ielt;
+  if (!CONSP (elt))
+    return false;
+  ielt = XINT (XCAR (elt));
+  return (ielt & 0xF0000)       /* a comment flag is set */
+    || ((1 << (ielt & 0xFF)) & LITERAL_MASK); /* One of Sstring, .... */
+}
+
+static
+bool syntax_table_value_is_interesting_for_literals (Lisp_Object val)
+{
+  if (!CONSP (val)
+      || !INTEGERP (XCAR (val)))
+    return false;
+  return SYNTAB_LITERAL (XCAR (val));
+}
+
+/* The text property PROP is having its value VAL at position POS in buffer BUF
+either set or cleared.  If this value is relevant to the syntax of literals,
+reduce the BUF's "syntax cache position" to POS.  */
+void
+check_syntax_cache_for_prop (ptrdiff_t pos, Lisp_Object prop,
+                             Lisp_Object val, Lisp_Object buffer)
+{
+  struct buffer *b;
+  struct buffer *current = current_buffer;
+  Lisp_Object plist;
+
+  if (!BUFFERP (buffer))
+    return;
+  b = XBUFFER (buffer);
+  set_buffer_internal_1 (b);
+  if (pos >= syntax_propertize__done)
+    {
+      set_buffer_internal_1 (current);
+      return;
+    }
+
+  if (EQ (prop, Qcategory)
+      && SYMBOLP (val))
+    {
+      plist = Fsymbol_plist (val);
+      while (CONSP (plist))
+        {
+          prop = XCAR (plist);
+          plist = XCDR (plist);
+          if (!CONSP (plist))
+            {
+              set_buffer_internal_1 (current);
+              return;
+            }
+          val = XCAR (plist);
+          if (EQ (prop, Qsyntax_table))
+            break;
+          plist = XCDR (plist);
+        }
+    }
+  if (EQ (prop, Qsyntax_table)
+      && syntax_table_value_is_interesting_for_literals (val))
+    call1 (Qsyntax_ppss_flush_cache, make_number (pos));
+  set_buffer_internal_1 (current);
+}
+
+/*****************************************************************************
+ *****************************************************************************/
+
+
 /* Check whether charpos FROM is at the end of a comment.
    FROM_BYTE is the bytepos corresponding to FROM.
    Do not move back before STOP.
@@ -989,6 +1099,222 @@ back_comment (ptrdiff_t from, ptrdiff_t from_byte, ptrdiff_t stop,
   return from != comment_end;
 }
 
+/* If the two syntax entries OLD_SYN and NEW_SYN would parse strings
+   or comments differently return true, otherwise return nil. */
+static bool
+literally_different (Lisp_Object old_syn, Lisp_Object new_syn)
+{
+  bool old_literality = SYNTAB_LITERAL (old_syn),
+    new_literality = SYNTAB_LITERAL (new_syn);
+  return (old_literality != new_literality)
+    || (old_literality
+        && (!EQ (Fcar (old_syn), Fcar (new_syn))));
+}
+
+/* If there is a character position in the range [START, END] for
+   whose syntaxes in syntax tables OLD and NEW strings or comments
+   might be parsed differently, return the lowest character for which
+   this holds.  Otherwise, return -1.  */
+static int
+syntax_table_ranges_differ_literally_p (Lisp_Object old, Lisp_Object new,
+                                              int start, int end)
+{
+  int old_from, new_from, old_to, new_to;
+  Lisp_Object old_syn = Qnil, new_syn = Qnil; /* Initialise to avoid compiler warnings. */
+
+  new_from = old_from = start;
+  new_to = old_to = -1;
+
+  while ((old_from < end) && (new_from < end))
+    {
+      if (old_from == new_from)
+        {
+          old_syn = char_table_ref_and_range_with_parents (old, old_from,
+                                                           &old_from, &old_to);
+          new_syn = char_table_ref_and_range_with_parents (new, new_from,
+                                                           &new_from, &new_to);
+          if (literally_different (old_syn, new_syn))
+            return old_from;
+          old_from = old_to + 1;
+          new_from = new_to + 1;
+          old_to = -1;
+          new_to = -1;
+        }
+      else if (old_from < new_from)
+        {
+          old_syn = char_table_ref_and_range_with_parents (old, old_from,
+                                                           &old_from, &old_to);
+          if (literally_different (old_syn, new_syn))
+            return old_from;
+          old_from = old_to + 1;
+          old_to = -1;
+        }
+      else
+        {
+          new_syn = char_table_ref_and_range_with_parents (new, new_from,
+                                                           &new_from, &new_to);
+          if (literally_different (old_syn, new_syn))
+            return new_from;
+          new_from = new_to + 1;
+          new_to = -1;
+        }
+    }
+  return -1;
+}
+
+DEFUN ("least-literal-difference-between-syntax-tables",
+       Fleast_literal_difference_between_syntax_tables,
+       Sleast_literal_difference_between_syntax_tables,
+       2, 2, 0,
+       doc: /* Lowest char whose different syntaxes in OLD and NEW parse literals differently.
+               OLD and NEW are syntax tables.  */)
+       (Lisp_Object old, Lisp_Object new)
+{
+  int c;
+
+  check_syntax_table (old);
+  check_syntax_table (new);
+  c = syntax_table_ranges_differ_literally_p (old, new, 0, MAX_CHAR + 1);
+  if (c >= 0)
+    return make_number (c);
+  return Qnil;
+}
+
+/* The next two variables are alists, the key being a syntax table,
+   and the value being a non-empty list of syntax_tables.  When a
+   syntax table B is known to parse strings and comments the same as
+   syntax table A, it will be a member of the value whose key is A in
+   `literally_the_same_sts' (and vice versa).  Similarly, when two
+   syntax tables are known to parse strings and comments differently,
+   there will be entries in `literally_different_sts'. */
+static Lisp_Object literally_the_same_sts, literally_different_sts;
+
+DEFUN ("syntax-tables-literally-different-p",
+       Fsyntax_tables_literally_different_p,
+       Ssyntax_tables_literally_different_p,
+       2, 2, 0,
+       doc: /* Will syntax tables OLD and NEW parse literals differently?
+Return t when OLD and NEW might parse comments and strings differently,
+otherwise nil.  (Use `least-literal-difference-between-syntax-tables'
+to locate a character position where the tables differ.)  */)
+     (Lisp_Object old, Lisp_Object new)
+{
+  Lisp_Object elt;
+
+  check_syntax_table (old);
+  check_syntax_table (new);
+  /* Check to see if there is a cached relationship between the tables. */
+  if (!NILP (Fmemq (new, /*XCHAR_TABLE (old)->extras[0])*/
+                    Fassq (old, literally_the_same_sts))))
+    return Qnil;
+  if (!NILP (Fmemq (new, /*XCHAR_TABLE (old)->extras[1])*/
+                    Fassq (old, literally_different_sts))))
+    return Qt;
+  /* The two tables have no known relationship, so we'll have
+     laboriously to compare them. */
+  if (syntax_table_ranges_differ_literally_p (old, new, 0, MAX_CHAR + 1) >= 0)
+    {
+      /* Mark the "literally different" relationship between the OLD and
+         NEW syntax tables. */
+      elt = Fassq (new, literally_different_sts);
+      if (NILP (elt))
+        literally_different_sts = Fcons (Fcons (new, Fcons (old, Qnil)),
+                                         literally_different_sts);
+      else
+        Fsetcdr (elt, Fcons (old, XCDR (elt)));
+      elt = Fassq (old, literally_different_sts);
+      if (NILP (elt))
+        literally_different_sts = Fcons (Fcons (old, Fcons (new, Qnil)),
+                                         literally_different_sts);
+      else
+        Fsetcdr (elt, Fcons (new, XCDR (elt)));
+      return Qt;
+    }
+  else
+    {
+      /* Mark the "not literally different" relationship between the OLD
+         and NEW syntax tables. */
+      elt = Fassq (new, literally_the_same_sts);
+      if (NILP (elt))
+        literally_the_same_sts = Fcons (Fcons (new, Fcons (old, Qnil)),
+                                        literally_the_same_sts);
+      else
+        Fsetcdr (elt, Fcons (old, XCDR (elt)));
+      elt = Fassq (old, literally_the_same_sts);
+      if (NILP (elt))
+        literally_the_same_sts = Fcons (Fcons (old, Fcons (new, Qnil)),
+                                        literally_the_same_sts);
+      else
+        Fsetcdr (elt, Fcons (new, XCDR (elt)));
+      return Qnil;
+    }
+}
+
+/* If any character in the range [START, END) has an entry in syntax
+   table TABLE which is relevant to literal parsing, return true,
+   else return false. */
+static bool
+syntax_table_value_range_is_interesting_for_literals (Lisp_Object table,
+                                                      int start, int end)
+{
+  int from, to;
+  Lisp_Object syn;
+
+  from = start;
+  to = end;
+  while (from < to)
+    {
+      syn = char_table_ref_and_range_with_parents (table, from, &from, &to);
+      if (SYNTAB_LITERAL (syn))
+        return true;
+      from = to + 1;
+      to = end;
+    }
+  return false;
+}
+
+static void
+break_off_syntax_tables_literal_relations (Lisp_Object table)
+{
+  Lisp_Object remotes_elt;
+  Lisp_Object remotes, keep_remotes;
+  Lisp_Object rem, elt;
+
+  remotes_elt = Fassq (table, literally_the_same_sts);
+  remotes = Fcdr (remotes_elt);
+  keep_remotes = remotes;
+  while (!NILP (remotes))
+    {
+      rem = Fcar (remotes);
+      elt = Fassq (rem, literally_the_same_sts);
+      Fsetcdr (elt, Fdelq (table, Fcdr (elt)));
+      if (NILP (Fcdr (elt)))
+        literally_the_same_sts = Fdelq (elt, literally_the_same_sts);
+      remotes = Fcdr (remotes);
+    }
+  if (!NILP (keep_remotes))
+    literally_the_same_sts = Fdelq (remotes_elt, literally_the_same_sts);
+
+  remotes_elt = Fassq (table, literally_different_sts);
+  remotes = Fcdr (remotes_elt);
+  keep_remotes = remotes;
+  while (!NILP (remotes))
+    {
+      rem = Fcar (remotes);
+      elt = Fassq (rem, literally_different_sts);
+      Fsetcdr (elt, Fdelq (table, Fcdr (elt)));
+      if (NILP (Fcdr (elt)))
+        literally_different_sts = Fdelq (elt, literally_different_sts);
+      remotes = Fcdr (remotes);
+    }
+  if (!NILP (keep_remotes))
+    literally_different_sts = Fdelq (remotes_elt, literally_different_sts);
+}
+
+
+/*****************************************************************************
+ ****************************************************************************/
+
 DEFUN ("syntax-table-p", Fsyntax_table_p, Ssyntax_table_p, 1, 1, 0,
        doc: /* Return t if OBJECT is a syntax table.
 Currently, any char-table counts as a syntax table.  */)
@@ -1057,6 +1383,14 @@ One argument, a syntax table.  */)
 {
   int idx;
   check_syntax_table (table);
+  /* Optimise away the case when we're not changing the table. */
+  if (EQ (BVAR (current_buffer, syntax_table), table))
+    return table;
+  if (!NILP (Fsyntax_table_p (BVAR (current_buffer, syntax_table)))
+      && !NILP (Fsyntax_tables_literally_different_p
+                (BVAR (current_buffer, syntax_table), table))
+      && Ffboundp (Qsyntax_ppss_flush_cache)) /* for bootstrapping. */
+    call1 (Qsyntax_ppss_flush_cache, make_number (-1));
   bset_syntax_table (current_buffer, table);
   /* Indicate that this buffer now has a specified syntax table.  */
   idx = PER_BUFFER_VAR_IDX (syntax_table);
@@ -1274,6 +1608,16 @@ usage: (modify-syntax-entry CHAR NEWENTRY &optional SYNTAX-TABLE)  */)
     check_syntax_table (syntax_table);
 
   newentry = Fstring_to_syntax (newentry);
+  if (SYNTAB_LITERAL (newentry)
+      || (CONSP (c)
+          ? syntax_table_value_range_is_interesting_for_literals
+          (syntax_table, XINT (XCAR(c)), XINT (XCDR (c)))
+          : (SYNTAB_LITERAL (Faref (syntax_table, c)))))
+    {
+      empty_syntax_tables_buffers_syntax_caches (syntax_table);
+      break_off_syntax_tables_literal_relations (syntax_table);
+    }
+
   if (CONSP (c))
     SET_RAW_SYNTAX_ENTRY_RANGE (syntax_table, c, newentry);
   else
@@ -3637,6 +3981,10 @@ init_syntax_once (void)
   /* This has to be done here, before we call Fmake_char_table.  */
   DEFSYM (Qsyntax_table, "syntax-table");
 
+  /* We do not yet have any knowledge of how syntax tables parse literals. */
+  literally_the_same_sts = Qnil;
+  literally_different_sts = Qnil;
+
   /* Create objects which can be shared among syntax tables.  */
   Vsyntax_code_object = make_uninit_vector (Smax);
   for (i = 0; i < Smax; i++)
@@ -3728,6 +4076,9 @@ syms_of_syntax (void)
   staticpro (&gl_state.current_syntax_table);
   staticpro (&gl_state.old_prop);
 
+  staticpro (&literally_the_same_sts);
+  staticpro (&literally_different_sts);
+
   /* Defined in regex.c.  */
   staticpro (&re_match_object);
 
@@ -3752,6 +4103,8 @@ See the info node `(elisp)Syntax Properties' for a description of the
   DEFSYM (Qinternal__syntax_propertize, "internal--syntax-propertize");
   Fmake_variable_buffer_local (intern ("syntax-propertize--done"));
 
+  DEFSYM (Qsyntax_ppss_flush_cache, "syntax-ppss-flush-cache");
+
   words_include_escapes = 0;
   DEFVAR_BOOL ("words-include-escapes", words_include_escapes,
 	       doc: /* Non-nil means `forward-word', etc., should treat escape chars part of words.  */);
@@ -3790,6 +4143,8 @@ In both cases, LIMIT bounds the search. */);
   DEFSYM (Qcomment_end_can_be_escaped, "comment-end-can-be-escaped");
   Fmake_variable_buffer_local (Qcomment_end_can_be_escaped);
 
+  defsubr (&Sleast_literal_difference_between_syntax_tables);
+  defsubr (&Ssyntax_tables_literally_different_p);
   defsubr (&Ssyntax_table_p);
   defsubr (&Ssyntax_table);
   defsubr (&Sstandard_syntax_table);
diff --git a/src/syntax.h b/src/syntax.h
index 2171cbbba4..1771ae4728 100644
--- a/src/syntax.h
+++ b/src/syntax.h
@@ -28,6 +28,10 @@ INLINE_HEADER_BEGIN
 
 extern void update_syntax_table (ptrdiff_t, EMACS_INT, bool, Lisp_Object);
 extern void update_syntax_table_forward (ptrdiff_t, bool, Lisp_Object);
+extern void check_syntax_cache_for_prop (ptrdiff_t, Lisp_Object,
+                                         Lisp_Object, Lisp_Object);
+
+
 
 /* The standard syntax table is stored where it will automatically
    be used in all new buffers.  */
diff --git a/src/textprop.c b/src/textprop.c
index 984f2e6640..cd56d8b12c 100644
--- a/src/textprop.c
+++ b/src/textprop.c
@@ -21,6 +21,7 @@ along with GNU Emacs.  If not, see <https://www.gnu.org/licenses/>.  */
 
 #include "lisp.h"
 #include "intervals.h"
+#include "syntax.h"
 #include "buffer.h"
 #include "window.h"
 
@@ -340,6 +341,12 @@ set_properties (Lisp_Object properties, INTERVAL interval, Lisp_Object object)
 	    record_property_change (interval->position, LENGTH (interval),
 				    XCAR (sym), XCAR (value),
 				    object);
+            check_syntax_cache_for_prop
+              (interval->position, XCAR (sym), XCAR (value), object);
+            if (!EQ (property_value (properties, XCAR (sym)), Qunbound))
+              check_syntax_cache_for_prop
+                (interval->position, XCAR (sym),
+                 property_value (properties, XCAR (sym)), object);
 	  }
 
       /* For each new property that has no value at all in the old plist,
@@ -352,6 +359,8 @@ set_properties (Lisp_Object properties, INTERVAL interval, Lisp_Object object)
 	    record_property_change (interval->position, LENGTH (interval),
 				    XCAR (sym), Qnil,
 				    object);
+            check_syntax_cache_for_prop
+                (interval->position, XCAR (sym), XCAR (value), object);
 	  }
     }
 
@@ -406,6 +415,10 @@ add_properties (Lisp_Object plist, INTERVAL i, Lisp_Object object,
 	      {
 		record_property_change (i->position, LENGTH (i),
 					sym1, Fcar (this_cdr), object);
+                check_syntax_cache_for_prop
+                    (i->position, sym1, Fcar (this_cdr), object);
+                check_syntax_cache_for_prop
+                    (i->position, sym1, val1, object);
 	      }
 
 	    /* I's property has a different value -- change it */
@@ -442,6 +455,8 @@ add_properties (Lisp_Object plist, INTERVAL i, Lisp_Object object,
 	    {
 	      record_property_change (i->position, LENGTH (i),
 				      sym1, Qnil, object);
+              check_syntax_cache_for_prop
+                (i->position, sym1, val1, object);
 	    }
 	  set_interval_plist (i, Fcons (sym1, Fcons (val1, i->plist)));
 	  changed = true;
@@ -475,11 +490,14 @@ remove_properties (Lisp_Object plist, Lisp_Object list, INTERVAL i, Lisp_Object
       /* First, remove the symbol if it's at the head of the list */
       while (CONSP (current_plist) && EQ (sym, XCAR (current_plist)))
 	{
-	  if (BUFFERP (object))
-	    record_property_change (i->position, LENGTH (i),
-				    sym, XCAR (XCDR (current_plist)),
-				    object);
-
+          if (BUFFERP (object))
+            {
+              record_property_change (i->position, LENGTH (i),
+                                      sym, XCAR (XCDR (current_plist)),
+                                      object);
+              check_syntax_cache_for_prop
+                (i->position, sym, XCAR (XCDR (current_plist)), object);
+            }
 	  current_plist = XCDR (XCDR (current_plist));
 	  changed = true;
 	}
@@ -492,8 +510,12 @@ remove_properties (Lisp_Object plist, Lisp_Object list, INTERVAL i, Lisp_Object
 	  if (CONSP (this) && EQ (sym, XCAR (this)))
 	    {
 	      if (BUFFERP (object))
-		record_property_change (i->position, LENGTH (i),
-					sym, XCAR (XCDR (this)), object);
+                {
+                  record_property_change (i->position, LENGTH (i),
+                                          sym, XCAR (XCDR (this)), object);
+                  check_syntax_cache_for_prop
+                    (i->position, sym, XCAR (XCDR (this)), object);
+                }
 
 	      Fsetcdr (XCDR (tail2), XCDR (XCDR (this)));
 	      changed = true;


>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30393; Package emacs. (Mon, 05 Mar 2018 16:16:02 GMT) Full text and rfc822 format available.

Message #70 received at 30393 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Alan Mackenzie <acm <at> muc.de>
Cc: npostavs <at> users.sourceforge.net, dgutov <at> yandex.ru, monnier <at> IRO.UMontreal.CA,
 30393 <at> debbugs.gnu.org
Subject: Re: bug#30393: 24.4; cperl-mode: indentation failure
Date: Mon, 05 Mar 2018 18:14:51 +0200

> Date: Mon, 5 Mar 2018 08:42:55 +0000
> From: Alan Mackenzie <acm <at> muc.de>
> Cc: 30393 <at> debbugs.gnu.org, Dmitry Gutov <dgutov <at> yandex.ru>,
> 	Noam Postavsky <npostavs <at> users.sourceforge.net>
> 
> Existing C functions have been modified as follows:
>   o - signal_after_change.
>     * - Regardless of the settings of the change hooks,
>       syntax-ppss-flush-cache will be called for an actual textual
>       change.
>   o - Fset_syntax_table.
>     * - If the new table is literally different from the old,
>       syntax-ppss-flush-cache will be called with an argument of -1.
>   o - Fmodify_syntax_entry.
>     * - If the new entry is literally different from the old one,
>       syntax-ppss-flush-cache will be called with an argument of -1.
>   o - init_syntax_once and syms_of_syntax.
>     * - Administrative amendments.
>   o - set_properties, add_properties, remove_properties.  If a
>     syntax-table property set or removed, whether directly or via a
>     category property, potentially alters the parsing of literals,
>     syntax-ppss-flush-cache will be called.

Any reason why you introduce 2 new primitives that no Lisp code uses?

In any case, this needs documentation changes if and when it's agreed
upon.

Thanks.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30393; Package emacs. (Tue, 06 Mar 2018 18:25:02 GMT) Full text and rfc822 format available.

Message #73 received at 30393 <at> debbugs.gnu.org (full text, mbox):

From: Alan Mackenzie <acm <at> muc.de>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: npostavs <at> users.sourceforge.net, dgutov <at> yandex.ru, monnier <at> IRO.UMontreal.CA,
 30393 <at> debbugs.gnu.org
Subject: Re: bug#30393: 24.4; cperl-mode: indentation failure
Date: Tue, 6 Mar 2018 18:09:17 +0000

Hello, Eli.

On Mon, Mar 05, 2018 at 18:14:51 +0200, Eli Zaretskii wrote:
> > Date: Mon, 5 Mar 2018 08:42:55 +0000
> > From: Alan Mackenzie <acm <at> muc.de>
> > Cc: 30393 <at> debbugs.gnu.org, Dmitry Gutov <dgutov <at> yandex.ru>,
> > 	Noam Postavsky <npostavs <at> users.sourceforge.net>

> > Existing C functions have been modified as follows:
> >   o - signal_after_change.
> >     * - Regardless of the settings of the change hooks,
> >       syntax-ppss-flush-cache will be called for an actual textual
> >       change.
> >   o - Fset_syntax_table.
> >     * - If the new table is literally different from the old,
> >       syntax-ppss-flush-cache will be called with an argument of -1.
> >   o - Fmodify_syntax_entry.
> >     * - If the new entry is literally different from the old one,
> >       syntax-ppss-flush-cache will be called with an argument of -1.
> >   o - init_syntax_once and syms_of_syntax.
> >     * - Administrative amendments.
> >   o - set_properties, add_properties, remove_properties.  If a
> >     syntax-table property set or removed, whether directly or via a
> >     category property, potentially alters the parsing of literals,
> >     syntax-ppss-flush-cache will be called.

> Any reason why you introduce 2 new primitives that no Lisp code uses?

least-literal-difference-between-syntax-tables and
syntax-tables-literally-different-p?  They're for helping with
debugging.  Syntax tables, like char tables in general, are awkward
unwieldy beasts.  Sooner or later, somebody debugging is going to want
to compare two syntax tables which aren't behaving as she expects they
should.  Those primitives (which, yes, will need documenting) were cheap
and easy to write, but would be awkward and unwieldy to write as
one-offs in Lisp.

> In any case, this needs documentation changes if and when it's agreed
> upon.

Yes, but less documentation that would be needed without it.
Introducing the syntax-ppss mechanism into syntax primitives broke them,
since its cache invalidation is imperfect.  With my patch, aside from
any bugs in it, those primitives are less broken, hence less
documentation of the breakage is needed.

> Thanks.

-- 
Alan Mackenzie (Nuremberg, Germany).

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30393; Package emacs. (Sun, 08 Apr 2018 10:55:01 GMT) Full text and rfc822 format available.

Message #76 received at 30393 <at> debbugs.gnu.org (full text, mbox):

From: Alan Mackenzie <acm <at> muc.de>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: npostavs <at> users.sourceforge.net, dgutov <at> yandex.ru, monnier <at> IRO.UMontreal.CA,
 30393 <at> debbugs.gnu.org
Subject: Re: bug#30393: 24.4; cperl-mode: indentation failure
Date: Sun, 8 Apr 2018 10:52:57 +0000

Hello, Eli.

On Mon, Mar 05, 2018 at 18:14:51 +0200, Eli Zaretskii wrote:
> > Date: Mon, 5 Mar 2018 08:42:55 +0000
> > From: Alan Mackenzie <acm <at> muc.de>
> > Cc: 30393 <at> debbugs.gnu.org, Dmitry Gutov <dgutov <at> yandex.ru>,
> > 	Noam Postavsky <npostavs <at> users.sourceforge.net>

> > Existing C functions have been modified as follows:
> >   o - signal_after_change.
> >     * - Regardless of the settings of the change hooks,
> >       syntax-ppss-flush-cache will be called for an actual textual
> >       change.
> >   o - Fset_syntax_table.
> >     * - If the new table is literally different from the old,
> >       syntax-ppss-flush-cache will be called with an argument of -1.
> >   o - Fmodify_syntax_entry.
> >     * - If the new entry is literally different from the old one,
> >       syntax-ppss-flush-cache will be called with an argument of -1.
> >   o - init_syntax_once and syms_of_syntax.
> >     * - Administrative amendments.
> >   o - set_properties, add_properties, remove_properties.  If a
> >     syntax-table property set or removed, whether directly or via a
> >     category property, potentially alters the parsing of literals,
> >     syntax-ppss-flush-cache will be called.

> Any reason why you introduce 2 new primitives that no Lisp code uses?

[ Already answered ].

> In any case, this needs documentation changes if and when it's agreed
> upon.

Can we start moving forward with this change, please?

Just a quick summary of what it's about:  syntax-ppss is now being used
in back_comment.  syntax-ppss's cache, at the moment, isn't being
invalidated when the syntax table is swapped, or an entry in it is
changed, or when a syntax-table text property is applied to a buffer
element.  The patch fixes this, as far as back_comment is concerned.

> Thanks.

-- 
Alan Mackenzie (Nuremberg, Germany).

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30393; Package emacs. (Mon, 09 Apr 2018 18:42:02 GMT) Full text and rfc822 format available.

Message #79 received at 30393 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Alan Mackenzie <acm <at> muc.de>
Cc: npostavs <at> users.sourceforge.net, dgutov <at> yandex.ru, monnier <at> IRO.UMontreal.CA,
 30393 <at> debbugs.gnu.org
Subject: Re: bug#30393: 24.4; cperl-mode: indentation failure
Date: Mon, 09 Apr 2018 21:41:21 +0300

> Date: Sun, 8 Apr 2018 10:52:57 +0000
> Cc: monnier <at> IRO.UMontreal.CA, 30393 <at> debbugs.gnu.org, dgutov <at> yandex.ru,
>   npostavs <at> users.sourceforge.net
> From: Alan Mackenzie <acm <at> muc.de>
> 
> Can we start moving forward with this change, please?

What prevents you from moving forward with it?  You already know what
needs to be done to move forward: documentation of the new primitives
and some tests.  Am I missing something?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30393; Package emacs. (Tue, 10 Apr 2018 17:33:02 GMT) Full text and rfc822 format available.

Message #82 received at 30393 <at> debbugs.gnu.org (full text, mbox):

From: Alan Mackenzie <acm <at> muc.de>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: npostavs <at> users.sourceforge.net, dgutov <at> yandex.ru, monnier <at> IRO.UMontreal.CA,
 30393 <at> debbugs.gnu.org
Subject: Re: bug#30393: 24.4; cperl-mode: indentation failure
Date: Tue, 10 Apr 2018 17:31:06 +0000

Hello, Eli.

On Mon, Apr 09, 2018 at 21:41:21 +0300, Eli Zaretskii wrote:
> > Date: Sun, 8 Apr 2018 10:52:57 +0000
> > Cc: monnier <at> IRO.UMontreal.CA, 30393 <at> debbugs.gnu.org, dgutov <at> yandex.ru,
> >   npostavs <at> users.sourceforge.net
> > From: Alan Mackenzie <acm <at> muc.de>

> > Can we start moving forward with this change, please?

> What prevents you from moving forward with it?

The lack of an expressed intention, in principle, to accept the patch.
I'll take your post as this.

> You already know what needs to be done to move forward: documentation
> of the new primitives and some tests.

Yes.  I will move forward with this now.  Thanks.

-- 
Alan Mackenzie (Nuremberg, Germany).

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30393; Package emacs. (Mon, 16 Apr 2018 19:25:02 GMT) Full text and rfc822 format available.

Message #85 received at 30393 <at> debbugs.gnu.org (full text, mbox):

From: Alan Mackenzie <acm <at> muc.de>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: npostavs <at> users.sourceforge.net, dgutov <at> yandex.ru, monnier <at> IRO.UMontreal.CA,
 30393 <at> debbugs.gnu.org
Subject: Re: bug#30393: 24.4; cperl-mode: indentation failure - Documentation
 enhancements
Date: Mon, 16 Apr 2018 19:21:37 +0000

Hello, Eli.

On Mon, Apr 09, 2018 at 21:41:21 +0300, Eli Zaretskii wrote:
> > Date: Sun, 8 Apr 2018 10:52:57 +0000
> > Cc: monnier <at> IRO.UMontreal.CA, 30393 <at> debbugs.gnu.org, dgutov <at> yandex.ru,
> >   npostavs <at> users.sourceforge.net
> > From: Alan Mackenzie <acm <at> muc.de>

> > Can we start moving forward with this change, please?

> What prevents you from moving forward with it?  You already know what
> needs to be done to move forward: documentation of the new primitives
> and some tests.

After mulling it over for several weeks, I now think it would be better
just to leave these new primitives out.  Although they might be useful,
their main use was for me whilst hacking the syntax routines.

However, I propose the following documentation changes to go with my
patch to the code, these changes clarifying some of the limitations
inherent to syntax-ppss, and indicating how my enhancements will work.



diff --git a/doc/lispref/syntax.texi b/doc/lispref/syntax.texi
index 3327d7855c..b58e2a9a98 100644
--- a/doc/lispref/syntax.texi
+++ b/doc/lispref/syntax.texi
@@ -430,6 +430,11 @@ Syntax Table Functions
 This function always returns @code{nil}.  The old syntax information in
 the table for this character is discarded.
 
+Note that if the modification of @var{table} changes the way it parses
+comments or strings, the cache used by @code{syntax-ppss} will be
+emptied in each buffer whose current syntax table (or its parent,
+etc.) is @var{table}.  @xref{Position Parse}.
+
 @example
 @group
 @exdent @r{Examples:}
@@ -502,6 +507,12 @@ Syntax Table Functions
 @defun set-syntax-table table
 This function makes @var{table} the syntax table for the current buffer.
 It returns @var{table}.
+
+Note that if @var{table} parses comments or strings differently from
+the buffer's previous syntax table, the cache used by
+@code{syntax-ppss} in this buffer will be emptied.  @xref{Position
+Parse}.
+
 @end defun
 
 @defun syntax-table
@@ -523,6 +534,9 @@ Syntax Table Functions
 more precise: @code{with-syntax-table} temporarily alters the current
 syntax table of whichever buffer is current at the time the macro
 execution starts.  Other buffers are not affected.
+
+This macro might empty the cache used by @code{syntax-ppss}, as noted
+above under @code{set-syntax-table}.
 @end defmac
 
 @node Syntax Properties
@@ -534,6 +548,15 @@ Syntax Properties
 occurrences in the buffer, by applying a @code{syntax-table} text
 property.  @xref{Text Properties}, for how to apply text properties.
 
+As an alternative to setting @code{syntax-table} text properties
+directly, you can use @code{syntax-propertize-function} (see below).
+Most major modes supplied with Emacs which use these text properties
+use this method for applying them.  We strongly recommended you to use
+just one of these methods in any Emacs Lisp program, and not to mix
+them in the same buffer.@footnote{@code{syntax-propertize-function}
+can operate at unpredictable times, and may erase explicitly applied
+@code{syntax-table} properties.}
+
   The valid values of @code{syntax-table} text property are:
 
 @table @asis
@@ -556,6 +579,10 @@ Syntax Properties
 If this is non-@code{nil}, the syntax scanning functions, like
 @code{forward-sexp}, pay attention to syntax text properties.
 Otherwise they use only the current syntax table.
+
+If you change this variable after buffer initialization time, even if
+your Emacs Lisp program doesn't use @code{syntax-ppss}, you should
+call @code{syntax-ppss-flush-cache}.  @xref{Position Parse}.
 @end defvar
 
 @defvar syntax-propertize-function
@@ -695,8 +722,9 @@ Motion via Parsing
 negative @var{depth} has the effect of moving deeper by @var{-depth}
 levels of parenthesis.
 
-Scanning ignores comments if @code{parse-sexp-ignore-comments} is
-non-@code{nil}.
+Scanning skips over comments if @code{parse-sexp-ignore-comments} is
+non-@code{nil} (which it usually is for programming language major
+modes).
 
 If the scan reaches the beginning or end of the accessible part of the
 buffer before it has scanned over @var{count} parenthetical groupings,
@@ -709,7 +737,7 @@ Motion via Parsing
 It returns the position where the scan stops.  If @var{count} is
 negative, the scan moves backwards.
 
-Scanning ignores comments if @code{parse-sexp-ignore-comments} is
+Scanning skips over comments if @code{parse-sexp-ignore-comments} is
 non-@code{nil}.
 
 If the scan reaches the beginning or end of (the accessible part of) the
@@ -747,7 +775,7 @@ Position Parse
 
   For syntactic analysis, such as in indentation, often the useful
 thing is to compute the syntactic state corresponding to a given buffer
-position.  This function does that conveniently.
+position.  @code{syntax-ppss} does that conveniently.
 
 @defun syntax-ppss &optional pos
 This function returns the parser state that the parser would reach at
@@ -769,6 +797,33 @@ Position Parse
 complete subexpression) and sixth value (minimum parenthesis depth) in
 the returned parser state are not meaningful.
 
+Note that these caches become invalid when you set a new syntax table
+for the buffer, or change an entry for a character in the current
+syntax table.  If you set or clear a @code{syntax-table} text property
+at some buffer position, the caches become invalid for the buffer
+portion at and after that position.  It is your responsibility to deal
+with these situations, either by calling
+@code{syntax-ppss-flush-cache} (see below), or by refraining from
+using @code{syntax-ppss} while the caches are invalid.  An exception
+to this is when such a change alters the way comments or strings are
+parsed; then, Emacs calls @code{syntax-ppss-flush-cache}
+automatically.@footnote{The reason for this is that from Emacs 27,
+Emacs uses @code{syntax-ppss} internally in low level primitives such
+as @code{forward-comment} and @code{scan-lists}.  This automatic
+flushing of the cache helps these primitives continue to work
+reliably.}
+
+Changing any of the variables @code{multibyte-syntax-as-symbol},
+@code{parse-sexp-ignore-comments}, @code{comment-end-can-be-escaped}
+(@pxref{Control Parsing}), or @code{parse-sexp-lookup-properties}
+(@pxref{Syntax Properties}) after buffer initialization will always
+leave @code{syntax-ppss}'s caches invalid in the affected buffers.  So
+will changing a symbol's @code{syntax-table} property when that symbol
+is the value of a @code{category} text property somewhere in the
+buffer (@pxref{Special Properties}), a practice we don't recommend.
+In such cases you must always take one of the actions detailed in the
+previous paragraph.
+
 This function has a side effect: it adds a buffer-local entry to
 @code{before-change-functions} (@pxref{Change Hooks}) for
 @code{syntax-ppss-flush-cache} (see below).  This entry keeps the
@@ -952,6 +1007,11 @@ Control Parsing
 You can use @code{forward-comment} to move forward or backward over
 one comment or several comments.
 
+If you change any of the above three variables after buffer
+initialization time, even if your Emacs Lisp program doesn't use
+@code{syntax-ppss}, you should call @code{syntax-ppss-flush-cache}.
+@xref{Position Parse}.
+
 @node Syntax Table Internals
 @section Syntax Table Internals
 @cindex syntax table internals
diff --git a/doc/lispref/text.texi b/doc/lispref/text.texi
index ebfa8b9b0f..d8df9fe352 100644
--- a/doc/lispref/text.texi
+++ b/doc/lispref/text.texi
@@ -3203,6 +3203,14 @@ Special Properties
 properties of this symbol serve as defaults for the properties of the
 character.
 
+You might be tempted to put a @code{syntax-table} property onto the
+symbol, and change this property's value repeatedly in your Lisp
+program, thus changing the syntax of many characters in a buffer at
+the same time.  We advise against doing this, since it renders the
+caches used by @code{syntax-ppss} invalid in a way that Emacs cannot
+detect and correct for.  If you are going to be doing this, please
+consult @ref{Position Parse} and follow the advice there.
+
 @item face
 @cindex face codes of text
 @kindex face @r{(text property)}


-- 
Alan Mackenzie (Nuremberg, Germany).

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30393; Package emacs. (Thu, 19 Apr 2018 07:53:01 GMT) Full text and rfc822 format available.

Message #88 received at 30393 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Alan Mackenzie <acm <at> muc.de>
Cc: npostavs <at> users.sourceforge.net, dgutov <at> yandex.ru, monnier <at> IRO.UMontreal.CA,
 30393 <at> debbugs.gnu.org
Subject: Re: bug#30393: 24.4; cperl-mode: indentation failure - Documentation
 enhancements
Date: Thu, 19 Apr 2018 10:52:41 +0300

> Date: Mon, 16 Apr 2018 19:21:37 +0000
> Cc: monnier <at> IRO.UMontreal.CA, 30393 <at> debbugs.gnu.org, dgutov <at> yandex.ru,
>   npostavs <at> users.sourceforge.net
> From: Alan Mackenzie <acm <at> muc.de>
> 
> However, I propose the following documentation changes to go with my
> patch to the code, these changes clarifying some of the limitations
> inherent to syntax-ppss, and indicating how my enhancements will work.

OK.  I have a minor stylistic comment about the documentation changes:

> +As an alternative to setting @code{syntax-table} text properties
> +directly, you can use @code{syntax-propertize-function} (see below).
> +Most major modes supplied with Emacs which use these text properties
> +use this method for applying them.  We strongly recommended you to use
> +just one of these methods in any Emacs Lisp program, and not to mix
> +them in the same buffer.@footnote{@code{syntax-propertize-function}
> +can operate at unpredictable times, and may erase explicitly applied
> +@code{syntax-table} properties.}

@footnote should begin before the period that ends a sentence to which
the footnote applies.  I believe the usual typographic convention is
to show footnotes as this:

   bla bla bla¹.

rather than as this:

   yak yak yak.¹

If you agree, this should be fixed in more than one place in your
patch.

Another minor comment is to please consider whether the description
you add warrant some new index entries, so that it would be easier to
find this stuff when one is specifically looking for it.  I tend to
think at least some index entries would be useful.

Thanks.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30393; Package emacs. (Sat, 22 Aug 2020 16:08:01 GMT) Full text and rfc822 format available.

Message #91 received at 30393 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: Alan Mackenzie <acm <at> muc.de>, dgutov <at> yandex.ru, 30393 <at> debbugs.gnu.org,
 monnier <at> IRO.UMontreal.CA, npostavs <at> users.sourceforge.net
Subject: Re: bug#30393: 24.4; cperl-mode: indentation failure -
 Documentation enhancements
Date: Sat, 22 Aug 2020 18:07:23 +0200

Eli Zaretskii <eliz <at> gnu.org> writes:

>> However, I propose the following documentation changes to go with my
>> patch to the code, these changes clarifying some of the limitations
>> inherent to syntax-ppss, and indicating how my enhancements will work.
>
> OK.  I have a minor stylistic comment about the documentation changes:

This was more than two years ago, but the patch was apparently never
applied.  Alan?

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30393; Package emacs. (Tue, 03 Nov 2020 13:46:02 GMT) Full text and rfc822 format available.

Message #94 received at 30393 <at> debbugs.gnu.org (full text, mbox):

From: haj <at> posteo.de (Harald Jörg)
To: 30393 <at> debbugs.gnu.org
Subject: [PATCH] Add a test to verify that the bug is gone (and a fix for
 Emacs 26)
Date: Tue, 03 Nov 2020 14:45:20 +0100

[Message part 1 (text/plain, inline)]

This bug had a rather long discussion, but apparently no conclusion so
far.  I found that the indentation is correct in Emacs 27 and 28, so
apparently it has been fixed elsewhere.  The patch contains a test
(suitable for cperl-mode and perl-mode) to verify correct indentation
for the example source code from the bug report.

In Emacs 26 the bug still exists.  I find it serious enough to add the
workaround given by Noam Postavsky in the discussion of Bug#25480.  When
the opening paren in column 0 is within a string variable (as it is in
the code from the bug report), indenting changed the value of that
variable.  This should not be allowed to happen.
-- 
Cheers,
haj

[0001-cperl-mode-Fix-indentation-for-Emacs-26-Bug-30393.patch (text/x-diff, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30393; Package emacs. (Tue, 03 Nov 2020 14:30:02 GMT) Full text and rfc822 format available.

Message #97 received at 30393 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: haj <at> posteo.de (Harald Jörg)
Cc: 30393 <at> debbugs.gnu.org
Subject: Re: bug#30393: [PATCH] Add a test to verify that the bug is gone
 (and a fix for Emacs 26)
Date: Tue, 03 Nov 2020 15:29:07 +0100

haj <at> posteo.de (Harald Jörg) writes:

> In Emacs 26 the bug still exists.  I find it serious enough to add the
> workaround given by Noam Postavsky in the discussion of Bug#25480.  When
> the opening paren in column 0 is within a string variable (as it is in
> the code from the bug report), indenting changed the value of that
> variable.  This should not be allowed to happen.

Thanks; applied to the trunk.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no

Added tag(s) fixed. Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Tue, 03 Nov 2020 14:30:03 GMT) Full text and rfc822 format available.

bug marked as fixed in version 28.1, send any further explanations to 30393 <at> debbugs.gnu.org and paulusm <paulusm <at> bigpond.com> Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Tue, 03 Nov 2020 14:30:04 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Wed, 02 Dec 2020 12:24:09 GMT) Full text and rfc822 format available.

This bug report was last modified 3 years and 145 days ago.

Previous Next

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #30393 cperl-mode: open-paren-in-column-0 of string literal affects later statement indentation

GNU bug report logs - #30393
cperl-mode: open-paren-in-column-0 of string literal affects later statement indentation