GNU bug report logs - #36474
Algorithm in electric-pair--unbalanced-strings-p unsuitable for CC Mode

Previous Next

Package: emacs;

Reported by: Alan Mackenzie <acm <at> muc.de>

Date: Tue, 2 Jul 2019 13:17:01 UTC

Severity: normal

Done: Alan Mackenzie <acm <at> muc.de>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 36474 in the body.
You can then email your comments to 36474 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#36474; Package emacs. (Tue, 02 Jul 2019 13:17:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Alan Mackenzie <acm <at> muc.de>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Tue, 02 Jul 2019 13:17:01 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Alan Mackenzie <acm <at> muc.de>
To: João Távora <joaotavora <at> gmail.com>,
 bug-gnu-emacs <at> gnu.org
Subject: Algorithm in electric-pair--unbalanced-strings-p unsuitable for CC
 Mode
Date: Tue, 2 Jul 2019 13:16:32 +0000
Hello João and Emacs.

This is a follow up bug to bug #36423: 27.0.50; electric-pair-mode not
working properly depending of file content.

Start the Emacs master (up to date state as of 2019-07-02T14:30 +0000)
with emacs -Q, put the following in a C++ Mode buffer and enable
electric-pair-mode:

"foo\n

.  Type a " at the end of foo.  electric-pair-mode wrongly inserts two
"s.

Diagnosis: electric-pair--unbalanced-strings-p works after the (single)
newly typed " has been stripped from the buffer.  It attempts to
determine whether there are any open strings after the point of
insertion.  It does this by using parse-partial-sexp, and checks (nth 3
<result>) as evidence of an open string.

This does not work in CC Mode, since although there is an open string
marker (with a string fence syntax-table property on it) this is
"closed" (from parse-partial-sexp's point of view) by the string fence
property on the newline at the end of the line.
electric-pair--unbalanced-strings-p thus returns the wrong result.

A more suitable algorithm might look something like this: check whether
the newly inserted " has a string fence syntax-table text property.
(Its insertion will have already triggered the before- and
after-change-functions which set this property.)  If so, there is an
open string.  Of course, this only applies to CC Mode modes.

-- 
Alan Mackenzie (Nuremberg, Germany).




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#36474; Package emacs. (Tue, 02 Jul 2019 16:05:01 GMT) Full text and rfc822 format available.

Message #8 received at 36474 <at> debbugs.gnu.org (full text, mbox):

From: Alan Mackenzie <acm <at> muc.de>
To: João Távora <joaotavora <at> gmail.com>
Cc: 36474 <at> debbugs.gnu.org
Subject: Re: Algorithm in electric-pair--unbalanced-strings-p unsuitable for
 CC Mode
Date: Tue, 2 Jul 2019 16:04:10 +0000
On Tue, Jul 02, 2019 at 15:13:35 +0100, João Távora wrote:
>  > Hello João and Emacs.

> Hello Alan and Emacs,

> On Tue, Jul 2, 2019 at 2:16 PM Alan Mackenzie <acm <at> muc.de> wrote:

> > This is a follow up bug to bug #36423: 27.0.50; electric-pair-mode not
> > working properly depending of file content.

> [did you mean to copy bug-gnu-emacs <at> gnu.org, or emacs-devel <at> gnu.org?
> I'm assuming the latter and correcting]

No, it's a bug, therefore I submitted a bug report.

> > Start the Emacs master (up to date state as of 2019-07-02T14:30 +0000)
> > with emacs -Q, put the following in a C++ Mode buffer and enable
> > electric-pair-mode:

> > "foo\n

> > .  Type a " at the end of foo.  electric-pair-mode wrongly inserts two
> > "s.

> > Diagnosis: electric-pair--unbalanced-strings-p works after the (single)
> > newly typed " has been stripped from the buffer.  It attempts to
> > determine whether there are any open strings after the point of
> > insertion.  It does this by using parse-partial-sexp, and checks (nth 3
> > <result>) as evidence of an open string.

> I'm afraid this is a (another?) direct consequence of the NL-terminated
> strings feature that you introduced more than one year ago.  If you
> remember, this had various consequences vis-a-vis balancing,
> broke a test (one that I disabled in the expectation that a fix would be
> made available, which I don't think happened). Here are some points of
> that thread:

> https://lists.gnu.org/archive/html/emacs-devel/2018-06/msg00551.html
> https://lists.gnu.org/archive/html/emacs-devel/2018-06/msg00580.html

> I think I made my views clear back then: NL-terminated strings are a
> misfeature. The only argument _for) them, that they mimic what some
> compilers do, is very weak because (1) the code is invalid in both
> situations (not in any way "slightly less" invalid in any of them) and
> (2) compilers don't edit code and so have different requirements.

> The arguments _against_ NL-terminated strings is that they (1) break
> longstanding features such as sexp-based navigation (e.g. `up-list`
> and friends) for modes such say, `js-mode` and (2) break features
> that expect this to work, most notably electric-pair-mode.

This isn't true.  If those other feature no longer work with an up to
date Emacs, they should be fixed.

The fontification that CC Mode does is natural and helpful, and users
haven't complained about it (except when there've been bugs).  There have
certainly been no complaints about using font-lock-warning-face for the
invalid string delimiters, and font-lock-string-face for valid ones.

> Moving forward:

> 1. We can consider that electric-pair-mode is doing the right thing.
> Indeed if NL is indeed terminating a string, then quote balance has been
> maintained after the double quote insertion, i.e. it has not worsened.
> That is the general contract of  `electric-pair-preserve-balance`.

There is a bug: on typing a " to close a string, two "s are inserted into
the buffer, the second one being invalid.  This make absolutely no sense
from a user point of view.

> 2.The NL-terminated string feature is removed (or, if you prefer, is
> made disableable). This would restore the behaviour that most users
> would expect coming over, from say python-mode or js-mode. Perhaps
> it can already be disabled with a couple of lines of emacs-lisp tweaking
> the syntax-table.

The invalid string feature is here to stay.  It is a positive user
feature.  CC Mode has often been a pioneer in inventing Emacs features,
and this is just such a feature.

> 3. Someone comes up with a suitable indirection that doesn't involve
> hardcoding `cc-mode` in elec-pair.el.  That indirection would
> presumably do what you want for modes `cc-mode` derived
> from cc-mode.

There is already a great deal of such indirection in electric-pair-mode.
(Look for "(funcall electric-pair-....)" in elec-pair.el.)  Maybe there
is enough there already to accomodate CC Mode, maybe an extra function
variable would need introducing.

For this, I think we would both rather that you amend elec-pair.el rather
than me.

> 4. Someone reinvents electric-pair-mode in cc-model.el.
> Let's not do this.

No, let's not do that!  :-)

> I prefer 2.

That isn't an option.  Unless you can come up with another workable
strategy that achieves the same effect.

> Thanks,
> João



> > This does not work in CC Mode, since although there is an open string
> > marker (with a string fence syntax-table property on it) this is
> > "closed" (from parse-partial-sexp's point of view) by the string fence
> > property on the newline at the end of the line.
> > electric-pair--unbalanced-strings-p thus returns the wrong result.

> > A more suitable algorithm might look something like this: check whether
> > the newly inserted " has a string fence syntax-table text property.
> > (Its insertion will have already triggered the before- and
> > after-change-functions which set this property.)  If so, there is an
> > open string.  Of course, this only applies to CC Mode modes.

> --
> João Távora

-- 
Alan Mackenzie (Nuremberg, Germany).




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#36474; Package emacs. (Tue, 02 Jul 2019 17:23:02 GMT) Full text and rfc822 format available.

Message #11 received at 36474 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Alan Mackenzie <acm <at> muc.de>, emacs-devel <emacs-devel <at> gnu.org>,
 spacibba <at> aol.com
Cc: 36474 <at> debbugs.gnu.org
Subject: Re: Algorithm in electric-pair--unbalanced-strings-p unsuitable for
 CC Mode
Date: Tue, 2 Jul 2019 18:22:34 +0100
[Message part 1 (text/plain, inline)]
On Tue, Jul 2, 2019 at 5:04 PM Alan Mackenzie <acm <at> muc.de> wrote:

> > [did you mean to copy bug-gnu-emacs <at> gnu.org, or emacs-devel <at> gnu.org?
> > I'm assuming the latter and correcting]
> No, it's a bug, therefore I submitted a bug report.

You should use the X-Debbugs-CC feature then (and why not continue
in the existing bug 36423?)

Anyway, I insist this matter be brought to emacs-devel because it's a
followup to a discussion that started there but never reached a
suitable conclusion. For that reason, and because I provide a
workaround for the bug  at the end of  this message, I'm cross-posting
this one mail to both the bug list and emacs-devel.

> > The arguments _against_ NL-terminated strings is that they (1) break
> > longstanding features such as sexp-based navigation (e.g. `up-list`
> > and friends) for modes such say, `js-mode` and (2) break features
> > that expect this to work, most notably electric-pair-mode.
>
> This isn't true.

What "isn't true"? Have those features broken or not? They worked
before the fe06f643b commit and don't work after the commit. It
sounds quite unrefutable to me.

> If those other feature no longer work with an up to
> date Emacs, they should be fixed.

I've stated this repeatedly in the life of this discussion: it's not just
about
electric-pair-mode. If you try to M-x up-list from a multi-line string in
emacs 26.1 it works just as well in js-mode and c++-mode.  In emacs
master it does not in c++-mode. Same with forward-sexp on the starting
delimiter, etc.

> The fontification that CC Mode does is natural and helpful, and users
> haven't complained about it (except when there've been bugs).

Yes, users haven't complained except when users have complained.

> There have
> certainly been no complaints about using font-lock-warning-face for the
> invalid string delimiters, and font-lock-string-face for valid ones.

That's because providing this annotation is perfectly fine.  The problem
is providing it _at the expense of other features_. And _that's_ what
they've complained about: an average user has no obvious way of
telling that the particular implementation of the red annotation thingy
is guilty of breaking his C-M-u or his electric-pair-mode.

He/she might even judge the latter more vital than said red thingy
, an annotation which he/she will get by other means if using
very popular packages such as flycheck, or flymake, or eglot, or
lsp-mode, etc. These normally call the compiler directly on the
source code and highlight those and many other errors.

On the other hand, if what you want is the red annotation, are you
absolutely sure there isn't a better way to get it? And if you are,
are you also absolutely sure you need to put it in the code and
and not provide an easy way to turn it off?

> For this, I think we would both rather that you amend elec-pair.el rather
> than me.

I'll be "mulling this over". There are potentially many other points of
breakage that would need such an indirection, and doing that to serve
just a particular cc-mode quirk doesn't sound priority to me.

In the meantime, let others chime in.

Also, in the meantime, for a user that is bothered by this bug,
I'd recomend to put something like this in his/her .emacs file:

  (defun c-unescaped-nls-in-string-p (&optional quote-pos) t)

I had something more elaborate in my setup but just this
seems to fix it in my testing.

There is a also a very promising variable, c-multiline-string-start-char,
that I think would be a good candidate for customizing this, but I
haven't messed with it enough. Just setting it from .emacs doesn't
do the trick. Perhaps in a mode hook?

--
João Távora
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#36474; Package emacs. (Tue, 02 Jul 2019 18:29:01 GMT) Full text and rfc822 format available.

Message #14 received at 36474 <at> debbugs.gnu.org (full text, mbox):

From: Alan Mackenzie <acm <at> muc.de>
To: João Távora <joaotavora <at> gmail.com>
Cc: spacibba <at> aol.com, 36474 <at> debbugs.gnu.org, emacs-devel <emacs-devel <at> gnu.org>
Subject: Re: Algorithm in electric-pair--unbalanced-strings-p unsuitable for
 CC Mode
Date: Tue, 2 Jul 2019 18:28:11 +0000
Hello, João.

On Tue, Jul 02, 2019 at 18:22:34 +0100, João Távora wrote:
> On Tue, Jul 2, 2019 at 5:04 PM Alan Mackenzie <acm <at> muc.de> wrote:

> > > [did you mean to copy bug-gnu-emacs <at> gnu.org, or emacs-devel <at> gnu.org?
> > > I'm assuming the latter and correcting]
> > No, it's a bug, therefore I submitted a bug report.

> You should use the X-Debbugs-CC feature then (and why not continue
> in the existing bug 36423?)

I didn't know about the former, and as for the bug, it is a different
bug with differenct causes from #36423.

> Anyway, I insist this matter be brought to emacs-devel because it's a
> followup to a discussion that started there but never reached a
> suitable conclusion. For that reason, and because I provide a
> workaround for the bug  at the end of  this message, I'm cross-posting
> this one mail to both the bug list and emacs-devel.

> > > The arguments _against_ NL-terminated strings is that they (1) break
> > > longstanding features such as sexp-based navigation (e.g. `up-list`
> > > and friends) for modes such say, `js-mode` and (2) break features
> > > that expect this to work, most notably electric-pair-mode.

> > This isn't true.

> What "isn't true"? Have those features broken or not?

They may well be broken.  CC Mode hasn't broken them.  They made invalid
assumptions, which turned out to be unjustified.

> They worked before the fe06f643b commit and don't work after the
> commit. It sounds quite unrefutable to me.

I don't know what you're talking about.

> > If those other feature no longer work with an up to date Emacs, they
> > should be fixed.

> I've stated this repeatedly in the life of this discussion: it's not
> just about electric-pair-mode. If you try to M-x up-list from a
> multi-line string in emacs 26.1 it works just as well in js-mode and
> c++-mode.  In emacs master it does not in c++-mode. Same with
> forward-sexp on the starting delimiter, etc.

I've just tried these in a multiline string in C++ Mode.  Both up-list
and forward-sexp work just fine.  I don't know what you're doing.

> > The fontification that CC Mode does is natural and helpful, and users
> > haven't complained about it (except when there've been bugs).

> Yes, users haven't complained except when users have complained.

> > There have certainly been no complaints about using
> > font-lock-warning-face for the invalid string delimiters, and
> > font-lock-string-face for valid ones.

> That's because providing this annotation is perfectly fine.  The problem
> is providing it _at the expense of other features_.

If other features are broken (and your list of other broken features, so
far, is empty), they should be fixed.

> And _that's_ what they've complained about: an average user has no
> obvious way of telling that the particular implementation of the red
> annotation thingy is guilty of breaking his C-M-u or his
> electric-pair-mode.

That's groundless disparagement.  C-M-u works, and electric-pair-mode is
broken because it's broken.  In one place it's using scan-sexps to move
forward over whitespace, totally oblivious to the possibility of
syntax-table text properties (which have been in use since Emacs-21).
That's broken code.

> He/she might even judge the latter more vital than said red thingy
> , an annotation which he/she will get by other means if using
> very popular packages such as flycheck, or flymake, or eglot, or
> lsp-mode, etc. These normally call the compiler directly on the
> source code and highlight those and many other errors.

Irrelevant.

> On the other hand, if what you want is the red annotation, are you
> absolutely sure there isn't a better way to get it?

No, I'm not.  That's why I invited you to come up with a better way, if
you can.

> And if you are, are you also absolutely sure you need to put it in the
> code and and not provide an easy way to turn it off?

It's a core feature of the mode, not an option.

> > For this, I think we would both rather that you amend elec-pair.el rather
> > than me.

> I'll be "mulling this over". There are potentially many other points
> of breakage

"Potentially" many?  So far, there is precisely one, in
electric-pair--unbalanced-strings-p.  I thought I was doing you a favour
by diagnosing the trouble.  If I'd known I'd get the reaction from you
I've just got, I wouldn't have bothered.

> that would need such an indirection, and doing that to serve just a
> particular cc-mode quirk doesn't sound priority to me.

No, you'd be cleaning up your code, to conform with the reality that in
2019 major modes use syntax-table text properties.  Features from CC
Mode have a habit of migrating to the Emacs core.

> In the meantime, let others chime in.

> Also, in the meantime, for a user that is bothered by this bug,
> I'd recomend to put something like this in his/her .emacs file:

>   (defun c-unescaped-nls-in-string-p (&optional quote-pos) t)

It's free software, but that's a stupid thing to do.

> I had something more elaborate in my setup but just this
> seems to fix it in my testing.

> There is a also a very promising variable, c-multiline-string-start-char,
> that I think would be a good candidate for customizing this, ....

It is not a customisation variable.  It is a language definition
variable.

> .... but I haven't messed with it enough. Just setting it from .emacs
> doesn't do the trick. Perhaps in a mode hook?

Or, alternatively, actually fix the problems which have been festering
for years or decades, and are just now revealing themselves.  Thus far,
there's exactly one such problem in electric-pair--unbalanced-strings-p.

> --
> João Távora

-- 
Alan Mackenzie (Nuremberg, Germany).




Reply sent to Alan Mackenzie <acm <at> muc.de>:
You have taken responsibility. (Mon, 08 Jul 2019 09:37:01 GMT) Full text and rfc822 format available.

Notification sent to Alan Mackenzie <acm <at> muc.de>:
bug acknowledged by developer. (Mon, 08 Jul 2019 09:37:02 GMT) Full text and rfc822 format available.

Message #19 received at 36474-done <at> debbugs.gnu.org (full text, mbox):

From: Alan Mackenzie <acm <at> muc.de>
To: 36474-done <at> debbugs.gnu.org
Subject: Re: bug#36474: Acknowledgement (Algorithm in
 electric-pair--unbalanced-strings-p unsuitable for CC Mode)
Date: Mon, 8 Jul 2019 09:36:27 +0000
The bug has been fixed in CC Mode.

On Tue, Jul 02, 2019 at 13:17:01 +0000, GNU bug Tracking System wrote:
> Thank you for filing a new bug report with debbugs.gnu.org.

> This is an automatically generated reply to let you know your message
> has been received.

> Your message is being forwarded to the package maintainers and other
> interested parties for their attention; they will reply in due course.

> Your message has been sent to the package maintainer(s):
>  bug-gnu-emacs <at> gnu.org

> If you wish to submit further information on this problem, please
> send it to 36474 <at> debbugs.gnu.org.

> Please do not send mail to help-debbugs <at> gnu.org unless you wish
> to report a problem with the Bug-tracking system.

> -- 
> 36474: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=36474
> GNU Bug Tracking System
> Contact help-debbugs <at> gnu.org with problems

-- 
Alan Mackenzie (Nuremberg, Germany).




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Mon, 05 Aug 2019 11:24:07 GMT) Full text and rfc822 format available.

This bug report was last modified 4 years and 255 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.