GNU bug report logs - #43631
28.0.50; CC Mode multiline strings grinds performance to a halt

Previous Next

Package: emacs;

Reported by: Theodor Thornhill <theo <at> thornhill.no>

Date: Sat, 26 Sep 2020 11:18:01 UTC

Severity: normal

Found in version 28.0.50

Done: Alan Mackenzie <acm <at> muc.de>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 43631 in the body.
You can then email your comments to 43631 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#43631; Package emacs. (Sat, 26 Sep 2020 11:18:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Theodor Thornhill <theo <at> thornhill.no>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Sat, 26 Sep 2020 11:18:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Theodor Thornhill <theo <at> thornhill.no>
To: bug-gnu-emacs <at> gnu.org
Subject: 28.0.50; CC Mode multiline strings grinds performance to a halt
Date: Sat, 26 Sep 2020 13:17:29 +0200
Hello there!

While creating a new mode derived from CC Mode, we noticed performance
is affected heavily when setting a character for
'c-multiline-string-start-char'. There is a discussion around this that
can be found at https://github.com/josteink/csharp-mode/issues/164,
and we were given an easily reproducible repo for this. It is verified
to slow typing down both in 'csharp-mode', 'pike-mode' and in this test
case:
https://github.com/unhammer/csharp-mode/tree/164-repro

I think (unconvincingly) that some of the problematic code is situated
around line 2047 in 'cc-mode.el', but this is only a guess taken from
some light profiling.

The issue is described well on github, and I think me trying to
reiterate that here will only cause subtle confusions.

One thing of note is that you don't even have to have any multiline
strings for this performance hit to occur, meaning all 'csharp-mode'
files do suffer from this.

Let me know if something is still unclear, and I'll try to bring up some
more information.

All the best,
Theodor Thornhill




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#43631; Package emacs. (Sat, 26 Sep 2020 11:42:01 GMT) Full text and rfc822 format available.

Message #8 received at 43631 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Theodor Thornhill <theo <at> thornhill.no>
Cc: 43631 <at> debbugs.gnu.org
Subject: Re: bug#43631: 28.0.50;
 CC Mode multiline strings grinds performance to a halt
Date: Sat, 26 Sep 2020 14:41:19 +0300
> Date: Sat, 26 Sep 2020 13:17:29 +0200
> From: Theodor Thornhill via "Bug reports for GNU Emacs,
>  the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org>
> 
> While creating a new mode derived from CC Mode, we noticed performance
> is affected heavily when setting a character for
> 'c-multiline-string-start-char'. There is a discussion around this that
> can be found at https://github.com/josteink/csharp-mode/issues/164,
> and we were given an easily reproducible repo for this. It is verified
> to slow typing down both in 'csharp-mode', 'pike-mode' and in this test
> case:
> https://github.com/unhammer/csharp-mode/tree/164-repro
> 
> I think (unconvincingly) that some of the problematic code is situated
> around line 2047 in 'cc-mode.el', but this is only a guess taken from
> some light profiling.

All the profiles posted there end prematurely, thus making it
impossible to make independent conclusions regarding the possible
culprits.  Would it be possible to post here a full profile,
completely expanded, obtained after loading all the relevant *.el
files as *.el (NOT *.elc!), so that the profile is detailed enough to
show the relevant parts?  It would make the discussion much more
focused.

Bonus points for posting another profile, where the feature you think
is the main culprit is disabled.

TIA




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#43631; Package emacs. (Sat, 26 Sep 2020 12:41:02 GMT) Full text and rfc822 format available.

Message #11 received at 43631 <at> debbugs.gnu.org (full text, mbox):

From: Theodor Thornhill <theo <at> thornhill.no>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 43631 <at> debbugs.gnu.org
Subject: Re: bug#43631: 28.0.50; CC Mode multiline strings grinds
 performance to a halt
Date: Sat, 26 Sep 2020 14:40:36 +0200
[Message part 1 (text/plain, inline)]
Eli Zaretskii <eliz <at> gnu.org> writes:

> All the profiles posted there end prematurely, thus making it
> impossible to make independent conclusions regarding the possible
> culprits.  Would it be possible to post here a full profile,
> completely expanded, obtained after loading all the relevant *.el
> files as *.el (NOT *.elc!), so that the profile is detailed enough to
> show the relevant parts?  It would make the discussion much more
> focused.
>
> Bonus points for posting another profile, where the feature you think
> is the main culprit is disabled.
>
> TIA


Attached is two reports, one which is super slow, and one that is fast.

Recipe:
 - git clone https://github.com/unhammer/csharp-mode/
 - git checkout 164-repro
 - eval csharp-mode.el
 - open superslow.cs and write some text
 - rinse, repeat, but with
   
(c-lang-defconst c-multiline-string-start-char
  csharp ?@)

commented out.

One is unbearably slow, the other is super fast.

Hope this helps a little!

All the best,
Theodor Thornhill


[not-slow-without-multiline.txt (text/plain, attachment)]
[report-slow-with-multiline.txt (text/plain, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#43631; Package emacs. (Sat, 26 Sep 2020 13:44:02 GMT) Full text and rfc822 format available.

Message #14 received at 43631 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Theodor Thornhill <theo <at> thornhill.no>
Cc: 43631 <at> debbugs.gnu.org
Subject: Re: bug#43631: 28.0.50; CC Mode multiline strings grinds
 performance to a halt
Date: Sat, 26 Sep 2020 16:43:05 +0300
> From: Theodor Thornhill <theo <at> thornhill.no>
> Cc: 43631 <at> debbugs.gnu.org
> Date: Sat, 26 Sep 2020 14:40:36 +0200
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> > All the profiles posted there end prematurely, thus making it
> > impossible to make independent conclusions regarding the possible
> > culprits.  Would it be possible to post here a full profile,
> > completely expanded, obtained after loading all the relevant *.el
> > files as *.el (NOT *.elc!), so that the profile is detailed enough to
> > show the relevant parts?  It would make the discussion much more
> > focused.
> >
> > Bonus points for posting another profile, where the feature you think
> > is the main culprit is disabled.
> >
> > TIA
> 
> 
> Attached is two reports, one which is super slow, and one that is fast.

Thanks, but please post the text of Profiler-Report buffer, not the
internal structure it produces.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#43631; Package emacs. (Sat, 26 Sep 2020 16:04:02 GMT) Full text and rfc822 format available.

Message #17 received at 43631 <at> debbugs.gnu.org (full text, mbox):

From: Theodor Thornhill <theo <at> thornhill.no>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 43631 <at> debbugs.gnu.org
Subject: Re: bug#43631: 28.0.50; CC Mode multiline strings grinds
 performance to a halt
Date: Sat, 26 Sep 2020 18:03:10 +0200
[Message part 1 (text/plain, inline)]
Eli Zaretskii <eliz <at> gnu.org> writes:

[...]


>> 
>> Attached is two reports, one which is super slow, and one that is fast.
>
> Thanks, but please post the text of Profiler-Report buffer, not the
> internal structure it produces.

Ok, third time's the charm:

recipe same as before.

Theodor Thornhill

[fast-without-multiline.txt (text/plain, attachment)]
[slow-with-multiline.txt (text/plain, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#43631; Package emacs. (Sat, 26 Sep 2020 16:18:01 GMT) Full text and rfc822 format available.

Message #20 received at 43631 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Theodor Thornhill <theo <at> thornhill.no>
Cc: 43631 <at> debbugs.gnu.org
Subject: Re: bug#43631: 28.0.50; CC Mode multiline strings grinds
 performance to a halt
Date: Sat, 26 Sep 2020 19:17:06 +0300
> From: Theodor Thornhill <theo <at> thornhill.no>
> Cc: 43631 <at> debbugs.gnu.org
> Date: Sat, 26 Sep 2020 18:03:10 +0200
> 
> > Thanks, but please post the text of Profiler-Report buffer, not the
> > internal structure it produces.
> 
> Ok, third time's the charm:

Thanks.  This seems to indicate that this loop in
c-pps-to-string-delim is the culprit:

    (while (progn
	     (parse-partial-sexp (point) end nil nil st-s 'syntax-table)
	     (unless (bobp)
	       (c-clear-syn-tab (1- (point))))
	     (setq st-pos (point))
	     (and (< (point) end)
		  (not (eq (char-before) ?\")))))

But I'm confused why the "fast" profile starts with
font-lock-fontify-region, whereas the "slow" profile doesn't have
font-lock-fontify-region anywhere...

Hopefully, Alan can take it from here.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#43631; Package emacs. (Sat, 26 Sep 2020 19:44:01 GMT) Full text and rfc822 format available.

Message #23 received at 43631 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Eli Zaretskii <eliz <at> gnu.org>, Theodor Thornhill <theo <at> thornhill.no>
Cc: 43631 <at> debbugs.gnu.org
Subject: Re: bug#43631: 28.0.50; CC Mode multiline strings grinds performance
 to a halt
Date: Sat, 26 Sep 2020 22:43:05 +0300
On 26.09.2020 19:17, Eli Zaretskii wrote:
> But I'm confused why the "fast" profile starts with
> font-lock-fontify-region, whereas the "slow" profile doesn't have
> font-lock-fontify-region anywhere...

Because CC Mode doesn't use syntax-propertize-function (most major modes 
do, so we're used to seeing "slow" syntax analysis being done inside 
font-lock-fontify-region because it calls s-p-f).

CC Mode applies syntax properties inside before/after-change-functions, 
and the "slow" profile reflects that: c-before-change is featured 
prominently.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#43631; Package emacs. (Sun, 27 Sep 2020 09:56:01 GMT) Full text and rfc822 format available.

Message #26 received at 43631 <at> debbugs.gnu.org (full text, mbox):

From: Theodor Thornhill <theo <at> thornhill.no>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 43631 <at> debbugs.gnu.org
Subject: Re: bug#43631: 28.0.50; CC Mode multiline strings grinds
 performance to a halt
Date: Sun, 27 Sep 2020 11:54:23 +0200
Eli Zaretskii <eliz <at> gnu.org> writes:


[...]

>
> Thanks.  This seems to indicate that this loop in
> c-pps-to-string-delim is the culprit:
>
>     (while (progn
> 	     (parse-partial-sexp (point) end nil nil st-s 'syntax-table)
> 	     (unless (bobp)
> 	       (c-clear-syn-tab (1- (point))))
> 	     (setq st-pos (point))
> 	     (and (< (point) end)
> 		  (not (eq (char-before) ?\")))))
>
> But I'm confused why the "fast" profile starts with
> font-lock-fontify-region, whereas the "slow" profile doesn't have
> font-lock-fontify-region anywhere...
>

Thanks for digging into this. I can add one more thing that I see. When
this variable is set to some char, typing that character and then quote
mark would only insert one quote and fontify to end of buffer as a
string.

Example:

 - type @"             // (#") for pike-mode
 - see whole buffer get fontified
 - no extra quote mark is inserted to make a proper pair.

What I would expect:

 - type @"
 - see @""
 - type normally inside quote marks.

I am not sure how this is related, if at all, but found it noticeable
enough to add to this discussion.

Also, if the 'multiline-string' variable is not set, typing @" would
behave as expected, with the pair being closed and nothing other than
string is fontified.

> Hopefully, Alan can take it from here.

Theo




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#43631; Package emacs. (Sun, 27 Sep 2020 11:36:02 GMT) Full text and rfc822 format available.

Message #29 received at 43631 <at> debbugs.gnu.org (full text, mbox):

From: Theodor Thornhill <theo <at> thornhill.no>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 43631 <at> debbugs.gnu.org
Subject: Re: bug#43631: 28.0.50; CC Mode multiline strings grinds
 performance to a halt
Date: Sun, 27 Sep 2020 13:34:32 +0200
Eli Zaretskii <eliz <at> gnu.org> writes:

[...]


> But I'm confused why the "fast" profile starts with
> font-lock-fontify-region, whereas the "slow" profile doesn't have
> font-lock-fontify-region anywhere...

I see that when I remove 'c-before-change-check-unbalanced-strings from
'c-get-state-before-change-functions' the performance degradation
ceases.  I'm not sure what else is affected by that change, so not sure
if that can be counted as a fix as far as 'csharp-mode' is concerned.

Just wanted to let you know.

Theo




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#43631; Package emacs. (Mon, 28 Sep 2020 19:34:02 GMT) Full text and rfc822 format available.

Message #32 received at 43631 <at> debbugs.gnu.org (full text, mbox):

From: Alan Mackenzie <acm <at> muc.de>
To: Theodor Thornhill <theo <at> thornhill.no>
Cc: 43631 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>
Subject: Re: bug#43631: 28.0.50; CC Mode multiline strings grinds performance
 to a halt
Date: Mon, 28 Sep 2020 19:33:25 +0000
Hello, Theo.

On Sun, Sep 27, 2020 at 13:34:32 +0200, Theodor Thornhill wrote:
> Eli Zaretskii <eliz <at> gnu.org> writes:

> [...]


> > But I'm confused why the "fast" profile starts with
> > font-lock-fontify-region, whereas the "slow" profile doesn't have
> > font-lock-fontify-region anywhere...

> I see that when I remove 'c-before-change-check-unbalanced-strings from
> 'c-get-state-before-change-functions' the performance degradation
> ceases.  I'm not sure what else is affected by that change, so not sure
> if that can be counted as a fix as far as 'csharp-mode' is concerned.

I would strongly recommend you not to make such a change, at least not
without a good deal of matching changes elsewhere.  ;-)

It seems the bit in c-b-c-check-unbalanced-strings dealing with
multiline strings was written on the assumption that buffers containing
such would be small.

With multiline strings, _any_ change involving quote characters can flip
the string/non-string characterisation from point all the way to the end
of the buffer.  In the worst case scenario, this potentially big region
needs to be analysed and have syntax-table text properties throughout
the entire region changed.

The current problem is that c-b-c-check-u-strings is doing this analysis
for every buffer change.  This was easier to code, but has led to
performance problems on buffers which aren't small.  The solution to
this will have to involve restricting this analysis to when quote marks
or the c-multiline-string-start-char get inserted or removed.  That way,
there should only be an occasional and tolerable delay when one of these
characters is inserted/removed.

I'll be looking at this in the coming days.

> Just wanted to let you know.

> Theo

-- 
Alan Mackenzie (Nuremberg, Germany).




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#43631; Package emacs. (Mon, 28 Sep 2020 19:43:01 GMT) Full text and rfc822 format available.

Message #35 received at 43631 <at> debbugs.gnu.org (full text, mbox):

From: Theodor Thornhill <theo <at> thornhill.no>
To: Alan Mackenzie <acm <at> muc.de>
Cc: 43631 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>
Subject: Re: bug#43631: 28.0.50; CC Mode multiline strings grinds
 performance to a halt
Date: Mon, 28 Sep 2020 21:41:12 +0200
Alan Mackenzie <acm <at> muc.de> writes:

> Hello, Theo.
>

[...]
>> I see that when I remove 'c-before-change-check-unbalanced-strings from
>> 'c-get-state-before-change-functions' the performance degradation
>> ceases.  I'm not sure what else is affected by that change, so not sure
>> if that can be counted as a fix as far as 'csharp-mode' is concerned.
>
> I would strongly recommend you not to make such a change, at least not
> without a good deal of matching changes elsewhere.  ;-)

Yeah, I put it back in some time ago ;)

>
> It seems the bit in c-b-c-check-unbalanced-strings dealing with
> multiline strings was written on the assumption that buffers containing
> such would be small.
>
> With multiline strings, _any_ change involving quote characters can flip
> the string/non-string characterisation from point all the way to the end
> of the buffer.  In the worst case scenario, this potentially big region
> needs to be analysed and have syntax-table text properties throughout
> the entire region changed.
>
> The current problem is that c-b-c-check-u-strings is doing this analysis
> for every buffer change.  This was easier to code, but has led to
> performance problems on buffers which aren't small.  The solution to
> this will have to involve restricting this analysis to when quote marks
> or the c-multiline-string-start-char get inserted or removed.  That way,
> there should only be an occasional and tolerable delay when one of these
> characters is inserted/removed.
>
> I'll be looking at this in the coming days.
>

Thats very interesting, and thanks!

--
Theo




Reply sent to Alan Mackenzie <acm <at> muc.de>:
You have taken responsibility. (Fri, 13 Oct 2023 15:28:02 GMT) Full text and rfc822 format available.

Notification sent to Theodor Thornhill <theo <at> thornhill.no>:
bug acknowledged by developer. (Fri, 13 Oct 2023 15:28:02 GMT) Full text and rfc822 format available.

Message #40 received at 43631-done <at> debbugs.gnu.org (full text, mbox):

From: Alan Mackenzie <acm <at> muc.de>
To: 43631-done <at> debbugs.gnu.org
Subject: Re: bug#43631: 28.0.50; CC Mode multiline strings grinds performance
 to a halt
Date: Fri, 13 Oct 2023 15:26:36 +0000
Hello, Emacs.

This bug was fixed with my commits between 2021-08-12 and 2021-08-14.

Closing.

-- 
Alan Mackenzie (Nuremberg, Germany).




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sat, 11 Nov 2023 12:24:09 GMT) Full text and rfc822 format available.

This bug report was last modified 167 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.