GNU bug report logs - #49127
Performance degradation in encode_coding_object

Previous Next

Package: emacs;

Reported by: Victor Nawothnig <victor.nawothnig <at> icloud.com>

Date: Sun, 20 Jun 2021 08:19:03 UTC

Severity: normal

Done: Eli Zaretskii <eliz <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 49127 in the body.
You can then email your comments to 49127 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#49127; Package emacs. (Sun, 20 Jun 2021 08:19:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Victor Nawothnig <victor.nawothnig <at> icloud.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Sun, 20 Jun 2021 08:19:03 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Victor Nawothnig <victor.nawothnig <at> icloud.com>
To: bug-gnu-emacs <at> gnu.org
Subject: Performance degradation in encode_coding_object
Date: Sun, 20 Jun 2021 08:30:24 +0200
Hi,

All of the following applies to Emacs 27.1 and 28.0.50.

Im currently debugging a performance degradation in haskell-mode. When editing code in a terminal frame, over time as I make modifications to source code, redrawing lines during scrolling becomes continuously slower to the point that it sometimes takes up 200ms for a single line to draw. This problem disappears once the GC runs.

With gprof/prof_events I have nailed the problem to be encode_coding_object looping over all markers. In degenerate cases this list can contain millions of markers. Traversing this list is particularly slow because of the indirection being a singly linked list. Based on the fact that a GC remedies this, I’m assuming this list contains mostly  unreachable markers. When stepping through encode_coding_object with GDB after a GC this list of markers shrinks to small double digit numbers from millions.

The source of these markers appears to be looking-at in the font locking code of haskell-mode, this assumption is based on the fact that commenting out the uses of looking-at in haskell-mode prevents the accumulation of markers and thus the slowdown.

One contributing factor to all of this, is that for lsp-mode to perform adequately, one needs a relatively high gc-cons-threshold, which means GCs that would clean up the markers run more rarely, leading to higher accumulation of markers over time.

This problem only triggers in terminal frames, but not in GUI frames. Setting GDB breakpoints suggests that the GUI frame never even calls into encode_coding_object.

So far I’m torn on whether this is a bug in the haskell-mode font locking code or in Emacs. What do you think?

Kind regards,
Victor



Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#49127; Package emacs. (Sun, 20 Jun 2021 09:05:02 GMT) Full text and rfc822 format available.

Message #8 received at 49127 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Victor Nawothnig <victor.nawothnig <at> icloud.com>
Cc: 49127 <at> debbugs.gnu.org
Subject: Re: bug#49127: Performance degradation in encode_coding_object
Date: Sun, 20 Jun 2021 12:04:59 +0300
> Date: Sun, 20 Jun 2021 08:30:24 +0200
> From:  Victor Nawothnig via "Bug reports for GNU Emacs,
>  the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org>
> 
> With gprof/prof_events I have nailed the problem to be encode_coding_object looping over all markers. In degenerate cases this list can contain millions of markers. Traversing this list is particularly slow because of the indirection being a singly linked list. Based on the fact that a GC remedies this, I’m assuming this list contains mostly  unreachable markers. When stepping through encode_coding_object with GDB after a GC this list of markers shrinks to small double digit numbers from millions.
> 
> The source of these markers appears to be looking-at in the font locking code of haskell-mode, this assumption is based on the fact that commenting out the uses of looking-at in haskell-mode prevents the accumulation of markers and thus the slowdown.

Do you understand why using looking-at causes creation of markers?  If
so, can you show the details of why this happens?

> One contributing factor to all of this, is that for lsp-mode to perform adequately, one needs a relatively high gc-cons-threshold, which means GCs that would clean up the markers run more rarely, leading to higher accumulation of markers over time.

Yes, playing with GC threshold is usually a bad idea, but it is hard
to explain to people why, and they keep doing that, to their cost.

> This problem only triggers in terminal frames, but not in GUI frames. Setting GDB breakpoints suggests that the GUI frame never even calls into encode_coding_object.

Can you should a backtrace from the call to encode_coding_object,
including the Lisp backtrace (via the "xbacktrace" command)?

> So far I’m torn on whether this is a bug in the haskell-mode font locking code or in Emacs. What do you think?

Let's revisit this question after we have all the data I requested
above, okay?

Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#49127; Package emacs. (Thu, 24 Jun 2021 16:50:02 GMT) Full text and rfc822 format available.

Message #11 received at 49127 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: victor.nawothnig <at> icloud.com
Cc: 49127 <at> debbugs.gnu.org
Subject: Re: bug#49127: Performance degradation in encode_coding_object
Date: Thu, 24 Jun 2021 19:49:41 +0300
Ping!  Could you please respond to my requests below?  I'd like to
make some progress with this bug report.

> Date: Sun, 20 Jun 2021 12:04:59 +0300
> From: Eli Zaretskii <eliz <at> gnu.org>
> Cc: 49127 <at> debbugs.gnu.org
> 
> > Date: Sun, 20 Jun 2021 08:30:24 +0200
> > From:  Victor Nawothnig via "Bug reports for GNU Emacs,
> >  the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org>
> > 
> > With gprof/prof_events I have nailed the problem to be encode_coding_object looping over all markers. In degenerate cases this list can contain millions of markers. Traversing this list is particularly slow because of the indirection being a singly linked list. Based on the fact that a GC remedies this, I’m assuming this list contains mostly  unreachable markers. When stepping through encode_coding_object with GDB after a GC this list of markers shrinks to small double digit numbers from millions.
> > 
> > The source of these markers appears to be looking-at in the font locking code of haskell-mode, this assumption is based on the fact that commenting out the uses of looking-at in haskell-mode prevents the accumulation of markers and thus the slowdown.
> 
> Do you understand why using looking-at causes creation of markers?  If
> so, can you show the details of why this happens?
> 
> > One contributing factor to all of this, is that for lsp-mode to perform adequately, one needs a relatively high gc-cons-threshold, which means GCs that would clean up the markers run more rarely, leading to higher accumulation of markers over time.
> 
> Yes, playing with GC threshold is usually a bad idea, but it is hard
> to explain to people why, and they keep doing that, to their cost.
> 
> > This problem only triggers in terminal frames, but not in GUI frames. Setting GDB breakpoints suggests that the GUI frame never even calls into encode_coding_object.
> 
> Can you should a backtrace from the call to encode_coding_object,
> including the Lisp backtrace (via the "xbacktrace" command)?
> 
> > So far I’m torn on whether this is a bug in the haskell-mode font locking code or in Emacs. What do you think?
> 
> Let's revisit this question after we have all the data I requested
> above, okay?
> 
> Thanks.
> 
> 
> 
> 





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#49127; Package emacs. (Sun, 25 Jul 2021 07:11:01 GMT) Full text and rfc822 format available.

Message #14 received at 49127 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: victor.nawothnig <at> icloud.com
Cc: 49127 <at> debbugs.gnu.org
Subject: Re: bug#49127: Performance degradation in encode_coding_object
Date: Sun, 25 Jul 2021 10:10:40 +0300
Ping! Ping!  Please respond, so we could take care of this issue.

To recap: I would like to have a backtrace from the call to
encode_coding_object, including the Lisp backtrace (via the
"xbacktrace" command), so that we could see how this performance issue
happens.

TIA

> Date: Thu, 24 Jun 2021 19:49:41 +0300
> From: Eli Zaretskii <eliz <at> gnu.org>
> Cc: 49127 <at> debbugs.gnu.org
> 
> Ping!  Could you please respond to my requests below?  I'd like to
> make some progress with this bug report.
> 
> > Date: Sun, 20 Jun 2021 12:04:59 +0300
> > From: Eli Zaretskii <eliz <at> gnu.org>
> > Cc: 49127 <at> debbugs.gnu.org
> > 
> > > Date: Sun, 20 Jun 2021 08:30:24 +0200
> > > From:  Victor Nawothnig via "Bug reports for GNU Emacs,
> > >  the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org>
> > > 
> > > With gprof/prof_events I have nailed the problem to be encode_coding_object looping over all markers. In degenerate cases this list can contain millions of markers. Traversing this list is particularly slow because of the indirection being a singly linked list. Based on the fact that a GC remedies this, I’m assuming this list contains mostly  unreachable markers. When stepping through encode_coding_object with GDB after a GC this list of markers shrinks to small double digit numbers from millions.
> > > 
> > > The source of these markers appears to be looking-at in the font locking code of haskell-mode, this assumption is based on the fact that commenting out the uses of looking-at in haskell-mode prevents the accumulation of markers and thus the slowdown.
> > 
> > Do you understand why using looking-at causes creation of markers?  If
> > so, can you show the details of why this happens?
> > 
> > > One contributing factor to all of this, is that for lsp-mode to perform adequately, one needs a relatively high gc-cons-threshold, which means GCs that would clean up the markers run more rarely, leading to higher accumulation of markers over time.
> > 
> > Yes, playing with GC threshold is usually a bad idea, but it is hard
> > to explain to people why, and they keep doing that, to their cost.
> > 
> > > This problem only triggers in terminal frames, but not in GUI frames. Setting GDB breakpoints suggests that the GUI frame never even calls into encode_coding_object.
> > 
> > Can you should a backtrace from the call to encode_coding_object,
> > including the Lisp backtrace (via the "xbacktrace" command)?
> > 
> > > So far I’m torn on whether this is a bug in the haskell-mode font locking code or in Emacs. What do you think?
> > 
> > Let's revisit this question after we have all the data I requested
> > above, okay?
> > 
> > Thanks.
> > 
> > 
> > 
> > 
> 
> 
> 
> 
> 




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#49127; Package emacs. (Sun, 15 Aug 2021 15:08:01 GMT) Full text and rfc822 format available.

Message #17 received at 49127 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: victor.nawothnig <at> icloud.com
Cc: 49127 <at> debbugs.gnu.org
Subject: Re: bug#49127: Performance degradation in encode_coding_object
Date: Sun, 15 Aug 2021 18:07:23 +0300
Ping! Ping! Ping! Please respond!

> Date: Sun, 25 Jul 2021 10:10:40 +0300
> From: Eli Zaretskii <eliz <at> gnu.org>
> Cc: 49127 <at> debbugs.gnu.org
> 
> Ping! Ping!  Please respond, so we could take care of this issue.
> 
> To recap: I would like to have a backtrace from the call to
> encode_coding_object, including the Lisp backtrace (via the
> "xbacktrace" command), so that we could see how this performance issue
> happens.
> 
> TIA
> 
> > Date: Thu, 24 Jun 2021 19:49:41 +0300
> > From: Eli Zaretskii <eliz <at> gnu.org>
> > Cc: 49127 <at> debbugs.gnu.org
> > 
> > Ping!  Could you please respond to my requests below?  I'd like to
> > make some progress with this bug report.
> > 
> > > Date: Sun, 20 Jun 2021 12:04:59 +0300
> > > From: Eli Zaretskii <eliz <at> gnu.org>
> > > Cc: 49127 <at> debbugs.gnu.org
> > > 
> > > > Date: Sun, 20 Jun 2021 08:30:24 +0200
> > > > From:  Victor Nawothnig via "Bug reports for GNU Emacs,
> > > >  the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org>
> > > > 
> > > > With gprof/prof_events I have nailed the problem to be encode_coding_object looping over all markers. In degenerate cases this list can contain millions of markers. Traversing this list is particularly slow because of the indirection being a singly linked list. Based on the fact that a GC remedies this, I’m assuming this list contains mostly  unreachable markers. When stepping through encode_coding_object with GDB after a GC this list of markers shrinks to small double digit numbers from millions.
> > > > 
> > > > The source of these markers appears to be looking-at in the font locking code of haskell-mode, this assumption is based on the fact that commenting out the uses of looking-at in haskell-mode prevents the accumulation of markers and thus the slowdown.
> > > 
> > > Do you understand why using looking-at causes creation of markers?  If
> > > so, can you show the details of why this happens?
> > > 
> > > > One contributing factor to all of this, is that for lsp-mode to perform adequately, one needs a relatively high gc-cons-threshold, which means GCs that would clean up the markers run more rarely, leading to higher accumulation of markers over time.
> > > 
> > > Yes, playing with GC threshold is usually a bad idea, but it is hard
> > > to explain to people why, and they keep doing that, to their cost.
> > > 
> > > > This problem only triggers in terminal frames, but not in GUI frames. Setting GDB breakpoints suggests that the GUI frame never even calls into encode_coding_object.
> > > 
> > > Can you should a backtrace from the call to encode_coding_object,
> > > including the Lisp backtrace (via the "xbacktrace" command)?
> > > 
> > > > So far I’m torn on whether this is a bug in the haskell-mode font locking code or in Emacs. What do you think?
> > > 
> > > Let's revisit this question after we have all the data I requested
> > > above, okay?
> > > 
> > > Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#49127; Package emacs. (Mon, 16 Aug 2021 17:44:01 GMT) Full text and rfc822 format available.

Message #20 received at 49127 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Mattias Engdegård <mattiase <at> acm.org>
Cc: 49127 <at> debbugs.gnu.org, victor.nawothnig <at> icloud.com
Subject: Re: bug#49127: Performance degradation in encode_coding_object
Date: Mon, 16 Aug 2021 20:43:20 +0300
> From: Mattias Engdegård <mattiase <at> acm.org>
> Date: Mon, 16 Aug 2021 19:32:46 +0200
> Cc: Eli Zaretskii <eliz <at> gnu.org>, 49127 <at> debbugs.gnu.org
> 
> It looks very much like haskell-mode is producing way too many markers, and that should be easily fixed (I wrote a tentative pull request). However, the code in encode_coding_object appears broken. Look at the first loop in that function:
> 
>   if (EQ (src_object, dst_object))
>     {
>       struct Lisp_Marker *tail;
> 
>       for (tail = BUF_MARKERS (current_buffer); tail; tail = tail->next)
> 	{
> 	  tail->need_adjustment
> 	    = tail->charpos == (tail->insertion_type ? from : to);
> 	  need_marker_adjustment |= tail->need_adjustment;
> 	}
>     }
> 
> I don't know how this could ever work. We loop through the markers in the current buffer?

Yes.  Why do you think this loop is broken?

> It is called from term.c with src_object and dst_object both being Qnil; no buffer is involved at all. At the very least the outer condition should have an "&& BUFFERP (dst_object)" added, and the loop should run through the markers of XBUFFER (dst_object) instead of current_buffer, no?
> 
> `decode_coding_object` seem to get these things right but I only took a superficial look. It's probably copy-paste errors.

It isn't a copy/paste, I think it's a simple omission.

But if you look up-thread, you will see that I asked for xbacktrace
results from the OP, to make sure I completely understand the reasons
for the performance problem.  I never got any responses for that.  So
if you can produce the Lisp backtraces in that case, and later test a
fix, I'm quite sure I know how to fix this problem.  TIA.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#49127; Package emacs. (Mon, 16 Aug 2021 18:08:02 GMT) Full text and rfc822 format available.

Message #23 received at 49127 <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 49127 <at> debbugs.gnu.org, victor.nawothnig <at> icloud.com
Subject: Re: bug#49127: Performance degradation in encode_coding_object
Date: Mon, 16 Aug 2021 20:06:32 +0200
[Message part 1 (text/plain, inline)]
16 aug. 2021 kl. 19.43 skrev Eli Zaretskii <eliz <at> gnu.org>:

>> I don't know how this could ever work. We loop through the markers in the current buffer?
> 
> Yes.  Why do you think this loop is broken?

Because unless I misunderstood the code entirely, the current buffer has nothing to do with the operation at hand.

It's easy to reproduce the original problem: run Emacs in a terminal and make a buffer with many markers. See how the text displays slower with more markers. I've attached a short example; try (make-test-buffer 1000).

The attached patch fixes this problem.

[lus.el (application/octet-stream, attachment)]
[0001-Fix-marker-traversion-in-encode_coding_object.patch (application/octet-stream, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#49127; Package emacs. (Mon, 16 Aug 2021 18:51:02 GMT) Full text and rfc822 format available.

Message #26 received at 49127 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Mattias Engdegård <mattiase <at> acm.org>
Cc: 49127 <at> debbugs.gnu.org, victor.nawothnig <at> icloud.com
Subject: Re: bug#49127: Performance degradation in encode_coding_object
Date: Mon, 16 Aug 2021 21:50:03 +0300
> From: Mattias Engdegård <mattiase <at> acm.org>
> Date: Mon, 16 Aug 2021 20:06:32 +0200
> Cc: victor.nawothnig <at> icloud.com, 49127 <at> debbugs.gnu.org
> 
> > Yes.  Why do you think this loop is broken?
> 
> Because unless I misunderstood the code entirely, the current buffer has nothing to do with the operation at hand.

It does, if both src_object and dst_object are the same (current)
buffer.

> It's easy to reproduce the original problem: run Emacs in a terminal and make a buffer with many markers. See how the text displays slower with more markers. I've attached a short example; try (make-test-buffer 1000).

How can we be sure this reproduces the original issue?  The original
issue was reported from a real-life use case, not from a toy program.

> The attached patch fixes this problem.

Thanks, but this is not the whole story with that problem in
encode_coding_object.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#49127; Package emacs. (Mon, 16 Aug 2021 20:05:01 GMT) Full text and rfc822 format available.

Message #29 received at 49127 <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 49127 <at> debbugs.gnu.org, victor.nawothnig <at> icloud.com
Subject: Re: bug#49127: Performance degradation in encode_coding_object
Date: Mon, 16 Aug 2021 22:04:36 +0200
16 aug. 2021 kl. 20.50 skrev Eli Zaretskii <eliz <at> gnu.org>:

> It does, if both src_object and dst_object are the same (current)
> buffer.

There is nothing in that function or in the comment describing it which says that the current buffer must be either. It costs us nothing to use the correct value so that's what we should do.

> How can we be sure this reproduces the original issue?  The original
> issue was reported from a real-life use case, not from a toy program.

Because I have read and tested the original code as well, which is how that toy program that reproduces the problem in a simpler but qualitatively equivalent way came to be. The patch fixes the symptoms of both the toy program and the real-life use case.

> Thanks, but this is not the whole story with that problem in
> encode_coding_object.

What is the whole story then?





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#49127; Package emacs. (Tue, 17 Aug 2021 12:35:02 GMT) Full text and rfc822 format available.

Message #32 received at 49127 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Mattias Engdegård <mattiase <at> acm.org>
Cc: 49127 <at> debbugs.gnu.org, victor.nawothnig <at> icloud.com
Subject: Re: bug#49127: Performance degradation in encode_coding_object
Date: Tue, 17 Aug 2021 15:34:19 +0300
> From: Mattias Engdegård <mattiase <at> acm.org>
> Date: Mon, 16 Aug 2021 22:04:36 +0200
> Cc: victor.nawothnig <at> icloud.com, 49127 <at> debbugs.gnu.org
> 
> > Thanks, but this is not the whole story with that problem in
> > encode_coding_object.
> 
> What is the whole story then?

See the change I installed a few minutes ago.




Reply sent to Eli Zaretskii <eliz <at> gnu.org>:
You have taken responsibility. (Tue, 17 Aug 2021 12:36:01 GMT) Full text and rfc822 format available.

Notification sent to Victor Nawothnig <victor.nawothnig <at> icloud.com>:
bug acknowledged by developer. (Tue, 17 Aug 2021 12:36:01 GMT) Full text and rfc822 format available.

Message #37 received at 49127-done <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: victor.nawothnig <at> icloud.com
Cc: 49127-done <at> debbugs.gnu.org
Subject: Re: bug#49127: Performance degradation in encode_coding_object
Date: Tue, 17 Aug 2021 15:35:08 +0300
> Date: Sun, 15 Aug 2021 18:07:23 +0300
> From: Eli Zaretskii <eliz <at> gnu.org>
> Cc: 49127 <at> debbugs.gnu.org
> 
> Ping! Ping! Ping! Please respond!

This should now be fixed on the master branch.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#49127; Package emacs. (Tue, 17 Aug 2021 13:08:02 GMT) Full text and rfc822 format available.

Message #40 received at 49127 <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 49127 <at> debbugs.gnu.org, victor.nawothnig <at> icloud.com
Subject: Re: bug#49127: Performance degradation in encode_coding_object
Date: Tue, 17 Aug 2021 15:06:58 +0200
17 aug. 2021 kl. 14.34 skrev Eli Zaretskii <eliz <at> gnu.org>

> See the change I installed a few minutes ago.

Let's see, you took my patch and fixed one more condition that I missed (line 8301) -- thank you!
Anyway, I'm happy that it got done at last, and it's good that you took the time to write the documenting comment.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#49127; Package emacs. (Tue, 17 Aug 2021 14:07:01 GMT) Full text and rfc822 format available.

Message #43 received at 49127 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Mattias Engdegård <mattiase <at> acm.org>
Cc: 49127 <at> debbugs.gnu.org, victor.nawothnig <at> icloud.com
Subject: Re: bug#49127: Performance degradation in encode_coding_object
Date: Tue, 17 Aug 2021 17:05:49 +0300
> Feedback-ID:mattiase <at> acm.or
> From: Mattias Engdegård <mattiase <at> acm.org>
> Date: Tue, 17 Aug 2021 15:06:58 +0200
> Cc: victor.nawothnig <at> icloud.com, 49127 <at> debbugs.gnu.org
> 
> 17 aug. 2021 kl. 14.34 skrev Eli Zaretskii <eliz <at> gnu.org>
> 
> > See the change I installed a few minutes ago.
> 
> Let's see, you took my patch and fixed one more condition that I missed (line 8301) -- thank you!

No, it's you who took _my_ patch, removed one of its parts, and deleted
the commentary.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#49127; Package emacs. (Tue, 17 Aug 2021 16:08:01 GMT) Full text and rfc822 format available.

Message #46 received at 49127 <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 49127 <at> debbugs.gnu.org, victor.nawothnig <at> icloud.com
Subject: Re: bug#49127: Performance degradation in encode_coding_object
Date: Tue, 17 Aug 2021 18:07:10 +0200
17 aug. 2021 kl. 16.05 skrev Eli Zaretskii <eliz <at> gnu.org>:

> No, it's you who took _my_ patch, removed one of its parts, and deleted
> the commentary.

Yes, I wanted to have all the credit to my name! All mine!

Seriously, I didn't mean to imply that you did anything improper, and I'm genuinely pleased that we ended up with a good solution.

This business makes me wonder if there are more cases where the linear marker list causes bad performance. Probably not very often but it's not unreasonable for a mode to have many data structures with many (live) markers into the text, and that would lead to text changes going from O(1) to O(text size). In any case, not anything to worry about right now.






Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#49127; Package emacs. (Tue, 17 Aug 2021 17:17:01 GMT) Full text and rfc822 format available.

Message #49 received at 49127 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Mattias Engdegård <mattiase <at> acm.org>
Cc: 49127 <at> debbugs.gnu.org, victor.nawothnig <at> icloud.com
Subject: Re: bug#49127: Performance degradation in encode_coding_object
Date: Tue, 17 Aug 2021 20:16:29 +0300
> From: Mattias Engdegård <mattiase <at> acm.org>
> Date: Tue, 17 Aug 2021 18:07:10 +0200
> Cc: victor.nawothnig <at> icloud.com, 49127 <at> debbugs.gnu.org
> 
> This business makes me wonder if there are more cases where the linear marker list causes bad performance. Probably not very often but it's not unreasonable for a mode to have many data structures with many (live) markers into the text, and that would lead to text changes going from O(1) to O(text size). In any case, not anything to worry about right now.

It's a problem to have many markers in a buffer.  Not only for code
that searches them linearly, but also for stuff like converting
between character and byte positions, something that we do a lot in
the most inner loops of our code.  Lisp programs that produce gobs of
markers should be ideally redesigned not to do so.  Unless we change
the way we store markers to make that a non-issue.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#49127; Package emacs. (Wed, 18 Aug 2021 11:05:01 GMT) Full text and rfc822 format available.

Message #52 received at 49127 <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 49127 <at> debbugs.gnu.org, victor.nawothnig <at> icloud.com
Subject: Re: bug#49127: Performance degradation in encode_coding_object
Date: Wed, 18 Aug 2021 13:04:12 +0200
17 aug. 2021 kl. 19.16 skrev Eli Zaretskii <eliz <at> gnu.org>:

> It's a problem to have many markers in a buffer.  Not only for code
> that searches them linearly, but also for stuff like converting
> between character and byte positions, something that we do a lot in
> the most inner loops of our code.  Lisp programs that produce gobs of
> markers should be ideally redesigned not to do so.

I agree; the haskell-mode usage was misguided in this respect. Marker-intensive modes could still be justified occasionally: consider an index of all definitions and uses of each variable, function and type in a program, produced by an expensive compilation that cannot be re-run for each little change. This could easily amount to thousands of markers in a big file.

>  Unless we change
> the way we store markers to make that a non-issue.

We probably should but I agree that marker reduction should be the first remedy to try.
Improving the GC so that it's cheaper to run frequent collections would also help in the common case where most markers are dead (as in this case).





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#49127; Package emacs. (Wed, 18 Aug 2021 11:44:02 GMT) Full text and rfc822 format available.

Message #55 received at 49127 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Mattias Engdegård <mattiase <at> acm.org>
Cc: 49127 <at> debbugs.gnu.org, victor.nawothnig <at> icloud.com
Subject: Re: bug#49127: Performance degradation in encode_coding_object
Date: Wed, 18 Aug 2021 14:43:38 +0300
> From: Mattias Engdegård <mattiase <at> acm.org>
> Date: Wed, 18 Aug 2021 13:04:12 +0200
> Cc: victor.nawothnig <at> icloud.com, 49127 <at> debbugs.gnu.org
> 
> consider an index of all definitions and uses of each variable, function and type in a program, produced by an expensive compilation that cannot be re-run for each little change. This could easily amount to thousands of markers in a big file.

They could have used text properties instead, no?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#49127; Package emacs. (Wed, 18 Aug 2021 12:22:02 GMT) Full text and rfc822 format available.

Message #58 received at 49127 <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 49127 <at> debbugs.gnu.org, victor.nawothnig <at> icloud.com
Subject: Re: bug#49127: Performance degradation in encode_coding_object
Date: Wed, 18 Aug 2021 14:21:22 +0200
18 aug. 2021 kl. 13.43 skrev Eli Zaretskii <eliz <at> gnu.org>:

>> consider an index of all definitions and uses of each variable, function and type in a program, produced by an expensive compilation that cannot be re-run for each little change. This could easily amount to thousands of markers in a big file.
> 
> They could have used text properties instead, no?

Yes and no. That works for the direction from a point in the buffer but not the other way: go to a specific place in the buffer, such as a definition or all references of something. It could be something selected from a tree in the sidebar etc or be implicit (as in "go to the superclass of the class at point").

The alternative is to use plain offsets instead, and update them inside buffer-change callback functions -- but then we are essentially doing exactly what the Emacs marker implementation would be doing, only in Elisp instead of C so probably slower, unless we use more clever data structures.

Emacs isn't badly designed -- the marker implementation is fine if we assume there are only a few of them per buffer, which is mostly the case. But it isn't scalable, and also adversely affected by large amounts of garbage markers.






Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#49127; Package emacs. (Wed, 18 Aug 2021 13:24:02 GMT) Full text and rfc822 format available.

Message #61 received at 49127 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Mattias Engdegård <mattiase <at> acm.org>
Cc: 49127 <at> debbugs.gnu.org, victor.nawothnig <at> icloud.com
Subject: Re: bug#49127: Performance degradation in encode_coding_object
Date: Wed, 18 Aug 2021 16:23:11 +0300
> From: Mattias Engdegård <mattiase <at> acm.org>
> Date: Wed, 18 Aug 2021 14:21:22 +0200
> Cc: victor.nawothnig <at> icloud.com, 49127 <at> debbugs.gnu.org
> 
> > They could have used text properties instead, no?
> 
> Yes and no. That works for the direction from a point in the buffer but not the other way: go to a specific place in the buffer, such as a definition or all references of something. It could be something selected from a tree in the sidebar etc or be implicit (as in "go to the superclass of the class at point").

Text property search doesn't fit the bill?

> Emacs isn't badly designed -- the marker implementation is fine if we assume there are only a few of them per buffer, which is mostly the case. But it isn't scalable, and also adversely affected by large amounts of garbage markers.

These situations usually mean we lack some infrastructure, and the
Lisp program uses what's available, with bad results.  A better
solution is to design and implement the missing infrastructure
instead.

The problem with Emacs is not the design, it's that in many cases,
instead of extending the design where something is missing, Lisp
programmers tend to use the existing features outside of their
intended purpose.  IOW, our design is not evolving enough according to
the needs, not in a small measure because those needs are not always
communicated to us.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#49127; Package emacs. (Wed, 18 Aug 2021 13:33:01 GMT) Full text and rfc822 format available.

Message #64 received at 49127 <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 49127 <at> debbugs.gnu.org, victor.nawothnig <at> icloud.com
Subject: Re: bug#49127: Performance degradation in encode_coding_object
Date: Wed, 18 Aug 2021 15:32:11 +0200
18 aug. 2021 kl. 15.23 skrev Eli Zaretskii <eliz <at> gnu.org>:

> Text property search doesn't fit the bill?

Oh it does, except that it's linear in the size of the buffer, and that doesn't really scale either. For a single interactive lookup this might not be too bad but there might be programmatic uses that iterate.

> These situations usually mean we lack some infrastructure, and the
> Lisp program uses what's available, with bad results.  A better
> solution is to design and implement the missing infrastructure
> instead.

Could be, but markers is one type of infrastructure, and implementing something else for the same purpose is a bit of a waste compared to just making markers faster.

> The problem with Emacs is not the design, it's that in many cases,
> instead of extending the design where something is missing, Lisp
> programmers tend to use the existing features outside of their
> intended purpose.

Very true. We have probably done our job for the time being, but let's keep our eyes open for uses (legitimate or not) that stress the marker system to the point of disappointment, and consider what to do then.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#49127; Package emacs. (Wed, 18 Aug 2021 13:40:01 GMT) Full text and rfc822 format available.

Message #67 received at 49127 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Mattias Engdegård <mattiase <at> acm.org>
Cc: 49127 <at> debbugs.gnu.org, victor.nawothnig <at> icloud.com
Subject: Re: bug#49127: Performance degradation in encode_coding_object
Date: Wed, 18 Aug 2021 16:39:19 +0300
> From: Mattias Engdegård <mattiase <at> acm.org>
> Date: Wed, 18 Aug 2021 15:32:11 +0200
> Cc: victor.nawothnig <at> icloud.com, 49127 <at> debbugs.gnu.org
> 
> 18 aug. 2021 kl. 15.23 skrev Eli Zaretskii <eliz <at> gnu.org>:
> 
> > Text property search doesn't fit the bill?
> 
> Oh it does, except that it's linear in the size of the buffer

It is?  Don't we use the fact that properties are stored in an
interval tree?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#49127; Package emacs. (Wed, 18 Aug 2021 13:55:01 GMT) Full text and rfc822 format available.

Message #70 received at 49127 <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 49127 <at> debbugs.gnu.org, victor.nawothnig <at> icloud.com
Subject: Re: bug#49127: Performance degradation in encode_coding_object
Date: Wed, 18 Aug 2021 15:54:21 +0200
18 aug. 2021 kl. 15.39 skrev Eli Zaretskii <eliz <at> gnu.org>:

> It is?  Don't we use the fact that properties are stored in an
> interval tree?

I could very well be wrong about this, but I believe that the interval tree is indexed by location, so that we can quickly find a property given an offset. Searching for a particular property or property value requires going through all properties.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#49127; Package emacs. (Wed, 18 Aug 2021 14:00:01 GMT) Full text and rfc822 format available.

Message #73 received at 49127 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Mattias Engdegård <mattiase <at> acm.org>
Cc: 49127 <at> debbugs.gnu.org, victor.nawothnig <at> icloud.com
Subject: Re: bug#49127: Performance degradation in encode_coding_object
Date: Wed, 18 Aug 2021 16:59:13 +0300
> From: Mattias Engdegård <mattiase <at> acm.org>
> Date: Wed, 18 Aug 2021 15:54:21 +0200
> Cc: victor.nawothnig <at> icloud.com, 49127 <at> debbugs.gnu.org
> 
> 18 aug. 2021 kl. 15.39 skrev Eli Zaretskii <eliz <at> gnu.org>:
> 
> > It is?  Don't we use the fact that properties are stored in an
> > interval tree?
> 
> I could very well be wrong about this, but I believe that the interval tree is indexed by location, so that we can quickly find a property given an offset. Searching for a particular property or property value requires going through all properties.

next_interval doesn't look like linear search to me.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#49127; Package emacs. (Wed, 18 Aug 2021 14:35:01 GMT) Full text and rfc822 format available.

Message #76 received at 49127 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Mattias Engdegård <mattiase <at> acm.org>
Cc: 49127 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>,
 victor.nawothnig <at> icloud.com
Subject: Re: bug#49127: Performance degradation in encode_coding_object
Date: Wed, 18 Aug 2021 16:34:19 +0200
Mattias Engdegård <mattiase <at> acm.org> writes:

> Could be, but markers is one type of infrastructure, and implementing
> something else for the same purpose is a bit of a waste compared to
> just making markers faster.

I'm all for making markers faster -- users do stumble onto this problem,
is my impression.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#49127; Package emacs. (Wed, 18 Aug 2021 15:25:01 GMT) Full text and rfc822 format available.

Message #79 received at 49127 <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 49127 <at> debbugs.gnu.org, victor.nawothnig <at> icloud.com
Subject: Re: bug#49127: Performance degradation in encode_coding_object
Date: Wed, 18 Aug 2021 17:24:32 +0200
18 aug. 2021 kl. 15.59 skrev Eli Zaretskii <eliz <at> gnu.org>:

> next_interval doesn't look like linear search to me.

That's right, but it doesn't search for a particular property or property value -- it is used by next-property-change which only wants the change of some property. To find the position of a particular property, we would at least need to call something like next-single-property-change, which calls next_interval in a loop.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#49127; Package emacs. (Fri, 20 Aug 2021 23:25:01 GMT) Full text and rfc822 format available.

Message #82 received at 49127 <at> debbugs.gnu.org (full text, mbox):

From: Michael Welsh Duggan <mwd <at> md5i.com>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: 49127 <at> debbugs.gnu.org,
 Mattias Engdegård <mattiase <at> acm.org>,
 victor.nawothnig <at> icloud.com
Subject: Re: bug#49127: Performance degradation in encode_coding_object
Date: Fri, 20 Aug 2021 19:24:05 -0400
Lars Ingebrigtsen <larsi <at> gnus.org> writes:

> Mattias Engdegård <mattiase <at> acm.org> writes:
>
>> Could be, but markers is one type of infrastructure, and implementing
>> something else for the same purpose is a bit of a waste compared to
>> just making markers faster.
>
> I'm all for making markers faster -- users do stumble onto this problem,
> is my impression.

Could this be added to etc/TODO?  I have some ideas regarding this, but
I don't know if I will be able to get to it (or get release permission
from my place of work).

-- 
Michael Welsh Duggan
(md5i <at> md5i.com)




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#49127; Package emacs. (Sat, 21 Aug 2021 06:36:01 GMT) Full text and rfc822 format available.

Message #85 received at 49127 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Michael Welsh Duggan <mwd <at> md5i.com>
Cc: 49127 <at> debbugs.gnu.org, mattiase <at> acm.org, larsi <at> gnus.org,
 victor.nawothnig <at> icloud.com
Subject: Re: bug#49127: Performance degradation in encode_coding_object
Date: Sat, 21 Aug 2021 09:34:56 +0300
> From: Michael Welsh Duggan <mwd <at> md5i.com>
> Date: Fri, 20 Aug 2021 19:24:05 -0400
> Cc: 49127 <at> debbugs.gnu.org,
>  Mattias Engdegård <mattiase <at> acm.org>,
>  victor.nawothnig <at> icloud.com
> 
> > I'm all for making markers faster -- users do stumble onto this problem,
> > is my impression.
> 
> Could this be added to etc/TODO?

Thanks, done.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sat, 18 Sep 2021 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 2 years and 217 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.