GNU bug report logs - #56623
memory leak introduced by new nonrecursive lisp reader

Previous Next

Package: emacs;

Reported by: Tom Gillespie <tgbugs <at> gmail.com>

Date: Sun, 17 Jul 2022 22:04:01 UTC

Severity: normal

Tags: moreinfo

Done: Mattias Engdegård <mattiase <at> acm.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 56623 in the body.
You can then email your comments to 56623 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#56623; Package emacs. (Sun, 17 Jul 2022 22:04:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Tom Gillespie <tgbugs <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Sun, 17 Jul 2022 22:04:01 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Tom Gillespie <tgbugs <at> gmail.com>
To: Emacs Bug Report <bug-gnu-emacs <at> gnu.org>
Cc: Greg Hendershott <greghendershott <at> gmail.com>, mattiase <at> acm.org
Subject: memory leak introduced by new nonrecursive lisp reader
Date: Sun, 17 Jul 2022 15:03:16 -0700
[Message part 1 (text/plain, inline)]
While tracking down a memory leak in racket-mode git bisect
landed on b903507b36c438653a02d7b6291e9744d5221e28
Nonrecursive Lisp reader (bug#55676).

I do not have a minimal reproduction at the moment, but a
non-minimal one is documented at
see https://github.com/greghendershott/racket-mode/issues/632.
There are before and after graphs showing the change in behavior in
https://github.com/greghendershott/racket-mode/issues/632#issuecomment-1186609580

I have also attached the bisect script, and elisp needed to
reproduce the issue. Setting up the racket environment
to cause the issue is more involved (see the github issue).

Based on the behavior, it seems like strings and conses may be
marked by mark_lread but then somehow never unmarked, so
they never get collected, but that is a blind guess and I have no
idea if that is related at all.
[ow.el (text/x-emacs-lisp, attachment)]
[emacs-memory-leak-test.el (text/x-emacs-lisp, attachment)]
[bisect-memory.sh (application/x-shellscript, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#56623; Package emacs. (Mon, 18 Jul 2022 11:22:02 GMT) Full text and rfc822 format available.

Message #8 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: Tom Gillespie <tgbugs <at> gmail.com>
Cc: Greg Hendershott <greghendershott <at> gmail.com>,
 Emacs Bug Report <bug-gnu-emacs <at> gnu.org>
Subject: Re: memory leak introduced by new nonrecursive lisp reader
Date: Mon, 18 Jul 2022 13:21:34 +0200
18 juli 2022 kl. 00.03 skrev Tom Gillespie <tgbugs <at> gmail.com>:

> While tracking down a memory leak in racket-mode git bisect
> landed on b903507b36c438653a02d7b6291e9744d5221e28
> Nonrecursive Lisp reader (bug#55676).

Thank you. A self-contained reproduction would indeed help a lot.





Added tag(s) moreinfo. Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Sat, 23 Jul 2022 08:25:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#56623; Package emacs. (Wed, 27 Jul 2022 09:21:01 GMT) Full text and rfc822 format available.

Message #13 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: Tom Gillespie <tgbugs <at> gmail.com>
Cc: Greg Hendershott <greghendershott <at> gmail.com>,
 Emacs Bug Report <bug-gnu-emacs <at> gnu.org>
Subject: Re: memory leak introduced by new nonrecursive lisp reader
Date: Wed, 27 Jul 2022 11:20:11 +0200
It just struck me that you may just be unlucky.

The GC scans the C stack conservatively. The reader rewrite reduced the number of activation records on the stack; instead there is a much more compact marking stack which is scanned precisely, which is both much faster and, well, more precise.

However, in the absence of recursion the size of an on-stack data buffer in the reader was increased substantially (from 64 to 1024) because the danger from stack overflow was gone, and it made for a measurable increase in reading performance.

This means that when the GC now scans the C stack, it has to scan 1024 bytes of typically uninitialised data left over from previous stack activations which could very well include pointers to otherwise dead objects that will now be retained.

A counter-argument is that these undead objects will only be kept artificially alive as long as GC only takes place in the reader. Anyway, the hunch is easily tested: try either or both of these changes:

1. reduce stackbufsize:

--- a/src/lread.c
+++ b/src/lread.c
@@ -2919,7 +2919,7 @@ digit_to_number (int character, int base)
 /* Size of the fixed-size buffer used during reading.
    It should be at least big enough for `invalid_radix_integer' but
    can usefully be much bigger than that.  */
-enum { stackbufsize = 1024 };
+enum { stackbufsize = 64 };
 
 static void
 invalid_radix_integer (EMACS_INT radix, char stackbuf[VLA_ELEMS (stackbufsize)],


2. zero the buffer on entry:

--- a/src/lread.c
+++ b/src/lread.c
@@ -3678,6 +3678,7 @@ read_stack_push (struct read_stack_entry e)
 read0 (Lisp_Object readcharfun, bool locate_syms)
 {
   char stackbuf[stackbufsize];
+  memset (stackbuf, 0, stackbufsize);
   char *read_buffer = stackbuf;
   ptrdiff_t read_buffer_size = sizeof stackbuf;
   char *heapbuf = NULL;


If either makes things better, maybe we can come up with a solution that doesn't hurt performance.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#56623; Package emacs. (Sat, 27 Aug 2022 15:36:01 GMT) Full text and rfc822 format available.

Message #16 received at 56623 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Mattias Engdegård <mattiase <at> acm.org>
Cc: Tom Gillespie <tgbugs <at> gmail.com>, 56623 <at> debbugs.gnu.org,
 greghendershott <at> gmail.com
Subject: Re: bug#56623: memory leak introduced by new nonrecursive lisp reader
Date: Sat, 27 Aug 2022 17:35:37 +0200
Mattias Engdegård <mattiase <at> acm.org> writes:

> A counter-argument is that these undead objects will only be kept artificially alive as long as GC only takes place in the reader. Anyway, the hunch is easily tested: try either or both of these changes:
>
> 1. reduce stackbufsize:
>
> --- a/src/lread.c
> +++ b/src/lread.c
> @@ -2919,7 +2919,7 @@ digit_to_number (int character, int base)
>  /* Size of the fixed-size buffer used during reading.
>     It should be at least big enough for `invalid_radix_integer' but
>     can usefully be much bigger than that.  */
> -enum { stackbufsize = 1024 };
> +enum { stackbufsize = 64 };
>
>  static void
>  invalid_radix_integer (EMACS_INT radix, char stackbuf[VLA_ELEMS (stackbufsize)],
>
> 2. zero the buffer on entry:
>
> --- a/src/lread.c
> +++ b/src/lread.c
> @@ -3678,6 +3678,7 @@ read_stack_push (struct read_stack_entry e)
>  read0 (Lisp_Object readcharfun, bool locate_syms)
>  {
>    char stackbuf[stackbufsize];
> +  memset (stackbuf, 0, stackbufsize);
>    char *read_buffer = stackbuf;
>    ptrdiff_t read_buffer_size = sizeof stackbuf;
>    char *heapbuf = NULL;
>
> If either makes things better, maybe we can come up with a solution that doesn't hurt performance.

This was a month ago -- Tom, did you try these changes?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#56623; Package emacs. (Sun, 28 Aug 2022 21:07:02 GMT) Full text and rfc822 format available.

Message #19 received at 56623 <at> debbugs.gnu.org (full text, mbox):

From: Tom Gillespie <tgbugs <at> gmail.com>
To: Greg Hendershott <mail <at> greghendershott.com>
Cc: Mattias Engdegård <mattiase <at> acm.org>,
 Lars Ingebrigtsen <larsi <at> gnus.org>, 56623 <at> debbugs.gnu.org
Subject: Re: bug#56623: memory leak introduced by new nonrecursive lisp reader
Date: Sun, 28 Aug 2022 17:06:02 -0400
> maybe same happened for Tom?

Argh, yep, first time seeing this. I have limited internet access right now,
but will take a look when I get back next week.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#56623; Package emacs. (Sun, 28 Aug 2022 21:34:02 GMT) Full text and rfc822 format available.

Message #22 received at 56623 <at> debbugs.gnu.org (full text, mbox):

From: Greg Hendershott <mail <at> greghendershott.com>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: Mattias Engdegård <mattiase <at> acm.org>,
 56623 <at> debbugs.gnu.org, Tom Gillespie <tgbugs <at> gmail.com>
Subject: Re: bug#56623: memory leak introduced by new nonrecursive lisp reader
Date: Sun, 28 Aug 2022 13:29:35 -0400
Hi, Lars.

Your previous email from a month ago doesn't seem to have been delivered
to me; maybe same happened for Tom?

Today I took some time to try making both changes to lread.c. That did
not seem to help -- in my testing using Tom's artifacts, memory use
still grew unbounded (whereas it levels off with e.g. Emacs 25.2.2).

Bu as a sanity check, it would be great if Tom could try this, too, to
make sure?

I've gone to some non-trivial effort to make a much more minimal recipe
-- but so far I can't. On the one hand, Tom's bisect seems to point
clearly at this reader change. On the other hand, I don't have any
mental model for how/why, and I can't elicit similar symptoms so far by
working up from simpler examples exercising the reader. :(

Greg

On Sat 27 Aug 2022 at 15:35, Lars Ingebrigtsen <larsi <at> gnus.org> wrote:
> Mattias Engdegård <mattiase <at> acm.org> writes:
>
>> A counter-argument is that these undead objects will only be kept artificially alive as long as GC only takes place in the reader. Anyway, the hunch is easily tested: try either or both of these changes:
>>
>> 1. reduce stackbufsize:
>>
>> --- a/src/lread.c
>> +++ b/src/lread.c
>> @@ -2919,7 +2919,7 @@ digit_to_number (int character, int base)
>>  /* Size of the fixed-size buffer used during reading.
>>     It should be at least big enough for `invalid_radix_integer' but
>>     can usefully be much bigger than that.  */
>> -enum { stackbufsize = 1024 };
>> +enum { stackbufsize = 64 };
>>
>>  static void
>>  invalid_radix_integer (EMACS_INT radix, char stackbuf[VLA_ELEMS (stackbufsize)],
>>
>> 2. zero the buffer on entry:
>>
>> --- a/src/lread.c
>> +++ b/src/lread.c
>> @@ -3678,6 +3678,7 @@ read_stack_push (struct read_stack_entry e)
>>  read0 (Lisp_Object readcharfun, bool locate_syms)
>>  {
>>    char stackbuf[stackbufsize];
>> +  memset (stackbuf, 0, stackbufsize);
>>    char *read_buffer = stackbuf;
>>    ptrdiff_t read_buffer_size = sizeof stackbuf;
>>    char *heapbuf = NULL;
>>
>> If either makes things better, maybe we can come up with a solution that doesn't hurt performance.
>
> This was a month ago -- Tom, did you try these changes?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#56623; Package emacs. (Mon, 29 Aug 2022 08:45:01 GMT) Full text and rfc822 format available.

Message #25 received at 56623 <at> debbugs.gnu.org (full text, mbox):

From: Gerd Möllmann <gerd.moellmann <at> gmail.com>
To: Mattias Engdegård <mattiase <at> acm.org>
Cc: Tom Gillespie <tgbugs <at> gmail.com>, 56623 <at> debbugs.gnu.org,
 greghendershott <at> gmail.com
Subject: Re: bug#56623: memory leak introduced by new nonrecursive lisp reader
Date: Mon, 29 Aug 2022 10:44:46 +0200
Mattias Engdegård <mattiase <at> acm.org> writes:

> If either makes things better, maybe we can come up with a solution
> that doesn't hurt performance.

This might be completely off, but anyway:

lread.c uses free_cons.  Is it certain that the cons cell is never
marked at that point?  Not that I see it happening, or could explain how
exactly GC would run while that code executes.  Maybe it's worth adding
a check in free_cons?






Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#56623; Package emacs. (Mon, 29 Aug 2022 09:39:02 GMT) Full text and rfc822 format available.

Message #28 received at 56623 <at> debbugs.gnu.org (full text, mbox):

From: Gerd Möllmann <gerd.moellmann <at> gmail.com>
To: Mattias Engdegård <mattiase <at> acm.org>
Cc: Tom Gillespie <tgbugs <at> gmail.com>, 56623 <at> debbugs.gnu.org,
 greghendershott <at> gmail.com
Subject: Re: bug#56623: memory leak introduced by new nonrecursive lisp reader
Date: Mon, 29 Aug 2022 11:38:06 +0200
Gerd Möllmann <gerd.moellmann <at> gmail.com> writes:

> Mattias Engdegård <mattiase <at> acm.org> writes:
>
>> If either makes things better, maybe we can come up with a solution
>> that doesn't hurt performance.
>
> This might be completely off, but anyway:
>
> lread.c uses free_cons.  Is it certain that the cons cell is never
> marked at that point?  Not that I see it happening, or could explain how
> exactly GC would run while that code executes.  Maybe it's worth adding
> a check in free_cons?

One candidate: A hash table with user-defined :test.  Invoking the
user-defined functions can GC.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#56623; Package emacs. (Mon, 29 Aug 2022 15:21:02 GMT) Full text and rfc822 format available.

Message #31 received at 56623 <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: Greg Hendershott <mail <at> greghendershott.com>
Cc: Tom Gillespie <tgbugs <at> gmail.com>, Lars Ingebrigtsen <larsi <at> gnus.org>,
 56623 <at> debbugs.gnu.org
Subject: Re: bug#56623: memory leak introduced by new nonrecursive lisp reader
Date: Mon, 29 Aug 2022 17:20:41 +0200
29 aug. 2022 kl. 14.59 skrev Greg Hendershott <mail <at> greghendershott.com>:

> The new `read' leaks when it errors.

Looks like a very very silly mistake. Will fix.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#56623; Package emacs. (Mon, 29 Aug 2022 15:27:02 GMT) Full text and rfc822 format available.

Message #34 received at 56623 <at> debbugs.gnu.org (full text, mbox):

From: Greg Hendershott <mail <at> greghendershott.com>
To: Lars Ingebrigtsen <larsi <at> gnus.org>, Tom Gillespie <tgbugs <at> gmail.com>
Cc: Mattias Engdegård <mattiase <at> acm.org>,
 56623 <at> debbugs.gnu.org
Subject: Re: bug#56623: memory leak introduced by new nonrecursive lisp reader
Date: Mon, 29 Aug 2022 08:59:21 -0400
I think I've discovered the problem.

The new `read' leaks when it errors.

Here's a little driver to elicit this:

--------------------------------------------------
 
;; Run this with e.g. emacs --batch -Q -l example.el
(message "Run htop and observe the RES value")
(dotimes (n 512)
  (message "%s" n)
  (with-temp-buffer
    ;; Insert an unclosed sexp
    (insert "(")
    ;; Make it large, with quite a few sub-expressions to read
    (dotimes (_ 8192)
      (insert (format "%S" '(foo bar baz "a somewhat long string blah blah blah blah blah\n"))))
    ;; Go to start and attempt a read. This will fail due to the
    ;; unclosed paren, so we use `ignore-errors'.
    (goto-char (point-min))
    (ignore-errors (read (current-buffer)))))

;; With the old recursive lread.c, the RES value will oscillate around
;; a stable low value (presumably as GC happens).

;; With the new non-recursive lread.c, the RES value will climb
;; unbounded into hundreds of megabytes.

--------------------------------------------------

I hope this helps!




Reply sent to Mattias Engdegård <mattiase <at> acm.org>:
You have taken responsibility. (Mon, 29 Aug 2022 15:54:01 GMT) Full text and rfc822 format available.

Notification sent to Tom Gillespie <tgbugs <at> gmail.com>:
bug acknowledged by developer. (Mon, 29 Aug 2022 15:54:02 GMT) Full text and rfc822 format available.

Message #39 received at 56623-done <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: Mattias Engdegård <mattiase <at> acm.org>
Cc: Tom Gillespie <tgbugs <at> gmail.com>, Lars Ingebrigtsen <larsi <at> gnus.org>,
 Greg Hendershott <mail <at> greghendershott.com>, 56623-done <at> debbugs.gnu.org
Subject: Re: bug#56623: memory leak introduced by new nonrecursive lisp reader
Date: Mon, 29 Aug 2022 17:52:39 +0200
>> The new `read' leaks when it errors.
> 
> Looks like a very very silly mistake.

Indeed it was (sorry!), and it should be fixed by c0bb1aac10.
Many thanks for helping out!






bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Tue, 27 Sep 2022 11:24:05 GMT) Full text and rfc822 format available.

This bug report was last modified 1 year and 211 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.