GNU bug report logs -
#56623
memory leak introduced by new nonrecursive lisp reader
Previous Next
Reported by: Tom Gillespie <tgbugs <at> gmail.com>
Date: Sun, 17 Jul 2022 22:04:01 UTC
Severity: normal
Tags: moreinfo
Done: Mattias Engdegård <mattiase <at> acm.org>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 56623 in the body.
You can then email your comments to 56623 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#56623
; Package
emacs
.
(Sun, 17 Jul 2022 22:04:01 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Tom Gillespie <tgbugs <at> gmail.com>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Sun, 17 Jul 2022 22:04:01 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
While tracking down a memory leak in racket-mode git bisect
landed on b903507b36c438653a02d7b6291e9744d5221e28
Nonrecursive Lisp reader (bug#55676).
I do not have a minimal reproduction at the moment, but a
non-minimal one is documented at
see https://github.com/greghendershott/racket-mode/issues/632.
There are before and after graphs showing the change in behavior in
https://github.com/greghendershott/racket-mode/issues/632#issuecomment-1186609580
I have also attached the bisect script, and elisp needed to
reproduce the issue. Setting up the racket environment
to cause the issue is more involved (see the github issue).
Based on the behavior, it seems like strings and conses may be
marked by mark_lread but then somehow never unmarked, so
they never get collected, but that is a blind guess and I have no
idea if that is related at all.
[ow.el (text/x-emacs-lisp, attachment)]
[emacs-memory-leak-test.el (text/x-emacs-lisp, attachment)]
[bisect-memory.sh (application/x-shellscript, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#56623
; Package
emacs
.
(Mon, 18 Jul 2022 11:22:02 GMT)
Full text and
rfc822 format available.
Message #8 received at submit <at> debbugs.gnu.org (full text, mbox):
18 juli 2022 kl. 00.03 skrev Tom Gillespie <tgbugs <at> gmail.com>:
> While tracking down a memory leak in racket-mode git bisect
> landed on b903507b36c438653a02d7b6291e9744d5221e28
> Nonrecursive Lisp reader (bug#55676).
Thank you. A self-contained reproduction would indeed help a lot.
Added tag(s) moreinfo.
Request was from
Lars Ingebrigtsen <larsi <at> gnus.org>
to
control <at> debbugs.gnu.org
.
(Sat, 23 Jul 2022 08:25:02 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#56623
; Package
emacs
.
(Wed, 27 Jul 2022 09:21:01 GMT)
Full text and
rfc822 format available.
Message #13 received at submit <at> debbugs.gnu.org (full text, mbox):
It just struck me that you may just be unlucky.
The GC scans the C stack conservatively. The reader rewrite reduced the number of activation records on the stack; instead there is a much more compact marking stack which is scanned precisely, which is both much faster and, well, more precise.
However, in the absence of recursion the size of an on-stack data buffer in the reader was increased substantially (from 64 to 1024) because the danger from stack overflow was gone, and it made for a measurable increase in reading performance.
This means that when the GC now scans the C stack, it has to scan 1024 bytes of typically uninitialised data left over from previous stack activations which could very well include pointers to otherwise dead objects that will now be retained.
A counter-argument is that these undead objects will only be kept artificially alive as long as GC only takes place in the reader. Anyway, the hunch is easily tested: try either or both of these changes:
1. reduce stackbufsize:
--- a/src/lread.c
+++ b/src/lread.c
@@ -2919,7 +2919,7 @@ digit_to_number (int character, int base)
/* Size of the fixed-size buffer used during reading.
It should be at least big enough for `invalid_radix_integer' but
can usefully be much bigger than that. */
-enum { stackbufsize = 1024 };
+enum { stackbufsize = 64 };
static void
invalid_radix_integer (EMACS_INT radix, char stackbuf[VLA_ELEMS (stackbufsize)],
2. zero the buffer on entry:
--- a/src/lread.c
+++ b/src/lread.c
@@ -3678,6 +3678,7 @@ read_stack_push (struct read_stack_entry e)
read0 (Lisp_Object readcharfun, bool locate_syms)
{
char stackbuf[stackbufsize];
+ memset (stackbuf, 0, stackbufsize);
char *read_buffer = stackbuf;
ptrdiff_t read_buffer_size = sizeof stackbuf;
char *heapbuf = NULL;
If either makes things better, maybe we can come up with a solution that doesn't hurt performance.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#56623
; Package
emacs
.
(Sat, 27 Aug 2022 15:36:01 GMT)
Full text and
rfc822 format available.
Message #16 received at 56623 <at> debbugs.gnu.org (full text, mbox):
Mattias Engdegård <mattiase <at> acm.org> writes:
> A counter-argument is that these undead objects will only be kept artificially alive as long as GC only takes place in the reader. Anyway, the hunch is easily tested: try either or both of these changes:
>
> 1. reduce stackbufsize:
>
> --- a/src/lread.c
> +++ b/src/lread.c
> @@ -2919,7 +2919,7 @@ digit_to_number (int character, int base)
> /* Size of the fixed-size buffer used during reading.
> It should be at least big enough for `invalid_radix_integer' but
> can usefully be much bigger than that. */
> -enum { stackbufsize = 1024 };
> +enum { stackbufsize = 64 };
>
> static void
> invalid_radix_integer (EMACS_INT radix, char stackbuf[VLA_ELEMS (stackbufsize)],
>
> 2. zero the buffer on entry:
>
> --- a/src/lread.c
> +++ b/src/lread.c
> @@ -3678,6 +3678,7 @@ read_stack_push (struct read_stack_entry e)
> read0 (Lisp_Object readcharfun, bool locate_syms)
> {
> char stackbuf[stackbufsize];
> + memset (stackbuf, 0, stackbufsize);
> char *read_buffer = stackbuf;
> ptrdiff_t read_buffer_size = sizeof stackbuf;
> char *heapbuf = NULL;
>
> If either makes things better, maybe we can come up with a solution that doesn't hurt performance.
This was a month ago -- Tom, did you try these changes?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#56623
; Package
emacs
.
(Sun, 28 Aug 2022 21:07:02 GMT)
Full text and
rfc822 format available.
Message #19 received at 56623 <at> debbugs.gnu.org (full text, mbox):
> maybe same happened for Tom?
Argh, yep, first time seeing this. I have limited internet access right now,
but will take a look when I get back next week.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#56623
; Package
emacs
.
(Sun, 28 Aug 2022 21:34:02 GMT)
Full text and
rfc822 format available.
Message #22 received at 56623 <at> debbugs.gnu.org (full text, mbox):
Hi, Lars.
Your previous email from a month ago doesn't seem to have been delivered
to me; maybe same happened for Tom?
Today I took some time to try making both changes to lread.c. That did
not seem to help -- in my testing using Tom's artifacts, memory use
still grew unbounded (whereas it levels off with e.g. Emacs 25.2.2).
Bu as a sanity check, it would be great if Tom could try this, too, to
make sure?
I've gone to some non-trivial effort to make a much more minimal recipe
-- but so far I can't. On the one hand, Tom's bisect seems to point
clearly at this reader change. On the other hand, I don't have any
mental model for how/why, and I can't elicit similar symptoms so far by
working up from simpler examples exercising the reader. :(
Greg
On Sat 27 Aug 2022 at 15:35, Lars Ingebrigtsen <larsi <at> gnus.org> wrote:
> Mattias Engdegård <mattiase <at> acm.org> writes:
>
>> A counter-argument is that these undead objects will only be kept artificially alive as long as GC only takes place in the reader. Anyway, the hunch is easily tested: try either or both of these changes:
>>
>> 1. reduce stackbufsize:
>>
>> --- a/src/lread.c
>> +++ b/src/lread.c
>> @@ -2919,7 +2919,7 @@ digit_to_number (int character, int base)
>> /* Size of the fixed-size buffer used during reading.
>> It should be at least big enough for `invalid_radix_integer' but
>> can usefully be much bigger than that. */
>> -enum { stackbufsize = 1024 };
>> +enum { stackbufsize = 64 };
>>
>> static void
>> invalid_radix_integer (EMACS_INT radix, char stackbuf[VLA_ELEMS (stackbufsize)],
>>
>> 2. zero the buffer on entry:
>>
>> --- a/src/lread.c
>> +++ b/src/lread.c
>> @@ -3678,6 +3678,7 @@ read_stack_push (struct read_stack_entry e)
>> read0 (Lisp_Object readcharfun, bool locate_syms)
>> {
>> char stackbuf[stackbufsize];
>> + memset (stackbuf, 0, stackbufsize);
>> char *read_buffer = stackbuf;
>> ptrdiff_t read_buffer_size = sizeof stackbuf;
>> char *heapbuf = NULL;
>>
>> If either makes things better, maybe we can come up with a solution that doesn't hurt performance.
>
> This was a month ago -- Tom, did you try these changes?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#56623
; Package
emacs
.
(Mon, 29 Aug 2022 08:45:01 GMT)
Full text and
rfc822 format available.
Message #25 received at 56623 <at> debbugs.gnu.org (full text, mbox):
Mattias Engdegård <mattiase <at> acm.org> writes:
> If either makes things better, maybe we can come up with a solution
> that doesn't hurt performance.
This might be completely off, but anyway:
lread.c uses free_cons. Is it certain that the cons cell is never
marked at that point? Not that I see it happening, or could explain how
exactly GC would run while that code executes. Maybe it's worth adding
a check in free_cons?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#56623
; Package
emacs
.
(Mon, 29 Aug 2022 09:39:02 GMT)
Full text and
rfc822 format available.
Message #28 received at 56623 <at> debbugs.gnu.org (full text, mbox):
Gerd Möllmann <gerd.moellmann <at> gmail.com> writes:
> Mattias Engdegård <mattiase <at> acm.org> writes:
>
>> If either makes things better, maybe we can come up with a solution
>> that doesn't hurt performance.
>
> This might be completely off, but anyway:
>
> lread.c uses free_cons. Is it certain that the cons cell is never
> marked at that point? Not that I see it happening, or could explain how
> exactly GC would run while that code executes. Maybe it's worth adding
> a check in free_cons?
One candidate: A hash table with user-defined :test. Invoking the
user-defined functions can GC.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#56623
; Package
emacs
.
(Mon, 29 Aug 2022 15:21:02 GMT)
Full text and
rfc822 format available.
Message #31 received at 56623 <at> debbugs.gnu.org (full text, mbox):
29 aug. 2022 kl. 14.59 skrev Greg Hendershott <mail <at> greghendershott.com>:
> The new `read' leaks when it errors.
Looks like a very very silly mistake. Will fix.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#56623
; Package
emacs
.
(Mon, 29 Aug 2022 15:27:02 GMT)
Full text and
rfc822 format available.
Message #34 received at 56623 <at> debbugs.gnu.org (full text, mbox):
I think I've discovered the problem.
The new `read' leaks when it errors.
Here's a little driver to elicit this:
--------------------------------------------------
;; Run this with e.g. emacs --batch -Q -l example.el
(message "Run htop and observe the RES value")
(dotimes (n 512)
(message "%s" n)
(with-temp-buffer
;; Insert an unclosed sexp
(insert "(")
;; Make it large, with quite a few sub-expressions to read
(dotimes (_ 8192)
(insert (format "%S" '(foo bar baz "a somewhat long string blah blah blah blah blah\n"))))
;; Go to start and attempt a read. This will fail due to the
;; unclosed paren, so we use `ignore-errors'.
(goto-char (point-min))
(ignore-errors (read (current-buffer)))))
;; With the old recursive lread.c, the RES value will oscillate around
;; a stable low value (presumably as GC happens).
;; With the new non-recursive lread.c, the RES value will climb
;; unbounded into hundreds of megabytes.
--------------------------------------------------
I hope this helps!
Reply sent
to
Mattias Engdegård <mattiase <at> acm.org>
:
You have taken responsibility.
(Mon, 29 Aug 2022 15:54:01 GMT)
Full text and
rfc822 format available.
Notification sent
to
Tom Gillespie <tgbugs <at> gmail.com>
:
bug acknowledged by developer.
(Mon, 29 Aug 2022 15:54:02 GMT)
Full text and
rfc822 format available.
Message #39 received at 56623-done <at> debbugs.gnu.org (full text, mbox):
>> The new `read' leaks when it errors.
>
> Looks like a very very silly mistake.
Indeed it was (sorry!), and it should be fixed by c0bb1aac10.
Many thanks for helping out!
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Tue, 27 Sep 2022 11:24:05 GMT)
Full text and
rfc822 format available.
This bug report was last modified 1 year and 211 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.