GNU bug report logs -
#71012
30.0.50; tree-sitter crash
Previous Next
Reported by: Helmut Eller <eller.helmut <at> gmail.com>
Date: Fri, 17 May 2024 13:40:01 UTC
Severity: normal
Found in version 30.0.50
Done: Yuan Fu <casouri <at> gmail.com>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 71012 in the body.
You can then email your comments to 71012 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#71012
; Package
emacs
.
(Fri, 17 May 2024 13:40:01 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Helmut Eller <eller.helmut <at> gmail.com>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Fri, 17 May 2024 13:40:01 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
The code in the attached file tries to parse src/lisp.h but crashes
while printing the result: emacs --batch -l ts-bug.el
[ts-bug.el (application/emacs-lisp, attachment)]
[Message part 3 (text/plain, inline)]
Program received signal SIGSEGV, Segmentation fault.
0x000055555575c33a in buf_bytepos_to_charpos (b=0x555556074c60, bytepos=1)
at marker.c:343
343 eassert (bytepos >= BUF_Z_BYTE (b)
(gdb) ba 10
#0 0x000055555575c33a in buf_bytepos_to_charpos (b=0x555556074c60, bytepos=1)
at marker.c:343
#1 0x0000555555853509 in Ftreesit_node_start
(node=node <at> entry=XIL(0x55555605b225)) at treesit.c:1927
#2 0x00005555557f3f8a in print_vectorlike_unreadable
(obj=XIL(0x55555605b225), printcharfun=XIL(0), escapeflag=<optimized out>, buf=0x7fffffff7ef0 "dd\aVUU") at print.c:2051
#3 0x00005555557f1b85 in print_object
(obj=<optimized out>, printcharfun=<optimized out>, escapeflag=false)
at print.c:2642
#4 0x00005555557f2cf0 in Fprin1_to_string
(object=object <at> entry=XIL(0x55555605b225), noescape=XIL(0x30), overrides=overrides <at> entry=XIL(0)) at print.c:814
#5 0x00005555557b7c30 in styled_format
(nargs=2, args=args <at> entry=0x7fffffffda30, message=message <at> entry=true)
at editfns.c:3635
#6 0x00005555557b933f in Fformat_message
(args=0x7fffffffda30, nargs=<optimized out>) at editfns.c:3388
#7 Fmessage (args=0x7fffffffda30, nargs=<optimized out>) at editfns.c:3185
#8 Fmessage (nargs=<optimized out>, args=0x7fffffffda30) at editfns.c:3154
#9 0x00005555557c6b75 in eval_sub (form=<optimized out>)
at /scratch/emacs/emacs-git/src/lisp.h:2243
(More stack frames follow...)
In GNU Emacs 30.0.50 (build 6, x86_64-pc-linux-gnu, GTK+ Version
3.24.38, cairo version 1.16.0) of 2024-05-17 built on caladan
Repository revision: 6ca3a60db3427bc6aef08144c1524920ff3d9c4d
Repository branch: master
Windowing system distributor 'The X.Org Foundation', version 11.0.12101007
System Description: Debian GNU/Linux 12 (bookworm)
Configured using:
'configure --enable-checking --without-native-compiler
--with-xpm=ifavailable --with-gif=ifavailable
--with-native-compilation=no --with-tree-sitter'
Configured features:
CAIRO DBUS FREETYPE GLIB GMP GNUTLS GSETTINGS HARFBUZZ JPEG LIBSELINUX
LIBSYSTEMD LIBXML2 MODULES NOTIFY INOTIFY PDUMPER PNG SECCOMP SOUND
SQLITE3 THREADS TIFF TOOLKIT_SCROLL_BARS TREE_SITTER WEBP X11 XDBE XIM
XINPUT2 GTK3 ZLIB
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#71012
; Package
emacs
.
(Fri, 17 May 2024 15:30:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 71012 <at> debbugs.gnu.org (full text, mbox):
> From: Helmut Eller <eller.helmut <at> gmail.com>
> Date: Fri, 17 May 2024 15:39:27 +0200
>
> The code in the attached file tries to parse src/lisp.h but crashes
> while printing the result: emacs --batch -l ts-bug.el
Yuan, can you help, please?
Btw, why do you use treesit-parse-string? The Emacs integration with
tree-sitter can parse a buffer without making a string from its
contents.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#71012
; Package
emacs
.
(Fri, 17 May 2024 15:36:01 GMT)
Full text and
rfc822 format available.
Message #11 received at 71012 <at> debbugs.gnu.org (full text, mbox):
> Btw, why do you use treesit-parse-string? The Emacs integration with
> tree-sitter can parse a buffer without making a string from its
> contents.
It's the first time that I use treesit. I just tried a few things.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#71012
; Package
emacs
.
(Fri, 17 May 2024 16:01:01 GMT)
Full text and
rfc822 format available.
Message #14 received at 71012 <at> debbugs.gnu.org (full text, mbox):
> From: Helmut Eller <eller.helmut <at> gmail.com>
> Cc: Yuan Fu <casouri <at> gmail.com>, 71012 <at> debbugs.gnu.org
> Date: Fri, 17 May 2024 17:34:04 +0200
>
> > Btw, why do you use treesit-parse-string? The Emacs integration with
> > tree-sitter can parse a buffer without making a string from its
> > contents.
>
> It's the first time that I use treesit. I just tried a few things.
There's a chapter about it in the ELisp manual (the node "Parsing
Program Source"), in case you haven't read it.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#71012
; Package
emacs
.
(Sat, 18 May 2024 06:09:01 GMT)
Full text and
rfc822 format available.
Message #17 received at 71012 <at> debbugs.gnu.org (full text, mbox):
> On May 17, 2024, at 8:29 AM, Eli Zaretskii <eliz <at> gnu.org> wrote:
>
>> From: Helmut Eller <eller.helmut <at> gmail.com>
>> Date: Fri, 17 May 2024 15:39:27 +0200
>>
>> The code in the attached file tries to parse src/lisp.h but crashes
>> while printing the result: emacs --batch -l ts-bug.el
>
> Yuan, can you help, please?
>
> Btw, why do you use treesit-parse-string? The Emacs integration with
> tree-sitter can parse a buffer without making a string from its
> contents.
Yep, I’ll look into it.
Yuan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#71012
; Package
emacs
.
(Mon, 27 May 2024 22:13:02 GMT)
Full text and
rfc822 format available.
Message #20 received at 71012 <at> debbugs.gnu.org (full text, mbox):
> On May 17, 2024, at 11:07 PM, Yuan Fu <casouri <at> gmail.com> wrote:
>
>
>
>> On May 17, 2024, at 8:29 AM, Eli Zaretskii <eliz <at> gnu.org> wrote:
>>
>>> From: Helmut Eller <eller.helmut <at> gmail.com>
>>> Date: Fri, 17 May 2024 15:39:27 +0200
>>>
>>> The code in the attached file tries to parse src/lisp.h but crashes
>>> while printing the result: emacs --batch -l ts-bug.el
>>
>> Yuan, can you help, please?
>>
>> Btw, why do you use treesit-parse-string? The Emacs integration with
>> tree-sitter can parse a buffer without making a string from its
>> contents.
>
> Yep, I’ll look into it.
>
> Yuan
Just an update, I didn’t forget about this. If I didn’t reply back today, I will in a few days :-)
Yuan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#71012
; Package
emacs
.
(Wed, 29 May 2024 05:17:02 GMT)
Full text and
rfc822 format available.
Message #23 received at 71012 <at> debbugs.gnu.org (full text, mbox):
> On May 27, 2024, at 3:10 PM, Yuan Fu <casouri <at> gmail.com> wrote:
>
>
>
>> On May 17, 2024, at 11:07 PM, Yuan Fu <casouri <at> gmail.com> wrote:
>>
>>
>>
>>> On May 17, 2024, at 8:29 AM, Eli Zaretskii <eliz <at> gnu.org> wrote:
>>>
>>>> From: Helmut Eller <eller.helmut <at> gmail.com>
>>>> Date: Fri, 17 May 2024 15:39:27 +0200
>>>>
>>>> The code in the attached file tries to parse src/lisp.h but crashes
>>>> while printing the result: emacs --batch -l ts-bug.el
>>>
>>> Yuan, can you help, please?
>>>
>>> Btw, why do you use treesit-parse-string? The Emacs integration with
>>> tree-sitter can parse a buffer without making a string from its
>>> contents.
>>
>> Yep, I’ll look into it.
>>
>> Yuan
>
> Just an update, I didn’t forget about this. If I didn’t reply back today, I will in a few days :-)
>
> Yuan
From what I can gather, the crash seems to be because the temp buffer is garbage collected—the inserted lisp.h is a large file, so the temp buffer is probably immediately collected, before Emacs tries to print the node in the next line. I replaced the insert-file-content with some smaller file and it didn’t crash.
But that theory has critical flaws: a) Emacs certainly doesn't collect the temp buffer before the with-temp-buffer form returns; b) I can’t crash Emacs in my non-debug build by inserting (garbage-collect) in front of the message line in the example; c) debug build Emacs still crashes even if I enlarge gc-cons-threshold.
Eli, is there anything different regarding temp buffers in debug builds?
Yuan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#71012
; Package
emacs
.
(Wed, 29 May 2024 12:29:02 GMT)
Full text and
rfc822 format available.
Message #26 received at 71012 <at> debbugs.gnu.org (full text, mbox):
> From: Yuan Fu <casouri <at> gmail.com>
> Date: Tue, 28 May 2024 22:15:05 -0700
> Cc: Helmut Eller <eller.helmut <at> gmail.com>,
> 71012 <at> debbugs.gnu.org
>
> From what I can gather, the crash seems to be because the temp buffer is garbage collected—the inserted lisp.h is a large file, so the temp buffer is probably immediately collected, before Emacs tries to print the node in the next line. I replaced the insert-file-content with some smaller file and it didn’t crash.
It is unthinkable that a buffer is GC'ed while it is being used.
> But that theory has critical flaws: a) Emacs certainly doesn't collect the temp buffer before the with-temp-buffer form returns; b) I can’t crash Emacs in my non-debug build by inserting (garbage-collect) in front of the message line in the example; c) debug build Emacs still crashes even if I enlarge gc-cons-threshold.
>
> Eli, is there anything different regarding temp buffers in debug builds?
No.
But note that there are _two_ temporary buffers involved here: one is
created in ts-bug.el, and it remains intact and valid; the other is
the temporary buffer created by treesit-parse-string. That one is
killed by the time treesit-parse-string returns, so treesit-node-start
attempts to access positions of a killed buffer!
So I think this is a bug in treesit-parse-string: it cannot use
with-temp-buffer; instead, it should make the buffer into which it
inserts the string part of the parser, so that the buffer is killed
and GC'ed only when the parser is no longer referenced. Otherwise the
syntax tree returned by treesit-parse-string is unsafe to use.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#71012
; Package
emacs
.
(Sat, 01 Jun 2024 17:17:02 GMT)
Full text and
rfc822 format available.
Message #29 received at 71012 <at> debbugs.gnu.org (full text, mbox):
> On May 29, 2024, at 5:28 AM, Eli Zaretskii <eliz <at> gnu.org> wrote:
>
>> From: Yuan Fu <casouri <at> gmail.com>
>> Date: Tue, 28 May 2024 22:15:05 -0700
>> Cc: Helmut Eller <eller.helmut <at> gmail.com>,
>> 71012 <at> debbugs.gnu.org
>>
>> From what I can gather, the crash seems to be because the temp buffer is garbage collected—the inserted lisp.h is a large file, so the temp buffer is probably immediately collected, before Emacs tries to print the node in the next line. I replaced the insert-file-content with some smaller file and it didn’t crash.
>
> It is unthinkable that a buffer is GC'ed while it is being used.
>
>> But that theory has critical flaws: a) Emacs certainly doesn't collect the temp buffer before the with-temp-buffer form returns; b) I can’t crash Emacs in my non-debug build by inserting (garbage-collect) in front of the message line in the example; c) debug build Emacs still crashes even if I enlarge gc-cons-threshold.
>>
>> Eli, is there anything different regarding temp buffers in debug builds?
>
> No.
>
> But note that there are _two_ temporary buffers involved here: one is
> created in ts-bug.el, and it remains intact and valid; the other is
> the temporary buffer created by treesit-parse-string. That one is
> killed by the time treesit-parse-string returns, so treesit-node-start
> attempts to access positions of a killed buffer!
>
> So I think this is a bug in treesit-parse-string: it cannot use
> with-temp-buffer; instead, it should make the buffer into which it
> inserts the string part of the parser, so that the buffer is killed
> and GC'ed only when the parser is no longer referenced. Otherwise the
> syntax tree returned by treesit-parse-string is unsafe to use.
I see, you’re absolutely right, thanks for the analysis! On top of that I need to make sure all the treesit function checks for buffer liveness before accessing the buffer. I was under the impression that a killed buffer would keep its content around until it’s collected. Turns out that wasn’t the case.
Yuan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#71012
; Package
emacs
.
(Sat, 01 Jun 2024 17:46:02 GMT)
Full text and
rfc822 format available.
Message #32 received at 71012 <at> debbugs.gnu.org (full text, mbox):
> On Jun 1, 2024, at 10:15 AM, Yuan Fu <casouri <at> gmail.com> wrote:
>
>
>
>> On May 29, 2024, at 5:28 AM, Eli Zaretskii <eliz <at> gnu.org> wrote:
>>
>>> From: Yuan Fu <casouri <at> gmail.com>
>>> Date: Tue, 28 May 2024 22:15:05 -0700
>>> Cc: Helmut Eller <eller.helmut <at> gmail.com>,
>>> 71012 <at> debbugs.gnu.org
>>>
>>> From what I can gather, the crash seems to be because the temp buffer is garbage collected—the inserted lisp.h is a large file, so the temp buffer is probably immediately collected, before Emacs tries to print the node in the next line. I replaced the insert-file-content with some smaller file and it didn’t crash.
>>
>> It is unthinkable that a buffer is GC'ed while it is being used.
>>
>>> But that theory has critical flaws: a) Emacs certainly doesn't collect the temp buffer before the with-temp-buffer form returns; b) I can’t crash Emacs in my non-debug build by inserting (garbage-collect) in front of the message line in the example; c) debug build Emacs still crashes even if I enlarge gc-cons-threshold.
>>>
>>> Eli, is there anything different regarding temp buffers in debug builds?
>>
>> No.
>>
>> But note that there are _two_ temporary buffers involved here: one is
>> created in ts-bug.el, and it remains intact and valid; the other is
>> the temporary buffer created by treesit-parse-string. That one is
>> killed by the time treesit-parse-string returns, so treesit-node-start
>> attempts to access positions of a killed buffer!
>>
>> So I think this is a bug in treesit-parse-string: it cannot use
>> with-temp-buffer; instead, it should make the buffer into which it
>> inserts the string part of the parser, so that the buffer is killed
>> and GC'ed only when the parser is no longer referenced. Otherwise the
>> syntax tree returned by treesit-parse-string is unsafe to use.
>
> I see, you’re absolutely right, thanks for the analysis! On top of that I need to make sure all the treesit function checks for buffer liveness before accessing the buffer. I was under the impression that a killed buffer would keep its content around until it’s collected. Turns out that wasn’t the case.
>
> Yuan
Pushed the fix to emacs-29.
Yuan
Reply sent
to
Yuan Fu <casouri <at> gmail.com>
:
You have taken responsibility.
(Thu, 06 Jun 2024 05:33:02 GMT)
Full text and
rfc822 format available.
Notification sent
to
Helmut Eller <eller.helmut <at> gmail.com>
:
bug acknowledged by developer.
(Thu, 06 Jun 2024 05:33:02 GMT)
Full text and
rfc822 format available.
Message #37 received at 71012-done <at> debbugs.gnu.org (full text, mbox):
> On Jun 1, 2024, at 10:43 AM, Yuan Fu <casouri <at> gmail.com> wrote:
>
>
>
>> On Jun 1, 2024, at 10:15 AM, Yuan Fu <casouri <at> gmail.com> wrote:
>>
>>
>>
>>> On May 29, 2024, at 5:28 AM, Eli Zaretskii <eliz <at> gnu.org> wrote:
>>>
>>>> From: Yuan Fu <casouri <at> gmail.com>
>>>> Date: Tue, 28 May 2024 22:15:05 -0700
>>>> Cc: Helmut Eller <eller.helmut <at> gmail.com>,
>>>> 71012 <at> debbugs.gnu.org
>>>>
>>>> From what I can gather, the crash seems to be because the temp buffer is garbage collected—the inserted lisp.h is a large file, so the temp buffer is probably immediately collected, before Emacs tries to print the node in the next line. I replaced the insert-file-content with some smaller file and it didn’t crash.
>>>
>>> It is unthinkable that a buffer is GC'ed while it is being used.
>>>
>>>> But that theory has critical flaws: a) Emacs certainly doesn't collect the temp buffer before the with-temp-buffer form returns; b) I can’t crash Emacs in my non-debug build by inserting (garbage-collect) in front of the message line in the example; c) debug build Emacs still crashes even if I enlarge gc-cons-threshold.
>>>>
>>>> Eli, is there anything different regarding temp buffers in debug builds?
>>>
>>> No.
>>>
>>> But note that there are _two_ temporary buffers involved here: one is
>>> created in ts-bug.el, and it remains intact and valid; the other is
>>> the temporary buffer created by treesit-parse-string. That one is
>>> killed by the time treesit-parse-string returns, so treesit-node-start
>>> attempts to access positions of a killed buffer!
>>>
>>> So I think this is a bug in treesit-parse-string: it cannot use
>>> with-temp-buffer; instead, it should make the buffer into which it
>>> inserts the string part of the parser, so that the buffer is killed
>>> and GC'ed only when the parser is no longer referenced. Otherwise the
>>> syntax tree returned by treesit-parse-string is unsafe to use.
>>
>> I see, you’re absolutely right, thanks for the analysis! On top of that I need to make sure all the treesit function checks for buffer liveness before accessing the buffer. I was under the impression that a killed buffer would keep its content around until it’s collected. Turns out that wasn’t the case.
>>
>> Yuan
>
> Pushed the fix to emacs-29.
>
> Yuan
>
The fix works for me so I’m closing this report. Feel free to followup if new problems occur :-)
Yuan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#71012
; Package
emacs
.
(Fri, 07 Jun 2024 08:40:02 GMT)
Full text and
rfc822 format available.
Message #40 received at 71012 <at> debbugs.gnu.org (full text, mbox):
Just curious: since generate-new-buffer creates a new buffer each time
it is called, is it guaranteed that this buffer will eventually be GCed,
once the caller of treesit-parse-string is done with it?
Thanks,
--
Basil
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#71012
; Package
emacs
.
(Mon, 10 Jun 2024 08:43:02 GMT)
Full text and
rfc822 format available.
Message #43 received at 71012 <at> debbugs.gnu.org (full text, mbox):
BTW, not sure if this is the right bug report, but currently on master I
see the following test failure:
make TEST_LOAD_EL=no test/treesit-tests
make -C test treesit-tests
make[1]: Entering directory '/home/blc/.local/src/emacs/test'
make[2]: Entering directory '/home/blc/.local/src/emacs/test'
GEN src/treesit-tests.log
Running 28 tests (2024-06-10 10:11:18+0200, selector `(not (or (tag :unstable) (tag :nativecomp)))')
passed 1/28 treesit-basic-parsing (0.000398 sec)
passed 2/28 treesit-cross-boundary (0.000307 sec)
passed 3/28 treesit-cursor-helper-with-missing-node (0.000217 sec)
Can't guess python-indent-offset, using defaults: 4
passed 4/28 treesit-defun-navigation-nested-1 (0.038371 sec)
passed 5/28 treesit-defun-navigation-nested-2 (0.058591 sec)
passed 6/28 treesit-defun-navigation-nested-3 (0.002775 sec)
passed 7/28 treesit-defun-navigation-nested-4 (0.003478 sec)
Can't guess python-indent-offset, using defaults: 4
passed 8/28 treesit-defun-navigation-top-level (0.003415 sec)
passed 9/28 treesit-indirect-buffer (0.000249 sec)
passed 10/28 treesit-multi-lang (0.000739 sec)
passed 11/28 treesit-narrow (0.000216 sec)
Test treesit-node-api backtrace:
make[2]: *** [Makefile:185: src/treesit-tests.log] Aborted (core dumped)
make[2]: Leaving directory '/home/blc/.local/src/emacs/test'
make[1]: *** [Makefile:251: src/treesit-tests] Error 2
make[1]: Leaving directory '/home/blc/.local/src/emacs/test'
make: *** [Makefile:1133: test/treesit-tests] Error 2
And no treesit-tests.log file is generated.
--
Basil
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#71012
; Package
emacs
.
(Mon, 10 Jun 2024 18:27:02 GMT)
Full text and
rfc822 format available.
Message #46 received at 71012-done <at> debbugs.gnu.org (full text, mbox):
> Cc: 71012 <at> debbugs.gnu.org, eller.helmut <at> gmail.com
> From: "Basil L. Contovounesios" <basil <at> contovou.net>
> Date: Mon, 10 Jun 2024 10:12:51 +0200
>
> BTW, not sure if this is the right bug report, but currently on master I
> see the following test failure:
>
> make TEST_LOAD_EL=no test/treesit-tests
> make -C test treesit-tests
> make[1]: Entering directory '/home/blc/.local/src/emacs/test'
> make[2]: Entering directory '/home/blc/.local/src/emacs/test'
> GEN src/treesit-tests.log
> Running 28 tests (2024-06-10 10:11:18+0200, selector `(not (or (tag :unstable) (tag :nativecomp)))')
> passed 1/28 treesit-basic-parsing (0.000398 sec)
> passed 2/28 treesit-cross-boundary (0.000307 sec)
> passed 3/28 treesit-cursor-helper-with-missing-node (0.000217 sec)
> Can't guess python-indent-offset, using defaults: 4
> passed 4/28 treesit-defun-navigation-nested-1 (0.038371 sec)
> passed 5/28 treesit-defun-navigation-nested-2 (0.058591 sec)
> passed 6/28 treesit-defun-navigation-nested-3 (0.002775 sec)
> passed 7/28 treesit-defun-navigation-nested-4 (0.003478 sec)
> Can't guess python-indent-offset, using defaults: 4
> passed 8/28 treesit-defun-navigation-top-level (0.003415 sec)
> passed 9/28 treesit-indirect-buffer (0.000249 sec)
> passed 10/28 treesit-multi-lang (0.000739 sec)
> passed 11/28 treesit-narrow (0.000216 sec)
> Test treesit-node-api backtrace:
> make[2]: *** [Makefile:185: src/treesit-tests.log] Aborted (core dumped)
> make[2]: Leaving directory '/home/blc/.local/src/emacs/test'
> make[1]: *** [Makefile:251: src/treesit-tests] Error 2
> make[1]: Leaving directory '/home/blc/.local/src/emacs/test'
> make: *** [Makefile:1133: test/treesit-tests] Error 2
>
> And no treesit-tests.log file is generated.
Mattias fixed the crash, and I then fixed the test not to fail.
So I'm now closing this bug.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#71012
; Package
emacs
.
(Wed, 12 Jun 2024 05:40:02 GMT)
Full text and
rfc822 format available.
Message #49 received at 71012 <at> debbugs.gnu.org (full text, mbox):
> On Jun 7, 2024, at 1:39 AM, Basil L. Contovounesios <basil <at> contovou.net> wrote:
>
> Just curious: since generate-new-buffer creates a new buffer each time
> it is called, is it guaranteed that this buffer will eventually be GCed,
> once the caller of treesit-parse-string is done with it?
Yeah, from my testing that seems to be the case.
Yuan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#71012
; Package
emacs
.
(Wed, 12 Jun 2024 05:40:04 GMT)
Full text and
rfc822 format available.
Message #52 received at 71012-done <at> debbugs.gnu.org (full text, mbox):
> On Jun 10, 2024, at 11:25 AM, Eli Zaretskii <eliz <at> gnu.org> wrote:
>
>> Cc: 71012 <at> debbugs.gnu.org, eller.helmut <at> gmail.com
>> From: "Basil L. Contovounesios" <basil <at> contovou.net>
>> Date: Mon, 10 Jun 2024 10:12:51 +0200
>>
>> BTW, not sure if this is the right bug report, but currently on master I
>> see the following test failure:
>>
>> make TEST_LOAD_EL=no test/treesit-tests
>> make -C test treesit-tests
>> make[1]: Entering directory '/home/blc/.local/src/emacs/test'
>> make[2]: Entering directory '/home/blc/.local/src/emacs/test'
>> GEN src/treesit-tests.log
>> Running 28 tests (2024-06-10 10:11:18+0200, selector `(not (or (tag :unstable) (tag :nativecomp)))')
>> passed 1/28 treesit-basic-parsing (0.000398 sec)
>> passed 2/28 treesit-cross-boundary (0.000307 sec)
>> passed 3/28 treesit-cursor-helper-with-missing-node (0.000217 sec)
>> Can't guess python-indent-offset, using defaults: 4
>> passed 4/28 treesit-defun-navigation-nested-1 (0.038371 sec)
>> passed 5/28 treesit-defun-navigation-nested-2 (0.058591 sec)
>> passed 6/28 treesit-defun-navigation-nested-3 (0.002775 sec)
>> passed 7/28 treesit-defun-navigation-nested-4 (0.003478 sec)
>> Can't guess python-indent-offset, using defaults: 4
>> passed 8/28 treesit-defun-navigation-top-level (0.003415 sec)
>> passed 9/28 treesit-indirect-buffer (0.000249 sec)
>> passed 10/28 treesit-multi-lang (0.000739 sec)
>> passed 11/28 treesit-narrow (0.000216 sec)
>> Test treesit-node-api backtrace:
>> make[2]: *** [Makefile:185: src/treesit-tests.log] Aborted (core dumped)
>> make[2]: Leaving directory '/home/blc/.local/src/emacs/test'
>> make[1]: *** [Makefile:251: src/treesit-tests] Error 2
>> make[1]: Leaving directory '/home/blc/.local/src/emacs/test'
>> make: *** [Makefile:1133: test/treesit-tests] Error 2
>>
>> And no treesit-tests.log file is generated.
>
> Mattias fixed the crash, and I then fixed the test not to fail.
>
> So I'm now closing this bug.
Thank you!
Yuan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#71012
; Package
emacs
.
(Thu, 13 Jun 2024 11:44:01 GMT)
Full text and
rfc822 format available.
Message #55 received at 71012 <at> debbugs.gnu.org (full text, mbox):
Yuan Fu [2024-06-11 22:38 -0700] wrote:
>> On Jun 7, 2024, at 1:39 AM, Basil L. Contovounesios <basil <at> contovou.net> wrote:
>>
>> Just curious: since generate-new-buffer creates a new buffer each time
>> it is called, is it guaranteed that this buffer will eventually be GCed,
>> once the caller of treesit-parse-string is done with it?
>
> Yeah, from my testing that seems to be the case.
What did you try?
I'm putting the following in an emacs -Q *scratch* buffer:
(require 'treesit)
(message "# of buffers before : %d" (length (buffer-list)))
(dotimes-with-progress-reporter (i 10000) "Parsing"
(treesit-parse-string "int c = 0;" 'c))
(garbage-collect)
(message "# of buffers after : %d" (length (buffer-list)))
Each time I M-x eval-buffer:
- the list of buffers grows
- the memory usage grows
- loop iterations slow down noticeably
Am I missing something?
--
Basil
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#71012
; Package
emacs
.
(Thu, 13 Jun 2024 11:54:02 GMT)
Full text and
rfc822 format available.
Message #58 received at 71012 <at> debbugs.gnu.org (full text, mbox):
By the way, this shouldn't make a big difference by default, but did you
consider calling generate-new-buffer with a non-nil optional argument (à
la with-temp-buffer) in treesit-parse-string?
--
Basil
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Fri, 12 Jul 2024 11:24:06 GMT)
Full text and
rfc822 format available.
bug unarchived.
Request was from
"Basil L. Contovounesios" <basil <at> contovou.net>
to
control <at> debbugs.gnu.org
.
(Wed, 24 Jul 2024 14:57:01 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#71012
; Package
emacs
.
(Wed, 24 Jul 2024 14:59:01 GMT)
Full text and
rfc822 format available.
Message #65 received at 71012 <at> debbugs.gnu.org (full text, mbox):
Ping: thoughts on whether this is an issue?
Basil L. Contovounesios [2024-06-13 13:43 +0200] wrote:
> Yuan Fu [2024-06-11 22:38 -0700] wrote:
>>> On Jun 7, 2024, at 1:39 AM, Basil L. Contovounesios <basil <at> contovou.net> wrote:
>>>
>>> Just curious: since generate-new-buffer creates a new buffer each time
>>> it is called, is it guaranteed that this buffer will eventually be GCed,
>>> once the caller of treesit-parse-string is done with it?
>>
>> Yeah, from my testing that seems to be the case.
>
> What did you try?
> I'm putting the following in an emacs -Q *scratch* buffer:
>
> (require 'treesit)
> (message "# of buffers before : %d" (length (buffer-list)))
> (dotimes-with-progress-reporter (i 10000) "Parsing"
> (treesit-parse-string "int c = 0;" 'c))
> (garbage-collect)
> (message "# of buffers after : %d" (length (buffer-list)))
>
> Each time I M-x eval-buffer:
> - the list of buffers grows
> - the memory usage grows
> - loop iterations slow down noticeably
>
> Am I missing something?
Thanks,
--
Basil
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#71012
; Package
emacs
.
(Wed, 24 Jul 2024 16:33:02 GMT)
Full text and
rfc822 format available.
Message #68 received at 71012 <at> debbugs.gnu.org (full text, mbox):
> Cc: 71012 <at> debbugs.gnu.org, eller.helmut <at> gmail.com
> From: "Basil L. Contovounesios" <basil <at> contovou.net>
> Date: Wed, 24 Jul 2024 16:57:53 +0200
>
> Ping: thoughts on whether this is an issue?
Thoughts about what? what is deemed to be a problem in this case?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#71012
; Package
emacs
.
(Wed, 24 Jul 2024 23:33:02 GMT)
Full text and
rfc822 format available.
Message #71 received at 71012 <at> debbugs.gnu.org (full text, mbox):
Eli Zaretskii [2024-07-24 19:31 +0300] wrote:
>> Cc: 71012 <at> debbugs.gnu.org, eller.helmut <at> gmail.com
>> From: "Basil L. Contovounesios" <basil <at> contovou.net>
>> Date: Wed, 24 Jul 2024 16:57:53 +0200
>>
>> Ping: thoughts on whether this is an issue?
>
> Thoughts about what? what is deemed to be a problem in this case?
That each call to treesit-parse-string now allocates a new internal
buffer which is not automatically GCed.
If this cannot be avoided, I think the docs should at least warn that
it's the caller's responsibility to kill the return node's buffer when
finished with it.
--
Basil
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#71012
; Package
emacs
.
(Thu, 25 Jul 2024 05:28:02 GMT)
Full text and
rfc822 format available.
Message #74 received at 71012 <at> debbugs.gnu.org (full text, mbox):
> From: "Basil L. Contovounesios" <basil <at> contovou.net>
> Cc: casouri <at> gmail.com, 71012 <at> debbugs.gnu.org, eller.helmut <at> gmail.com
> Date: Thu, 25 Jul 2024 01:32:01 +0200
>
> Eli Zaretskii [2024-07-24 19:31 +0300] wrote:
>
> >> Cc: 71012 <at> debbugs.gnu.org, eller.helmut <at> gmail.com
> >> From: "Basil L. Contovounesios" <basil <at> contovou.net>
> >> Date: Wed, 24 Jul 2024 16:57:53 +0200
> >>
> >> Ping: thoughts on whether this is an issue?
> >
> > Thoughts about what? what is deemed to be a problem in this case?
>
> That each call to treesit-parse-string now allocates a new internal
> buffer which is not automatically GCed.
>
> If this cannot be avoided, I think the docs should at least warn that
> it's the caller's responsibility to kill the return node's buffer when
> finished with it.
If the buffers aren't killed by the code, it's indeed an issue that
needs to be solved. At the time I suggested that the node's buffer is
killed when the node is GC'ed or deleted -- have this not been done?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#71012
; Package
emacs
.
(Thu, 25 Jul 2024 07:28:02 GMT)
Full text and
rfc822 format available.
Message #77 received at 71012 <at> debbugs.gnu.org (full text, mbox):
> On Jul 24, 2024, at 10:27 PM, Eli Zaretskii <eliz <at> gnu.org> wrote:
>
>> From: "Basil L. Contovounesios" <basil <at> contovou.net>
>> Cc: casouri <at> gmail.com, 71012 <at> debbugs.gnu.org, eller.helmut <at> gmail.com
>> Date: Thu, 25 Jul 2024 01:32:01 +0200
>>
>> Eli Zaretskii [2024-07-24 19:31 +0300] wrote:
>>
>>>> Cc: 71012 <at> debbugs.gnu.org, eller.helmut <at> gmail.com
>>>> From: "Basil L. Contovounesios" <basil <at> contovou.net>
>>>> Date: Wed, 24 Jul 2024 16:57:53 +0200
>>>>
>>>> Ping: thoughts on whether this is an issue?
>>>
>>> Thoughts about what? what is deemed to be a problem in this case?
>>
>> That each call to treesit-parse-string now allocates a new internal
>> buffer which is not automatically GCed.
>>
>> If this cannot be avoided, I think the docs should at least warn that
>> it's the caller's responsibility to kill the return node's buffer when
>> finished with it.
>
> If the buffers aren't killed by the code, it's indeed an issue that
> needs to be solved. At the time I suggested that the node's buffer is
> killed when the node is GC'ed or deleted -- have this not been done?
No. I think it’s best to implement treesit-parse-string in C with ts_parse_string now. That way we don’t need to worry about the temp buffer.
Yuan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#71012
; Package
emacs
.
(Thu, 25 Jul 2024 10:42:01 GMT)
Full text and
rfc822 format available.
Message #80 received at 71012 <at> debbugs.gnu.org (full text, mbox):
> From: Yuan Fu <casouri <at> gmail.com>
> Date: Thu, 25 Jul 2024 00:26:30 -0700
> Cc: "Basil L. Contovounesios" <basil <at> contovou.net>,
> 71012 <at> debbugs.gnu.org,
> Helmut Eller <eller.helmut <at> gmail.com>
>
> > On Jul 24, 2024, at 10:27 PM, Eli Zaretskii <eliz <at> gnu.org> wrote:
> >
> >> From: "Basil L. Contovounesios" <basil <at> contovou.net>
> >> Cc: casouri <at> gmail.com, 71012 <at> debbugs.gnu.org, eller.helmut <at> gmail.com
> >> Date: Thu, 25 Jul 2024 01:32:01 +0200
> >>
> >> Eli Zaretskii [2024-07-24 19:31 +0300] wrote:
> >>
> >>>> Cc: 71012 <at> debbugs.gnu.org, eller.helmut <at> gmail.com
> >>>> From: "Basil L. Contovounesios" <basil <at> contovou.net>
> >>>> Date: Wed, 24 Jul 2024 16:57:53 +0200
> >>>>
> >>>> Ping: thoughts on whether this is an issue?
> >>>
> >>> Thoughts about what? what is deemed to be a problem in this case?
> >>
> >> That each call to treesit-parse-string now allocates a new internal
> >> buffer which is not automatically GCed.
> >>
> >> If this cannot be avoided, I think the docs should at least warn that
> >> it's the caller's responsibility to kill the return node's buffer when
> >> finished with it.
> >
> > If the buffers aren't killed by the code, it's indeed an issue that
> > needs to be solved. At the time I suggested that the node's buffer is
> > killed when the node is GC'ed or deleted -- have this not been done?
>
> No. I think it’s best to implement treesit-parse-string in C with ts_parse_string now. That way we don’t need to worry about the temp buffer.
That'd be fine as well, yes.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#71012
; Package
emacs
.
(Sun, 04 Aug 2024 03:04:02 GMT)
Full text and
rfc822 format available.
Message #83 received at 71012 <at> debbugs.gnu.org (full text, mbox):
> On Jul 25, 2024, at 3:40 AM, Eli Zaretskii <eliz <at> gnu.org> wrote:
>
>> From: Yuan Fu <casouri <at> gmail.com>
>> Date: Thu, 25 Jul 2024 00:26:30 -0700
>> Cc: "Basil L. Contovounesios" <basil <at> contovou.net>,
>> 71012 <at> debbugs.gnu.org,
>> Helmut Eller <eller.helmut <at> gmail.com>
>>
>>> On Jul 24, 2024, at 10:27 PM, Eli Zaretskii <eliz <at> gnu.org> wrote:
>>>
>>>> From: "Basil L. Contovounesios" <basil <at> contovou.net>
>>>> Cc: casouri <at> gmail.com, 71012 <at> debbugs.gnu.org, eller.helmut <at> gmail.com
>>>> Date: Thu, 25 Jul 2024 01:32:01 +0200
>>>>
>>>> Eli Zaretskii [2024-07-24 19:31 +0300] wrote:
>>>>
>>>>>> Cc: 71012 <at> debbugs.gnu.org, eller.helmut <at> gmail.com
>>>>>> From: "Basil L. Contovounesios" <basil <at> contovou.net>
>>>>>> Date: Wed, 24 Jul 2024 16:57:53 +0200
>>>>>>
>>>>>> Ping: thoughts on whether this is an issue?
>>>>>
>>>>> Thoughts about what? what is deemed to be a problem in this case?
>>>>
>>>> That each call to treesit-parse-string now allocates a new internal
>>>> buffer which is not automatically GCed.
>>>>
>>>> If this cannot be avoided, I think the docs should at least warn that
>>>> it's the caller's responsibility to kill the return node's buffer when
>>>> finished with it.
>>>
>>> If the buffers aren't killed by the code, it's indeed an issue that
>>> needs to be solved. At the time I suggested that the node's buffer is
>>> killed when the node is GC'ed or deleted -- have this not been done?
>>
>> No. I think it’s best to implement treesit-parse-string in C with ts_parse_string now. That way we don’t need to worry about the temp buffer.
>
> That'd be fine as well, yes.
Just to update that this is taking longer than I expected. I need to change many things so the parsed tree ends up properly gc’ed. Now I remember why I used with-temp-buffer at the first place :-)
Yuan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#71012
; Package
emacs
.
(Sat, 24 Aug 2024 22:33:02 GMT)
Full text and
rfc822 format available.
Message #86 received at 71012 <at> debbugs.gnu.org (full text, mbox):
> On Aug 3, 2024, at 8:01 PM, Yuan Fu <casouri <at> gmail.com> wrote:
>
>
>
>> On Jul 25, 2024, at 3:40 AM, Eli Zaretskii <eliz <at> gnu.org> wrote:
>>
>>> From: Yuan Fu <casouri <at> gmail.com>
>>> Date: Thu, 25 Jul 2024 00:26:30 -0700
>>> Cc: "Basil L. Contovounesios" <basil <at> contovou.net>,
>>> 71012 <at> debbugs.gnu.org,
>>> Helmut Eller <eller.helmut <at> gmail.com>
>>>
>>>> On Jul 24, 2024, at 10:27 PM, Eli Zaretskii <eliz <at> gnu.org> wrote:
>>>>
>>>>> From: "Basil L. Contovounesios" <basil <at> contovou.net>
>>>>> Cc: casouri <at> gmail.com, 71012 <at> debbugs.gnu.org, eller.helmut <at> gmail.com
>>>>> Date: Thu, 25 Jul 2024 01:32:01 +0200
>>>>>
>>>>> Eli Zaretskii [2024-07-24 19:31 +0300] wrote:
>>>>>
>>>>>>> Cc: 71012 <at> debbugs.gnu.org, eller.helmut <at> gmail.com
>>>>>>> From: "Basil L. Contovounesios" <basil <at> contovou.net>
>>>>>>> Date: Wed, 24 Jul 2024 16:57:53 +0200
>>>>>>>
>>>>>>> Ping: thoughts on whether this is an issue?
>>>>>>
>>>>>> Thoughts about what? what is deemed to be a problem in this case?
>>>>>
>>>>> That each call to treesit-parse-string now allocates a new internal
>>>>> buffer which is not automatically GCed.
>>>>>
>>>>> If this cannot be avoided, I think the docs should at least warn that
>>>>> it's the caller's responsibility to kill the return node's buffer when
>>>>> finished with it.
>>>>
>>>> If the buffers aren't killed by the code, it's indeed an issue that
>>>> needs to be solved. At the time I suggested that the node's buffer is
>>>> killed when the node is GC'ed or deleted -- have this not been done?
>>>
>>> No. I think it’s best to implement treesit-parse-string in C with ts_parse_string now. That way we don’t need to worry about the temp buffer.
>>
>> That'd be fine as well, yes.
>
> Just to update that this is taking longer than I expected. I need to change many things so the parsed tree ends up properly gc’ed. Now I remember why I used with-temp-buffer at the first place :-)
>
> Yuan
>
Ok, after much struggle I settled with the easier option, which is to still use a temp buffer, but make sure Emacs garbage collects the temp buffer. I detailed the reasoning in the comment above Ftreesit_parse_string. I verified that the buffer gets collected when the nodes are collected, and pushed it to master.
Yuan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#71012
; Package
emacs
.
(Tue, 27 Aug 2024 11:01:01 GMT)
Full text and
rfc822 format available.
Message #89 received at 71012 <at> debbugs.gnu.org (full text, mbox):
Yuan Fu [2024-08-24 15:30 -0700] wrote:
> Ok, after much struggle I settled with the easier option, which is to still use
> a temp buffer, but make sure Emacs garbage collects the temp buffer. I detailed
> the reasoning in the comment above Ftreesit_parse_string. I verified that the
> buffer gets collected when the nodes are collected, and pushed it to master.
Thanks, the snippet in https://bugs.gnu.org/71012#55 now runs quickly
and in linear time.
--
Basil
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#71012
; Package
emacs
.
(Wed, 28 Aug 2024 04:31:02 GMT)
Full text and
rfc822 format available.
Message #92 received at 71012-done <at> debbugs.gnu.org (full text, mbox):
> On Aug 27, 2024, at 3:59 AM, Basil L. Contovounesios <basil <at> contovou.net> wrote:
>
> Yuan Fu [2024-08-24 15:30 -0700] wrote:
>
>> Ok, after much struggle I settled with the easier option, which is to still use
>> a temp buffer, but make sure Emacs garbage collects the temp buffer. I detailed
>> the reasoning in the comment above Ftreesit_parse_string. I verified that the
>> buffer gets collected when the nodes are collected, and pushed it to master.
>
> Thanks, the snippet in https://bugs.gnu.org/71012#55 now runs quickly
> and in linear time.
Great! Closing this report.
Yuan
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Wed, 25 Sep 2024 11:24:06 GMT)
Full text and
rfc822 format available.
This bug report was last modified 169 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.