GNU bug report logs - #60237
30.0.50; tree sitter core dumps when I edebug view a node

Previous Next

Package: emacs;

Reported by: Mickey Petersen <mickey <at> masteringemacs.org>

Date: Wed, 21 Dec 2022 12:30:02 UTC

Severity: normal

Found in version 30.0.50

Done: Stefan Monnier <monnier <at> iro.umontreal.ca>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 60237 in the body.
You can then email your comments to 60237 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#60237; Package emacs. (Wed, 21 Dec 2022 12:30:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Mickey Petersen <mickey <at> masteringemacs.org>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Wed, 21 Dec 2022 12:30:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Mickey Petersen <mickey <at> masteringemacs.org>
To: bug-gnu-emacs <at> gnu.org
Subject: 30.0.50; tree sitter core dumps when I edebug view a node
Date: Wed, 21 Dec 2022 12:24:34 +0000
Happens in emacs -Q (after loading some simple elisp code that uses treesit.el) and consistently and repeatedly.


Here's the elisp. When I edebug it I can step and view all the variables and expressions I like. The `combobulate-' functions are widely used in the library and pose no issues anywhere else and do nothing more than fetch nodes via tree sitter. It is only this bit of code that blows up, and then only when invoked inside a python string.



    (when-let ((navigable-node (combobulate--get-nearest-navigable-node)) ;; <-- edebugging these work fine;
                 (nearest-node (combobulate-node-at-point))
                 (targets (seq-filter
                           (lambda (elem) (and elem (< elem (point))))
                           (list (save-excursion (ignore-errors (backward-up-list 1 t t) (point)))
                                 (combobulate-node-point (combobulate--nav-get-parent navigable-node)) ;; <- call into this inner form blows up when I read the argument value of `navigable-node' on the inside.
                                 (combobulate-node-point (combobulate--nav-get-parent nearest-node))))))
        (when-let (target (apply #'max targets))
          (goto-char target)
          (combobulate--flash-node (combobulate--get-nearest-navigable-node))))

Here is the "fix"

    (when-let* ((navigable-node (combobulate--get-nearest-navigable-node))
                      (nearest-node (combobulate-node-at-point))
                      (navigable-node-parent (combobulate--nav-get-parent navigable-node))  ;; <- refactor out 
                      (nearest-node-parent (combobulate--nav-get-parent nearest-node)) ;; <- refactor out
                      (targets (seq-filter
                                (lambda (elem) (and elem (< elem (point))))
                                (list (save-excursion (ignore-errors (backward-up-list 1 t t) (point))) ; <- smoking gun
                                      (combobulate-node-point navigable-node-parent)
                                      (combobulate-node-point nearest-node-parent)))))
            (when-let (target (apply #'max targets))
              (goto-char target)
              (combobulate--flash-node (combobulate--get-nearest-navigable-node))))

Clearly, `ignore-errors' + `backward-up-list' which throws errors left and right if it doesn't like what it's seeing is causing this.

If I instead of edebugging just run the code, it hangs Emacs. I have to kill -9 it.


Core dump's half a gig; not going to attach it.


--- Backtrace from the dump here ---

#0  raise (sig=<optimised out>) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x000055e6f87a8e21 in terminate_due_to_signal (sig=sig <at> entry=11, backtrace_limit=-117776184, backtrace_limit <at> entry=40) at emacs.c:464
#2  0x000055e6f87a933d in handle_fatal_signal (sig=sig <at> entry=11) at sysdep.c:1783
#3  0x000055e6f8901f2d in deliver_thread_signal (sig=sig <at> entry=11, handler=0x55e6f87a932c <handle_fatal_signal>) at sysdep.c:1775
#4  0x000055e6f8901fad in deliver_fatal_thread_signal (sig=11) at sysdep.c:1888
#5  handle_sigsegv (sig=11, siginfo=<optimised out>, arg=<optimised out>) at sysdep.c:1888
#6  0x00007fb676b683c0 in <signal handler called> () at /lib/x86_64-linux-gnu/libpthread.so.0
#7  0x00007fb674ea6574 in ts_language_symbol_count () at /usr/local/lib/libtree-sitter.so.0
#8  0x00007fb674ea6773 in ts_language_symbol_name () at /usr/local/lib/libtree-sitter.so.0
#9  0x000055e6f8a01ca5 in Ftreesit_node_type (node=node <at> entry=XIL(0x55e6fdb98f2d)) at treesit.c:1705
#10 0x000055e6f899ee4d in print_vectorlike
    (obj=XIL(0x55e6fdb98f2d), printcharfun=XIL(0), escapeflag=<optimised out>, buf=0x7ffe8098a210 "\335M\351\371\346U") at print.c:2040
#11 0x000055e6f899cb51 in print_object (obj=XIL(0x55e6fdb98f2d), printcharfun=XIL(0), escapeflag=true) at print.c:2612
#12 0x000055e6f899d42c in Fprin1 (object=XIL(0x55e6fdb98f2d), printcharfun=XIL(0x55e6fd9fcdd5), overrides=<optimised out>) at print.c:777
#13 0x000055e6f89bc627 in exec_byte_code (fun=<optimised out>, args_template=<optimised out>, nargs=<optimised out>, args=<optimised out>)
    at lisp.h:2204
#14 0x000055e6f89735f7 in Ffuncall (nargs=3, args=0x7ffe8098a530) at eval.c:2995
#15 0x000055e6f8973880 in Fapply (nargs=2, args=0x7fb66f8327a8) at eval.c:2666
#16 0x000055e6f89bc627 in exec_byte_code (fun=<optimised out>, args_template=<optimised out>, nargs=<optimised out>, args=<optimised out>)
    at lisp.h:2204
#17 0x000055e6f89735f7 in Ffuncall (nargs=4, args=0x7ffe8098a680) at eval.c:2995
#18 0x000055e6f8973880 in Fapply (nargs=3, args=0x7fb66f832700) at eval.c:2666
#19 0x000055e6f89bc627 in exec_byte_code (fun=<optimised out>, args_template=<optimised out>, nargs=<optimised out>, args=<optimised out>)
    at lisp.h:2204
#20 0x000055e6f89735f7 in Ffuncall (nargs=3, args=0x7fb66f832660) at eval.c:2995
#21 0x000055e6f8973b0a in Fapply (nargs=3, args=0x7fb66f832660) at eval.c:2623
#22 0x000055e6f89bc627 in exec_byte_code (fun=<optimised out>, args_template=<optimised out>, nargs=<optimised out>, args=<optimised out>)
    at lisp.h:2204
#23 0x000055e6f89bc366 in exec_byte_code (fun=<optimised out>, args_template=<optimised out>, nargs=<optimised out>, args=<optimised out>)
    at bytecode.c:811
#24 0x000055e6f89735f7 in Ffuncall (nargs=nargs <at> entry=3, args=args <at> entry=0x7ffe8098a9f8) at eval.c:2995
#25 0x000055e6f896f293 in Ffuncall_interactively (nargs=3, args=0x7ffe8098a9f8) at callint.c:248
#26 0x000055e6f89735f7 in Ffuncall (nargs=4, args=0x7ffe8098a9f0) at eval.c:2995
#27 0x000055e6f8973880 in Fapply (nargs=nargs <at> entry=3, args=args <at> entry=0x7ffe8098ab60) at eval.c:2666
#28 0x000055e6f8970c57 in Fcall_interactively (function=XIL(0x3a90730), record_flag=XIL(0), keys=XIL(0x55e6fd9f8a6d)) at lisp.h:1171
#29 0x00007fb6706bdc95 in F636f6d6d616e642d65786563757465_command_execute_0 ()
    at /home/mickey/Downloads/emacs/src/../native-lisp/30.0.50-7cb43add/preloaded/simple-fab5b0cf-b9ebea66.eln
#30 0x000055e6f89735f7 in Ffuncall (nargs=nargs <at> entry=2, args=args <at> entry=0x7ffe8098ad10) at eval.c:2995
#31 0x000055e6f88f5ea0 in call1 (arg1=<optimised out>, fn=XIL(0x4c20)) at lisp.h:3247
#32 command_loop_1 () at keyboard.c:1495
#33 0x000055e6f8971bf7 in internal_condition_case
    (bfun=bfun <at> entry=0x55e6f88f5a80 <command_loop_1>, handlers=handlers <at> entry=XIL(0x90), hfun=hfun <at> entry=0x55e6f88e8b60 <cmd_error>)
    at eval.c:1474
#34 0x000055e6f88e11ea in command_loop_2 (handlers=handlers <at> entry=XIL(0x90)) at keyboard.c:1125
#35 0x000055e6f8971b39 in internal_catch
    (tag=tag <at> entry=XIL(0x6b10), func=func <at> entry=0x55e6f88e11c0 <command_loop_2>, arg=arg <at> entry=XIL(0x90)) at eval.c:1197
#36 0x000055e6f88e113c in command_loop () at lisp.h:1171
#37 0x000055e6f88e86b8 in recursive_edit_1 () at keyboard.c:712
#38 0x000055e6f88e8a60 in Frecursive_edit () at keyboard.c:795
#39 0x000055e6f89bc627 in exec_byte_code (fun=<optimised out>, args_template=<optimised out>, nargs=<optimised out>, args=<optimised out>)
    at lisp.h:2204
#40 0x000055e6f89735f7 in Ffuncall (nargs=3, args=0x7fb66f8323d8) at eval.c:2995
#41 0x000055e6f8973b0a in Fapply (nargs=3, args=0x7fb66f8323d8) at eval.c:2623
#42 0x000055e6f89bc627 in exec_byte_code (fun=<optimised out>, args_template=<optimised out>, nargs=<optimised out>, args=<optimised out>)
    at lisp.h:2204
#43 0x000055e6f8978c0f in apply_lambda (fun=<optimised out>, args=<optimised out>, count=...) at eval.c:3103
#44 0x000055e6f8976d4b in eval_sub (form=<optimised out>) at eval.c:2588
#45 0x000055e6f8978bce in apply_lambda (fun=<optimised out>, args=<optimised out>, count=...) at eval.c:3098
#46 0x000055e6f8976d4b in eval_sub (form=<optimised out>) at eval.c:2588
#47 0x000055e6f897862d in Fprogn (body=XIL(0)) at eval.c:436
#48 funcall_lambda (fun=XIL(0x55e6fc3bf1f3), nargs=0, arg_vector=0x7fb66f832168) at eval.c:3233
#49 0x000055e6f89bc366 in exec_byte_code (fun=<optimised out>, args_template=<optimised out>, nargs=<optimised out>, args=<optimised out>)
    at bytecode.c:811
#50 0x000055e6f8978c0f in apply_lambda (fun=<optimised out>, args=<optimised out>, count=...) at eval.c:3103
#51 0x000055e6f8976d4b in eval_sub (form=<optimised out>) at eval.c:2588
#52 0x000055e6f897862d in Fprogn (body=XIL(0)) at eval.c:436
#53 funcall_lambda (fun=XIL(0x55e6f9db7153), nargs=1, arg_vector=0x7ffe8098b670) at eval.c:3233
#54 0x000055e6f8978c0f in apply_lambda (fun=<optimised out>, args=<optimised out>, count=...) at eval.c:3103
#55 0x000055e6f8976d4b in eval_sub (form=<optimised out>) at eval.c:2588
#56 0x000055e6f8978bce in apply_lambda (fun=<optimised out>, args=<optimised out>, count=...) at eval.c:3098
#57 0x000055e6f8976d4b in eval_sub (form=<optimised out>) at eval.c:2588
#58 0x000055e6f897723d in eval_sub (form=<optimised out>) at eval.c:2465
#59 0x000055e6f8978bce in apply_lambda (fun=<optimised out>, args=<optimised out>, count=...) at eval.c:3098
#60 0x000055e6f8976d4b in eval_sub (form=<optimised out>) at eval.c:2588
#61 0x000055e6f897774d in Fand (args=XIL(0)) at eval.c:370
#62 0x000055e6f8977428 in eval_sub (form=<optimised out>) at lisp.h:2204
#63 0x000055e6f8979496 in FletX (args=XIL(0x55e6fd7f83c3)) at lisp.h:1522
#64 0x000055e6f8977428 in eval_sub (form=<optimised out>) at lisp.h:2204
#65 0x000055e6f8977f97 in Fprog1 (args=XIL(0x55e6fd7f7c13)) at lisp.h:1516
#66 0x000055e6f8977428 in eval_sub (form=<optimised out>) at lisp.h:2204
#67 0x000055e6f89797eb in Funwind_protect (args=XIL(0x55e6fd7f7c73)) at lisp.h:1516
#68 0x000055e6f8977428 in eval_sub (form=<optimised out>) at lisp.h:2204
#69 0x000055e6f8979235 in Fprogn (body=XIL(0)) at eval.c:436
#70 Flet (args=<optimised out>) at eval.c:1026
#71 0x000055e6f8977428 in eval_sub (form=<optimised out>) at lisp.h:2204
#72 0x000055e6f8979235 in Fprogn (body=XIL(0)) at eval.c:436
#73 Flet (args=<optimised out>) at eval.c:1026
#74 0x000055e6f8977428 in eval_sub (form=<optimised out>) at lisp.h:2204
#75 0x000055e6f8977ef5 in Fprogn (body=XIL(0x55e6fd8a41b3)) at eval.c:436
#76 prog_ignore (body=XIL(0x55e6fd7f7e03)) at eval.c:447
#77 Fwhile (args=<optimised out>) at eval.c:1047
#78 0x000055e6f8977428 in eval_sub (form=<optimised out>) at lisp.h:2204
#79 0x000055e6f897964d in Fprogn (body=XIL(0)) at eval.c:436
#80 FletX (args=XIL(0x55e6fd7f7eb3)) at eval.c:958
#81 0x000055e6f8977428 in eval_sub (form=<optimised out>) at lisp.h:2204
#82 0x000055e6f897862d in Fprogn (body=XIL(0)) at eval.c:436
#83 funcall_lambda (fun=XIL(0x55e6fd7f7f93), nargs=1, arg_vector=0x7ffe8098c4e0) at eval.c:3233
#84 0x000055e6f89735f7 in Ffuncall (nargs=nargs <at> entry=2, args=args <at> entry=0x7ffe8098c4d8) at eval.c:2995
#85 0x000055e6f896f293 in Ffuncall_interactively (nargs=2, args=0x7ffe8098c4d8) at callint.c:248
#86 0x000055e6f89735f7 in Ffuncall (nargs=nargs <at> entry=3, args=args <at> entry=0x7ffe8098c4d0) at eval.c:2995
#87 0x000055e6f89708d3 in Fcall_interactively (function=<optimised out>, record_flag=<optimised out>, keys=<optimised out>)
    at callint.c:785
#88 0x00007fb6706bdc95 in F636f6d6d616e642d65786563757465_command_execute_0 ()
    at /home/mickey/Downloads/emacs/src/../native-lisp/30.0.50-7cb43add/preloaded/simple-fab5b0cf-b9ebea66.eln
#89 0x000055e6f89735f7 in Ffuncall (nargs=nargs <at> entry=2, args=args <at> entry=0x7ffe8098c7b0) at eval.c:2995
#90 0x000055e6f88f5ea0 in call1 (arg1=<optimised out>, fn=XIL(0x4c20)) at lisp.h:3247
#91 command_loop_1 () at keyboard.c:1495
#92 0x000055e6f8971bf7 in internal_condition_case
    (bfun=bfun <at> entry=0x55e6f88f5a80 <command_loop_1>, handlers=handlers <at> entry=XIL(0x90), hfun=hfun <at> entry=0x55e6f88e8b60 <cmd_error>)
    at eval.c:1474
#93 0x000055e6f88e11ea in command_loop_2 (handlers=handlers <at> entry=XIL(0x90)) at keyboard.c:1125
#94 0x000055e6f8971b39 in internal_catch
    (tag=tag <at> entry=XIL(0xffc0), func=func <at> entry=0x55e6f88e11c0 <command_loop_2>, arg=arg <at> entry=XIL(0x90)) at eval.c:1197
#95 0x000055e6f88e1186 in command_loop () at lisp.h:1171
#96 0x000055e6f88e86b8 in recursive_edit_1 () at keyboard.c:712
#97 0x000055e6f88e8a60 in Frecursive_edit () at keyboard.c:795
#98 0x000055e6f87b23b8 in main (argc=<optimised out>, argv=<optimised out>) at emacs.c:2529

--- END ---



In GNU Emacs 30.0.50 (build 2, x86_64-pc-linux-gnu, GTK+ Version
 3.24.20, cairo version 1.16.0) of 2022-11-29 built on mickey-work
Repository revision: 7939184f8e0370e7a3397d492812c6d202c2a193
Repository branch: master
Windowing system distributor 'The X.Org Foundation', version 11.0.12013000
System Description: Ubuntu 20.04.3 LTS

Configured using:
 'configure --with-native-compilation --with-json --with-mailutils
 --without-compress-install --with-imagemagick CC=gcc-10'

Configured features:
ACL CAIRO DBUS FREETYPE GIF GLIB GMP GNUTLS GPM GSETTINGS HARFBUZZ
IMAGEMAGICK JPEG JSON LCMS2 LIBOTF LIBSELINUX LIBSYSTEMD LIBXML2
M17N_FLT MODULES NATIVE_COMP NOTIFY INOTIFY PDUMPER PNG RSVG SECCOMP
SOUND SQLITE3 THREADS TIFF TOOLKIT_SCROLL_BARS TREE_SITTER X11 XDBE
XIM XINPUT2 XPM GTK3 ZLIB

Important settings:
  value of $LC_MONETARY: en_GB.UTF-8
  value of $LC_NUMERIC: en_GB.UTF-8
  value of $LC_TIME: en_GB.UTF-8
  value of $LANG: en_GB.UTF-8
  value of $XMODIFIERS: @im=ibus
  locale-coding-system: utf-8-unix






Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60237; Package emacs. (Sat, 24 Dec 2022 07:24:01 GMT) Full text and rfc822 format available.

Message #8 received at 60237 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Mickey Petersen <mickey <at> masteringemacs.org>, Yuan Fu <casouri <at> gmail.com>
Cc: 60237 <at> debbugs.gnu.org
Subject: Re: bug#60237: 30.0.50;
 tree sitter core dumps when I edebug view a node
Date: Sat, 24 Dec 2022 09:23:32 +0200
> From: Mickey Petersen <mickey <at> masteringemacs.org>
> Date: Wed, 21 Dec 2022 12:24:34 +0000

Yuan, can you look into this?  The crash is in tree-sitter, so maybe
it isn't our bug, but I'd like to be sure.  And even if it is a
tree-sitter bug, maybe we can work around it to prevent Emacs from
crashing?

> Happens in emacs -Q (after loading some simple elisp code that uses treesit.el) and consistently and repeatedly.
> 
> 
> Here's the elisp. When I edebug it I can step and view all the variables and expressions I like. The `combobulate-' functions are widely used in the library and pose no issues anywhere else and do nothing more than fetch nodes via tree sitter. It is only this bit of code that blows up, and then only when invoked inside a python string.
> 
> 
> 
>     (when-let ((navigable-node (combobulate--get-nearest-navigable-node)) ;; <-- edebugging these work fine;
>                  (nearest-node (combobulate-node-at-point))
>                  (targets (seq-filter
>                            (lambda (elem) (and elem (< elem (point))))
>                            (list (save-excursion (ignore-errors (backward-up-list 1 t t) (point)))
>                                  (combobulate-node-point (combobulate--nav-get-parent navigable-node)) ;; <- call into this inner form blows up when I read the argument value of `navigable-node' on the inside.
>                                  (combobulate-node-point (combobulate--nav-get-parent nearest-node))))))
>         (when-let (target (apply #'max targets))
>           (goto-char target)
>           (combobulate--flash-node (combobulate--get-nearest-navigable-node))))
> 
> Here is the "fix"
> 
>     (when-let* ((navigable-node (combobulate--get-nearest-navigable-node))
>                       (nearest-node (combobulate-node-at-point))
>                       (navigable-node-parent (combobulate--nav-get-parent navigable-node))  ;; <- refactor out 
>                       (nearest-node-parent (combobulate--nav-get-parent nearest-node)) ;; <- refactor out
>                       (targets (seq-filter
>                                 (lambda (elem) (and elem (< elem (point))))
>                                 (list (save-excursion (ignore-errors (backward-up-list 1 t t) (point))) ; <- smoking gun
>                                       (combobulate-node-point navigable-node-parent)
>                                       (combobulate-node-point nearest-node-parent)))))
>             (when-let (target (apply #'max targets))
>               (goto-char target)
>               (combobulate--flash-node (combobulate--get-nearest-navigable-node))))
> 
> Clearly, `ignore-errors' + `backward-up-list' which throws errors left and right if it doesn't like what it's seeing is causing this.
> 
> If I instead of edebugging just run the code, it hangs Emacs. I have to kill -9 it.
> 
> 
> Core dump's half a gig; not going to attach it.
> 
> 
> --- Backtrace from the dump here ---
> 
> #0  raise (sig=<optimised out>) at ../sysdeps/unix/sysv/linux/raise.c:50
> #1  0x000055e6f87a8e21 in terminate_due_to_signal (sig=sig <at> entry=11, backtrace_limit=-117776184, backtrace_limit <at> entry=40) at emacs.c:464
> #2  0x000055e6f87a933d in handle_fatal_signal (sig=sig <at> entry=11) at sysdep.c:1783
> #3  0x000055e6f8901f2d in deliver_thread_signal (sig=sig <at> entry=11, handler=0x55e6f87a932c <handle_fatal_signal>) at sysdep.c:1775
> #4  0x000055e6f8901fad in deliver_fatal_thread_signal (sig=11) at sysdep.c:1888
> #5  handle_sigsegv (sig=11, siginfo=<optimised out>, arg=<optimised out>) at sysdep.c:1888
> #6  0x00007fb676b683c0 in <signal handler called> () at /lib/x86_64-linux-gnu/libpthread.so.0
> #7  0x00007fb674ea6574 in ts_language_symbol_count () at /usr/local/lib/libtree-sitter.so.0
> #8  0x00007fb674ea6773 in ts_language_symbol_name () at /usr/local/lib/libtree-sitter.so.0
> #9  0x000055e6f8a01ca5 in Ftreesit_node_type (node=node <at> entry=XIL(0x55e6fdb98f2d)) at treesit.c:1705
> #10 0x000055e6f899ee4d in print_vectorlike
>     (obj=XIL(0x55e6fdb98f2d), printcharfun=XIL(0), escapeflag=<optimised out>, buf=0x7ffe8098a210 "\335M\351\371\346U") at print.c:2040
> #11 0x000055e6f899cb51 in print_object (obj=XIL(0x55e6fdb98f2d), printcharfun=XIL(0), escapeflag=true) at print.c:2612
> #12 0x000055e6f899d42c in Fprin1 (object=XIL(0x55e6fdb98f2d), printcharfun=XIL(0x55e6fd9fcdd5), overrides=<optimised out>) at print.c:777
> #13 0x000055e6f89bc627 in exec_byte_code (fun=<optimised out>, args_template=<optimised out>, nargs=<optimised out>, args=<optimised out>)
>     at lisp.h:2204
> #14 0x000055e6f89735f7 in Ffuncall (nargs=3, args=0x7ffe8098a530) at eval.c:2995
> #15 0x000055e6f8973880 in Fapply (nargs=2, args=0x7fb66f8327a8) at eval.c:2666
> #16 0x000055e6f89bc627 in exec_byte_code (fun=<optimised out>, args_template=<optimised out>, nargs=<optimised out>, args=<optimised out>)
>     at lisp.h:2204
> #17 0x000055e6f89735f7 in Ffuncall (nargs=4, args=0x7ffe8098a680) at eval.c:2995
> #18 0x000055e6f8973880 in Fapply (nargs=3, args=0x7fb66f832700) at eval.c:2666
> #19 0x000055e6f89bc627 in exec_byte_code (fun=<optimised out>, args_template=<optimised out>, nargs=<optimised out>, args=<optimised out>)
>     at lisp.h:2204
> #20 0x000055e6f89735f7 in Ffuncall (nargs=3, args=0x7fb66f832660) at eval.c:2995
> #21 0x000055e6f8973b0a in Fapply (nargs=3, args=0x7fb66f832660) at eval.c:2623
> #22 0x000055e6f89bc627 in exec_byte_code (fun=<optimised out>, args_template=<optimised out>, nargs=<optimised out>, args=<optimised out>)
>     at lisp.h:2204
> #23 0x000055e6f89bc366 in exec_byte_code (fun=<optimised out>, args_template=<optimised out>, nargs=<optimised out>, args=<optimised out>)
>     at bytecode.c:811
> #24 0x000055e6f89735f7 in Ffuncall (nargs=nargs <at> entry=3, args=args <at> entry=0x7ffe8098a9f8) at eval.c:2995
> #25 0x000055e6f896f293 in Ffuncall_interactively (nargs=3, args=0x7ffe8098a9f8) at callint.c:248
> #26 0x000055e6f89735f7 in Ffuncall (nargs=4, args=0x7ffe8098a9f0) at eval.c:2995
> #27 0x000055e6f8973880 in Fapply (nargs=nargs <at> entry=3, args=args <at> entry=0x7ffe8098ab60) at eval.c:2666
> #28 0x000055e6f8970c57 in Fcall_interactively (function=XIL(0x3a90730), record_flag=XIL(0), keys=XIL(0x55e6fd9f8a6d)) at lisp.h:1171
> #29 0x00007fb6706bdc95 in F636f6d6d616e642d65786563757465_command_execute_0 ()
>     at /home/mickey/Downloads/emacs/src/../native-lisp/30.0.50-7cb43add/preloaded/simple-fab5b0cf-b9ebea66.eln
> #30 0x000055e6f89735f7 in Ffuncall (nargs=nargs <at> entry=2, args=args <at> entry=0x7ffe8098ad10) at eval.c:2995
> #31 0x000055e6f88f5ea0 in call1 (arg1=<optimised out>, fn=XIL(0x4c20)) at lisp.h:3247
> #32 command_loop_1 () at keyboard.c:1495
> #33 0x000055e6f8971bf7 in internal_condition_case
>     (bfun=bfun <at> entry=0x55e6f88f5a80 <command_loop_1>, handlers=handlers <at> entry=XIL(0x90), hfun=hfun <at> entry=0x55e6f88e8b60 <cmd_error>)
>     at eval.c:1474
> #34 0x000055e6f88e11ea in command_loop_2 (handlers=handlers <at> entry=XIL(0x90)) at keyboard.c:1125
> #35 0x000055e6f8971b39 in internal_catch
>     (tag=tag <at> entry=XIL(0x6b10), func=func <at> entry=0x55e6f88e11c0 <command_loop_2>, arg=arg <at> entry=XIL(0x90)) at eval.c:1197
> #36 0x000055e6f88e113c in command_loop () at lisp.h:1171
> #37 0x000055e6f88e86b8 in recursive_edit_1 () at keyboard.c:712
> #38 0x000055e6f88e8a60 in Frecursive_edit () at keyboard.c:795
> #39 0x000055e6f89bc627 in exec_byte_code (fun=<optimised out>, args_template=<optimised out>, nargs=<optimised out>, args=<optimised out>)
>     at lisp.h:2204
> #40 0x000055e6f89735f7 in Ffuncall (nargs=3, args=0x7fb66f8323d8) at eval.c:2995
> #41 0x000055e6f8973b0a in Fapply (nargs=3, args=0x7fb66f8323d8) at eval.c:2623
> #42 0x000055e6f89bc627 in exec_byte_code (fun=<optimised out>, args_template=<optimised out>, nargs=<optimised out>, args=<optimised out>)
>     at lisp.h:2204
> #43 0x000055e6f8978c0f in apply_lambda (fun=<optimised out>, args=<optimised out>, count=...) at eval.c:3103
> #44 0x000055e6f8976d4b in eval_sub (form=<optimised out>) at eval.c:2588
> #45 0x000055e6f8978bce in apply_lambda (fun=<optimised out>, args=<optimised out>, count=...) at eval.c:3098
> #46 0x000055e6f8976d4b in eval_sub (form=<optimised out>) at eval.c:2588
> #47 0x000055e6f897862d in Fprogn (body=XIL(0)) at eval.c:436
> #48 funcall_lambda (fun=XIL(0x55e6fc3bf1f3), nargs=0, arg_vector=0x7fb66f832168) at eval.c:3233
> #49 0x000055e6f89bc366 in exec_byte_code (fun=<optimised out>, args_template=<optimised out>, nargs=<optimised out>, args=<optimised out>)
>     at bytecode.c:811
> #50 0x000055e6f8978c0f in apply_lambda (fun=<optimised out>, args=<optimised out>, count=...) at eval.c:3103
> #51 0x000055e6f8976d4b in eval_sub (form=<optimised out>) at eval.c:2588
> #52 0x000055e6f897862d in Fprogn (body=XIL(0)) at eval.c:436
> #53 funcall_lambda (fun=XIL(0x55e6f9db7153), nargs=1, arg_vector=0x7ffe8098b670) at eval.c:3233
> #54 0x000055e6f8978c0f in apply_lambda (fun=<optimised out>, args=<optimised out>, count=...) at eval.c:3103
> #55 0x000055e6f8976d4b in eval_sub (form=<optimised out>) at eval.c:2588
> #56 0x000055e6f8978bce in apply_lambda (fun=<optimised out>, args=<optimised out>, count=...) at eval.c:3098
> #57 0x000055e6f8976d4b in eval_sub (form=<optimised out>) at eval.c:2588
> #58 0x000055e6f897723d in eval_sub (form=<optimised out>) at eval.c:2465
> #59 0x000055e6f8978bce in apply_lambda (fun=<optimised out>, args=<optimised out>, count=...) at eval.c:3098
> #60 0x000055e6f8976d4b in eval_sub (form=<optimised out>) at eval.c:2588
> #61 0x000055e6f897774d in Fand (args=XIL(0)) at eval.c:370
> #62 0x000055e6f8977428 in eval_sub (form=<optimised out>) at lisp.h:2204
> #63 0x000055e6f8979496 in FletX (args=XIL(0x55e6fd7f83c3)) at lisp.h:1522
> #64 0x000055e6f8977428 in eval_sub (form=<optimised out>) at lisp.h:2204
> #65 0x000055e6f8977f97 in Fprog1 (args=XIL(0x55e6fd7f7c13)) at lisp.h:1516
> #66 0x000055e6f8977428 in eval_sub (form=<optimised out>) at lisp.h:2204
> #67 0x000055e6f89797eb in Funwind_protect (args=XIL(0x55e6fd7f7c73)) at lisp.h:1516
> #68 0x000055e6f8977428 in eval_sub (form=<optimised out>) at lisp.h:2204
> #69 0x000055e6f8979235 in Fprogn (body=XIL(0)) at eval.c:436
> #70 Flet (args=<optimised out>) at eval.c:1026
> #71 0x000055e6f8977428 in eval_sub (form=<optimised out>) at lisp.h:2204
> #72 0x000055e6f8979235 in Fprogn (body=XIL(0)) at eval.c:436
> #73 Flet (args=<optimised out>) at eval.c:1026
> #74 0x000055e6f8977428 in eval_sub (form=<optimised out>) at lisp.h:2204
> #75 0x000055e6f8977ef5 in Fprogn (body=XIL(0x55e6fd8a41b3)) at eval.c:436
> #76 prog_ignore (body=XIL(0x55e6fd7f7e03)) at eval.c:447
> #77 Fwhile (args=<optimised out>) at eval.c:1047
> #78 0x000055e6f8977428 in eval_sub (form=<optimised out>) at lisp.h:2204
> #79 0x000055e6f897964d in Fprogn (body=XIL(0)) at eval.c:436
> #80 FletX (args=XIL(0x55e6fd7f7eb3)) at eval.c:958
> #81 0x000055e6f8977428 in eval_sub (form=<optimised out>) at lisp.h:2204
> #82 0x000055e6f897862d in Fprogn (body=XIL(0)) at eval.c:436
> #83 funcall_lambda (fun=XIL(0x55e6fd7f7f93), nargs=1, arg_vector=0x7ffe8098c4e0) at eval.c:3233
> #84 0x000055e6f89735f7 in Ffuncall (nargs=nargs <at> entry=2, args=args <at> entry=0x7ffe8098c4d8) at eval.c:2995
> #85 0x000055e6f896f293 in Ffuncall_interactively (nargs=2, args=0x7ffe8098c4d8) at callint.c:248
> #86 0x000055e6f89735f7 in Ffuncall (nargs=nargs <at> entry=3, args=args <at> entry=0x7ffe8098c4d0) at eval.c:2995
> #87 0x000055e6f89708d3 in Fcall_interactively (function=<optimised out>, record_flag=<optimised out>, keys=<optimised out>)
>     at callint.c:785
> #88 0x00007fb6706bdc95 in F636f6d6d616e642d65786563757465_command_execute_0 ()
>     at /home/mickey/Downloads/emacs/src/../native-lisp/30.0.50-7cb43add/preloaded/simple-fab5b0cf-b9ebea66.eln
> #89 0x000055e6f89735f7 in Ffuncall (nargs=nargs <at> entry=2, args=args <at> entry=0x7ffe8098c7b0) at eval.c:2995
> #90 0x000055e6f88f5ea0 in call1 (arg1=<optimised out>, fn=XIL(0x4c20)) at lisp.h:3247
> #91 command_loop_1 () at keyboard.c:1495
> #92 0x000055e6f8971bf7 in internal_condition_case
>     (bfun=bfun <at> entry=0x55e6f88f5a80 <command_loop_1>, handlers=handlers <at> entry=XIL(0x90), hfun=hfun <at> entry=0x55e6f88e8b60 <cmd_error>)
>     at eval.c:1474
> #93 0x000055e6f88e11ea in command_loop_2 (handlers=handlers <at> entry=XIL(0x90)) at keyboard.c:1125
> #94 0x000055e6f8971b39 in internal_catch
>     (tag=tag <at> entry=XIL(0xffc0), func=func <at> entry=0x55e6f88e11c0 <command_loop_2>, arg=arg <at> entry=XIL(0x90)) at eval.c:1197
> #95 0x000055e6f88e1186 in command_loop () at lisp.h:1171
> #96 0x000055e6f88e86b8 in recursive_edit_1 () at keyboard.c:712
> #97 0x000055e6f88e8a60 in Frecursive_edit () at keyboard.c:795
> #98 0x000055e6f87b23b8 in main (argc=<optimised out>, argv=<optimised out>) at emacs.c:2529
> 
> --- END ---
> 
> 
> 
> In GNU Emacs 30.0.50 (build 2, x86_64-pc-linux-gnu, GTK+ Version
>  3.24.20, cairo version 1.16.0) of 2022-11-29 built on mickey-work
> Repository revision: 7939184f8e0370e7a3397d492812c6d202c2a193
> Repository branch: master
> Windowing system distributor 'The X.Org Foundation', version 11.0.12013000
> System Description: Ubuntu 20.04.3 LTS
> 
> Configured using:
>  'configure --with-native-compilation --with-json --with-mailutils
>  --without-compress-install --with-imagemagick CC=gcc-10'
> 
> Configured features:
> ACL CAIRO DBUS FREETYPE GIF GLIB GMP GNUTLS GPM GSETTINGS HARFBUZZ
> IMAGEMAGICK JPEG JSON LCMS2 LIBOTF LIBSELINUX LIBSYSTEMD LIBXML2
> M17N_FLT MODULES NATIVE_COMP NOTIFY INOTIFY PDUMPER PNG RSVG SECCOMP
> SOUND SQLITE3 THREADS TIFF TOOLKIT_SCROLL_BARS TREE_SITTER X11 XDBE
> XIM XINPUT2 XPM GTK3 ZLIB
> 
> Important settings:
>   value of $LC_MONETARY: en_GB.UTF-8
>   value of $LC_NUMERIC: en_GB.UTF-8
>   value of $LC_TIME: en_GB.UTF-8
>   value of $LANG: en_GB.UTF-8
>   value of $XMODIFIERS: @im=ibus
>   locale-coding-system: utf-8-unix
> 
> 
> 
> 
> 
> 




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60237; Package emacs. (Sat, 24 Dec 2022 09:21:01 GMT) Full text and rfc822 format available.

Message #11 received at 60237 <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: Mickey Petersen <mickey <at> masteringemacs.org>, 60237 <at> debbugs.gnu.org
Subject: Re: bug#60237: 30.0.50; tree sitter core dumps when I edebug view a 
 node
Date: Sat, 24 Dec 2022 01:20:19 -0800
Eli Zaretskii <eliz <at> gnu.org> writes:

>> From: Mickey Petersen <mickey <at> masteringemacs.org>
>> Date: Wed, 21 Dec 2022 12:24:34 +0000
>
> Yuan, can you look into this?  The crash is in tree-sitter, so maybe
> it isn't our bug, but I'd like to be sure.  And even if it is a
> tree-sitter bug, maybe we can work around it to prevent Emacs from
> crashing?

Absolutely.

>> Happens in emacs -Q (after loading some simple elisp code that uses treesit.el) and consistently and repeatedly.
>> 
>> 
>> Here's the elisp. When I edebug it I can step and view all the
>> variables and expressions I like. The `combobulate-' functions are
>> widely used in the library and pose no issues anywhere else and do
>> nothing more than fetch nodes via tree sitter. It is only this bit of
>> code that blows up, and then only when invoked inside a python
>> string.

It would be nice if you can make a reproduce recipe. Judging from the
backtrace, you can probably trigger it by printing the node with print
or princ.  And does it trigger on all python strings? Or some specific
string in some specific python source?

Yuan




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60237; Package emacs. (Thu, 29 Dec 2022 14:24:01 GMT) Full text and rfc822 format available.

Message #14 received at 60237 <at> debbugs.gnu.org (full text, mbox):

From: Mickey Petersen <mickey <at> masteringemacs.org>
To: Yuan Fu <casouri <at> gmail.com>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 60237 <at> debbugs.gnu.org
Subject: Re: bug#60237: 30.0.50; tree sitter core dumps when I edebug view a
 node
Date: Thu, 29 Dec 2022 14:21:35 +0000
Yuan Fu <casouri <at> gmail.com> writes:

> Eli Zaretskii <eliz <at> gnu.org> writes:
>
>>> From: Mickey Petersen <mickey <at> masteringemacs.org>
>>> Date: Wed, 21 Dec 2022 12:24:34 +0000
>>
>> Yuan, can you look into this?  The crash is in tree-sitter, so maybe
>> it isn't our bug, but I'd like to be sure.  And even if it is a
>> tree-sitter bug, maybe we can work around it to prevent Emacs from
>> crashing?
>
> Absolutely.
>
>>> Happens in emacs -Q (after loading some simple elisp code that uses treesit.el) and consistently and repeatedly.
>>>
>>>
>>> Here's the elisp. When I edebug it I can step and view all the
>>> variables and expressions I like. The `combobulate-' functions are
>>> widely used in the library and pose no issues anywhere else and do
>>> nothing more than fetch nodes via tree sitter. It is only this bit of
>>> code that blows up, and then only when invoked inside a python
>>> string.
>
> It would be nice if you can make a reproduce recipe. Judging from the
> backtrace, you can probably trigger it by printing the node with print
> or princ.  And does it trigger on all python strings? Or some specific
> string in some specific python source?
>

This issue seems entirely related to `M-x treesit-explore-mode` (and possibly the inspect variant also)  though it is hard to reproduce reliably. I get either crashes or hangs, depending on whether I have edebug on or not.

Thrown errors seem to be the common denominator?


> Yuan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60237; Package emacs. (Fri, 24 Feb 2023 23:23:02 GMT) Full text and rfc822 format available.

Message #17 received at 60237 <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: Mickey Petersen <mickey <at> masteringemacs.org>
Cc: eliz <at> gnu.org, 60237 <at> debbugs.gnu.org
Subject: Re: bug#60237: 30.0.50; tree sitter core dumps when I edebug view a 
 node
Date: Fri, 24 Feb 2023 15:22:00 -0800
Mickey Petersen <mickey <at> masteringemacs.org> writes:

> Yuan Fu <casouri <at> gmail.com> writes:
>
>> Eli Zaretskii <eliz <at> gnu.org> writes:
>>
>>>> From: Mickey Petersen <mickey <at> masteringemacs.org>
>>>> Date: Wed, 21 Dec 2022 12:24:34 +0000
>>>
>>> Yuan, can you look into this?  The crash is in tree-sitter, so maybe
>>> it isn't our bug, but I'd like to be sure.  And even if it is a
>>> tree-sitter bug, maybe we can work around it to prevent Emacs from
>>> crashing?
>>
>> Absolutely.
>>
>>>> Happens in emacs -Q (after loading some simple elisp code that uses treesit.el) and consistently and repeatedly.
>>>>
>>>>
>>>> Here's the elisp. When I edebug it I can step and view all the
>>>> variables and expressions I like. The `combobulate-' functions are
>>>> widely used in the library and pose no issues anywhere else and do
>>>> nothing more than fetch nodes via tree sitter. It is only this bit of
>>>> code that blows up, and then only when invoked inside a python
>>>> string.
>>
>> It would be nice if you can make a reproduce recipe. Judging from the
>> backtrace, you can probably trigger it by printing the node with print
>> or princ.  And does it trigger on all python strings? Or some specific
>> string in some specific python source?
>>
>
> This issue seems entirely related to `M-x treesit-explore-mode` (and
> possibly the inspect variant also) though it is hard to reproduce
> reliably. I get either crashes or hangs, depending on whether I have
> edebug on or not.
>
> Thrown errors seem to be the common denominator?

I’m stumbled on a reliably way to trigger a crash, of possibly the same cause as
this one, by enabling the profiler and M-x garbage-collect in a
tree-sitter mode on Mac. I tried to reproduce this on Linux but with no
success.

I was also able to trigger infinite loop by the same recipe on time, but I
didn’t run that session under lldb. Anyway, we can focus on the crash
first.

Below’s the backtrace. Eli, could you see anything from this? I have the
lldb session live so let me know if you want to see anything.
Unfortunately I can’t get gdb to work on Mac.

I suspect there is some stupid mistake that I made concerning gcing
tree-sitter objects. Could you see anything suspicious from the
following description:

A Lisp_TS_Parser contains a TSParser and a TSTree, which are freed when
the Lisp_TS_Parser is collected. A Lisp_TS_Node references the parser
from which it is created, so that a Lisp_TS_Parser is only collected
when no live node references it, because the Lisp_TS_Node references the
TSTree stored in the Lisp_TS_Parser.


* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0)
    frame #0: 0x0000000100250f3d emacs`ASIZE(array=0x00000001a1889245) at lisp.h:1768:3
   1765	ASIZE (Lisp_Object array)
   1766	{
   1767	  ptrdiff_t size = XVECTOR (array)->header.size;
-> 1768	  eassume (0 <= size);
   1769	  return size;
   1770	}
   1771
Target 0: (emacs) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0)
  * frame #0: 0x0000000100250f3d emacs`ASIZE(array=0x00000001a1889245) at lisp.h:1768:3
    frame #1: 0x0000000100250e5e emacs`get_backtrace(array=0x00000001a1889245) at eval.c:4193:28
    frame #2: 0x00000001003001ce emacs`record_backtrace(log=0x00000001a1887d68, count=64) at profiler.c:162:3
    frame #3: 0x000000010030016d emacs`malloc_probe(size=64) at profiler.c:509:3
    frame #4: 0x0000000100204e6d emacs`xmalloc(size=64) at alloc.c:760:3
    frame #5: 0x0000000100e6c0c9 libtree-sitter.0.dylib`ts_subtree_release + 158
    frame #6: 0x0000000100e6f004 libtree-sitter.0.dylib`ts_tree_delete + 44
    frame #7: 0x0000000100307379 emacs`treesit_delete_parser(lisp_parser=0x00000001a2c0f0e0) at treesit.c:1182:3
    frame #8: 0x0000000100212c1b emacs`cleanup_vector(vector=0x00000001a2c0f0e0) at alloc.c:3179:5
    frame #9: 0x00000001002124c9 emacs`sweep_vectors at alloc.c:3254:5
    frame #10: 0x000000010020c777 emacs`gc_sweep at alloc.c:7430:3
    frame #11: 0x000000010020bb67 emacs`garbage_collect at alloc.c:6262:3
    frame #12: 0x000000010020b706 emacs`maybe_garbage_collect at alloc.c:6107:5
    frame #13: 0x00000001002b4bea emacs`maybe_gc at lisp.h:5591:5
    frame #14: 0x00000001002afcaa emacs`exec_byte_code(fun=0x0000000107557b85, args_template=256, nargs=1, args=0x000000010809e338) at bytecode.c:782:6
    frame #15: 0x0000000100251e77 emacs`fetch_and_exec_byte_code(fun=0x00000001a1396be5, args_template=514, nargs=2, args=0x00007ff7bfefc628) at eval.c:3081:10
    frame #16: 0x000000010024e7c1 emacs`funcall_lambda(fun=0x00000001a1396be5, nargs=2, arg_vector=0x00007ff7bfefc628) at eval.c:3153:9
    frame #17: 0x000000010024e0a7 emacs`funcall_general(fun=0x00000001a1396be5, numargs=2, args=0x00007ff7bfefc628) at eval.c:2945:12
    frame #18: 0x00000001002494b4 emacs`Ffuncall(nargs=3, args=0x00007ff7bfefc620) at eval.c:2995:21
    frame #19: 0x000000010024d61c emacs`Fapply(nargs=2, args=0x00007ff7bfefc738) at eval.c:2666:24
    frame #20: 0x0000000100245752 emacs`apply1(fn=0x00000000a0a4d740, arg=0x00000001a38c59b3) at eval.c:2882:43
    frame #21: 0x00000001002cb671 emacs`read_process_output_call(fun_and_args=0x00000001a38c59c3) at process.c:6070:10
    frame #22: 0x000000010024a493 emacs`internal_condition_case_1(bfun=(emacs`read_process_output_call at process.c:6069), arg=0x00000001a38c59c3, handlers=0x0000000000000090, hfun=(emacs`read_process_output_error_handler at process.c:6075)) at eval.c:1498:25
    frame #23: 0x00000001002cb5c3 emacs`read_and_dispose_of_process_output(p=0x00000001a1448440, chars="Content-Length: 1184\r\n\r\n{\"jsonrpc\":\"2.0\",\"method\":\"textDocument/publishDiagnostics\",\"params\":{\"uri\":\"file:///Users/yuan/t/js/test.ts\",\"diagnostics\":[{\"range\":{\"start\":{\"line\":4,\"character\":11},\"end\":{\"line\":4,\"character\":14}},\"message\":\"Property 'get' does not exist on type 'Document'.\",\"severity\":1,\"code\":2339,\"source\":\"typescript\",\"tags\":[]},{\"range\":{\"start\":{\"line\":0,\"character\":14},\"end\":{\"line\":0,\"character\":15}},\"message\":\"Parameter 'a' implicitly has an 'any' type, but a better type may be inferred from usage.\",\"severity\":4,\"code\":7044,\"source\":\"typescript\",\"tags\":[]},{\"range\":{\"start\":{\"line\":0,\"character\":14},\"end\":{\"line\":0,\"character\":15}},\"message\":\"'a' is declared but its value is never read.\",\"severity\":4,\"code\":6133,\"source\":\"typescript\",\"tags\":[1]},{\"range\":{\"start\":{\"line\":0,\"character\":17},\"end\":{\"line\":0,\"character\":18}},\"message\":\"Parameter 'b' implicitly has an 'any' type, but a better type may be inferred from usage.\",\"severity\":4,\"code\":7044,\"source\":\"typescript\",\"tags\":[]},{\"range\":{\""..., nbytes=1208, coding=0x00000001577aed10) at process.c:6294:5
    frame #24: 0x00000001002c46df emacs`read_process_output(proc=0x00000001a1448445, channel=26) at process.c:6204:3
    frame #25: 0x00000001002c3097 emacs`wait_reading_process_output(time_limit=5, nsecs=0, read_kbd=-1, do_display=true, wait_for_cell=0x0000000000000000, wait_proc=0x0000000000000000, just_wait_proc=0) at process.c:5888:16
    frame #26: 0x000000010000c881 emacs`sit_for(timeout=0x0000000000000016, reading=true, display_option=1) at dispnew.c:6256:7
    frame #27: 0x0000000100166ffd emacs`read_char(commandflag=1, map=0x0000000193762fb3, prev_event=0x0000000000000000, used_mouse_menu=0x00007ff7bfefeb1f, end_time=0x0000000000000000) at keyboard.c:2872:11
    frame #28: 0x0000000100162e18 emacs`read_key_sequence(keybuf=0x00007ff7bfefee40, prompt=0x0000000000000000, dont_downcase_last=false, can_return_switch_frame=true, fix_current_buffer=true, prevent_redisplay=false) at keyboard.c:10074:12
    frame #29: 0x00000001001611c0 emacs`command_loop_1 at keyboard.c:1375:15
    frame #30: 0x000000010024a3c8 emacs`internal_condition_case(bfun=(emacs`command_loop_1 at keyboard.c:1269), handlers=0x0000000000000090, hfun=(emacs`cmd_error at keyboard.c:927)) at eval.c:1474:25
    frame #31: 0x0000000100160c63 emacs`command_loop_2(handlers=0x0000000000000090) at keyboard.c:1124:11
    frame #32: 0x0000000100249b53 emacs`internal_catch(tag=0x000000000000f3f0, func=(emacs`command_loop_2 at keyboard.c:1120), arg=0x0000000000000090) at eval.c:1197:25
    frame #33: 0x000000010015ffda emacs`command_loop at keyboard.c:1102:2
    frame #34: 0x000000010015fddf emacs`recursive_edit_1 at keyboard.c:711:9
    frame #35: 0x0000000100160352 emacs`Frecursive_edit at keyboard.c:794:3
    frame #36: 0x000000010015d177 emacs`main(argc=1, argv=0x00007ff7bfeff6a8) at emacs.c:2529:3
    frame #37: 0x00007ff81a314310 dyld`start + 2432
(lldb) up 4
frame #4: 0x0000000100204e6d emacs`xmalloc(size=64) at alloc.c:760:3
   757
   758 	  if (!val)
   759 	    memory_full (size);
-> 760 	  MALLOC_PROBE (size);
   761 	  return val;
   762 	}
   763
(lldb) list
   764 	/* Like the above, but zeroes out the memory just allocated.  */
   765
   766 	void *
   767 	xzalloc (size_t size)
   768 	{
   769 	  void *val;
   770
(lldb) up
frame #5: 0x0000000100e6c0c9 libtree-sitter.0.dylib`ts_subtree_release + 158
libtree-sitter.0.dylib`ts_subtree_release:
->  0x100e6c0c9 <+158>: movq   -0x30(%rbp), %rdi
    0x100e6c0cd <+162>: movq   %rax, 0x10(%rdi)
    0x100e6c0d1 <+166>: movl   %r15d, 0x1c(%rdi)
    0x100e6c0d5 <+170>: movl   0x18(%rdi), %eax
(lldb) down
frame #4: 0x0000000100204e6d emacs`xmalloc(size=64) at alloc.c:760:3
   757
   758 	  if (!val)
   759 	    memory_full (size);
-> 760 	  MALLOC_PROBE (size);
   761 	  return val;
   762 	}
   763
(lldb) down
frame #3: 0x000000010030016d emacs`malloc_probe(size=64) at profiler.c:509:3
   506 	malloc_probe (size_t size)
   507 	{
   508 	  eassert (HASH_TABLE_P (memory_log));
-> 509 	  record_backtrace (XHASH_TABLE (memory_log), min (size, MOST_POSITIVE_FIXNUM));
   510 	}
   511
   512 	DEFUN ("function-equal", Ffunction_equal, Sfunction_equal, 2, 2, 0,
(lldb) down
frame #2: 0x00000001003001ce emacs`record_backtrace(log=0x00000001a1887d68, count=64) at profiler.c:162:3
   159 	  /* Get a "working memory" vector.  */
   160 	  Lisp_Object backtrace = HASH_VALUE (log, index);
   161 	  eassert (BASE_EQ (Qunbound, HASH_KEY (log, index)));
-> 162 	  get_backtrace (backtrace);
   163
   164 	  { /* We basically do a `gethash+puthash' here, except that we have to be
   165 	       careful to avoid memory allocation since we're in a signal
(lldb) down
frame #1: 0x0000000100250e5e emacs`get_backtrace(array=0x00000001a1889245) at eval.c:4193:28
   4190	get_backtrace (Lisp_Object array)
   4191	{
   4192	  union specbinding *pdl = backtrace_next (backtrace_top ());
-> 4193	  ptrdiff_t i = 0, asize = ASIZE (array);
   4194
   4195	  /* Copy the backtrace contents into working memory.  */
   4196	  for (; i < asize; i++)
(lldb) down
frame #0: 0x0000000100250f3d emacs`ASIZE(array=0x00000001a1889245) at lisp.h:1768:3
   1765	ASIZE (Lisp_Object array)
   1766	{
   1767	  ptrdiff_t size = XVECTOR (array)->header.size;
-> 1768	  eassume (0 <= size);
   1769	  return size;
   1770	}
   1771
(lldb) p size
(ptrdiff_t) $0 = -9223372036854775792
(lldb) p XVECTOR(array)
(Lisp_Vector *) $1 = 0x00000001a1889240
(lldb) p XVECTOR(array)->header
(vectorlike_header) $2 = (size = -9223372036854775792)
(lldb) p *array
error: expression failed to parse:
error: <user expression 3>:1:1: incomplete type 'Lisp_X' where a complete type is required
*array
^
(lldb)




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60237; Package emacs. (Fri, 24 Feb 2023 23:30:02 GMT) Full text and rfc822 format available.

Message #20 received at 60237 <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: Mickey Petersen <mickey <at> masteringemacs.org>
Cc: eliz <at> gnu.org, 60237 <at> debbugs.gnu.org
Subject: Re: bug#60237: 30.0.50; tree sitter core dumps when I edebug view a 
 node
Date: Fri, 24 Feb 2023 15:29:15 -0800
Yuan Fu <casouri <at> gmail.com> writes:

> Mickey Petersen <mickey <at> masteringemacs.org> writes:
>
>> Yuan Fu <casouri <at> gmail.com> writes:
>>
>>> Eli Zaretskii <eliz <at> gnu.org> writes:
>>>
>>>>> From: Mickey Petersen <mickey <at> masteringemacs.org>
>>>>> Date: Wed, 21 Dec 2022 12:24:34 +0000
>>>>
>>>> Yuan, can you look into this?  The crash is in tree-sitter, so maybe
>>>> it isn't our bug, but I'd like to be sure.  And even if it is a
>>>> tree-sitter bug, maybe we can work around it to prevent Emacs from
>>>> crashing?
>>>
>>> Absolutely.
>>>
>>>>> Happens in emacs -Q (after loading some simple elisp code that uses treesit.el) and consistently and repeatedly.
>>>>>
>>>>>
>>>>> Here's the elisp. When I edebug it I can step and view all the
>>>>> variables and expressions I like. The `combobulate-' functions are
>>>>> widely used in the library and pose no issues anywhere else and do
>>>>> nothing more than fetch nodes via tree sitter. It is only this bit of
>>>>> code that blows up, and then only when invoked inside a python
>>>>> string.
>>>
>>> It would be nice if you can make a reproduce recipe. Judging from the
>>> backtrace, you can probably trigger it by printing the node with print
>>> or princ.  And does it trigger on all python strings? Or some specific
>>> string in some specific python source?
>>>
>>
>> This issue seems entirely related to `M-x treesit-explore-mode` (and
>> possibly the inspect variant also) though it is hard to reproduce
>> reliably. I get either crashes or hangs, depending on whether I have
>> edebug on or not.
>>
>> Thrown errors seem to be the common denominator?
>
> I’m stumbled on a reliably way to trigger a crash, of possibly the same cause as
> this one, by enabling the profiler and M-x garbage-collect in a
> tree-sitter mode on Mac. I tried to reproduce this on Linux but with no
> success.
>
> I was also able to trigger infinite loop by the same recipe on time, but I
> didn’t run that session under lldb. Anyway, we can focus on the crash
> first.

Maybe it will help us understand the problem better, so here is the
backtrace for the infinite loop. I’m not sure why treesit_delete_parser
would trigger gc, as it just calls two tree_sitter functions:

void
treesit_delete_parser (struct Lisp_TS_Parser *lisp_parser)
{
  ts_tree_delete (lisp_parser->tree);
  ts_parser_delete (lisp_parser->parser);
}


Date/Time:        2023-02-24 15:08:13.620 -0800
End time:         2023-02-24 15:09:53.381 -0800
OS Version:       macOS 13.2.1 (Build 22D68)
Architecture:     x86_64h
Report Version:   40
Incident Identifier: B9F9C8A6-5293-4B70-935A-FAB1EF623EB2

Data Source:      Stackshots
Shared Cache:     57815A20-AF2C-3B56-9006-23ABDE7962B0 slid base address 0x7ff81a27a000, slide 0x1a27a000 (System Primary)
Shared Cache:     E1E267C5-FE0B-3ED9-86BE-E4F329F01460 slid base address 0x7ff818076000, slide 0x18076000 (DriverKit)

Command:          emacs
Path:             /Users/USER/*/emacs
Architecture:     x86_64
Parent:           fish [11593] [unique pid 110818]
Responsible:      iTerm2 [1312]
PID:              11610
Time Since Fork:  186s

Event:            hang
Duration:         99.76s
Duration Sampled: 4.20s (process was unresponsive for 96 seconds before sampling)
Steps:            42 (100ms sampling interval)

Hardware model:   MacBookPro16,3
Active cpus:      8
HW page size:     4096
VM page size:     4096

Time Since Boot:  259463s
Time Awake Since Boot: 136664s
Time Since Wake:  1506s

Fan speed:        4411 rpm -> 4591 (+180)
Total CPU Time:   8.453s (31.1G cycles, 29.7G instructions, 1.05c/i)
Advisory levels:  Battery -> 3, User -> 2, ThermalPressure -> 1, Combined -> 2
Free disk space:  65.61 GB/465.63 GB, low space threshold 3072 MB
Vnodes Available: 73.05% (192242/263168)

Preferred User Language: en-US, zh-Hans-US
Country Code:     US
Keyboards:        ABC
OS Cryptex File Extents: 2417

--------------------------------------------------
Timeline format: stacks are sorted chronologically
Use -i and -heavy to re-report with count sorting
--------------------------------------------------


Heaviest stack for the main thread of the target process:
  42  start + 2432 (dyld + 25360) [0x7ff81a314310]
  42  main + 7399 (emacs.c:2529,3 in emacs + 1429879) [0x10fc30177]
  42  Frecursive_edit + 306 (keyboard.c:794,3 in emacs + 1442642) [0x10fc33352]
  42  recursive_edit_1 + 255 (keyboard.c:711,9 in emacs + 1441247) [0x10fc32ddf]
  42  command_loop + 282 (keyboard.c:1102,2 in emacs + 1441754) [0x10fc32fda]
  42  internal_catch + 67 (eval.c:1197,25 in emacs + 2399059) [0x10fd1cb53]
  42  command_loop_2 + 35 (keyboard.c:1124,11 in emacs + 1444963) [0x10fc33c63]
  42  internal_condition_case + 136 (eval.c:1474,25 in emacs + 2401224) [0x10fd1d3c8]
  42  command_loop_1 + 2627 (keyboard.c:1494,13 in emacs + 1447651) [0x10fc346e3]
  42  call1 + 60 (lisp.h:3247,10 in emacs + 1464844) [0x10fc38a0c]
  42  Ffuncall + 324 (eval.c:2995,21 in emacs + 2397364) [0x10fd1c4b4]
  42  funcall_general + 279 (eval.c:2945,12 in emacs + 2416807) [0x10fd210a7]
  42  funcall_lambda + 385 (eval.c:3153,9 in emacs + 2418625) [0x10fd217c1]
  42  fetch_and_exec_byte_code + 87 (eval.c:3081,10 in emacs + 2432631) [0x10fd24e77]
  42  exec_byte_code + 3739 (bytecode.c:809,14 in emacs + 2817595) [0x10fd82e3b]
  42  funcall_subr + 401 (eval.c:3038,15 in emacs + 2417601) [0x10fd213c1]
  42  Fcall_interactively + 1057 (callint.c:342,36 in emacs + 2363633) [0x10fd140f1]
  42  Fapply + 2348 (eval.c:2666,24 in emacs + 2414108) [0x10fd2061c]
  42  Ffuncall + 324 (eval.c:2995,21 in emacs + 2397364) [0x10fd1c4b4]
  42  funcall_general + 197 (eval.c:2941,12 in emacs + 2416725) [0x10fd21055]
  42  funcall_subr + 810 (eval.c:3059,9 in emacs + 2418010) [0x10fd2155a]
  42  Ffuncall_interactively + 47 (callint.c:250,32 in emacs + 2362479) [0x10fd13c6f]
  42  Ffuncall + 324 (eval.c:2995,21 in emacs + 2397364) [0x10fd1c4b4]
  42  funcall_general + 279 (eval.c:2945,12 in emacs + 2416807) [0x10fd210a7]
  42  funcall_lambda + 385 (eval.c:3153,9 in emacs + 2418625) [0x10fd217c1]
  42  fetch_and_exec_byte_code + 87 (eval.c:3081,10 in emacs + 2432631) [0x10fd24e77]
  42  exec_byte_code + 3338 (bytecode.c:782,6 in emacs + 2817194) [0x10fd82caa]
  42  maybe_gc + 26 (lisp.h:5591,5 in emacs + 2837482) [0x10fd87bea]
  42  maybe_garbage_collect + 38 (alloc.c:6107,5 in emacs + 2144006) [0x10fcde706]
  42  garbage_collect + 999 (alloc.c:6262,3 in emacs + 2145127) [0x10fcdeb67]
  42  gc_sweep + 39 (alloc.c:7430,3 in emacs + 2148215) [0x10fcdf777]
  42  sweep_vectors + 297 (alloc.c:3254,5 in emacs + 2172105) [0x10fce54c9]
  42  cleanup_vector + 523 (alloc.c:3179,5 in emacs + 2173979) [0x10fce5c1b]
  42  treesit_delete_parser + 25 (treesit.c:1182,3 in emacs + 3175289) [0x10fdda379]
  42  ts_tree_delete + 44 (libtree-sitter.0.0.dylib + 114692) [0x110942004]
  42  ts_subtree_release + 158 (libtree-sitter.0.0.dylib + 102601) [0x11093f0c9]
  42  xmalloc + 77 (alloc.c:760,3 in emacs + 2117229) [0x10fcd7e6d]
  42  malloc_probe + 93 (profiler.c:509,3 in emacs + 3146093) [0x10fdd316d]
  42  record_backtrace + 95 (profiler.c:169,19 in emacs + 3146207) [0x10fdd31df]
  42  hash_lookup + 90 (fns.c:4693,44 in emacs + 2505546) [0x10fd36b4a]
  42  ??? [0x7fa1909ed180]
  42  _sigtramp + 29 (libsystem_platform.dylib + 15389) [0x7ff81a671c1d]
  42  deliver_fatal_thread_signal + 26 (sysdep.c:1795,3 in emacs + 1650762) [0x10fc6604a]
  42  deliver_thread_signal + 137 (sysdep.c:1775,3 in emacs + 1662777) [0x10fc68f39]
  42  handle_fatal_signal + 24 (sysdep.c:1783,3 in emacs + 1662632) [0x10fc68ea8]
  42  terminate_due_to_signal + 192 (emacs.c:447,11 in emacs + 3863584) [0x10fe82420]
  42  shut_down_emacs + 489 (emacs.c:2991,3 in emacs + 1422313) [0x10fc2e3e9]
  42  Fdo_auto_save + 309 (fileio.c:6042,18 in emacs + 1894389) [0x10fca17f5]
  42  Fexpand_file_name + 110 (fileio.c:956,13 in emacs + 1841918) [0x10fc94afe]
  42  Ffind_file_name_handler + 331 (fileio.c:324,24 in emacs + 1830395) [0x10fc91dfb]
  42  fast_string_match + 55 (lisp.h:4768,10 in emacs + 1831287) [0x10fc92177]
  42  fast_string_match_internal + 94 (search.c:487,7 in emacs + 1988174) [0x10fcb864e]
  42  compile_pattern + 599 (search.c:235,4 in emacs + 1988967) [0x10fcb8967]
  42  compile_pattern_1 + 331 (search.c:121,18 in emacs + 2021755) [0x10fcc097b]
  42  rpl_re_compile_pattern + 73 (regex-emacs.c:5170,9 in emacs + 2062489) [0x10fcca899]
  42  regex_compile + 133 (regex-emacs.c:1768,25 in emacs + 2062693) [0x10fcca965]
  42  xmalloc + 77 (alloc.c:760,3 in emacs + 2117229) [0x10fcd7e6d]
  42  malloc_probe + 93 (profiler.c:509,3 in emacs + 3146093) [0x10fdd316d]
  42  record_backtrace + 95 (profiler.c:169,19 in emacs + 3146207) [0x10fdd31df]
  42  hash_lookup + 90 (fns.c:4693,44 in emacs + 2505546) [0x10fd36b4a]
  42  ASIZE + 45 (lisp.h:1768,3 in emacs + 2442877) [0x10fd2767d]
*37  hndl_alltraps + 95 (kernel + 694399) [0xffffff800038587f]
*22  user_trap + 1218 (kernel + 2542418) [0xffffff8000548b52]
*21  exception_triage_thread + 490 (kernel + 1119322) [0xffffff80003ed45a]
*15  exception_deliver + 2172 (kernel + 1117868) [0xffffff80003eceac]
*15  mach_exception_raise + 265 (kernel + 1631513) [0xffffff800046a519]
*5   kernel_mach_msg_rpc + 689 (kernel + 1139009) [0xffffff80003f2141]
*4   ipc_port_adjust_special_reply_port_locked + 1170 (kernel + 989010) [0xffffff80003cd752]
*2   ipc_port_send_turnstile_complete + 213 (kernel + 989509) [0xffffff80003cd945]
*2   mpsc_daemon_enqueue + 177 (kernel + 1240465) [0xffffff800040ad91]
*2   ??? (kernel + 1548050) [0xffffff8000455f12]




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60237; Package emacs. (Sat, 25 Feb 2023 07:17:02 GMT) Full text and rfc822 format available.

Message #23 received at 60237 <at> debbugs.gnu.org (full text, mbox):

From: Po Lu <luangruo <at> yahoo.com>
To: Yuan Fu <casouri <at> gmail.com>
Cc: eliz <at> gnu.org, Mickey Petersen <mickey <at> masteringemacs.org>,
 60237 <at> debbugs.gnu.org
Subject: Re: bug#60237: 30.0.50; tree sitter core dumps when I edebug view a
 node
Date: Sat, 25 Feb 2023 15:13:44 +0800
Yuan Fu <casouri <at> gmail.com> writes:

> * thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0)
>     frame #0: 0x0000000100250f3d emacs`ASIZE(array=0x00000001a1889245) at lisp.h:1768:3
>    1765	ASIZE (Lisp_Object array)
>    1766	{
>    1767	  ptrdiff_t size = XVECTOR (array)->header.size;
> -> 1768	  eassume (0 <= size);
>    1769	  return size;
>    1770	}
>    1771

This is a bug inside the profiler: if it is trying to hook into xmalloc,
it should not call anything that can call ASIZE, because GC modifies the
mark bits inside the vector header, which happen to be stored in the
`size' field, and GC has been able to call xmalloc ever since the mark
stack stuff was installed.

Since you assume 0 <= size, LLVM is generating one of its favorite
instructions, ud2, in response to a situation you told the compiler
would never happen.  Make sure that situation doesn't happen!!

> Target 0: (emacs) stopped.
> (lldb) bt
> * thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0)
>   * frame #0: 0x0000000100250f3d emacs`ASIZE(array=0x00000001a1889245) at lisp.h:1768:3
>     frame #1: 0x0000000100250e5e emacs`get_backtrace(array=0x00000001a1889245) at eval.c:4193:28
>     frame #2: 0x00000001003001ce emacs`record_backtrace(log=0x00000001a1887d68, count=64) at profiler.c:162:3
>     frame #3: 0x000000010030016d emacs`malloc_probe(size=64) at profiler.c:509:3
>     frame #4: 0x0000000100204e6d emacs`xmalloc(size=64) at alloc.c:760:3
>     frame #5: 0x0000000100e6c0c9 libtree-sitter.0.dylib`ts_subtree_release + 158
>     frame #6: 0x0000000100e6f004 libtree-sitter.0.dylib`ts_tree_delete + 44
>     frame #7: 0x0000000100307379 emacs`treesit_delete_parser(lisp_parser=0x00000001a2c0f0e0) at treesit.c:1182:3
>     frame #8: 0x0000000100212c1b emacs`cleanup_vector(vector=0x00000001a2c0f0e0) at alloc.c:3179:5
>     frame #9: 0x00000001002124c9 emacs`sweep_vectors at alloc.c:3254:5
>     frame #10: 0x000000010020c777 emacs`gc_sweep at alloc.c:7430:3
>     frame #11: 0x000000010020bb67 emacs`garbage_collect at alloc.c:6262:3
>     frame #12: 0x000000010020b706 emacs`maybe_garbage_collect at alloc.c:6107:5
>     frame #13: 0x00000001002b4bea emacs`maybe_gc at lisp.h:5591:5

BTW, where do you see GC being called from treesit_delete_parser?  What
I see is a bug in the profiler; it should use some other data structure
to store its backtraces, when its xmalloc hook is called.

GC has historically never called xmalloc, so the profiler will likely
crash upon growing the mark stack as well.  I guess another important
question is why ts_delete_parser is calling xmalloc.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60237; Package emacs. (Sat, 25 Feb 2023 07:52:02 GMT) Full text and rfc822 format available.

Message #26 received at 60237 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Yuan Fu <casouri <at> gmail.com>
Cc: mickey <at> masteringemacs.org, 60237 <at> debbugs.gnu.org
Subject: Re: bug#60237: 30.0.50; tree sitter core dumps when I edebug view a 
 node
Date: Sat, 25 Feb 2023 09:51:32 +0200
> From: Yuan Fu <casouri <at> gmail.com>
> Date: Fri, 24 Feb 2023 15:22:00 -0800
> Cc: eliz <at> gnu.org,
>  60237 <at> debbugs.gnu.org
> 
> I’m stumbled on a reliably way to trigger a crash, of possibly the same cause as
> this one, by enabling the profiler and M-x garbage-collect in a
> tree-sitter mode on Mac. I tried to reproduce this on Linux but with no
> success.
> 
> I was also able to trigger infinite loop by the same recipe on time, but I
> didn’t run that session under lldb. Anyway, we can focus on the crash
> first.
> 
> Below’s the backtrace. Eli, could you see anything from this?
> [...]
> (lldb) down
> frame #0: 0x0000000100250f3d emacs`ASIZE(array=0x00000001a1889245) at lisp.h:1768:3
>    1765	ASIZE (Lisp_Object array)
>    1766	{
>    1767	  ptrdiff_t size = XVECTOR (array)->header.size;
> -> 1768	  eassume (0 <= size);
>    1769	  return size;
>    1770	}
>    1771
> (lldb) p size
> (ptrdiff_t) $0 = -9223372036854775792

It looks like we are calling ASIZE in the context of GC, when the
vectors have their mark bit set, which makes ASIZE return negative
values: do

  (gdb) p/x (unsigned long long)-9223372036854775792
  $1 = 0x8000000000000010

So this is 16 (10 hex) with the array mark flag bit set.  The fix is
simple:

diff --git a/src/eval.c b/src/eval.c
index 2dd0c35..7e6b742 100644
--- a/src/eval.c
+++ b/src/eval.c
@@ -4190,7 +4190,7 @@ mark_specpdl (union specbinding *first, union specbinding *ptr)
 get_backtrace (Lisp_Object array)
 {
   union specbinding *pdl = backtrace_next (backtrace_top ());
-  ptrdiff_t i = 0, asize = ASIZE (array);
+  ptrdiff_t i = 0, asize = gc_asize (array);
 
   /* Copy the backtrace contents into working memory.  */
   for (; i < asize; i++)

> I suspect there is some stupid mistake that I made concerning gcing
> tree-sitter objects. Could you see anything suspicious from the
> following description:
> 
> A Lisp_TS_Parser contains a TSParser and a TSTree, which are freed when
> the Lisp_TS_Parser is collected. A Lisp_TS_Node references the parser
> from which it is created, so that a Lisp_TS_Parser is only collected
> when no live node references it, because the Lisp_TS_Node references the
> TSTree stored in the Lisp_TS_Parser.

Sounds good, but do you understand why tree-sitter calls malloc when
you GC a parser?  This is what we see in the backtrace:

>   * frame #0: 0x0000000100250f3d emacs`ASIZE(array=0x00000001a1889245) at lisp.h:1768:3
>     frame #1: 0x0000000100250e5e emacs`get_backtrace(array=0x00000001a1889245) at eval.c:4193:28
>     frame #2: 0x00000001003001ce emacs`record_backtrace(log=0x00000001a1887d68, count=64) at profiler.c:162:3
>     frame #3: 0x000000010030016d emacs`malloc_probe(size=64) at profiler.c:509:3
>     frame #4: 0x0000000100204e6d emacs`xmalloc(size=64) at alloc.c:760:3
>     frame #5: 0x0000000100e6c0c9 libtree-sitter.0.dylib`ts_subtree_release + 158
>     frame #6: 0x0000000100e6f004 libtree-sitter.0.dylib`ts_tree_delete + 44
>     frame #7: 0x0000000100307379 emacs`treesit_delete_parser(lisp_parser=0x00000001a2c0f0e0) at treesit.c:1182:3

As you see, when we call ts_tree_delete, it calls ts_subtree_release,
which in turn calls malloc (redirected into our xmalloc).  Is this
expected?  Can you look in the tree-sitter sources and verify that
this is OK?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60237; Package emacs. (Sat, 25 Feb 2023 07:56:02 GMT) Full text and rfc822 format available.

Message #29 received at 60237 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Yuan Fu <casouri <at> gmail.com>
Cc: mickey <at> masteringemacs.org, 60237 <at> debbugs.gnu.org
Subject: Re: bug#60237: 30.0.50; tree sitter core dumps when I edebug view a 
 node
Date: Sat, 25 Feb 2023 09:55:33 +0200
> From: Yuan Fu <casouri <at> gmail.com>
> Date: Fri, 24 Feb 2023 15:29:15 -0800
> Cc: eliz <at> gnu.org,
>  60237 <at> debbugs.gnu.org
> 
> Maybe it will help us understand the problem better, so here is the
> backtrace for the infinite loop. I’m not sure why treesit_delete_parser
> would trigger gc, as it just calls two tree_sitter functions:
> 
> void
> treesit_delete_parser (struct Lisp_TS_Parser *lisp_parser)
> {
>   ts_tree_delete (lisp_parser->tree);
>   ts_parser_delete (lisp_parser->parser);
> }

According to the backtrace, it's the other way around: Emacs called
some function via funcall, and funcall decided it was a good time to
do a GC.  Then GC called treesit_delete_parser, presumably because
that parser object was no longer in use?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60237; Package emacs. (Sun, 26 Feb 2023 02:02:01 GMT) Full text and rfc822 format available.

Message #32 received at 60237 <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: Po Lu <luangruo <at> yahoo.com>, Mickey Petersen <mickey <at> masteringemacs.org>,
 60237 <at> debbugs.gnu.org
Subject: Re: bug#60237: 30.0.50; tree sitter core dumps when I edebug view a
 node
Date: Sat, 25 Feb 2023 18:01:07 -0800
> GC has historically never called xmalloc, so the profiler will likely
> crash upon growing the mark stack as well.  I guess another important
> question is why ts_delete_parser is calling xmalloc.
> 

> As you see, when we call ts_tree_delete, it calls ts_subtree_release,
> which in turn calls malloc (redirected into our xmalloc).  Is this
> expected?  Can you look in the tree-sitter sources and verify that
> this is OK?

I had a look, and it seems legit. In tree-sitter, a TSTree (or more precisely, a Subtree) is just some inlined data plus a refcounted pointer to the complete data. This way multiple trees share common subtrees/nodes. Eg, when incrementally parsing, you pass in an old tree and get a new tree, these two trees will share the unchanged part of the tree. 

Therefore, deleting a tree is not simply free(). Instead, you decrement the refcount of the subtree, and if the count == 0, free the data and traverse the subtree and decrementing each children’s refcount, and delete them if the count == 0, and so on. To traverse the tree, the function uses an array as a stack, which calls array_push to push new elements, which may call malloc.

Yuan



Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60237; Package emacs. (Sun, 26 Feb 2023 02:03:01 GMT) Full text and rfc822 format available.

Message #35 received at 60237 <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: Mickey Petersen <mickey <at> masteringemacs.org>, 60237 <at> debbugs.gnu.org
Subject: Re: bug#60237: 30.0.50; tree sitter core dumps when I edebug view a
 node
Date: Sat, 25 Feb 2023 18:02:00 -0800

> On Feb 24, 2023, at 11:55 PM, Eli Zaretskii <eliz <at> gnu.org> wrote:
> 
>> From: Yuan Fu <casouri <at> gmail.com>
>> Date: Fri, 24 Feb 2023 15:29:15 -0800
>> Cc: eliz <at> gnu.org,
>> 60237 <at> debbugs.gnu.org
>> 
>> Maybe it will help us understand the problem better, so here is the
>> backtrace for the infinite loop. I’m not sure why treesit_delete_parser
>> would trigger gc, as it just calls two tree_sitter functions:
>> 
>> void
>> treesit_delete_parser (struct Lisp_TS_Parser *lisp_parser)
>> {
>>  ts_tree_delete (lisp_parser->tree);
>>  ts_parser_delete (lisp_parser->parser);
>> }
> 
> According to the backtrace, it's the other way around: Emacs called
> some function via funcall, and funcall decided it was a good time to
> do a GC.  Then GC called treesit_delete_parser, presumably because
> that parser object was no longer in use?

Ah, right. I forgot it’s not a callstack.

Yuan



Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60237; Package emacs. (Sun, 26 Feb 2023 02:38:01 GMT) Full text and rfc822 format available.

Message #38 received at 60237 <at> debbugs.gnu.org (full text, mbox):

From: Po Lu <luangruo <at> yahoo.com>
To: Yuan Fu <casouri <at> gmail.com>
Cc: Eli Zaretskii <eliz <at> gnu.org>, Mickey Petersen <mickey <at> masteringemacs.org>,
 60237 <at> debbugs.gnu.org
Subject: Re: bug#60237: 30.0.50; tree sitter core dumps when I edebug view a
 node
Date: Sun, 26 Feb 2023 10:37:17 +0800
Yuan Fu <casouri <at> gmail.com> writes:

> I had a look, and it seems legit. In tree-sitter, a TSTree (or more
> precisely, a Subtree) is just some inlined data plus a refcounted
> pointer to the complete data. This way multiple trees share common
> subtrees/nodes. Eg, when incrementally parsing, you pass in an old
> tree and get a new tree, these two trees will share the unchanged part
> of the tree.
>
> Therefore, deleting a tree is not simply free(). Instead, you
> decrement the refcount of the subtree, and if the count == 0, free the
> data and traverse the subtree and decrementing each children’s
> refcount, and delete them if the count == 0, and so on.

And what will happen if that malloc fails, while *freeing* memory?
Anyway, the profiler should either be fixed to not hook into xmalloc, or
(better) tree-sitter should be fixed to not call xmalloc during GC.

> To traverse the tree, the function uses an array as a stack, which
> calls array_push to push new elements, which may call malloc.

How deep are those trees?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60237; Package emacs. (Sun, 26 Feb 2023 06:15:01 GMT) Full text and rfc822 format available.

Message #41 received at 60237 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Yuan Fu <casouri <at> gmail.com>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: luangruo <at> yahoo.com, mickey <at> masteringemacs.org, 60237 <at> debbugs.gnu.org
Subject: Re: bug#60237: 30.0.50; tree sitter core dumps when I edebug view a
 node
Date: Sun, 26 Feb 2023 08:14:00 +0200
> From: Yuan Fu <casouri <at> gmail.com>
> Date: Sat, 25 Feb 2023 18:01:07 -0800
> Cc: Mickey Petersen <mickey <at> masteringemacs.org>,
>  60237 <at> debbugs.gnu.org,
>  Po Lu <luangruo <at> yahoo.com>
> 
> > GC has historically never called xmalloc, so the profiler will likely
> > crash upon growing the mark stack as well.  I guess another important
> > question is why ts_delete_parser is calling xmalloc.
> 
> > As you see, when we call ts_tree_delete, it calls ts_subtree_release,
> > which in turn calls malloc (redirected into our xmalloc).  Is this
> > expected?  Can you look in the tree-sitter sources and verify that
> > this is OK?
> 
> I had a look, and it seems legit. In tree-sitter, a TSTree (or more precisely, a Subtree) is just some inlined data plus a refcounted pointer to the complete data. This way multiple trees share common subtrees/nodes. Eg, when incrementally parsing, you pass in an old tree and get a new tree, these two trees will share the unchanged part of the tree. 
> 
> Therefore, deleting a tree is not simply free(). Instead, you decrement the refcount of the subtree, and if the count == 0, free the data and traverse the subtree and decrementing each children’s refcount, and delete them if the count == 0, and so on. To traverse the tree, the function uses an array as a stack, which calls array_push to push new elements, which may call malloc.

Stefan, could it be a problem for us if garbage-collecting an object
calls xmalloc?  Including if the "memory" profiler is running at the
time of that GC?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60237; Package emacs. (Sun, 26 Feb 2023 06:19:02 GMT) Full text and rfc822 format available.

Message #44 received at 60237 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Po Lu <luangruo <at> yahoo.com>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: casouri <at> gmail.com, mickey <at> masteringemacs.org, 60237 <at> debbugs.gnu.org
Subject: Re: bug#60237: 30.0.50; tree sitter core dumps when I edebug view a
 node
Date: Sun, 26 Feb 2023 08:18:23 +0200
> From: Po Lu <luangruo <at> yahoo.com>
> Cc: Eli Zaretskii <eliz <at> gnu.org>,  Mickey Petersen
>  <mickey <at> masteringemacs.org>,  60237 <at> debbugs.gnu.org
> Date: Sun, 26 Feb 2023 10:37:17 +0800
> 
> Anyway, the profiler should either be fixed to not hook into xmalloc, or
> (better) tree-sitter should be fixed to not call xmalloc during GC.

That's what the "memory" profiler does, AFAIU.  It uses xmalloc as a
poor-man's timer.  It is rather useless and misleading, but if we
remove it, platforms that don't have timers and SIGPROF will not be
able to profile.

But maybe there are no such platforms?  (DJGPP has setitimer and
SIGPROF, so the MSDOS build shouldn't be a problem, although I never
tried profiling in the MSDOS build.)




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60237; Package emacs. (Sun, 26 Feb 2023 09:43:02 GMT) Full text and rfc822 format available.

Message #47 received at 60237 <at> debbugs.gnu.org (full text, mbox):

From: Mickey Petersen <mickey <at> masteringemacs.org>
To: Yuan Fu <casouri <at> gmail.com>
Cc: Po Lu <luangruo <at> yahoo.com>, Eli Zaretskii <eliz <at> gnu.org>,
 60237 <at> debbugs.gnu.org
Subject: Re: bug#60237: 30.0.50; tree sitter core dumps when I edebug view a
 node
Date: Sun, 26 Feb 2023 09:41:18 +0000
Yuan Fu <casouri <at> gmail.com> writes:

>> GC has historically never called xmalloc, so the profiler will
>> likely
>> crash upon growing the mark stack as well.  I guess another
>> important
>> question is why ts_delete_parser is calling xmalloc.
>>
>
>> As you see, when we call ts_tree_delete, it calls
>> ts_subtree_release,
>> which in turn calls malloc (redirected into our xmalloc).  Is this
>> expected?  Can you look in the tree-sitter sources and verify that
>> this is OK?
>
> I had a look, and it seems legit. In tree-sitter, a TSTree (or more
> precisely, a Subtree) is just some inlined data plus a refcounted
> pointer to the complete data. This way multiple trees share common
> subtrees/nodes. Eg, when incrementally parsing, you pass in an old
> tree and get a new tree, these two trees will share the unchanged part
> of the tree.

Would that mean we could possibly preserve node instances -- either the real TS ones, or an Emacs-created facsimile -- between incremental parsing? That would be useful for refactoring.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60237; Package emacs. (Sun, 26 Feb 2023 15:17:01 GMT) Full text and rfc822 format available.

Message #50 received at 60237 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: luangruo <at> yahoo.com, Yuan Fu <casouri <at> gmail.com>, mickey <at> masteringemacs.org,
 60237 <at> debbugs.gnu.org
Subject: Re: bug#60237: 30.0.50; tree sitter core dumps when I edebug view a
 node
Date: Sun, 26 Feb 2023 10:16:25 -0500
> Stefan, could it be a problem for us if garbage-collecting an object
> calls xmalloc?  Including if the "memory" profiler is running at the
> time of that GC?

I can't think of a fundamental reason why this would be a problem, but
as you've seen some code may not be quite ready for it.

I suspect the simplest solution is to do something like what we do
for the cpu-profiler, i.e. handle the "time within GC" specially by
checking (EQ (backtrace_top_function (), QAutomatic_GC)) to determine
that we're within the GC.

We could just not count those xmalloc calls, tho better would be to use
generalize `cpu_gc_count` so it's also used for the mem profiler.


        Stefan


PS: While the mem profiler was originally thought as a poor-man option
in the absence of timers, I've occasionally found it handy to track down
problems where we're spending too much time in the GC.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60237; Package emacs. (Mon, 27 Feb 2023 00:36:01 GMT) Full text and rfc822 format available.

Message #53 received at 60237 <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: Mickey Petersen <mickey <at> masteringemacs.org>
Cc: Po Lu <luangruo <at> yahoo.com>, Eli Zaretskii <eliz <at> gnu.org>,
 60237 <at> debbugs.gnu.org
Subject: Re: bug#60237: 30.0.50; tree sitter core dumps when I edebug view a
 node
Date: Sun, 26 Feb 2023 16:34:49 -0800

> On Feb 26, 2023, at 1:41 AM, Mickey Petersen <mickey <at> masteringemacs.org> wrote:
> 
> 
> Yuan Fu <casouri <at> gmail.com> writes:
> 
>>> GC has historically never called xmalloc, so the profiler will
>>> likely
>>> crash upon growing the mark stack as well.  I guess another
>>> important
>>> question is why ts_delete_parser is calling xmalloc.
>>> 
>> 
>>> As you see, when we call ts_tree_delete, it calls
>>> ts_subtree_release,
>>> which in turn calls malloc (redirected into our xmalloc).  Is this
>>> expected?  Can you look in the tree-sitter sources and verify that
>>> this is OK?
>> 
>> I had a look, and it seems legit. In tree-sitter, a TSTree (or more
>> precisely, a Subtree) is just some inlined data plus a refcounted
>> pointer to the complete data. This way multiple trees share common
>> subtrees/nodes. Eg, when incrementally parsing, you pass in an old
>> tree and get a new tree, these two trees will share the unchanged part
>> of the tree.
> 
> Would that mean we could possibly preserve node instances -- either the real TS ones, or an Emacs-created facsimile -- between incremental parsing? That would be useful for refactoring.

What kind of exact interface (function) do you want? The treesit-node-outdated error is solely Emacs’s product, tree-sitter itself doesn’t mark a node outdated. It is possible for Emacs to not delete the old tree and give it to you, or allow you to access information of an outdated node.

Yuan



Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60237; Package emacs. (Mon, 27 Feb 2023 08:27:01 GMT) Full text and rfc822 format available.

Message #56 received at 60237 <at> debbugs.gnu.org (full text, mbox):

From: Mickey Petersen <mickey <at> masteringemacs.org>
To: Yuan Fu <casouri <at> gmail.com>
Cc: Po Lu <luangruo <at> yahoo.com>, Eli Zaretskii <eliz <at> gnu.org>,
 60237 <at> debbugs.gnu.org
Subject: Re: bug#60237: 30.0.50; tree sitter core dumps when I edebug view a
 node
Date: Mon, 27 Feb 2023 08:22:04 +0000
Yuan Fu <casouri <at> gmail.com> writes:

>> On Feb 26, 2023, at 1:41 AM, Mickey Petersen <mickey <at> masteringemacs.org> wrote:
>>
>>
>> Yuan Fu <casouri <at> gmail.com> writes:
>>
>>>> GC has historically never called xmalloc, so the profiler will
>>>> likely
>>>> crash upon growing the mark stack as well.  I guess another
>>>> important
>>>> question is why ts_delete_parser is calling xmalloc.
>>>>
>>>
>>>> As you see, when we call ts_tree_delete, it calls
>>>> ts_subtree_release,
>>>> which in turn calls malloc (redirected into our xmalloc).  Is this
>>>> expected?  Can you look in the tree-sitter sources and verify that
>>>> this is OK?
>>>
>>> I had a look, and it seems legit. In tree-sitter, a TSTree (or more
>>> precisely, a Subtree) is just some inlined data plus a refcounted
>>> pointer to the complete data. This way multiple trees share common
>>> subtrees/nodes. Eg, when incrementally parsing, you pass in an old
>>> tree and get a new tree, these two trees will share the unchanged part
>>> of the tree.
>>
>> Would that mean we could possibly preserve node instances -- either
>> the real TS ones, or an Emacs-created facsimile -- between
>> incremental parsing? That would be useful for refactoring.
>
> What kind of exact interface (function) do you want? The
> treesit-node-outdated error is solely Emacs’s product, tree-sitter
> itself doesn’t mark a node outdated. It is possible for Emacs to not
> delete the old tree and give it to you, or allow you to access
> information of an outdated node.

OK, so let me explain:

Touching the buffer for any reason invalidates the whole tree; that's
not good. It's not good, because a lot of the information may still be
useful and viable. Outdating the node is not a bad idea as it avoids a
lot of 'traps' around accidental modifications that can corrupt things
without the developer's knowledge.

I'd like to be able to access all the information possible; perhaps
behind a flag variable like `treesit-allow-outdated-node-access'. What
I'm really mostly interested in is:

- How well the node references handle changes in byte positions in TS.

- Does changing something at X shift (like a `point-marker`) everything
below it? Does an outdated node correctly reference its new location
and state, such as changes to children or its position in the tree?

Right now, Combobulate can make a proxy node, which essentially
captures the basics of a live node and stores it in a defstruct. That
way I can at least retain the start/end, type, text, etc. of a node
and still do light refactoring without contorting myself to do things
in a particular order, which is not always possible (like delaying
editing to the very end.)





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60237; Package emacs. (Mon, 27 Feb 2023 09:07:01 GMT) Full text and rfc822 format available.

Message #59 received at 60237 <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: Mickey Petersen <mickey <at> masteringemacs.org>
Cc: Po Lu <luangruo <at> yahoo.com>, Eli Zaretskii <eliz <at> gnu.org>,
 60237 <at> debbugs.gnu.org
Subject: Re: bug#60237: 30.0.50; tree sitter core dumps when I edebug view a
 node
Date: Mon, 27 Feb 2023 01:05:49 -0800

> On Feb 27, 2023, at 12:22 AM, Mickey Petersen <mickey <at> masteringemacs.org> wrote:
> 
> 
> Yuan Fu <casouri <at> gmail.com> writes:
> 
>>> On Feb 26, 2023, at 1:41 AM, Mickey Petersen <mickey <at> masteringemacs.org> wrote:
>>> 
>>> 
>>> Yuan Fu <casouri <at> gmail.com> writes:
>>> 
>>>>> GC has historically never called xmalloc, so the profiler will
>>>>> likely
>>>>> crash upon growing the mark stack as well.  I guess another
>>>>> important
>>>>> question is why ts_delete_parser is calling xmalloc.
>>>>> 
>>>> 
>>>>> As you see, when we call ts_tree_delete, it calls
>>>>> ts_subtree_release,
>>>>> which in turn calls malloc (redirected into our xmalloc).  Is this
>>>>> expected?  Can you look in the tree-sitter sources and verify that
>>>>> this is OK?
>>>> 
>>>> I had a look, and it seems legit. In tree-sitter, a TSTree (or more
>>>> precisely, a Subtree) is just some inlined data plus a refcounted
>>>> pointer to the complete data. This way multiple trees share common
>>>> subtrees/nodes. Eg, when incrementally parsing, you pass in an old
>>>> tree and get a new tree, these two trees will share the unchanged part
>>>> of the tree.
>>> 
>>> Would that mean we could possibly preserve node instances -- either
>>> the real TS ones, or an Emacs-created facsimile -- between
>>> incremental parsing? That would be useful for refactoring.
>> 
>> What kind of exact interface (function) do you want? The
>> treesit-node-outdated error is solely Emacs’s product, tree-sitter
>> itself doesn’t mark a node outdated. It is possible for Emacs to not
>> delete the old tree and give it to you, or allow you to access
>> information of an outdated node.
> 
> OK, so let me explain:
> 
> Touching the buffer for any reason invalidates the whole tree; that's
> not good. It's not good, because a lot of the information may still be
> useful and viable. Outdating the node is not a bad idea as it avoids a
> lot of 'traps' around accidental modifications that can corrupt things
> without the developer's knowledge.
> 
> I'd like to be able to access all the information possible; perhaps
> behind a flag variable like `treesit-allow-outdated-node-access'. What
> I'm really mostly interested in is:
> 
> - How well the node references handle changes in byte positions in TS.

They don’t handle position changes. If the buffer content changed, we need to reparse. Once we reparsed the buffer, a new tree is born. While it is true that the new tree shares some node with the old tree, tree-sitter does not expose any function or information that tells you which node in the new tree is “the same” as which node in the old tree; nor does it tell you whether a node in the old tree still “exists” in the new tree.

Now, there does exist a function (in tree-sitter’s API) that allows you to “edit” a node with position changes. But a) I’m not sure how does it handle the case where the node is deleted by the change and b) it is not very useful because once you reparse the buffer, the new tree is completely independent from the old tree (ignoring the implementation detail which is not exposed).

> 
> - Does changing something at X shift (like a `point-marker`) everything
> below it? Does an outdated node correctly reference its new location
> and state, such as changes to children or its position in the tree?

Like I said above, any buffer change will create a new tree with no relation to the old tree, so there is no shifting.

And there really isn’t a “new location”: we don’t know if the old node is still in the new tree. Mind you, even if the node is completely outside of the changed region, it can still disappear from the new tree because of change of its surrounding context. For example, in the following C code:

/*
int c = 1;

If I insert a closing comment delimiter, and buffer becomes

/*
int c = 1;
*/

Even though int c = 1; is not in the changed range, nor did it’s position move, all those nodes (int, c, =, etc) are not in the new tree anymore, because the whole thing becomes a comment.

I made any access to outdated nodes error because there really isn’t any good reason to use them, at least I didn’t think of any at the time. And make them error out should help people catch errors.

> 
> Right now, Combobulate can make a proxy node, which essentially
> captures the basics of a live node and stores it in a defstruct. That
> way I can at least retain the start/end, type, text, etc. of a node
> and still do light refactoring without contorting myself to do things
> in a particular order, which is not always possible (like delaying
> editing to the very end.)

IIUC, you want to do some very minor whitespace edit to the buffer which doesn’t really change the parse tree, so you don’t want the nodes to be invalidated for no good reason? Not erroring on outdated nodes is easy. As you said, we can add a treesit-inhibit-error-outdated variable. But not it’s not so easy to automatically update outdated nodes’ positions (with aforementioned tree-sitter function). However, if you are making those changes, you much know how to adjust your nodes position, right? So maybe it isn’t a must-have for your purpose.


Yuan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60237; Package emacs. (Mon, 27 Feb 2023 14:33:01 GMT) Full text and rfc822 format available.

Message #62 received at 60237 <at> debbugs.gnu.org (full text, mbox):

From: Mickey Petersen <mickey <at> masteringemacs.org>
To: Yuan Fu <casouri <at> gmail.com>
Cc: Po Lu <luangruo <at> yahoo.com>, Eli Zaretskii <eliz <at> gnu.org>,
 60237 <at> debbugs.gnu.org
Subject: Re: bug#60237: 30.0.50; tree sitter core dumps when I edebug view a
 node
Date: Mon, 27 Feb 2023 14:29:52 +0000
Yuan Fu <casouri <at> gmail.com> writes:

>> On Feb 27, 2023, at 12:22 AM, Mickey Petersen <mickey <at> masteringemacs.org> wrote:
>>
>>
>> Yuan Fu <casouri <at> gmail.com> writes:
>>
>>>> On Feb 26, 2023, at 1:41 AM, Mickey Petersen <mickey <at> masteringemacs.org> wrote:
>>>>
>>>>
>>>> Yuan Fu <casouri <at> gmail.com> writes:
>>>>
>>>>>> GC has historically never called xmalloc, so the profiler will
>>>>>> likely
>>>>>> crash upon growing the mark stack as well.  I guess another
>>>>>> important
>>>>>> question is why ts_delete_parser is calling xmalloc.
>>>>>>
>>>>>
>>>>>> As you see, when we call ts_tree_delete, it calls
>>>>>> ts_subtree_release,
>>>>>> which in turn calls malloc (redirected into our xmalloc).  Is this
>>>>>> expected?  Can you look in the tree-sitter sources and verify that
>>>>>> this is OK?
>>>>>
>>>>> I had a look, and it seems legit. In tree-sitter, a TSTree (or more
>>>>> precisely, a Subtree) is just some inlined data plus a refcounted
>>>>> pointer to the complete data. This way multiple trees share common
>>>>> subtrees/nodes. Eg, when incrementally parsing, you pass in an old
>>>>> tree and get a new tree, these two trees will share the unchanged part
>>>>> of the tree.
>>>>
>>>> Would that mean we could possibly preserve node instances -- either
>>>> the real TS ones, or an Emacs-created facsimile -- between
>>>> incremental parsing? That would be useful for refactoring.
>>>
>>> What kind of exact interface (function) do you want? The
>>> treesit-node-outdated error is solely Emacs’s product, tree-sitter
>>> itself doesn’t mark a node outdated. It is possible for Emacs to not
>>> delete the old tree and give it to you, or allow you to access
>>> information of an outdated node.
>>
>> OK, so let me explain:
>>
>> Touching the buffer for any reason invalidates the whole tree; that's
>> not good. It's not good, because a lot of the information may still be
>> useful and viable. Outdating the node is not a bad idea as it avoids a
>> lot of 'traps' around accidental modifications that can corrupt things
>> without the developer's knowledge.
>>
>> I'd like to be able to access all the information possible; perhaps
>> behind a flag variable like `treesit-allow-outdated-node-access'. What
>> I'm really mostly interested in is:
>>
>> - How well the node references handle changes in byte positions in TS.
>
> They don’t handle position changes. If the buffer content changed, we
> need to reparse. Once we reparsed the buffer, a new tree is
> born. While it is true that the new tree shares some node with the old
> tree, tree-sitter does not expose any function or information that
> tells you which node in the new tree is “the same” as which node in
> the old tree; nor does it tell you whether a node in the old tree
> still “exists” in the new tree.
>
> Now, there does exist a function (in tree-sitter’s API) that allows
> you to “edit” a node with position changes. But a) I’m not sure how
> does it handle the case where the node is deleted by the change and b)
> it is not very useful because once you reparse the buffer, the new
> tree is completely independent from the old tree (ignoring the
> implementation detail which is not exposed).
>
>>
>> - Does changing something at X shift (like a `point-marker`) everything
>> below it? Does an outdated node correctly reference its new location
>> and state, such as changes to children or its position in the tree?
>
> Like I said above, any buffer change will create a new tree with no relation to the old tree, so there is no shifting.
>
> And there really isn’t a “new location”: we don’t know if the old node
> is still in the new tree. Mind you, even if the node is completely
> outside of the changed region, it can still disappear from the new
> tree because of change of its surrounding context. For example, in the
> following C code:
>
> /*
> int c = 1;
>
> If I insert a closing comment delimiter, and buffer becomes
>
> /*
> int c = 1;
> */
>
> Even though int c = 1; is not in the changed range, nor did it’s
> position move, all those nodes (int, c, =, etc) are not in the new
> tree anymore, because the whole thing becomes a comment.
>
> I made any access to outdated nodes error because there really isn’t
> any good reason to use them, at least I didn’t think of any at the
> time. And make them error out should help people catch errors.
>
>>
>> Right now, Combobulate can make a proxy node, which essentially
>> captures the basics of a live node and stores it in a defstruct. That
>> way I can at least retain the start/end, type, text, etc. of a node
>> and still do light refactoring without contorting myself to do things
>> in a particular order, which is not always possible (like delaying
>> editing to the very end.)
>
> IIUC, you want to do some very minor whitespace edit to the buffer
> which doesn’t really change the parse tree, so you don’t want the
> nodes to be invalidated for no good reason? Not erroring on outdated
> nodes is easy. As you said, we can add a
> treesit-inhibit-error-outdated variable. But not it’s not so easy to
> automatically update outdated nodes’ positions (with aforementioned
> tree-sitter function). However, if you are making those changes, you
> much know how to adjust your nodes position, right? So maybe it isn’t
> a must-have for your purpose.

It's a good point, but it's also easy to create a scenario where you
at least want to keep the position and esp. the type and text (for
reporting information to the user, or similar.)

My main interest is now refactoring and how to best do it. If TS can
do some of it, then all the better. I realise it was never meant to,
but if we can continue accessing the information contained in a node
even if it is outdated, then that could be useful, however niche.

Currently I use overlays and point markers, but they are not
infallible.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60237; Package emacs. (Mon, 27 Feb 2023 22:38:02 GMT) Full text and rfc822 format available.

Message #65 received at 60237 <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: Mickey Petersen <mickey <at> masteringemacs.org>
Cc: Po Lu <luangruo <at> yahoo.com>, Eli Zaretskii <eliz <at> gnu.org>,
 60237 <at> debbugs.gnu.org
Subject: Re: bug#60237: 30.0.50; tree sitter core dumps when I edebug view a
 node
Date: Mon, 27 Feb 2023 14:37:22 -0800

> On Feb 27, 2023, at 6:29 AM, Mickey Petersen <mickey <at> masteringemacs.org> wrote:
> 
> 
> Yuan Fu <casouri <at> gmail.com> writes:
> 
>>> On Feb 27, 2023, at 12:22 AM, Mickey Petersen <mickey <at> masteringemacs.org> wrote:
>>> 
>>> 
>>> Yuan Fu <casouri <at> gmail.com> writes:
>>> 
>>>>> On Feb 26, 2023, at 1:41 AM, Mickey Petersen <mickey <at> masteringemacs.org> wrote:
>>>>> 
>>>>> 
>>>>> Yuan Fu <casouri <at> gmail.com> writes:
>>>>> 
>>>>>>> GC has historically never called xmalloc, so the profiler will
>>>>>>> likely
>>>>>>> crash upon growing the mark stack as well.  I guess another
>>>>>>> important
>>>>>>> question is why ts_delete_parser is calling xmalloc.
>>>>>>> 
>>>>>> 
>>>>>>> As you see, when we call ts_tree_delete, it calls
>>>>>>> ts_subtree_release,
>>>>>>> which in turn calls malloc (redirected into our xmalloc).  Is this
>>>>>>> expected?  Can you look in the tree-sitter sources and verify that
>>>>>>> this is OK?
>>>>>> 
>>>>>> I had a look, and it seems legit. In tree-sitter, a TSTree (or more
>>>>>> precisely, a Subtree) is just some inlined data plus a refcounted
>>>>>> pointer to the complete data. This way multiple trees share common
>>>>>> subtrees/nodes. Eg, when incrementally parsing, you pass in an old
>>>>>> tree and get a new tree, these two trees will share the unchanged part
>>>>>> of the tree.
>>>>> 
>>>>> Would that mean we could possibly preserve node instances -- either
>>>>> the real TS ones, or an Emacs-created facsimile -- between
>>>>> incremental parsing? That would be useful for refactoring.
>>>> 
>>>> What kind of exact interface (function) do you want? The
>>>> treesit-node-outdated error is solely Emacs’s product, tree-sitter
>>>> itself doesn’t mark a node outdated. It is possible for Emacs to not
>>>> delete the old tree and give it to you, or allow you to access
>>>> information of an outdated node.
>>> 
>>> OK, so let me explain:
>>> 
>>> Touching the buffer for any reason invalidates the whole tree; that's
>>> not good. It's not good, because a lot of the information may still be
>>> useful and viable. Outdating the node is not a bad idea as it avoids a
>>> lot of 'traps' around accidental modifications that can corrupt things
>>> without the developer's knowledge.
>>> 
>>> I'd like to be able to access all the information possible; perhaps
>>> behind a flag variable like `treesit-allow-outdated-node-access'. What
>>> I'm really mostly interested in is:
>>> 
>>> - How well the node references handle changes in byte positions in TS.
>> 
>> They don’t handle position changes. If the buffer content changed, we
>> need to reparse. Once we reparsed the buffer, a new tree is
>> born. While it is true that the new tree shares some node with the old
>> tree, tree-sitter does not expose any function or information that
>> tells you which node in the new tree is “the same” as which node in
>> the old tree; nor does it tell you whether a node in the old tree
>> still “exists” in the new tree.
>> 
>> Now, there does exist a function (in tree-sitter’s API) that allows
>> you to “edit” a node with position changes. But a) I’m not sure how
>> does it handle the case where the node is deleted by the change and b)
>> it is not very useful because once you reparse the buffer, the new
>> tree is completely independent from the old tree (ignoring the
>> implementation detail which is not exposed).
>> 
>>> 
>>> - Does changing something at X shift (like a `point-marker`) everything
>>> below it? Does an outdated node correctly reference its new location
>>> and state, such as changes to children or its position in the tree?
>> 
>> Like I said above, any buffer change will create a new tree with no relation to the old tree, so there is no shifting.
>> 
>> And there really isn’t a “new location”: we don’t know if the old node
>> is still in the new tree. Mind you, even if the node is completely
>> outside of the changed region, it can still disappear from the new
>> tree because of change of its surrounding context. For example, in the
>> following C code:
>> 
>> /*
>> int c = 1;
>> 
>> If I insert a closing comment delimiter, and buffer becomes
>> 
>> /*
>> int c = 1;
>> */
>> 
>> Even though int c = 1; is not in the changed range, nor did it’s
>> position move, all those nodes (int, c, =, etc) are not in the new
>> tree anymore, because the whole thing becomes a comment.
>> 
>> I made any access to outdated nodes error because there really isn’t
>> any good reason to use them, at least I didn’t think of any at the
>> time. And make them error out should help people catch errors.
>> 
>>> 
>>> Right now, Combobulate can make a proxy node, which essentially
>>> captures the basics of a live node and stores it in a defstruct. That
>>> way I can at least retain the start/end, type, text, etc. of a node
>>> and still do light refactoring without contorting myself to do things
>>> in a particular order, which is not always possible (like delaying
>>> editing to the very end.)
>> 
>> IIUC, you want to do some very minor whitespace edit to the buffer
>> which doesn’t really change the parse tree, so you don’t want the
>> nodes to be invalidated for no good reason? Not erroring on outdated
>> nodes is easy. As you said, we can add a
>> treesit-inhibit-error-outdated variable. But not it’s not so easy to
>> automatically update outdated nodes’ positions (with aforementioned
>> tree-sitter function). However, if you are making those changes, you
>> much know how to adjust your nodes position, right? So maybe it isn’t
>> a must-have for your purpose.
> 
> It's a good point, but it's also easy to create a scenario where you
> at least want to keep the position and esp. the type and text (for
> reporting information to the user, or similar.)

I should be clearer. I meant that treesit-inhibit-error-outdated is reasonable and easy to implement. So if you want we can add it. OTOH auto-updating outdated nodes with position information is nontrivial, and might not be must-have for your purpose.

> 
> My main interest is now refactoring and how to best do it. If TS can
> do some of it, then all the better. I realise it was never meant to,
> but if we can continue accessing the information contained in a node
> even if it is outdated, then that could be useful, however niche.

I guess “refactoring” includes not only whitespace changes but also some structural changes like slurping (or whatever it’s called), right? If you want to do structural changes, tree-sitter probably can’t help you much, as you observed. Maybe it’s better to “export” the tree-sitter tree to your own tree and do transformations with it? Maybe that’s already what you does now.

> Currently I use overlays and point markers, but they are not
> infallible.

Yuan



Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60237; Package emacs. (Mon, 27 Feb 2023 22:47:02 GMT) Full text and rfc822 format available.

Message #68 received at 60237 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Yuan Fu <casouri <at> gmail.com>, Mickey Petersen <mickey <at> masteringemacs.org>
Cc: Po Lu <luangruo <at> yahoo.com>, Eli Zaretskii <eliz <at> gnu.org>,
 60237 <at> debbugs.gnu.org
Subject: Re: bug#60237: 30.0.50; tree sitter core dumps when I edebug view a
 node
Date: Tue, 28 Feb 2023 00:45:51 +0200
On 28/02/2023 00:37, Yuan Fu wrote:
>> My main interest is now refactoring and how to best do it. If TS can
>> do some of it, then all the better. I realise it was never meant to,
>> but if we can continue accessing the information contained in a node
>> even if it is outdated, then that could be useful, however niche.
> I guess “refactoring” includes not only whitespace changes but also some structural changes like slurping (or whatever it’s called), right? If you want to do structural changes, tree-sitter probably can’t help you much, as you observed. Maybe it’s better to “export” the tree-sitter tree to your own tree and do transformations with it? Maybe that’s already what you does now.
> 

Or simply produce all the editing information up front, and then modify 
the buffer in different places in one swoop.

Just like treesit-indent-region does.

That almost the same as described, but doesn't require creating a 
parallel parse tree hierarchy.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60237; Package emacs. (Tue, 28 Feb 2023 14:02:02 GMT) Full text and rfc822 format available.

Message #71 received at 60237 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: luangruo <at> yahoo.com, casouri <at> gmail.com, mickey <at> masteringemacs.org,
 60237 <at> debbugs.gnu.org
Subject: Re: bug#60237: 30.0.50; tree sitter core dumps when I edebug view a
 node
Date: Tue, 28 Feb 2023 16:00:34 +0200
> From: Stefan Monnier <monnier <at> iro.umontreal.ca>
> Cc: Yuan Fu <casouri <at> gmail.com>,  luangruo <at> yahoo.com,
>   mickey <at> masteringemacs.org,  60237 <at> debbugs.gnu.org
> Date: Sun, 26 Feb 2023 10:16:25 -0500
> 
> > Stefan, could it be a problem for us if garbage-collecting an object
> > calls xmalloc?  Including if the "memory" profiler is running at the
> > time of that GC?
> 
> I can't think of a fundamental reason why this would be a problem, but
> as you've seen some code may not be quite ready for it.
> 
> I suspect the simplest solution is to do something like what we do
> for the cpu-profiler, i.e. handle the "time within GC" specially by
> checking (EQ (backtrace_top_function (), QAutomatic_GC)) to determine
> that we're within the GC.

Any reason not to install the patch that uses gcsize instead of ASIZE?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60237; Package emacs. (Wed, 01 Mar 2023 04:08:01 GMT) Full text and rfc822 format available.

Message #74 received at 60237 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: luangruo <at> yahoo.com, casouri <at> gmail.com, mickey <at> masteringemacs.org,
 60237 <at> debbugs.gnu.org
Subject: Re: bug#60237: 30.0.50; tree sitter core dumps when I edebug view a
 node
Date: Tue, 28 Feb 2023 23:07:47 -0500
>> > Stefan, could it be a problem for us if garbage-collecting an object
>> > calls xmalloc?  Including if the "memory" profiler is running at the
>> > time of that GC?
>> 
>> I can't think of a fundamental reason why this would be a problem, but
>> as you've seen some code may not be quite ready for it.
>> 
>> I suspect the simplest solution is to do something like what we do
>> for the cpu-profiler, i.e. handle the "time within GC" specially by
>> checking (EQ (backtrace_top_function (), QAutomatic_GC)) to determine
>> that we're within the GC.
>
> Any reason not to install the patch that uses gcsize instead of ASIZE?

That might work, but I suspect there's a good reason why I used
`cpu_gc_count`.  I think running the "normal" profiling code during GC
can cause other problems than just ASIZE because it can/will change
ELisp objects, and modifying the heap while we're doing GC is the
problem that concurrent GCs try to solve: our GC is not equipped
for that.


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60237; Package emacs. (Wed, 01 Mar 2023 13:28:01 GMT) Full text and rfc822 format available.

Message #77 received at 60237 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: luangruo <at> yahoo.com, casouri <at> gmail.com, mickey <at> masteringemacs.org,
 60237 <at> debbugs.gnu.org
Subject: Re: bug#60237: 30.0.50; tree sitter core dumps when I edebug view a
 node
Date: Wed, 01 Mar 2023 15:27:26 +0200
> From: Stefan Monnier <monnier <at> iro.umontreal.ca>
> Cc: casouri <at> gmail.com,  luangruo <at> yahoo.com,  mickey <at> masteringemacs.org,
>   60237 <at> debbugs.gnu.org
> Date: Tue, 28 Feb 2023 23:07:47 -0500
> 
> >> > Stefan, could it be a problem for us if garbage-collecting an object
> >> > calls xmalloc?  Including if the "memory" profiler is running at the
> >> > time of that GC?
> >> 
> >> I can't think of a fundamental reason why this would be a problem, but
> >> as you've seen some code may not be quite ready for it.
> >> 
> >> I suspect the simplest solution is to do something like what we do
> >> for the cpu-profiler, i.e. handle the "time within GC" specially by
> >> checking (EQ (backtrace_top_function (), QAutomatic_GC)) to determine
> >> that we're within the GC.
> >
> > Any reason not to install the patch that uses gcsize instead of ASIZE?
> 
> That might work, but I suspect there's a good reason why I used
> `cpu_gc_count`.  I think running the "normal" profiling code during GC
> can cause other problems than just ASIZE because it can/will change
> ELisp objects, and modifying the heap while we're doing GC is the
> problem that concurrent GCs try to solve: our GC is not equipped
> for that.

Would you mind installing a change along these lines on the emacs-29
branch?  I'm not familiar enough with profiler.c to experiment with
its code on the release branch.

TIA





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60237; Package emacs. (Wed, 01 Mar 2023 14:09:02 GMT) Full text and rfc822 format available.

Message #80 received at 60237 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: luangruo <at> yahoo.com, casouri <at> gmail.com, mickey <at> masteringemacs.org,
 60237 <at> debbugs.gnu.org
Subject: Re: bug#60237: 30.0.50; tree sitter core dumps when I edebug view a
 node
Date: Wed, 01 Mar 2023 09:08:03 -0500
Eli Zaretskii [2023-03-01 15:27:26] wrote:

>> From: Stefan Monnier <monnier <at> iro.umontreal.ca>
>> Cc: casouri <at> gmail.com,  luangruo <at> yahoo.com,  mickey <at> masteringemacs.org,
>>   60237 <at> debbugs.gnu.org
>> Date: Tue, 28 Feb 2023 23:07:47 -0500
>> 
>> >> > Stefan, could it be a problem for us if garbage-collecting an object
>> >> > calls xmalloc?  Including if the "memory" profiler is running at the
>> >> > time of that GC?
>> >> 
>> >> I can't think of a fundamental reason why this would be a problem, but
>> >> as you've seen some code may not be quite ready for it.
>> >> 
>> >> I suspect the simplest solution is to do something like what we do
>> >> for the cpu-profiler, i.e. handle the "time within GC" specially by
>> >> checking (EQ (backtrace_top_function (), QAutomatic_GC)) to determine
>> >> that we're within the GC.
>> >
>> > Any reason not to install the patch that uses gcsize instead of ASIZE?
>> 
>> That might work, but I suspect there's a good reason why I used
>> `cpu_gc_count`.  I think running the "normal" profiling code during GC
>> can cause other problems than just ASIZE because it can/will change
>> ELisp objects, and modifying the heap while we're doing GC is the
>> problem that concurrent GCs try to solve: our GC is not equipped
>> for that.
>
> Would you mind installing a change along these lines on the emacs-29
> branch?  I'm not familiar enough with profiler.c to experiment with
> its code on the release branch.

For `emacs-29` I suggest we just use the patch below which should
circumvent the problem.


        Stefan


diff --git a/src/profiler.c b/src/profiler.c
index 81b5e7b0cf0..c99ed0a81a2 100644
--- a/src/profiler.c
+++ b/src/profiler.c
@@ -505,6 +505,8 @@ DEFUN ("profiler-memory-log",
 void
 malloc_probe (size_t size)
 {
+  if (EQ (backtrace_top_function (), QAutomatic_GC))
+    return;                     /* bug#60237 */
   eassert (HASH_TABLE_P (memory_log));
   record_backtrace (XHASH_TABLE (memory_log), min (size, MOST_POSITIVE_FIXNUM));
 }





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60237; Package emacs. (Wed, 01 Mar 2023 15:52:02 GMT) Full text and rfc822 format available.

Message #83 received at 60237 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: luangruo <at> yahoo.com, casouri <at> gmail.com, mickey <at> masteringemacs.org,
 60237 <at> debbugs.gnu.org
Subject: Re: bug#60237: 30.0.50; tree sitter core dumps when I edebug view a
 node
Date: Wed, 01 Mar 2023 17:51:42 +0200
> From: Stefan Monnier <monnier <at> iro.umontreal.ca>
> Cc: casouri <at> gmail.com,  luangruo <at> yahoo.com,  mickey <at> masteringemacs.org,
>   60237 <at> debbugs.gnu.org
> Date: Wed, 01 Mar 2023 09:08:03 -0500
> 
> Eli Zaretskii [2023-03-01 15:27:26] wrote:
> 
> > Would you mind installing a change along these lines on the emacs-29
> > branch?  I'm not familiar enough with profiler.c to experiment with
> > its code on the release branch.
> 
> For `emacs-29` I suggest we just use the patch below which should
> circumvent the problem.

Fine with me, please install, and thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60237; Package emacs. (Wed, 01 Mar 2023 17:40:02 GMT) Full text and rfc822 format available.

Message #86 received at 60237 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: luangruo <at> yahoo.com, casouri <at> gmail.com, mickey <at> masteringemacs.org,
 60237 <at> debbugs.gnu.org
Subject: Re: bug#60237: 30.0.50; tree sitter core dumps when I edebug view a
 node
Date: Wed, 01 Mar 2023 12:39:03 -0500
>> For `emacs-29` I suggest we just use the patch below which should
>> circumvent the problem.
>
> Fine with me, please install, and thanks.

Thanks, pushed.  Hopefully we can do a bit better on `master`, but
I don't have time for it right now.  Maybe someone else?


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60237; Package emacs. (Thu, 02 Mar 2023 05:55:02 GMT) Full text and rfc822 format available.

Message #89 received at 60237 <at> debbugs.gnu.org (full text, mbox):

From: Po Lu <luangruo <at> yahoo.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Eli Zaretskii <eliz <at> gnu.org>, mickey <at> masteringemacs.org, casouri <at> gmail.com,
 60237 <at> debbugs.gnu.org
Subject: Re: bug#60237: 30.0.50; tree sitter core dumps when I edebug view a
 node
Date: Thu, 02 Mar 2023 13:53:54 +0800
Stefan Monnier <monnier <at> iro.umontreal.ca> writes:

> Eli Zaretskii [2023-03-01 15:27:26] wrote:
>
>>> From: Stefan Monnier <monnier <at> iro.umontreal.ca>
>>> Cc: casouri <at> gmail.com,  luangruo <at> yahoo.com,  mickey <at> masteringemacs.org,
>>>   60237 <at> debbugs.gnu.org
>>> Date: Tue, 28 Feb 2023 23:07:47 -0500
>>> 
>>> >> > Stefan, could it be a problem for us if garbage-collecting an object
>>> >> > calls xmalloc?  Including if the "memory" profiler is running at the
>>> >> > time of that GC?
>>> >> 
>>> >> I can't think of a fundamental reason why this would be a problem, but
>>> >> as you've seen some code may not be quite ready for it.
>>> >> 
>>> >> I suspect the simplest solution is to do something like what we do
>>> >> for the cpu-profiler, i.e. handle the "time within GC" specially by
>>> >> checking (EQ (backtrace_top_function (), QAutomatic_GC)) to determine
>>> >> that we're within the GC.
>>> >
>>> > Any reason not to install the patch that uses gcsize instead of ASIZE?
>>> 
>>> That might work, but I suspect there's a good reason why I used
>>> `cpu_gc_count`.  I think running the "normal" profiling code during GC
>>> can cause other problems than just ASIZE because it can/will change
>>> ELisp objects, and modifying the heap while we're doing GC is the
>>> problem that concurrent GCs try to solve: our GC is not equipped
>>> for that.
>>
>> Would you mind installing a change along these lines on the emacs-29
>> branch?  I'm not familiar enough with profiler.c to experiment with
>> its code on the release branch.
>
> For `emacs-29` I suggest we just use the patch below which should
> circumvent the problem.
>
>
>         Stefan
>
>
> diff --git a/src/profiler.c b/src/profiler.c
> index 81b5e7b0cf0..c99ed0a81a2 100644
> --- a/src/profiler.c
> +++ b/src/profiler.c
> @@ -505,6 +505,8 @@ DEFUN ("profiler-memory-log",
>  void
>  malloc_probe (size_t size)
>  {
> +  if (EQ (backtrace_top_function (), QAutomatic_GC))
> +    return;                     /* bug#60237 */
>    eassert (HASH_TABLE_P (memory_log));
>    record_backtrace (XHASH_TABLE (memory_log), min (size, MOST_POSITIVE_FIXNUM));
>  }

Shouldn't this be:

  if (gc_in_progress)
    return;




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60237; Package emacs. (Thu, 02 Mar 2023 20:25:01 GMT) Full text and rfc822 format available.

Message #92 received at 60237 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Po Lu <luangruo <at> yahoo.com>
Cc: Eli Zaretskii <eliz <at> gnu.org>, mickey <at> masteringemacs.org, casouri <at> gmail.com,
 60237 <at> debbugs.gnu.org
Subject: Re: bug#60237: 30.0.50; tree sitter core dumps when I edebug view a
 node
Date: Thu, 02 Mar 2023 15:24:07 -0500
>> diff --git a/src/profiler.c b/src/profiler.c
>> index 81b5e7b0cf0..c99ed0a81a2 100644
>> --- a/src/profiler.c
>> +++ b/src/profiler.c
>> @@ -505,6 +505,8 @@ DEFUN ("profiler-memory-log",
>>  void
>>  malloc_probe (size_t size)
>>  {
>> +  if (EQ (backtrace_top_function (), QAutomatic_GC))
>> +    return;                     /* bug#60237 */
>>    eassert (HASH_TABLE_P (memory_log));
>>    record_backtrace (XHASH_TABLE (memory_log), min (size, MOST_POSITIVE_FIXNUM));
>>  }
>
> Shouldn't this be:
>
>   if (gc_in_progress)
>     return;

Sounds like a good idea.  If so that should apply to the cpu profiler
code as well.  It might be worthwhile to check the details to see if
there might be subtle differences (e.g. when we're running
`post-gc-hook` maybe?).


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60237; Package emacs. (Sat, 04 Mar 2023 12:23:01 GMT) Full text and rfc822 format available.

Message #95 received at 60237 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: luangruo <at> yahoo.com, casouri <at> gmail.com, mickey <at> masteringemacs.org,
 60237 <at> debbugs.gnu.org
Subject: Re: bug#60237: 30.0.50; tree sitter core dumps when I edebug view a
 node
Date: Sat, 04 Mar 2023 14:21:41 +0200
> From: Stefan Monnier <monnier <at> iro.umontreal.ca>
> Cc: casouri <at> gmail.com,  luangruo <at> yahoo.com,  mickey <at> masteringemacs.org,
>   60237 <at> debbugs.gnu.org
> Date: Wed, 01 Mar 2023 12:39:03 -0500
> 
> >> For `emacs-29` I suggest we just use the patch below which should
> >> circumvent the problem.
> >
> > Fine with me, please install, and thanks.
> 
> Thanks, pushed.  Hopefully we can do a bit better on `master`, but
> I don't have time for it right now.  Maybe someone else?

I tried cargo-culting the cpu_gc_count stuff for the memory profiler,
see the patch below.  However, something is amiss: this assertion in
profiler.el sometimes triggers:

    (maphash
     (lambda (backtrace _count)
       (let* ((max (1- (length backtrace)))
              (head (aref backtrace max))
              (best-parent nil)
              (best-match (1+ max))
              (parents (gethash head fun-map)))
         (pcase-dolist (`(,i . ,parent) parents)
           (when t ;; (<= (- max i) best-match) ;Else, it can't be better.
             (let ((match max)
                   (imatch i))
               (cl-assert (>= match imatch))  <<<<<<<<<<<<<<<<<<<<<<<<<<<<
               (cl-assert (function-equal (aref backtrace max)
                                          (aref parent i)))

I cannot reliably reproduce this, and don't understand what causes the
assertion.  Any hints?

Here's the patch:

diff --git a/src/profiler.c b/src/profiler.c
index 8247b2e..92d8a0a 100644
--- a/src/profiler.c
+++ b/src/profiler.c
@@ -227,6 +227,9 @@ record_backtrace (log_t *log, EMACS_INT count)
 /* Separate counter for the time spent in the GC.  */
 static EMACS_INT cpu_gc_count;
 
+/* Separate counter for the memory allocations during GC.  */
+static EMACS_INT mem_gc_count;
+
 /* The current sampling interval in nanoseconds.  */
 static EMACS_INT current_sampling_interval;
 
@@ -451,7 +454,10 @@ DEFUN ("profiler-memory-start", Fprofiler_memory_start, Sprofiler_memory_start,
     error ("Memory profiler is already running");
 
   if (NILP (memory_log))
-    memory_log = make_log ();
+    {
+      mem_gc_count = 0;
+      memory_log = make_log ();
+    }
 
   profiler_memory_running = true;
 
@@ -495,6 +501,10 @@ DEFUN ("profiler-memory-log",
      more for our use afterwards since we can't rely on its special
      pre-allocated keys anymore.  So we have to allocate a new one.  */
   memory_log = profiler_memory_running ? make_log () : Qnil;
+  Fputhash (make_vector (1, QAutomatic_GC),
+	    make_fixnum (mem_gc_count),
+	    result);
+  mem_gc_count = 0;
   return result;
 }
 
@@ -506,10 +516,19 @@ DEFUN ("profiler-memory-log",
 malloc_probe (size_t size)
 {
   if (EQ (backtrace_top_function (), QAutomatic_GC)) /* bug#60237 */
-    /* FIXME: We should do something like what we did with `cpu_gc_count`.  */
-    return;
-  eassert (HASH_TABLE_P (memory_log));
-  record_backtrace (XHASH_TABLE (memory_log), min (size, MOST_POSITIVE_FIXNUM));
+    /* Special case the malloc-count inside GC because the hash-table
+       code is not prepared to be used while the GC is running.
+       More specifically it uses ASIZE at many places where it does
+       not expect the ARRAY_MARK_FLAG to be set.  We could try and
+       harden the hash-table code, but it doesn't seem worth the
+       effort.  */
+    mem_gc_count = saturated_add (mem_gc_count, 1);
+  else
+    {
+      eassert (HASH_TABLE_P (memory_log));
+      record_backtrace (XHASH_TABLE (memory_log),
+			min (size, MOST_POSITIVE_FIXNUM));
+    }
 }
 
 DEFUN ("function-equal", Ffunction_equal, Sfunction_equal, 2, 2, 0,




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60237; Package emacs. (Wed, 08 Mar 2023 16:35:02 GMT) Full text and rfc822 format available.

Message #98 received at 60237 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: luangruo <at> yahoo.com, casouri <at> gmail.com, mickey <at> masteringemacs.org,
 60237 <at> debbugs.gnu.org
Subject: Re: bug#60237: 30.0.50; tree sitter core dumps when I edebug view a
 node
Date: Wed, 08 Mar 2023 11:34:14 -0500
> I tried cargo-culting the cpu_gc_count stuff for the memory profiler,
> see the patch below.  However, something is amiss: this assertion in
> profiler.el sometimes triggers:
>
>     (maphash
>      (lambda (backtrace _count)
>        (let* ((max (1- (length backtrace)))
>               (head (aref backtrace max))
>               (best-parent nil)
>               (best-match (1+ max))
>               (parents (gethash head fun-map)))
>          (pcase-dolist (`(,i . ,parent) parents)
>            (when t ;; (<= (- max i) best-match) ;Else, it can't be better.
>              (let ((match max)
>                    (imatch i))
>                (cl-assert (>= match imatch))  <<<<<<<<<<<<<<<<<<<<<<<<<<<<
>                (cl-assert (function-equal (aref backtrace max)
>                                           (aref parent i)))
>
> I cannot reliably reproduce this, and don't understand what causes the
> assertion.  Any hints?

Hmm... I just took a look but can't see neither why your change would
be more likely to trigger this error than the existing code for the
`cpu` case, nor why this assertion should always be true.

IOW, I'm going to have to find the original author to ask him what he
was thinking back then.

> Here's the patch:

Looks good.  Just one nitpick:

>  malloc_probe (size_t size)
>  {
>    if (EQ (backtrace_top_function (), QAutomatic_GC)) /* bug#60237 */
> -    /* FIXME: We should do something like what we did with `cpu_gc_count`.  */
> -    return;
> -  eassert (HASH_TABLE_P (memory_log));
> -  record_backtrace (XHASH_TABLE (memory_log), min (size, MOST_POSITIVE_FIXNUM));
> +    /* Special case the malloc-count inside GC because the hash-table
> +       code is not prepared to be used while the GC is running.
> +       More specifically it uses ASIZE at many places where it does
> +       not expect the ARRAY_MARK_FLAG to be set.  We could try and
> +       harden the hash-table code, but it doesn't seem worth the
> +       effort.  */
> +    mem_gc_count = saturated_add (mem_gc_count, 1);

Here we should increase by `size` rather than by 1.


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60237; Package emacs. (Fri, 10 Mar 2023 18:29:02 GMT) Full text and rfc822 format available.

Message #101 received at 60237 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: luangruo <at> yahoo.com, casouri <at> gmail.com, mickey <at> masteringemacs.org,
 60237 <at> debbugs.gnu.org
Subject: Re: bug#60237: 30.0.50; tree sitter core dumps when I edebug view a
 node
Date: Fri, 10 Mar 2023 13:28:20 -0500
>> I tried cargo-culting the cpu_gc_count stuff for the memory profiler,
>> see the patch below.  However, something is amiss: this assertion in
>> profiler.el sometimes triggers:
>>
>>     (maphash
>>      (lambda (backtrace _count)
>>        (let* ((max (1- (length backtrace)))
>>               (head (aref backtrace max))
>>               (best-parent nil)
>>               (best-match (1+ max))
>>               (parents (gethash head fun-map)))
>>          (pcase-dolist (`(,i . ,parent) parents)
>>            (when t ;; (<= (- max i) best-match) ;Else, it can't be better.
>>              (let ((match max)
>>                    (imatch i))
>>                (cl-assert (>= match imatch))  <<<<<<<<<<<<<<<<<<<<<<<<<<<<
>>                (cl-assert (function-equal (aref backtrace max)
>>                                           (aref parent i)))
>>
>> I cannot reliably reproduce this, and don't understand what causes the
>> assertion.  Any hints?
>
> Hmm... I just took a look but can't see neither why your change would
> be more likely to trigger this error than the existing code for the
> `cpu` case, nor why this assertion should always be true.

I can imagine corner cases where this could trigger, but they all
involve funny business where we change `profiler-max-stack-depth` during
a single profiling run (I think you'd need to write ad-hoc ELisp code
for that).  The only other explanation I can see is that we
somehow end up with a backtrace that includes `Automatic_GC` somewhere
not at the top (maybe this can happen with a `post-gc-hook`?).

If you manage to reproduce it, I'd be interested to know the value of
`backtrace` and `parent` when the assertion fails (and maybe just save
the `log` hash-table so we can look at it).  It might be a symptom of
another bug.

And I still can't see how/why this would happen only for the `memory`
profiler and not for the `cpu` profiler, so I assume it can also happen
for the `cpu` profiler and we've just been lucky not to bump into it yet.

This said, I think the patch below should fix it for the `cpu` profiler
and a similar change should fix it for your patch (and the patch is
arguably right in the sense that without this `nil` entry, the backtrace
entry created for `Automatic_GC` is not really complete).


        Stefan


diff --git a/src/profiler.c b/src/profiler.c
index 8247b2e90c6..295c47a2acd 100644
--- a/src/profiler.c
+++ b/src/profiler.c
@@ -423,7 +423,7 @@ DEFUN ("profiler-cpu-log", Fprofiler_cpu_log, Sprofiler_cpu_log,
      more for our use afterwards since we can't rely on its special
      pre-allocated keys anymore.  So we have to allocate a new one.  */
   cpu_log = profiler_cpu_running ? make_log () : Qnil;
-  Fputhash (make_vector (1, QAutomatic_GC),
+  Fputhash (CALLN (Fvector, QAutomatic_GC, Qnil),
 	    make_fixnum (cpu_gc_count),
 	    result);
   cpu_gc_count = 0;





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60237; Package emacs. (Fri, 10 Mar 2023 20:58:02 GMT) Full text and rfc822 format available.

Message #104 received at 60237 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: luangruo <at> yahoo.com, casouri <at> gmail.com, mickey <at> masteringemacs.org,
 60237 <at> debbugs.gnu.org
Subject: Re: bug#60237: 30.0.50; tree sitter core dumps when I edebug view a
 node
Date: Fri, 10 Mar 2023 15:56:56 -0500
I pushed your change to `master`, with my patch on top.  Plus a few
other patches to reduce redundancy a bit and fix a FIXME.


        Stefan


Stefan Monnier [2023-03-10 13:28:20] wrote:

>>> I tried cargo-culting the cpu_gc_count stuff for the memory profiler,
>>> see the patch below.  However, something is amiss: this assertion in
>>> profiler.el sometimes triggers:
>>>
>>>     (maphash
>>>      (lambda (backtrace _count)
>>>        (let* ((max (1- (length backtrace)))
>>>               (head (aref backtrace max))
>>>               (best-parent nil)
>>>               (best-match (1+ max))
>>>               (parents (gethash head fun-map)))
>>>          (pcase-dolist (`(,i . ,parent) parents)
>>>            (when t ;; (<= (- max i) best-match) ;Else, it can't be better.
>>>              (let ((match max)
>>>                    (imatch i))
>>>                (cl-assert (>= match imatch))  <<<<<<<<<<<<<<<<<<<<<<<<<<<<
>>>                (cl-assert (function-equal (aref backtrace max)
>>>                                           (aref parent i)))
>>>
>>> I cannot reliably reproduce this, and don't understand what causes the
>>> assertion.  Any hints?
>>
>> Hmm... I just took a look but can't see neither why your change would
>> be more likely to trigger this error than the existing code for the
>> `cpu` case, nor why this assertion should always be true.
>
> I can imagine corner cases where this could trigger, but they all
> involve funny business where we change `profiler-max-stack-depth` during
> a single profiling run (I think you'd need to write ad-hoc ELisp code
> for that).  The only other explanation I can see is that we
> somehow end up with a backtrace that includes `Automatic_GC` somewhere
> not at the top (maybe this can happen with a `post-gc-hook`?).
>
> If you manage to reproduce it, I'd be interested to know the value of
> `backtrace` and `parent` when the assertion fails (and maybe just save
> the `log` hash-table so we can look at it).  It might be a symptom of
> another bug.
>
> And I still can't see how/why this would happen only for the `memory`
> profiler and not for the `cpu` profiler, so I assume it can also happen
> for the `cpu` profiler and we've just been lucky not to bump into it yet.
>
> This said, I think the patch below should fix it for the `cpu` profiler
> and a similar change should fix it for your patch (and the patch is
> arguably right in the sense that without this `nil` entry, the backtrace
> entry created for `Automatic_GC` is not really complete).
>
>
>         Stefan
>
>
> diff --git a/src/profiler.c b/src/profiler.c
> index 8247b2e90c6..295c47a2acd 100644
> --- a/src/profiler.c
> +++ b/src/profiler.c
> @@ -423,7 +423,7 @@ DEFUN ("profiler-cpu-log", Fprofiler_cpu_log, Sprofiler_cpu_log,
>       more for our use afterwards since we can't rely on its special
>       pre-allocated keys anymore.  So we have to allocate a new one.  */
>    cpu_log = profiler_cpu_running ? make_log () : Qnil;
> -  Fputhash (make_vector (1, QAutomatic_GC),
> +  Fputhash (CALLN (Fvector, QAutomatic_GC, Qnil),
>  	    make_fixnum (cpu_gc_count),
>  	    result);
>    cpu_gc_count = 0;





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60237; Package emacs. (Fri, 10 Mar 2023 23:53:01 GMT) Full text and rfc822 format available.

Message #107 received at 60237 <at> debbugs.gnu.org (full text, mbox):

From: Po Lu <luangruo <at> yahoo.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Eli Zaretskii <eliz <at> gnu.org>, mickey <at> masteringemacs.org, casouri <at> gmail.com,
 60237 <at> debbugs.gnu.org
Subject: Re: bug#60237: 30.0.50; tree sitter core dumps when I edebug view a
 node
Date: Sat, 11 Mar 2023 07:52:40 +0800
Stefan Monnier <monnier <at> iro.umontreal.ca> writes:

>>> I tried cargo-culting the cpu_gc_count stuff for the memory profiler,
>>> see the patch below.  However, something is amiss: this assertion in
>>> profiler.el sometimes triggers:
>>>
>>>     (maphash
>>>      (lambda (backtrace _count)
>>>        (let* ((max (1- (length backtrace)))
>>>               (head (aref backtrace max))
>>>               (best-parent nil)
>>>               (best-match (1+ max))
>>>               (parents (gethash head fun-map)))
>>>          (pcase-dolist (`(,i . ,parent) parents)
>>>            (when t ;; (<= (- max i) best-match) ;Else, it can't be better.
>>>              (let ((match max)
>>>                    (imatch i))
>>>                (cl-assert (>= match imatch))  <<<<<<<<<<<<<<<<<<<<<<<<<<<<
>>>                (cl-assert (function-equal (aref backtrace max)
>>>                                           (aref parent i)))
>>>
>>> I cannot reliably reproduce this, and don't understand what causes the
>>> assertion.  Any hints?
>>
>> Hmm... I just took a look but can't see neither why your change would
>> be more likely to trigger this error than the existing code for the
>> `cpu` case, nor why this assertion should always be true.
>
> I can imagine corner cases where this could trigger, but they all
> involve funny business where we change `profiler-max-stack-depth` during
> a single profiling run (I think you'd need to write ad-hoc ELisp code
> for that).  The only other explanation I can see is that we
> somehow end up with a backtrace that includes `Automatic_GC` somewhere
> not at the top (maybe this can happen with a `post-gc-hook`?).

What about gc_in_progress? Why can't we use that?
This should avoid everything related to post-gc-hook.

Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60237; Package emacs. (Sat, 11 Mar 2023 02:43:02 GMT) Full text and rfc822 format available.

Message #110 received at 60237 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Po Lu <luangruo <at> yahoo.com>
Cc: Eli Zaretskii <eliz <at> gnu.org>, mickey <at> masteringemacs.org, casouri <at> gmail.com,
 60237 <at> debbugs.gnu.org
Subject: Re: bug#60237: 30.0.50; tree sitter core dumps when I edebug view a
 node
Date: Fri, 10 Mar 2023 21:41:57 -0500
>> I can imagine corner cases where this could trigger, but they all
>> involve funny business where we change `profiler-max-stack-depth` during
>> a single profiling run (I think you'd need to write ad-hoc ELisp code
>> for that).  The only other explanation I can see is that we
>> somehow end up with a backtrace that includes `Automatic_GC` somewhere
>> not at the top (maybe this can happen with a `post-gc-hook`?).
>
> What about gc_in_progress? Why can't we use that?

In the text you quote I simply try and describe the kinds of situations
where I think the problem can appear.  I don't know which of those are
actually possible, nor do I suggest what should be done about it.

And yes, maybe we can use `gc_in_progress`.
So far I haven't taken a look at that, but feel free to do so.

> This should avoid everything related to post-gc-hook.

Probably.  At the same time, if we're sampling while running
`post-gc-hook`, then it's safe to do the "normal" job of the sampling
code (the GC proper is completed already), so maybe the better thing to
do in that case is to treat it as a backtrace which has `Automatic GC`
as its root (i.e. ignore the part of the backtrace that's above
`Automatic GC`).


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60237; Package emacs. (Sat, 11 Mar 2023 03:30:03 GMT) Full text and rfc822 format available.

Message #113 received at 60237 <at> debbugs.gnu.org (full text, mbox):

From: Po Lu <luangruo <at> yahoo.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Eli Zaretskii <eliz <at> gnu.org>, mickey <at> masteringemacs.org, casouri <at> gmail.com,
 60237 <at> debbugs.gnu.org
Subject: Re: bug#60237: 30.0.50; tree sitter core dumps when I edebug view a
 node
Date: Sat, 11 Mar 2023 11:29:44 +0800
Stefan Monnier <monnier <at> iro.umontreal.ca> writes:

> In the text you quote I simply try and describe the kinds of situations
> where I think the problem can appear.  I don't know which of those are
> actually possible, nor do I suggest what should be done about it.
>
> And yes, maybe we can use `gc_in_progress`.
> So far I haven't taken a look at that, but feel free to do so.

I thought you'd know a problem or two with `gc_in_progress', which is
why you decided to check for QAutomatic_GC in the backtrace.

> Probably.  At the same time, if we're sampling while running
> `post-gc-hook`, then it's safe to do the "normal" job of the sampling
> code (the GC proper is completed already), so maybe the better thing to
> do in that case is to treat it as a backtrace which has `Automatic GC`
> as its root (i.e. ignore the part of the backtrace that's above
> `Automatic GC`).

That makes sense, yes.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60237; Package emacs. (Sat, 11 Mar 2023 03:39:02 GMT) Full text and rfc822 format available.

Message #116 received at 60237 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Po Lu <luangruo <at> yahoo.com>
Cc: Eli Zaretskii <eliz <at> gnu.org>, mickey <at> masteringemacs.org, casouri <at> gmail.com,
 60237 <at> debbugs.gnu.org
Subject: Re: bug#60237: 30.0.50; tree sitter core dumps when I edebug view a
 node
Date: Fri, 10 Mar 2023 22:38:33 -0500
>> In the text you quote I simply try and describe the kinds of situations
>> where I think the problem can appear.  I don't know which of those are
>> actually possible, nor do I suggest what should be done about it.
>> And yes, maybe we can use `gc_in_progress`.
>> So far I haven't taken a look at that, but feel free to do so.
> I thought you'd know a problem or two with `gc_in_progress', which is
> why you decided to check for QAutomatic_GC in the backtrace.

No.  I can't remember why I used `QAutomatic_GC` back when
that profiler was first introduced, but I guess it's because I hadn't
realized I could use `gc_in_progress`.
And now I just preserved the code that was there.


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60237; Package emacs. (Sat, 11 Mar 2023 06:46:01 GMT) Full text and rfc822 format available.

Message #119 received at 60237 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: luangruo <at> yahoo.com, casouri <at> gmail.com, mickey <at> masteringemacs.org,
 60237 <at> debbugs.gnu.org
Subject: Re: bug#60237: 30.0.50; tree sitter core dumps when I edebug view a
 node
Date: Sat, 11 Mar 2023 08:45:22 +0200
> From: Stefan Monnier <monnier <at> iro.umontreal.ca>
> Cc: casouri <at> gmail.com,  luangruo <at> yahoo.com,  mickey <at> masteringemacs.org,
>   60237 <at> debbugs.gnu.org
> Date: Fri, 10 Mar 2023 15:56:56 -0500
> 
> I pushed your change to `master`, with my patch on top.  Plus a few
> other patches to reduce redundancy a bit and fix a FIXME.

Thanks.  I guess we can now close this issue?




Reply sent to Stefan Monnier <monnier <at> iro.umontreal.ca>:
You have taken responsibility. (Sat, 11 Mar 2023 17:47:02 GMT) Full text and rfc822 format available.

Notification sent to Mickey Petersen <mickey <at> masteringemacs.org>:
bug acknowledged by developer. (Sat, 11 Mar 2023 17:47:02 GMT) Full text and rfc822 format available.

Message #124 received at 60237-done <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: luangruo <at> yahoo.com, casouri <at> gmail.com, mickey <at> masteringemacs.org,
 60237-done <at> debbugs.gnu.org
Subject: Re: bug#60237: 30.0.50; tree sitter core dumps when I edebug view a
 node
Date: Sat, 11 Mar 2023 12:45:51 -0500
>> I pushed your change to `master`, with my patch on top.  Plus a few
>> other patches to reduce redundancy a bit and fix a FIXME.
> Thanks.  I guess we can now close this issue?

I think so, yes.


        Stefan





bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sun, 09 Apr 2023 11:24:08 GMT) Full text and rfc822 format available.

This bug report was last modified 354 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.