GNU bug report logs -
#76180
[feature/igc] Remaining known tracing issues
Previous Next
To reply to this bug, email your comments to 76180 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#76180
; Package
emacs
.
(Mon, 10 Feb 2025 15:17:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Pip Cet <pipcet <at> protonmail.com>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Mon, 10 Feb 2025 15:17:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
Currently, I'm aware of three known issues which would potentially
result in live objects not being traced, which may lead to crashes that
are very hard to debug. These are:
1. Bytecode objects violating the stack limit prosaically called
HORRIBLE_ESTIMATE will fail to scan some of their excessively-large
bytecode stack if interrupted by GC in execution.
2. Nativecomp code is sometimes compiled without -fomit-frame-pointer
and uses %rbp as a general-purpose register: if said nativecomp code
calls SETJMP directly, and setjmp() "scrambles" the frame pointer
register, as it unfortunately does on most systems, the "scrambled"
general-purpose value will not be traced and will not pin or keep alive
its referenced object.
3. setjmp buffers in general are allocated using igc_alloc_handler,
which does not scan the setjmp buffer conservatively. As it is
unpredictable which callee-saved registers might contain live
references, we need to scan the setjmp buffer conservatively. That
still won't scan the stack pointer or frame pointer if they are
"scrambled", so we must also ensure the frame pointer is never omitted
in compiled code.
Note that there are good reasons we're not seeing frequent crashes due
to these bugs: 1. doesn't happen because most bytecode objects are small
(and nativecomp is often in use). 2. affects only nativecomp functions
calling setjmp directly, not those calling intermediate Emacs C code
(which saves the %rbp register on the stack where MPS would see it).
3. Most likely requires aggresssive link-time optimization options to
cause bugs.
My idea is to propose three stop-gap patches for these issues as soon as
this has a bug number; more satisfying patches will require help from
bytecode/bytecode GC experts (Mattias), the nativecomp maintainer
(Andrea), and a potential rewrite to always keep jump buffers on the C
stack (Paul).
My hope is that once these stop-gap patches are installed, it's a good
point to call again for general testing of the feature/igc branch, under
certain restrictions and with specific instructions to reduce the
possibility of data loss due to crashes (e.g. run in GDB, set a
breakpoint on kill, don't use TTY signals, xwidgets, unusual toolkits).
I hope that "PGTK" and "WIDE_EMACS_INT" can be removed from the list of
build constellations expected to be unstable ASAP.
The branch will continue to contain bugs and we can't take any
responsibility, but at least avoiding those known bugs seems to result
in a usable work environment here, and it might be a good opportunity to
regain some of the early testers who were too frustrated by stability
issues to let them know that we've fixed some of them.
Thoughts?
Pip
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#76180
; Package
emacs
.
(Mon, 10 Feb 2025 19:07:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 76180 <at> debbugs.gnu.org (full text, mbox):
On 2/10/25 07:15, Pip Cet wrote:
> a potential rewrite to always keep jump buffers on the C
> stack (Paul).
Another possibility would be to change igc_alloc_handler so that jump
buffers on the heap are scanned conservatively. Both possibilities have
merits; I am not sure which would be better.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#76180
; Package
emacs
.
(Mon, 10 Feb 2025 19:25:02 GMT)
Full text and
rfc822 format available.
Message #11 received at 76180 <at> debbugs.gnu.org (full text, mbox):
"Paul Eggert" <eggert <at> cs.ucla.edu> writes:
> On 2/10/25 07:15, Pip Cet wrote:
>> a potential rewrite to always keep jump buffers on the C
>> stack (Paul).
>
> Another possibility would be to change igc_alloc_handler so that jump
> buffers on the heap are scanned conservatively.
Thanks!
That's the stop-gap patch, yes: turning it into an "ambiguous" root.
> Both possibilities have merits; I am not sure which would be better.
(Note that MPS does not allow ordinary heap objects to be scanned
conservatively, just roots; and while it does allow roots to be
protected with memory barriers, we don't make use of that yet. My
understanding from reading some comments in the MPS code is that it's
strongly optimized for not having additional large unprotected roots in
addition to the C stack).
My main argument for preferring the (slightly harder) solution of
keeping the jump buffers on the C stack is that the handlerlist could
shrink once again; right now, a large handlerlist never shrinks and
treats many disjoint areas as a permanent ambiguous root, IIUC. As my
hope is to make GC more frequent and use protection more, large
unprotected roots seem like a problem right now.
Pip
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#76180
; Package
emacs
.
(Tue, 11 Feb 2025 06:16:02 GMT)
Full text and
rfc822 format available.
Message #14 received at submit <at> debbugs.gnu.org (full text, mbox):
Pip Cet <pipcet <at> protonmail.com> writes:
> 1. Bytecode objects violating the stack limit prosaically called
> HORRIBLE_ESTIMATE will fail to scan some of their excessively-large
> bytecode stack if interrupted by GC in execution.
The current bytecode stack design is not really my thing, TBH. But I
haven't followed what has been happening the last decades, so what do I
know.
There were times back then when I allocated byte code stacks using
alloca for each frame, which of course implies that the maximum stack
size was known at the time. (The idea behind that was to use
conservative stack marking and so on.)
Why that isn't the case now, I don't know. Or maybe I just didn't find
it. Also, I'm not sure if that is still the case, but it might be worth
checking if an overflow of the large stacks that are currently used
(500K or so) is detected.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#76180
; Package
emacs
.
(Tue, 11 Feb 2025 18:33:02 GMT)
Full text and
rfc822 format available.
Message #17 received at 76180 <at> debbugs.gnu.org (full text, mbox):
Pip Cet via "Bug reports for GNU Emacs, the Swiss army knife of text
editors" <bug-gnu-emacs <at> gnu.org> writes:
> My hope is that once these stop-gap patches are installed, it's a good
> point to call again for general testing of the feature/igc branch, under
> certain restrictions and with specific instructions to reduce the
> possibility of data loss due to crashes (e.g. run in GDB, set a
> breakpoint on kill, don't use TTY signals, xwidgets, unusual toolkits).
> I hope that "PGTK" and "WIDE_EMACS_INT" can be removed from the list of
> build constellations expected to be unstable ASAP.
>
> The branch will continue to contain bugs and we can't take any
> responsibility, but at least avoiding those known bugs seems to result
> in a usable work environment here, and it might be a good opportunity to
> regain some of the early testers who were too frustrated by stability
> issues to let them know that we've fixed some of them.
>
> Thoughts?
I don't have anything specific to add, but the above sounds like a good
plan to me. Meanwhile, we also continue seeing other useful bug reports
trickling in.
Thanks again for working on this.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#76180
; Package
emacs
.
(Tue, 11 Feb 2025 20:17:01 GMT)
Full text and
rfc822 format available.
Message #20 received at 76180 <at> debbugs.gnu.org (full text, mbox):
"Stefan Kangas" <stefankangas <at> gmail.com> writes:
> Pip Cet via "Bug reports for GNU Emacs, the Swiss army knife of text
> editors" <bug-gnu-emacs <at> gnu.org> writes:
>
>> My hope is that once these stop-gap patches are installed, it's a good
>> point to call again for general testing of the feature/igc branch, under
>> certain restrictions and with specific instructions to reduce the
>> possibility of data loss due to crashes (e.g. run in GDB, set a
>> breakpoint on kill, don't use TTY signals, xwidgets, unusual toolkits).
>> I hope that "PGTK" and "WIDE_EMACS_INT" can be removed from the list of
>> build constellations expected to be unstable ASAP.
>>
>> The branch will continue to contain bugs and we can't take any
>> responsibility, but at least avoiding those known bugs seems to result
>> in a usable work environment here, and it might be a good opportunity to
>> regain some of the early testers who were too frustrated by stability
>> issues to let them know that we've fixed some of them.
>>
>> Thoughts?
>
> I don't have anything specific to add, but the above sounds like a good
> plan to me. Meanwhile, we also continue seeing other useful bug reports
> trickling in.
Well, here are the patches. I tried for a while to actually produce the
bugs they attempt to fix, but those efforts weren't successful. As
these bugs would be a bit subtle, that's not overly surprising.
I've decided against putting HORRIBLE_ESTIMATE in lisp.h because then
we'd probably have to explain what it is :-)
From 3cc41ab8ba5dda51774157dea5a976b931bf6a90 Mon Sep 17 00:00:00 2001
From: Pip Cet <pipcet <at> protonmail.com>
Subject: [PATCH 1/3] [MPS] Force nativecomp code to be compiled with frame
pointers
If nativecomp code keeps a reference to an MPS object in %rbp and
calls setjmp, the reference will be scrambled and cause crashes if MPS
moves or frees it.
* lisp/emacs-lisp/comp.el (native-comp-compiler-options) [HAVE_MPS]:
Add "-fno-omit-frame-pointer".
---
lisp/emacs-lisp/comp.el | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/lisp/emacs-lisp/comp.el b/lisp/emacs-lisp/comp.el
index 20b9c140228..86944b1a645 100644
--- a/lisp/emacs-lisp/comp.el
+++ b/lisp/emacs-lisp/comp.el
@@ -91,7 +91,8 @@ native-comp-bootstrap-deny-list
:type '(repeat regexp)
:version "28.1")
-(defcustom native-comp-compiler-options nil
+(defcustom native-comp-compiler-options
+ (cond ((featurep 'mps) (list "-fno-omit-frame-pointer")))
"Command line options passed verbatim to GCC compiler.
Note that not all options are meaningful and some options might even
break your Emacs. Use at your own risk.
--
2.48.1
From 2c8c6f8013aaf6a80ba0e97b99e8bc156c31b3ff Mon Sep 17 00:00:00 2001
From: Pip Cet <pipcet <at> protonmail.com>
Subject: [PATCH 2/3] [MPS] Refuse to read byte-code objects with deep stacks
HORRIBLE_ESTIMATE in igc.c provides an upper limit on how deep a byte
code object's stack can be and still be handled appropriately by igc.
Enforce that.
* src/alloc.c (Fmake_byte_code):
* src/lread.c (bytecode_from_rev_list): Error out rather than creating
a crashable byte-code object.-
---
src/alloc.c | 5 +++++
src/lread.c | 5 +++++
2 files changed, 10 insertions(+)
diff --git a/src/alloc.c b/src/alloc.c
index 7bad029b858..f8bcb1ef717 100644
--- a/src/alloc.c
+++ b/src/alloc.c
@@ -3820,6 +3820,11 @@ and (optional) INTERACTIVE-SPEC.
&& FIXNATP (args[CLOSURE_STACK_DEPTH])))
error ("Invalid byte-code object");
+#ifdef HAVE_MPS
+ if (XFIXNAT (args[CLOSURE_STACK_DEPTH]) >= 1022)
+ error ("Byte-code object uses too much stack, increase HORRIBLE_ESTIMATE!");
+#endif
+
#ifndef HAVE_MPS
/* Bytecode must be immovable. */
pin_string (args[CLOSURE_CODE]);
diff --git a/src/lread.c b/src/lread.c
index e8cd689d794..67c1bbe5f7b 100644
--- a/src/lread.c
+++ b/src/lread.c
@@ -3483,6 +3483,11 @@ bytecode_from_rev_list (Lisp_Object elems, Lisp_Object readcharfun)
|| NILP (vec[CLOSURE_CONSTANTS]))))))
invalid_syntax ("Invalid byte-code object", readcharfun);
+#ifdef HAVE_MPS
+ if (XFIXNAT (vec[CLOSURE_STACK_DEPTH]) >= 1022)
+ error ("Byte-code object uses too much stack, increase HORRIBLE_ESTIMATE!");
+#endif
+
if (STRINGP (vec[CLOSURE_CODE]))
{
if (STRING_MULTIBYTE (vec[CLOSURE_CODE]))
--
2.48.1
From ad35aecca5c6270faeaff03cae4ff682d7503b35 Mon Sep 17 00:00:00 2001
From: Pip Cet <pipcet <at> protonmail.com>
Subject: [PATCH 3/3] [MPS] Ensure sys_jmp buffers are scanned ambiguously
* src/igc.c (igc_alloc_handler): Use 'igc_xzalloc_ambig', not 'alloc'.
---
src/igc.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/src/igc.c b/src/igc.c
index 9ed5c983a5d..17816c3a74c 100644
--- a/src/igc.c
+++ b/src/igc.c
@@ -4491,7 +4491,14 @@ igc_alloc_blv (void)
void *
igc_alloc_handler (void)
{
- struct handler *h = alloc (sizeof *h, IGC_OBJ_HANDLER);
+ /* struct handler contains a jmp_buf, which means it must be scanned
+ ambiguously in case one of the registers stored in the jmp_buf
+ refers to an MPS object.
+
+ On Windows 64, jmp_buf is also 16-byte aligned, which causes
+ trouble with the MPS default alignment.
+ */
+ struct handler *h = igc_xzalloc_ambig (sizeof *h);
return h;
}
--
2.48.1
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#76180
; Package
emacs
.
(Tue, 11 Feb 2025 20:28:01 GMT)
Full text and
rfc822 format available.
Message #23 received at 76180 <at> debbugs.gnu.org (full text, mbox):
Pip Cet <pipcet <at> protonmail.com> writes:
> diff --git a/src/lread.c b/src/lread.c
> index e8cd689d794..67c1bbe5f7b 100644
> --- a/src/lread.c
> +++ b/src/lread.c
> @@ -3483,6 +3483,11 @@ bytecode_from_rev_list (Lisp_Object elems, Lisp_Object readcharfun)
> || NILP (vec[CLOSURE_CONSTANTS]))))))
> invalid_syntax ("Invalid byte-code object", readcharfun);
>
> +#ifdef HAVE_MPS
> + if (XFIXNAT (vec[CLOSURE_STACK_DEPTH]) >= 1022)
> + error ("Byte-code object uses too much stack, increase HORRIBLE_ESTIMATE!");
> +#endif
> +
> if (STRINGP (vec[CLOSURE_CODE]))
> {
> if (STRING_MULTIBYTE (vec[CLOSURE_CODE]))
> --
> 2.48.1
Sorry, this hunk wasn't the one I meant to send. This one needs to be
applied after it:
diff --git a/src/lread.c b/src/lread.c
index a638b6eec8e..ae312874574 100644
--- a/src/lread.c
+++ b/src/lread.c
@@ -3576,7 +3576,7 @@ bytecode_from_rev_list (Lisp_Object elems, Lisp_Object readcharfun)
invalid_syntax ("Invalid byte-code object", readcharfun);
#ifdef HAVE_MPS
- if (XFIXNAT (vec[CLOSURE_STACK_DEPTH]) >= 1022)
+ if (size > CLOSURE_STACK_DEPTH && XFIXNAT (vec[CLOSURE_STACK_DEPTH]) >= 1022)
error ("Byte-code object uses too much stack, increase HORRIBLE_ESTIMATE!");
#endif
to fix the obvious problem :-)
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#76180
; Package
emacs
.
(Wed, 12 Feb 2025 12:36:01 GMT)
Full text and
rfc822 format available.
Message #26 received at 76180 <at> debbugs.gnu.org (full text, mbox):
> Date: Tue, 11 Feb 2025 20:27:23 +0000
> From: Pip Cet <pipcet <at> protonmail.com>
> Cc: 76180 <at> debbugs.gnu.org, gerd.moellmann <at> gmail.com, mattiase <at> acm.org, acorallo <at> gnu.org, eggert <at> cs.ucla.edu, eliz <at> gnu.org
>
> Pip Cet <pipcet <at> protonmail.com> writes:
>
> > diff --git a/src/lread.c b/src/lread.c
> > index e8cd689d794..67c1bbe5f7b 100644
> > --- a/src/lread.c
> > +++ b/src/lread.c
> > @@ -3483,6 +3483,11 @@ bytecode_from_rev_list (Lisp_Object elems, Lisp_Object readcharfun)
> > || NILP (vec[CLOSURE_CONSTANTS]))))))
> > invalid_syntax ("Invalid byte-code object", readcharfun);
> >
> > +#ifdef HAVE_MPS
> > + if (XFIXNAT (vec[CLOSURE_STACK_DEPTH]) >= 1022)
> > + error ("Byte-code object uses too much stack, increase HORRIBLE_ESTIMATE!");
> > +#endif
> > +
> > if (STRINGP (vec[CLOSURE_CODE]))
> > {
> > if (STRING_MULTIBYTE (vec[CLOSURE_CODE]))
> > --
> > 2.48.1
>
> Sorry, this hunk wasn't the one I meant to send. This one needs to be
> applied after it:
>
> diff --git a/src/lread.c b/src/lread.c
> index a638b6eec8e..ae312874574 100644
> --- a/src/lread.c
> +++ b/src/lread.c
> @@ -3576,7 +3576,7 @@ bytecode_from_rev_list (Lisp_Object elems, Lisp_Object readcharfun)
> invalid_syntax ("Invalid byte-code object", readcharfun);
>
> #ifdef HAVE_MPS
> - if (XFIXNAT (vec[CLOSURE_STACK_DEPTH]) >= 1022)
> + if (size > CLOSURE_STACK_DEPTH && XFIXNAT (vec[CLOSURE_STACK_DEPTH]) >= 1022)
> error ("Byte-code object uses too much stack, increase HORRIBLE_ESTIMATE!");
> #endif
>
>
> to fix the obvious problem :-)
Can't we figure out how far to scan, instead of using some
more-or-less arbitrary constant? The comments in scan_bc seem to
imply that figuring out the distance to scan should be possible?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#76180
; Package
emacs
.
(Wed, 12 Feb 2025 12:42:02 GMT)
Full text and
rfc822 format available.
Message #29 received at 76180 <at> debbugs.gnu.org (full text, mbox):
> Date: Tue, 11 Feb 2025 20:16:21 +0000
> From: Pip Cet <pipcet <at> protonmail.com>
> Cc: 76180 <at> debbugs.gnu.org, gerd.moellmann <at> gmail.com, mattiase <at> acm.org, acorallo <at> gnu.org, eggert <at> cs.ucla.edu, eliz <at> gnu.org
>
> --- a/lisp/emacs-lisp/comp.el
> +++ b/lisp/emacs-lisp/comp.el
> @@ -91,7 +91,8 @@ native-comp-bootstrap-deny-list
> :type '(repeat regexp)
> :version "28.1")
>
> -(defcustom native-comp-compiler-options nil
> +(defcustom native-comp-compiler-options
> + (cond ((featurep 'mps) (list "-fno-omit-frame-pointer")))
> "Command line options passed verbatim to GCC compiler.
How about always using -fno-omit-frame-pointer, not as part of a user
option? We could put it in some variable for those few (who?) who
might want to enable -fomit-frame-pointer. I see no reason to expose
this to users, do you?
Alternatively, maybe the nativecomp code could be modified not to use
RBP?
Andrea, WDYT?
Thanks.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#76180
; Package
emacs
.
(Fri, 14 Feb 2025 12:32:01 GMT)
Full text and
rfc822 format available.
Message #32 received at 76180 <at> debbugs.gnu.org (full text, mbox):
"Eli Zaretskii" <eliz <at> gnu.org> writes:
>> Date: Tue, 11 Feb 2025 20:16:21 +0000
>> From: Pip Cet <pipcet <at> protonmail.com>
>> Cc: 76180 <at> debbugs.gnu.org, gerd.moellmann <at> gmail.com, mattiase <at> acm.org, acorallo <at> gnu.org, eggert <at> cs.ucla.edu, eliz <at> gnu.org
>>
>> --- a/lisp/emacs-lisp/comp.el
>> +++ b/lisp/emacs-lisp/comp.el
>> @@ -91,7 +91,8 @@ native-comp-bootstrap-deny-list
>> :type '(repeat regexp)
>> :version "28.1")
>>
>> -(defcustom native-comp-compiler-options nil
>> +(defcustom native-comp-compiler-options
>> + (cond ((featurep 'mps) (list "-fno-omit-frame-pointer")))
>> "Command line options passed verbatim to GCC compiler.
>
> How about always using -fno-omit-frame-pointer, not as part of a user
> option?
I agree that would be better. Will come up with a patch once I'm
convinced the phantom bug in my MPS build I was chasing for hours is
gone.
> We could put it in some variable for those few (who?) who might want
> to enable -fomit-frame-pointer. I see no reason to expose this to
> users, do you?
No, it's a mistake to give users the option to compile with
-fomit-frame-pointer, IMHO, so let's not do that.
> Alternatively, maybe the nativecomp code could be modified not to use
> RBP?
I'm not sure what you mean: the code is generated by libgccjit. We
could try to make all calls to setjmp go through a wrapper which saves
%rbp, but that is very difficult because setjmp is not an ordinary
function and not treated as one by GCC (bug#46824), so I'd rather not
(nevermind the very different API used by windows).
Thanks!
Pip
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#76180
; Package
emacs
.
(Fri, 14 Feb 2025 12:39:02 GMT)
Full text and
rfc822 format available.
Message #35 received at 76180 <at> debbugs.gnu.org (full text, mbox):
> Date: Fri, 14 Feb 2025 12:31:15 +0000
> From: Pip Cet <pipcet <at> protonmail.com>
> Cc: acorallo <at> gnu.org, stefankangas <at> gmail.com, 76180 <at> debbugs.gnu.org, gerd.moellmann <at> gmail.com, mattiase <at> acm.org, eggert <at> cs.ucla.edu
>
> "Eli Zaretskii" <eliz <at> gnu.org> writes:
>
> > Alternatively, maybe the nativecomp code could be modified not to use
> > RBP?
>
> I'm not sure what you mean: the code is generated by libgccjit.
Oh, I didn't realize that. If we really have no control over this,
then maybe a bug report against libgccjit is in order?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#76180
; Package
emacs
.
(Fri, 14 Feb 2025 12:52:02 GMT)
Full text and
rfc822 format available.
Message #38 received at 76180 <at> debbugs.gnu.org (full text, mbox):
"Eli Zaretskii" <eliz <at> gnu.org> writes:
>> Date: Tue, 11 Feb 2025 20:27:23 +0000
>> From: Pip Cet <pipcet <at> protonmail.com>
>> Cc: 76180 <at> debbugs.gnu.org, gerd.moellmann <at> gmail.com, mattiase <at> acm.org, acorallo <at> gnu.org, eggert <at> cs.ucla.edu, eliz <at> gnu.org
>>
>> Pip Cet <pipcet <at> protonmail.com> writes:
>>
>> > diff --git a/src/lread.c b/src/lread.c
>> > index e8cd689d794..67c1bbe5f7b 100644
>> > --- a/src/lread.c
>> > +++ b/src/lread.c
>> > @@ -3483,6 +3483,11 @@ bytecode_from_rev_list (Lisp_Object elems, Lisp_Object readcharfun)
>> > || NILP (vec[CLOSURE_CONSTANTS]))))))
>> > invalid_syntax ("Invalid byte-code object", readcharfun);
>> >
>> > +#ifdef HAVE_MPS
>> > + if (XFIXNAT (vec[CLOSURE_STACK_DEPTH]) >= 1022)
>> > + error ("Byte-code object uses too much stack, increase HORRIBLE_ESTIMATE!");
>> > +#endif
>> > +
>> > if (STRINGP (vec[CLOSURE_CODE]))
>> > {
>> > if (STRING_MULTIBYTE (vec[CLOSURE_CODE]))
>> > --
>> > 2.48.1
>>
>> Sorry, this hunk wasn't the one I meant to send. This one needs to be
>> applied after it:
>>
>> diff --git a/src/lread.c b/src/lread.c
>> index a638b6eec8e..ae312874574 100644
>> --- a/src/lread.c
>> +++ b/src/lread.c
>> @@ -3576,7 +3576,7 @@ bytecode_from_rev_list (Lisp_Object elems, Lisp_Object readcharfun)
>> invalid_syntax ("Invalid byte-code object", readcharfun);
>>
>> #ifdef HAVE_MPS
>> - if (XFIXNAT (vec[CLOSURE_STACK_DEPTH]) >= 1022)
>> + if (size > CLOSURE_STACK_DEPTH && XFIXNAT (vec[CLOSURE_STACK_DEPTH]) >= 1022)
>> error ("Byte-code object uses too much stack, increase HORRIBLE_ESTIMATE!");
>> #endif
>>
>>
>> to fix the obvious problem :-)
>
> Can't we figure out how far to scan, instead of using some
> more-or-less arbitrary constant? The comments in scan_bc seem to
> imply that figuring out the distance to scan should be possible?
I've put a lot of debugging statements in, and there are bytecode stacks
in excess of 1024 bytes, but they're rare (org mode and cedet, in
particular). It also depends on how you build: nativecomp builds seem
to have larger bytecode stacks...
The problem is that the top of stack is usually kept in a register while
exec_byte_code executes, and we don't know which one. However, I don't
really understand why struct bc_frame can't be expanded to include the
max_stack value when we set up the frame; that way, we'd still be
scanning too much in some cases (since max_stack may not be reached),
but never too little.
We could scan the bytecode stack non-conservatively, but that would
require clearing the stack when we build a new frame. However, even if
we scan conservatively, it's a lot easier to skip zero words than to
look up random words left over from a previous iteration which are most
likely no longer valid.
I must confess that I've locally changed scan_bc to scan the entire
bytecode stack. It may be easier to do that and use memclear when we're
popping a frame...
Pip
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#76180
; Package
emacs
.
(Fri, 14 Feb 2025 12:59:02 GMT)
Full text and
rfc822 format available.
Message #41 received at 76180 <at> debbugs.gnu.org (full text, mbox):
"Eli Zaretskii" <eliz <at> gnu.org> writes:
>> Date: Fri, 14 Feb 2025 12:31:15 +0000
>> From: Pip Cet <pipcet <at> protonmail.com>
>> Cc: acorallo <at> gnu.org, stefankangas <at> gmail.com, 76180 <at> debbugs.gnu.org, gerd.moellmann <at> gmail.com, mattiase <at> acm.org, eggert <at> cs.ucla.edu
>>
>> "Eli Zaretskii" <eliz <at> gnu.org> writes:
>>
>> > Alternatively, maybe the nativecomp code could be modified not to use
>> > RBP?
>>
>> I'm not sure what you mean: the code is generated by libgccjit.
>
> Oh, I didn't realize that. If we really have no control over this,
> then maybe a bug report against libgccjit is in order?
I think I misexpressed myself. The problem is that libgccjit acts like
GCC, generating new native code which may or may not use %rbp depending
on the compiler options specified by comp.c. Old versions of libgccjit
had no way of passing such compiler options, but current ones do.
The question is whether we're happy appending "-fno-omit-frame-pointer",
or whether we want to be ambitious and refuse to accept
"-fomit-frame-pointer".
I haven't tried this, but passing the wrong options to libgccjit might
make it assume the wrong ABI or hardware, so causing a crash for a
malicious user is probably not something we can avoid; the only question
is how we keep bug reports focussed so we don't have to keep this option
in mind for every single bug.
Pip
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#76180
; Package
emacs
.
(Fri, 14 Feb 2025 13:11:02 GMT)
Full text and
rfc822 format available.
Message #44 received at 76180 <at> debbugs.gnu.org (full text, mbox):
> Date: Fri, 14 Feb 2025 12:57:49 +0000
> From: Pip Cet <pipcet <at> protonmail.com>
> Cc: acorallo <at> gnu.org, stefankangas <at> gmail.com, 76180 <at> debbugs.gnu.org, gerd.moellmann <at> gmail.com, mattiase <at> acm.org, eggert <at> cs.ucla.edu
>
> "Eli Zaretskii" <eliz <at> gnu.org> writes:
>
> >> I'm not sure what you mean: the code is generated by libgccjit.
> >
> > Oh, I didn't realize that. If we really have no control over this,
> > then maybe a bug report against libgccjit is in order?
>
> I think I misexpressed myself. The problem is that libgccjit acts like
> GCC, generating new native code which may or may not use %rbp depending
> on the compiler options specified by comp.c. Old versions of libgccjit
> had no way of passing such compiler options, but current ones do.
>
> The question is whether we're happy appending "-fno-omit-frame-pointer",
> or whether we want to be ambitious and refuse to accept
> "-fomit-frame-pointer".
I think the former. And maybe not to append that at the very end, so
as to leave a "fire escape" for those who, for some reasons, want to
use -fomit-frame-pointer, and presumably know what they are doing.
This bug report was last modified 7 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.