Package: emacs;
Reported by: Pip Cet <pipcet <at> protonmail.com>
Date: Wed, 22 Jan 2025 10:20:01 UTC
Severity: normal
Done: Pip Cet <pipcet <at> protonmail.com>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 75754 in the body.
You can then email your comments to 75754 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
View this report as an mbox folder, status mbox, maintainer mbox
bug-gnu-emacs <at> gnu.org
:bug#75754
; Package emacs
.
(Wed, 22 Jan 2025 10:20:01 GMT) Full text and rfc822 format available.Pip Cet <pipcet <at> protonmail.com>
:bug-gnu-emacs <at> gnu.org
.
(Wed, 22 Jan 2025 10:20:01 GMT) Full text and rfc822 format available.Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
From: Pip Cet <pipcet <at> protonmail.com> To: bug-gnu-emacs <at> gnu.org, Paul Eggert <eggert <at> cs.ucla.edu> Subject: styled_format stack usage/GC protection Date: Wed, 22 Jan 2025 10:18:32 +0000
This popped up while trying to debug the feature/igc branch and looking into formatted output functions to replace sprintf/snprintf: The function styled_format in editfns.c does this: enum { /* Maximum precision for a %f conversion such that the trailing output digit might be nonzero. Any precision larger than this will not yield useful information. */ USEFUL_PRECISION_MAX = ((1 - LDBL_MIN_EXP) * (FLT_RADIX == 2 || FLT_RADIX == 10 ? 1 : FLT_RADIX == 16 ? 4 : -1)), /* Maximum number of bytes (including terminating null) generated by any format, if precision is no more than USEFUL_PRECISION_MAX. On all practical hosts, %Lf is the worst case. */ SPRINTF_BUFSIZE = (sizeof "-." + (LDBL_MAX_10_EXP + 1) + USEFUL_PRECISION_MAX) }; char initial_buffer[1000 + SPRINTF_BUFSIZE]; USE_SAFE_ALLOCA; sa_avail -= sizeof initial_buffer; On my system, the relevant values are: USEFUL_PRECISION_MAX 16382 SPRINTF_BUFSIZE 21318 MAX_ALLOCA 16384 After the code above executes, sa_avail is -5934. styled_format proceeds to allocate a structure using SAFE_ALLOCA: /* Information recorded for each format spec. */ struct info { /* The corresponding argument, converted to string if conversion was needed. */ Lisp_Object argument; /* The start and end bytepos in the output string. */ ptrdiff_t start, end; /* The start bytepos of the spec in the format string. */ ptrdiff_t fbeg; /* Whether the argument is a string with intervals. */ bool_bf intervals : 1; } *info; info = SAFE_ALLOCA (alloca_size); While the alloca_size value is small, sa_avail is negative when we enter SAFE_ALLOCA, so SAFE_ALLOCA uses xmalloc to allocate the memory on the heap. The structure contains a Lisp_Object. This Lisp_Object must be protected from GC by being present on the C stack if GC can ever happen in this function. SAFE_ALLOCA doesn't protect it. I'm not entirely sure this is a problem, but (let ((print-unreadable-function (lambda (&rest args) (garbage-collect)))) (format "%S" (symbol-function '+))) produces this backtrace: #0 garbage_collect () at alloc.c:6450 #1 0x00005555557f7700 in Fgarbage_collect () at alloc.c:6697 #2 0x000055555582e1ee in eval_sub (form=XIL(0x7ffff4dab073)) at eval.c:2584 #3 0x0000555555828d9f in Fprogn (body=XIL(0)) at eval.c:439 #4 0x0000555555830604 in funcall_lambda (fun=XIL(0x555556045c9d), nargs=2, arg_vector=0x7fffffff6b78) at eval.c:3339 #5 0x000055555582f62d in funcall_general (fun=XIL(0x555556045c9d), numargs=2, args=0x7fffffff6b78) at eval.c:3033 #6 0x000055555582f89a in Ffuncall (nargs=3, args=0x7fffffff6b70) at eval.c:3082 #7 0x0000555555863e8a in print_vectorlike_unreadable (obj=XIL(0x555555ee7765), printcharfun=XIL(0), escapeflag=true, buf=0x7fffffff6ce0 "\220m\377\377\377\177") at print.c:1683 #8 0x0000555555866e5f in print_object (obj=XIL(0x555555ee7765), printcharfun=XIL(0), escapeflag=true) at print.c:2647 #9 0x0000555555862ec2 in print (obj=XIL(0x555555ee7765), printcharfun=XIL(0), escapeflag=true) at print.c:1296 #10 0x0000555555861cef in Fprin1_to_string (object=XIL(0x555555ee7765), noescape=XIL(0), overrides=XIL(0)) at print.c:814 #11 0x000055555581fd14 in styled_format (nargs=2, args=0x7fffffffca20, message=false) at editfns.c:3633 #12 0x000055555581f444 in Fformat (nargs=2, args=0x7fffffffca20) at editfns.c:3370 indicating that GC can happen. The code attempts to protect the current argument by keeping it in a redundant automatic variable: Lisp_Object arg = spec->argument; ... spec->argument = arg = Fprin1_to_string (arg, noescape, Qnil); I don't know whether this approach works; maybe a very smart compiler could eliminate the automatic variable and keep only the copy on the heap for some of the time, but it's unlikely that that's valid while calling GC. In any case, this protects only the current argument; if we detect a multibyte situation too late, we may restart the loop and, I think, reuse info->argument values which were unprotected during GC. The problem on the feature/igc branch may explain some of the crashes we've seen during heavy elisp usage. On the master branch, the problems are: 1. excessive stack usage: exceeding MAX_ALLOCA by creating a large temporary buffer on the stack seems unwise. 2. SAFE_ALLOCA is in use, but on many systems it's impossible to use stack space. 3. GC protection needs to be verified, or fixed if we accept that there is a possibility a Lisp_Object might be unprotected. Note that fixing (2) is strictly optional; however, fixing only (2) would make (3) a latent but still real bug, which may be worse than the current situation. On the feature/igc branch, where protection is definitely required, I'm thinking about a fix. A quick fix would be to replace all elements of struct info by Lisp_Object values and use SAFE_ALLOCA_LISP.
bug-gnu-emacs <at> gnu.org
:bug#75754
; Package emacs
.
(Wed, 22 Jan 2025 15:57:02 GMT) Full text and rfc822 format available.Message #8 received at 75754 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Pip Cet <pipcet <at> protonmail.com> Cc: 75754 <at> debbugs.gnu.org, eggert <at> cs.ucla.edu Subject: Re: bug#75754: styled_format stack usage/GC protection Date: Wed, 22 Jan 2025 17:56:09 +0200
> Date: Wed, 22 Jan 2025 10:18:32 +0000 > From: Pip Cet via "Bug reports for GNU Emacs, > the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org> > > While the alloca_size value is small, sa_avail is negative when we enter > SAFE_ALLOCA, so SAFE_ALLOCA uses xmalloc to allocate the memory on the > heap. > > The structure contains a Lisp_Object. This Lisp_Object must be > protected from GC by being present on the C stack if GC can ever happen > in this function. SAFE_ALLOCA doesn't protect it. > > I'm not entirely sure this is a problem, but > > (let ((print-unreadable-function (lambda (&rest args) (garbage-collect)))) > (format "%S" (symbol-function '+))) > > produces this backtrace: > > #0 garbage_collect () at alloc.c:6450 > #1 0x00005555557f7700 in Fgarbage_collect () at alloc.c:6697 > #2 0x000055555582e1ee in eval_sub (form=XIL(0x7ffff4dab073)) at eval.c:2584 > #3 0x0000555555828d9f in Fprogn (body=XIL(0)) at eval.c:439 > #4 0x0000555555830604 in funcall_lambda (fun=XIL(0x555556045c9d), nargs=2, arg_vector=0x7fffffff6b78) at eval.c:3339 > #5 0x000055555582f62d in funcall_general (fun=XIL(0x555556045c9d), numargs=2, args=0x7fffffff6b78) at eval.c:3033 > #6 0x000055555582f89a in Ffuncall (nargs=3, args=0x7fffffff6b70) at eval.c:3082 > #7 0x0000555555863e8a in print_vectorlike_unreadable > (obj=XIL(0x555555ee7765), printcharfun=XIL(0), escapeflag=true, buf=0x7fffffff6ce0 "\220m\377\377\377\177") at print.c:1683 > #8 0x0000555555866e5f in print_object (obj=XIL(0x555555ee7765), printcharfun=XIL(0), escapeflag=true) at print.c:2647 > #9 0x0000555555862ec2 in print (obj=XIL(0x555555ee7765), printcharfun=XIL(0), escapeflag=true) at print.c:1296 > #10 0x0000555555861cef in Fprin1_to_string (object=XIL(0x555555ee7765), noescape=XIL(0), overrides=XIL(0)) at print.c:814 > #11 0x000055555581fd14 in styled_format (nargs=2, args=0x7fffffffca20, message=false) at editfns.c:3633 > #12 0x000055555581f444 in Fformat (nargs=2, args=0x7fffffffca20) at editfns.c:3370 > > indicating that GC can happen. > > The code attempts to protect the current argument by keeping it in a > redundant automatic variable: > > Lisp_Object arg = spec->argument; > ... > spec->argument = arg = Fprin1_to_string (arg, noescape, Qnil); I think indeed the protection here is by having the Lisp object in an automatic variable. > In any case, this protects only the current argument; if we detect a > multibyte situation too late, we may restart the loop and, I think, > reuse info->argument values which were unprotected during GC. Can you elaborate how this could happen by walking through the relevant code? > On the feature/igc branch, where protection is definitely required, I'm > thinking about a fix. A quick fix would be to replace all elements of > struct info by Lisp_Object values and use SAFE_ALLOCA_LISP. Are automatic variables not protected on the igc branch as they are with the old GC? Can we use AUTO_STRING to solve the problem?
bug-gnu-emacs <at> gnu.org
:bug#75754
; Package emacs
.
(Wed, 22 Jan 2025 17:19:01 GMT) Full text and rfc822 format available.Message #11 received at 75754 <at> debbugs.gnu.org (full text, mbox):
From: Pip Cet <pipcet <at> protonmail.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: 75754 <at> debbugs.gnu.org, eggert <at> cs.ucla.edu Subject: Re: bug#75754: styled_format stack usage/GC protection Date: Wed, 22 Jan 2025 17:17:47 +0000
"Eli Zaretskii" <eliz <at> gnu.org> writes: >> Date: Wed, 22 Jan 2025 10:18:32 +0000 >> From: Pip Cet via "Bug reports for GNU Emacs, >> the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org> >> >> While the alloca_size value is small, sa_avail is negative when we enter >> SAFE_ALLOCA, so SAFE_ALLOCA uses xmalloc to allocate the memory on the >> heap. >> >> The structure contains a Lisp_Object. This Lisp_Object must be >> protected from GC by being present on the C stack if GC can ever happen >> in this function. SAFE_ALLOCA doesn't protect it. >> >> I'm not entirely sure this is a problem, but >> >> (let ((print-unreadable-function (lambda (&rest args) (garbage-collect)))) >> (format "%S" (symbol-function '+))) >> >> produces this backtrace: >> >> #0 garbage_collect () at alloc.c:6450 >> #1 0x00005555557f7700 in Fgarbage_collect () at alloc.c:6697 >> #2 0x000055555582e1ee in eval_sub (form=XIL(0x7ffff4dab073)) at eval.c:2584 >> #3 0x0000555555828d9f in Fprogn (body=XIL(0)) at eval.c:439 >> #4 0x0000555555830604 in funcall_lambda (fun=XIL(0x555556045c9d), nargs=2, arg_vector=0x7fffffff6b78) at eval.c:3339 >> #5 0x000055555582f62d in funcall_general (fun=XIL(0x555556045c9d), numargs=2, args=0x7fffffff6b78) at eval.c:3033 >> #6 0x000055555582f89a in Ffuncall (nargs=3, args=0x7fffffff6b70) at eval.c:3082 >> #7 0x0000555555863e8a in print_vectorlike_unreadable >> (obj=XIL(0x555555ee7765), printcharfun=XIL(0), escapeflag=true, buf=0x7fffffff6ce0 "\220m\377\377\377\177") at print.c:1683 >> #8 0x0000555555866e5f in print_object (obj=XIL(0x555555ee7765), printcharfun=XIL(0), escapeflag=true) at print.c:2647 >> #9 0x0000555555862ec2 in print (obj=XIL(0x555555ee7765), printcharfun=XIL(0), escapeflag=true) at print.c:1296 >> #10 0x0000555555861cef in Fprin1_to_string (object=XIL(0x555555ee7765), noescape=XIL(0), overrides=XIL(0)) at print.c:814 >> #11 0x000055555581fd14 in styled_format (nargs=2, args=0x7fffffffca20, message=false) at editfns.c:3633 >> #12 0x000055555581f444 in Fformat (nargs=2, args=0x7fffffffca20) at editfns.c:3370 >> >> indicating that GC can happen. >> >> The code attempts to protect the current argument by keeping it in a ^^^^^^^ This part is important: if there are several arguments, only one of them (at most) is protected by the automatic variable. >> redundant automatic variable: >> >> Lisp_Object arg = spec->argument; >> ... >> spec->argument = arg = Fprin1_to_string (arg, noescape, Qnil); > > I think indeed the protection here is by having the Lisp object in an > automatic variable. The important point is that protection appears to me to be insufficient, because there's a single automatic variable but many args. The very minor potential issue is that it is often an important optimization for code to know it is the sole owner of a malloc'd region. That's part of the reason ATTRIBUTE_MALLOC was added. The GCC analyzer certainly does very smart things with this attribute. The technical details of conservative GC are that an automatic variable has to live on the stack or in a register because it's the only way for the code to reproduce its value (if it's not reused, there's no harm in freeing it early). In this case, a very smart compiler might realize that it is the sole owner of the malloc'd memory (that's what ATTRIBUTE_MALLOC is about) and doesn't have to redundantly keep a stack variable representing a value it can read back from such memory. I wouldn't usually worry about this, because SAFE_ALLOC uses alloca or xmalloc and alloca isn't optimized. In this case, the alloca case is unreachable and presumably optimized out, so it's merely about modifying a variable which lives in two places in only one of them; with -Os, it'd be natural to modify the heap location directly and optimize out the register location. >> In any case, this protects only the current argument; if we detect a >> multibyte situation too late, we may restart the loop and, I think, >> reuse info->argument values which were unprotected during GC. > > Can you elaborate how this could happen by walking through the > relevant code? It's about this label: /* If we start out planning a unibyte result, then discover it has to be multibyte, we jump back to retry. */ retry: When we reach the label, we reestablish ispec (the current index into the info array), but not nspec (the last valid index into that array). That means that this code: if ((conversion == 'S' || (conversion == 's' && ! STRINGP (arg) && ! SYMBOLP (arg)))) { if (EQ (arg, args[n])) { Lisp_Object noescape = conversion == 'S' ? Qnil : Qt; spec->argument = arg = Fprin1_to_string (arg, noescape, Qnil); if (STRING_MULTIBYTE (arg) && ! multibyte) { multibyte = true; goto retry; } } conversion = 's'; } only has a real effect during the first attempt. I'll describe a scenario, but it might be best to test this and provide a gdb log: The first argument is converted to a unibyte string S. S is stored in info[0]->argument. This is one reference to it, the second one is "arg". We advance to the next argument, making "arg" refer to it and leaving S unprotected. Fine, as long as we don't call out to Lisp. But Vprint_unreadable_function is bound and we do call out to Lisp while trying to convert the second argument. GC happens and collects the first string. Then, the Lisp call returns and provides us with an unexpected multibyte string. So we set multibyte = true and goto retry, set "arg" to spec->argument, which is the now-invalid pointer to S. We then print it, but the string data of the collected string has been compacted and now points to another string's data. We call copy_text on SDATA (S), which copies bogus data into buf, which is then converted to a string and returned. However, as far as I can tell, with the old GC, print.c calls out to Lisp very rarely, so this is unlikely to happen in practice. With IGC, the situation is much worse: GC can strike at any time, fail to see the reference to the string in the xmalloc'd buffer, move the string or its data or both, and usually overwrites the old string right away (we could poison the old memory in this case, but doing so will destroy the old data which might have provided the only hint as to what was moved and where). My plan is to test this now, and if it does happen, unconditionally allocate a second SAFE_ALLOCA_LISP area, which will protect the arguments; my first attempt put a pointer to this area in the info struct, but I'm pretty sure that's unnecessary: we simply have to check the right index in both areas. If it doesn't happen with alloc.c, we should probably still do that; it's not obviously less efficient and the slight impact on readability will be outweighed by readers having to wonder about the redundantly-synched arg variable. Once the massive stack allocation has been fixed (in the simplest case, we simply accept that we exceed MAX_ALLOCA and remove the subtraction from sa_avail which accounts for the extra stack usage), performance should return to normal, and while we will have fixed the bug we can sleep soundly knowing that the fix only applies to rare cases anyway (in the simplest case, one would have to provide a 818-character format string to even run out of alloca space with the default settings). I'd like to reiterate my objection to the fact that SAFE_ALLOCA doesn't unconditionally scan all memory it has allocated for all GC implementations. That change in semantics would make both branches safer and we could take our time investigating potential bugs such as this one: until we decide the extra protection is no longer needed, bugs such as this one couldn't happen. Gerd did change SAFE_NALLOCA to scan its buffer on feature/igc, but not SAFE_ALLOCA. Gerd, can we simply: #define SAFE_ALLOCA(size) ({ void *buf; SAFE_NALLOCA(buf, size, 1); buf; }) for the time being? It might reduce the rush to fix this bug. >> On the feature/igc branch, where protection is definitely required, I'm >> thinking about a fix. A quick fix would be to replace all elements of >> struct info by Lisp_Object values and use SAFE_ALLOCA_LISP. > > Are automatic variables not protected on the igc branch as they are > with the old GC? Apart from the SDATA difference (in which case igc protects against GC but the old GC requires a no-GC assumption while the pointer is in use), the protection should be equivalent. When this failed to happen with -fomit-frame-pointer builds, we saw difficult bugs. The problem is that conservative stack marking isn't guaranteed to work by any C standard or ABI that I'm aware of. It used to be the case that an automatic variable would reside in one place so that was good enough; now, it's common for the only live reference to be an internal pointer. If we give the compiler extra space to play with, it might decide to use it, and that's what ATTRIBUTE_MALLOC does. > Can we use AUTO_STRING to solve the problem? No. It's strictly optional and cannot solve a problem that affects all builds. Pip
bug-gnu-emacs <at> gnu.org
:bug#75754
; Package emacs
.
(Wed, 22 Jan 2025 18:19:02 GMT) Full text and rfc822 format available.Message #14 received at 75754 <at> debbugs.gnu.org (full text, mbox):
From: Pip Cet <pipcet <at> protonmail.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: 75754 <at> debbugs.gnu.org, eggert <at> cs.ucla.edu Subject: Re: bug#75754: styled_format stack usage/GC protection Date: Wed, 22 Jan 2025 18:18:48 +0000
(setq print-unreadable-function (lambda (&rest args) (garbage-collect) (make-string 100 256 t))) (message "%S" (format "%S %S" [1] (symbol-function '+))) fails to print the initial "[1]" here. Changing 256 to ?A and t to nil produces "[1] AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA" GDB session confirms the bug is as I've described. Pip
bug-gnu-emacs <at> gnu.org
:bug#75754
; Package emacs
.
(Wed, 22 Jan 2025 19:40:01 GMT) Full text and rfc822 format available.Message #17 received at 75754 <at> debbugs.gnu.org (full text, mbox):
From: Pip Cet <pipcet <at> protonmail.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: 75754 <at> debbugs.gnu.org, eggert <at> cs.ucla.edu Subject: Re: bug#75754: styled_format stack usage/GC protection Date: Wed, 22 Jan 2025 19:39:35 +0000
Pip Cet <pipcet <at> protonmail.com> writes: > (setq print-unreadable-function > (lambda (&rest args) > (garbage-collect) > (make-string 100 256 t))) > > (message "%S" (format "%S %S" [1] (symbol-function '+))) > > fails to print the initial "[1]" here. Changing 256 to ?A and t to nil > produces > > "[1] AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA" > > GDB session confirms the bug is as I've described. Patch: From caa546d3df5e6a23d36d6d60834da655bf407ba4 Mon Sep 17 00:00:00 2001 From: Pip Cet <pipcet <at> protonmail.com> Subject: [PATCH] Fix GC bug causing incorrect 'format' output (Bug#75754) This fixes the correctness bug discovered in bug#75754, but not the performance issue or excessive stack usage. * src/editfns.c (styled_format): Allocate Lisp_Object array using SAFE_ALLOCA_LISP, not SAFE_ALLOCA. --- src/editfns.c | 25 +++++++++++++------------ 1 file changed, 13 insertions(+), 12 deletions(-) diff --git a/src/editfns.c b/src/editfns.c index 8a5fb673fe7..fa138061105 100644 --- a/src/editfns.c +++ b/src/editfns.c @@ -3431,10 +3431,6 @@ styled_format (ptrdiff_t nargs, Lisp_Object *args, bool message) /* Information recorded for each format spec. */ struct info { - /* The corresponding argument, converted to string if conversion - was needed. */ - Lisp_Object argument; - /* The start and end bytepos in the output string. */ ptrdiff_t start, end; @@ -3461,6 +3457,10 @@ styled_format (ptrdiff_t nargs, Lisp_Object *args, bool message) || SIZE_MAX < alloca_size) memory_full (SIZE_MAX); info = SAFE_ALLOCA (alloca_size); + /* Ose argument belonging to each spec, but needs to be allocated + separately so GC doesn't free the strings (bug#75754). */ + Lisp_Object *spec_arguments; + SAFE_ALLOCA_LISP (spec_arguments, nspec_bound); /* discarded[I] is 1 if byte I of the format string was not copied into the output. It is 2 if byte I was not the first byte of its character. */ @@ -3610,14 +3610,15 @@ styled_format (ptrdiff_t nargs, Lisp_Object *args, bool message) if (! (n < nargs)) error ("Not enough arguments for format string"); - struct info *spec = &info[ispec++]; + ptrdiff_t spec_index = ispec++; + struct info *spec = &info[spec_index]; if (nspec < ispec) { - spec->argument = args[n]; + spec_arguments[spec_index] = args[n]; spec->intervals = false; nspec = ispec; } - Lisp_Object arg = spec->argument; + Lisp_Object arg = spec_arguments[spec_index]; /* For 'S', prin1 the argument, and then treat like 's'. For 's', princ any argument that is not a string or @@ -3630,7 +3631,7 @@ styled_format (ptrdiff_t nargs, Lisp_Object *args, bool message) if (EQ (arg, args[n])) { Lisp_Object noescape = conversion == 'S' ? Qnil : Qt; - spec->argument = arg = Fprin1_to_string (arg, noescape, Qnil); + spec_arguments[spec_index] = arg = Fprin1_to_string (arg, noescape, Qnil); if (STRING_MULTIBYTE (arg) && ! multibyte) { multibyte = true; @@ -3648,7 +3649,7 @@ styled_format (ptrdiff_t nargs, Lisp_Object *args, bool message) multibyte = true; goto retry; } - spec->argument = arg = Fchar_to_string (arg); + spec_arguments[spec_index] = arg = Fchar_to_string (arg); } if (!EQ (arg, args[n])) @@ -3658,7 +3659,7 @@ styled_format (ptrdiff_t nargs, Lisp_Object *args, bool message) if (SYMBOLP (arg)) { - spec->argument = arg = SYMBOL_NAME (arg); + spec_arguments[spec_index] = arg = SYMBOL_NAME (arg); if (STRING_MULTIBYTE (arg) && ! multibyte) { multibyte = true; @@ -4303,9 +4304,9 @@ styled_format (ptrdiff_t nargs, Lisp_Object *args, bool message) for (ptrdiff_t i = 0; i < nspec; i++) if (info[i].intervals) { - len = make_fixnum (SCHARS (info[i].argument)); + len = make_fixnum (SCHARS (spec_arguments[i])); Lisp_Object new_len = make_fixnum (info[i].end - info[i].start); - props = text_property_list (info[i].argument, + props = text_property_list (spec_arguments[i], make_fixnum (0), len, Qnil); props = extend_property_ranges (props, len, new_len); /* If successive arguments have properties, be sure that -- 2.47.1
bug-gnu-emacs <at> gnu.org
:bug#75754
; Package emacs
.
(Thu, 23 Jan 2025 05:48:01 GMT) Full text and rfc822 format available.Message #20 received at 75754 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Pip Cet <pipcet <at> protonmail.com>, eggert <at> cs.ucla.edu Cc: 75754 <at> debbugs.gnu.org Subject: Re: bug#75754: styled_format stack usage/GC protection Date: Thu, 23 Jan 2025 07:47:42 +0200
> Date: Wed, 22 Jan 2025 19:39:35 +0000 > From: Pip Cet <pipcet <at> protonmail.com> > Cc: 75754 <at> debbugs.gnu.org, eggert <at> cs.ucla.edu > > Pip Cet <pipcet <at> protonmail.com> writes: > > > (setq print-unreadable-function > > (lambda (&rest args) > > (garbage-collect) > > (make-string 100 256 t))) > > > > (message "%S" (format "%S %S" [1] (symbol-function '+))) > > > > fails to print the initial "[1]" here. Changing 256 to ?A and t to nil > > produces > > > > "[1] AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA" > > > > GDB session confirms the bug is as I've described. > > Patch: Thanks. Paul, any comments on the issue and/or the patch?
bug-gnu-emacs <at> gnu.org
:bug#75754
; Package emacs
.
(Thu, 23 Jan 2025 17:25:01 GMT) Full text and rfc822 format available.Message #23 received at 75754 <at> debbugs.gnu.org (full text, mbox):
From: Paul Eggert <eggert <at> cs.ucla.edu> To: Eli Zaretskii <eliz <at> gnu.org>, Pip Cet <pipcet <at> protonmail.com> Cc: 75754 <at> debbugs.gnu.org Subject: Re: bug#75754: styled_format stack usage/GC protection Date: Thu, 23 Jan 2025 09:24:12 -0800
On 2025-01-22 21:47, Eli Zaretskii wrote: > Paul, any comments on the issue and/or the patch? Patch looks OK except for the "Ose" in a comment.
bug-gnu-emacs <at> gnu.org
:bug#75754
; Package emacs
.
(Thu, 23 Jan 2025 17:51:02 GMT) Full text and rfc822 format available.Message #26 received at 75754 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Paul Eggert <eggert <at> cs.ucla.edu> Cc: pipcet <at> protonmail.com, 75754 <at> debbugs.gnu.org Subject: Re: bug#75754: styled_format stack usage/GC protection Date: Thu, 23 Jan 2025 19:49:35 +0200
> Date: Thu, 23 Jan 2025 09:24:12 -0800 > Cc: 75754 <at> debbugs.gnu.org > From: Paul Eggert <eggert <at> cs.ucla.edu> > > On 2025-01-22 21:47, Eli Zaretskii wrote: > > Paul, any comments on the issue and/or the patch? > > Patch looks OK except for the "Ose" in a comment. Thanks. I think it would be good to have a test for this, if that is feasible.
bug-gnu-emacs <at> gnu.org
:bug#75754
; Package emacs
.
(Thu, 23 Jan 2025 19:33:01 GMT) Full text and rfc822 format available.Message #29 received at 75754 <at> debbugs.gnu.org (full text, mbox):
From: Pip Cet <pipcet <at> protonmail.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: Paul Eggert <eggert <at> cs.ucla.edu>, 75754 <at> debbugs.gnu.org Subject: Re: bug#75754: styled_format stack usage/GC protection Date: Thu, 23 Jan 2025 19:32:44 +0000
"Eli Zaretskii" <eliz <at> gnu.org> writes: >> Date: Thu, 23 Jan 2025 09:24:12 -0800 >> Cc: 75754 <at> debbugs.gnu.org >> From: Paul Eggert <eggert <at> cs.ucla.edu> >> >> On 2025-01-22 21:47, Eli Zaretskii wrote: >> > Paul, any comments on the issue and/or the patch? >> >> Patch looks OK except for the "Ose" in a comment. > > Thanks. > > I think it would be good to have a test for this, if that is feasible. While I don't think a test would be a bad thing per se (feel free to use the code I posted, of course), I don't think it's the most sensible use of resources here: the precise situation is unlikely to occur again unless the patch is reverted, and similar bugs wouldn't be caught by this test. The most worrying aspect of this wasn't that there was a GC bug and we used a free'd string, it was that we didn't crash at that point, but silently treated the string as empty, but that's for bug#75768. And while I do think we should revisit the safety of SAFE_ALLOCA (scanning the extra memory as though it were on the stack seems easy, but I haven't tried) and SDATA (pinning the sblocks in memory is doable, I've got a patch, but there are subtle performance issue), those also could benefit from broader discussion. At the very least, a separate bug report. (In both of these cases, the feature/igc branch currently chooses the safer approach; in the case of SDATA, it must. In the case of SAFE_ALLOCA, we could introduce an unsafe version for some situations in which we are absolutely convinced no Lisp data can ever live in the xmalloc'd block, but so far it's not been required for performance. Many places use SDATA in situations where they expect no GC can happen, and there are no comments or macros indicating this to us, so deciding whether this is one of the cases is impossible. I mention this because this is likely to be the last such bug we find because it was buggy on both branches but caused problems only on feature/igc.) So, assuming no one volunteers to write a specific test, let's close this bug. Pip
bug-gnu-emacs <at> gnu.org
:bug#75754
; Package emacs
.
(Thu, 23 Jan 2025 20:34:03 GMT) Full text and rfc822 format available.Message #32 received at 75754 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Pip Cet <pipcet <at> protonmail.com> Cc: eggert <at> cs.ucla.edu, 75754 <at> debbugs.gnu.org Subject: Re: bug#75754: styled_format stack usage/GC protection Date: Thu, 23 Jan 2025 22:32:31 +0200
> Date: Thu, 23 Jan 2025 19:32:44 +0000 > From: Pip Cet <pipcet <at> protonmail.com> > Cc: Paul Eggert <eggert <at> cs.ucla.edu>, 75754 <at> debbugs.gnu.org > > So, assuming no one volunteers to write a specific test, let's close > this bug. I wrote a test: diff --git a/test/src/editfns-tests.el b/test/src/editfns-tests.el index 9fff425..6da3c73 100644 --- a/test/src/editfns-tests.el +++ b/test/src/editfns-tests.el @@ -534,4 +534,14 @@ editfns-tests--before/after-change-functions 'utf-8 nil (current-buffer)) (should (null (sanity-check-change-functions-errors)))))) +(ert-deftest editfns-tests-styled-print () + (let* ((print-unreadable-function + (lambda (&rest args) + (garbage-collect) + (make-string 100 ?Ā t))) + (str "\"[1] ĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀ\"")) + (should (string= + (format "%S" (format "%S %S" [1] (symbol-function '+))) str)))) + + ;;; editfns-tests.el ends here It fails (after rebuilding with commit ed5067e689a5): Test editfns-tests-styled-print backtrace: signal(ert-test-failed (((should (string= (format "%S" (format "%S % ert-fail(((should (string= (format "%S" (format "%S %S" [1] (symbol- (if (unwind-protect (setq value-602 (apply fn-600 args-601)) (setq f (let (form-description-604) (if (unwind-protect (setq value-602 (app (let ((value-602 'ert-form-evaluation-aborted-603)) (let (form-descr (let* ((fn-600 #'string=) (args-601 (condition-case err (list (forma (let* ((print-unreadable-function #'(lambda (&rest args) (garbage-co #f(lambda () [t] (let* ((print-unreadable-function #'...) (str "\"[1 #f(compiled-function () #<bytecode 0x118aa52bef4d292f>)() handler-bind-1(#f(compiled-function () #<bytecode 0x118aa52bef4d292f ert--run-test-internal(#s(ert--test-execution-info :test #s(ert-test ert-run-test(#s(ert-test :name editfns-tests-styled-print :documenta ert-run-or-rerun-test(#s(ert--stats :selector ... :tests ... :test-m ert-run-tests((not (or (tag :unstable) (tag :nativecomp))) #f(compil ert-run-tests-batch((not (or (tag :unstable) (tag :nativecomp)))) ert-run-tests-batch-and-exit((not (or (tag :unstable) (tag :nativeco eval((ert-run-tests-batch-and-exit '(not (or (tag :unstable) (tag :n command-line-1(("-L" ";." "-l" "ert" "--eval" "(setq treesit-extra-l command-line() normal-top-level() Test editfns-tests-styled-print condition: (ert-test-failed ((should (string= (format "%S" ...) str)) :form (string= "\"ion-a\"" "\"[1] ĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀ\"") :value nil :explanation (arrays-of-different-length 7 106 "\"ion-a\"" "\"[1] ĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀ\"" first-mismatch-at 1))) FAILED 4/24 editfns-tests-styled-print (0.041935 sec) at src/editfns-tests.el:537 What did I miss?
bug-gnu-emacs <at> gnu.org
:bug#75754
; Package emacs
.
(Thu, 23 Jan 2025 22:38:01 GMT) Full text and rfc822 format available.Message #35 received at 75754 <at> debbugs.gnu.org (full text, mbox):
From: Pip Cet <pipcet <at> protonmail.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: eggert <at> cs.ucla.edu, 75754 <at> debbugs.gnu.org Subject: Re: bug#75754: styled_format stack usage/GC protection Date: Thu, 23 Jan 2025 22:37:43 +0000
"Eli Zaretskii" <eliz <at> gnu.org> writes: > What did I miss? Can we make this a new bug? This one is SDATA, not SAFE_ALLOCA. diff --git a/src/editfns.c b/src/editfns.c index 4ba356d627c..23a5f9aeac6 100644 --- a/src/editfns.c +++ b/src/editfns.c @@ -3491,7 +3491,7 @@ styled_format (ptrdiff_t nargs, Lisp_Object *args, bool message) /* If we start out planning a unibyte result, then discover it has to be multibyte, we jump back to retry. */ retry: - + format_start = SSDATA (args[0]); p = buf; nchars = 0; should fix it. Pip
bug-gnu-emacs <at> gnu.org
:bug#75754
; Package emacs
.
(Thu, 23 Jan 2025 23:59:02 GMT) Full text and rfc822 format available.Message #38 received at 75754 <at> debbugs.gnu.org (full text, mbox):
From: Pip Cet <pipcet <at> protonmail.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: eggert <at> cs.ucla.edu, 75754 <at> debbugs.gnu.org Subject: Re: bug#75754: styled_format stack usage/GC protection Date: Thu, 23 Jan 2025 23:58:41 +0000
Pip Cet <pipcet <at> protonmail.com> writes: > "Eli Zaretskii" <eliz <at> gnu.org> writes: > >> What did I miss? > > Can we make this a new bug? Whether this is the same bug or a different one depends on how you look at it: both can be interpreted as a violated no-GC assumption, or the first one can be interpreted as a result of the excessive stack usage while the second one is due to SDATA relocation. There are three ways to fix the SDATA relocation issue: 1. Reload format_start and format (and end, and format0) after every call which might have GC'd. If you think we should do this, please tell me whether lisp_string_width can GC. More importantly, assuming it doesn't, document this in every function in the call tree starting at lisp_string_width so we don't accidentally change it. 2. memcpy the format string. Two-liner, more likely to fix the bug for good than (1), wastes more memory (since sa_avail has been negative since we entered the function, this is xmalloc'd memory). 3. replace format by a ptrdiff_t and all instances of *format by SREF (args[0], index). Faster than 2, but many changes hurting readability. > diff --git a/src/editfns.c b/src/editfns.c > index 4ba356d627c..23a5f9aeac6 100644 > --- a/src/editfns.c > +++ b/src/editfns.c > @@ -3491,7 +3491,7 @@ styled_format (ptrdiff_t nargs, Lisp_Object *args, bool message) > /* If we start out planning a unibyte result, > then discover it has to be multibyte, we jump back to retry. */ > retry: > - > + format_start = SSDATA (args[0]); > p = buf; > nchars = 0; > > > should fix it. Just be clear here: that fixes the specific test. It does not fix the bug. Here's (2), which seems most doable and, I think, may fix the bug, unless there's a subtlety I missed. diff --git a/src/editfns.c b/src/editfns.c index 4ba356d627c..72cbbb51c40 100644 --- a/src/editfns.c +++ b/src/editfns.c @@ -3442,9 +3442,10 @@ styled_format (ptrdiff_t nargs, Lisp_Object *args, bool message) } *info; CHECK_STRING (args[0]); - char *format_start = SSDATA (args[0]); bool multibyte_format = STRING_MULTIBYTE (args[0]); ptrdiff_t formatlen = SBYTES (args[0]); + char *format_start = SAFE_ALLOCA (formatlen); + memcpy (format_start, SSDATA (args[0]), formatlen); bool fmt_props = !!string_intervals (args[0]); /* Upper bound on number of format specs. Each uses at least 2 chars. */ Note that (3) technically introduces another bug: we call out to Lisp, which can modify args[0], causing it to be resized if it's multibyte. We'd use the old bytecount, and might read past the end of the string. feature/igc avoids all of these problems by keeping SDATA pointers valid, even when the string is resized. Pip
bug-gnu-emacs <at> gnu.org
:bug#75754
; Package emacs
.
(Fri, 24 Jan 2025 07:43:02 GMT) Full text and rfc822 format available.Message #41 received at 75754 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Pip Cet <pipcet <at> protonmail.com> Cc: eggert <at> cs.ucla.edu, 75754 <at> debbugs.gnu.org Subject: Re: bug#75754: styled_format stack usage/GC protection Date: Fri, 24 Jan 2025 09:41:47 +0200
> Date: Thu, 23 Jan 2025 22:37:43 +0000 > From: Pip Cet <pipcet <at> protonmail.com> > Cc: eggert <at> cs.ucla.edu, 75754 <at> debbugs.gnu.org > > "Eli Zaretskii" <eliz <at> gnu.org> writes: > > > What did I miss? > > Can we make this a new bug? We could, or we could keep discussing that in this bug (since this is still about styled_format). > This one is SDATA, not SAFE_ALLOCA. > > diff --git a/src/editfns.c b/src/editfns.c > index 4ba356d627c..23a5f9aeac6 100644 > --- a/src/editfns.c > +++ b/src/editfns.c > @@ -3491,7 +3491,7 @@ styled_format (ptrdiff_t nargs, Lisp_Object *args, bool message) > /* If we start out planning a unibyte result, > then discover it has to be multibyte, we jump back to retry. */ > retry: > - > + format_start = SSDATA (args[0]); > p = buf; > nchars = 0; > > > should fix it. Didn't try it yet, since we are still discussing what to do. But I would like to install the test, for now as expected to fail. WDYT?
bug-gnu-emacs <at> gnu.org
:bug#75754
; Package emacs
.
(Fri, 24 Jan 2025 08:17:02 GMT) Full text and rfc822 format available.Message #44 received at 75754 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Pip Cet <pipcet <at> protonmail.com> Cc: eggert <at> cs.ucla.edu, 75754 <at> debbugs.gnu.org Subject: Re: bug#75754: styled_format stack usage/GC protection Date: Fri, 24 Jan 2025 10:16:28 +0200
> Date: Thu, 23 Jan 2025 23:58:41 +0000 > From: Pip Cet <pipcet <at> protonmail.com> > Cc: eggert <at> cs.ucla.edu, 75754 <at> debbugs.gnu.org > > 1. Reload format_start and format (and end, and format0) after every > call which might have GC'd. If you think we should do this, please > tell me whether lisp_string_width can GC. It can, if called with the last argument 'true' (because find_automatic_composition calls into Lisp). Currently, we call it with 'false', so it cannot. > More importantly, assuming it doesn't, document this in every > function in the call tree starting at lisp_string_width so we don't > accidentally change it. > > 2. memcpy the format string. Two-liner, more likely to fix the bug for > good than (1), wastes more memory (since sa_avail has been negative > since we entered the function, this is xmalloc'd memory). > > 3. replace format by a ptrdiff_t and all instances of *format by > SREF (args[0], index). Faster than 2, but many changes hurting > readability. I prefer (2), I think. Assuming it indeed fixes the problem. > > + format_start = SSDATA (args[0]); > > p = buf; > > nchars = 0; > > > > > > should fix it. > > Just be clear here: that fixes the specific test. It does not fix the > bug. Can we have a test for "the rest" of the bug, which is not fixed by the above? > Here's (2), which seems most doable and, I think, may fix the bug, > unless there's a subtlety I missed. > > diff --git a/src/editfns.c b/src/editfns.c > index 4ba356d627c..72cbbb51c40 100644 > --- a/src/editfns.c > +++ b/src/editfns.c > @@ -3442,9 +3442,10 @@ styled_format (ptrdiff_t nargs, Lisp_Object *args, bool message) > } *info; > > CHECK_STRING (args[0]); > - char *format_start = SSDATA (args[0]); > bool multibyte_format = STRING_MULTIBYTE (args[0]); > ptrdiff_t formatlen = SBYTES (args[0]); > + char *format_start = SAFE_ALLOCA (formatlen); > + memcpy (format_start, SSDATA (args[0]), formatlen); > bool fmt_props = !!string_intervals (args[0]); Does this assume anything about formatlen, like that it doesn't exceed the alloca limit? If so, should we have an assertion about that? > feature/igc avoids all of these problems by keeping SDATA pointers > valid, even when the string is resized. Unless the string is GC'ed, right?
bug-gnu-emacs <at> gnu.org
:bug#75754
; Package emacs
.
(Fri, 24 Jan 2025 12:52:02 GMT) Full text and rfc822 format available.Message #47 received at 75754 <at> debbugs.gnu.org (full text, mbox):
From: Pip Cet <pipcet <at> protonmail.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: eggert <at> cs.ucla.edu, Michael Albinus <michael.albinus <at> gmx.de>, 75754 <at> debbugs.gnu.org Subject: Re: bug#75754: styled_format stack usage/GC protection Date: Fri, 24 Jan 2025 12:51:22 +0000
Michael Albinus, could you please C-s ert-how in this message and comment briefly if appropriate? I think adding a new defstruct may help improve ert in at least three ways, including this one. Please indicate whether this would be *potentially* acceptable; if it is, I'll prepare a separate bug. If it isn't, I might have to think of something else. "Eli Zaretskii" <eliz <at> gnu.org> writes: >> Date: Thu, 23 Jan 2025 23:58:41 +0000 >> From: Pip Cet <pipcet <at> protonmail.com> >> Cc: eggert <at> cs.ucla.edu, 75754 <at> debbugs.gnu.org >> >> 1. Reload format_start and format (and end, and format0) after every >> call which might have GC'd. If you think we should do this, please >> tell me whether lisp_string_width can GC. > > It can, if called with the last argument 'true' (because > find_automatic_composition calls into Lisp). Currently, we call it > with 'false', so it cannot. Is this documented anywhere? >> More importantly, assuming it doesn't, document this in every >> function in the call tree starting at lisp_string_width so we don't >> accidentally change it. >> >> 2. memcpy the format string. Two-liner, more likely to fix the bug for >> good than (1), wastes more memory (since sa_avail has been negative >> since we entered the function, this is xmalloc'd memory). >> >> 3. replace format by a ptrdiff_t and all instances of *format by >> SREF (args[0], index). Faster than 2, but many changes hurting >> readability. > > I prefer (2), I think. Assuming it indeed fixes the problem. I'll let you know when I'm (reasonably) certain. > >> > + format_start = SSDATA (args[0]); >> > p = buf; >> > nchars = 0; >> > >> > >> > should fix it. >> >> Just be clear here: that fixes the specific test. It does not fix the >> bug. > > Can we have a test for "the rest" of the bug, which is not fixed by > the above? Thank you very much for suggesting this: if I hadn't been wrong about the first test case being useful (it was), I'd probably have refused. While writing this test didn't uncover new bugs, it was an educational experience. New test: (ert-deftest editfns-tests-styled-print () (let* ((gc-me (copy-sequence "foo")) (print-unreadable-function (lambda (&rest args) (make-string 10 10 t) (garbage-collect) (make-string 10 10 t) (prog1 (make-string 100 ?Ā t) (setq gc-me nil) (make-string 10 10 t)))) (str "\"[1] ĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀ ĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀ ĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀ\"")) (should (string= (format "%S" (format (copy-sequence "%S %S %S %S") [1] (symbol-function '+) (symbol-function '*) (symbol-function '-))) str)))) The "problem" writing this test is that string compaction is very often idempotent: the only way a string is moved again in a second GC is when another string which is *older* than the string moved in the first GC is freed. That's 'gc-me', which must be live during the first call to garbage-collect but unreachable for the second call. While I figured out how to do this for this specific test, it'd be nicer to just run make check-gc and stress-test the garbage collector in ways known to have caused problems. (Brief proposal relevant to several issues: If we want to test GC bugs more thoroughly, we need a way to tell ert to run them in a special way. I proposed extending ert with an ert-how defstruct for the make benchmark target, indicating "how" a test is to be run, and implemented it, but I don't think I've sent the patch yet. Using this structure to create conditions for testing for GC bugs would be possible, and it would also be an easier way to make crash tests print a message before they're run. Of course, this would in no way imply a decision to use this structure, or ERT, for "make benchmark". Using it for GC testing or indicating crash tests would be sufficient reason to add it, IMHO. While this needs more thought, if it's obviously unacceptable for a simple reason I haven't thought of, please let me know so I won't waste time on it. Of course the final decision will have to be made based on an actual patch, but if we can have a quick "no" right now I'd appreciate it.) >> Here's (2), which seems most doable and, I think, may fix the bug, >> unless there's a subtlety I missed. >> >> diff --git a/src/editfns.c b/src/editfns.c >> index 4ba356d627c..72cbbb51c40 100644 >> --- a/src/editfns.c >> +++ b/src/editfns.c >> @@ -3442,9 +3442,10 @@ styled_format (ptrdiff_t nargs, Lisp_Object *args, bool message) >> } *info; >> >> CHECK_STRING (args[0]); >> - char *format_start = SSDATA (args[0]); >> bool multibyte_format = STRING_MULTIBYTE (args[0]); >> ptrdiff_t formatlen = SBYTES (args[0]); >> + char *format_start = SAFE_ALLOCA (formatlen); >> + memcpy (format_start, SSDATA (args[0]), formatlen); >> bool fmt_props = !!string_intervals (args[0]); > > Does this assume anything about formatlen, like that it doesn't exceed > the alloca limit? If so, should we have an assertion about that? No. We're already past the alloca limit at his point so SAFE_ALLOCA will always call xmalloc here (that's the "excessive stack usage" part of this bug, which remains unfixed but might be a performance issue in rare situations involving very many calls to styled_format which don't GC). Calling SAFE_ALLOCA, while unsafe for GC (I'd suggest renaming the macro, but not right now), is safe wrt large allocations. >> feature/igc avoids all of these problems by keeping SDATA pointers >> valid, even when the string is resized. > > Unless the string is GC'ed, right? Even if it is GC'd. The SDATA will stay around while there are live pointers to it. The SDATA doesn't keep the string alive, but there's no way to get a reference to the now-dead Lisp_String structure from the SDATA, so this won't cause bugs. Pip
bug-gnu-emacs <at> gnu.org
:bug#75754
; Package emacs
.
(Fri, 24 Jan 2025 13:22:02 GMT) Full text and rfc822 format available.Message #50 received at 75754 <at> debbugs.gnu.org (full text, mbox):
From: Pip Cet <pipcet <at> protonmail.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: eggert <at> cs.ucla.edu, 75754 <at> debbugs.gnu.org, Stefan Kangas <stefankangas <at> gmail.com> Subject: Re: bug#75754: styled_format stack usage/GC protection Date: Fri, 24 Jan 2025 13:21:24 +0000
"Eli Zaretskii" <eliz <at> gnu.org> writes: >> Date: Thu, 23 Jan 2025 22:37:43 +0000 >> From: Pip Cet <pipcet <at> protonmail.com> >> Cc: eggert <at> cs.ucla.edu, 75754 <at> debbugs.gnu.org >> >> "Eli Zaretskii" <eliz <at> gnu.org> writes: >> >> > What did I miss? >> >> Can we make this a new bug? > > We could, or we could keep discussing that in this bug (since this is > still about styled_format). Let's do the latter, too much discussion happened here already and the subject does kind of cover this bug, too (as well as the excessive stack usage which remains unfixed; I cannot easily fix it without a lot of self-study or Paul Eggert's help). >> This one is SDATA, not SAFE_ALLOCA. >> >> diff --git a/src/editfns.c b/src/editfns.c >> index 4ba356d627c..23a5f9aeac6 100644 >> --- a/src/editfns.c >> +++ b/src/editfns.c >> @@ -3491,7 +3491,7 @@ styled_format (ptrdiff_t nargs, Lisp_Object *args, bool message) >> /* If we start out planning a unibyte result, >> then discover it has to be multibyte, we jump back to retry. */ >> retry: >> - >> + format_start = SSDATA (args[0]); >> p = buf; >> nchars = 0; >> >> >> should fix it. > > Didn't try it yet, since we are still discussing what to do. But I > would like to install the test, for now as expected to fail. WDYT? Please don't. The test without the bugfix has absolutely undefined behavior and will most likely crash Emacs in some circumstances, and might cause other tests to produce incorrect results as well. Let's wait with adding such tests until we have ERT infrastructure to deal with them as gracefully as we can (crash tests need to be run in timeout, with a ulimit, so this involves Makefile work, too). (Stefan Kangas: I do recall you asked me to sketch how ERT changes could improve the output of "make check" in cases where Emacs crashed, in bug#75648. As Michael rejected the initial approach I had taken, I decided to delay this until I could make a new proposal that hadn't been rejected already. This bug now looks like it may result in one). Pip
bug-gnu-emacs <at> gnu.org
:bug#75754
; Package emacs
.
(Fri, 24 Jan 2025 13:58:02 GMT) Full text and rfc822 format available.Message #53 received at 75754 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Pip Cet <pipcet <at> protonmail.com> Cc: eggert <at> cs.ucla.edu, michael.albinus <at> gmx.de, 75754 <at> debbugs.gnu.org Subject: Re: bug#75754: styled_format stack usage/GC protection Date: Fri, 24 Jan 2025 15:57:45 +0200
> Date: Fri, 24 Jan 2025 12:51:22 +0000 > From: Pip Cet <pipcet <at> protonmail.com> > Cc: eggert <at> cs.ucla.edu, 75754 <at> debbugs.gnu.org, Michael Albinus <michael.albinus <at> gmx.de> > > Michael Albinus, could you please C-s ert-how in this message and > comment briefly if appropriate? I think adding a new defstruct may help > improve ert in at least three ways, including this one. Please indicate > whether this would be *potentially* acceptable; if it is, I'll prepare a > separate bug. If it isn't, I might have to think of something else. I think it will take time for Michael to respond, due to his personal circumstances. So if you expect the answers soon, that might not happen. > "Eli Zaretskii" <eliz <at> gnu.org> writes: > > >> Date: Thu, 23 Jan 2025 23:58:41 +0000 > >> From: Pip Cet <pipcet <at> protonmail.com> > >> Cc: eggert <at> cs.ucla.edu, 75754 <at> debbugs.gnu.org > >> > >> 1. Reload format_start and format (and end, and format0) after every > >> call which might have GC'd. If you think we should do this, please > >> tell me whether lisp_string_width can GC. > > > > It can, if called with the last argument 'true' (because > > find_automatic_composition calls into Lisp). Currently, we call it > > with 'false', so it cannot. > > Is this documented anywhere? If you mean under what circumstances some internal function calls Lisp (or can otherwise GC), then no, we don't document that anywhere. I just looked at the source code to answer your question. > Thank you very much for suggesting this: if I hadn't been wrong about > the first test case being useful (it was), I'd probably have refused. > While writing this test didn't uncover new bugs, it was an educational > experience. > > New test: > > (ert-deftest editfns-tests-styled-print () > (let* ((gc-me (copy-sequence "foo")) > (print-unreadable-function > (lambda (&rest args) > (make-string 10 10 t) > (garbage-collect) > (make-string 10 10 t) > (prog1 (make-string 100 ?Ā t) > (setq gc-me nil) > (make-string 10 10 t)))) > (str "\"[1] ĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀ ĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀ ĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀ\"")) > (should (string= > (format "%S" (format (copy-sequence "%S %S %S %S") [1] > (symbol-function '+) > (symbol-function '*) > (symbol-function '-))) > str)))) > > The "problem" writing this test is that string compaction is very often > idempotent: the only way a string is moved again in a second GC is when > another string which is *older* than the string moved in the first GC is > freed. That's 'gc-me', which must be live during the first call to > garbage-collect but unreachable for the second call. I suggest to have at least part of the above in comments to the test, since these details are so important for the test to be effective. > While I figured out how to do this for this specific test, it'd be nicer > to just run make check-gc and stress-test the garbage collector in ways > known to have caused problems. Yes; patches welcome. > (Brief proposal relevant to several issues: If we want to test GC bugs > more thoroughly, we need a way to tell ert to run them in a special way. > I proposed extending ert with an ert-how defstruct for the make > benchmark target, indicating "how" a test is to be run, and implemented > it, but I don't think I've sent the patch yet. Using this structure to > create conditions for testing for GC bugs would be possible, and it > would also be an easier way to make crash tests print a message before > they're run. > > Of course, this would in no way imply a decision to use this structure, > or ERT, for "make benchmark". Using it for GC testing or indicating > crash tests would be sufficient reason to add it, IMHO. > > While this needs more thought, if it's obviously unacceptable for a > simple reason I haven't thought of, please let me know so I won't waste > time on it. Of course the final decision will have to be made based on > an actual patch, but if we can have a quick "no" right now I'd > appreciate it.) Well, for now I don't have even a vague idea what you have in mind, so it is hard for me to answer the question. How do you extend ERT to be able to run tests in a way that makes testing GC-related bugs easier? What are the general ideas for that? > >> + char *format_start = SAFE_ALLOCA (formatlen); > >> + memcpy (format_start, SSDATA (args[0]), formatlen); > >> bool fmt_props = !!string_intervals (args[0]); > > > > Does this assume anything about formatlen, like that it doesn't exceed > > the alloca limit? If so, should we have an assertion about that? > > No. We're already past the alloca limit at his point so SAFE_ALLOCA > will always call xmalloc here (that's the "excessive stack usage" part > of this bug, which remains unfixed but might be a performance issue in > rare situations involving very many calls to styled_format which don't > GC). IMO, that fact should also be in comments to the code. > >> feature/igc avoids all of these problems by keeping SDATA pointers > >> valid, even when the string is resized. > > > > Unless the string is GC'ed, right? > > Even if it is GC'd. > > The SDATA will stay around while there are live pointers to it. The > SDATA doesn't keep the string alive, but there's no way to get a > reference to the now-dead Lisp_String structure from the SDATA, so this > won't cause bugs. Even if the code modifies the string after GC?
bug-gnu-emacs <at> gnu.org
:bug#75754
; Package emacs
.
(Fri, 24 Jan 2025 14:10:01 GMT) Full text and rfc822 format available.Message #56 received at 75754 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Pip Cet <pipcet <at> protonmail.com> Cc: eggert <at> cs.ucla.edu, 75754 <at> debbugs.gnu.org, stefankangas <at> gmail.com Subject: Re: bug#75754: styled_format stack usage/GC protection Date: Fri, 24 Jan 2025 16:09:05 +0200
> Date: Fri, 24 Jan 2025 13:21:24 +0000 > From: Pip Cet <pipcet <at> protonmail.com> > Cc: eggert <at> cs.ucla.edu, 75754 <at> debbugs.gnu.org, Stefan Kangas <stefankangas <at> gmail.com> > > "Eli Zaretskii" <eliz <at> gnu.org> writes: > > >> Can we make this a new bug? > > > > We could, or we could keep discussing that in this bug (since this is > > still about styled_format). > > Let's do the latter Fine by me. However, ... > (Stefan Kangas: I do recall you asked me to sketch how ERT changes could > improve the output of "make check" in cases where Emacs crashed, in > bug#75648. As Michael rejected the initial approach I had taken, I > decided to delay this until I could make a new proposal that hadn't been > rejected already. This bug now looks like it may result in one). ... _this_ part definitely belongs to a separate discussion, probably on emacs-devel and not here.
bug-gnu-emacs <at> gnu.org
:bug#75754
; Package emacs
.
(Fri, 24 Jan 2025 15:03:01 GMT) Full text and rfc822 format available.Message #59 received at 75754 <at> debbugs.gnu.org (full text, mbox):
From: Pip Cet <pipcet <at> protonmail.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: eggert <at> cs.ucla.edu, michael.albinus <at> gmx.de, 75754 <at> debbugs.gnu.org Subject: Re: bug#75754: styled_format stack usage/GC protection Date: Fri, 24 Jan 2025 15:02:20 +0000
"Eli Zaretskii" <eliz <at> gnu.org> writes: >> Date: Fri, 24 Jan 2025 12:51:22 +0000 >> From: Pip Cet <pipcet <at> protonmail.com> >> Cc: eggert <at> cs.ucla.edu, 75754 <at> debbugs.gnu.org, Michael Albinus <michael.albinus <at> gmx.de> >> >> Michael Albinus, could you please C-s ert-how in this message and >> comment briefly if appropriate? I think adding a new defstruct may help >> improve ert in at least three ways, including this one. Please indicate >> whether this would be *potentially* acceptable; if it is, I'll prepare a >> separate bug. If it isn't, I might have to think of something else. > > I think it will take time for Michael to respond, due to his personal > circumstances. So if you expect the answers soon, that might not > happen. Will file a new bug, with an initial message hopefully clear enough to save Michael some time for when he gets around to it. >> "Eli Zaretskii" <eliz <at> gnu.org> writes: >> >> >> Date: Thu, 23 Jan 2025 23:58:41 +0000 >> >> From: Pip Cet <pipcet <at> protonmail.com> >> >> Cc: eggert <at> cs.ucla.edu, 75754 <at> debbugs.gnu.org >> >> >> >> 1. Reload format_start and format (and end, and format0) after every >> >> call which might have GC'd. If you think we should do this, please >> >> tell me whether lisp_string_width can GC. >> > >> > It can, if called with the last argument 'true' (because >> > find_automatic_composition calls into Lisp). Currently, we call it >> > with 'false', so it cannot. >> >> Is this documented anywhere? > > If you mean under what circumstances some internal function calls Lisp > (or can otherwise GC), then no, we don't document that anywhere. I > just looked at the source code to answer your question. (This is an honest question: do you believe Emacs would be worse off if we added an ASSUME_NO_GC macro like ASSUME_NO_GC (true); p = SDATA (str); 532 lines of code here use p for the last time ASSUME_NO_GC (false); This seems to me to be readable (better names/API welcome, of course), even if it's a nop. Faster than writing down these assumptions in English, and it does offer the ability to eventually find such cases automatically. That would have found this bug if Fprin1 contained /* Due to print-unreadable-function, all calls to prin1 may call Lisp, which may GC. Ensure we're not accidentally called from a critical section. */ #ifdef ENABLE_CHECK_NO_GC if (assume_no_gc >= 0) emacs_abort (); #endif It'd make life easier for relative Emacs newcomers like myself who do not know by heart which functions call GC and which don't.) >> Thank you very much for suggesting this: if I hadn't been wrong about >> the first test case being useful (it was), I'd probably have refused. >> While writing this test didn't uncover new bugs, it was an educational >> experience. >> >> New test: >> >> (ert-deftest editfns-tests-styled-print () >> (let* ((gc-me (copy-sequence "foo")) >> (print-unreadable-function >> (lambda (&rest args) >> (make-string 10 10 t) >> (garbage-collect) >> (make-string 10 10 t) >> (prog1 (make-string 100 ?Ā t) >> (setq gc-me nil) >> (make-string 10 10 t)))) >> (str "\"[1] ĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀ ĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀ ĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀ\"")) >> (should (string= >> (format "%S" (format (copy-sequence "%S %S %S %S") [1] >> (symbol-function '+) >> (symbol-function '*) >> (symbol-function '-))) >> str)))) >> >> The "problem" writing this test is that string compaction is very often >> idempotent: the only way a string is moved again in a second GC is when >> another string which is *older* than the string moved in the first GC is >> freed. That's 'gc-me', which must be live during the first call to >> garbage-collect but unreachable for the second call. > > I suggest to have at least part of the above in comments to the test, > since these details are so important for the test to be effective. Yes, definitely, if it is to be useful. Without comments this is so much line noise. > What are the general ideas for that? This doesn't belong here; I'll file a new bug even if I can't provide such details now (unless there already is one), as a wishlist item. >> >> feature/igc avoids all of these problems by keeping SDATA pointers >> >> valid, even when the string is resized. >> > >> > Unless the string is GC'ed, right? >> >> Even if it is GC'd. >> >> The SDATA will stay around while there are live pointers to it. The >> SDATA doesn't keep the string alive, but there's no way to get a >> reference to the now-dead Lisp_String structure from the SDATA, so this >> won't cause bugs. > > Even if the code modifies the string after GC? If the GC free'd the string, but not the SDATA structure, we cannot regenerate the string to call resize_string_data. So, no, this cannot happen. Note that resizing a live string while you have a live SDATA pointer to it might result in the SDATA pointer referring to the pre-modification SDATA, not the current one. Thanks for continuing to ask such questions! It's important we don't forget about one such question and cause crashes, particularly on feature/igc where there's a shortage of testers (which is not anyone's fault but mine; I'm working on making it more testable)! Pip
bug-gnu-emacs <at> gnu.org
:bug#75754
; Package emacs
.
(Fri, 24 Jan 2025 15:30:02 GMT) Full text and rfc822 format available.Message #62 received at 75754 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Pip Cet <pipcet <at> protonmail.com> Cc: eggert <at> cs.ucla.edu, michael.albinus <at> gmx.de, 75754 <at> debbugs.gnu.org Subject: Re: bug#75754: styled_format stack usage/GC protection Date: Fri, 24 Jan 2025 17:28:53 +0200
> Date: Fri, 24 Jan 2025 15:02:20 +0000 > From: Pip Cet <pipcet <at> protonmail.com> > Cc: eggert <at> cs.ucla.edu, 75754 <at> debbugs.gnu.org, michael.albinus <at> gmx.de > > > If you mean under what circumstances some internal function calls Lisp > > (or can otherwise GC), then no, we don't document that anywhere. I > > just looked at the source code to answer your question. > > (This is an honest question: do you believe Emacs would be worse off if > we added an ASSUME_NO_GC macro like > > ASSUME_NO_GC (true); > p = SDATA (str); > > 532 lines of code here > > use p for the last time > > ASSUME_NO_GC (false); No, I don't think Emacs will be worse. But I hope we will only have to do this in not too-many places. Although I'm unsure how to find them and how to discern between them and the rest. We'd need some rules where to place these. > >> >> feature/igc avoids all of these problems by keeping SDATA pointers > >> >> valid, even when the string is resized. > >> > > >> > Unless the string is GC'ed, right? > >> > >> Even if it is GC'd. > >> > >> The SDATA will stay around while there are live pointers to it. The > >> SDATA doesn't keep the string alive, but there's no way to get a > >> reference to the now-dead Lisp_String structure from the SDATA, so this > >> won't cause bugs. > > > > Even if the code modifies the string after GC? > > If the GC free'd the string, but not the SDATA structure, we cannot > regenerate the string to call resize_string_data. So, no, this cannot > happen. > > Note that resizing a live string while you have a live SDATA pointer to > it might result in the SDATA pointer referring to the pre-modification > SDATA, not the current one. Who said anything about resizing? I thought about changing a string with SSET, for example, or Faset. IOW, the scenario is: . we take a pointer to a string's SDATA . GC happens . we use SSET or Faset to modify the string . we use the above pointer With the old GC, this is a no-no, but you seem to be saying that on the igc branch it's okay to do that?
bug-gnu-emacs <at> gnu.org
:bug#75754
; Package emacs
.
(Fri, 24 Jan 2025 23:41:01 GMT) Full text and rfc822 format available.Message #65 received at 75754 <at> debbugs.gnu.org (full text, mbox):
From: Stefan Kangas <stefankangas <at> gmail.com> To: Eli Zaretskii <eliz <at> gnu.org>, Pip Cet <pipcet <at> protonmail.com> Cc: eggert <at> cs.ucla.edu, 75754 <at> debbugs.gnu.org Subject: Re: bug#75754: styled_format stack usage/GC protection Date: Fri, 24 Jan 2025 17:40:46 -0600
Eli Zaretskii <eliz <at> gnu.org> writes: >> (Stefan Kangas: I do recall you asked me to sketch how ERT changes could >> improve the output of "make check" in cases where Emacs crashed, in >> bug#75648. As Michael rejected the initial approach I had taken, I >> decided to delay this until I could make a new proposal that hadn't been >> rejected already. This bug now looks like it may result in one). > > ... _this_ part definitely belongs to a separate discussion, probably > on emacs-devel and not here. Moving this to emacs-devel sounds good to me. I think improving this aspect would be generally useful, so thanks for working on it.
bug-gnu-emacs <at> gnu.org
:bug#75754
; Package emacs
.
(Sat, 25 Jan 2025 07:56:02 GMT) Full text and rfc822 format available.Message #68 received at 75754 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Pip Cet <pipcet <at> protonmail.com> Cc: eggert <at> cs.ucla.edu, michael.albinus <at> gmx.de, 75754 <at> debbugs.gnu.org Subject: Re: bug#75754: styled_format stack usage/GC protection Date: Sat, 25 Jan 2025 09:54:58 +0200
> Date: Fri, 24 Jan 2025 22:35:50 +0000 > From: Pip Cet <pipcet <at> protonmail.com> > > >> I think we should eventually strive to call these macros in all but the > >> most obviously correct places where we use SDATA: > >> > >> memcpy (buf, SDATA (str), SBYTES (str)); > >> > >> is perfectly okay, but the current code in call_process is not, and we > >> can (IF we find bugs; most of my experiments don't, and that's okay) > >> decide where to draw the line between those. > > > > I won't have an opinion until I actually see the patches. > > Understood. > > >> My preference would be to store the "no-gc" nesting level in the specpdl > >> directly rather than allocating a new specpdl entry every time we > >> increase it. We definitely do unwind out of no-gc situations, in any > >> case, which is fine. > >> > >> (The alternative would be to call SDATA_FREE on sdata pointers when > >> we're done with them, but that seems excessive to me at this point, > >> particularly since we'd need to unwind that, too.) > > > > You lost me here. > > We could demand that every SDATA call be balanced by an SDATA_FREE call > or a nonlocal exit. > > At this point, I don't think we should, because there are other no-GC > assertions and we've seen where "technically necessary but usually nop" > macros got us with GCPRO. > > I think we'll have to ultimately have to find a compromise, for the old > GC, between reducing no-GC situations and asserting we're in one when we > need to be. If stack references pin SDATA and we scan SAFE_ALLOCA'd > memory conservatively, that would still leave SDATA pointers in > explicitly xmalloc'd memory, and temporarily invalid objects that would > crash the garbage collector if it were to look at them. Maybe more > situations. > > Note these proposals interact non-trivially. > > However, I think we should try hard to keep this as non-intrusive as > possible. ASSUME_NO_GC (); seems okay to me to mark the start of a > no-GC section, but ASSUME_NO_GC (true) balanced by ASSUME_NO_GC (false) > already feels like I'm doing the compiler's work for it, even if an > xsignal non-local exit implies the right numbers of ASSUME_NO_GC (false) > calls: the vast majority of critical sections begin and end in the same > function, and any early return means we've left the critical section. > > However, if we decided to make "return" imply an ASSUME_NO_GC reset, > that, while possible, would probably introduce system dependencies we > might not be able to accept easily. > > I don't know whether most functions that call SDATA have a critical > section which ends before the function returns. You seem to be discussing some details of the ASSUME_NO_GC macro implementation. I cannot follow this because I don't have a clear enough idea of what you have in mind, and you didn't yet describe the main ideas of the implementation. So let's defer this until you do. > >> >> > Even if the code modifies the string after GC? > >> >> > >> >> If the GC free'd the string, but not the SDATA structure, we cannot > >> >> regenerate the string to call resize_string_data. So, no, this cannot > >> >> happen. > >> >> > >> >> Note that resizing a live string while you have a live SDATA pointer to > >> >> it might result in the SDATA pointer referring to the pre-modification > >> >> SDATA, not the current one. > >> > > >> > Who said anything about resizing? > >> > >> Non-resizing Faset is boring, works as expected. > > > > But it will modify the string whose SDATA pointer is different, no? > > Different how? Different as in: "pointing to a location different from the data of the string that was moved by GC". > There's only one string in this example, and after the > string's SDATA changes, you can access the old SDATA, but it doesn't > belong to a string anymore. That's what I thought. So code that expects the old SDATA pointer to be pointing to the same string after GC will have a bug on the igc branch. It's a different bug than under the old GC, but still a bug. Right? > > Lisp_Object my_string; > > char *p = SDATA (my_string); > > /* trigger GC */ > > Faset (my_string, make_fixnum (0), make_fixnum (97)); > > /* code which expects p[0] to be 'a' */ > > If that is all the code, this is safe with feature/igc. You say "safe", but it sounds like we have different definitions of "safe". Because later you say: > MPS GC will never make pointers that live on the stack invalid. It's > transparent to correct client programs. So, yes, if you're asking about > MPS GC, the sequence is safe and doesn't crash, but may surprise you by > reproducing the old string data. Which seems to mean your "safe" means the code will not crash, whereas what I meant is the correctness of the code. Code which refers to the old string data is incorrect, even if it won't crash. So my conclusion is that even with MPS, referring in code after GC to an SDATA pointer taken before GC leads to incorrect code. IOW, pointers to SDATA of a string cannot be relied upon to be valid (i.e. pointing to the correct string) across GC, so the code must re-initialize such pointer after any fragment that could GC. This was the rule with the old GC, and it should continue to be the rule with MPS. > >> > With the old GC, this is a no-no, but you seem to be saying that on > >> > the igc branch it's okay to do that? > >> > >> Note that none of that is guaranteed. We might poison the old SDATA in > >> Faset, and possibly should. I don't think MPS officially allows you to > >> prematurely free an object, so that's probably not going to happen. > > > > Is that a yes? > > It's currently safe. "Safe" as in "won't crash" or "safe" as in "will point to the correct string data"? I think the former, whereas I was talking about the latter. > One scenario is we decide that no useful code > wants to do this, so we deliberately make it crash if we can. Another > scenario is that we decide this is so useful that we add an > SDATA->struct Lisp_String back pointer (alloc.c goes both ways, giving > you a back pointer but moving it around in memory so it'll be invalid > should you ever need it). As both of these scenarios require > significant work and a lot of thought, it's most likely the current > situation persists on feature/igc for a while, and no great loss either > way. That's not the issue that was bothering me.
bug-gnu-emacs <at> gnu.org
:bug#75754
; Package emacs
.
(Sat, 25 Jan 2025 09:46:02 GMT) Full text and rfc822 format available.Message #71 received at 75754 <at> debbugs.gnu.org (full text, mbox):
From: Pip Cet <pipcet <at> protonmail.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: eggert <at> cs.ucla.edu, michael.albinus <at> gmx.de, 75754 <at> debbugs.gnu.org Subject: Re: bug#75754: styled_format stack usage/GC protection Date: Sat, 25 Jan 2025 09:45:19 +0000
"Eli Zaretskii" <eliz <at> gnu.org> writes: >> Date: Fri, 24 Jan 2025 22:35:50 +0000 >> From: Pip Cet <pipcet <at> protonmail.com> >> >> >> > Even if the code modifies the string after GC? >> >> >> >> >> >> If the GC free'd the string, but not the SDATA structure, we cannot >> >> >> regenerate the string to call resize_string_data. So, no, this cannot >> >> >> happen. >> >> >> >> >> >> Note that resizing a live string while you have a live SDATA pointer to >> >> >> it might result in the SDATA pointer referring to the pre-modification >> >> >> SDATA, not the current one. >> >> > >> >> > Who said anything about resizing? >> >> >> >> Non-resizing Faset is boring, works as expected. >> > >> > But it will modify the string whose SDATA pointer is different, no? >> >> Different how? > > Different as in: "pointing to a location different from the data of > the string that was moved by GC". I'm sorry, this may sound like you as though I'm being pedantic, but I need to be sure that we're talking about the same thing here: The scenario you describe involves a single string object, but two sdata areas. The single string points to the first sdata area, but then is modified to point to the second. Modifying the first sdata area after the second one has been created will have no effect whatsoever on the string. >> There's only one string in this example, and after the >> string's SDATA changes, you can access the old SDATA, but it doesn't >> belong to a string anymore. > > That's what I thought. So code that expects the old SDATA pointer to > be pointing to the same string after GC will have a bug on the igc feature/igc: GC doesn't invalidate SDATA pointers. Resizing string modification does (the other way is to call pin_string). > branch. It's a different bug than under the old GC, but still a bug. > Right? If there's modification involved, it's a bug which will cause apparently random behavior but no crashes. If there's no modification, it's a bug only because the code might be used with alloc.c GC. >> > Lisp_Object my_string; >> > char *p = SDATA (my_string); >> > /* trigger GC */ >> > Faset (my_string, make_fixnum (0), make_fixnum (97)); >> > /* code which expects p[0] to be 'a' */ >> >> If that is all the code, this is safe with feature/igc. > > You say "safe", but it sounds like we have different definitions of > "safe". Because later you say: I meant "won't crash, but most likely useless". >> MPS GC will never make pointers that live on the stack invalid. It's >> transparent to correct client programs. So, yes, if you're asking about >> MPS GC, the sequence is safe and doesn't crash, but may surprise you by >> reproducing the old string data. > > Which seems to mean your "safe" means the code will not crash, whereas > what I meant is the correctness of the code. Code which refers to the > old string data is incorrect, even if it won't crash. I believe that's true. > So my conclusion is that even with MPS, referring in code after GC to > an SDATA pointer taken before GC leads to incorrect code. IOW, > pointers to SDATA of a string cannot be relied upon to be valid > (i.e. pointing to the correct string) across GC, so the code must > re-initialize such pointer after any fragment that could GC. This was > the rule with the old GC, and it should continue to be the rule with > MPS. No. GC doesn't invalidate SDATA pointers. Only string resizing does (but that's a technical detail: code should assume all string modification may resize the string, and always reload any SDATA pointers they might have). The only reason such code must be buggy is because it will fail to work with alloc.c. In some (rare) cases, we're fine with the undefined behavior: if we call out to Lisp code which unexpectedly modifies a string we're currently looking at, we shouldn't crash (this is hard for multibyte strings), but we shouldn't go out of our way to detect or handle this situation, either. (I posted a patch yesterday to fix mapconcat in this case, and it does precisely that: if we find ourselves in the middle of a multibyte sequence unexpectedly, that's an error(), not a crash. The patch does need an additional small change I'd overlooked). My proposal is to reload the SDATA pointer even in such cases, so it's more obvious that there is no bug in our code then. If we decide to establish that rule, we may also want to enforce it. Pip
bug-gnu-emacs <at> gnu.org
:bug#75754
; Package emacs
.
(Sat, 25 Jan 2025 13:25:02 GMT) Full text and rfc822 format available.Message #74 received at 75754 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Pip Cet <pipcet <at> protonmail.com> Cc: eggert <at> cs.ucla.edu, michael.albinus <at> gmx.de, 75754 <at> debbugs.gnu.org Subject: Re: bug#75754: styled_format stack usage/GC protection Date: Sat, 25 Jan 2025 15:23:56 +0200
> Date: Sat, 25 Jan 2025 09:45:19 +0000 > From: Pip Cet <pipcet <at> protonmail.com> > Cc: eggert <at> cs.ucla.edu, michael.albinus <at> gmx.de, 75754 <at> debbugs.gnu.org > > "Eli Zaretskii" <eliz <at> gnu.org> writes: > > >> Date: Fri, 24 Jan 2025 22:35:50 +0000 > >> From: Pip Cet <pipcet <at> protonmail.com> > > >> >> >> > Even if the code modifies the string after GC? > >> >> >> > >> >> >> If the GC free'd the string, but not the SDATA structure, we cannot > >> >> >> regenerate the string to call resize_string_data. So, no, this cannot > >> >> >> happen. > >> >> >> > >> >> >> Note that resizing a live string while you have a live SDATA pointer to > >> >> >> it might result in the SDATA pointer referring to the pre-modification > >> >> >> SDATA, not the current one. > >> >> > > >> >> > Who said anything about resizing? > >> >> > >> >> Non-resizing Faset is boring, works as expected. > >> > > >> > But it will modify the string whose SDATA pointer is different, no? > >> > >> Different how? > > > > Different as in: "pointing to a location different from the data of > > the string that was moved by GC". > > I'm sorry, this may sound like you as though I'm being pedantic, but I > need to be sure that we're talking about the same thing here: > > The scenario you describe involves a single string object, but two sdata > areas. The single string points to the first sdata area, but then is > modified to point to the second. Modifying the first sdata area after > the second one has been created will have no effect whatsoever on the > string. > > >> There's only one string in this example, and after the > >> string's SDATA changes, you can access the old SDATA, but it doesn't > >> belong to a string anymore. > > > > That's what I thought. So code that expects the old SDATA pointer to > > be pointing to the same string after GC will have a bug on the igc > > feature/igc: GC doesn't invalidate SDATA pointers. Resizing string > modification does (the other way is to call pin_string). > > > branch. It's a different bug than under the old GC, but still a bug. > > Right? > > If there's modification involved, it's a bug which will cause apparently > random behavior but no crashes. If there's no modification, it's a bug > only because the code might be used with alloc.c GC. > > >> > Lisp_Object my_string; > >> > char *p = SDATA (my_string); > >> > /* trigger GC */ > >> > Faset (my_string, make_fixnum (0), make_fixnum (97)); > >> > /* code which expects p[0] to be 'a' */ > >> > >> If that is all the code, this is safe with feature/igc. > > > > You say "safe", but it sounds like we have different definitions of > > "safe". Because later you say: > > I meant "won't crash, but most likely useless". > > >> MPS GC will never make pointers that live on the stack invalid. It's > >> transparent to correct client programs. So, yes, if you're asking about > >> MPS GC, the sequence is safe and doesn't crash, but may surprise you by > >> reproducing the old string data. > > > > Which seems to mean your "safe" means the code will not crash, whereas > > what I meant is the correctness of the code. Code which refers to the > > old string data is incorrect, even if it won't crash. > > I believe that's true. The above confirms what I thought about this, thanks. > > So my conclusion is that even with MPS, referring in code after GC to > > an SDATA pointer taken before GC leads to incorrect code. IOW, > > pointers to SDATA of a string cannot be relied upon to be valid > > (i.e. pointing to the correct string) across GC, so the code must > > re-initialize such pointer after any fragment that could GC. This was > > the rule with the old GC, and it should continue to be the rule with > > MPS. > > No. And this seems to contradict it. > GC doesn't invalidate SDATA pointers. I never said anything about invalidating the SDATA pointers. So if your "No" is because of the (lack of) invalidation, then it is not relevant to what I was asking about. Code can be incorrect even if it doesn't crash or otherwise refers to invalid pointers. It could be incorrect because it modifies incorrect memory believing that it modifies the data of a string.
bug-gnu-emacs <at> gnu.org
:bug#75754
; Package emacs
.
(Sat, 25 Jan 2025 18:21:02 GMT) Full text and rfc822 format available.Message #77 received at 75754 <at> debbugs.gnu.org (full text, mbox):
From: Pip Cet <pipcet <at> protonmail.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: eggert <at> cs.ucla.edu, michael.albinus <at> gmx.de, 75754 <at> debbugs.gnu.org Subject: Re: bug#75754: styled_format stack usage/GC protection Date: Sat, 25 Jan 2025 18:20:40 +0000
"Eli Zaretskii" <eliz <at> gnu.org> writes: >> Date: Sat, 25 Jan 2025 09:45:19 +0000 >> From: Pip Cet <pipcet <at> protonmail.com> >> Cc: eggert <at> cs.ucla.edu, michael.albinus <at> gmx.de, 75754 <at> debbugs.gnu.org >> >> "Eli Zaretskii" <eliz <at> gnu.org> writes: >> If there's modification involved, it's a bug which will cause apparently >> random behavior but no crashes. If there's no modification, it's a bug >> only because the code might be used with alloc.c GC. >> >> >> > Lisp_Object my_string; >> >> > char *p = SDATA (my_string); >> >> > /* trigger GC */ This line is not important. GC is transparent to client code like this, which doesn't violate the rules by hiding a pointer somewhere. >> >> > Faset (my_string, make_fixnum (0), make_fixnum (97)); ^^^^^ This is the important part. Faset can and does reallocate string data unpredictably. >> >> > /* code which expects p[0] to be 'a' */ >> >> >> >> If that is all the code, this is safe with feature/igc. >> > >> > You say "safe", but it sounds like we have different definitions of >> > "safe". Because later you say: >> >> I meant "won't crash, but most likely useless". >> >> >> MPS GC will never make pointers that live on the stack invalid. It's >> >> transparent to correct client programs. So, yes, if you're asking about >> >> MPS GC, the sequence is safe and doesn't crash, but may surprise you by >> >> reproducing the old string data. >> > >> > Which seems to mean your "safe" means the code will not crash, whereas >> > what I meant is the correctness of the code. Code which refers to the >> > old string data is incorrect, even if it won't crash. >> >> I believe that's true. > > The above confirms what I thought about this, thanks. The "sequence" I mentioned above included a string modification, which invalidates the SDATA pointer (sometimes). GC never does, so its inclusion in the sequence was irrelevant. >> > So my conclusion is that even with MPS, referring in code after GC to >> > an SDATA pointer taken before GC leads to incorrect code. IOW, >> > pointers to SDATA of a string cannot be relied upon to be valid >> > (i.e. pointing to the correct string) across GC, so the code must >> > re-initialize such pointer after any fragment that could GC. This was >> > the rule with the old GC, and it should continue to be the rule with >> > MPS. >> >> No. > > And this seems to contradict it. Here, you were talking ONLY about GC, the irrelevant part. GC never invalidates an SDATA pointer. Only string modification does. >> GC doesn't invalidate SDATA pointers. > > I never said anything about invalidating the SDATA pointers. So if Then they're valid. Period. Pip
bug-gnu-emacs <at> gnu.org
:bug#75754
; Package emacs
.
(Mon, 27 Jan 2025 09:10:02 GMT) Full text and rfc822 format available.Message #80 received at 75754 <at> debbugs.gnu.org (full text, mbox):
From: Michael Albinus <michael.albinus <at> gmx.de> To: Pip Cet <pipcet <at> protonmail.com> Cc: Eli Zaretskii <eliz <at> gnu.org>, eggert <at> cs.ucla.edu, 75754 <at> debbugs.gnu.org Subject: Re: bug#75754: styled_format stack usage/GC protection Date: Mon, 27 Jan 2025 10:08:58 +0100
Pip Cet <pipcet <at> protonmail.com> writes: Hi Pip, > Michael Albinus, could you please C-s ert-how in this message and > comment briefly if appropriate? I think adding a new defstruct may help > improve ert in at least three ways, including this one. Please indicate > whether this would be *potentially* acceptable; if it is, I'll prepare a > separate bug. If it isn't, I might have to think of something else. > (Brief proposal relevant to several issues: If we want to test GC bugs > more thoroughly, we need a way to tell ert to run them in a special way. > I proposed extending ert with an ert-how defstruct for the make > benchmark target, indicating "how" a test is to be run, and implemented > it, but I don't think I've sent the patch yet. Using this structure to > create conditions for testing for GC bugs would be possible, and it > would also be an easier way to make crash tests print a message before > they're run. No general objection from me. If you show the patch in a new bug, we'll be able to discuss. > Pip Best regards, Michael.
Pip Cet <pipcet <at> protonmail.com>
:Pip Cet <pipcet <at> protonmail.com>
:Message #85 received at 75754-done <at> debbugs.gnu.org (full text, mbox):
From: Pip Cet <pipcet <at> protonmail.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: eggert <at> cs.ucla.edu, 75754-done <at> debbugs.gnu.org Subject: Re: bug#75754: styled_format stack usage/GC protection Date: Mon, 03 Feb 2025 21:02:17 +0000
"Eli Zaretskii" <eliz <at> gnu.org> writes: >> Date: Thu, 23 Jan 2025 23:58:41 +0000 >> From: Pip Cet <pipcet <at> protonmail.com> >> Cc: eggert <at> cs.ucla.edu, 75754 <at> debbugs.gnu.org >> >> 1. Reload format_start and format (and end, and format0) after every >> call which might have GC'd. If you think we should do this, please >> tell me whether lisp_string_width can GC. > > It can, if called with the last argument 'true' (because > find_automatic_composition calls into Lisp). Currently, we call it > with 'false', so it cannot. > >> More importantly, assuming it doesn't, document this in every >> function in the call tree starting at lisp_string_width so we don't >> accidentally change it. >> >> 2. memcpy the format string. Two-liner, more likely to fix the bug for >> good than (1), wastes more memory (since sa_avail has been negative >> since we entered the function, this is xmalloc'd memory). >> >> 3. replace format by a ptrdiff_t and all instances of *format by >> SREF (args[0], index). Faster than 2, but many changes hurting >> readability. > > I prefer (2), I think. Assuming it indeed fixes the problem. I had to modify it slightly to copy the final NUL character (no harm done if it isn't used, but it does appear to be relied upon in a few places). Pushed now. Please test, and I'm closing this bug with the following notes for followup bugs: 1. styled_format still uses excessive amounts of stack, both because it reserves enough space to print a long double using %f (eating up all of our sa_avail space on this system), and because it reserves strlen (format)/2 argument spec slots, rather than simply counting '%' characters as we copy the string. 2. the crash tests need to be installed once we have a way of presenting "Emacs crashed" messages nicely for ERT tets. Alternatively, just wait a few weeks; people who run new test suites on old Emacs versions can expect the occasional crash. 3. If anyone wants to avoid copying the format string again (which isn't really worth it because format strings aren't usually that long), please ensure that modifying the string from Lisp callbacks in the middle of the format won't cause crashes. Thanks! Pip
bug-gnu-emacs <at> gnu.org
:bug#75754
; Package emacs
.
(Tue, 04 Feb 2025 12:22:01 GMT) Full text and rfc822 format available.Message #88 received at 75754-done <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Pip Cet <pipcet <at> protonmail.com> Cc: eggert <at> cs.ucla.edu, 75754-done <at> debbugs.gnu.org Subject: Re: bug#75754: styled_format stack usage/GC protection Date: Tue, 04 Feb 2025 14:21:00 +0200
> Date: Mon, 03 Feb 2025 21:02:17 +0000 > From: Pip Cet <pipcet <at> protonmail.com> > Cc: eggert <at> cs.ucla.edu, 75754-done <at> debbugs.gnu.org > > "Eli Zaretskii" <eliz <at> gnu.org> writes: > > >> Date: Thu, 23 Jan 2025 23:58:41 +0000 > >> From: Pip Cet <pipcet <at> protonmail.com> > >> Cc: eggert <at> cs.ucla.edu, 75754 <at> debbugs.gnu.org > >> > >> 1. Reload format_start and format (and end, and format0) after every > >> call which might have GC'd. If you think we should do this, please > >> tell me whether lisp_string_width can GC. > > > > It can, if called with the last argument 'true' (because > > find_automatic_composition calls into Lisp). Currently, we call it > > with 'false', so it cannot. > > > >> More importantly, assuming it doesn't, document this in every > >> function in the call tree starting at lisp_string_width so we don't > >> accidentally change it. > >> > >> 2. memcpy the format string. Two-liner, more likely to fix the bug for > >> good than (1), wastes more memory (since sa_avail has been negative > >> since we entered the function, this is xmalloc'd memory). > >> > >> 3. replace format by a ptrdiff_t and all instances of *format by > >> SREF (args[0], index). Faster than 2, but many changes hurting > >> readability. > > > > I prefer (2), I think. Assuming it indeed fixes the problem. > > I had to modify it slightly to copy the final NUL character (no harm > done if it isn't used, but it does appear to be relied upon in a few > places). > > Pushed now. Please test Thanks, the test I wrote now passes, so I've installed it. > 2. the crash tests need to be installed once we have a way of presenting > "Emacs crashed" messages nicely for ERT tets. Alternatively, just wait > a few weeks; people who run new test suites on old Emacs versions can > expect the occasional crash. Does this mean you didn't want me to install the test? It doesn't crash.
bug-gnu-emacs <at> gnu.org
:bug#75754
; Package emacs
.
(Tue, 04 Feb 2025 12:32:01 GMT) Full text and rfc822 format available.Message #91 received at 75754-done <at> debbugs.gnu.org (full text, mbox):
From: Pip Cet <pipcet <at> protonmail.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: eggert <at> cs.ucla.edu, 75754-done <at> debbugs.gnu.org Subject: Re: bug#75754: styled_format stack usage/GC protection Date: Tue, 04 Feb 2025 12:31:21 +0000
"Eli Zaretskii" <eliz <at> gnu.org> writes: >> 2. the crash tests need to be installed once we have a way of presenting >> "Emacs crashed" messages nicely for ERT tets. Alternatively, just wait >> a few weeks; people who run new test suites on old Emacs versions can >> expect the occasional crash. > > Does this mean you didn't want me to install the test? It doesn't > crash. I was worried that people might run the current test suite with older versions of Emacs to get a which-test-was-fixed-when dataset, and that crashes may get in the way. But I don't really see a problem with that, to be honest, so no continued objections from me! Thanks for installing this! Pip
Debbugs Internal Request <help-debbugs <at> gnu.org>
to internal_control <at> debbugs.gnu.org
.
(Wed, 05 Mar 2025 12:24:15 GMT) Full text and rfc822 format available.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.