GNU bug report logs - #75768
Missing assertions to detect access to GC-freed strings

Previous Next

Package: emacs;

Reported by: Pip Cet <pipcet <at> protonmail.com>

Date: Wed, 22 Jan 2025 20:13:02 UTC

Severity: normal

To reply to this bug, email your comments to 75768 AT debbugs.gnu.org.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#75768; Package emacs. (Wed, 22 Jan 2025 20:13:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Pip Cet <pipcet <at> protonmail.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Wed, 22 Jan 2025 20:13:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Pip Cet <pipcet <at> protonmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: Missing assertions to detect access to GC-freed strings
Date: Wed, 22 Jan 2025 20:11:58 +0000
This is a spin-off from bug#75754.  I produced an example program to
demonstrate the correctness bug described therein:

(setq print-unreadable-function
      (lambda (&rest args)
        (garbage-collect)
        (make-string 100 ?θ)))

(message "%S" (format "%S %S" [1] (symbol-function '+)))

This program sometimes completes without throwing an error or causing a
segfault on the master branch, but produces incorrect output: the
correct output starts with "[1] "; the incorrect output omits the "[1]".

This is because the freed string looks like this:

(gdb) p info[0].argument
$7 = XIL(0x555555fdcea4)
(gdb) pp $
#<INVALID_LISP_OBJECT 0x555555fdcea4>
(gdb) p XSTRING($7)
$8 = (struct Lisp_String *) 0x555555fdcea0
(gdb) p *$
$9 = {
  u = {
    s = {
      size = 0,
      size_byte = -1,
      intervals = 0x0,
      data = 0x0
    },
    next = 0x0,
    gcaligned = 0 '\000'
  }
}

This is obviously invalid; as Emacs strings include a trailing NUL byte,
their data pointer must never be NULL.  However, as the size field is 0
(this is invalid for all but the pair of empty strings), we never
attempt to dereference the NULL pointer, and that means we do not abort
or crash soon enough.

This needs to be fixed.  GC bugs usually lead to segmentation faults and
crashes; producing incorrect output and silently corrupting user data is
much, much worse.

I'm proposing to add checks, at least temporarily, to the
--enable-checking builds: these checks would ensure that a string
contains a non-NULL data field if it is accessed in any way.

Even with this patch, we would fail to recognize impossible strings
automatically if even a single bit in the data pointer is set.  I don't
have a better solution yet, though, but the patch (to follow as soon as
this has a bug number) would have detected this particular problem a
little sooner.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75768; Package emacs. (Thu, 23 Jan 2025 17:43:02 GMT) Full text and rfc822 format available.

Message #8 received at 75768 <at> debbugs.gnu.org (full text, mbox):

From: Pip Cet <pipcet <at> protonmail.com>
To: 75768 <at> debbugs.gnu.org
Subject: Re: Missing assertions to detect access to GC-freed strings
Date: Thu, 23 Jan 2025 17:42:16 +0000
Pip Cet <pipcet <at> protonmail.com> writes:

> Even with this patch, we would fail to recognize impossible strings
> automatically if even a single bit in the data pointer is set.  I don't
> have a better solution yet, though, but the patch (to follow as soon as
> this has a bug number) would have detected this particular problem a
> little sooner.

Patch follows, I've confirmed that it would have detected this problem
very slightly sooner.

From 5ddd7c3cb24fe1d0b2f53a44500f11db69a09831 Mon Sep 17 00:00:00 2001
From: Pip Cet <pipcet <at> protonmail.com>
Subject: [PATCH] Add debug assertions that we never access a free'd string
 (bug#75768)

* src/lisp.h (LIVE_STRING_P): New function.
(SDATA):
(SCHARS):
(STRING_BYTES):
(string_immovable_p): Use it in assertions.
---
 src/lisp.h | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/src/lisp.h b/src/lisp.h
index 28fa4c8037e..498aa2b98af 100644
--- a/src/lisp.h
+++ b/src/lisp.h
@@ -1617,6 +1617,13 @@ STRINGP (Lisp_Object x)
   return TAGGEDP (x, Lisp_String);
 }
 
+/* /\* Should always return true. */  */
+INLINE bool
+LIVE_STRING_P (Lisp_Object string)
+{
+  return XSTRING (string)->u.s.data != NULL;
+}
+
 INLINE void
 CHECK_STRING (Lisp_Object x)
 {
@@ -1685,6 +1692,7 @@ STRING_SET_MULTIBYTE (Lisp_Object str)
 INLINE unsigned char *
 SDATA (Lisp_Object string)
 {
+  eassert (LIVE_STRING_P (string));
   return XSTRING (string)->u.s.data;
 }
 INLINE char *
@@ -1706,6 +1714,7 @@ SSET (Lisp_Object string, ptrdiff_t index, unsigned char new)
 INLINE ptrdiff_t
 SCHARS (Lisp_Object string)
 {
+  eassert (LIVE_STRING_P (string));
   ptrdiff_t nchars = XSTRING (string)->u.s.size;
   eassume (0 <= nchars);
   return nchars;
@@ -1717,6 +1726,7 @@ SCHARS (Lisp_Object string)
 INLINE ptrdiff_t
 STRING_BYTES (struct Lisp_String *s)
 {
+  eassert (LIVE_STRING_P (make_lisp_ptr (s, Lisp_String)));
 #ifdef GC_CHECK_STRING_BYTES
   ptrdiff_t nbytes = string_bytes (s);
 #else
@@ -1753,6 +1763,7 @@ CHECK_STRING_NULL_BYTES (Lisp_Object string)
 INLINE bool
 string_immovable_p (Lisp_Object str)
 {
+  eassert (LIVE_STRING_P (str));
   return XSTRING (str)->u.s.size_byte == -3;
 }
 
-- 
2.47.1





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75768; Package emacs. (Thu, 23 Jan 2025 17:46:02 GMT) Full text and rfc822 format available.

Message #11 received at 75768 <at> debbugs.gnu.org (full text, mbox):

From: Pip Cet <pipcet <at> protonmail.com>
To: 75768 <at> debbugs.gnu.org
Subject: Re: Missing assertions to detect access to GC-freed strings
Date: Thu, 23 Jan 2025 17:44:54 +0000
"Pip Cet" <pipcet <at> protonmail.com> writes:

> Pip Cet <pipcet <at> protonmail.com> writes:
>
>> Even with this patch, we would fail to recognize impossible strings
>> automatically if even a single bit in the data pointer is set.  I don't
>> have a better solution yet, though, but the patch (to follow as soon as
>> this has a bug number) would have detected this particular problem a
>> little sooner.
>
> Patch follows, I've confirmed that it would have detected this problem

Sorry, wrong patch!

From 5018bdbf3729351f5273c7eecee3b67131aa2859 Mon Sep 17 00:00:00 2001
From: Pip Cet <pipcet <at> protonmail.com>
Subject: [PATCH] Additional checks for access to freed strings (bug#75768)

* src/lisp.h (LIVE_STRING_P): New function.
(SDATA):
(SCHARS):
(STRING_BYTES):
(string_immovable_p): Use it in assertions.
---
 src/lisp.h | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/src/lisp.h b/src/lisp.h
index 28fa4c8037e..678f1ba0460 100644
--- a/src/lisp.h
+++ b/src/lisp.h
@@ -1672,6 +1672,13 @@ STRING_SET_MULTIBYTE (Lisp_Object str)
   XSTRING (str)->u.s.size_byte = XSTRING (str)->u.s.size;
 }
 
+/* Should always return true.  */
+INLINE bool
+LIVE_STRING_P (Lisp_Object string)
+{
+  return XSTRING (string)->u.s.data != NULL;
+}
+
 /* Convenience functions for dealing with Lisp strings.  */
 
 /* WARNING: Use the 'char *' pointers to string data with care in code
@@ -1685,6 +1692,7 @@ STRING_SET_MULTIBYTE (Lisp_Object str)
 INLINE unsigned char *
 SDATA (Lisp_Object string)
 {
+  eassert (LIVE_STRING_P (string));
   return XSTRING (string)->u.s.data;
 }
 INLINE char *
@@ -1706,6 +1714,7 @@ SSET (Lisp_Object string, ptrdiff_t index, unsigned char new)
 INLINE ptrdiff_t
 SCHARS (Lisp_Object string)
 {
+  eassert (LIVE_STRING_P (string));
   ptrdiff_t nchars = XSTRING (string)->u.s.size;
   eassume (0 <= nchars);
   return nchars;
@@ -1717,6 +1726,7 @@ SCHARS (Lisp_Object string)
 INLINE ptrdiff_t
 STRING_BYTES (struct Lisp_String *s)
 {
+  eassert (LIVE_STRING_P (make_lisp_ptr (s, Lisp_String)));
 #ifdef GC_CHECK_STRING_BYTES
   ptrdiff_t nbytes = string_bytes (s);
 #else
@@ -1753,6 +1763,7 @@ CHECK_STRING_NULL_BYTES (Lisp_Object string)
 INLINE bool
 string_immovable_p (Lisp_Object str)
 {
+  eassert (LIVE_STRING_P (str));
   return XSTRING (str)->u.s.size_byte == -3;
 }
 
-- 
2.47.1





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75768; Package emacs. (Thu, 23 Jan 2025 18:18:01 GMT) Full text and rfc822 format available.

Message #14 received at 75768 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Pip Cet <pipcet <at> protonmail.com>
Cc: 75768 <at> debbugs.gnu.org
Subject: Re: bug#75768: Missing assertions to detect access to GC-freed strings
Date: Thu, 23 Jan 2025 20:16:46 +0200
> Date: Thu, 23 Jan 2025 17:42:16 +0000
> From:  Pip Cet via "Bug reports for GNU Emacs,
>  the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org>
> 
> Pip Cet <pipcet <at> protonmail.com> writes:
> 
> > Even with this patch, we would fail to recognize impossible strings
> > automatically if even a single bit in the data pointer is set.  I don't
> > have a better solution yet, though, but the patch (to follow as soon as
> > this has a bug number) would have detected this particular problem a
> > little sooner.
> 
> Patch follows, I've confirmed that it would have detected this problem
> very slightly sooner.

Thanks.

> +/* /\* Should always return true. */  */

I wonder why did you comment out the comment?

> +INLINE bool
> +LIVE_STRING_P (Lisp_Object string)
> +{
> +  return XSTRING (string)->u.s.data != NULL;

Can we please have a comment here saying that sweep_strings sets the
data pointer to NULL, and thus this function checks whether a string
was GC'ed?

>  INLINE ptrdiff_t
>  SCHARS (Lisp_Object string)
>  {
> +  eassert (LIVE_STRING_P (string));
>    ptrdiff_t nchars = XSTRING (string)->u.s.size;
>    eassume (0 <= nchars);
>    return nchars;
> @@ -1717,6 +1726,7 @@ SCHARS (Lisp_Object string)
>  INLINE ptrdiff_t
>  STRING_BYTES (struct Lisp_String *s)
>  {
> +  eassert (LIVE_STRING_P (make_lisp_ptr (s, Lisp_String)));

I understand why we want an assertion in SDATA and SSDATA, but do we
really want it in SCHARS and STRING_BYTES?  Those do not access the
string text, so how can they be harmful?




This bug report was last modified 75 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.