GNU bug report logs - #78686
feature/igc: [PATCH] Make charset_table an exact root

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: emacs; Reported by: Helmut Eller <eller.helmut@HIDDEN>; Keywords: patch; dated Tue, 3 Jun 2025 19:45:08 UTC; Maintainer for emacs is bug-gnu-emacs@HIDDEN.

Message received at 78686 <at> debbugs.gnu.org:


Received: (at 78686) by debbugs.gnu.org; 5 Jun 2025 08:09:23 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Jun 05 04:09:23 2025
Received: from localhost ([127.0.0.1]:58436 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1uN5f8-0004Q9-UA
	for submit <at> debbugs.gnu.org; Thu, 05 Jun 2025 04:09:23 -0400
Received: from mail-wr1-x432.google.com ([2a00:1450:4864:20::432]:61820)
 by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128)
 (Exim 4.84_2) (envelope-from <eller.helmut@HIDDEN>)
 id 1uN5f6-0004Pd-O3
 for 78686 <at> debbugs.gnu.org; Thu, 05 Jun 2025 04:09:21 -0400
Received: by mail-wr1-x432.google.com with SMTP id
 ffacd0b85a97d-3a503d9ef59so537091f8f.3
 for <78686 <at> debbugs.gnu.org>; Thu, 05 Jun 2025 01:09:20 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=gmail.com; s=20230601; t=1749110954; x=1749715754; darn=debbugs.gnu.org;
 h=mime-version:user-agent:message-id:date:references:in-reply-to
 :subject:cc:to:from:from:to:cc:subject:date:message-id:reply-to;
 bh=4mIMfAm/zu3Jqj088+Nr/ChyUMPTYuICdbD4AOTMj88=;
 b=QE2Fgfonr9LwNuMEq5/7RdVZwZPWsZQ83OZJrsqjX3OfnQSdEYF204DYhI9tAh660N
 V/EZG1mMTA0+UKaAbNLfvyM2wiGetShUy3OYN+wqgzOTYbJapmAFN3TAE6J1CWc1tvjd
 dYiA2LvObwDZ881eAmoURiZFWsZKpQyvmlk6B4y0svLhXRHC+DSl4R7YvF83ZFyBq6CK
 /XV3HG9kyHifusdPOKmDhYLQ0+pMzT/oMaEC6dQFhmN/9PMDoU1kA1ycjm24EqBOJsK7
 XDW1ykNf9xvce1YvUSwuayljM/NdhaCWR+DBu+0pWnjxnri9FzlbGmQ7JdzqIupTcplP
 ymDQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20230601; t=1749110954; x=1749715754;
 h=mime-version:user-agent:message-id:date:references:in-reply-to
 :subject:cc:to:from:x-gm-message-state:from:to:cc:subject:date
 :message-id:reply-to;
 bh=4mIMfAm/zu3Jqj088+Nr/ChyUMPTYuICdbD4AOTMj88=;
 b=uRVr0uWAREuGOnSUrCNSiIxpt9FxHiJW+yMD76PkgU4wW0/sOP1Oi4iiTVyR3+x9dA
 yQR0WgaKhTlcScOZbM2r3Sc37t6hIWChMEAdcVner7eziU1lLSpvrrlRjUOThetXd051
 88wC+X4abeqMYleKTizLh+DTNdYpQMu2U2/QdLSqKV/SAX2IScoPw3YwEq2Z3WNCXMn3
 APEUMJuFRwlAq++9sguj8/uD2tArxMh44bBqGQFNtNZjrZZ1WKgmrb3qkwm8q0iMy8IT
 SNP2dHg3Aov0oAH0V1bwuGszo8yBn+PTQxTjut1vim87I7jbnXNl9hJQbTMCUYePcTMc
 RtxQ==
X-Gm-Message-State: AOJu0YwgZ7xQma1BmLBf4etNcSwHC/1KRVk7duqAYM3P0tCl9uMfYWvr
 ooH0rhnfMIB/IuxgT7/00KuXfHVnXNJKgPFUcyVHGQqDLzyEaib9ucXiGe7oqg==
X-Gm-Gg: ASbGncs/OMqubDEd4XkqHr2lV1j1kp+DqC4zXMnXQvbRY8z6l/M6NlfArgK1KwcJK8a
 mSf73A+95y7ZxOJV3loudvb6H2hdON+SGpdftHXOq0td6yxOex2V325fOE8w+3tEOU93Kw4vVqu
 2g7/uN1R9cBXtzU3Lv3JLxG+tSCpP2YubDkb5COmz8qMKnkEREW1360KS7C7xRgrMyZfRq/Cmiv
 DsWi0h7XoACNakgz8p7TPE+1dzKkeH8YjRr+yP+9B7xINcwlJwfK4W1Mtz0eWNwko+AtjbHN+JZ
 VQjJqnsUtu5KVPcauuZlkwRmxzc+rfnfnrCZjWfaNRTjTbz+
X-Google-Smtp-Source: AGHT+IFGbc2SCjp0p1XDGDN7Aqwrwmuf9e8/P3zhP6rckX8ZjqWZJOaf1/UZAYnYS7hjZWmM44UFTg==
X-Received: by 2002:a05:6000:4287:b0:3a4:d6ed:8e00 with SMTP id
 ffacd0b85a97d-3a51d959f72mr5108826f8f.33.1749110954076; 
 Thu, 05 Jun 2025 01:09:14 -0700 (PDT)
Received: from caladan ([89.107.110.32]) by smtp.gmail.com with ESMTPSA id
 ffacd0b85a97d-3a4efe5b87csm24164860f8f.12.2025.06.05.01.09.13
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Thu, 05 Jun 2025 01:09:13 -0700 (PDT)
From: Helmut Eller <eller.helmut@HIDDEN>
To: Pip Cet <pipcet@HIDDEN>
Subject: Re: bug#78686: feature/igc: [PATCH] Make charset_table an exact root
In-Reply-To: <87cybj8dqo.fsf@HIDDEN>
References: <87r000k3uv.fsf@HIDDEN> <87bjr3ejpr.fsf@HIDDEN>
 <87qzzziber.fsf@HIDDEN> <87cybj8dqo.fsf@HIDDEN>
Date: Thu, 05 Jun 2025 10:09:12 +0200
Message-ID: <875xhaip93.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13)
MIME-Version: 1.0
Content-Type: text/plain
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 78686
Cc: 78686 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

On Wed, Jun 04 2025, Pip Cet wrote:

> As far as I can tell, there's just one significant problem: we have a
> root covering charset_table_init even though that array is unused in
> pdumped Emacs sessions.  It's not a huge problem, just 180 zero words
> fixed as Lisp objects, but it is unsatisfying and we should fix it.

It's more like 60kb of zeroes.

> So let's get rid of that unnecessary root.
>
> That leaves two very minor problems:
>
> 1. undumped Emacs sessions.  I don't think these are worth optimizing
> for.  Do you?

No. But there is also no need to make them gratuitously different from
normal sessions.

> 2. Emacs sessions which grow charset_table beyond the 180 entries in the
> dump.  Does this happen?  In any case, we do handle the case correctly,
> leaking memory but giving the user access to as many charsets as they
> desire.

Defining new charsets is presumably an exotic thing to do.

> There is also the long-term goal of making it so all objects in the
> pdump section can be relocated elsewhere, allowing us to free the
> section.  We should do so by tracing objects in the dump precisely so
> they can be moved out.

As my patch does, in contrast to your idea.

[...]
>> Second, I don't like the practice of using the GC header to create
>> "shadow types".
>
> Noted.  That's a good argument for not introducing new ones, but your
> patch doesn't introduce or delete any; it just copies things back from
> the dump heap (which is never freed) to an unprotected root in the BSS
> section or to xmalloc'd memory.

Exactly: my patch copies the charset table to the place where it
belongs, the space in the BSS section reserved for it.  Just like
builtin symbols and subrs are copied to their space in the BSS section.

[...]
>> Maybe, depending on how this charset table business goes.
>
> I don't see the dependency.

My motivation may depend on it.

> As far as I can tell, we need this patch.  It removes the extraneous
> root.

I like my patch better.

Helmut




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#78686; Package emacs. Full text available.

Message received at 78686 <at> debbugs.gnu.org:


Received: (at 78686) by debbugs.gnu.org; 4 Jun 2025 20:16:02 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Wed Jun 04 16:16:02 2025
Received: from localhost ([127.0.0.1]:53199 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1uMuWm-0005Mr-TK
	for submit <at> debbugs.gnu.org; Wed, 04 Jun 2025 16:16:02 -0400
Received: from mail-10630.protonmail.ch ([79.135.106.30]:26851)
 by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.84_2) (envelope-from <pipcet@HIDDEN>)
 id 1uMuWk-00050q-82
 for 78686 <at> debbugs.gnu.org; Wed, 04 Jun 2025 16:15:59 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=protonmail.com;
 s=protonmail3; t=1749068149; x=1749327349;
 bh=DDFMohfGpxBOv3bOslRcaER2LKJn/PDA7RmYyupn6KI=;
 h=Date:To:From:Cc:Subject:Message-ID:In-Reply-To:References:
 Feedback-ID:From:To:Cc:Date:Subject:Reply-To:Feedback-ID:
 Message-ID:BIMI-Selector:List-Unsubscribe:List-Unsubscribe-Post;
 b=nTMWmnXiEg0c1e5szwwDGgWlK2orXO1w+WSxifbd2+JFXVS5ySpqc1D0MU2laDhcP
 3sCtf42wGmxLIyNB4zepGL3CGOurtpldb2uwhNJ2YfUan3n1ZQEjLuJTEcdtTBMXQT
 yqzJFKRGjd4QlwDotXaw5hVN8aYuGvjNVEPryEpurN5EXT8ntrBKkkdzx4rf3U317M
 HDQT6bxaIFRsPO+lcdan/um3apRwYpw52VXYQ2EAw95fQfMrfMaJp3QPYljkQ6jy3H
 n3E/3zcj+v742WN/3IpPFNuUiJS+F6m9pjf4PoW8tKC4IK39660VIt9KgAcjpTQ0zo
 LXOV/oae+nJaQ==
Date: Wed, 04 Jun 2025 20:15:46 +0000
To: Helmut Eller <eller.helmut@HIDDEN>
From: Pip Cet <pipcet@HIDDEN>
Subject: Re: bug#78686: feature/igc: [PATCH] Make charset_table an exact root
Message-ID: <87cybj8dqo.fsf@HIDDEN>
In-Reply-To: <87qzzziber.fsf@HIDDEN>
References: <87r000k3uv.fsf@HIDDEN> <87bjr3ejpr.fsf@HIDDEN>
 <87qzzziber.fsf@HIDDEN>
Feedback-ID: 112775352:user:proton
X-Pm-Message-ID: 445735e5b15450a204dfa8ea0c649b5bbcded4ae
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 78686
Cc: 78686 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

"Helmut Eller" <eller.helmut@HIDDEN> writes:

> On Wed, Jun 04 2025, Pip Cet wrote:
>
>> "Helmut Eller" <eller.helmut@HIDDEN> writes:
>>
>>> Here is a proposal to move charset_table out of the GC heap and make it
>>> an exact root area instead.
>>
>>> Previously, the charset_table was located in various places:
>>
>>> 1. in charset_table_init during bootstrap (ambiguous root)
>>> 2. on the GC heap (loaded from the dump; pinned)
>>> 3. on the malloc heap when it needed to grow (ambiguous root)
>>>
>>> Now, the charset table is copied back to charset_table_init after
>>> loading the dump and it is scanned as an exact root area.
>>
>> Thanks.  Seems like the wrong approach to me:
>>
>> This removes the case we want, which is case 2, by restoring the case we
>> don't want, by adding new and larger unprotected roots.  Shouldn't we
>> simply make it so if igc is in use, we only ever use case 2?
>
> Not necessarily.  My ideal solution would be to replace the
> charset.attributes field with something that doesn't need to be traced,
> e.g. an index into a Lisp_Vector of attributes.

I agree the charset design in general could do with some attention.  In
particular, we're still leaking the old charset tables when growing for
the second or subsequent times on the master branch, and I really think
if there were a good reason to do so someone would have updated the
comment by now.  Plus, there is no way to undefine charsets.

However, such changes should not be specific to feature/igc.

What problem are we trying to solve here?

As far as I can tell, there's just one significant problem: we have a
root covering charset_table_init even though that array is unused in
pdumped Emacs sessions.  It's not a huge problem, just 180 zero words
fixed as Lisp objects, but it is unsatisfying and we should fix it.

So let's get rid of that unnecessary root.

That leaves two very minor problems:

1. undumped Emacs sessions.  I don't think these are worth optimizing
for.  Do you?

2. Emacs sessions which grow charset_table beyond the 180 entries in the
dump.  Does this happen?  In any case, we do handle the case correctly,
leaking memory but giving the user access to as many charsets as they
desire.

There is also the long-term goal of making it so all objects in the
pdump section can be relocated elsewhere, allowing us to free the
section.  We should do so by tracing objects in the dump precisely so
they can be moved out.  We should not do so by explicitly memcpying
objects from the dump to xmalloc'd memory.

>> So why not just turn IGC_OBJ_DUMPED_CHARSET_TABLE into
>> IGC_OBJ_CHARSET_TABLE, remove some igc_asserts, and avoid roots
>> altogether, ensuring that root_create_charset_table is called only in
>> temacs, never in pdumped emacs?
>
> First, The IGC_OBJ_DUMPED_FOO types only exist because changing the GC
> heap format was easier than wrestling with the dumper.  There was the
> expectation that the dumper will eventually be extended to support what
> MPS actually needs: sections that correspond to MPS pools.

I'm not agreeing or disagreeing with that statement.  I think the MPS
tradeoff of not having snapshots and requiring client code to handle
dump/restore scenarios on its own was probably a good choice, and I see
no way of changing it now.

> Second, I don't like the practice of using the GC header to create
> "shadow types".

Noted.  That's a good argument for not introducing new ones, but your
patch doesn't introduce or delete any; it just copies things back from
the dump heap (which is never freed) to an unprotected root in the BSS
section or to xmalloc'd memory.

> I'd prefer to make them pvec types or keep 'em out of
> the GC heap.

I agree with the sentiment, but turning struct charset into a
pseudovector would be a significant change which should be done on the
master branch, too, IMHO, and would need to be discussed first.  I
assume there were good reasons for making it a flat table rather than a
vector of pseudovectors (it does save one level of indirection), but I'm
not sure those are still relevant.

>> Are you looking at kbd_buffer next?  That's a very large ambiguous root,
>> but a little tricky, not least due to the GCC bug.
>
> Maybe, depending on how this charset table business goes.

I don't see the dependency.

As far as I can tell, we need this patch.  It removes the extraneous
root.

From a13ef4ad9b97455c58682b4f616272a1c09c4b8f Mon Sep 17 00:00:00 2001
From: Pip Cet <pipcet@HIDDEN>
Subject: [PATCH] Destroy the charset_table_init root when pdump is loaded
 (bug#78686)

At this point, pdumper has changed charset_table to point into the
dump, not to charset_table_init, which remains unused for the rest of
the Emacs session.  No need for a root.

* src/igc.c (igc_on_pdump_loaded): Destroy unnecessary root.
---
 src/igc.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/igc.c b/src/igc.c
index f63b0fe8c88..ab61fa7d3d6 100644
--- a/src/igc.c
+++ b/src/igc.c
@@ -5204,6 +5204,8 @@ igc_on_pdump_loaded (void *dump_base, void *hot_start=
, void *hot_end,
 =09=09=09 (uint8_t *) pinned_objects_in_dump
 =09=09=09   + sizeof pinned_objects_in_dump,
 =09=09=09 "dump-pins");
+
+  igc_destroy_root_with_start (charset_table_init);
 }
=20
 void *
--=20
2.48.1

We should also add a comment explaining that pinned_objects_in_dump[1]
is necessary to keep the code space masks around:

From 48e21dc0e27e9d0a19579c5cd13cbad43533ada8 Mon Sep 17 00:00:00 2001
From: Pip Cet <pipcet@HIDDEN>
Subject: [PATCH] ; * src/igc.c (fix_charset_table): Add comment about untra=
ced
 ref.

---
 src/igc.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/igc.c b/src/igc.c
index ab61fa7d3d6..7ff263255f9 100644
--- a/src/igc.c
+++ b/src/igc.c
@@ -1950,6 +1950,9 @@ fix_charset_table (mps_ss_t ss, struct charset *table=
, size_t nbytes)
   {
     for (size_t i =3D 0, len =3D nbytes / sizeof (struct charset); i < len=
; i++)
       IGC_FIX12_OBJ (ss, &table[i].attributes);
+    /* FIXME/igc: fix table[i].code_space_mask, once that's no longer an
+       interior pointer.  Currently, it's protected explicitly by
+       pinned_objects_in_dump[1] =3D=3D code_space_masks.  */
   }
   MPS_SCAN_END (ss);
   return MPS_RES_OK;
--=20
2.48.1

Finally, once we're convinced that it's not needed for anything else, we
can do this:

From 7111cb4562bbd89f12d6a7bf284ec95f93ccf69d Mon Sep 17 00:00:00 2001
From: Pip Cet <pipcet@HIDDEN>
Subject: [PATCH] Make charset_table an exact root (bug#78686)

* src/igc.c (fix_charset_table): Adjust comment.
(igc_on_pdump_loaded): Turn charset_table into an exact rather than an
ambiguous (pinned) root.
---
 src/igc.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/src/igc.c b/src/igc.c
index 7ff263255f9..70e8eac49e9 100644
--- a/src/igc.c
+++ b/src/igc.c
@@ -1952,7 +1952,7 @@ fix_charset_table (mps_ss_t ss, struct charset *table=
, size_t nbytes)
       IGC_FIX12_OBJ (ss, &table[i].attributes);
     /* FIXME/igc: fix table[i].code_space_mask, once that's no longer an
        interior pointer.  Currently, it's protected explicitly by
-       pinned_objects_in_dump[1] =3D=3D code_space_masks.  */
+       pinned_objects_in_dump[0] =3D=3D code_space_masks.  */
   }
   MPS_SCAN_END (ss);
   return MPS_RES_OK;
@@ -5153,7 +5153,7 @@ check_dump (mps_addr_t start, mps_addr_t end)
   return true;
 }
=20
-static mps_addr_t pinned_objects_in_dump[3];
+static mps_addr_t pinned_objects_in_dump[2];
=20
 /* Called from pdumper_load.  [HOT_START, HOT_END) is the hot section of
    the dump.  [COL_START, COLD_END) is the cold section of the
@@ -5195,9 +5195,11 @@ igc_on_pdump_loaded (void *dump_base, void *hot_star=
t, void *hot_end,
   set_header (heap_end, IGC_OBJ_PAD, relocs_size, 0);
=20
   eassert (check_dump (h, cold_end));
+
+  igc_root_create_exact_ptr (&charset_table);
+
   /* Pin some stuff in the dump  */
   mps_addr_t pinned_roots[] =3D {
-    charset_table,
     cold_start, /* code_space_masks */
     cold_user_data_start,
   };
--=20
2.48.1

WDYT?

Pip





Information forwarded to bug-gnu-emacs@HIDDEN:
bug#78686; Package emacs. Full text available.

Message received at 78686 <at> debbugs.gnu.org:


Received: (at 78686) by debbugs.gnu.org; 4 Jun 2025 18:56:07 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Wed Jun 04 14:56:07 2025
Received: from localhost ([127.0.0.1]:52541 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1uMtHS-00037s-9Z
	for submit <at> debbugs.gnu.org; Wed, 04 Jun 2025 14:56:06 -0400
Received: from mail-wr1-x430.google.com ([2a00:1450:4864:20::430]:49177)
 by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128)
 (Exim 4.84_2) (envelope-from <eller.helmut@HIDDEN>)
 id 1uMtHQ-000372-7t
 for 78686 <at> debbugs.gnu.org; Wed, 04 Jun 2025 14:56:04 -0400
Received: by mail-wr1-x430.google.com with SMTP id
 ffacd0b85a97d-3a3673e12c4so84566f8f.2
 for <78686 <at> debbugs.gnu.org>; Wed, 04 Jun 2025 11:56:04 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=gmail.com; s=20230601; t=1749063358; x=1749668158; darn=debbugs.gnu.org;
 h=mime-version:user-agent:message-id:date:references:in-reply-to
 :subject:cc:to:from:from:to:cc:subject:date:message-id:reply-to;
 bh=uOPsM5WgB5hQGMN9zhVtaiSTSQs3P5OgMX1Tlu9LaYs=;
 b=jXmDtP+wknOlsn++UFAmRCMhzltd7FVdw5+g2ch5pydw7SY/yHTlTS1w+OmZ2IUlVQ
 rmOc/9UNfdQfUglwyB/tG1c3BRixjNQW/m/hBRSNAChB1frt5AqSIZpGXqrI4PqZPhFR
 AYiRS6631CwoSnx3756sdOg182bYrb3MDRMFmrbFzlBS5FjX9QJ7qyk0v94TRJE5d55e
 4LsBSvsTWx6xKDKvWTcouJg8jqVkDovSoWA11UtMTd1RKRw0puDVhWlj8iB8lhJBCUTy
 aNpfFNNNMzy7mppyC4ki4zmxTdpyK5l+msZflHt9hQAlqvfXpRlMxmlUASnWKCrQ4gUB
 EaMQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20230601; t=1749063358; x=1749668158;
 h=mime-version:user-agent:message-id:date:references:in-reply-to
 :subject:cc:to:from:x-gm-message-state:from:to:cc:subject:date
 :message-id:reply-to;
 bh=uOPsM5WgB5hQGMN9zhVtaiSTSQs3P5OgMX1Tlu9LaYs=;
 b=NtvSzXIub2adadtlDg7Ygahz29OskmryUR6PQ1r+Xiy8znHSoVsrF1px+VlHGrHKoK
 tQo4s14A6auSvOsGDgJGkyeO4/9JS1lieF6Xq5btUJ8Kd+zM0FUym5qC4akIiKyrehs9
 A4Eu0r6ma0kgoqQ43VYsYFxVLgn6U/YYpTtLOMHknLCcdCoIlsGAN7sT5e/Eukejp/YV
 0sdZ5cHpYrNjl8uOojlb6CyA7l0txGGRdFsxV+IYtWD+0q5PrEEgecFjqHwA+jjM4xC5
 TYwX++bjVKcqDpxrzhemGV2GzvL6lEHefJdFE+BUImtnEBPji3oqe50k5wdE9YgZ2RdV
 0vlQ==
X-Gm-Message-State: AOJu0YzXhQgSIlSFNrSpnBpKB5LLuZbJyB/2onLy0Jmmfx0W5T2LGD/T
 ouaxSe+JNZwXt4Fdj+YSVJ80jaw7o1oXxntbiMEGh+avt/jBpqKR9lC1K0k7vA==
X-Gm-Gg: ASbGncsMkKgwqBkjISsNszfcFdClDZnjyD0ZactoYlNnQ58w/61NOCNqRTNXnl0M1po
 Ym0O3JLP7Js7OhsYih4Ua/Ls0I0vGF/REd5RlN8dD77B/cdiYzFA/SJVg65FAUqd+TN5x7e042b
 DTXC6LL98TGL8uhmFP7Zef/bQN165jnrEEHm1H8NvSLKmOSU6NVskZsics7ZyUazSZi/hnMwwro
 O7CobJPUqNinwvCLlmeL+ZC7IGpzO12S2RuPEHTy7sXQ4dt2uuwi9EMc8UgnYeNplTNdBRE5pVo
 3GeHmao5zkMa+fo3FaOi6L26UT8N0rZMlMtg41uuVRScEZhh
X-Google-Smtp-Source: AGHT+IEoLkn0lASNi8amihdMcD5NEy9p8WKSR2/RVKLxgU9ZsRPu35HU/KNT4Z1GbGgDNmcfjt3JSw==
X-Received: by 2002:a05:6000:240d:b0:3a4:d99b:f51d with SMTP id
 ffacd0b85a97d-3a51d8efd25mr3834415f8f.11.1749063357661; 
 Wed, 04 Jun 2025 11:55:57 -0700 (PDT)
Received: from caladan ([89.107.110.32]) by smtp.gmail.com with ESMTPSA id
 ffacd0b85a97d-3a521c64d22sm2251045f8f.48.2025.06.04.11.55.56
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Wed, 04 Jun 2025 11:55:57 -0700 (PDT)
From: Helmut Eller <eller.helmut@HIDDEN>
To: Pip Cet <pipcet@HIDDEN>
Subject: Re: bug#78686: feature/igc: [PATCH] Make charset_table an exact root
In-Reply-To: <87bjr3ejpr.fsf@HIDDEN>
References: <87r000k3uv.fsf@HIDDEN> <87bjr3ejpr.fsf@HIDDEN>
Date: Wed, 04 Jun 2025 20:55:56 +0200
Message-ID: <87qzzziber.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13)
MIME-Version: 1.0
Content-Type: text/plain
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 78686
Cc: 78686 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

On Wed, Jun 04 2025, Pip Cet wrote:

> "Helmut Eller" <eller.helmut@HIDDEN> writes:
>
>> Here is a proposal to move charset_table out of the GC heap and make it
>> an exact root area instead.
>
>> Previously, the charset_table was located in various places:
>
>> 1. in charset_table_init during bootstrap (ambiguous root)
>> 2. on the GC heap (loaded from the dump; pinned)
>> 3. on the malloc heap when it needed to grow (ambiguous root)
>>
>> Now, the charset table is copied back to charset_table_init after
>> loading the dump and it is scanned as an exact root area.
>
> Thanks.  Seems like the wrong approach to me:
>
> This removes the case we want, which is case 2, by restoring the case we
> don't want, by adding new and larger unprotected roots.  Shouldn't we
> simply make it so if igc is in use, we only ever use case 2?

Not necessarily.  My ideal solution would be to replace the
charset.attributes field with something that doesn't need to be traced,
e.g. an index into a Lisp_Vector of attributes.

[...]
> So why not just turn IGC_OBJ_DUMPED_CHARSET_TABLE into
> IGC_OBJ_CHARSET_TABLE, remove some igc_asserts, and avoid roots
> altogether, ensuring that root_create_charset_table is called only in
> temacs, never in pdumped emacs?

First, The IGC_OBJ_DUMPED_FOO types only exist because changing the GC
heap format was easier than wrestling with the dumper.  There was the
expectation that the dumper will eventually be extended to support what
MPS actually needs: sections that correspond to MPS pools.

Second, I don't like the practice of using the GC header to create
"shadow types".  I'd prefer to make them pvec types or keep 'em out of
the GC heap.

[...]
> Are you looking at kbd_buffer next?  That's a very large ambiguous root,
> but a little tricky, not least due to the GCC bug.

Maybe, depending on how this charset table business goes.

Helmut




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#78686; Package emacs. Full text available.

Message received at 78686 <at> debbugs.gnu.org:


Received: (at 78686) by debbugs.gnu.org; 4 Jun 2025 13:10:28 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Wed Jun 04 09:10:28 2025
Received: from localhost ([127.0.0.1]:48737 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1uMnsy-0001Py-2K
	for submit <at> debbugs.gnu.org; Wed, 04 Jun 2025 09:10:28 -0400
Received: from mail-10630.protonmail.ch ([79.135.106.30]:18085)
 by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.84_2) (envelope-from <pipcet@HIDDEN>)
 id 1uMnsu-0001P5-7V
 for 78686 <at> debbugs.gnu.org; Wed, 04 Jun 2025 09:10:25 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=protonmail.com;
 s=protonmail3; t=1749042616; x=1749301816;
 bh=NEtdUDez1PpvWrNQWu+dXY2eNnhLRhd784sXWjRJ1kA=;
 h=Date:To:From:Cc:Subject:Message-ID:In-Reply-To:References:
 Feedback-ID:From:To:Cc:Date:Subject:Reply-To:Feedback-ID:
 Message-ID:BIMI-Selector:List-Unsubscribe:List-Unsubscribe-Post;
 b=UTAMjxyrxkyJ+zt0GSKpkh54OGmwP7/mSC/TVyBO2bSpp1DkbLQXxZUxo8gXLmsms
 YUyfl88/tlcEHyJqWobWZKZcgUXOX2wQIrLyU/M9cqy6UcbNYlE1eawk6SXR1X5ByL
 /F0sqHafLJdL5xq55YDkn6zhYElUKMU94OY5ph0d0tyRiB5vYRyh1N4dmGFcATAIuY
 6oxziU3yaq4DBV8PqSJ5Bo1L7aTpd1uW+iXVI5crukJZXkIszz1BCEdwR23jvCjmpv
 VpZV69WvwZAgdmYMO6K+0pzIitXP2uRIYFMfdbuA9uNbLwN18SXgpgEh3FyeI+80yk
 0CU5qcFihEm5w==
Date: Wed, 04 Jun 2025 13:10:12 +0000
To: Helmut Eller <eller.helmut@HIDDEN>
From: Pip Cet <pipcet@HIDDEN>
Subject: Re: bug#78686: feature/igc: [PATCH] Make charset_table an exact root
Message-ID: <87bjr3ejpr.fsf@HIDDEN>
In-Reply-To: <87r000k3uv.fsf@HIDDEN>
References: <87r000k3uv.fsf@HIDDEN>
Feedback-ID: 112775352:user:proton
X-Pm-Message-ID: ade9e6814df905f440c8c6d3f1332e4f1a0b9154
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 78686
Cc: 78686 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

"Helmut Eller" <eller.helmut@HIDDEN> writes:

> Here is a proposal to move charset_table out of the GC heap and make it
> an exact root area instead.

> Previously, the charset_table was located in various places:

> 1. in charset_table_init during bootstrap (ambiguous root)
> 2. on the GC heap (loaded from the dump; pinned)
> 3. on the malloc heap when it needed to grow (ambiguous root)
>
> Now, the charset table is copied back to charset_table_init after
> loading the dump and it is scanned as an exact root area.

Thanks.  Seems like the wrong approach to me:

This removes the case we want, which is case 2, by restoring the case we
don't want, by adding new and larger unprotected roots.  Shouldn't we
simply make it so if igc is in use, we only ever use case 2?

This code doesn't handle charset_table growing above the initial size,
then dumping.  We should attempt to support unusual dumps like that.

There is a warning about making charset_table_init[] large enough so it
doesn't grow during bootstrapping, but I think that comment is obsolete
and refers to unexec problems.  It should be fine now, I think, but
there really is no reason for an optimization that applies only to
temacs, anyway.

At the very least, this deserves an emacs_abort (not an igc_assert which
might be disabled) and a comment explaining why it's not usually hit: a
silent failure and crash is a bad way of indicating to a user they need
to bump the initial charset_table_size.

Scanning only to charset_used, rather than to the end of the array, is a
questionable optimization: scanning a small number of Qnils is better
than relying on what's potentially another thread making memory
modifications so they become visible in the right order.

charset_table should move to the cold section of the dump if it's no
longer expected to be used after restoring the dump.

I don't think "clone_charset_table" is a good name for the only thing
this function does, which is to restore a dumped charset table, making
the dumped copy unsafe to use in the process.

I don't understand why it needs to assert the arena is parked.  That
simply makes future changes more difficult.

So why not just turn IGC_OBJ_DUMPED_CHARSET_TABLE into
IGC_OBJ_CHARSET_TABLE, remove some igc_asserts, and avoid roots
altogether, ensuring that root_create_charset_table is called only in
temacs, never in pdumped emacs?

Also, I think removing pinned_objects_in_dump is a separate change.

Are you looking at kbd_buffer next?  That's a very large ambiguous root,
but a little tricky, not least due to the GCC bug.

Pip





Information forwarded to bug-gnu-emacs@HIDDEN:
bug#78686; Package emacs. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 3 Jun 2025 19:44:23 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Tue Jun 03 15:44:23 2025
Received: from localhost ([127.0.0.1]:41854 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1uMXYZ-0002LS-DH
	for submit <at> debbugs.gnu.org; Tue, 03 Jun 2025 15:44:22 -0400
Received: from lists.gnu.org ([2001:470:142::17]:58516)
 by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.84_2) (envelope-from <eller.helmut@HIDDEN>)
 id 1uMXYS-0002KT-SE
 for submit <at> debbugs.gnu.org; Tue, 03 Jun 2025 15:44:14 -0400
Received: from eggs.gnu.org ([2001:470:142:3::10])
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eller.helmut@HIDDEN>)
 id 1uMXYG-0007E2-9O
 for bug-gnu-emacs@HIDDEN; Tue, 03 Jun 2025 15:44:00 -0400
Received: from mail-wr1-x430.google.com ([2a00:1450:4864:20::430])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128)
 (Exim 4.90_1) (envelope-from <eller.helmut@HIDDEN>)
 id 1uMXYC-0005SY-UZ
 for bug-gnu-emacs@HIDDEN; Tue, 03 Jun 2025 15:43:59 -0400
Received: by mail-wr1-x430.google.com with SMTP id
 ffacd0b85a97d-3a361b8a664so5814469f8f.3
 for <bug-gnu-emacs@HIDDEN>; Tue, 03 Jun 2025 12:43:55 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=gmail.com; s=20230601; t=1748979834; x=1749584634; darn=gnu.org;
 h=mime-version:user-agent:message-id:date:subject:to:from:from:to:cc
 :subject:date:message-id:reply-to;
 bh=Id1Kfprj60ypRt//v5QW8hyjlk0VIXyT9qLsAW/rEgw=;
 b=ZUst/QAy7l7VkN5tauurKDI+ueVssIvVHwz3okS8yl2VBdEN0bFvSvI5HnAl/QQ8AP
 eRFPDx4pLi+X88/lYmmFmTFsw8lhp59+CJFvyHT79HuciAOsg4YFPE1vSYjGD0OxsM2S
 Wp7z7DxJTaXq6DiZntqiccQOxO1KSj/4qXX1zyeldFwYU4YRWQSVwLYwuzP6LG58hyi/
 fAg1+Xff5jdzaRkci2jaza24E3ryDN1FLBibdwqDmMZApPqnN3tL/LHQd1b4JUFR979e
 B6NsGRe3k7ubnTi61Jk2A5QgRWkGusCJAtvayAIyfwqNuiKPMa2uykd1L3ABq5F5eW1t
 hEfg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20230601; t=1748979834; x=1749584634;
 h=mime-version:user-agent:message-id:date:subject:to:from
 :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
 bh=Id1Kfprj60ypRt//v5QW8hyjlk0VIXyT9qLsAW/rEgw=;
 b=wMaM/FGIeHRCxRU6O9V+NUsQTJSjMcGyDjJZghEHa9xJX4NVWQ91pvOauGckecPaIo
 q1IOUlvHYduy2cMgfmO5YGeujalWYFjXP+RIikLJ08rGuw67/rda5Kveb7AVp2P/O/8u
 sAA0+gUvsDNaGLgYQJ6crttHUTRJSaTezgwWH63ykql+A1189oqlfZLlVXZfCa4BBFfz
 mvgpq2XgMCZ2Nms7dpNV2agySoOZhoTLdZ1F8bP10r/lwmeyWXhWuBwstH2IU/ULS4sZ
 QVsCGs42nQBIUkeNK1+vgXFIySkFr0c67RH9GIBxNojLoM/YQqYlZZp5XBo5/jHFeizj
 xSWg==
X-Gm-Message-State: AOJu0YyukDyn4EsBpG9ngJ3zV58/TO19m1cJWQhDbRmcnntNE4T/AyGa
 RfnFlrj+lPO6wwz6GERX8jnxmi7zMaJC57SY9LtDIpe58iWDbc5Yeq75q0SqAw==
X-Gm-Gg: ASbGnctzkEfwo49hmcqetcKw9i14hnsYpwht8wYytKEaNKtPDub8/vHW7w4sxyqEqn2
 klhGm9bVxctqKqUAguHaSAmK2ShdulmmPiTJ5p1QU3oR3IuY1kXphQSOxjm2UIdG3DrO4ELMSKY
 8PmRh4LQK5lyyOy+NoZRSZY9PuzgDfb3YhPFDgV5PJUXApmchiR7Ulc2zBywqd0FhLKt0y5OU1K
 WA6xTpdAmTmeprTjMqVircRLaR5aA3YjYM9WyUhPjvsfnw9a5eM1FkBn3c/luM8u2UDdOXv1S8j
 /cmuOkwjglqr6fr5hTXb5nR/djjAwHMUaZYrQxD4pyiguBOUyk/M90JiKy8=
X-Google-Smtp-Source: AGHT+IHNswQHst+f26a25fmVT9RTVtHVFU0tQjNlplMrwqXa4fYlICL8xd60L2kp8H/R+5F1J3jWMw==
X-Received: by 2002:a05:6000:2389:b0:3a4:deb9:8964 with SMTP id
 ffacd0b85a97d-3a51d91f899mr48273f8f.17.1748979834239; 
 Tue, 03 Jun 2025 12:43:54 -0700 (PDT)
Received: from caladan ([89.107.110.32]) by smtp.gmail.com with ESMTPSA id
 5b1f17b1804b1-450d7fc281csm179914285e9.40.2025.06.03.12.43.53
 for <bug-gnu-emacs@HIDDEN>
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Tue, 03 Jun 2025 12:43:53 -0700 (PDT)
From: Helmut Eller <eller.helmut@HIDDEN>
To: bug-gnu-emacs@HIDDEN
Subject: feature/igc: [PATCH] Make charset_table an exact root
X-Debbugs-Cc: 
Date: Tue, 03 Jun 2025 21:43:52 +0200
Message-ID: <87r000k3uv.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13)
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="=-=-="
Received-SPF: pass client-ip=2a00:1450:4864:20::430;
 envelope-from=eller.helmut@HIDDEN; helo=mail-wr1-x430.google.com
X-Spam_score_int: -20
X-Spam_score: -2.1
X-Spam_bar: --
X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1,
 DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001,
 RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001,
 SPF_PASS=-0.001 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-Spam-Score: 1.0 (+)
X-Debbugs-Envelope-To: submit
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -0.0 (/)

--=-=-=
Content-Type: text/plain

Here is a proposal to move charset_table out of the GC heap and make it
an exact root area instead.


--=-=-=
Content-Type: text/x-diff
Content-Disposition: attachment;
 filename=0001-Make-charset_table-an-exact-root.patch

From ded35d23523a18b5c61dd7e7c2de92cb2dafc7ff Mon Sep 17 00:00:00 2001
From: Helmut Eller <eller.helmut@HIDDEN>
Date: Tue, 3 Jun 2025 17:30:33 +0200
Subject: [PATCH] Make charset_table an exact root

Previously, the charset_table was located in various places:

1. in charset_table_init during bootstrap (ambiguous root)
2. on the GC heap (loaded from the dump; pinned)
3. on the malloc heap when it needed to grow (ambiguous root)

Now, the charset table is copied back to charset_table_init after
loading the dump and it is scanned as an exact root area.

* src/igc.h (igc_root_create_area): New API.
(igc_scan_charset_table): New.
* src/igc.c (igc_scan_charset_table): Implementation.
(root_create_charset_table): Use it.
(clone_charset_table, clone_charset): New helpers.
(igc_on_pdump_loaded): Use it.
(fix_charset_table): Deleted.
* src/charset.c (Fdefine_charset_internal): Create/destory root areas
when allocating/freeing charset tables.
---
 src/charset.c | 19 +++++++-----
 src/igc.c     | 84 ++++++++++++++++++++++++++++++---------------------
 src/igc.h     |  4 +++
 3 files changed, 65 insertions(+), 42 deletions(-)

diff --git a/src/charset.c b/src/charset.c
index 9cbc899b354..2ee6fa0345c 100644
--- a/src/charset.c
+++ b/src/charset.c
@@ -1132,17 +1132,20 @@ DEFUN ("define-charset-internal", Fdefine_charset_internal,
 	     coding_system.charbuf[i] entries, which are 'int'.  */
 	  int old_size = charset_table_size;
 	  ptrdiff_t new_size = old_size;
-	  struct charset *new_table =
+	  struct charset *new_table
+	    = xpalloc (0, &new_size, 1,
+		       min (INT_MAX, MOST_POSITIVE_FIXNUM),
+		       sizeof *charset_table);
 #ifdef HAVE_MPS
-	    igc_xpalloc_ambig
-#else
-	    xpalloc
+	  memset (new_table, 0, new_size * sizeof *new_table);
+	  igc_root_create_area (new_table, new_table + new_size,
+			        igc_scan_charset_table, NULL, "charset-table");
 #endif
-	    (0, &new_size, 1,
-	     min (INT_MAX, MOST_POSITIVE_FIXNUM),
-	     sizeof *charset_table);
           memcpy (new_table, charset_table, old_size * sizeof *new_table);
-          charset_table = new_table;
+#ifdef HAVE_MPS
+	  igc_destroy_root_with_start (charset_table);
+#endif
+	  charset_table = new_table;
 	  charset_table_size = new_size;
 	  /* FIXME: This leaks memory, as the old charset_table becomes
 	     unreachable.  If the old charset table is charset_table_init
diff --git a/src/igc.c b/src/igc.c
index f63b0fe8c88..67aefb1b273 100644
--- a/src/igc.c
+++ b/src/igc.c
@@ -1725,6 +1725,20 @@ scan_tty_list (mps_ss_t ss, void *start, void *end, void *closure)
   return MPS_RES_OK;
 }
 
+mps_res_t
+igc_scan_charset_table (struct igc_ss *s, void *start, void *end, void *closure)
+{
+  mps_ss_t ss = (mps_ss_t) s;
+  struct charset *table = start;
+  MPS_SCAN_BEGIN (ss)
+  {
+    for (size_t i = 0, used = charset_table_used; i < used; i++)
+      IGC_FIX12_OBJ (ss, &table[i].attributes);
+  }
+  MPS_SCAN_END (ss);
+  return MPS_RES_OK;
+}
+
 /***********************************************************************
 			 Default pad, fwd, ...
  ***********************************************************************/
@@ -1941,20 +1955,6 @@ fix_handler (mps_ss_t ss, struct handler *h)
   return MPS_RES_OK;
 }
 
-static mps_res_t
-fix_charset_table (mps_ss_t ss, struct charset *table, size_t nbytes)
-{
-  igc_assert (table == charset_table);
-  igc_assert (nbytes == (charset_table_size * sizeof (struct charset)));
-  MPS_SCAN_BEGIN (ss)
-  {
-    for (size_t i = 0, len = nbytes / sizeof (struct charset); i < len; i++)
-      IGC_FIX12_OBJ (ss, &table[i].attributes);
-  }
-  MPS_SCAN_END (ss);
-  return MPS_RES_OK;
-}
-
 static mps_res_t fix_vector (mps_ss_t ss, struct Lisp_Vector *v);
 static mps_res_t fix_marker_vector (mps_ss_t ss, struct Lisp_Vector *v);
 static mps_res_t fix_weak_hash_table_strong_part (mps_ss_t ss, struct Lisp_Weak_Hash_Table_Strong_Part *t);
@@ -2025,6 +2025,7 @@ dflt_scan_obj (mps_ss_t ss, mps_addr_t base_start, mps_addr_t base_limit,
       case IGC_OBJ_DUMPED_BUFFER_TEXT:
       case IGC_OBJ_DUMPED_BIGNUM_DATA:
       case IGC_OBJ_DUMPED_BYTES:
+      case IGC_OBJ_DUMPED_CHARSET_TABLE:
 	/* Can occur in the dump.  */
 	break;
 
@@ -2080,11 +2081,6 @@ dflt_scan_obj (mps_ss_t ss, mps_addr_t base_start, mps_addr_t base_limit,
 			 fix_blv);
 	break;
 
-      case IGC_OBJ_DUMPED_CHARSET_TABLE:
-	IGC_FIX_CALL (ss, fix_charset_table (ss, (struct charset *)client,
-					     obj_size (header)));
-	break;
-
       case IGC_OBJ_WEAK_HASH_TABLE_STRONG_PART:
 	IGC_FIX_CALL_FN (ss, struct Lisp_Weak_Hash_Table_Strong_Part, client,
 			 fix_weak_hash_table_strong_part);
@@ -2982,8 +2978,8 @@ root_create_bc (struct igc_thread_list *t)
 static void
 root_create_charset_table (struct igc *gc)
 {
-  root_create_ambig (gc, charset_table_init,
-		     charset_table_init + ARRAYELTS (charset_table_init),
+  root_create_exact (gc, &charset_table_init, &charset_table_init + 1,
+		     (mps_area_scan_t) igc_scan_charset_table,
 		     "charset-table");
 }
 
@@ -3069,6 +3065,14 @@ igc_root_create_n (Lisp_Object start[], size_t n)
   return root_create_exact_n (start, n);
 }
 
+void
+igc_root_create_area (void *start, void *end, igc_scan_area_t f, void *closure,
+		      const char *label)
+{
+  root_create (global_igc, start, end, mps_rank_exact (), (mps_area_scan_t) f,
+	       closure, false, label);
+}
+
 static igc_root_list *
 root_find (void *start)
 {
@@ -5150,7 +5154,26 @@ check_dump (mps_addr_t start, mps_addr_t end)
   return true;
 }
 
-static mps_addr_t pinned_objects_in_dump[3];
+static void
+clone_charset (struct charset *src, struct charset *dst)
+{
+  *dst = *src;
+  if (src->code_space_mask)
+    {
+      size_t nbytes = 256;
+      dst->code_space_mask = xzalloc (nbytes);
+      memcpy (dst->code_space_mask, src->code_space_mask, nbytes);
+    }
+}
+
+static void
+clone_charset_table (size_t used, size_t size, struct charset src[size],
+		     struct charset dst[size])
+{
+  igc_assert (global_igc->park_count > 0);
+  for (size_t i = 0; i < used; i++)
+    clone_charset (&src[i], &dst[i]);
+}
 
 /* Called from pdumper_load.  [HOT_START, HOT_END) is the hot section of
    the dump.  [COL_START, COLD_END) is the cold section of the
@@ -5190,20 +5213,13 @@ igc_on_pdump_loaded (void *dump_base, void *hot_start, void *hot_end,
   set_header (hot_end, IGC_OBJ_PAD, discardable_size, 0);
   /* Ignore relocs */
   set_header (heap_end, IGC_OBJ_PAD, relocs_size, 0);
+  /* Copy charset_table out of the GC heap */
+  igc_assert (ARRAYELTS (charset_table_init) == charset_table_size);
+  clone_charset_table (charset_table_used, charset_table_size,
+		       charset_table, charset_table_init);
+  charset_table = charset_table_init;
 
   eassert (check_dump (h, cold_end));
-  /* Pin some stuff in the dump  */
-  mps_addr_t pinned_roots[] = {
-    charset_table,
-    cold_start, /* code_space_masks */
-    cold_user_data_start,
-  };
-  static_assert (sizeof pinned_roots == sizeof pinned_objects_in_dump);
-  memcpy (pinned_objects_in_dump, pinned_roots, sizeof pinned_roots);
-  igc_root_create_ambig (pinned_objects_in_dump,
-			 (uint8_t *) pinned_objects_in_dump
-			   + sizeof pinned_objects_in_dump,
-			 "dump-pins");
 }
 
 void *
diff --git a/src/igc.h b/src/igc.h
index 5e24c7c338a..7275eaf937f 100644
--- a/src/igc.h
+++ b/src/igc.h
@@ -152,7 +152,11 @@ #define EMACS_IGC_H
 void igc_root_destroy_comp_unit (struct Lisp_Native_Comp_Unit *u);
 void igc_root_destroy_comp_unit_eph (struct Lisp_Native_Comp_Unit *u);
 void *igc_root_create_n (Lisp_Object start[], size_t n);
+void igc_root_create_area (void *start, void *end, igc_scan_area_t f,
+			   void *closure, const char *label);
 void igc_destroy_root_with_start (void *start);
+igc_scan_result_t igc_scan_charset_table (struct igc_ss *, void *, void *,
+					  void *);
 size_t igc_header_size (void);
 char *igc_dump_finish_obj (void *client, enum igc_obj_type type,
 			   char *base, char *end);
-- 
2.39.5


--=-=-=--




Acknowledgement sent to Helmut Eller <eller.helmut@HIDDEN>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs@HIDDEN. Full text available.
Report forwarded to bug-gnu-emacs@HIDDEN:
bug#78686; Package emacs. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Thu, 5 Jun 2025 08:15:01 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.