GNU bug report logs - #70988
(read FUNCTION) uses Latin-1 [PATCH]

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: emacs; Reported by: Mattias Engdegård <mattias.engdegard@HIDDEN>; Keywords: patch; dated Thu, 16 May 2024 18:14:01 UTC; Maintainer for emacs is bug-gnu-emacs@HIDDEN.

Message received at 70988 <at> debbugs.gnu.org:


Received: (at 70988) by debbugs.gnu.org; 13 Feb 2025 16:42:34 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Feb 13 11:42:34 2025
Received: from localhost ([127.0.0.1]:45197 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1ticIM-0003sT-Ch
	for submit <at> debbugs.gnu.org; Thu, 13 Feb 2025 11:42:34 -0500
Received: from eggs.gnu.org ([2001:470:142:3::10]:48760)
 by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.84_2) (envelope-from <eliz@HIDDEN>) id 1ticIJ-0003s5-CB
 for 70988 <at> debbugs.gnu.org; Thu, 13 Feb 2025 11:42:32 -0500
Received: from fencepost.gnu.org ([2001:470:142:3::e])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@HIDDEN>)
 id 1ticID-00040h-4D; Thu, 13 Feb 2025 11:42:25 -0500
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From:
 Date; bh=rOFbBewu1EkXKQxsbVSeC0Umtiz91hTrcKChkosX1y0=; b=qAHikRY3N0iuz8X9cq0W
 2Y89acTq3Tu223M7QAxi9A6usybWdAIXfrUVd6Y1FFvI3DJvtPB8MHikzsOjyIr5/w8XDo/3CUckR
 4gVkqVA0BNCX00URV9vuA48LJD/WEbVqPLyaydD3lkj/Ym65KK8HhrNdwcBACMfkGGRu0CWXiico7
 o9YDlWQAf9/Aar7zE9fcj4jSar6Jloqju9XHwMCMUB9wb3ZH6wc7uxugQ7uyVIC/sWnECZ/SO/+Lr
 2om4GxVZJ4q1gEUuX0fpXlkzoHoDpqcTrj4UwYPLrZIdgkoEwOJoD66yujO1mMeNMBOc4xNjA92xq
 ukebOGvXvLyl6g==;
Date: Thu, 13 Feb 2025 18:42:13 +0200
Message-Id: <86v7tdhk4a.fsf@HIDDEN>
From: Eli Zaretskii <eliz@HIDDEN>
To: Pip Cet <pipcet@HIDDEN>
In-Reply-To: <87h64yrl9a.fsf@HIDDEN> (message from Pip Cet on Thu, 13
 Feb 2025 14:08:02 +0000)
Subject: Re: bug#70988: (read FUNCTION) uses Latin-1 [PATCH]
References: <37B5B5D0-9C0B-4E1C-9F3C-6CA647612E08@HIDDEN>
 <86seyhh9uv.fsf@HIDDEN> <E13B82E6-8A2F-4D1B-B0A0-8D251270685F@HIDDEN>
 <86le49h6sm.fsf@HIDDEN> <BBC28443-929B-4EE8-8773-984C5CD948CA@HIDDEN>
 <8634qghg2j.fsf@HIDDEN> <74B41A66-5B3C-4A09-A5F4-A389464BDA27@HIDDEN>
 <EF4677D0-BE1D-46FF-8BD2-60F553756F0D@HIDDEN>
 <CADwFkm=7A1XvBma-aTEkjYjJhrzCVwLDL2XYR2kJbfu0cJKBBA@HIDDEN>
 <87h64yrl9a.fsf@HIDDEN>
MIME-version: 1.0
Content-type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 70988
Cc: 70988 <at> debbugs.gnu.org, mattias.engdegard@HIDDEN, stefankangas@HIDDEN,
 monnier@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

> Date: Thu, 13 Feb 2025 14:08:02 +0000
> From: Pip Cet <pipcet@HIDDEN>
> Cc: Mattias Engdegård <mattias.engdegard@HIDDEN>, 70988 <at> debbugs.gnu.org, Eli Zaretskii <eliz@HIDDEN>, monnier@HIDDEN
> 
> @@ -3058,7 +3058,8 @@ read_char_escape (Lisp_Object readcharfun, int next_char)
>        chr = c;
>        break;
>      }
> -  eassert (chr >= 0 && chr < (1 << CHARACTERBITS));
> +  if (chr < 0 || chr > (1 << CHARACTERBITS))
> +    invalid_syntax ("Invalid character", readcharfun);

Please leave the assertion in place here (in addition to the error
message), since an abort is easier to spot than an error message,
especially if someone catches errors at a higher level.

I'm okay with the rest of this patch, thanks.




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#70988; Package emacs. Full text available.

Message received at 70988 <at> debbugs.gnu.org:


Received: (at 70988) by debbugs.gnu.org; 13 Feb 2025 14:08:19 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Feb 13 09:08:19 2025
Received: from localhost ([127.0.0.1]:41467 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1tiZt5-0004Ef-8g
	for submit <at> debbugs.gnu.org; Thu, 13 Feb 2025 09:08:19 -0500
Received: from mail-10631.protonmail.ch ([79.135.106.31]:47161)
 by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.84_2) (envelope-from <pipcet@HIDDEN>)
 id 1tiZt3-0004ER-F3
 for 70988 <at> debbugs.gnu.org; Thu, 13 Feb 2025 09:08:18 -0500
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=protonmail.com;
 s=protonmail3; t=1739455688; x=1739714888;
 bh=RKSUuYsQwaDaZEnjyVw1itfXnNap7ImlvE2e40ehg0Q=;
 h=Date:To:From:Cc:Subject:Message-ID:In-Reply-To:References:
 Feedback-ID:From:To:Cc:Date:Subject:Reply-To:Feedback-ID:
 Message-ID:BIMI-Selector:List-Unsubscribe:List-Unsubscribe-Post;
 b=EqwjkRWWJixKuCb8yBXuQDSagVc0Qwe2gFY8I+rI+O9Ls/4a4p+o8J8hKmCGN7lNi
 KVPgJ0Wg4TeMObTUK2zoszUJWxZmZyV9UGnWRAolgWvUC9v5wEsbixZVoxm15yZJ0d
 jTQmgz/Ixa6qrRQIxZZCv4tZHiYPFCh3dAdK+f3zZ+n4p0IAZvyao9vC/wcELNvLbE
 RQsoUoVUttHie0Jdlvw6x8R5xU9pi/piuqmHXpNwP9W2hd+fU4Ak6QsSMpQUX9lbpg
 iE2dchrihD+JQNYoUgjXYmCfv0a0Tz6hbJeBZzjfhPcL+YJ3Uk4cImBDdn+5vlD727
 Cp25mVmvwwXXg==
Date: Thu, 13 Feb 2025 14:08:02 +0000
To: Stefan Kangas <stefankangas@HIDDEN>
From: Pip Cet <pipcet@HIDDEN>
Subject: Re: bug#70988: (read FUNCTION) uses Latin-1 [PATCH]
Message-ID: <87h64yrl9a.fsf@HIDDEN>
In-Reply-To: <CADwFkm=7A1XvBma-aTEkjYjJhrzCVwLDL2XYR2kJbfu0cJKBBA@HIDDEN>
References: <37B5B5D0-9C0B-4E1C-9F3C-6CA647612E08@HIDDEN>
 <86seyhh9uv.fsf@HIDDEN> <E13B82E6-8A2F-4D1B-B0A0-8D251270685F@HIDDEN>
 <86le49h6sm.fsf@HIDDEN> <BBC28443-929B-4EE8-8773-984C5CD948CA@HIDDEN>
 <8634qghg2j.fsf@HIDDEN> <74B41A66-5B3C-4A09-A5F4-A389464BDA27@HIDDEN>
 <EF4677D0-BE1D-46FF-8BD2-60F553756F0D@HIDDEN>
 <CADwFkm=7A1XvBma-aTEkjYjJhrzCVwLDL2XYR2kJbfu0cJKBBA@HIDDEN>
Feedback-ID: 112775352:user:proton
X-Pm-Message-ID: 5859837e0ca61a25895985addd2d89ec7bc97378
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 70988
Cc: 70988 <at> debbugs.gnu.org,
 =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= <mattias.engdegard@HIDDEN>,
 Eli Zaretskii <eliz@HIDDEN>, monnier@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

"Stefan Kangas" <stefankangas@HIDDEN> writes:

> Mattias Engdeg=C3=A5rd <mattias.engdegard@HIDDEN> writes:
>
>> After looking further into the Lisp reader/printer I found two more
>> silent Latin-1 assumptions. In all three cases, I firmly believe the
>> following to be true:
>>
>> * The behaviour is not intended but just code accidents.
>> * They should hardly affect any user code at all.
>> * They are nevertheless clear bugs which should be fixed.
>>
>> Further on this will have to wait until after Emacs 30 has been branched=
 to avoid delaying that more important task.
>
> FWIW, the proposed patch looks like a bug fix to me as well, so I think
> we should install it.

Here's a patch which avoids a few crashes in lread.c.  If we change the
behavior of readcharfuns' return values to always be treated as
multibyte, but fail to verify that its return value satisfies
CHARACTERP, as proposed, further problems will arise.

In the event that there is interest in fixing Emacs not to crash in
these circumstances, let me know and I can commit this.

Pip

From e9d48abeb199db7d76386639adadba5c7e45177c Mon Sep 17 00:00:00 2001
From: Pip Cet <pipcet@HIDDEN>
Subject: [PATCH] Avoid crashes in lread.c when invalid characters are read

* src/lread.c (readchar): Don't crash for non-fixnum return values.
(read_filtered_event): Don't crash for invalid symbol properties.
(Fread_char):
(Fread_char_exclusive):
(character_name_to_code): Check 'FIXNUMP' before using 'XFIXNUM'.
(read_char_escape): Use 'invalid_syntax' rather than crashing for
invalid characters.
---
 src/lread.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/src/lread.c b/src/lread.c
index 6af95873bb8..5875b489c97 100644
--- a/src/lread.c
+++ b/src/lread.c
@@ -398,7 +398,7 @@ readchar (Lisp_Object readcharfun, bool *multibyte)
=20
   tem =3D call0 (readcharfun);
=20
-  if (NILP (tem))
+  if (!FIXNUMP (tem))
     return -1;
   return XFIXNUM (tem);
=20
@@ -816,7 +816,7 @@ read_filtered_event (bool no_switch_frame, bool ascii_r=
equired,
 =09      tem1 =3D Fget (Fcar (tem), Qascii_character);
 =09      /* Merge this symbol's modifier bits
 =09=09 with the ASCII equivalent of its basic code.  */
-=09      if (!NILP (tem1))
+=09      if (FIXNUMP (tem1) && FIXNUMP (Fcar (Fcdr (tem))))
 =09=09XSETFASTINT (val, XFIXNUM (tem1) | XFIXNUM (Fcar (Fcdr (tem))));
 =09    }
 =09}
@@ -898,7 +898,7 @@ DEFUN ("read-char", Fread_char, Sread_char, 0, 3, 0,
     }
   val =3D read_filtered_event (1, 1, 1, ! NILP (inherit_input_method), sec=
onds);
=20
-  return (NILP (val) ? Qnil
+  return (!FIXNUMP (val) ? Qnil
 =09  : make_fixnum (char_resolve_modifier_mask (XFIXNUM (val))));
 }
=20
@@ -976,7 +976,7 @@ DEFUN ("read-char-exclusive", Fread_char_exclusive, Sre=
ad_char_exclusive, 0, 3,
=20
   val =3D read_filtered_event (1, 1, 0, ! NILP (inherit_input_method), sec=
onds);
=20
-  return (NILP (val) ? Qnil
+  return (!FIXNUMP (val) ? Qnil
 =09  : make_fixnum (char_resolve_modifier_mask (XFIXNUM (val))));
 }
=20
@@ -2820,7 +2820,7 @@ character_name_to_code (char const *name, ptrdiff_t n=
ame_len,
       invalid_syntax_lisp (CALLN (Fformat, format, namestr), readcharfun);
     }
=20
-  return XFIXNUM (code);
+  return FIXNUMP (code) ? XFIXNUM (code) : -1;
 }
=20
 /* Bound on the length of a Unicode character name.  As of
@@ -3058,7 +3058,8 @@ read_char_escape (Lisp_Object readcharfun, int next_c=
har)
       chr =3D c;
       break;
     }
-  eassert (chr >=3D 0 && chr < (1 << CHARACTERBITS));
+  if (chr < 0 || chr > (1 << CHARACTERBITS))
+    invalid_syntax ("Invalid character", readcharfun);
=20
   /* Apply Control modifiers, using the rules:
      \C-X =3D ascii_ctrl(nomod(X)) | mods(X)  if nomod(X) is one of:
--=20
2.48.1





Information forwarded to bug-gnu-emacs@HIDDEN:
bug#70988; Package emacs. Full text available.

Message received at 70988 <at> debbugs.gnu.org:


Received: (at 70988) by debbugs.gnu.org; 13 Feb 2025 10:23:35 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Feb 13 05:23:35 2025
Received: from localhost ([127.0.0.1]:40759 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1tiWNb-00067R-EZ
	for submit <at> debbugs.gnu.org; Thu, 13 Feb 2025 05:23:35 -0500
Received: from eggs.gnu.org ([2001:470:142:3::10]:40768)
 by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.84_2) (envelope-from <eliz@HIDDEN>) id 1tiWNY-000677-Bm
 for 70988 <at> debbugs.gnu.org; Thu, 13 Feb 2025 05:23:33 -0500
Received: from fencepost.gnu.org ([2001:470:142:3::e])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@HIDDEN>)
 id 1tiWNS-00007E-SK; Thu, 13 Feb 2025 05:23:27 -0500
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date:
 mime-version; bh=3adD/imPhNKlX8rBSprj04VTzFWkex3PSEB9kARg+VE=; b=od8hEmgxKLdI
 tbEoJUVNMmyjBMYr5gTfTZ6Yy0tZvyx+HYqecuEUQ9yJvzNocstsDzBM0Cjz4AkW1X51o4Cs5zJLo
 5S6crdhXxeEyrV2UgoJb4+jq+WGWhNOJx8ro8AhgV369LsXzptcpiEasEegK29XMRVFAnWpDT1gB4
 /T+VWNrXii2xiDln7asLykVKkOZuaOzthuD+EhZibnDU+VDtWx/n0l40/zd1jyTPBs91YFtq5rT+j
 drOq2kLgXkPrKPliY7o00j0N3ap1m2gIBmm3hgIB7x7Ag/ILccEO9+CaWfPUf0xXi5j8YCGNd6mcm
 A9zrKW5ISJKl+vablpCPMg==;
Date: Thu, 13 Feb 2025 12:23:21 +0200
Message-Id: <86ed02i1nq.fsf@HIDDEN>
From: Eli Zaretskii <eliz@HIDDEN>
To: Pip Cet <pipcet@HIDDEN>
In-Reply-To: <87a5aqcg2z.fsf@HIDDEN> (message from Pip Cet on Thu, 13
 Feb 2025 10:08:54 +0000)
Subject: Re: bug#70988: (read FUNCTION) uses Latin-1 [PATCH]
References: <37B5B5D0-9C0B-4E1C-9F3C-6CA647612E08@HIDDEN>
 <8634qghg2j.fsf@HIDDEN> <74B41A66-5B3C-4A09-A5F4-A389464BDA27@HIDDEN>
 <EF4677D0-BE1D-46FF-8BD2-60F553756F0D@HIDDEN>
 <CADwFkm=7A1XvBma-aTEkjYjJhrzCVwLDL2XYR2kJbfu0cJKBBA@HIDDEN>
 <877c5vf730.fsf@HIDDEN> <86r043j77n.fsf@HIDDEN>
 <87ed02ewnl.fsf@HIDDEN> <86lduajsdi.fsf@HIDDEN>
 <87a5aqcg2z.fsf@HIDDEN>
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 70988
Cc: 70988 <at> debbugs.gnu.org, mattias.engdegard@HIDDEN, stefankangas@HIDDEN,
 monnier@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

> Date: Thu, 13 Feb 2025 10:08:54 +0000
> From: Pip Cet <pipcet@HIDDEN>
> Cc: stefankangas@HIDDEN, mattias.engdegard@HIDDEN, 70988 <at> debbugs.gnu.org, monnier@HIDDEN
> 
> "Eli Zaretskii" <eliz@HIDDEN> writes:

[...]

This style of "discussion" leads nowhere useful, so I'm bowing out of
it.




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#70988; Package emacs. Full text available.

Message received at 70988 <at> debbugs.gnu.org:


Received: (at 70988) by debbugs.gnu.org; 13 Feb 2025 10:09:11 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Feb 13 05:09:11 2025
Received: from localhost ([127.0.0.1]:40690 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1tiW9f-0002Mo-09
	for submit <at> debbugs.gnu.org; Thu, 13 Feb 2025 05:09:11 -0500
Received: from mail-10630.protonmail.ch ([79.135.106.30]:12931)
 by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.84_2) (envelope-from <pipcet@HIDDEN>)
 id 1tiW9b-0002M9-B6
 for 70988 <at> debbugs.gnu.org; Thu, 13 Feb 2025 05:09:08 -0500
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=protonmail.com;
 s=protonmail3; t=1739441340; x=1739700540;
 bh=NgBWClVjnlgvcQL5Fk2BfuyV+hcsZRr9GBTibaO0Y2I=;
 h=Date:To:From:Cc:Subject:Message-ID:In-Reply-To:References:
 Feedback-ID:From:To:Cc:Date:Subject:Reply-To:Feedback-ID:
 Message-ID:BIMI-Selector:List-Unsubscribe:List-Unsubscribe-Post;
 b=rT7MWtuKRsmmusx2ecYLjzfIj2q1E9kAGP7GWaVrj2XNxNrdZGw+XZSWRKTpjTvDy
 m/eqNB0MMF3oirIZQlVoMgG2W/VPc3r/ASdKqmsfqFm4eDV3EipvYH7DNtMeUoZDAP
 cYTdY0mxTXAy38wLETkyzDdBMlPFUrHnk/efAILkfbhtc08oLQoIIp6WkGlEE7S/g1
 U6GM9Nj6keUJ1uaXhyl5brwBGbMLaiRFs9EqqoH+ckLdi4DGV5gCdTLO3rbIQTf7Rk
 flZVU7M8IKeuMBEwuN/sue9SQ+Q43OHxTwIO2oDJ1lTOFE5Ft3COeFbGGuQ+E1RzDC
 +tLG8zAE/fUVA==
Date: Thu, 13 Feb 2025 10:08:54 +0000
To: Eli Zaretskii <eliz@HIDDEN>
From: Pip Cet <pipcet@HIDDEN>
Subject: Re: bug#70988: (read FUNCTION) uses Latin-1 [PATCH]
Message-ID: <87a5aqcg2z.fsf@HIDDEN>
In-Reply-To: <86lduajsdi.fsf@HIDDEN>
References: <37B5B5D0-9C0B-4E1C-9F3C-6CA647612E08@HIDDEN>
 <8634qghg2j.fsf@HIDDEN> <74B41A66-5B3C-4A09-A5F4-A389464BDA27@HIDDEN>
 <EF4677D0-BE1D-46FF-8BD2-60F553756F0D@HIDDEN>
 <CADwFkm=7A1XvBma-aTEkjYjJhrzCVwLDL2XYR2kJbfu0cJKBBA@HIDDEN>
 <877c5vf730.fsf@HIDDEN> <86r043j77n.fsf@HIDDEN>
 <87ed02ewnl.fsf@HIDDEN> <86lduajsdi.fsf@HIDDEN>
Feedback-ID: 112775352:user:proton
X-Pm-Message-ID: ee9fc3ea3b0c996611da08624e35f180b4d1fb5d
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Spam-Score: 1.0 (+)
X-Debbugs-Envelope-To: 70988
Cc: 70988 <at> debbugs.gnu.org, mattias.engdegard@HIDDEN, stefankangas@HIDDEN,
 monnier@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

"Eli Zaretskii" <eliz@HIDDEN> writes:

>> Date: Wed, 12 Feb 2025 20:27:58 +0000
>> From: Pip Cet <pipcet@HIDDEN>
>> Cc: stefankangas@HIDDEN, mattias.engdegard@HIDDEN, 70988@HIDDEN=
nu.org, monnier@HIDDEN
>>
>> "Eli Zaretskii" <eliz@HIDDEN> writes:
>>
>> >> --- a/src/lread.c
>> >> +++ b/src/lread.c
>> >> @@ -398,9 +398,12 @@ readchar (Lisp_Object readcharfun, bool *multiby=
te)
>> >>
>> >>    tem =3D call0 (readcharfun);
>> >>
>> >> -  if (NILP (tem))
>> >> +  if (!CHARACTERP (tem))
>> >>      return -1;
>> >> -  return XFIXNUM (tem);
>> >> +  if (multibyte && !ASCII_CHAR_P (XFIXNAT (tem)))
>> >> +    *multibyte =3D true;
>> >> +
>> >> +  return XFIXNAT (tem);
>> >
>> > AFAIU, the proposed patch was just a bugfix, whereas the above also
>> > changes behavior in backward-incompatible ways.
>>
>> The other way around, I think: the first proposed patch changed the
>> behavior of readchar to always set the multibyte flag when a function
>> was used, resulting in the creation of symbols whose ASCII names are
>> multibyte strings.  The previous behavior was never to set the multibyte
>> flag, which was correct for ASCII strings but not multibyte ones.
>>
>> This patch retains the previous behavior for ASCII symbols, but sets the
>> multibyte flag for non-ASCII symbols, which seems the best we can do if
>> we're given a simple function.
>
> I'm talking about the CHARACTERP test (why not FIXNUMP?), and the

The function is supposed to return a character, not just any fixnum.

> addition of ASCII_CHAR_P test (why would we want an ASCII character
> to never be considered multibyte?).

It's the other way around, again: if there's a non-ASCII character, we
treat the stream as multibyte; if there are ONLY ASCII characters, we
treat it as unibyte.

>> If we want to change symbol names to always be multibyte strings, we can
>> do that, but then we probably want to do that or all streams.
>
> I don't understand why you are talking about symbols: AFAIU this code
> is used in many other cases as well.  But even for symbols: why change
> the current behavior of making their names multibyte?

The current behavior is to make their names unibyte!  The current
behavior is *changed* by the first patch, and *retained* by my patch.

>> It also fixes yet another XFIXNUM crash, but those (there are more in
>> lread.c, it seems) should be fixed independently.
>
> I'm okay with adding a FIXNUMP test (which happens in the debugging
> builds anyway, so any violations probably never happen), but using
> CHARACTERP changes behavior.

If you count "avoids further crashes" as "changes behavior", yes.

readcharfun is supposed to return a character or -1.  Some callers
assume the return value is a valid character, and will crash otherwise.
I haven't checked all of them because there are many.

>> However, it does give us the ability to extend the API so
>> readcharfun could return a single character string, unibyte or
>> multibyte, to be handled appropriately.
>
> This is also a change in behavior.

Yes, of course, which is why it's a separate proposal and not part of
the patch.

Pip





Information forwarded to bug-gnu-emacs@HIDDEN:
bug#70988; Package emacs. Full text available.

Message received at 70988 <at> debbugs.gnu.org:


Received: (at 70988) by debbugs.gnu.org; 13 Feb 2025 06:01:28 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Feb 13 01:01:28 2025
Received: from localhost ([127.0.0.1]:39648 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1tiSHs-0002gR-Dj
	for submit <at> debbugs.gnu.org; Thu, 13 Feb 2025 01:01:27 -0500
Received: from eggs.gnu.org ([2001:470:142:3::10]:46544)
 by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.84_2) (envelope-from <eliz@HIDDEN>) id 1tiSHp-0002gA-RN
 for 70988 <at> debbugs.gnu.org; Thu, 13 Feb 2025 01:01:22 -0500
Received: from fencepost.gnu.org ([2001:470:142:3::e])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@HIDDEN>)
 id 1tiSHg-0002jl-Ft; Thu, 13 Feb 2025 01:01:13 -0500
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date:
 mime-version; bh=V/7dN6XSa0jV9IewFFqhEKmpchKc0zhhx2mafMX7/lk=; b=pThcHqG3nWU0
 FWAhYHcPjpNvb6GqsVT/A3EKNeZCfWt9DB2L8pQB40mxjWOlcMOlKE8B0rOkODx4lX0OY4xG51LA0
 ayx8pjmTMenb8DVwqQL2GZplbhypKWZuDw2aRjHnC9hFYsXDfa9QyBV1yP6ZXQaGzdCY3Tr7ToieA
 VlBXSU88G/o0L5Dvgr3HeoAw0g1JUTHwFjDKxQfONRmZMW3K1eT8ThN37awulhG1+XrNzYtLwSuod
 bIEJhRjyIrTu+q+KoUCd8KtuRRqaZbnLg5K/j2c7onXTUC13+6XCISYRIqZk6Cli0eA3dQpE4g861
 uZ9U7WNbODtRZFmc3O+TSQ==;
Date: Thu, 13 Feb 2025 08:00:57 +0200
Message-Id: <86lduajsdi.fsf@HIDDEN>
From: Eli Zaretskii <eliz@HIDDEN>
To: Pip Cet <pipcet@HIDDEN>
In-Reply-To: <87ed02ewnl.fsf@HIDDEN> (message from Pip Cet on Wed, 12
 Feb 2025 20:27:58 +0000)
Subject: Re: bug#70988: (read FUNCTION) uses Latin-1 [PATCH]
References: <37B5B5D0-9C0B-4E1C-9F3C-6CA647612E08@HIDDEN>
 <86le49h6sm.fsf@HIDDEN> <BBC28443-929B-4EE8-8773-984C5CD948CA@HIDDEN>
 <8634qghg2j.fsf@HIDDEN> <74B41A66-5B3C-4A09-A5F4-A389464BDA27@HIDDEN>
 <EF4677D0-BE1D-46FF-8BD2-60F553756F0D@HIDDEN>
 <CADwFkm=7A1XvBma-aTEkjYjJhrzCVwLDL2XYR2kJbfu0cJKBBA@HIDDEN>
 <877c5vf730.fsf@HIDDEN> <86r043j77n.fsf@HIDDEN>
 <87ed02ewnl.fsf@HIDDEN>
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 70988
Cc: 70988 <at> debbugs.gnu.org, mattias.engdegard@HIDDEN, stefankangas@HIDDEN,
 monnier@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

> Date: Wed, 12 Feb 2025 20:27:58 +0000
> From: Pip Cet <pipcet@HIDDEN>
> Cc: stefankangas@HIDDEN, mattias.engdegard@HIDDEN, 70988 <at> debbugs.gnu.org, monnier@HIDDEN
> 
> "Eli Zaretskii" <eliz@HIDDEN> writes:
> 
> >> --- a/src/lread.c
> >> +++ b/src/lread.c
> >> @@ -398,9 +398,12 @@ readchar (Lisp_Object readcharfun, bool *multibyte)
> >>
> >>    tem = call0 (readcharfun);
> >>
> >> -  if (NILP (tem))
> >> +  if (!CHARACTERP (tem))
> >>      return -1;
> >> -  return XFIXNUM (tem);
> >> +  if (multibyte && !ASCII_CHAR_P (XFIXNAT (tem)))
> >> +    *multibyte = true;
> >> +
> >> +  return XFIXNAT (tem);
> >
> > AFAIU, the proposed patch was just a bugfix, whereas the above also
> > changes behavior in backward-incompatible ways.
> 
> The other way around, I think: the first proposed patch changed the
> behavior of readchar to always set the multibyte flag when a function
> was used, resulting in the creation of symbols whose ASCII names are
> multibyte strings.  The previous behavior was never to set the multibyte
> flag, which was correct for ASCII strings but not multibyte ones.
> 
> This patch retains the previous behavior for ASCII symbols, but sets the
> multibyte flag for non-ASCII symbols, which seems the best we can do if
> we're given a simple function.

I'm talking about the CHARACTERP test (why not FIXNUMP?), and the
addition of ASCII_CHAR_P test (why would we want an ASCII character
to never be considered multibyte?).

> If we want to change symbol names to always be multibyte strings, we can
> do that, but then we probably want to do that or all streams.

I don't understand why you are talking about symbols: AFAIU this code
is used in many other cases as well.  But even for symbols: why change
the current behavior of making their names multibyte?

> It also fixes yet another XFIXNUM crash, but those (there are more in
> lread.c, it seems) should be fixed independently.

I'm okay with adding a FIXNUMP test (which happens in the debugging
builds anyway, so any violations probably never happen), but using
CHARACTERP changes behavior.

> However, it does give us the ability to extend the API so
> readcharfun could return a single character string, unibyte or
> multibyte, to be handled appropriately.

This is also a change in behavior.




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#70988; Package emacs. Full text available.

Message received at 70988 <at> debbugs.gnu.org:


Received: (at 70988) by debbugs.gnu.org; 12 Feb 2025 20:28:15 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Wed Feb 12 15:28:15 2025
Received: from localhost ([127.0.0.1]:38541 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1tiJLD-0001M6-AD
	for submit <at> debbugs.gnu.org; Wed, 12 Feb 2025 15:28:15 -0500
Received: from mail-10631.protonmail.ch ([79.135.106.31]:52263)
 by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.84_2) (envelope-from <pipcet@HIDDEN>)
 id 1tiJLA-0001Le-4u
 for 70988 <at> debbugs.gnu.org; Wed, 12 Feb 2025 15:28:13 -0500
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=protonmail.com;
 s=protonmail3; t=1739392085; x=1739651285;
 bh=XTS7wpLyLMZVUE95WXPZiT+5pDU0hMqNeuHJ6QQ3wKg=;
 h=Date:To:From:Cc:Subject:Message-ID:In-Reply-To:References:
 Feedback-ID:From:To:Cc:Date:Subject:Reply-To:Feedback-ID:
 Message-ID:BIMI-Selector:List-Unsubscribe:List-Unsubscribe-Post;
 b=Ka79hwC0MbLy1dl8oVbpnM1gnfSxWxtodAWyoE/aMBLE5AcL2wJNyCtTUfS8N+LKU
 iZTeRYeqbIRm4D513yQwLQpxIJQ5J5NDbO5nAa6t6+bjIg2G9zAWHpqik1fSdEOp+H
 4IwJIMMxnYsIsBjx38geBRDKy/bSVvnfYDNiGmrAmxjs8WMmNHSIoN0+7xYwJBVkb1
 NH71OeT+PVLWvUKUbTR2grirAnjDaDnMSI0G4VTLGxn3NkNqUQmCRf28iiJkN8wv/x
 384qbeY3PLxLwdGm7I08XNJ+dmrr01qheHqCDjnX5tpOljvivKLujUixvd0pIZzYxR
 SKUh6+WoR0x8w==
Date: Wed, 12 Feb 2025 20:27:58 +0000
To: Eli Zaretskii <eliz@HIDDEN>
From: Pip Cet <pipcet@HIDDEN>
Subject: Re: bug#70988: (read FUNCTION) uses Latin-1 [PATCH]
Message-ID: <87ed02ewnl.fsf@HIDDEN>
In-Reply-To: <86r043j77n.fsf@HIDDEN>
References: <37B5B5D0-9C0B-4E1C-9F3C-6CA647612E08@HIDDEN>
 <86le49h6sm.fsf@HIDDEN> <BBC28443-929B-4EE8-8773-984C5CD948CA@HIDDEN>
 <8634qghg2j.fsf@HIDDEN> <74B41A66-5B3C-4A09-A5F4-A389464BDA27@HIDDEN>
 <EF4677D0-BE1D-46FF-8BD2-60F553756F0D@HIDDEN>
 <CADwFkm=7A1XvBma-aTEkjYjJhrzCVwLDL2XYR2kJbfu0cJKBBA@HIDDEN>
 <877c5vf730.fsf@HIDDEN> <86r043j77n.fsf@HIDDEN>
Feedback-ID: 112775352:user:proton
X-Pm-Message-ID: de393a67019a052cc13c333ef2c17ec680ce0d35
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 70988
Cc: 70988 <at> debbugs.gnu.org, mattias.engdegard@HIDDEN, stefankangas@HIDDEN,
 monnier@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

"Eli Zaretskii" <eliz@HIDDEN> writes:

>> Date: Wed, 12 Feb 2025 16:42:43 +0000
>> From: Pip Cet <pipcet@HIDDEN>
>> Cc: Mattias Engdeg=C3=A5rd <mattias.engdegard@HIDDEN>, 70988@debbugs.=
gnu.org, Eli Zaretskii <eliz@HIDDEN>, monnier@HIDDEN
>>
>> The alternative patch would look something like this:
>>
>> >From bbc65c9be7ccebf034f4d10f018a076ef1e8a4e9 Mon Sep 17 00:00:00 2001
>> From: Pip Cet <pipcet@HIDDEN>
>> Subject: [PATCH] Auto-detect multibyteness of readchar funs (bug#70988)
>>
>> * src/lread.c (readchar): Set *MULTIBYTE if we detect a multibyte
>> character.  Return -1 for non-characters rather than crashing.
>> ---
>>  src/lread.c | 7 +++++--
>>  1 file changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/lread.c b/src/lread.c
>> index 6af95873bb8..c18c1be3cf5 100644
>> --- a/src/lread.c
>> +++ b/src/lread.c
>> @@ -398,9 +398,12 @@ readchar (Lisp_Object readcharfun, bool *multibyte)
>>
>>    tem =3D call0 (readcharfun);
>>
>> -  if (NILP (tem))
>> +  if (!CHARACTERP (tem))
>>      return -1;
>> -  return XFIXNUM (tem);
>> +  if (multibyte && !ASCII_CHAR_P (XFIXNAT (tem)))
>> +    *multibyte =3D true;
>> +
>> +  return XFIXNAT (tem);
>
> AFAIU, the proposed patch was just a bugfix, whereas the above also
> changes behavior in backward-incompatible ways.

The other way around, I think: the first proposed patch changed the
behavior of readchar to always set the multibyte flag when a function
was used, resulting in the creation of symbols whose ASCII names are
multibyte strings.  The previous behavior was never to set the multibyte
flag, which was correct for ASCII strings but not multibyte ones.

This patch retains the previous behavior for ASCII symbols, but sets the
multibyte flag for non-ASCII symbols, which seems the best we can do if
we're given a simple function.

If we want to change symbol names to always be multibyte strings, we can
do that, but then we probably want to do that or all streams.

It also fixes yet another XFIXNUM crash, but those (there are more in
lread.c, it seems) should be fixed independently.  However, it does give
us the ability to extend the API so readcharfun could return a single
character string, unibyte or multibyte, to be handled appropriately.

Pip





Information forwarded to bug-gnu-emacs@HIDDEN:
bug#70988; Package emacs. Full text available.

Message received at 70988 <at> debbugs.gnu.org:


Received: (at 70988) by debbugs.gnu.org; 12 Feb 2025 19:26:03 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Wed Feb 12 14:26:03 2025
Received: from localhost ([127.0.0.1]:38341 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1tiIN0-0003hg-Ph
	for submit <at> debbugs.gnu.org; Wed, 12 Feb 2025 14:26:03 -0500
Received: from eggs.gnu.org ([2001:470:142:3::10]:45976)
 by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.84_2) (envelope-from <eliz@HIDDEN>) id 1tiIMy-0003h9-QO
 for 70988 <at> debbugs.gnu.org; Wed, 12 Feb 2025 14:26:01 -0500
Received: from fencepost.gnu.org ([2001:470:142:3::e])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@HIDDEN>)
 id 1tiIMs-00070b-BT; Wed, 12 Feb 2025 14:25:54 -0500
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From:
 Date; bh=yI+GCQTgIINZ4Q7AX7ArfqFZSC7kV5DjX7WMX993psc=; b=SkLBVUEFmpb0CFYx9EVp
 a5z2I4ABtyf6XIH4XSlMjCdautkAtAlCk3LO2TVZVjw1iduf22p6e8PTgPnU3MkWGuuLvgravuR2S
 SCO559UR2HSi+N/T3bBXbjSb2bSsumZltKxc3269SPhTatztYdWeUZUeMFZIOApiM39hOgdBgI0Ap
 OrYBoVwyCFq8Q0Pb1s4W1p5F0f4lPsbGH55IJUN1v2RPs/mF9YLBG0pu4dAuKljZlB0UkrX0bc8kj
 r+8okrap6dp8HSambnDdWc+1c1fA1MXfv7yTRQdutj77OoTs/yPSN1+5lXUcD7HDR+yQ0ercmehjd
 rAs/H/Pw6RAsEg==;
Date: Wed, 12 Feb 2025 21:25:48 +0200
Message-Id: <86r043j77n.fsf@HIDDEN>
From: Eli Zaretskii <eliz@HIDDEN>
To: Pip Cet <pipcet@HIDDEN>
In-Reply-To: <877c5vf730.fsf@HIDDEN> (message from Pip Cet on Wed, 12
 Feb 2025 16:42:43 +0000)
Subject: Re: bug#70988: (read FUNCTION) uses Latin-1 [PATCH]
References: <37B5B5D0-9C0B-4E1C-9F3C-6CA647612E08@HIDDEN>
 <86seyhh9uv.fsf@HIDDEN> <E13B82E6-8A2F-4D1B-B0A0-8D251270685F@HIDDEN>
 <86le49h6sm.fsf@HIDDEN> <BBC28443-929B-4EE8-8773-984C5CD948CA@HIDDEN>
 <8634qghg2j.fsf@HIDDEN> <74B41A66-5B3C-4A09-A5F4-A389464BDA27@HIDDEN>
 <EF4677D0-BE1D-46FF-8BD2-60F553756F0D@HIDDEN>
 <CADwFkm=7A1XvBma-aTEkjYjJhrzCVwLDL2XYR2kJbfu0cJKBBA@HIDDEN>
 <877c5vf730.fsf@HIDDEN>
MIME-version: 1.0
Content-type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 70988
Cc: 70988 <at> debbugs.gnu.org, mattias.engdegard@HIDDEN, stefankangas@HIDDEN,
 monnier@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

> Date: Wed, 12 Feb 2025 16:42:43 +0000
> From: Pip Cet <pipcet@HIDDEN>
> Cc: Mattias Engdegård <mattias.engdegard@HIDDEN>, 70988 <at> debbugs.gnu.org, Eli Zaretskii <eliz@HIDDEN>, monnier@HIDDEN
> 
> The alternative patch would look something like this:
> 
> >From bbc65c9be7ccebf034f4d10f018a076ef1e8a4e9 Mon Sep 17 00:00:00 2001
> From: Pip Cet <pipcet@HIDDEN>
> Subject: [PATCH] Auto-detect multibyteness of readchar funs (bug#70988)
> 
> * src/lread.c (readchar): Set *MULTIBYTE if we detect a multibyte
> character.  Return -1 for non-characters rather than crashing.
> ---
>  src/lread.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/src/lread.c b/src/lread.c
> index 6af95873bb8..c18c1be3cf5 100644
> --- a/src/lread.c
> +++ b/src/lread.c
> @@ -398,9 +398,12 @@ readchar (Lisp_Object readcharfun, bool *multibyte)
>  
>    tem = call0 (readcharfun);
>  
> -  if (NILP (tem))
> +  if (!CHARACTERP (tem))
>      return -1;
> -  return XFIXNUM (tem);
> +  if (multibyte && !ASCII_CHAR_P (XFIXNAT (tem)))
> +    *multibyte = true;
> +
> +  return XFIXNAT (tem);

AFAIU, the proposed patch was just a bugfix, whereas the above also
changes behavior in backward-incompatible ways.




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#70988; Package emacs. Full text available.

Message received at 70988 <at> debbugs.gnu.org:


Received: (at 70988) by debbugs.gnu.org; 12 Feb 2025 16:43:00 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Wed Feb 12 11:43:00 2025
Received: from localhost ([127.0.0.1]:37985 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1tiFpE-00015O-1U
	for submit <at> debbugs.gnu.org; Wed, 12 Feb 2025 11:43:00 -0500
Received: from mail-4316.protonmail.ch ([185.70.43.16]:48057)
 by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.84_2) (envelope-from <pipcet@HIDDEN>)
 id 1tiFpB-000155-2W
 for 70988 <at> debbugs.gnu.org; Wed, 12 Feb 2025 11:42:57 -0500
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=protonmail.com;
 s=protonmail3; t=1739378570; x=1739637770;
 bh=zIAQkgJNK8GYMVqSzepUJEOUwv5OAihfD4LqunkRgL4=;
 h=Date:To:From:Cc:Subject:Message-ID:In-Reply-To:References:
 Feedback-ID:From:To:Cc:Date:Subject:Reply-To:Feedback-ID:
 Message-ID:BIMI-Selector:List-Unsubscribe:List-Unsubscribe-Post;
 b=wDZXy6xi/OafCATKKecPhxKwlUDHvbecVavHY8i80oUnkmH2hMCxbKTFpM4STo5tC
 yQQQ+6ZpMxt/zpsUQrNFQIdyJYHph027nyfx5+cut0RhLx++lA+eySNbC5fe/ox7pC
 LV5evWUUcFgHCkhh9tuN/5d0mArZQgu3JsrWTt/V+61DP1jBHjAvU0smgG79/RDb+s
 v5Z1P5+MQMGk9Tf8PlZUOmp3ZHGQzlQmwkxFqTJF4TwAu7iAdSXBqvkLkDFbcGkNbA
 OWdk503nIoqm3BZSt+Gfi0XrqEUmLHTWd4JLLB29zruCeobOHVEqTJKhXoAcerz2+W
 IV2sUm2cmWVRQ==
Date: Wed, 12 Feb 2025 16:42:43 +0000
To: Stefan Kangas <stefankangas@HIDDEN>
From: Pip Cet <pipcet@HIDDEN>
Subject: Re: bug#70988: (read FUNCTION) uses Latin-1 [PATCH]
Message-ID: <877c5vf730.fsf@HIDDEN>
In-Reply-To: <CADwFkm=7A1XvBma-aTEkjYjJhrzCVwLDL2XYR2kJbfu0cJKBBA@HIDDEN>
References: <37B5B5D0-9C0B-4E1C-9F3C-6CA647612E08@HIDDEN>
 <86seyhh9uv.fsf@HIDDEN> <E13B82E6-8A2F-4D1B-B0A0-8D251270685F@HIDDEN>
 <86le49h6sm.fsf@HIDDEN> <BBC28443-929B-4EE8-8773-984C5CD948CA@HIDDEN>
 <8634qghg2j.fsf@HIDDEN> <74B41A66-5B3C-4A09-A5F4-A389464BDA27@HIDDEN>
 <EF4677D0-BE1D-46FF-8BD2-60F553756F0D@HIDDEN>
 <CADwFkm=7A1XvBma-aTEkjYjJhrzCVwLDL2XYR2kJbfu0cJKBBA@HIDDEN>
Feedback-ID: 112775352:user:proton
X-Pm-Message-ID: 6accb3f2b485f3248cd09e52cd5c2fbf5292c26b
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 70988
Cc: 70988 <at> debbugs.gnu.org,
 =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= <mattias.engdegard@HIDDEN>,
 Eli Zaretskii <eliz@HIDDEN>, monnier@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

"Stefan Kangas" <stefankangas@HIDDEN> writes:

> Mattias Engdeg=C3=A5rd <mattias.engdegard@HIDDEN> writes:
>
>> After looking further into the Lisp reader/printer I found two more
>> silent Latin-1 assumptions. In all three cases, I firmly believe the
>> following to be true:
>>
>> * The behaviour is not intended but just code accidents.
>> * They should hardly affect any user code at all.
>> * They are nevertheless clear bugs which should be fixed.
>>
>> Further on this will have to wait until after Emacs 30 has been branched=
 to avoid delaying that more important task.
>
> FWIW, the proposed patch looks like a bug fix to me as well, so I think
> we should install it.

I think we should think about whether we want to force multibyte to true
for all functions, even those never returning non-ASCII chars.  Also,
the code appears to use XFIXNUM on a Lisp_Object that might not be one.

IIUC, the difference is that all-ASCII strings would be unibyte strings
in some circumstances.

The alternative patch would look something like this:

From bbc65c9be7ccebf034f4d10f018a076ef1e8a4e9 Mon Sep 17 00:00:00 2001
From: Pip Cet <pipcet@HIDDEN>
Subject: [PATCH] Auto-detect multibyteness of readchar funs (bug#70988)

* src/lread.c (readchar): Set *MULTIBYTE if we detect a multibyte
character.  Return -1 for non-characters rather than crashing.
---
 src/lread.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/src/lread.c b/src/lread.c
index 6af95873bb8..c18c1be3cf5 100644
--- a/src/lread.c
+++ b/src/lread.c
@@ -398,9 +398,12 @@ readchar (Lisp_Object readcharfun, bool *multibyte)
=20
   tem =3D call0 (readcharfun);
=20
-  if (NILP (tem))
+  if (!CHARACTERP (tem))
     return -1;
-  return XFIXNUM (tem);
+  if (multibyte && !ASCII_CHAR_P (XFIXNAT (tem)))
+    *multibyte =3D true;
+
+  return XFIXNAT (tem);
=20
  read_multibyte:
   if (unread_char >=3D 0)
--=20
2.48.1






Information forwarded to bug-gnu-emacs@HIDDEN:
bug#70988; Package emacs. Full text available.

Message received at 70988 <at> debbugs.gnu.org:


Received: (at 70988) by debbugs.gnu.org; 12 Feb 2025 14:53:09 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Wed Feb 12 09:53:09 2025
Received: from localhost ([127.0.0.1]:33571 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1tiE6u-0008SF-Fl
	for submit <at> debbugs.gnu.org; Wed, 12 Feb 2025 09:53:08 -0500
Received: from mail-ej1-x62c.google.com ([2a00:1450:4864:20::62c]:53652)
 by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128)
 (Exim 4.84_2) (envelope-from <stefankangas@HIDDEN>)
 id 1tiE6r-0008Ra-Aa
 for 70988 <at> debbugs.gnu.org; Wed, 12 Feb 2025 09:53:05 -0500
Received: by mail-ej1-x62c.google.com with SMTP id
 a640c23a62f3a-ab7f76aeedbso87809966b.3
 for <70988 <at> debbugs.gnu.org>; Wed, 12 Feb 2025 06:53:05 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=gmail.com; s=20230601; t=1739371979; x=1739976779; darn=debbugs.gnu.org;
 h=content-transfer-encoding:cc:to:subject:message-id:date
 :mime-version:references:in-reply-to:from:from:to:cc:subject:date
 :message-id:reply-to;
 bh=tHYJKzY6PioyJd2McsVMUaOy588AphsoY0Pxx0bHJ0U=;
 b=MnItIGoC/uKSgC5kp1MfZb7nPWkpVqnPdlp/TEgeyPhV8vA+bAl8xkckl/iTPCaQe5
 ozFHb88Yq1g20UITrt9e9PgCAmIVZ6Pv2xc7RK87d8cqW8oHCnJLSrhBuw6cXMIc4uc7
 9k02lJ5dD17QuE9vzgYFp/ARlDgoZpgop8B/FjG+bY+KgpR3+GWSKHgl9BtergiPMcRf
 Gmsl9KJBSPsdmzo16jNeTgVIwWdx3umQ3pIeLnGEN1PzMJsM8NExxaeAXhup0RaE/j9U
 SwCya4xqD0OtD1OdhYVPyLecHvz9fxdtosnd6hGuwDhOWBtn6KQyMJdRQ/xtqxxKIfm6
 FjyQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20230601; t=1739371979; x=1739976779;
 h=content-transfer-encoding:cc:to:subject:message-id:date
 :mime-version:references:in-reply-to:from:x-gm-message-state:from:to
 :cc:subject:date:message-id:reply-to;
 bh=tHYJKzY6PioyJd2McsVMUaOy588AphsoY0Pxx0bHJ0U=;
 b=Tu2ukn6GZK6zFJHYdmQ2u420VC9JtA/75LydWUB1LWaAvgPey4TNeK14W5+fa5Ty6F
 izCTzvN2NjL31mSCUs/cNjzztjpT2+J3gK13ort/Dks7+IpQ0QWxlxpMfI3jo4XXPAZ0
 GAqxVObJeu88VjtppsnzkOqWVfKOzhE5PI4CZtPWoIXH1jldMemvJiwdNFsxAGx7/sss
 Owf7XzdGb0R/D3rAceCjxWQJFNbYbLi/xm84szZAniuss7xUNK2AqqOal1Wjk7Gnrs6z
 jHMaqLvtLT9LprKt25JRErIJk6rwLmIWGpkbr9aW+LgoZACZTsIeZexLIKsZOngwF0Id
 pSbQ==
X-Forwarded-Encrypted: i=1;
 AJvYcCWNAJX2v4ShQ8Kx3JIJGu061W/RKsily1aPnEGRirXA+eNZ11/1WlUcuJbpJWRsW/BNJRRvfg==@debbugs.gnu.org
X-Gm-Message-State: AOJu0YyBLNO6Slclb8q2cGeLGzM3iu23mi3F4w26rhTAbtzJQwrlbDE6
 CmJzkzvJPUqN2av/7iNC4I1mZuuRIA46ll1kNvLutvgDOvAzEhlnods+pHea2sMtQVyEpehEf4P
 ebhzxmD4EReU/tWLssnpmysuIDDPci2TZqmJfuQ==
X-Gm-Gg: ASbGncvluArfgdoObFAuHVPim6IEC8YaUbOb2pBSF4qZDG/iF2KejBwvpCPKoyfEHCT
 eNQMRCDvkwt+c6mDdjiUlZFyS8CS9HGCrfZ5wK7+KlMZpJ/9lopxAxe4dcVNF8aJ28zIClc5E40
 E=
X-Google-Smtp-Source: AGHT+IEIm7r6NexmT5/0LHMhHMn2+L+Y5Ma6SYsUXQ5GKACptrBmlg07l/r3KRFNC8RhzE+zBf1q4IUK33lFB6VFV1I=
X-Received: by 2002:a05:6402:51cf:b0:5dc:796f:fc86 with SMTP id
 4fb4d7f45d1cf-5deadd9d31cmr8517127a12.16.1739371978926; Wed, 12 Feb 2025
 06:52:58 -0800 (PST)
Received: from 753933720722 named unknown by gmailapi.google.com with
 HTTPREST; Wed, 12 Feb 2025 06:52:58 -0800
From: Stefan Kangas <stefankangas@HIDDEN>
In-Reply-To: <EF4677D0-BE1D-46FF-8BD2-60F553756F0D@HIDDEN>
References: <37B5B5D0-9C0B-4E1C-9F3C-6CA647612E08@HIDDEN>
 <86seyhh9uv.fsf@HIDDEN> <E13B82E6-8A2F-4D1B-B0A0-8D251270685F@HIDDEN>
 <86le49h6sm.fsf@HIDDEN> <BBC28443-929B-4EE8-8773-984C5CD948CA@HIDDEN>
 <8634qghg2j.fsf@HIDDEN> <74B41A66-5B3C-4A09-A5F4-A389464BDA27@HIDDEN>
 <EF4677D0-BE1D-46FF-8BD2-60F553756F0D@HIDDEN>
MIME-Version: 1.0
Date: Wed, 12 Feb 2025 06:52:58 -0800
X-Gm-Features: AWEUYZmOJ-oSFx_tghElPWrXNxNdRuIEQEO3o-3y5GFEfFLXh60uts3sHS1ZmIA
Message-ID: <CADwFkm=7A1XvBma-aTEkjYjJhrzCVwLDL2XYR2kJbfu0cJKBBA@HIDDEN>
Subject: Re: bug#70988: (read FUNCTION) uses Latin-1 [PATCH]
To: =?UTF-8?Q?Mattias_Engdeg=C3=A5rd?= <mattias.engdegard@HIDDEN>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 70988
Cc: 70988 <at> debbugs.gnu.org, Eli Zaretskii <eliz@HIDDEN>,
 monnier@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

Mattias Engdeg=C3=A5rd <mattias.engdegard@HIDDEN> writes:

> After looking further into the Lisp reader/printer I found two more silen=
t Latin-1 assumptions. In all three cases, I firmly believe the following t=
o be true:
>
> * The behaviour is not intended but just code accidents.
> * They should hardly affect any user code at all.
> * They are nevertheless clear bugs which should be fixed.
>
> Further on this will have to wait until after Emacs 30 has been branched =
to avoid delaying that more important task.

FWIW, the proposed patch looks like a bug fix to me as well, so I think
we should install it.




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#70988; Package emacs. Full text available.

Message received at 70988 <at> debbugs.gnu.org:


Received: (at 70988) by debbugs.gnu.org; 30 May 2024 15:44:39 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu May 30 11:44:39 2024
Received: from localhost ([127.0.0.1]:43578 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1sChxH-0003wX-7J
	for submit <at> debbugs.gnu.org; Thu, 30 May 2024 11:44:39 -0400
Received: from mail-lf1-f48.google.com ([209.85.167.48]:55464)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <mattias.engdegard@HIDDEN>) id 1sChxF-0003wE-CN
 for 70988 <at> debbugs.gnu.org; Thu, 30 May 2024 11:44:37 -0400
Received: by mail-lf1-f48.google.com with SMTP id
 2adb3069b0e04-5295e488248so1161678e87.2
 for <70988 <at> debbugs.gnu.org>; Thu, 30 May 2024 08:44:26 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=gmail.com; s=20230601; t=1717083801; x=1717688601; darn=debbugs.gnu.org;
 h=to:references:message-id:content-transfer-encoding:cc:date
 :in-reply-to:from:subject:mime-version:sender:from:to:cc:subject
 :date:message-id:reply-to;
 bh=F76qA3naXA6dbOpkal3jxcXa25NQiaoi5E13Q14KveY=;
 b=OWY8vQe4PcwZr9s3O25S/Whsrmmn9EDS02uHMUbCmIX0jbH7vUAvGSg2lIoXCSi696
 /mLQI/IWBEEVf8uzKFjQ4BTqaSvWvooIQiGQB1+p09/NkowPMdmNvkVR+rQm8noCGkLQ
 KUgCYJAf3JkGLytnJ/jSADN1lOcjbA0slsNzpumDxZsqNwLcQ+y4EMoFuRP6fOcTHHlp
 q833Qi/sWbaG7roc6K6t1U1mcKd/xCPdzUzbxVsw9HWlZSh30RwCmkGbdw/7qYT5Iu6C
 Lvo/7piIM80cpCEIBosQBSloJpie8AnbDUd8y4iwJQ8KEUh0FvGCxfsJl1rPz9v+mZx+
 Ez5A==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20230601; t=1717083801; x=1717688601;
 h=to:references:message-id:content-transfer-encoding:cc:date
 :in-reply-to:from:subject:mime-version:sender:x-gm-message-state
 :from:to:cc:subject:date:message-id:reply-to;
 bh=F76qA3naXA6dbOpkal3jxcXa25NQiaoi5E13Q14KveY=;
 b=v975XZfLR1pZYPD9ezBqFQcmzZHPtudupvS+n4o0al1WcCDQDCQZhS3FK3yOvwv4D7
 F87pmYRBTA9zemcTlYd6+pqEK6LUWzLYbpSG+1O4tBw0EdjFX0vOwOSwbN5sbe79A+OE
 Rdm9kTwWWkpg5wYP+swFqofArrxIN5UICIzzrzm0cH7hnvbPmTVnnOV8WR8z9rvblBlF
 SdjXQBnfTYM8foptDwp0NG/YL4w6C09hJiu9m8ovOGc04ykOIMiSlUrlu/sAwrmrp9zs
 zLHDtX90tA3le6sthHG4Vv8OM+v88XTG492T7ZDtgUAMelbaW8deOdsuKaDu7ssQiCg8
 f5pQ==
X-Gm-Message-State: AOJu0YyAeVh5bCbJko8LNy+hKb0Kje0Tp3gng1/zXWUNXyx2METHUV1b
 mcXFgg+YZSJS5PuaJRz7cuYfmUYYOBfVAY21gNqmTOy9+C74HyRo
X-Google-Smtp-Source: AGHT+IGG1/Om2QiBGQNfyEyp2obKGvfJZsrH7LZNYgsRvejMhwqZx6PvgTvkiEhjPMQRMUmxv7QVXA==
X-Received: by 2002:ac2:5496:0:b0:529:1dd4:3e76 with SMTP id
 2adb3069b0e04-52b7d480865mr1425913e87.59.1717083800308; 
 Thu, 30 May 2024 08:43:20 -0700 (PDT)
Received: from smtpclient.apple (c80-217-1-132.bredband.tele2.se.
 [80.217.1.132]) by smtp.gmail.com with ESMTPSA id
 2adb3069b0e04-52b6f7abff1sm401491e87.260.2024.05.30.08.43.19
 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128);
 Thu, 30 May 2024 08:43:19 -0700 (PDT)
Content-Type: text/plain;
	charset=us-ascii
Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.120.0.1.15\))
Subject: Re: bug#70988: (read FUNCTION) uses Latin-1 [PATCH]
From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= <mattias.engdegard@HIDDEN>
In-Reply-To: <74B41A66-5B3C-4A09-A5F4-A389464BDA27@HIDDEN>
Date: Thu, 30 May 2024 17:43:19 +0200
Content-Transfer-Encoding: quoted-printable
Message-Id: <EF4677D0-BE1D-46FF-8BD2-60F553756F0D@HIDDEN>
References: <37B5B5D0-9C0B-4E1C-9F3C-6CA647612E08@HIDDEN>
 <86seyhh9uv.fsf@HIDDEN> <E13B82E6-8A2F-4D1B-B0A0-8D251270685F@HIDDEN>
 <86le49h6sm.fsf@HIDDEN> <BBC28443-929B-4EE8-8773-984C5CD948CA@HIDDEN>
 <8634qghg2j.fsf@HIDDEN> <74B41A66-5B3C-4A09-A5F4-A389464BDA27@HIDDEN>
To: Eli Zaretskii <eliz@HIDDEN>
X-Mailer: Apple Mail (2.3654.120.0.1.15)
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 70988
Cc: 70988 <at> debbugs.gnu.org, monnier@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

After looking further into the Lisp reader/printer I found two more =
silent Latin-1 assumptions. In all three cases, I firmly believe the =
following to be true:

* The behaviour is not intended but just code accidents.
* They should hardly affect any user code at all.
* They are nevertheless clear bugs which should be fixed.

Further on this will have to wait until after Emacs 30 has been branched =
to avoid delaying that more important task.





Information forwarded to bug-gnu-emacs@HIDDEN:
bug#70988; Package emacs. Full text available.

Message received at 70988 <at> debbugs.gnu.org:


Received: (at 70988) by debbugs.gnu.org; 17 May 2024 17:09:27 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Fri May 17 13:09:27 2024
Received: from localhost ([127.0.0.1]:56239 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1s815D-0004Cm-0m
	for submit <at> debbugs.gnu.org; Fri, 17 May 2024 13:09:27 -0400
Received: from mail-lf1-f54.google.com ([209.85.167.54]:45306)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <mattias.engdegard@HIDDEN>) id 1s815B-0004Bq-P7
 for 70988 <at> debbugs.gnu.org; Fri, 17 May 2024 13:09:26 -0400
Received: by mail-lf1-f54.google.com with SMTP id
 2adb3069b0e04-51f2ebbd8a7so2795206e87.2
 for <70988 <at> debbugs.gnu.org>; Fri, 17 May 2024 10:09:22 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=gmail.com; s=20230601; t=1715965697; x=1716570497; darn=debbugs.gnu.org;
 h=to:references:message-id:content-transfer-encoding:cc:date
 :in-reply-to:from:subject:mime-version:sender:from:to:cc:subject
 :date:message-id:reply-to;
 bh=cMem+VUmFnLOtC5CHR1mAmg8csBzq41/14DGUnnpirk=;
 b=YXD7nGYSYRMUK9Fc7oJKw9bwnzRdgJsTB9pnIRR02pImLM10kBWMra+iHf5g4Fn9kE
 tHN4rTfganiUSN290N0NimXOjIHfIhCnyw3seJjuxbSjejVqaN4GSOL3pY2lMXUBBqQy
 JNHxWDbAfjZW2itCjV0PfNBHscf92F7+LiU+FqX3xXdObDw4EKvsUrseyBMFFfa7Vtlx
 e8sBnf1MtLNHo2B2j+RPxiBriXys2qI2Ssj1pPntbAW6aqUOHjVienrx67k9F2ztyR57
 Hj14TYKFeHb50ksTiOom0F0DipRW4MWY3Y948nudwijzW3wzAN//mezcEPpXA/SomN9H
 lOsg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20230601; t=1715965697; x=1716570497;
 h=to:references:message-id:content-transfer-encoding:cc:date
 :in-reply-to:from:subject:mime-version:sender:x-gm-message-state
 :from:to:cc:subject:date:message-id:reply-to;
 bh=cMem+VUmFnLOtC5CHR1mAmg8csBzq41/14DGUnnpirk=;
 b=RCinhl2F53NeWRxrlTzKawyBOLg4LwyDaQ12Jxtrj8IYbOrYgn5D1LJ15vIy1kThF4
 iGzVehiyqZbVkiSGL+AxMEVj2vhtSq7awlLQ9OUkLcLy9RLGjITglhTSJh7+MMcYZ25M
 KREV4472CcOGYL+Ig5hFIURxPeZcHBrk7/5nEhok7wGvzWkQybe1BAtDsYWUybxiuW5o
 SJVbaHQqLPJ/zp0exOoyE+/3flYOWyDc3nLdvf2hqz6d+CyVMcHskbfoFSgK5yW9XB/i
 LKW/d9A1qN9n3v/RSBeGqNFUwCDX0vQqMDhHLWe/O7sQA925CDxNGZr8ByIGL5JdFtcD
 Pcjg==
X-Gm-Message-State: AOJu0YzQiR2one/UiNx43xWzZln37E4z6U25Uv5FEnPiRdWS0SuEInik
 s9XxFdR1skB7w4zE6rXTm/EPqD09l3D18FZbvZVu1YyaFBZM8tAD
X-Google-Smtp-Source: AGHT+IFd7vD4L5YLK5vKH5BZaodPofsIO7ey8+rtizceDnctrAXF2axv6c8ywN7OITZrC9YChu19WQ==
X-Received: by 2002:a05:6512:32ae:b0:523:8a14:9149 with SMTP id
 2adb3069b0e04-5238a1491b0mr6251486e87.21.1715965696554; 
 Fri, 17 May 2024 10:08:16 -0700 (PDT)
Received: from smtpclient.apple (c80-217-1-132.bredband.tele2.se.
 [80.217.1.132]) by smtp.gmail.com with ESMTPSA id
 2adb3069b0e04-521f38d8b44sm3376307e87.209.2024.05.17.10.08.16
 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128);
 Fri, 17 May 2024 10:08:16 -0700 (PDT)
Content-Type: text/plain;
	charset=us-ascii
Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.120.0.1.15\))
Subject: Re: bug#70988: (read FUNCTION) uses Latin-1 [PATCH]
From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= <mattias.engdegard@HIDDEN>
In-Reply-To: <8634qghg2j.fsf@HIDDEN>
Date: Fri, 17 May 2024 19:08:15 +0200
Content-Transfer-Encoding: quoted-printable
Message-Id: <74B41A66-5B3C-4A09-A5F4-A389464BDA27@HIDDEN>
References: <37B5B5D0-9C0B-4E1C-9F3C-6CA647612E08@HIDDEN>
 <86seyhh9uv.fsf@HIDDEN> <E13B82E6-8A2F-4D1B-B0A0-8D251270685F@HIDDEN>
 <86le49h6sm.fsf@HIDDEN> <BBC28443-929B-4EE8-8773-984C5CD948CA@HIDDEN>
 <8634qghg2j.fsf@HIDDEN>
To: Eli Zaretskii <eliz@HIDDEN>
X-Mailer: Apple Mail (2.3654.120.0.1.15)
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 70988
Cc: 70988 <at> debbugs.gnu.org, monnier@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

17 maj 2024 kl. 12.45 skrev Eli Zaretskii <eliz@HIDDEN>:

>>>>> Is it an accident that the code does the same only _after_ the =
call to
>>>>> readbyte?
>>>>=20
>>>> Yes, I have no reason to believe otherwise.
>>>=20
>>> To me, it actually looks as done on purpose.
>>=20
>> You could very well be right about that. What I meant is that the =
order doesn't matter at all.
>=20
> Doesn't it affect what the readbyte call does?

No -- the `*multibyte =3D ...` assignment is just an extra return value, =
which indicates whether the returned values come from a unibyte or =
multibyte source. For any given source (READCHARFUN, in the terminology =
of lread.c), the characters will all be unibyte or multibyte, so this =
returned `multibyte` flag will typically only be used once by the caller =
and saved for future reference.

But you are right to question it because lread.c is a royal mess and =
many changes have not been made in a clean way. It is unclear whether =
it's worth returning the `multibyte` flag at all; it's only used in =
special cases.







Information forwarded to bug-gnu-emacs@HIDDEN:
bug#70988; Package emacs. Full text available.

Message received at 70988 <at> debbugs.gnu.org:


Received: (at 70988) by debbugs.gnu.org; 17 May 2024 10:48:21 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Fri May 17 06:48:21 2024
Received: from localhost ([127.0.0.1]:54476 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1s7v8P-00053u-3L
	for submit <at> debbugs.gnu.org; Fri, 17 May 2024 06:48:21 -0400
Received: from eggs.gnu.org ([209.51.188.92]:47900)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <eliz@HIDDEN>) id 1s7v8M-00053b-Tm
 for 70988 <at> debbugs.gnu.org; Fri, 17 May 2024 06:48:19 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@HIDDEN>)
 id 1s7v66-00009c-Oh; Fri, 17 May 2024 06:45:58 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From:
 Date; bh=Vpdii4anCedPOBeN+wrwEJG+fbfiUCbKuIKW6QbpXSM=; b=JLXU/5SiFo745zP/ephu
 NTeyUsiAtcLm8mkCxP8+xrBGV/sJnEFea+cHU4YAm8um0aWOl4y9DuGV21NqrniZe6iz1Q+t7EpFG
 S0YcX9xY2QC+sEJBmA21/GDNoKK9rfJxT5TbLKD51/2ghV62JEPi1HkEuheIY38hOrB8LePP8beJZ
 7JcR9Uhx40268BQivH/FaYZN0TxblocbIvb0B6ONeCcD3M2LrcElZxm7nuaPikGgaspAqotrR3Sjd
 7BPr1DcQLCSvOOwiwx7x7e0ynUZL+S47UuCQ4rf4L3aLAOD9vQog8oQbrp9cb/1UqSSlXzxa5F/WM
 ALelADJDTSDSrw==;
Date: Fri, 17 May 2024 13:45:56 +0300
Message-Id: <8634qghg2j.fsf@HIDDEN>
From: Eli Zaretskii <eliz@HIDDEN>
To: Mattias =?utf-8?Q?Engdeg=C3=A5rd?= <mattias.engdegard@HIDDEN>
In-Reply-To: <BBC28443-929B-4EE8-8773-984C5CD948CA@HIDDEN> (message from
 Mattias =?utf-8?Q?Engdeg=C3=A5rd?= on Fri, 17 May 2024 10:08:58 +0200)
Subject: Re: bug#70988: (read FUNCTION) uses Latin-1 [PATCH]
References: <37B5B5D0-9C0B-4E1C-9F3C-6CA647612E08@HIDDEN>
 <86seyhh9uv.fsf@HIDDEN> <E13B82E6-8A2F-4D1B-B0A0-8D251270685F@HIDDEN>
 <86le49h6sm.fsf@HIDDEN> <BBC28443-929B-4EE8-8773-984C5CD948CA@HIDDEN>
MIME-version: 1.0
Content-type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 70988
Cc: 70988 <at> debbugs.gnu.org, monnier@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

> From: Mattias Engdegård <mattias.engdegard@HIDDEN>
> Date: Fri, 17 May 2024 10:08:58 +0200
> Cc: 70988 <at> debbugs.gnu.org,
>  monnier@HIDDEN
> 
> 16 maj 2024 kl. 21.54 skrev Eli Zaretskii <eliz@HIDDEN>:
> 
> >>> Is it an accident that the code does the same only _after_ the call to
> >>> readbyte?
> >> 
> >> Yes, I have no reason to believe otherwise.
> > 
> > To me, it actually looks as done on purpose.
> 
> You could very well be right about that. What I meant is that the order doesn't matter at all.

Doesn't it affect what the readbyte call does?




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#70988; Package emacs. Full text available.

Message received at 70988 <at> debbugs.gnu.org:


Received: (at 70988) by debbugs.gnu.org; 17 May 2024 08:10:14 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Fri May 17 04:10:14 2024
Received: from localhost ([127.0.0.1]:53836 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1s7sfO-00037e-54
	for submit <at> debbugs.gnu.org; Fri, 17 May 2024 04:10:14 -0400
Received: from mail-lj1-f169.google.com ([209.85.208.169]:44317)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <mattias.engdegard@HIDDEN>) id 1s7sfI-00037V-O3
 for 70988 <at> debbugs.gnu.org; Fri, 17 May 2024 04:10:13 -0400
Received: by mail-lj1-f169.google.com with SMTP id
 38308e7fff4ca-2e4b90b03a9so18668821fa.1
 for <70988 <at> debbugs.gnu.org>; Fri, 17 May 2024 01:10:05 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=gmail.com; s=20230601; t=1715933340; x=1716538140; darn=debbugs.gnu.org;
 h=to:references:message-id:content-transfer-encoding:cc:date
 :in-reply-to:from:subject:mime-version:sender:from:to:cc:subject
 :date:message-id:reply-to;
 bh=BDwUzo/+a295j5JeqNJ6jIDTlIIXiO6spdnXijMjKFg=;
 b=gmfiCtK9KkCC2k4TXeMS5A53o4q0SalFogfBLN+Y3Xe3U0jEPhmt9MmkGTaFI5PwCl
 8x1AEo0nQZW2cqq0WaKQ5Qr9Ci+VDmcc+oNgitUI2aibua+hoy1B/hSryCZ2Om64V/x+
 VVOo/0nOaQRVswkRqoXbPSSEAubvVSFjPKnmpqGc+ay3WYVbuhlIQ+D8tXBQiXocSsfj
 Gxxo7CSTuOQxchg/yvr1tEhu+rQ/iIxhL8d1SB1hvjcGXaQcV+eaCR9zWZGbOr/kp5e7
 XoarEAhuQQEl0wHrihKiY//MjuKRmZEcgJ/kvso3B9xdnQ8QOFpP+N46tA/5hY8KIp1V
 45oQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20230601; t=1715933340; x=1716538140;
 h=to:references:message-id:content-transfer-encoding:cc:date
 :in-reply-to:from:subject:mime-version:sender:x-gm-message-state
 :from:to:cc:subject:date:message-id:reply-to;
 bh=BDwUzo/+a295j5JeqNJ6jIDTlIIXiO6spdnXijMjKFg=;
 b=FJsX4W4QkmGCJ9IA8YTfSqM+EgZPdpr6kdZsxEjov/jkUEuTV+CA6ypCVo/XNfzgj4
 noGIb9PkpKNS4JRQcUMRC6HYUtUn8fvsUKOVywq2cs3Kl2T3dL6QyZAsD4pqrwzpVFPw
 R+trfb6obehzfesayq3dkLuCR/nUtnR16Nzm4ce65b7soNd2B6j6Mf1Vsu6UvJHqMRZE
 e8WUoYnVWp8/rADhqtm18n8JM/xWRDwKZuVsOKEhZIwcj4hpV7uePIvFPLN4Tz15CTHA
 /7DIF5OnPKte1PAwOclN4/T4mlDMURzMnVfG5QxuAK7+ik//sIllhwJHuW9kiZ8qNXjl
 hn1g==
X-Gm-Message-State: AOJu0YyDLhZA5Khe2GbgrjTvhZKDr44CcbIgQ2pTXwFLjrZ8UtrZK1mc
 OLzzyzjNCurzkiDk3fF+ULLc7zcBUyhF6A83l6GJjE0cO0csp7Cn
X-Google-Smtp-Source: AGHT+IHoT9oZ5+AQCoL6QpAPQl5bHGHM0ni3FAbwop1vi3Wff7E727nsX1RTQm+DxGtsRm+SkdSMtQ==
X-Received: by 2002:a2e:98cf:0:b0:2e2:b61:aa97 with SMTP id
 38308e7fff4ca-2e5204cce6fmr143761651fa.48.1715933339811; 
 Fri, 17 May 2024 01:08:59 -0700 (PDT)
Received: from smtpclient.apple (c80-217-1-132.bredband.tele2.se.
 [80.217.1.132]) by smtp.gmail.com with ESMTPSA id
 38308e7fff4ca-2e6e5037e94sm9543631fa.52.2024.05.17.01.08.58
 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128);
 Fri, 17 May 2024 01:08:59 -0700 (PDT)
Content-Type: text/plain;
	charset=us-ascii
Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.120.0.1.15\))
Subject: Re: bug#70988: (read FUNCTION) uses Latin-1 [PATCH]
From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= <mattias.engdegard@HIDDEN>
In-Reply-To: <86le49h6sm.fsf@HIDDEN>
Date: Fri, 17 May 2024 10:08:58 +0200
Content-Transfer-Encoding: quoted-printable
Message-Id: <BBC28443-929B-4EE8-8773-984C5CD948CA@HIDDEN>
References: <37B5B5D0-9C0B-4E1C-9F3C-6CA647612E08@HIDDEN>
 <86seyhh9uv.fsf@HIDDEN> <E13B82E6-8A2F-4D1B-B0A0-8D251270685F@HIDDEN>
 <86le49h6sm.fsf@HIDDEN>
To: Eli Zaretskii <eliz@HIDDEN>
X-Mailer: Apple Mail (2.3654.120.0.1.15)
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 70988
Cc: 70988 <at> debbugs.gnu.org, monnier@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

16 maj 2024 kl. 21.54 skrev Eli Zaretskii <eliz@HIDDEN>:

>>> Is it an accident that the code does the same only _after_ the call =
to
>>> readbyte?
>>=20
>> Yes, I have no reason to believe otherwise.
>=20
> To me, it actually looks as done on purpose.

You could very well be right about that. What I meant is that the order =
doesn't matter at all.





Information forwarded to bug-gnu-emacs@HIDDEN:
bug#70988; Package emacs. Full text available.

Message received at 70988 <at> debbugs.gnu.org:


Received: (at 70988) by debbugs.gnu.org; 16 May 2024 19:54:14 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu May 16 15:54:14 2024
Received: from localhost ([127.0.0.1]:50666 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1s7hB8-00036O-1T
	for submit <at> debbugs.gnu.org; Thu, 16 May 2024 15:54:14 -0400
Received: from eggs.gnu.org ([209.51.188.92]:34504)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <eliz@HIDDEN>) id 1s7hB5-00036G-NT
 for 70988 <at> debbugs.gnu.org; Thu, 16 May 2024 15:54:12 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@HIDDEN>)
 id 1s7hAy-0003I3-0B; Thu, 16 May 2024 15:54:04 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From:
 Date; bh=DkVuF7ZYZ/mECwjWw7xhUZo/iSF3LY/unfoD6jR8S20=; b=U3lv4I9OEgZZD6ajpvo7
 oOuyRfQoU4Qq85X/me+i+CclgocRsxFTzvrHtZ+HpDpbWJVxFjAB2WtvEcIdxPiwZADzn2Zmatais
 2T2w/ms8LcE4dO0cVmOiWKVX0TQRt/DpViQZ5lGl12EEX290D44sJylk0tbVDn1ffXY9Ttfkjg/LD
 MN16zCeHq7wpauXH/UdGjQGgDizcf8QDasVL6K+as6E48Z0NRCZI1w6KvUoVaLj+LI947hR0nCFrq
 KD8VoxtofifxC4WDn70FSkqSKW6mNFd4uNE1CkxMuAeyGWbYvkyAyrZIjkUDYpl1ZUfex4VoZa//d
 /M0xZSmV8SA1ew==;
Date: Thu, 16 May 2024 22:54:01 +0300
Message-Id: <86le49h6sm.fsf@HIDDEN>
From: Eli Zaretskii <eliz@HIDDEN>
To: Mattias =?utf-8?Q?Engdeg=C3=A5rd?= <mattias.engdegard@HIDDEN>
In-Reply-To: <E13B82E6-8A2F-4D1B-B0A0-8D251270685F@HIDDEN> (message from
 Mattias =?utf-8?Q?Engdeg=C3=A5rd?= on Thu, 16 May 2024 21:45:56 +0200)
Subject: Re: bug#70988: (read FUNCTION) uses Latin-1 [PATCH]
References: <37B5B5D0-9C0B-4E1C-9F3C-6CA647612E08@HIDDEN>
 <86seyhh9uv.fsf@HIDDEN> <E13B82E6-8A2F-4D1B-B0A0-8D251270685F@HIDDEN>
MIME-version: 1.0
Content-type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 70988
Cc: 70988 <at> debbugs.gnu.org, monnier@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

> From: Mattias Engdegård <mattias.engdegard@HIDDEN>
> Date: Thu, 16 May 2024 21:45:56 +0200
> Cc: 70988 <at> debbugs.gnu.org,
>  monnier@HIDDEN
> 
> > Is it an accident that the code does the same only _after_ the call to
> > readbyte?
> 
> Yes, I have no reason to believe otherwise.

To me, it actually looks as done on purpose.




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#70988; Package emacs. Full text available.

Message received at 70988 <at> debbugs.gnu.org:


Received: (at 70988) by debbugs.gnu.org; 16 May 2024 19:47:16 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu May 16 15:47:16 2024
Received: from localhost ([127.0.0.1]:50618 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1s7h4K-00031s-8d
	for submit <at> debbugs.gnu.org; Thu, 16 May 2024 15:47:16 -0400
Received: from mail-lf1-f42.google.com ([209.85.167.42]:52435)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <mattias.engdegard@HIDDEN>) id 1s7h4E-00031a-SF
 for 70988 <at> debbugs.gnu.org; Thu, 16 May 2024 15:47:10 -0400
Received: by mail-lf1-f42.google.com with SMTP id
 2adb3069b0e04-52388d9ca98so2280988e87.0
 for <70988 <at> debbugs.gnu.org>; Thu, 16 May 2024 12:47:04 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=gmail.com; s=20230601; t=1715888758; x=1716493558; darn=debbugs.gnu.org;
 h=to:references:message-id:content-transfer-encoding:cc:date
 :in-reply-to:from:subject:mime-version:sender:from:to:cc:subject
 :date:message-id:reply-to;
 bh=zTO8RNwfZEG/9W0EWHsalnvGBH+Mi3uxXkLUvFqvt74=;
 b=ULFxohlgIY6DJHpOR7yzht4XEYUUCrUXs93EDWplmpDqPO+3ZxRAp/HKtoGtbg6qYY
 fqV7AkPi2J85ZcMEb3Qt/wwmeBx+u8Q1FOZBbddHauGT4PCFrQdwFhlbWtO4SY7qD7jS
 41lb6j7u4N+5HKY797SNIw/9K1z93sVJamGPjX3UzvNxXwOXPhI7bE1O4QkVcQEesyML
 ySpbnGWTDO9jtoT+67jmpf/0uzWHKH5Odu1yKC0anh/sR7+wCInnvw5KW+QIk9alXsd8
 VF224+pzyzHqLYWzhzojbYKCEYXb9JVUEo+LwnIp+xXxq746+2DKtNRW52YOMcHIKswL
 lZjg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20230601; t=1715888758; x=1716493558;
 h=to:references:message-id:content-transfer-encoding:cc:date
 :in-reply-to:from:subject:mime-version:sender:x-gm-message-state
 :from:to:cc:subject:date:message-id:reply-to;
 bh=zTO8RNwfZEG/9W0EWHsalnvGBH+Mi3uxXkLUvFqvt74=;
 b=dYHZyDG2nssN9CfjdtEaHODI3IFzquDfF/LIauiRPX0iemcleZePRWfRx4X5FX1pGp
 1+35ZNaNLjcsO1cDnF/Z921eHatCMggeqjyWtfOQxiJj4+e0If1AyVhMoiPGF2UE85Us
 9g5P/lOMRtDp90UkAtkoVYCDOc1EL8il43V7xh9L21kNrjkLCTdiOxZp1aPg1QEr2XWp
 ol6PpWBXZA3tFj6iAJrg+Hwwv3OKETIQKTtGMAKYPNB3TJD2KcKpqhA0PFIx1JWEcGnE
 l+ZRa2EJnj7y97TrvHwhT9DUpMjXOsf1f5/rqXnmuo3fiKXfQVUaj7oUPB80h//0KTtg
 5LSQ==
X-Gm-Message-State: AOJu0Yxp93fXwtxLZMuRNO7KdQhPJmGNvc7wwXH5YceJFUuA9uxkK2hQ
 0mIhU1L8yKqOksfHN08alL9gqUktfmuMxGbpqsTyL0XH97pg5wuh
X-Google-Smtp-Source: AGHT+IFdTvxvsZtZSbKRmXv1Sexj3vyxBO8xXZmo9ObEU7NzMmk9cmtK57eiZRV+W7bCCQ9H/OkJNw==
X-Received: by 2002:ac2:4c85:0:b0:51f:5d1a:b320 with SMTP id
 2adb3069b0e04-5221047585dmr14920154e87.68.1715888758306; 
 Thu, 16 May 2024 12:45:58 -0700 (PDT)
Received: from smtpclient.apple (c80-217-1-132.bredband.tele2.se.
 [80.217.1.132]) by smtp.gmail.com with ESMTPSA id
 2adb3069b0e04-521f35ba59bsm3049807e87.65.2024.05.16.12.45.57
 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128);
 Thu, 16 May 2024 12:45:57 -0700 (PDT)
Content-Type: text/plain;
	charset=us-ascii
Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.120.0.1.15\))
Subject: Re: bug#70988: (read FUNCTION) uses Latin-1 [PATCH]
From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= <mattias.engdegard@HIDDEN>
In-Reply-To: <86seyhh9uv.fsf@HIDDEN>
Date: Thu, 16 May 2024 21:45:56 +0200
Content-Transfer-Encoding: quoted-printable
Message-Id: <E13B82E6-8A2F-4D1B-B0A0-8D251270685F@HIDDEN>
References: <37B5B5D0-9C0B-4E1C-9F3C-6CA647612E08@HIDDEN>
 <86seyhh9uv.fsf@HIDDEN>
To: Eli Zaretskii <eliz@HIDDEN>
X-Mailer: Apple Mail (2.3654.120.0.1.15)
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 70988
Cc: 70988 <at> debbugs.gnu.org, monnier@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

16 maj 2024 kl. 20.47 skrev Eli Zaretskii <eliz@HIDDEN>:

> When is this situation relevant?  How many uses of
> function-as-a-stream are there out there?

Not many is my guess, which is perhaps why it wasn't found before.
I'm doing some performance work on the reader, and quirks in the code =
like these become obvious.

> Is it an accident that the code does the same only _after_ the call to
> readbyte?

Yes, I have no reason to believe otherwise.





Information forwarded to bug-gnu-emacs@HIDDEN:
bug#70988; Package emacs. Full text available.

Message received at 70988 <at> debbugs.gnu.org:


Received: (at 70988) by debbugs.gnu.org; 16 May 2024 18:48:07 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu May 16 14:48:07 2024
Received: from localhost ([127.0.0.1]:50351 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1s7g99-0002Jy-56
	for submit <at> debbugs.gnu.org; Thu, 16 May 2024 14:48:07 -0400
Received: from eggs.gnu.org ([209.51.188.92]:41950)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <eliz@HIDDEN>) id 1s7g94-0002JR-Uw
 for 70988 <at> debbugs.gnu.org; Thu, 16 May 2024 14:48:05 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@HIDDEN>)
 id 1s7g8x-0005HX-4V; Thu, 16 May 2024 14:47:55 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From:
 Date; bh=WuSDusavepW5zeVFHFNG9WEzTnEyFCXmTpez/9Br28E=; b=jy9tcBcbSz1M3UajjfFL
 sZvySlfKhpp/4XB/v329PQtTHSUsMLGxZntAXhPdD49uYuClfHoQPNmaLOgIR2zEhlDhy2grPUh17
 /T6Grw9uf/a4TeHEpHvqDkjyt5RfeFDeeb4LbhH+Bfq2Q3x2Y644Hl3aAaT4D7F3vV9+K57fvYo9P
 C1fiTklf3Sbm0ttuZHlA82YhcQUBozrCHDOih5+mjtrzs8sRVHrX+5GG5ff7/KJ1zORGP7zdw9rB1
 ym9sIfX1Ih70ncxLxhY5zGwWtZxQWpENpwUcYBNikE2jS8UsKNV+/Vt8hjtoayjzG8qRWON2zrByM
 SLECUV//AWuX/g==;
Date: Thu, 16 May 2024 21:47:52 +0300
Message-Id: <86seyhh9uv.fsf@HIDDEN>
From: Eli Zaretskii <eliz@HIDDEN>
To: Mattias =?utf-8?Q?Engdeg=C3=A5rd?= <mattias.engdegard@HIDDEN>
In-Reply-To: <37B5B5D0-9C0B-4E1C-9F3C-6CA647612E08@HIDDEN> (message from
 Mattias =?utf-8?Q?Engdeg=C3=A5rd?= on Thu, 16 May 2024 20:13:18 +0200)
Subject: Re: bug#70988: (read FUNCTION) uses Latin-1 [PATCH]
References: <37B5B5D0-9C0B-4E1C-9F3C-6CA647612E08@HIDDEN>
MIME-version: 1.0
Content-type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 70988
Cc: 70988 <at> debbugs.gnu.org, monnier@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

> Cc: Stefan Monnier <monnier@HIDDEN>
> From: Mattias Engdegård <mattias.engdegard@HIDDEN>
> Date: Thu, 16 May 2024 20:13:18 +0200
> 
> When `read` is called with a function as stream argument, the return values of that function are often interpreted as Latin-1 characters with only the 8 low bits used. Example:
> 
> (let* ((next '(?A #x12a nil))
>        (f (lambda (&rest args)
>             (if args
>                 (push (car args) next)
>               (pop next)))))
>   (read f))
> => A*   ; expected: AĪ
> 
> This is a result of `readchar` setting *multibyte to 0 on this code path.

When is this situation relevant?  How many uses of
function-as-a-stream are there out there?

In general, I wouldn't touch these rare cases with a 3-mile pole.  The
gain is generally very small (satisfaction from some abstract sense of
correctness aside), while the risk to break some code is usually high.
It is better to document this behavior and move on.

> The fix is straightforward (attached).
> 
> diff --git a/src/lread.c b/src/lread.c
> index c92b2ede932..2626272c4e2 100644
> --- a/src/lread.c
> +++ b/src/lread.c
> @@ -422,6 +422,8 @@ readchar (Lisp_Object readcharfun, bool *multibyte)
>        goto read_multibyte;
>      }
>  
> +  if (multibyte)
> +    *multibyte = 1;
>    tem = call0 (readcharfun);

Is it an accident that the code does the same only _after_ the call to
readbyte?




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#70988; Package emacs. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 16 May 2024 18:13:42 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu May 16 14:13:42 2024
Received: from localhost ([127.0.0.1]:50161 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1s7fbq-0001qD-4K
	for submit <at> debbugs.gnu.org; Thu, 16 May 2024 14:13:42 -0400
Received: from lists.gnu.org ([209.51.188.17]:51594)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <mattias.engdegard@HIDDEN>) id 1s7fbl-0001q7-7k
 for submit <at> debbugs.gnu.org; Thu, 16 May 2024 14:13:41 -0400
Received: from eggs.gnu.org ([2001:470:142:3::10])
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <mattias.engdegard@HIDDEN>)
 id 1s7fbi-0005ZH-Fp
 for bug-gnu-emacs@HIDDEN; Thu, 16 May 2024 14:13:34 -0400
Received: from mail-lj1-x22b.google.com ([2a00:1450:4864:20::22b])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128)
 (Exim 4.90_1) (envelope-from <mattias.engdegard@HIDDEN>)
 id 1s7fbW-0005Gt-2e
 for bug-gnu-emacs@HIDDEN; Thu, 16 May 2024 14:13:33 -0400
Received: by mail-lj1-x22b.google.com with SMTP id
 38308e7fff4ca-2e3e18c24c1so12557171fa.1
 for <bug-gnu-emacs@HIDDEN>; Thu, 16 May 2024 11:13:21 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=gmail.com; s=20230601; t=1715883200; x=1716488000; darn=gnu.org;
 h=to:cc:date:message-id:subject:mime-version:from:sender:from:to:cc
 :subject:date:message-id:reply-to;
 bh=NsCLKWBALa4YtKzEy2+S9ageaZN3CXP1MrTdqWzs2KI=;
 b=IkSYYLrJSKqTFe5fiLHHcvHOAoKGdydOtiLUOPIrCsqd11zV1jiOQQzZfii7S575nt
 YqhNhYGN8/DvhXELMam8KyLtJyMH1YCYoreUd+EMla7akJlFvErf32EVzHHkq+jSAEf1
 5pmT/yXgGu1QgdhupaX2laWFCHowM0WiYZxKptx6YzwLdbSOjfsNyselaTDxyYX6Z8Su
 MpGYn0pI6RXkN3MfzbLH5oAnR878inXh+0IpX76PfIThz5/xQzFapokGsTcJV6wZgJdG
 T07f8oZZZFKTW5tMfN0xDqFA1QXwPLGjs9vSyB/e2LJbs0PX/g9bDR5uLw+ylyGI4VNv
 Hpyw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20230601; t=1715883200; x=1716488000;
 h=to:cc:date:message-id:subject:mime-version:from:sender
 :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
 bh=NsCLKWBALa4YtKzEy2+S9ageaZN3CXP1MrTdqWzs2KI=;
 b=NCB2OW++XlQobglAXErmJ5KYpG1JaTOm3NKkKmpK9aXh2qu0bI0QHAJ5uIC49iUg+p
 b9mZniUjBtqkm14wa/QQvEvjA5IXyuTM7vEDvDxRtSne42bp/erJeohxd8xcaqt/+KjP
 LE/BZzvtJUn/431UZU61ltzM3UC+WMGkZoafdDQvqfJ1Mnd/xjq5hjp2lRLmf7WS1JsS
 1f7MRoskQXfxclXKgT3swsBrSlhqJpjlVOr+M+rDpqUFd6Wpnv6i6cqZJCEARJevssJy
 8f5s0qqY+TziISesFaddbBOu5HORkzZtBLwi4nDT6wBB/f3CXD5iuDutw7LXzVjlrGnB
 uQsw==
X-Gm-Message-State: AOJu0YxsO0n7YHQJp2fLLMqAaBEKOpUKWuF5ruhDzA2O6xOZlzWSIK4j
 Q9LxNI13UaBTyRhnLQJ31T5zpI3uU5jktvuZde4oPV7cj6jxVvcq5wHm+g==
X-Google-Smtp-Source: AGHT+IG1uY5pMMTqEBlASy7TlgdfDXzcfXukOSQfzpmpI1mtW0N+jvb+KcNDw7kYavcH1/4tTVg9ZQ==
X-Received: by 2002:a05:6512:3d19:b0:51a:b110:3214 with SMTP id
 2adb3069b0e04-5221007029cmr15444321e87.49.1715883199401; 
 Thu, 16 May 2024 11:13:19 -0700 (PDT)
Received: from smtpclient.apple (c80-217-1-132.bredband.tele2.se.
 [80.217.1.132]) by smtp.gmail.com with ESMTPSA id
 2adb3069b0e04-523b261ad51sm458599e87.224.2024.05.16.11.13.18
 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128);
 Thu, 16 May 2024 11:13:19 -0700 (PDT)
From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= <mattias.engdegard@HIDDEN>
Content-Type: multipart/mixed;
 boundary="Apple-Mail=_2685EA9B-3ACD-4D1D-81B9-D9031379C60E"
Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.120.0.1.15\))
Subject: (read FUNCTION) uses Latin-1 [PATCH]
Message-Id: <37B5B5D0-9C0B-4E1C-9F3C-6CA647612E08@HIDDEN>
Date: Thu, 16 May 2024 20:13:18 +0200
To: Emacs Bug Report <bug-gnu-emacs@HIDDEN>
X-Mailer: Apple Mail (2.3654.120.0.1.15)
Received-SPF: pass client-ip=2a00:1450:4864:20::22b;
 envelope-from=mattias.engdegard@HIDDEN; helo=mail-lj1-x22b.google.com
X-Spam_score_int: -20
X-Spam_score: -2.1
X-Spam_bar: --
X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1,
 DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001,
 SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-Spam-Score: -1.3 (-)
X-Debbugs-Envelope-To: submit
Cc: Stefan Monnier <monnier@HIDDEN>
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -2.3 (--)


--Apple-Mail=_2685EA9B-3ACD-4D1D-81B9-D9031379C60E
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=utf-8

When `read` is called with a function as stream argument, the return =
values of that function are often interpreted as Latin-1 characters with =
only the 8 low bits used. Example:

(let* ((next '(?A #x12a nil))
       (f (lambda (&rest args)
            (if args
                (push (car args) next)
              (pop next)))))
  (read f))
=3D> A*   ; expected: A=C4=AA

This is a result of `readchar` setting *multibyte to 0 on this code =
path.

The reader is not very consistent: inside string and character literals, =
the code seems to work as expected.

The fix is straightforward (attached).


--Apple-Mail=_2685EA9B-3ACD-4D1D-81B9-D9031379C60E
Content-Disposition: attachment;
	filename=read-from-function.diff
Content-Type: application/octet-stream;
	x-unix-mode=0644;
	name="read-from-function.diff"
Content-Transfer-Encoding: quoted-printable

diff=20--git=20a/src/lread.c=20b/src/lread.c=0Aindex=20=
c92b2ede932..2626272c4e2=20100644=0A---=20a/src/lread.c=0A+++=20=
b/src/lread.c=0A@@=20-422,6=20+422,8=20@@=20readchar=20(Lisp_Object=20=
readcharfun,=20bool=20*multibyte)=0A=20=20=20=20=20=20=20goto=20=
read_multibyte;=0A=20=20=20=20=20}=0A=20=0A+=20=20if=20(multibyte)=0A+=20=
=20=20=20*multibyte=20=3D=201;=0A=20=20=20tem=20=3D=20call0=20=
(readcharfun);=0A=20=0A=20=20=20if=20(NILP=20(tem))=0Adiff=20--git=20=
a/test/src/lread-tests.el=20b/test/src/lread-tests.el=0Aindex=20=
cc17f7eb3fa..41c9256a9bf=20100644=0A---=20a/test/src/lread-tests.el=0A=
+++=20b/test/src/lread-tests.el=0A@@=20-387,4=20+387,19=20@@=20=
lread-skip-to-eof=0A=20=20=20=20=20(goto-char=20(point-min))=0A=20=20=20=20=
=20(should-error=20(read=20(current-buffer))=20:type=20'end-of-file)))=0A=
=20=0A+(ert-deftest=20lread-from-function=20()=0A+=20=20;;=20Test=20=
reading=20from=20a=20stream=20defined=20by=20a=20function.=0A+=20=20(let=20=
((make-reader=20(lambda=20(chars)=0A+=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20(lambda=20(&rest=20args)=0A+=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20(if=20=
args=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20(push=20(car=20args)=20chars)=0A+=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20(pop=20=
chars))))))=0A+=20=20=20=20(dolist=20(seq=20'((?A=20?B)=20(?E=20?=C3=A4=20=
?=C3=BF)=20(?A=20?=CE=A9)=20(?*=20?=E2=98=83)=20(?a=20#o303=20#o245=20=
?b)))=0A+=20=20=20=20=20=20(let=20((str=20(apply=20#'string=20seq)))=0A+=20=
=20=20=20=20=20=20=20(should=20(eq=20(read=20(funcall=20make-reader=20=
seq))=20(intern=20str)))=0A+=20=20=20=20=20=20=20=20(let=20((quoted-seq=20=
`(?\"=20,@seq=20?\")))=0A+=20=20=20=20=20=20=20=20=20=20(should=20(equal=20=
(read=20(funcall=20make-reader=20quoted-seq))=20str)))))=0A+=20=20=20=20=
(dolist=20(c=20'(?A=20?=C3=A4=20?=C3=BF=20?=CE=A9=20?=E2=98=83))=0A+=20=20=
=20=20=20=20(should=20(eq=20(read=20(funcall=20make-reader=20`(??=20=
,c)))=20c)))))=0A+=0A=20;;;=20lread-tests.el=20ends=20here=0A=

--Apple-Mail=_2685EA9B-3ACD-4D1D-81B9-D9031379C60E--




Acknowledgement sent to Mattias Engdegård <mattias.engdegard@HIDDEN>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs@HIDDEN. Full text available.
Report forwarded to bug-gnu-emacs@HIDDEN:
bug#70988; Package emacs. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Thu, 13 Feb 2025 16:45:03 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.