GNU bug report logs - #56469
29.0.50; Unibyte dir in directory_files_internal

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: emacs; Reported by: Stefan Monnier <monnier@HIDDEN>; Keywords: patch; dated Sat, 9 Jul 2022 17:46:01 UTC; Maintainer for emacs is bug-gnu-emacs@HIDDEN.
Added tag(s) patch. Request was from Stefan Kangas <stefan@HIDDEN> to control <at> debbugs.gnu.org. Full text available.

Message received at 56469 <at> debbugs.gnu.org:


Received: (at 56469) by debbugs.gnu.org; 9 Jul 2022 18:53:50 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sat Jul 09 14:53:50 2022
Received: from localhost ([127.0.0.1]:35916 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1oAFaP-0001qw-Rt
	for submit <at> debbugs.gnu.org; Sat, 09 Jul 2022 14:53:50 -0400
Received: from eggs.gnu.org ([209.51.188.92]:57386)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <eliz@HIDDEN>) id 1oAFaN-0001qk-W0
 for 56469 <at> debbugs.gnu.org; Sat, 09 Jul 2022 14:53:48 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e]:48456)
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@HIDDEN>)
 id 1oAFaI-0003Ic-0x; Sat, 09 Jul 2022 14:53:42 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date:
 mime-version; bh=rOyy9p9LXUBUf3vrvG0cgncvzVaf4mRkHhd/Qjlqnho=; b=Q+0tQVGn8TeK
 6wKEKsQ5lLPo/CDi2ohA4iUEMSrG00lGoUFkm/7pfyhdTzRYscPZFRSR7C0M2dH3gC2hOS/wwnYXd
 OI5S7PLdam6h4Mb/2T2sm7hmPX6AdIcBuxgzEFTr352czvffS6bf8AkrT28PmGz4o+l26L2+MlyxK
 RKgFZfoxEbulGxaXeW9gTguXwk0KXyperKfktHwdsvKMyCpuqv29ivvb+j6ZBcpDRgJHOdtAk1XIS
 rYlsccqjH8gcRNtEEDAevZLKvfOncsFs8CfWj/j8XDVairSP0IjCy7Z2QVJ1kuf7xoITWbDId3n5T
 uezeWHa9R1znW2s6vchqMw==;
Received: from [87.69.77.57] (port=1778 helo=home-c4e4a596f7)
 by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@HIDDEN>)
 id 1oAFaH-0003r9-KH; Sat, 09 Jul 2022 14:53:41 -0400
Date: Sat, 09 Jul 2022 21:53:25 +0300
Message-Id: <83wncm15ju.fsf@HIDDEN>
From: Eli Zaretskii <eliz@HIDDEN>
To: Stefan Monnier <monnier@HIDDEN>
In-Reply-To: <jwvwncm9mjw.fsf-monnier+emacs@HIDDEN> (message from Stefan
 Monnier on Sat, 09 Jul 2022 14:20:37 -0400)
Subject: Re: bug#56469: 29.0.50; Unibyte dir in directory_files_internal
References: <jwvy1x2p4dn.fsf@HIDDEN> <83y1x2177x.fsf@HIDDEN>
 <jwvwncm9mjw.fsf-monnier+emacs@HIDDEN>
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 56469
Cc: 56469 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

> From: Stefan Monnier <monnier@HIDDEN>
> Cc: 56469 <at> debbugs.gnu.org
> Date: Sat, 09 Jul 2022 14:20:37 -0400
> 
> >> I suggest the patch below.  In a comment I suggest we don't try to use
> >> unibyte strings when a multibyte string would work as well.  This is
> >> because for those ASCII-only strings, it's cheaper to test bytes==chars
> >> to (re)discover that they are ASCII-only (when they're multibyte) than
> >> having to loop through the bytes (when they're unibyte).
> >
> > Please bootstrap Emacs in a directory with such a name, and if that
> > works, I'm okay with installing this change.
> 
> Just to clarify: by "this change" you refer to the change in the patch
> or the change suggested in the comment?

I meant the patch.  The comment I didn't understand at all.  It seemed
to be unrelated to the code and the change you were proposing.




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#56469; Package emacs. Full text available.

Message received at 56469 <at> debbugs.gnu.org:


Received: (at 56469) by debbugs.gnu.org; 9 Jul 2022 18:20:52 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sat Jul 09 14:20:51 2022
Received: from localhost ([127.0.0.1]:35901 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1oAF4V-00012C-MD
	for submit <at> debbugs.gnu.org; Sat, 09 Jul 2022 14:20:51 -0400
Received: from mailscanner.iro.umontreal.ca ([132.204.25.50]:14466)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <monnier@HIDDEN>) id 1oAF4T-00011y-AB
 for 56469 <at> debbugs.gnu.org; Sat, 09 Jul 2022 14:20:49 -0400
Received: from pmg3.iro.umontreal.ca (localhost [127.0.0.1])
 by pmg3.iro.umontreal.ca (Proxmox) with ESMTP id EA5444406EF;
 Sat,  9 Jul 2022 14:20:43 -0400 (EDT)
Received: from mail01.iro.umontreal.ca (unknown [172.31.2.1])
 by pmg3.iro.umontreal.ca (Proxmox) with ESMTP id 98D03440762;
 Sat,  9 Jul 2022 14:20:38 -0400 (EDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=iro.umontreal.ca;
 s=mail; t=1657390838;
 bh=XD5aSCaObo70++FnvX5uHwFuu6CyBOnVW6l9GZOxrXE=;
 h=From:To:Cc:Subject:In-Reply-To:References:Date:From;
 b=hljd+cqVbMXs6iYvcMXRiZnSLHYvvYZmp0tVJbQ2iqyrsKUsgqeRsBkjwtSOYCtCG
 lgf39XZNoijNgjwgdAT2GU0r1qh06TzHzFEq7sb5Lf3/PGFW+rXuHq7tQjbK/n9DVC
 GC9tNAMS8NYo6Fy3DTfX2zxXQ1FZoj9VCy6bFQ/KRnyRReOvzyEHzuzyc8Ej1BNPRN
 ByuaM8r5LfcEaYDGDgVrR47z7jCbB5U0zKrdu0R+a5/SLMzfCUoeDbe+dnExbrz198
 jkPS0aXCuFQAFb984eqCdXwJ0WBuQ3ZAngrgP2Ht25W6MaH5g9r0gooNnbyD0WDUzq
 5J2Hhw/E1J4gQ==
Received: from pastel (unknown [45.72.196.165])
 by mail01.iro.umontreal.ca (Postfix) with ESMTPSA id 6C192120192;
 Sat,  9 Jul 2022 14:20:38 -0400 (EDT)
From: Stefan Monnier <monnier@HIDDEN>
To: Eli Zaretskii <eliz@HIDDEN>
Subject: Re: bug#56469: 29.0.50; Unibyte dir in directory_files_internal
In-Reply-To: <83y1x2177x.fsf@HIDDEN> (Eli Zaretskii's message of "Sat, 09 Jul
 2022 21:17:22 +0300")
Message-ID: <jwvwncm9mjw.fsf-monnier+emacs@HIDDEN>
References: <jwvy1x2p4dn.fsf@HIDDEN> <83y1x2177x.fsf@HIDDEN>
Date: Sat, 09 Jul 2022 14:20:37 -0400
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-SPAM-INFO: Spam detection results:  0
 ALL_TRUSTED                -1 Passed through trusted hosts only via SMTP
 AWL -0.062 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 DKIM_SIGNED               0.1 Message has a DKIM or DK signature,
 not necessarily valid
 DKIM_VALID -0.1 Message has at least one valid DKIM or DK signature
 DKIM_VALID_AU -0.1 Message has a valid DKIM or DK signature from author's
 domain T_SCC_BODY_TEXT_LINE    -0.01 -
X-SPAM-LEVEL: 
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 56469
Cc: 56469 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

>> I suggest the patch below.  In a comment I suggest we don't try to use
>> unibyte strings when a multibyte string would work as well.  This is
>> because for those ASCII-only strings, it's cheaper to test bytes==chars
>> to (re)discover that they are ASCII-only (when they're multibyte) than
>> having to loop through the bytes (when they're unibyte).
>
> Please bootstrap Emacs in a directory with such a name, and if that
> works, I'm okay with installing this change.

Just to clarify: by "this change" you refer to the change in the patch
or the change suggested in the comment?


        Stefan





Information forwarded to bug-gnu-emacs@HIDDEN:
bug#56469; Package emacs. Full text available.

Message received at 56469 <at> debbugs.gnu.org:


Received: (at 56469) by debbugs.gnu.org; 9 Jul 2022 18:17:46 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sat Jul 09 14:17:46 2022
Received: from localhost ([127.0.0.1]:35896 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1oAF1W-0000xY-6v
	for submit <at> debbugs.gnu.org; Sat, 09 Jul 2022 14:17:46 -0400
Received: from eggs.gnu.org ([209.51.188.92]:51592)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <eliz@HIDDEN>) id 1oAF1U-0000xL-Da
 for 56469 <at> debbugs.gnu.org; Sat, 09 Jul 2022 14:17:44 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e]:48282)
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@HIDDEN>)
 id 1oAF1O-0001Ar-KU; Sat, 09 Jul 2022 14:17:38 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From:
 Date; bh=RfwzcQKeoXWbvp7Rbd/zo9oto7d1hhInTQEZC1XINeU=; b=KVWZNO2C3MuNA8928FQt
 CwX6HVblbInulsStmUhcsfm7lR2QaDreTgsxfWzd2SJ3UcNnajS/JlXpqH5gHRcs1a607k8uBE9tD
 8aU3z/44QchYRjKIuADowXuNKH8UtC7IjjV+aHvfeifj4F7ImaYNsdl8l/gN/WnxO6zekOZO7kDo0
 QIadFoxncG5rFahRuSgX/dDGMbMdSVtVnI26AMrFWFgZW0lmJ5hhrJjdq4tVMC4F/G79Sc6t8ubWl
 A1gwQopWKNWjQdcOG15SyhCvVj1k2Adyld1VHM10+PLOxQybRJe6b+GKqLcnSK4gff0hT2frrMFSE
 WN4HONaLs0wC3Q==;
Received: from [87.69.77.57] (port=3563 helo=home-c4e4a596f7)
 by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@HIDDEN>)
 id 1oAF1O-0003sY-3d; Sat, 09 Jul 2022 14:17:38 -0400
Date: Sat, 09 Jul 2022 21:17:22 +0300
Message-Id: <83y1x2177x.fsf@HIDDEN>
From: Eli Zaretskii <eliz@HIDDEN>
To: Stefan Monnier <monnier@HIDDEN>
In-Reply-To: <jwvy1x2p4dn.fsf@HIDDEN> (bug-gnu-emacs@HIDDEN)
Subject: Re: bug#56469: 29.0.50; Unibyte dir in directory_files_internal
References: <jwvy1x2p4dn.fsf@HIDDEN>
MIME-version: 1.0
Content-type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: 8bit
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 56469
Cc: 56469 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

> Date: Sat, 09 Jul 2022 13:44:52 -0400
> From:  Stefan Monnier via "Bug reports for GNU Emacs,
>  the Swiss army knife of text editors" <bug-gnu-emacs@HIDDEN>
> 
> If you have a directory named "/tmp/\303a" with a file named "fée"
> inside, then (directory-files "/tmp/\303a" 'full) is likely to return
> a funny string which is multibyte but contains an invalid
> utf-8 sequence (its bytes spell "/tmp/\303a/f\303\251e").
> That strings seems to be printed as "/tmp/¡/fée" which corresponds
> to "/tmp/\303\241/f\303\251e".
> 
> Such a string with an invalid UTF-8 sequence is handled quite graciously
> by Emacs, so I wasn't able to get an actual crash out of it, but it's
> still something we should avoid.
> 
> I suggest the patch below.  In a comment I suggest we don't try to use
> unibyte strings when a multibyte string would work as well.  This is
> because for those ASCII-only strings, it's cheaper to test bytes==chars
> to (re)discover that they are ASCII-only (when they're multibyte) than
> having to loop through the bytes (when they're unibyte).

Please bootstrap Emacs in a directory with such a name, and if that
works, I'm okay with installing this change.

Thanks.




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#56469; Package emacs. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 9 Jul 2022 17:45:21 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sat Jul 09 13:45:21 2022
Received: from localhost ([127.0.0.1]:35860 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1oAEW8-0006Pl-NS
	for submit <at> debbugs.gnu.org; Sat, 09 Jul 2022 13:45:21 -0400
Received: from lists.gnu.org ([209.51.188.17]:59458)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <monnier@HIDDEN>) id 1oAEW6-0006Pd-En
 for submit <at> debbugs.gnu.org; Sat, 09 Jul 2022 13:45:18 -0400
Received: from eggs.gnu.org ([2001:470:142:3::10]:60654)
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <monnier@HIDDEN>)
 id 1oAEW3-0006MA-QD
 for bug-gnu-emacs@HIDDEN; Sat, 09 Jul 2022 13:45:18 -0400
Received: from mailscanner.iro.umontreal.ca ([132.204.25.50]:45963)
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <monnier@HIDDEN>)
 id 1oAEW0-0005bk-UP
 for bug-gnu-emacs@HIDDEN; Sat, 09 Jul 2022 13:45:14 -0400
Received: from pmg1.iro.umontreal.ca (localhost.localdomain [127.0.0.1])
 by pmg1.iro.umontreal.ca (Proxmox) with ESMTP id 42D05100182
 for <bug-gnu-emacs@HIDDEN>; Sat,  9 Jul 2022 13:45:10 -0400 (EDT)
Received: from mail01.iro.umontreal.ca (unknown [172.31.2.1])
 by pmg1.iro.umontreal.ca (Proxmox) with ESMTP id 752B410012B
 for <bug-gnu-emacs@HIDDEN>; Sat,  9 Jul 2022 13:45:04 -0400 (EDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=iro.umontreal.ca;
 s=mail; t=1657388704;
 bh=ADZsJdqYhkryUf4oGzhRT0jk+3ogCDsQVs6wL90dkN0=;
 h=From:To:Subject:Date:From;
 b=K5Ot9bninuciVARa3mhu/gDmmvVaKKLX6lb7XWcnty0R6MpVY6vnBleKWThDx3Isb
 vsFY4ZcMOPiU1HiBuptZSgLPiNc4ciwq/2A7op1W+zfzfCJbHOG1no02rxXzPyGnOR
 0mw+ZepCDstuzoXCTbM3jPLDKhWX9wWNBONyeiqB/iBBfqFiuKMvhY2xThxm2H3qOM
 KtcZhOITv9o4XO1IrSE2TK2JMiwCRDr2pf5ZzUTHMFsZAsLA0hNJd9BFgm4eWBBQOQ
 LWT1Ag8H/cS5+SrFFiJyTDk6dYA4iZUGCt3JHfqvm0UgcWmVIo0b6ULI9aPkcLmhm+
 LcdTKLV+WSDKQ==
Received: from pastel (unknown [45.72.196.165])
 by mail01.iro.umontreal.ca (Postfix) with ESMTPSA id 1DF7F1204F3
 for <bug-gnu-emacs@HIDDEN>; Sat,  9 Jul 2022 13:45:04 -0400 (EDT)
From: Stefan Monnier <monnier@HIDDEN>
To: bug-gnu-emacs@HIDDEN
Subject: 29.0.50; Unibyte dir in directory_files_internal
Date: Sat, 09 Jul 2022 13:44:52 -0400
Message-ID: <jwvy1x2p4dn.fsf@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: quoted-printable
X-SPAM-INFO: Spam detection results:  0
 ALL_TRUSTED                -1 Passed through trusted hosts only via SMTP
 AWL -0.045 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 DKIM_SIGNED               0.1 Message has a DKIM or DK signature,
 not necessarily valid
 DKIM_VALID -0.1 Message has at least one valid DKIM or DK signature
 DKIM_VALID_AU -0.1 Message has a valid DKIM or DK signature from author's
 domain T_SCC_BODY_TEXT_LINE    -0.01 -
X-SPAM-LEVEL: 
Received-SPF: pass client-ip=132.204.25.50;
 envelope-from=monnier@HIDDEN; helo=mailscanner.iro.umontreal.ca
X-Spam_score_int: -42
X-Spam_score: -4.3
X-Spam_bar: ----
X-Spam_report: (-4.3 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1,
 DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_MED=-2.3,
 SPF_HELO_NONE=0.001, SPF_PASS=-0.001,
 T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-Spam-Score: -1.3 (-)
X-Debbugs-Envelope-To: submit
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -2.3 (--)

Package: Emacs
Version: 29.0.50


If you have a directory named "/tmp/\303a" with a file named "f=E9e"
inside, then (directory-files "/tmp/\303a" 'full) is likely to return
a funny string which is multibyte but contains an invalid
utf-8 sequence (its bytes spell "/tmp/\303a/f\303\251e").
That strings seems to be printed as "/tmp/=A1/f=E9e" which corresponds
to "/tmp/\303\241/f\303\251e".

Such a string with an invalid UTF-8 sequence is handled quite graciously
by Emacs, so I wasn't able to get an actual crash out of it, but it's
still something we should avoid.

I suggest the patch below.  In a comment I suggest we don't try to use
unibyte strings when a multibyte string would work as well.  This is
because for those ASCII-only strings, it's cheaper to test bytes=3D=3Dchars
to (re)discover that they are ASCII-only (when they're multibyte) than
having to loop through the bytes (when they're unibyte).


        Stefan


diff --git a/src/dired.c b/src/dired.c
index 6bb8c2fcb9f..33ddfafd8e7 100644
--- a/src/dired.c
+++ b/src/dired.c
@@ -219,6 +219,13 @@ directory_files_internal (Lisp_Object directory, Lisp_=
Object full,
     }
 #endif
=20
+  if (!NILP (full) && !STRING_MULTIBYTE (directory))
+    { /* We will be concatenating 'directory' with local file name.
+         We always decode local file names, so in order to safely concaten=
ate
+         them we need 'directory' to be multibyte.  */
+      directory =3D Fstring_to_multibyte (directory);
+    }
+
   ptrdiff_t directory_nbytes =3D SBYTES (directory);
   re_match_object =3D Qt;
=20
@@ -263,9 +270,10 @@ directory_files_internal (Lisp_Object directory, Lisp_=
Object full,
 	  ptrdiff_t name_nbytes =3D SBYTES (name);
 	  ptrdiff_t nbytes =3D directory_nbytes + needsep + name_nbytes;
 	  ptrdiff_t nchars =3D SCHARS (directory) + needsep + SCHARS (name);
-	  finalname =3D make_uninit_multibyte_string (nchars, nbytes);
-	  if (nchars =3D=3D nbytes)
-	    STRING_SET_UNIBYTE (finalname);
+	  /* FIXME: Why not make them all multibyte?  */
+	  finalname =3D (nchars =3D=3D nbytes)
+	              ? make_uninit_string (nchars, nbytes)
+	              : make_uninit_multibyte_string (nchars, nbytes);
 	  memcpy (SDATA (finalname), SDATA (directory), directory_nbytes);
 	  if (needsep)
 	    SSET (finalname, directory_nbytes, DIRECTORY_SEP);





Acknowledgement sent to Stefan Monnier <monnier@HIDDEN>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs@HIDDEN. Full text available.
Report forwarded to bug-gnu-emacs@HIDDEN:
bug#56469; Package emacs. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Sun, 10 Jul 2022 02:30:02 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.