GNU bug report logs - #53236
26.1; encode-coding-string does not encode the string as expected

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: emacs; Reported by: Markus Triska <triska@HIDDEN>; Done: Markus Triska <triska@HIDDEN>; Maintainer for emacs is bug-gnu-emacs@HIDDEN.
bug closed, send any further explanations to 53236 <at> debbugs.gnu.org and Markus Triska <triska@HIDDEN> Request was from Markus Triska <triska@HIDDEN> to control <at> debbugs.gnu.org. Full text available.

Message received at 53236 <at> debbugs.gnu.org:


Received: (at 53236) by debbugs.gnu.org; 14 Jan 2022 10:00:21 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Fri Jan 14 05:00:21 2022
Received: from localhost ([127.0.0.1]:35688 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1n8JNd-00047D-Dd
	for submit <at> debbugs.gnu.org; Fri, 14 Jan 2022 05:00:21 -0500
Received: from mail-out.m-online.net ([212.18.0.9]:54995)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <whitebox@HIDDEN>) id 1n8JNb-000474-V5
 for 53236 <at> debbugs.gnu.org; Fri, 14 Jan 2022 05:00:20 -0500
Received: from frontend01.mail.m-online.net (unknown [192.168.8.182])
 by mail-out.m-online.net (Postfix) with ESMTP id 4JZxcg1fVQz1qwyG;
 Fri, 14 Jan 2022 11:00:19 +0100 (CET)
Received: from localhost (dynscan1.mnet-online.de [192.168.6.70])
 by mail.m-online.net (Postfix) with ESMTP id 4JZxcg03L0z1qqkC;
 Fri, 14 Jan 2022 11:00:18 +0100 (CET)
X-Virus-Scanned: amavisd-new at mnet-online.de
Received: from mail.mnet-online.de ([192.168.8.182])
 by localhost (dynscan1.mail.m-online.net [192.168.6.70]) (amavisd-new,
 port 10024)
 with ESMTP id yXXQtnQIG5jn; Fri, 14 Jan 2022 11:00:18 +0100 (CET)
X-Auth-Info: AT1AHeF5a3lseaUeZOSXNSMzKx5BnNzxX4eZjYvV6tXQTkNKsXENRXiN4co43scL
Received: from igel.home (ppp-46-244-178-192.dynamic.mnet-online.de
 [46.244.178.192])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (No client certificate requested)
 by mail.mnet-online.de (Postfix) with ESMTPSA;
 Fri, 14 Jan 2022 11:00:18 +0100 (CET)
Received: by igel.home (Postfix, from userid 1000)
 id 9B1CE2C323E; Fri, 14 Jan 2022 11:00:17 +0100 (CET)
From: Andreas Schwab <schwab@HIDDEN>
To: Eli Zaretskii <eliz@HIDDEN>
Subject: Re: bug#53236: 26.1; encode-coding-string does not encode the
 string as expected
References: <8735lra07e.fsf@HIDDEN> <838rvi3ixp.fsf@HIDDEN>
X-Yow: ..  ich bin in einem dusenjet ins jahr 53 vor chr...
 ich lande im antiken Rom...  einige gladiatoren spielen scrabble...
 ich rieche PIZZA...
Date: Fri, 14 Jan 2022 11:00:17 +0100
In-Reply-To: <838rvi3ixp.fsf@HIDDEN> (Eli Zaretskii's message of "Fri, 14 Jan
 2022 08:55:30 +0200")
Message-ID: <87sftq63im.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.91 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-Spam-Score: -0.4 (/)
X-Debbugs-Envelope-To: 53236
Cc: 53236 <at> debbugs.gnu.org, Markus Triska <triska@HIDDEN>
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.4 (-)

On Jan 14 2022, Eli Zaretskii wrote:

>> From: Markus Triska <triska@HIDDEN>
>> Date: Thu, 13 Jan 2022 20:45:57 +0100
>> 
>> Correspondingly, I expect (encode-coding-string "\200" 'utf-8) to yield
>> a string equivalent to "\xC2\x80", but that seems not to be the case. I get:
>> 
>>     (encode-coding-string "\200" 'utf-8) --> "\200"
>> 
>> And therefore, unexpectedly:
>> 
>>     (string= (encode-coding-string "\200" 'utf-8) "\xC2\x80") --> nil
>
> "\200" is a unibyte string, and encoding unibyte strings returns those
> strings without changing them.
>
> This is not a bug, just a dark corner of encoding/decoding stuff.

Or a dark corner of the string syntax.

ELISP> (multibyte-string-p "\200")
nil
ELISP> (multibyte-string-p "\x80")
nil
ELISP> (multibyte-string-p "\x0080")
t
ELISP> (encode-coding-string "\x0080" 'utf-8)
"\302\200"

-- 
Andreas Schwab, schwab@HIDDEN
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#53236; Package emacs. Full text available.

Message received at 53236 <at> debbugs.gnu.org:


Received: (at 53236) by debbugs.gnu.org; 14 Jan 2022 06:55:49 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Fri Jan 14 01:55:49 2022
Received: from localhost ([127.0.0.1]:35284 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1n8GV3-00085J-Ck
	for submit <at> debbugs.gnu.org; Fri, 14 Jan 2022 01:55:49 -0500
Received: from eggs.gnu.org ([209.51.188.92]:49962)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <eliz@HIDDEN>) id 1n8GV1-000854-Ul
 for 53236 <at> debbugs.gnu.org; Fri, 14 Jan 2022 01:55:48 -0500
Received: from [2001:470:142:3::e] (port=35300 helo=fencepost.gnu.org)
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@HIDDEN>)
 id 1n8GUv-0007YC-R8; Fri, 14 Jan 2022 01:55:41 -0500
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date:
 mime-version; bh=hg8aCLSFgBTIkliN/VBVAiiiIPh9jkY8v0YnmtAGuMg=; b=rRMUvSGhQ/sH
 M9B9KGHgrRBfzODtzc2hjEkdg/C8PDiHxp+mJ9m0UhSHib4A18HasJuTgEac0KjS5iNzua4PDBGiM
 U32BgsDloDUv7zeeWiYRgAs/DXdKyg2wmrjF9ikFsGIHS/DkgAZ6LwsvqtxhQ7hvjDNezs8EadhMB
 aqW82tgq61hQdA80zDWVk8XgLXo+olvHag1GW4KM5TlEEfqIWgoaSjJM1cADYYHDAh8yq3LSZ46J7
 oLkgcvCuKTH2xnv4N4Ai7e4pSLYXhrK9UQWDGX0onibmgEeAGLrwofTVr7uJW1MU45/EcT8NIyNWS
 LXZ/hT9kRNVFv6yPl+odKQ==;
Received: from [87.69.77.57] (port=1574 helo=home-c4e4a596f7)
 by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@HIDDEN>)
 id 1n8GUj-0002fU-RI; Fri, 14 Jan 2022 01:55:39 -0500
Date: Fri, 14 Jan 2022 08:55:30 +0200
Message-Id: <838rvi3ixp.fsf@HIDDEN>
From: Eli Zaretskii <eliz@HIDDEN>
To: Markus Triska <triska@HIDDEN>
In-Reply-To: <8735lra07e.fsf@HIDDEN> (message from Markus Triska on Thu, 
 13 Jan 2022 20:45:57 +0100)
Subject: Re: bug#53236: 26.1;
 encode-coding-string does not encode the string as expected
References: <8735lra07e.fsf@HIDDEN>
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 53236
Cc: 53236 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

> From: Markus Triska <triska@HIDDEN>
> Date: Thu, 13 Jan 2022 20:45:57 +0100
> 
> Correspondingly, I expect (encode-coding-string "\200" 'utf-8) to yield
> a string equivalent to "\xC2\x80", but that seems not to be the case. I get:
> 
>     (encode-coding-string "\200" 'utf-8) --> "\200"
> 
> And therefore, unexpectedly:
> 
>     (string= (encode-coding-string "\200" 'utf-8) "\xC2\x80") --> nil

"\200" is a unibyte string, and encoding unibyte strings returns those
strings without changing them.

This is not a bug, just a dark corner of encoding/decoding stuff.




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#53236; Package emacs. Full text available.

Message received at 53236 <at> debbugs.gnu.org:


Received: (at 53236) by debbugs.gnu.org; 13 Jan 2022 20:23:52 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Jan 13 15:23:52 2022
Received: from localhost ([127.0.0.1]:34681 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1n86dU-0007ka-Dq
	for submit <at> debbugs.gnu.org; Thu, 13 Jan 2022 15:23:52 -0500
Received: from mail-oi1-f169.google.com ([209.85.167.169]:39459)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <p.stephani2@HIDDEN>) id 1n86dS-0007kH-Dd
 for 53236 <at> debbugs.gnu.org; Thu, 13 Jan 2022 15:23:51 -0500
Received: by mail-oi1-f169.google.com with SMTP id e81so9368254oia.6
 for <53236 <at> debbugs.gnu.org>; Thu, 13 Jan 2022 12:23:50 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112;
 h=mime-version:references:in-reply-to:from:date:message-id:subject:to
 :cc:content-transfer-encoding;
 bh=pyV1sMVRcizTwdhe6lk9IZy+dkc2Yxwn2m/nWjQ/92g=;
 b=gKMAxrKcuXi9aHgUvMS3pMBwwpIg1+lc17epVDnmIUtJkoPQS1lVVysS5ougXnULpJ
 6xqEp5in2CJTYCqh965rZDEov0HSc4igE5/l43O6x6XndWepBHKRM+Zqg5Ax7UTXhRDQ
 gVRqFrSxHdwvqDYIJzHpTYcmGPZwJ4n+B+5M0vjLMBRXF79k1IBe3zB8wrsLjbTcVpss
 zSm1RmsfKMytKuZBiCTVMbqKt3U6huruHQLu9adzpbb8jbElIVokx6Vd4FfX2NJ4xS3u
 ELEOXmmB6SiikKma0nQ7Xv9KoV8/CBYphaqEeQ/tSq9iRbXr49ISGfUNtvc/HRtGkI/z
 XCAg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20210112;
 h=x-gm-message-state:mime-version:references:in-reply-to:from:date
 :message-id:subject:to:cc:content-transfer-encoding;
 bh=pyV1sMVRcizTwdhe6lk9IZy+dkc2Yxwn2m/nWjQ/92g=;
 b=7Bc4BRiwpMeqCpUnrxNAdrwd5kS/pME+RcAfB3MOP3bKT5u3chaN/H49HqwWCdFvP0
 zH8cSpXplTejkwrdgLJh0q/lDo3i3l5NwnIuCB2QcFvNOyi2fEHzB3MM/BrMTNxgX4xK
 DgOCjZJ9fbGgPjOJxOwRq+NpZIUapRLsTIJ2tDYhNjroDpp+e97djX029TW8Q6/NcXDG
 GbJWtobag2xHJV8BE67SUt/7jVqfaSW1jO4xPd9MvbqXAjeMps+dPwQdVxFt0ILrQi6Q
 R5RN7EooDScuVdgJbc83pdxVJVBQg4DaQtaYXVko7H7c7zkk22fWf3DJv3CLYD/gopMK
 am6Q==
X-Gm-Message-State: AOAM530p3FXrddcSc2Me74HhoKrsYn2oLBsSF7xYAyvpNBjGoatOozbi
 vCStOWrQlGYoo0mggh98BfIbUGh7XHlwOMzQPopFZyq4l7o=
X-Google-Smtp-Source: ABdhPJzyl3fb4/K4KofEHn6YlA5uNQaxtz0ks6vuj6PByEl6I6TI3gVEQTECocOxGGSK3L6mc9cJgl4FPP844CkMd7o=
X-Received: by 2002:aca:eb52:: with SMTP id j79mr9603352oih.150.1642105424605; 
 Thu, 13 Jan 2022 12:23:44 -0800 (PST)
MIME-Version: 1.0
References: <8735lra07e.fsf@HIDDEN>
In-Reply-To: <8735lra07e.fsf@HIDDEN>
From: Philipp Stephani <p.stephani2@HIDDEN>
Date: Thu, 13 Jan 2022 21:23:33 +0100
Message-ID: <CAArVCkTBfkb-7sm3R8+0BQ7fgRsmF23LXEmLxZNtRz4DZVs62Q@HIDDEN>
Subject: Re: bug#53236: 26.1; encode-coding-string does not encode the string
 as expected
To: Markus Triska <triska@HIDDEN>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Spam-Score: 0.2 (/)
X-Debbugs-Envelope-To: 53236
Cc: 53236 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -0.8 (/)

Am Do., 13. Jan. 2022 um 21:14 Uhr schrieb Markus Triska <triska@metalevel.=
at>:
>
> Dear all,
>
> please consider the UTF-8 encoding of the Unicode codepoint 0x80, which
> is formed by two bytes. In hexadecimal notation, they are: 0xC2 0x80.
>
> We can use decode-coding-string to verify that this byte sequence is
> decoded to 0x80 when specifying utf-8, which works exactly as expected:
>
>     (decode-coding-string "\xC2\x80" 'utf-8)
>
> This yields "\200", which is the same as "\x80", as verified via:
>
>     (string=3D "\200" "\x80") --> t

There are two possible interpretations of "\200":
1. The unibyte string containing the byte #x80
2. The multibyte string containing the Unicode character U+0080
The string literal "\200" gives you the former, while
(decode-coding-string "\xC2\x80" 'utf-8) gives you the latter. In
fact,
(string=3D (decode-coding-string "\xC2\x80" 'utf-8) "\200") =E2=87=92 nil
but
(string=3D (decode-coding-string "\xC2\x80" 'utf-8) "\u0080") =E2=87=92 t

>
> Correspondingly, I expect (encode-coding-string "\200" 'utf-8) to yield
> a string equivalent to "\xC2\x80", but that seems not to be the case. I g=
et:
>
>     (encode-coding-string "\200" 'utf-8) --> "\200"

Here "\200" gives you the unibyte string that contains the byte #x80.
That can't be encoded as UTF-8 (since UTF-8 encodes Unicode scalar
values, not raw bytes), so it's left alone.
However,
(encode-coding-string "\u0080" 'utf-8) =E2=87=92 "\302\200"

There's some background in the chapter "Text representations" in the
ELisp manual.
HTH




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#53236; Package emacs. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 13 Jan 2022 19:46:10 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Jan 13 14:46:10 2022
Received: from localhost ([127.0.0.1]:34654 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1n8630-0006d7-1s
	for submit <at> debbugs.gnu.org; Thu, 13 Jan 2022 14:46:10 -0500
Received: from lists.gnu.org ([209.51.188.17]:56322)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <triska@HIDDEN>) id 1n862x-0006cp-0V
 for submit <at> debbugs.gnu.org; Thu, 13 Jan 2022 14:46:08 -0500
Received: from eggs.gnu.org ([209.51.188.92]:38830)
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <triska@HIDDEN>)
 id 1n862u-00088b-0S
 for bug-gnu-emacs@HIDDEN; Thu, 13 Jan 2022 14:46:04 -0500
Received: from [78.47.144.35] (port=42036 helo=metalevel.at)
 by eggs.gnu.org with esmtp (Exim 4.90_1)
 (envelope-from <triska@HIDDEN>) id 1n862r-0007Y5-JM
 for bug-gnu-emacs@HIDDEN; Thu, 13 Jan 2022 14:46:02 -0500
Received: from mt-Lenovo-ideapad-120S-11IAP (localhost [127.0.0.1])
 by metalevel.at (Postfix) with ESMTP id A31059C73F
 for <bug-gnu-emacs@HIDDEN>; Thu, 13 Jan 2022 20:45:58 +0100 (CET)
Received: by mt-Lenovo-ideapad-120S-11IAP (Postfix, from userid 1000)
 id 2C6CA141261; Thu, 13 Jan 2022 20:45:58 +0100 (CET)
From: Markus Triska <triska@HIDDEN>
To: bug-gnu-emacs@HIDDEN
Subject: 26.1; encode-coding-string does not encode the string as expected
Date: Thu, 13 Jan 2022 20:45:57 +0100
Message-ID: <8735lra07e.fsf@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain
X-Host-Lookup-Failed: Reverse DNS lookup failed for 78.47.144.35 (failed)
Received-SPF: none client-ip=78.47.144.35; envelope-from=triska@HIDDEN;
 helo=metalevel.at
X-Spam_score_int: -10
X-Spam_score: -1.1
X-Spam_bar: -
X-Spam_report: (-1.1 / 5.0 requ) BAYES_00=-1.9, RDNS_NONE=0.793,
 SPF_HELO_NONE=0.001, SPF_NONE=0.001 autolearn=no autolearn_force=no
X-Spam_action: no action
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: submit
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

Dear all,

please consider the UTF-8 encoding of the Unicode codepoint 0x80, which
is formed by two bytes. In hexadecimal notation, they are: 0xC2 0x80.

We can use decode-coding-string to verify that this byte sequence is
decoded to 0x80 when specifying utf-8, which works exactly as expected:

    (decode-coding-string "\xC2\x80" 'utf-8)

This yields "\200", which is the same as "\x80", as verified via:

    (string= "\200" "\x80") --> t

Correspondingly, I expect (encode-coding-string "\200" 'utf-8) to yield
a string equivalent to "\xC2\x80", but that seems not to be the case. I get:

    (encode-coding-string "\200" 'utf-8) --> "\200"

And therefore, unexpectedly:

    (string= (encode-coding-string "\200" 'utf-8) "\xC2\x80") --> nil

It appears that encode-coding-string does not encode the string in UTF-8
as expected. Is there any way to obtain the desired encoding with
encode-coding-string, i.e., the UTF-8-encoded string "\xC2\x80"?

Thank you and all the best!
Markus

In GNU Emacs 26.1 (build 3, x86_64-pc-linux-gnu, X toolkit, Xaw scroll bars)
 of 2019-04-09 built on mt-laptop
Windowing system distributor 'The X.Org Foundation', version 11.0.12004000
System Description:	Ubuntu 19.04

Configured features:
XPM JPEG GIF PNG SOUND GSETTINGS NOTIFY GNUTLS LIBXML2 FREETYPE XFT ZLIB
TOOLKIT_SCROLL_BARS LUCID X11 THREADS

Important settings:
  value of $LC_MONETARY: en_GB.UTF-8
  value of $LC_NUMERIC: en_GB.UTF-8
  value of $LC_TIME: en_GB.UTF-8
  value of $LANG: en_US.UTF-8
  value of $XMODIFIERS: @im=ibus
  locale-coding-system: utf-8-unix





Acknowledgement sent to Markus Triska <triska@HIDDEN>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs@HIDDEN. Full text available.
Report forwarded to bug-gnu-emacs@HIDDEN:
bug#53236; Package emacs. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Sat, 15 Jan 2022 06:45:02 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.