X-Loop: help-debbugs@HIDDEN Subject: bug#54124: fmt inserts garbage in certain cases? Resent-From: "JD" <john1doe@HIDDEN> Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-coreutils@HIDDEN Resent-Date: Wed, 23 Feb 2022 11:28:01 +0000 Resent-Message-ID: <handler.54124.B.16456156701704 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: report 54124 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: To: 54124 <at> debbugs.gnu.org X-Debbugs-Original-To: <bug-coreutils@HIDDEN> Received: via spool by submit <at> debbugs.gnu.org id=B.16456156701704 (code B ref -1); Wed, 23 Feb 2022 11:28:01 +0000 Received: (at submit) by debbugs.gnu.org; 23 Feb 2022 11:27:50 +0000 Received: from localhost ([127.0.0.1]:44027 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1nMpoD-0000RO-Dt for submit <at> debbugs.gnu.org; Wed, 23 Feb 2022 06:27:50 -0500 Received: from lists.gnu.org ([209.51.188.17]:33160) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <john1doe@HIDDEN>) id 1nMpMO-000815-IL for submit <at> debbugs.gnu.org; Wed, 23 Feb 2022 05:59:05 -0500 Received: from eggs.gnu.org ([209.51.188.92]:50450) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <john1doe@HIDDEN>) id 1nMpMO-0001hq-7g for bug-coreutils@HIDDEN; Wed, 23 Feb 2022 05:59:04 -0500 Received: from forward103o.mail.yandex.net ([37.140.190.177]:41040) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <john1doe@HIDDEN>) id 1nMpML-0004eF-7u for bug-coreutils@HIDDEN; Wed, 23 Feb 2022 05:59:03 -0500 Received: from forward103q.mail.yandex.net (forward103q.mail.yandex.net [IPv6:2a02:6b8:c0e:50:0:640:b21c:d009]) by forward103o.mail.yandex.net (Yandex) with ESMTP id CEFE710A89FD for <bug-coreutils@HIDDEN>; Wed, 23 Feb 2022 13:58:55 +0300 (MSK) Received: from vla1-c131b56fb0b2.qloud-c.yandex.net (vla1-c131b56fb0b2.qloud-c.yandex.net [IPv6:2a02:6b8:c0d:2994:0:640:c131:b56f]) by forward103q.mail.yandex.net (Yandex) with ESMTP id CB13756A000F for <bug-coreutils@HIDDEN>; Wed, 23 Feb 2022 13:58:55 +0300 (MSK) Received: from vla5-3832771863b8.qloud-c.yandex.net (vla5-3832771863b8.qloud-c.yandex.net [2a02:6b8:c18:3417:0:640:3832:7718]) by vla1-c131b56fb0b2.qloud-c.yandex.net (mxback/Yandex) with ESMTP id wozkUPzEsl-wtceCQDF; Wed, 23 Feb 2022 13:58:55 +0300 X-Yandex-Fwd: 2 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ya.ru; s=mail; t=1645613935; bh=Kwi96ywYtrShYOwFVPlLBSLZHYlBj52nLLuMspMiqZ0=; h=Subject:From:Date:Cc:Message-Id:To; b=DHYIwqKN1hjBmHj7IB9X3vCBK0NVMyPjDNicTDx3aoejH+BbrShjoxNrB2gaN1rhc 63VRDaGDOWypF9inlPVXxMLVIJ/ulh0YIoORaMCQo8v1veoc7FP9mnHh4A1bFrg2Uz txUxaZ2UPU4aVKKbRaKpGP3v4CuP5GUsKtn7JRqU= Authentication-Results: vla1-c131b56fb0b2.qloud-c.yandex.net; dkim=pass header.i=@ya.ru Received: by vla5-3832771863b8.qloud-c.yandex.net (smtp/Yandex) with ESMTPSA id iBKPyX2v4B-wtJKu1PA; Wed, 23 Feb 2022 13:58:55 +0300 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) (Client certificate not present) Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=UTF-8 Date: Wed, 23 Feb 2022 12:58:54 +0200 Message-Id: <CI3DGCYKZW8W.C8AILPSC6NEH@HIDDEN> From: "JD" <john1doe@HIDDEN> X-Gpg-Key-Server: http://pgp.mit.edu X-Gpg-Key-Fingerprint: 3B87 29EA 2136 7F0B 9AB6 8345 40A8 221E 52CB FA20 X-Gpg-Key: http://infornography.kpoxa.org/personal/s/key/GPG.public.key Received-SPF: pass client-ip=37.140.190.177; envelope-from=john1doe@HIDDEN; helo=forward103o.mail.yandex.net X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: 0.2 (/) X-Mailman-Approved-At: Wed, 23 Feb 2022 06:27:48 -0500 X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -2.3 (--) Hi! I have fmt from coreutils 8.32.1 installed via MacPorts. If I run the following command: `echo =D1=85 =D1=85 =D1=85 =D1=85 =D1=85 = =D1=85 =D1=85 =D1=85 =D1=85 =D1=85 =D1=85 =D1=85 =D1=85 =D1=85 =D1=85 =D1= =85 =D1=85 =D1=85 =D1=85 =D1=85 =D1=85 =D1=85 =D1=85 =D1=85 =D1=85 =D1=85 |= gfmt -sw 10` (which is just echoing 26 Cyrillic '=D1=85' ('kha') letters),= I get the following results: https://i.imgur.com/yRx7uuz.png (iTerm2)=20 https://i.imgur.com/7oQ0UPz.png (iTerm2 if passed via `more`)=20 https://i.imgur.com/UlLrEMy.png (Alacritty) And if I delete just two '=D1=85' letters, like this: `echo =D1=85 =D1=85 = =D1=85 =D1=85 =D1=85 =D1=85 =D1=85 =D1=85 =D1=85 =D1=85 =D1=85 =D1=85 =D1= =85 =D1=85 =D1=85 =D1=85 =D1=85 =D1=85 =D1=85 =D1=85 =D1=85 =D1=85 =D1=85 = =D1=85 | gfmt -sw 10`, evertyhitng shows just fine: https://i.imgur.com/Dwu= Wxyx.png Would be grateful for any advice :) --=20 JD
Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.505 (Entity 5.505) Content-Type: text/plain; charset=utf-8 X-Loop: help-debbugs@HIDDEN From: help-debbugs@HIDDEN (GNU bug Tracking System) To: "JD" <john1doe@HIDDEN> Subject: bug#54124: Acknowledgement (fmt inserts garbage in certain cases?) Message-ID: <handler.54124.B.16456156701704.ack <at> debbugs.gnu.org> References: <CI3DGCYKZW8W.C8AILPSC6NEH@HIDDEN> X-Gnu-PR-Message: ack 54124 X-Gnu-PR-Package: coreutils Reply-To: 54124 <at> debbugs.gnu.org Date: Wed, 23 Feb 2022 11:28:01 +0000 Thank you for filing a new bug report with debbugs.gnu.org. This is an automatically generated reply to let you know your message has been received. Your message is being forwarded to the package maintainers and other interested parties for their attention; they will reply in due course. Your message has been sent to the package maintainer(s): bug-coreutils@HIDDEN If you wish to submit further information on this problem, please send it to 54124 <at> debbugs.gnu.org. Please do not send mail to help-debbugs@HIDDEN unless you wish to report a problem with the Bug-tracking system. --=20 54124: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D54124 GNU Bug Tracking System Contact help-debbugs@HIDDEN with problems
X-Loop: help-debbugs@HIDDEN Subject: bug#54124: fmt inserts garbage in certain cases? Resent-From: =?UTF-8?Q?P=C3=A1draig?= Brady <P@HIDDEN> Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-coreutils@HIDDEN Resent-Date: Wed, 23 Feb 2022 17:57:02 +0000 Resent-Message-ID: <handler.54124.B54124.164563897121102 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: followup 54124 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: To: JD <john1doe@HIDDEN>, 54124 <at> debbugs.gnu.org Received: via spool by 54124-submit <at> debbugs.gnu.org id=B54124.164563897121102 (code B ref 54124); Wed, 23 Feb 2022 17:57:02 +0000 Received: (at 54124) by debbugs.gnu.org; 23 Feb 2022 17:56:11 +0000 Received: from localhost ([127.0.0.1]:46426 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1nMvrs-0005Tr-D4 for submit <at> debbugs.gnu.org; Wed, 23 Feb 2022 12:56:11 -0500 Received: from mail-wm1-f46.google.com ([209.85.128.46]:44993) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <pixelbeat@HIDDEN>) id 1nMvrq-0005Tc-IR for 54124 <at> debbugs.gnu.org; Wed, 23 Feb 2022 12:55:59 -0500 Received: by mail-wm1-f46.google.com with SMTP id d14-20020a05600c34ce00b0037bf4d14dc7so4855586wmq.3 for <54124 <at> debbugs.gnu.org>; Wed, 23 Feb 2022 09:55:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=sender:message-id:date:mime-version:user-agent:subject :content-language:to:references:from:in-reply-to; bh=CimD2wQAhP8R17YOU3l4Y3DF62jwwKS3XajQPM6cN8g=; b=o7XOK52PjybVLQfHAA4vPlSCXsFCTt7S38nhIaBfTein+lRd72uDCWopckO7uJ35JD C2XclJKTI2EBfZbumx7WdTfUQdK1Dn/muBtoXm6Z9DwNm0QwcuXBzgG0QfIP3bi7rTzQ vaJ4XIGPntpBdRNt0gGhYBi4m5VCXbK4B31TTtb3+ADztalt01l9tgijcmlgXuDlfD97 HZXJVFseR0LKYAlzJ5afJJYXtvkxn4RDXM5/nVJJcniJD00D8aNSYeJjbzkuWeHear2G kkxUd+Opr4+DnvPctibj4yWdkQkO9dH9YKHynUE8REyqTy8USfWoKEx+m7isFn5tl3Bu ORyw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:sender:message-id:date:mime-version:user-agent :subject:content-language:to:references:from:in-reply-to; bh=CimD2wQAhP8R17YOU3l4Y3DF62jwwKS3XajQPM6cN8g=; b=wFLnUmBA1cnNKUteGmSa9wox5uLrd+iJbrKgvIvpX6yjDdmgEPytwAfAU7nGzAzfXb vY5NDKAJBL9qkTkHo+/f7tmqAEceOGulU/VKShj+oWNFqscMvO5rFkJyoy10/gP8e5rb KftpT7coopLSKVIlj9QVrx6JgAF+DbD4ii26vkjnOm0E6H4lFkVL7LFSpN7hlHgYGCvJ B9cyuq7dDtp2wO3FtwFN57BPXJO7SJG+mB++y6YXp2uMy2wpGCIV3oDPmxt4o0aHj1ud wMg5vjcnFy83aVwf47Q79EnOLprJObnhydQi2tKIYUYoRkwe2sThIH/oB6f6cAWm7dnN y79A== X-Gm-Message-State: AOAM531O2TiAjmuwo3KI7NmjP/u0g+2PNxE1aMZDUVZ9WdMeoOKl6etA 7YAzMCWh+JiC3VW5GoaTSmc= X-Google-Smtp-Source: ABdhPJxI8tLJOszLoFD4sLdX52UsfqJxaNOgMpBpeTY8wRnoguXJnUYgSwK9rMiegnmYSRaEeOoE1Q== X-Received: by 2002:a05:600c:4ecb:b0:37c:9125:ac03 with SMTP id g11-20020a05600c4ecb00b0037c9125ac03mr671101wmq.98.1645638952450; Wed, 23 Feb 2022 09:55:52 -0800 (PST) Received: from [192.168.1.9] (95-44-90-175-dynamic.agg2.lod.rsl-rtd.eircom.net. [95.44.90.175]) by smtp.googlemail.com with ESMTPSA id d14-20020a05600c34ce00b0037bdd89e3a5sm8670608wmq.3.2022.02.23.09.55.50 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 23 Feb 2022 09:55:51 -0800 (PST) Content-Type: multipart/mixed; boundary="------------4iW4YB0vtD8l2k0PcckECv41" Message-ID: <74f1591a-b7b6-525a-0a15-85d2bf017769@HIDDEN> Date: Wed, 23 Feb 2022 17:55:49 +0000 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:97.0) Gecko/20100101 Thunderbird/97.0 Content-Language: en-US References: <CI3DGCYKZW8W.C8AILPSC6NEH@HIDDEN> From: =?UTF-8?Q?P=C3=A1draig?= Brady <P@HIDDEN> In-Reply-To: <CI3DGCYKZW8W.C8AILPSC6NEH@HIDDEN> X-Spam-Score: 0.5 (/) X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -0.5 (/) This is a multi-part message in MIME format. --------------4iW4YB0vtD8l2k0PcckECv41 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit On 23/02/2022 10:58, JD wrote: > Hi! > > I have fmt from coreutils 8.32.1 installed via MacPorts. > > If I run the following command: `echo х х х х х х х х х х х х х х х х х х х х х х х х х х | gfmt -sw 10` (which is just echoing 26 Cyrillic 'х' ('kha') letters), I get the following results: > > https://i.imgur.com/yRx7uuz.png (iTerm2) > https://i.imgur.com/7oQ0UPz.png (iTerm2 if passed via `more`) > https://i.imgur.com/UlLrEMy.png (Alacritty) > > And if I delete just two 'х' letters, like this: `echo х х х х х х х х х х х х х х х х х х х х х х х х | gfmt -sw 10`, evertyhitng shows just fine: https://i.imgur.com/DwuWxyx.png > > Would be grateful for any advice :) The issue here is that (on macOS 10.15.7 at least), isspace(0x85) returns true for UTF-8 locales (but not for "C" or "iso8859-1" locales). BTW iscntrl() returns true for 0x85 on all non C locales on both Linux and macOS. Now gnulib says wrt isspace() that: "This function's behaviour depends on the locale, but does not support the multibyte characters that occur in strings in locales with @code{MB_CUR_MAX > 1} (this includes all the common UTF-8 locales)." I think isspace(x85) returning true on macOS is a bug, but we should probably avoid isspace() in fmt altogether given it's inconsistency with multibyte locales. The attached uses c_isspace() instead. cheers, Pádraig --------------4iW4YB0vtD8l2k0PcckECv41 Content-Type: text/x-patch; charset=UTF-8; name="fmt-utf8-macOS.patch" Content-Disposition: attachment; filename="fmt-utf8-macOS.patch" Content-Transfer-Encoding: base64 RnJvbSAxNjZiNjc4M2JjMWE2ZTBjZTIwNjExNGMxZDU5M2MyNTI4ZTNjZmExIE1vbiBTZXAg MTcgMDA6MDA6MDAgMjAwMQpGcm9tOiA9P1VURi04P3E/UD1DMz1BMWRyYWlnPTIwQnJhZHk/ PSA8UEBkcmFpZ0JyYWR5LmNvbT4KRGF0ZTogV2VkLCAyMyBGZWIgMjAyMiAxNzo1MDo0NiAr MDAwMApTdWJqZWN0OiBbUEFUQ0hdIGZtdDogZml4IGludmFsaWQgbXVsdGktYnl0ZSBzcGxp dHRpbmcgb24gbWFjT1MKCk9uIG1hY09TLCBpc3NwYWNlKDB4ODUpIHJldHVybnMgdHJ1ZSwK d2hpY2ggcmVzdWx0cyBpbiBzcGxpdHRpbmcgd2l0aGluIG11bHRpLWJ5dGUgY2hhcmFjdGVy cy4KCiogc3JjL2ZtdC5jIChnZXRfbGluZSk6IHMvaXNzcGFjZS9jX2lzc3BhY2UvLgoqIHRl c3RzL2ZtdC9ub24tc3BhY2Uuc2g6IEFkZCBhIG5ldyB0ZXN0LgoqIHRlc3RzL2xvY2FsLm1r OiBSZWZlcmVuY2UgbmV3IHRlc3QuCiogTkVXUzogTWVudGlvbiB0aGUgZml4LgpBZGRyZXNz ZXMgaHR0cHM6Ly9idWdzLmdudS5vcmcvNTQxMjQKLS0tCiBORVdTICAgICAgICAgICAgICAg ICAgIHwgIDQgKysrKwogc3JjL2ZtdC5jICAgICAgICAgICAgICB8ICAzICsrLQogdGVzdHMv Zm10L25vbi1zcGFjZS5zaCB8IDQ5ICsrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysr KysrKysrKysrKwogdGVzdHMvbG9jYWwubWsgICAgICAgICB8ICAzICsrLQogNCBmaWxlcyBj aGFuZ2VkLCA1NyBpbnNlcnRpb25zKCspLCAyIGRlbGV0aW9ucygtKQogY3JlYXRlIG1vZGUg MTAwNzU1IHRlc3RzL2ZtdC9ub24tc3BhY2Uuc2gKCmRpZmYgLS1naXQgYS9ORVdTIGIvTkVX UwppbmRleCBlZjY1YjRhYjguLjM1ZDlhNTBkZCAxMDA2NDQKLS0tIGEvTkVXUworKysgYi9O RVdTCkBAIC0yMSw2ICsyMSwxMCBAQCBHTlUgY29yZXV0aWxzIE5FV1MgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAtKi0gb3V0bGluZSAtKi0KICAgYW5kIEIgaXMgaW4g c29tZSBvdGhlciBmaWxlIHN5c3RlbS4KICAgW2J1ZyBpbnRyb2R1Y2VkIGluIGNvcmV1dGls cy05LjBdCiAKKyAgT24gbWFjT1MsIGZtdCBubyBsb25nZXIgY29ycnVwdHMgbXVsdGktYnl0 ZSBjaGFyYWN0ZXJzCisgIGJ5IG1pc2RldGVjdGluZyB0aGVpciBjb21wb25lbnQgYnl0ZXMg YXMgc3BhY2VzLgorICBbVGhpcyBidWcgd2FzIHByZXNlbnQgaW4gInRoZSBiZWdpbm5pbmci Ll0KKwogICAnaWQgeHl6JyBub3cgdXNlcyB0aGUgbmFtZSAneHl6JyB0byBkZXRlcm1pbmUg Z3JvdXBzLCBpbnN0ZWFkIG9mIHh5eidzIHVpZC4KICAgW2J1ZyBpbnRyb2R1Y2VkIGluIGNv cmV1dGlscy04LjIyXQogCmRpZmYgLS1naXQgYS9zcmMvZm10LmMgYi9zcmMvZm10LmMKaW5k ZXggMWViNzAxOWIwLi4wNWJhZmFiZDYgMTAwNjQ0Ci0tLSBhL3NyYy9mbXQuYworKysgYi9z cmMvZm10LmMKQEAgLTI2LDYgKzI2LDcgQEAKICAgIGl0IHRvIGJlIGEgdHlwZSBnZXQgc3lu dGF4IGVycm9ycyBmb3IgdGhlIHZhcmlhYmxlIGRlY2xhcmF0aW9uIGJlbG93LiAgKi8KICNk ZWZpbmUgd29yZCB1bnVzZWRfd29yZF90eXBlCiAKKyNpbmNsdWRlICJjLWN0eXBlLmgiCiAj aW5jbHVkZSAic3lzdGVtLmgiCiAjaW5jbHVkZSAiZXJyb3IuaCIKICNpbmNsdWRlICJkaWUu aCIKQEAgLTcwMiw3ICs3MDMsNyBAQCBnZXRfbGluZSAoRklMRSAqZiwgaW50IGMpCiAgICAg ICAgICAgKndwdHIrKyA9IGM7CiAgICAgICAgICAgYyA9IGdldGMgKGYpOwogICAgICAgICB9 Ci0gICAgICB3aGlsZSAoYyAhPSBFT0YgJiYgIWlzc3BhY2UgKGMpKTsKKyAgICAgIHdoaWxl IChjICE9IEVPRiAmJiAhY19pc3NwYWNlIChjKSk7CiAgICAgICBpbl9jb2x1bW4gKz0gd29y ZF9saW1pdC0+bGVuZ3RoID0gd3B0ciAtIHdvcmRfbGltaXQtPnRleHQ7CiAgICAgICBjaGVj a19wdW5jdHVhdGlvbiAod29yZF9saW1pdCk7CiAKZGlmZiAtLWdpdCBhL3Rlc3RzL2ZtdC9u b24tc3BhY2Uuc2ggYi90ZXN0cy9mbXQvbm9uLXNwYWNlLnNoCm5ldyBmaWxlIG1vZGUgMTAw NzU1CmluZGV4IDAwMDAwMDAwMC4uYjU5ODM4OTgzCi0tLSAvZGV2L251bGwKKysrIGIvdGVz dHMvZm10L25vbi1zcGFjZS5zaApAQCAtMCwwICsxLDQ5IEBACisjIS9iaW4vc2gKKyMgVGVz dCBmbXQgc3BhY2UgaGFuZGxpbmcKKworIyBDb3B5cmlnaHQgKEMpIDIwMjIgRnJlZSBTb2Z0 d2FyZSBGb3VuZGF0aW9uLCBJbmMuCisKKyMgVGhpcyBwcm9ncmFtIGlzIGZyZWUgc29mdHdh cmU6IHlvdSBjYW4gcmVkaXN0cmlidXRlIGl0IGFuZC9vciBtb2RpZnkKKyMgaXQgdW5kZXIg dGhlIHRlcm1zIG9mIHRoZSBHTlUgR2VuZXJhbCBQdWJsaWMgTGljZW5zZSBhcyBwdWJsaXNo ZWQgYnkKKyMgdGhlIEZyZWUgU29mdHdhcmUgRm91bmRhdGlvbiwgZWl0aGVyIHZlcnNpb24g MyBvZiB0aGUgTGljZW5zZSwgb3IKKyMgKGF0IHlvdXIgb3B0aW9uKSBhbnkgbGF0ZXIgdmVy c2lvbi4KKworIyBUaGlzIHByb2dyYW0gaXMgZGlzdHJpYnV0ZWQgaW4gdGhlIGhvcGUgdGhh dCBpdCB3aWxsIGJlIHVzZWZ1bCwKKyMgYnV0IFdJVEhPVVQgQU5ZIFdBUlJBTlRZOyB3aXRo b3V0IGV2ZW4gdGhlIGltcGxpZWQgd2FycmFudHkgb2YKKyMgTUVSQ0hBTlRBQklMSVRZIG9y IEZJVE5FU1MgRk9SIEEgUEFSVElDVUxBUiBQVVJQT1NFLiAgU2VlIHRoZQorIyBHTlUgR2Vu ZXJhbCBQdWJsaWMgTGljZW5zZSBmb3IgbW9yZSBkZXRhaWxzLgorCisjIFlvdSBzaG91bGQg aGF2ZSByZWNlaXZlZCBhIGNvcHkgb2YgdGhlIEdOVSBHZW5lcmFsIFB1YmxpYyBMaWNlbnNl CisjIGFsb25nIHdpdGggdGhpcyBwcm9ncmFtLiAgSWYgbm90LCBzZWUgPGh0dHBzOi8vd3d3 LmdudS5vcmcvbGljZW5zZXMvPi4KKworLiAiJHtzcmNkaXI9Ln0vdGVzdHMvaW5pdC5zaCI7 IHBhdGhfcHJlcGVuZF8gLi9zcmMKK3ByaW50X3Zlcl8gZm10IHByaW50ZgorCisjIEJlZm9y ZSBjb3JldXRpbHMgOS4xIG1hY09TIHRyZWF0ZWQgYnl0ZXMgbGlrZSAweDg1CisjIGFzIHNw YWNlIGNoYXJhY3RlcnMgaW4gbXVsdGktYnl0ZSBsb2NhbGVzIChpbmNsdWRpbmcgVVRGLTgp CisKK2NoZWNrX25vbl9zcGFjZSgpIHsKKyAgY2hhcj0iJDEiCisgIHRlc3QgIiQoZW52IHBy aW50ZiAiPSRjaGFyPSIgfCBmbXQgLXMgLXcxIHwgd2MgLWwpIiA9IDEgfHwgZmFpbD0xCit9 CisKK2V4cG9ydCBMQ19BTEw9ZW5fVVMuaXNvODg1OS0xICAjIG9ubHkgbG93ZXJjYXNlIGZv cm0gd29ya3Mgb24gbWFjT1MgMTAuMTUuNworaWYgdGVzdCAiJChsb2NhbGUgY2hhcm1hcCAy Pi9kZXYvbnVsbCB8IHNlZCAncy9pc28vSVNPLS8nKSIgPSBJU08tODg1OS0xOyB0aGVuCisg IGNoZWNrX25vbl9zcGFjZSAnXHhBMCcKK2ZpCisKK2V4cG9ydCBMQ19BTEw9ZW5fVVMuVVRG LTgKK2lmIHRlc3QgIiQobG9jYWxlIGNoYXJtYXAgMj4vZGV2L251bGwpIiA9IFVURi04OyB0 aGVuCisgIGNoZWNrX25vbl9zcGFjZSAnXHUwMEEwJyAgIyBObyBicmVhayBzcGFjZQorICBj aGVja19ub25fc3BhY2UgJ1x1MjAwNycgICMgVE9ETzogc2hvdWxkIHByb2JhYmx5IHNwbGl0 IG9uIGZpZ3VyZSBzcGFjZQorICBjaGVja19ub25fc3BhY2UgJ1x1MjAyRicgICMgTmFycm93 IG5vIGJyZWFrIHNwYWNlCisgIGNoZWNrX25vbl9zcGFjZSAnXHUyMDYwJyAgIyB6ZXJvLXdp ZHRoIG5vIGJyZWFrIHNwYWNlCisgIGNoZWNrX25vbl9zcGFjZSAnXHUwNDQ1JyAgIyBDeXJp bGxpYyBraGEgaGFzIDB4ODUsIHdoaWNoIG1hY09TIGlzc3BhY2UoKT10cnVlCitmaQorCitl eHBvcnQgTENfQUxMPXJ1X1JVLktPSTgtUgoraWYgdGVzdCAiJChsb2NhbGUgY2hhcm1hcCAy Pi9kZXYvbnVsbCkiID0gS09JOC1SOyB0aGVuCisgIGNoZWNrX25vbl9zcGFjZSAnXHg5QScK K2ZpCisKK0V4aXQgJGZhaWwKZGlmZiAtLWdpdCBhL3Rlc3RzL2xvY2FsLm1rIGIvdGVzdHMv bG9jYWwubWsKaW5kZXggZjEzNzZmYjcxLi5mOTdkZGNiOTggMTAwNjQ0Ci0tLSBhL3Rlc3Rz L2xvY2FsLm1rCisrKyBiL3Rlc3RzL2xvY2FsLm1rCkBAIC0yMzcsOCArMjM3LDkgQEAgYWxs X3Rlc3RzID0JCQkJCVwKICAgdGVzdHMvY2hncnAvcG9zaXgtSC5zaAkJCVwKICAgdGVzdHMv Y2hncnAvcmVjdXJzZS5zaAkJCVwKICAgdGVzdHMvZm10L2Jhc2UucGwJCQkJXAotICB0ZXN0 cy9mbXQvbG9uZy1saW5lLnNoCQkJXAogICB0ZXN0cy9mbXQvZ29hbC1vcHRpb24uc2gJCQlc CisgIHRlc3RzL2ZtdC9sb25nLWxpbmUuc2gJCQlcCisgIHRlc3RzL2ZtdC9ub24tc3BhY2Uu c2gJCQlcCiAgIHRlc3RzL21pc2MvZWNoby5zaAkJCQlcCiAgIHRlc3RzL21pc2MvZW52LnNo CQkJCVwKICAgdGVzdHMvbWlzYy9lbnYtc2lnbmFsLWhhbmRsZXIuc2gJCVwKLS0gCjIuMjYu MgoK --------------4iW4YB0vtD8l2k0PcckECv41--
X-Loop: help-debbugs@HIDDEN Subject: bug#54124: fmt inserts garbage in certain cases? Resent-From: =?UTF-8?Q?P=C3=A1draig?= Brady <P@HIDDEN> Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-coreutils@HIDDEN Resent-Date: Thu, 24 Feb 2022 01:31:01 +0000 Resent-Message-ID: <handler.54124.B54124.164566620617210 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: followup 54124 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: To: JD <john1doe@HIDDEN>, 54124 <at> debbugs.gnu.org Received: via spool by 54124-submit <at> debbugs.gnu.org id=B54124.164566620617210 (code B ref 54124); Thu, 24 Feb 2022 01:31:01 +0000 Received: (at 54124) by debbugs.gnu.org; 24 Feb 2022 01:30:06 +0000 Received: from localhost ([127.0.0.1]:46878 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1nN2xK-0004TW-0c for submit <at> debbugs.gnu.org; Wed, 23 Feb 2022 20:30:06 -0500 Received: from mail-wm1-f51.google.com ([209.85.128.51]:56301) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <pixelbeat@HIDDEN>) id 1nN2xI-0004SZ-HR for 54124 <at> debbugs.gnu.org; Wed, 23 Feb 2022 20:30:04 -0500 Received: by mail-wm1-f51.google.com with SMTP id i19so271135wmq.5 for <54124 <at> debbugs.gnu.org>; Wed, 23 Feb 2022 17:30:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=sender:message-id:date:mime-version:user-agent:subject :content-language:from:to:references:in-reply-to :content-transfer-encoding; bh=kH3Riv+n+SWjG6s9J016rYLZ74aYuIqfWQq767OIK2c=; b=AA0Xpz45BI4t01KzQu1eKxKoAtpFgom7XhZxqbq8x4LCpCJIF14tR5EzYK5J0xz3Lb xCnjqvTIo1KatmKlLAkIjgwe0Dbia+0Tgju+pEESB9SNbP25p0GLQELsDiYA9vCc4aaZ fiAc1BNx0Lua8dxmUmo805eC9GiLpExhqnWMkgANjByh+WNSRj+n234FHo4RPZWNUdVi xjK3yTeEt9/kjpk26pcnJCuEPzKmhr+bHiPVGBghrOGSJ02GHauSKdTl5BJTyqH6xgqM CgG0mPvqE2ZCNSw25zcVUGbXqqpj+SsPKQ2xDRMK/de0WkdcdXnMcQzeSN4pxdxjXPrN SLHA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:sender:message-id:date:mime-version:user-agent :subject:content-language:from:to:references:in-reply-to :content-transfer-encoding; bh=kH3Riv+n+SWjG6s9J016rYLZ74aYuIqfWQq767OIK2c=; b=x5wmfS/Hul8hnbwPKEAfdnPZOkU9MP+99+G/dYQ5aviMz+OCGTI8S16/wFxQSWsnhY J01q7nVQuyFa89veCNl+hLNFwNrtSqjaP7Ix7+HD4Gm7fzgiZ2uMJ0LeHbOUYTBKRmIS ivEjP3Y6lY3V2FPlRXUMSbTMRRTqX4VrnyBDhgsPxTYwLeQ1DOoRnNgtw9VdCXF5/iKF a5TYrMBapP0Gi/Y/BPFW2ZeVueSxwPB0E7/A1OYMdrlRORdW7pADzO9ayjUa3AiimG5R 2U4fhaE5z5AYArQso58WTALZ0sBRfO/y37g47owKFf/vSrZIQEo4Hr30SMV9VkYHYxtS xfqA== X-Gm-Message-State: AOAM530QS2lav2wL4feix9iWzB8H2SpVehMwgdpNAE+O8mKp2qvLSId2 rObn11Bs31slu9t8oas9Q0k= X-Google-Smtp-Source: ABdhPJypaH5MagNOLARveutqV2/7cb6cu9qa461Zz5ugyDWUXU9UYFvj5cZw734Dn5x6rFvxlGGZaQ== X-Received: by 2002:a05:600c:2e47:b0:37b:9d9b:4c90 with SMTP id q7-20020a05600c2e4700b0037b9d9b4c90mr9483527wmf.69.1645666198541; Wed, 23 Feb 2022 17:29:58 -0800 (PST) Received: from [192.168.1.9] (95-44-90-175-dynamic.agg2.lod.rsl-rtd.eircom.net. [95.44.90.175]) by smtp.googlemail.com with ESMTPSA id r2-20020a05600c35c200b00352cdcdd7b2sm10174999wmq.0.2022.02.23.17.29.56 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 23 Feb 2022 17:29:57 -0800 (PST) Message-ID: <cb3e1d02-0c4f-11c2-21ca-f148bda09cde@HIDDEN> Date: Thu, 24 Feb 2022 01:29:56 +0000 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:97.0) Gecko/20100101 Thunderbird/97.0 Content-Language: en-US From: =?UTF-8?Q?P=C3=A1draig?= Brady <P@HIDDEN> References: <CI3DGCYKZW8W.C8AILPSC6NEH@HIDDEN> <74f1591a-b7b6-525a-0a15-85d2bf017769@HIDDEN> In-Reply-To: <74f1591a-b7b6-525a-0a15-85d2bf017769@HIDDEN> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Score: 0.5 (/) X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -0.5 (/) On 23/02/2022 17:55, Pádraig Brady wrote: > I think isspace(x85) returning true on macOS is a bug, Bug is a bit of a strong word here. A digression into why 0x85 is being treated specially here. Note Cyrillic kha "х" is encoded in UTF-8 as: $ printf '\u0445' | od -tx1 0000000 d1 85 What I think is happening is \u0085 represents "Next Line" in unicode. This is present in unicode to support mapping to/from the corresponding char in EBCDIC, which had a distinct char for this in addition to CR and LF. Given isspace('\n') returns true, then it makes some sense that isspace("Next Line") would return true, and I guess through implementation details isspace(int) is operating on utf32 on macOS in UTF-8 locales and this returning true for this value. BTW 0xA0 is the only other value that isspace() returns true for (other than the standard c_isspace() values of course). This is non breaking space, so it's best we don't split on it anyway. I.e. this is another benefit to the change. I still think using c_isspace() to avoid this issue is best, and intend to push the change tomorrow. cheers, Pádraig
X-Loop: help-debbugs@HIDDEN Subject: bug#54124: fmt inserts garbage in certain cases? Resent-From: Paul Eggert <eggert@HIDDEN> Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-coreutils@HIDDEN Resent-Date: Thu, 24 Feb 2022 03:07:01 +0000 Resent-Message-ID: <handler.54124.B54124.164567199811176 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: followup 54124 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: To: =?UTF-8?Q?P=C3=A1draig?= Brady <P@HIDDEN>, JD <john1doe@HIDDEN>, 54124 <at> debbugs.gnu.org Received: via spool by 54124-submit <at> debbugs.gnu.org id=B54124.164567199811176 (code B ref 54124); Thu, 24 Feb 2022 03:07:01 +0000 Received: (at 54124) by debbugs.gnu.org; 24 Feb 2022 03:06:38 +0000 Received: from localhost ([127.0.0.1]:46996 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1nN4Sk-0002uC-8y for submit <at> debbugs.gnu.org; Wed, 23 Feb 2022 22:06:38 -0500 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:52060) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <eggert@HIDDEN>) id 1nN4Si-0002tw-If for 54124 <at> debbugs.gnu.org; Wed, 23 Feb 2022 22:06:37 -0500 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 70A7816015F; Wed, 23 Feb 2022 19:06:30 -0800 (PST) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id 3K8BubZOU1Ww; Wed, 23 Feb 2022 19:06:29 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id C3204160162; Wed, 23 Feb 2022 19:06:29 -0800 (PST) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id ecZVEhOrMozu; Wed, 23 Feb 2022 19:06:29 -0800 (PST) Received: from [131.179.64.200] (Penguin.CS.UCLA.EDU [131.179.64.200]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id A16E71600F9; Wed, 23 Feb 2022 19:06:29 -0800 (PST) Message-ID: <239351e3-01e9-3a64-1336-b049b7250d4d@HIDDEN> Date: Wed, 23 Feb 2022 19:06:29 -0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.5.0 Content-Language: en-US References: <CI3DGCYKZW8W.C8AILPSC6NEH@HIDDEN> <74f1591a-b7b6-525a-0a15-85d2bf017769@HIDDEN> <cb3e1d02-0c4f-11c2-21ca-f148bda09cde@HIDDEN> From: Paul Eggert <eggert@HIDDEN> Organization: UCLA Computer Science Department In-Reply-To: <cb3e1d02-0c4f-11c2-21ca-f148bda09cde@HIDDEN> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.3 (--) X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -3.3 (---) On 2/23/22 17:29, P=C3=A1draig Brady wrote: > Given isspace('\n') returns true, then it makes some sense that=20 > isspace("Next Line") > would return true, POSIX says that the application must insure that argument to isspace is=20 either EOF or "a character representable as an unsigned char", and=20 arguably since 0x85 not either one of those things the behavior of=20 isspace(0x85) is undefined. However, the C standard does not have this wording, and since POSIX is=20 supposed to defer to the C standard here, this appears to be a bug in=20 POSIX (as well as a bug in macOS). It's understandable if the Apple C=20 library's developers got confused by the POSIX wording.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997 nCipher Corporation Ltd,
1994-97 Ian Jackson.