X-Loop: help-debbugs@HIDDEN
Subject: bug#33775: fold: counting multi-byte utf-8 sequences as separate columns
Resent-From: Michael Siegel <msi@HIDDEN>
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-coreutils@HIDDEN
Resent-Date: Mon, 17 Dec 2018 02:15:01 +0000
Resent-Message-ID: <handler.33775.B.15450128704606 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: report 33775
X-GNU-PR-Package: coreutils
X-GNU-PR-Keywords:
To: 33775 <at> debbugs.gnu.org
X-Debbugs-Original-To: bug-coreutils@HIDDEN
Received: via spool by submit <at> debbugs.gnu.org id=B.15450128704606
(code B ref -1); Mon, 17 Dec 2018 02:15:01 +0000
Received: (at submit) by debbugs.gnu.org; 17 Dec 2018 02:14:30 +0000
Received: from localhost ([127.0.0.1]:50731 helo=debbugs.gnu.org)
by debbugs.gnu.org with esmtp (Exim 4.84_2)
(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
id 1gYiQP-0001CB-KH
for submit <at> debbugs.gnu.org; Sun, 16 Dec 2018 21:14:30 -0500
Received: from eggs.gnu.org ([208.118.235.92]:47016)
by debbugs.gnu.org with esmtp (Exim 4.84_2)
(envelope-from <msi@HIDDEN>) id 1gYhmS-0000AN-KO
for submit <at> debbugs.gnu.org; Sun, 16 Dec 2018 20:33:13 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
(envelope-from <msi@HIDDEN>) id 1gYhmM-0007dJ-G1
for submit <at> debbugs.gnu.org; Sun, 16 Dec 2018 20:33:07 -0500
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level:
X-Spam-Status: No, score=-0.0 required=5.0 tests=BAYES_20 autolearn=disabled
version=3.3.2
Received: from lists.gnu.org ([2001:4830:134:3::11]:33694)
by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32)
(Exim 4.71) (envelope-from <msi@HIDDEN>) id 1gYhmL-0007bw-KN
for submit <at> debbugs.gnu.org; Sun, 16 Dec 2018 20:33:06 -0500
Received: from eggs.gnu.org ([2001:4830:134:3::10]:56626)
by lists.gnu.org with esmtp (Exim 4.71)
(envelope-from <msi@HIDDEN>) id 1gYhmK-0003aN-Nk
for bug-coreutils@HIDDEN; Sun, 16 Dec 2018 20:33:05 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
(envelope-from <msi@HIDDEN>) id 1gYhmF-0007UU-KI
for bug-coreutils@HIDDEN; Sun, 16 Dec 2018 20:33:04 -0500
Received: from poseidon.malbolge.net ([5.45.108.48]:34886)
by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32)
(Exim 4.71) (envelope-from <msi@HIDDEN>) id 1gYhmF-0007RP-92
for bug-coreutils@HIDDEN; Sun, 16 Dec 2018 20:32:59 -0500
Received: from hermes.malbolge.net (hermes.malbolge.net [192.168.123.201])
by poseidon.malbolge.net (OpenSMTPD) with ESMTPSA id 7a4d17b6
(TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256:NO)
for <bug-coreutils@HIDDEN>; Mon, 17 Dec 2018 02:32:57 +0100 (CET)
Received: from kerberos.malbolge.net ([192.168.123.128] helo=127.0.0.1)
by hermes.malbolge.net with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128)
(Exim 4.89) (envelope-from <msi@HIDDEN>) id 1gYhmC-0007dp-Ce
for bug-coreutils@HIDDEN; Mon, 17 Dec 2018 02:32:56 +0100
From: Michael Siegel <msi@HIDDEN>
Openpgp: preference=signencrypt
Autocrypt: addr=msi@HIDDEN; prefer-encrypt=mutual; keydata=
mQINBFtDUbwBEAC3pgB2zgT1GBe8wwTuzRdKMIWnnI1HHVQVT8MuURvlQcHTBOM9KwV7s6hl
RF8gwyBYImptTGD/zCWckuIC8TBWarqslKCLi4r6FUwmS410fCSqIQbD2m0kV8wyz0XUuULU
v6E6aICqmgrEMJXgBrPtoK6Euvc9X9iJjhP+eC6EJ+lLp2snkn9ttAnaBGKupZzGa8X2q/de
eZl9T0LoqMoIuClzX1v+VMFv9Hmc1gj9SQ5EiYyR+6odzXLaSQgLMVnIzfQ0MuJQCeGiZyWj
oQK8IXAM2/R+94M79yzYNSbNp85nzQ+7vqsMH19f/+4Z6I8I/9fZjynB5ykLJtnSxvsBp4NO
W04iYSxppctEmX7K4wlb2DNK6+wsH0GfoLSEDcsE3gLoQfb8Va2UASGXzIwcHxn4mEfveQ2l
a5spYKr0xkMbiA4ETPdzgsy1tHKEaSdVk80uYenBxmeUS05FjRR7xGE2jdCmJs3y8CoMH3KS
+3Og9auBgbKK25qETrmEbbVAAMtHGuNaasOS+nIXvVyfUHEXSEYvCcn3HFHoiqKZIsOBMxlK
3CvPPPI8EL33y0+VBcSDE1VNw2MrnooSccHA4F7ecQPjrrRdCNOF/egJucpqx36rBBNz45vB
ZqWnbOntGdfd5dHCz9yRpwOy5/2VUwlCL8Zs7Gw0XgSFW4tzpQARAQABtCFNaWNoYWVsIFNp
ZWdlbCA8bXNpQG1hbGJvbGdlLm5ldD6JAlQEEwEIAD4WIQQgFAgh+NDy+OS/sgzouIF4zeFu
ZAUCW0NRvAIbAwUJA8JnAAULCQgHAgYVCAkKCwIEFgIDAQIeAQIXgAAKCRDouIF4zeFuZNUC
D/922JpXlJ/4Ny4PT0VZF/Ri7GkIwqMxr8nq/3+J0aLxtE5j12XmCZgupX7yCnSua/D3XnKE
PKOYdNBa+gA2LJ7YtAZs8H8FCWeO052oGQwUqXf/qftax0vJzeSZLikmeRzcfmGaHOZH+99R
UYI3b4zZTNavq35yLfZnVe2+VNVcReSTt9mEjiAf1M9JT2GNgmim3w/42ba/ol5Yx2zUhr0J
pDEaS0a1JWzI4ttlhMDThGGAejUgN3aZP4/JnEmGhXQdAYlP2rKye0/wrQTIjCp9hNZ2/8xP
D/SgxXDj+ePyyyL8nwj2BU/m6jydQLOSyJ5Uvq7SF96uF6OexyO7D/2bp/F4LIND7XVy/AUr
bIZzpjKVblcFeOTQdtYDUjjhy9eV1eVa6ywbO32mEywrsLuY6R0BoeDxR0IWU7xT1AIWGRpD
GZwM27X4iDiJjm1Hej0NAa+EPKTNQw0YOwVtM3cWRUNUq5my+DXVVYcT0n0tYdObWuhi+A88
C4aWypoWw6jZIDi74bwgFg/zQAL1AnqGxTjICWcDdG14GWgaVyAGJWDR937QtVg3sxni6qZb
BkgUwAYnOoo6Wbq2wPeJ9293agz3Aj+paILyCFFGwnRGPHN1FcD0gmdjcJMaTWEVbS0q4B76
LvmHnJZXOk85pUcAx5qowoyZMt4poQRX6hDPfbkCDQRbQ1G8ARAA0P7k87V2rNjkHu/7TBoT
8mSuEZTTtcmMKa8E+tErRpc4XQnDZUv4bzxOMMjWFlSIV6mQ8f3ZVA1LF86zOQUWbISp+b2Q
K3aKDB83Pbsclt45CUKd1TZNkQQGxtNLU1w0Sy3266pV1GEMxkadvoqJWQEpu4KkMzAaGlud
cHHi1TCkbJa0bmwaRbT1eirtAUEqffY6olRaM7UApeDgazSS1VlZsP4DwqoK4binSdzwe+3S
+Bqm8Gi2zjtl7cG6aWIA74tyYdWF8Mec7JY3KIu6rjtRvAznm7Y3R8RW4T4eRrujt8u+bwNA
tSjkFCH8mmO/w7NaVAZ4hDUNUCAT9bfYJWWZ3H8T80DQgOlBIMXt5F3ahHFVAIoNWbofJrAJ
NAM4icFE6WeWEDZVh3pCMoFftEIrQHahOSkITkDwFgO1WkBy5HN3hSDPJvpMiylRKds7Ftiw
LcA5sqWeB0nozAPKsp7Et70rH+AUFBpECDKKAJwnGkoBVcm1G5lOYFfnsYpD4Faxn2vIP8pN
rluAZjvZQ2038Jb+cYaOdGeD7Cr6j598LYuDm62juiv9itwV6MHR+aokVbEYGwe5HnQHlWFh
Gdj/Vx/j7CsnX9rcknWeFne560f7wpPiUfp4neM2/uSSvGHaZXONlMTtPBBY4TEnrZWnceNA
xAl6HHF6bMVyhzsAEQEAAYkCPAQYAQgAJhYhBCAUCCH40PL45L+yDOi4gXjN4W5kBQJbQ1G8
AhsMBQkDwmcAAAoJEOi4gXjN4W5kgqcQAKmjkQZJZmmA60fePgyUgKAtAhiPQrwC6+LD3hxw
bTT1AF8OqG4bbTqu/mWhIuoY67X35rb+4JySp3ZLFp0NJzTNwsuHh9eFi8/dm16hydGrp6zV
I1s5D+gWKW5YRNxbEJYxzYLRDyUkPLnzSRM8N1HnX3ElA0UBAkXWVZy3hnJUihPcWuVUipEP
59qENCK6YO3ii/2drbNWhpOCXgc2cHd9BiICUOvcAOwfj/n78dSj+azGQAt5PTa1c4wJC8o5
CMl5MvybttV2TzHA/r2rMH95/A3kqSuTmm4IP8EAe0uMLdmCX45KYzKdjcW20zEa48AEo6xD
0ifr1sOs12+B1IsouEJeQnPEz5pwblGNwuADw4W5+f6DTju+SCA7BvbvMEZlnVDAivWwQskX
HBdry3Xlo2ioUajEFNDwrx1ZI0Wp5X8wDMuwKXK1o7qgr9ZmKkA2ZFG9CjDK3pmwZ2V5oSdC
R4wo7rQNkh+wbuhb3J4INSyoTlqEuD6FKxKBIXvlb3YaAUnViiA9tPDVsTcfB0GLDalYYShq
JjErV3kXXb30sIYL0KLSCZFJW1uSLB9xc8WTkAs+4acJu1LCrY5uLhHIbZ3SC5dB6XnCCDKB
WvUCyxYuOWK7fRWCk56J+xc0vTXYGN+Vnr+90hHyhtHsNYmDFROzIGvgh3SHDjzEXAmT
Message-ID: <cb32cf5c-2f40-ab75-2c03-113bcd19d7ad@HIDDEN>
Date: Mon, 17 Dec 2018 02:32:55 +0100
User-Agent: Mozilla/5.0 (X11; Linux i686; rv:60.0) Gecko/20100101
Thunderbird/60.3.0
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Language: de-DE
Content-Transfer-Encoding: 8bit
X-detected-operating-system: by eggs.gnu.org: Genre and OS details not
recognized.
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x
X-Received-From: 2001:4830:134:3::11
X-Spam-Score: -4.1 (----)
X-Mailman-Approved-At: Sun, 16 Dec 2018 21:14:29 -0500
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -5.1 (-----)
Hello,
I've just discovered an odd behavior of `fold' while trying to wrap a
piece of text containing phonetic characters.
Take the following line, for example:
Tcl (pronounced "tickle" or tee cee ell /ˈtiː siː ɛl/) is a high-level,
It is 71 characters long. Still, running
echo "Tcl (pronounced "tickle" or tee cee ell /ˈtiː siː ɛl/) is a
high-level," | fold -w 72 -s
produces
Tcl (pronounced tickle or tee cee ell /ˈtiː siː ɛl/) is a
high-level,
I've had someone test this with FreeBSD's `fold', which didn't behave
that way. Instead, it filled out the line as expected.
Further investigation by developers of Adélie Linux revealed that GNU's
`fold' is counting multi-byte utf-8 sequences (in this case, the
phonetic characters) as separate columns:
awilcox on gwyn [pts/11 Sun 16 19:01] ~: cat testing.txt
1234567890 234567890 234567890 234567890 234567890 234567890 234567890
/ˈtiː siː ɛl/ Adélie en français español ¿que? ¡ay! here is 70 chars ^
yep.
awilcox on gwyn [pts/11 Sun 16 19:01] ~: fold -w 72 -s testing.txt
1234567890 234567890 234567890 234567890 234567890 234567890 234567890
/ˈtiː siː ɛl/ Adélie en français español ¿que? ¡ay! here is 70
chars ^
yep.
msi
Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.505 (Entity 5.505) Content-Type: text/plain; charset=utf-8 X-Loop: help-debbugs@HIDDEN From: help-debbugs@HIDDEN (GNU bug Tracking System) To: Michael Siegel <msi@HIDDEN> Subject: bug#33775: Acknowledgement (fold: counting multi-byte utf-8 sequences as separate columns) Message-ID: <handler.33775.B.15450128704606.ack <at> debbugs.gnu.org> References: <cb32cf5c-2f40-ab75-2c03-113bcd19d7ad@HIDDEN> X-Gnu-PR-Message: ack 33775 X-Gnu-PR-Package: coreutils Reply-To: 33775 <at> debbugs.gnu.org Date: Mon, 17 Dec 2018 02:15:01 +0000 Thank you for filing a new bug report with debbugs.gnu.org. This is an automatically generated reply to let you know your message has been received. Your message is being forwarded to the package maintainers and other interested parties for their attention; they will reply in due course. Your message has been sent to the package maintainer(s): bug-coreutils@HIDDEN If you wish to submit further information on this problem, please send it to 33775 <at> debbugs.gnu.org. Please do not send mail to help-debbugs@HIDDEN unless you wish to report a problem with the Bug-tracking system. --=20 33775: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D33775 GNU Bug Tracking System Contact help-debbugs@HIDDEN with problems
Received: (at control) by debbugs.gnu.org; 23 Dec 2018 06:04:00 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Sun Dec 23 01:04:00 2018 Received: from localhost ([127.0.0.1]:60450 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1gawro-0007vV-HQ for submit <at> debbugs.gnu.org; Sun, 23 Dec 2018 01:04:00 -0500 Received: from mail-pf1-f170.google.com ([209.85.210.170]:46578) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <assafgordon@HIDDEN>) id 1gawrm-0007vG-Hc; Sun, 23 Dec 2018 01:03:58 -0500 Received: by mail-pf1-f170.google.com with SMTP id c73so4491646pfe.13; Sat, 22 Dec 2018 22:03:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-language:content-transfer-encoding; bh=qPT0mitvhMz6h/Dfgh5OnjNx3/sINGUSRHuGc7OS2JQ=; b=YOuKw9RyMs24T1X3UKcLRJn+EG1pdxfYL7pHJXrPslpRLia508E8iILl/mnkX4gxho R3qBdHHlVr3q2AE6Qo0baJjwCj4Uq7hPD3tiPRNbh6g0Xno8GiBfkfOZ+xwA3cMEGTTx EFfKK71QsEEbfO3qYAZ4+EimB5S8t6p47P3sgxkXuHgOpNXeOF7fyFqub+9lwrUMXrO5 +yWjFEbFETkwoVjVzAl8jcnErJgPJEhDZEoOjVEuhY+HNhyu+zo335xBTWSiTMePwORa 9Tgq0h4/911NpZThb9yUmTmltHIkeVH8eXhDayCTMmv/PwV5ynIaQYmoWGG9P17W4Qtx UKwg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=qPT0mitvhMz6h/Dfgh5OnjNx3/sINGUSRHuGc7OS2JQ=; b=cQVu37K9JWN3EajdEgRjgFXdi5EioaC8SUZFY5pQNpTkzeLpolbvNSNG7UfS0DXl6a wThcNUs1BkDrfYG4PY29lCMLEUC2+5S76vjuJFr1tyB6bWR1OnPxLcbHvwWUHlMIOcrS fDTComaeroEO2E5H8qbo0ajRwWHyLHO7Wqcup7aAIHQen7frJyQ0Wsrw3ZTWhvV7H1rh mskfQP3nSlDOD6L9NpzIs0jTZuxdz4cP0yUNuYh3iP8zJyR0f0XUctLhgPj+9df/OUkT 6IivkABnc1top+A5Sa6zrc9soI0OG9AN69UaCAs8xHHDZ+cb2IYUcbYmEeCMYWe2ebnx w4LA== X-Gm-Message-State: AJcUukc1LQxYoMaz8OhGB8xX9toap5PXYBiz9YdFQyp7fs2iLRKuHWPZ 1c+53Wl+u+0cBuFOnKx5/vfKPyx4 X-Google-Smtp-Source: ALg8bN60EaFkHcdQROg26EDT8AKjx8rtvlw8zr5XsndbpeUpGbcr9nEzZC8K8OidAt3mvBOMfg8OYw== X-Received: by 2002:a63:ba4d:: with SMTP id l13mr8294855pgu.194.1545545032160; Sat, 22 Dec 2018 22:03:52 -0800 (PST) Received: from tomato.housegordon.com (moose.housegordon.com. [184.68.105.38]) by smtp.googlemail.com with ESMTPSA id 4sm52445335pfq.10.2018.12.22.22.03.50 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 22 Dec 2018 22:03:51 -0800 (PST) Subject: Re: bug#33775: fold: counting multi-byte utf-8 sequences as separate columns To: Michael Siegel <msi@HIDDEN>, 33775 <at> debbugs.gnu.org References: <cb32cf5c-2f40-ab75-2c03-113bcd19d7ad@HIDDEN> From: Assaf Gordon <assafgordon@HIDDEN> Message-ID: <4e9b7e51-4020-133d-0b3f-0cc89076a062@HIDDEN> Date: Sat, 22 Dec 2018 23:03:50 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.3.0 MIME-Version: 1.0 In-Reply-To: <cb32cf5c-2f40-ab75-2c03-113bcd19d7ad@HIDDEN> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.0 (-) severity 33775 wishlist retitle 33775 multibyte: fold: multi-byte sequences as separate columns stop Hello, On 2018-12-16 6:32 p.m., Michael Siegel wrote: > I've just discovered an odd behavior of `fold' while trying to wrap a > piece of text containing phonetic characters. > > Take the following line, for example: Thank you for reporting this issue and providing clear, reproducible examples. Adding complete multibyte/utf8 support to all coreutils programs is an on-going effort. I'm marking this as a "wishlist" item, which will remain open until we complete the implementation. Related multibyte items are listed here (with "multibyte" prefix): https://debbugs.gnu.org/cgi/pkgreport.cgi?which=pkg&data=coreutils regards, - assaf
Received: (at control) by debbugs.gnu.org; 23 Dec 2018 06:04:00 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Sun Dec 23 01:04:00 2018 Received: from localhost ([127.0.0.1]:60450 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1gawro-0007vV-HQ for submit <at> debbugs.gnu.org; Sun, 23 Dec 2018 01:04:00 -0500 Received: from mail-pf1-f170.google.com ([209.85.210.170]:46578) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <assafgordon@HIDDEN>) id 1gawrm-0007vG-Hc; Sun, 23 Dec 2018 01:03:58 -0500 Received: by mail-pf1-f170.google.com with SMTP id c73so4491646pfe.13; Sat, 22 Dec 2018 22:03:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-language:content-transfer-encoding; bh=qPT0mitvhMz6h/Dfgh5OnjNx3/sINGUSRHuGc7OS2JQ=; b=YOuKw9RyMs24T1X3UKcLRJn+EG1pdxfYL7pHJXrPslpRLia508E8iILl/mnkX4gxho R3qBdHHlVr3q2AE6Qo0baJjwCj4Uq7hPD3tiPRNbh6g0Xno8GiBfkfOZ+xwA3cMEGTTx EFfKK71QsEEbfO3qYAZ4+EimB5S8t6p47P3sgxkXuHgOpNXeOF7fyFqub+9lwrUMXrO5 +yWjFEbFETkwoVjVzAl8jcnErJgPJEhDZEoOjVEuhY+HNhyu+zo335xBTWSiTMePwORa 9Tgq0h4/911NpZThb9yUmTmltHIkeVH8eXhDayCTMmv/PwV5ynIaQYmoWGG9P17W4Qtx UKwg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=qPT0mitvhMz6h/Dfgh5OnjNx3/sINGUSRHuGc7OS2JQ=; b=cQVu37K9JWN3EajdEgRjgFXdi5EioaC8SUZFY5pQNpTkzeLpolbvNSNG7UfS0DXl6a wThcNUs1BkDrfYG4PY29lCMLEUC2+5S76vjuJFr1tyB6bWR1OnPxLcbHvwWUHlMIOcrS fDTComaeroEO2E5H8qbo0ajRwWHyLHO7Wqcup7aAIHQen7frJyQ0Wsrw3ZTWhvV7H1rh mskfQP3nSlDOD6L9NpzIs0jTZuxdz4cP0yUNuYh3iP8zJyR0f0XUctLhgPj+9df/OUkT 6IivkABnc1top+A5Sa6zrc9soI0OG9AN69UaCAs8xHHDZ+cb2IYUcbYmEeCMYWe2ebnx w4LA== X-Gm-Message-State: AJcUukc1LQxYoMaz8OhGB8xX9toap5PXYBiz9YdFQyp7fs2iLRKuHWPZ 1c+53Wl+u+0cBuFOnKx5/vfKPyx4 X-Google-Smtp-Source: ALg8bN60EaFkHcdQROg26EDT8AKjx8rtvlw8zr5XsndbpeUpGbcr9nEzZC8K8OidAt3mvBOMfg8OYw== X-Received: by 2002:a63:ba4d:: with SMTP id l13mr8294855pgu.194.1545545032160; Sat, 22 Dec 2018 22:03:52 -0800 (PST) Received: from tomato.housegordon.com (moose.housegordon.com. [184.68.105.38]) by smtp.googlemail.com with ESMTPSA id 4sm52445335pfq.10.2018.12.22.22.03.50 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 22 Dec 2018 22:03:51 -0800 (PST) Subject: Re: bug#33775: fold: counting multi-byte utf-8 sequences as separate columns To: Michael Siegel <msi@HIDDEN>, 33775 <at> debbugs.gnu.org References: <cb32cf5c-2f40-ab75-2c03-113bcd19d7ad@HIDDEN> From: Assaf Gordon <assafgordon@HIDDEN> Message-ID: <4e9b7e51-4020-133d-0b3f-0cc89076a062@HIDDEN> Date: Sat, 22 Dec 2018 23:03:50 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.3.0 MIME-Version: 1.0 In-Reply-To: <cb32cf5c-2f40-ab75-2c03-113bcd19d7ad@HIDDEN> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.0 (-) severity 33775 wishlist retitle 33775 multibyte: fold: multi-byte sequences as separate columns stop Hello, On 2018-12-16 6:32 p.m., Michael Siegel wrote: > I've just discovered an odd behavior of `fold' while trying to wrap a > piece of text containing phonetic characters. > > Take the following line, for example: Thank you for reporting this issue and providing clear, reproducible examples. Adding complete multibyte/utf8 support to all coreutils programs is an on-going effort. I'm marking this as a "wishlist" item, which will remain open until we complete the implementation. Related multibyte items are listed here (with "multibyte" prefix): https://debbugs.gnu.org/cgi/pkgreport.cgi?which=pkg&data=coreutils regards, - assaf
X-Loop: help-debbugs@HIDDEN
Subject: bug#33775: fold: counting multi-byte utf-8 sequences as separate columns
Resent-From: Assaf Gordon <assafgordon@HIDDEN>
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-coreutils@HIDDEN
Resent-Date: Sun, 23 Dec 2018 06:05:01 +0000
Resent-Message-ID: <handler.33775.B33775.154554504330517 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: followup 33775
X-GNU-PR-Package: coreutils
X-GNU-PR-Keywords:
To: Michael Siegel <msi@HIDDEN>, 33775 <at> debbugs.gnu.org
Received: via spool by 33775-submit <at> debbugs.gnu.org id=B33775.154554504330517
(code B ref 33775); Sun, 23 Dec 2018 06:05:01 +0000
Received: (at 33775) by debbugs.gnu.org; 23 Dec 2018 06:04:03 +0000
Received: from localhost ([127.0.0.1]:60454 helo=debbugs.gnu.org)
by debbugs.gnu.org with esmtp (Exim 4.84_2)
(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
id 1gawrq-0007w7-QO
for submit <at> debbugs.gnu.org; Sun, 23 Dec 2018 01:04:03 -0500
Received: from mail-pf1-f170.google.com ([209.85.210.170]:46578)
by debbugs.gnu.org with esmtp (Exim 4.84_2)
(envelope-from <assafgordon@HIDDEN>)
id 1gawrm-0007vG-Hc; Sun, 23 Dec 2018 01:03:58 -0500
Received: by mail-pf1-f170.google.com with SMTP id c73so4491646pfe.13;
Sat, 22 Dec 2018 22:03:58 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
h=subject:to:references:from:message-id:date:user-agent:mime-version
:in-reply-to:content-language:content-transfer-encoding;
bh=qPT0mitvhMz6h/Dfgh5OnjNx3/sINGUSRHuGc7OS2JQ=;
b=YOuKw9RyMs24T1X3UKcLRJn+EG1pdxfYL7pHJXrPslpRLia508E8iILl/mnkX4gxho
R3qBdHHlVr3q2AE6Qo0baJjwCj4Uq7hPD3tiPRNbh6g0Xno8GiBfkfOZ+xwA3cMEGTTx
EFfKK71QsEEbfO3qYAZ4+EimB5S8t6p47P3sgxkXuHgOpNXeOF7fyFqub+9lwrUMXrO5
+yWjFEbFETkwoVjVzAl8jcnErJgPJEhDZEoOjVEuhY+HNhyu+zo335xBTWSiTMePwORa
9Tgq0h4/911NpZThb9yUmTmltHIkeVH8eXhDayCTMmv/PwV5ynIaQYmoWGG9P17W4Qtx
UKwg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=1e100.net; s=20161025;
h=x-gm-message-state:subject:to:references:from:message-id:date
:user-agent:mime-version:in-reply-to:content-language
:content-transfer-encoding;
bh=qPT0mitvhMz6h/Dfgh5OnjNx3/sINGUSRHuGc7OS2JQ=;
b=cQVu37K9JWN3EajdEgRjgFXdi5EioaC8SUZFY5pQNpTkzeLpolbvNSNG7UfS0DXl6a
wThcNUs1BkDrfYG4PY29lCMLEUC2+5S76vjuJFr1tyB6bWR1OnPxLcbHvwWUHlMIOcrS
fDTComaeroEO2E5H8qbo0ajRwWHyLHO7Wqcup7aAIHQen7frJyQ0Wsrw3ZTWhvV7H1rh
mskfQP3nSlDOD6L9NpzIs0jTZuxdz4cP0yUNuYh3iP8zJyR0f0XUctLhgPj+9df/OUkT
6IivkABnc1top+A5Sa6zrc9soI0OG9AN69UaCAs8xHHDZ+cb2IYUcbYmEeCMYWe2ebnx
w4LA==
X-Gm-Message-State: AJcUukc1LQxYoMaz8OhGB8xX9toap5PXYBiz9YdFQyp7fs2iLRKuHWPZ
1c+53Wl+u+0cBuFOnKx5/vfKPyx4
X-Google-Smtp-Source: ALg8bN60EaFkHcdQROg26EDT8AKjx8rtvlw8zr5XsndbpeUpGbcr9nEzZC8K8OidAt3mvBOMfg8OYw==
X-Received: by 2002:a63:ba4d:: with SMTP id l13mr8294855pgu.194.1545545032160;
Sat, 22 Dec 2018 22:03:52 -0800 (PST)
Received: from tomato.housegordon.com (moose.housegordon.com. [184.68.105.38])
by smtp.googlemail.com with ESMTPSA id
4sm52445335pfq.10.2018.12.22.22.03.50
(version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
Sat, 22 Dec 2018 22:03:51 -0800 (PST)
References: <cb32cf5c-2f40-ab75-2c03-113bcd19d7ad@HIDDEN>
From: Assaf Gordon <assafgordon@HIDDEN>
Message-ID: <4e9b7e51-4020-133d-0b3f-0cc89076a062@HIDDEN>
Date: Sat, 22 Dec 2018 23:03:50 -0700
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101
Thunderbird/60.3.0
MIME-Version: 1.0
In-Reply-To: <cb32cf5c-2f40-ab75-2c03-113bcd19d7ad@HIDDEN>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 7bit
X-Spam-Score: -0.0 (/)
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)
severity 33775 wishlist
retitle 33775 multibyte: fold: multi-byte sequences as separate columns
stop
Hello,
On 2018-12-16 6:32 p.m., Michael Siegel wrote:
> I've just discovered an odd behavior of `fold' while trying to wrap a
> piece of text containing phonetic characters.
>
> Take the following line, for example:
Thank you for reporting this issue and
providing clear, reproducible examples.
Adding complete multibyte/utf8 support to all coreutils
programs is an on-going effort.
I'm marking this as a "wishlist" item, which will remain
open until we complete the implementation.
Related multibyte items are listed here (with "multibyte" prefix):
https://debbugs.gnu.org/cgi/pkgreport.cgi?which=pkg&data=coreutils
regards,
- assaf
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997 nCipher Corporation Ltd,
1994-97 Ian Jackson.