X-Loop: help-debbugs@HIDDEN Subject: bug#33775: fold: counting multi-byte utf-8 sequences as separate columns Resent-From: Michael Siegel <msi@HIDDEN> Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-coreutils@HIDDEN Resent-Date: Mon, 17 Dec 2018 02:15:01 +0000 Resent-Message-ID: <handler.33775.B.15450128704606 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: report 33775 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: To: 33775 <at> debbugs.gnu.org X-Debbugs-Original-To: bug-coreutils@HIDDEN Received: via spool by submit <at> debbugs.gnu.org id=B.15450128704606 (code B ref -1); Mon, 17 Dec 2018 02:15:01 +0000 Received: (at submit) by debbugs.gnu.org; 17 Dec 2018 02:14:30 +0000 Received: from localhost ([127.0.0.1]:50731 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1gYiQP-0001CB-KH for submit <at> debbugs.gnu.org; Sun, 16 Dec 2018 21:14:30 -0500 Received: from eggs.gnu.org ([208.118.235.92]:47016) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <msi@HIDDEN>) id 1gYhmS-0000AN-KO for submit <at> debbugs.gnu.org; Sun, 16 Dec 2018 20:33:13 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from <msi@HIDDEN>) id 1gYhmM-0007dJ-G1 for submit <at> debbugs.gnu.org; Sun, 16 Dec 2018 20:33:07 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-0.0 required=5.0 tests=BAYES_20 autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:33694) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from <msi@HIDDEN>) id 1gYhmL-0007bw-KN for submit <at> debbugs.gnu.org; Sun, 16 Dec 2018 20:33:06 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56626) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from <msi@HIDDEN>) id 1gYhmK-0003aN-Nk for bug-coreutils@HIDDEN; Sun, 16 Dec 2018 20:33:05 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from <msi@HIDDEN>) id 1gYhmF-0007UU-KI for bug-coreutils@HIDDEN; Sun, 16 Dec 2018 20:33:04 -0500 Received: from poseidon.malbolge.net ([5.45.108.48]:34886) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from <msi@HIDDEN>) id 1gYhmF-0007RP-92 for bug-coreutils@HIDDEN; Sun, 16 Dec 2018 20:32:59 -0500 Received: from hermes.malbolge.net (hermes.malbolge.net [192.168.123.201]) by poseidon.malbolge.net (OpenSMTPD) with ESMTPSA id 7a4d17b6 (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256:NO) for <bug-coreutils@HIDDEN>; Mon, 17 Dec 2018 02:32:57 +0100 (CET) Received: from kerberos.malbolge.net ([192.168.123.128] helo=127.0.0.1) by hermes.malbolge.net with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.89) (envelope-from <msi@HIDDEN>) id 1gYhmC-0007dp-Ce for bug-coreutils@HIDDEN; Mon, 17 Dec 2018 02:32:56 +0100 From: Michael Siegel <msi@HIDDEN> Openpgp: preference=signencrypt Autocrypt: addr=msi@HIDDEN; prefer-encrypt=mutual; keydata= mQINBFtDUbwBEAC3pgB2zgT1GBe8wwTuzRdKMIWnnI1HHVQVT8MuURvlQcHTBOM9KwV7s6hl RF8gwyBYImptTGD/zCWckuIC8TBWarqslKCLi4r6FUwmS410fCSqIQbD2m0kV8wyz0XUuULU v6E6aICqmgrEMJXgBrPtoK6Euvc9X9iJjhP+eC6EJ+lLp2snkn9ttAnaBGKupZzGa8X2q/de eZl9T0LoqMoIuClzX1v+VMFv9Hmc1gj9SQ5EiYyR+6odzXLaSQgLMVnIzfQ0MuJQCeGiZyWj oQK8IXAM2/R+94M79yzYNSbNp85nzQ+7vqsMH19f/+4Z6I8I/9fZjynB5ykLJtnSxvsBp4NO W04iYSxppctEmX7K4wlb2DNK6+wsH0GfoLSEDcsE3gLoQfb8Va2UASGXzIwcHxn4mEfveQ2l a5spYKr0xkMbiA4ETPdzgsy1tHKEaSdVk80uYenBxmeUS05FjRR7xGE2jdCmJs3y8CoMH3KS +3Og9auBgbKK25qETrmEbbVAAMtHGuNaasOS+nIXvVyfUHEXSEYvCcn3HFHoiqKZIsOBMxlK 3CvPPPI8EL33y0+VBcSDE1VNw2MrnooSccHA4F7ecQPjrrRdCNOF/egJucpqx36rBBNz45vB ZqWnbOntGdfd5dHCz9yRpwOy5/2VUwlCL8Zs7Gw0XgSFW4tzpQARAQABtCFNaWNoYWVsIFNp ZWdlbCA8bXNpQG1hbGJvbGdlLm5ldD6JAlQEEwEIAD4WIQQgFAgh+NDy+OS/sgzouIF4zeFu ZAUCW0NRvAIbAwUJA8JnAAULCQgHAgYVCAkKCwIEFgIDAQIeAQIXgAAKCRDouIF4zeFuZNUC D/922JpXlJ/4Ny4PT0VZF/Ri7GkIwqMxr8nq/3+J0aLxtE5j12XmCZgupX7yCnSua/D3XnKE PKOYdNBa+gA2LJ7YtAZs8H8FCWeO052oGQwUqXf/qftax0vJzeSZLikmeRzcfmGaHOZH+99R UYI3b4zZTNavq35yLfZnVe2+VNVcReSTt9mEjiAf1M9JT2GNgmim3w/42ba/ol5Yx2zUhr0J pDEaS0a1JWzI4ttlhMDThGGAejUgN3aZP4/JnEmGhXQdAYlP2rKye0/wrQTIjCp9hNZ2/8xP D/SgxXDj+ePyyyL8nwj2BU/m6jydQLOSyJ5Uvq7SF96uF6OexyO7D/2bp/F4LIND7XVy/AUr bIZzpjKVblcFeOTQdtYDUjjhy9eV1eVa6ywbO32mEywrsLuY6R0BoeDxR0IWU7xT1AIWGRpD GZwM27X4iDiJjm1Hej0NAa+EPKTNQw0YOwVtM3cWRUNUq5my+DXVVYcT0n0tYdObWuhi+A88 C4aWypoWw6jZIDi74bwgFg/zQAL1AnqGxTjICWcDdG14GWgaVyAGJWDR937QtVg3sxni6qZb BkgUwAYnOoo6Wbq2wPeJ9293agz3Aj+paILyCFFGwnRGPHN1FcD0gmdjcJMaTWEVbS0q4B76 LvmHnJZXOk85pUcAx5qowoyZMt4poQRX6hDPfbkCDQRbQ1G8ARAA0P7k87V2rNjkHu/7TBoT 8mSuEZTTtcmMKa8E+tErRpc4XQnDZUv4bzxOMMjWFlSIV6mQ8f3ZVA1LF86zOQUWbISp+b2Q K3aKDB83Pbsclt45CUKd1TZNkQQGxtNLU1w0Sy3266pV1GEMxkadvoqJWQEpu4KkMzAaGlud cHHi1TCkbJa0bmwaRbT1eirtAUEqffY6olRaM7UApeDgazSS1VlZsP4DwqoK4binSdzwe+3S +Bqm8Gi2zjtl7cG6aWIA74tyYdWF8Mec7JY3KIu6rjtRvAznm7Y3R8RW4T4eRrujt8u+bwNA tSjkFCH8mmO/w7NaVAZ4hDUNUCAT9bfYJWWZ3H8T80DQgOlBIMXt5F3ahHFVAIoNWbofJrAJ NAM4icFE6WeWEDZVh3pCMoFftEIrQHahOSkITkDwFgO1WkBy5HN3hSDPJvpMiylRKds7Ftiw LcA5sqWeB0nozAPKsp7Et70rH+AUFBpECDKKAJwnGkoBVcm1G5lOYFfnsYpD4Faxn2vIP8pN rluAZjvZQ2038Jb+cYaOdGeD7Cr6j598LYuDm62juiv9itwV6MHR+aokVbEYGwe5HnQHlWFh Gdj/Vx/j7CsnX9rcknWeFne560f7wpPiUfp4neM2/uSSvGHaZXONlMTtPBBY4TEnrZWnceNA xAl6HHF6bMVyhzsAEQEAAYkCPAQYAQgAJhYhBCAUCCH40PL45L+yDOi4gXjN4W5kBQJbQ1G8 AhsMBQkDwmcAAAoJEOi4gXjN4W5kgqcQAKmjkQZJZmmA60fePgyUgKAtAhiPQrwC6+LD3hxw bTT1AF8OqG4bbTqu/mWhIuoY67X35rb+4JySp3ZLFp0NJzTNwsuHh9eFi8/dm16hydGrp6zV I1s5D+gWKW5YRNxbEJYxzYLRDyUkPLnzSRM8N1HnX3ElA0UBAkXWVZy3hnJUihPcWuVUipEP 59qENCK6YO3ii/2drbNWhpOCXgc2cHd9BiICUOvcAOwfj/n78dSj+azGQAt5PTa1c4wJC8o5 CMl5MvybttV2TzHA/r2rMH95/A3kqSuTmm4IP8EAe0uMLdmCX45KYzKdjcW20zEa48AEo6xD 0ifr1sOs12+B1IsouEJeQnPEz5pwblGNwuADw4W5+f6DTju+SCA7BvbvMEZlnVDAivWwQskX HBdry3Xlo2ioUajEFNDwrx1ZI0Wp5X8wDMuwKXK1o7qgr9ZmKkA2ZFG9CjDK3pmwZ2V5oSdC R4wo7rQNkh+wbuhb3J4INSyoTlqEuD6FKxKBIXvlb3YaAUnViiA9tPDVsTcfB0GLDalYYShq JjErV3kXXb30sIYL0KLSCZFJW1uSLB9xc8WTkAs+4acJu1LCrY5uLhHIbZ3SC5dB6XnCCDKB WvUCyxYuOWK7fRWCk56J+xc0vTXYGN+Vnr+90hHyhtHsNYmDFROzIGvgh3SHDjzEXAmT Message-ID: <cb32cf5c-2f40-ab75-2c03-113bcd19d7ad@HIDDEN> Date: Mon, 17 Dec 2018 02:32:55 +0100 User-Agent: Mozilla/5.0 (X11; Linux i686; rv:60.0) Gecko/20100101 Thunderbird/60.3.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Language: de-DE Content-Transfer-Encoding: 8bit X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -4.1 (----) X-Mailman-Approved-At: Sun, 16 Dec 2018 21:14:29 -0500 X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -5.1 (-----) Hello, I've just discovered an odd behavior of `fold' while trying to wrap a piece of text containing phonetic characters. Take the following line, for example: Tcl (pronounced "tickle" or tee cee ell /ˈtiː siː ɛl/) is a high-level, It is 71 characters long. Still, running echo "Tcl (pronounced "tickle" or tee cee ell /ˈtiː siː ɛl/) is a high-level," | fold -w 72 -s produces Tcl (pronounced tickle or tee cee ell /ˈtiː siː ɛl/) is a high-level, I've had someone test this with FreeBSD's `fold', which didn't behave that way. Instead, it filled out the line as expected. Further investigation by developers of Adélie Linux revealed that GNU's `fold' is counting multi-byte utf-8 sequences (in this case, the phonetic characters) as separate columns: awilcox on gwyn [pts/11 Sun 16 19:01] ~: cat testing.txt 1234567890 234567890 234567890 234567890 234567890 234567890 234567890 /ˈtiː siː ɛl/ Adélie en français español ¿que? ¡ay! here is 70 chars ^ yep. awilcox on gwyn [pts/11 Sun 16 19:01] ~: fold -w 72 -s testing.txt 1234567890 234567890 234567890 234567890 234567890 234567890 234567890 /ˈtiː siː ɛl/ Adélie en français español ¿que? ¡ay! here is 70 chars ^ yep. msi
Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.505 (Entity 5.505) Content-Type: text/plain; charset=utf-8 X-Loop: help-debbugs@HIDDEN From: help-debbugs@HIDDEN (GNU bug Tracking System) To: Michael Siegel <msi@HIDDEN> Subject: bug#33775: Acknowledgement (fold: counting multi-byte utf-8 sequences as separate columns) Message-ID: <handler.33775.B.15450128704606.ack <at> debbugs.gnu.org> References: <cb32cf5c-2f40-ab75-2c03-113bcd19d7ad@HIDDEN> X-Gnu-PR-Message: ack 33775 X-Gnu-PR-Package: coreutils Reply-To: 33775 <at> debbugs.gnu.org Date: Mon, 17 Dec 2018 02:15:01 +0000 Thank you for filing a new bug report with debbugs.gnu.org. This is an automatically generated reply to let you know your message has been received. Your message is being forwarded to the package maintainers and other interested parties for their attention; they will reply in due course. Your message has been sent to the package maintainer(s): bug-coreutils@HIDDEN If you wish to submit further information on this problem, please send it to 33775 <at> debbugs.gnu.org. Please do not send mail to help-debbugs@HIDDEN unless you wish to report a problem with the Bug-tracking system. --=20 33775: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D33775 GNU Bug Tracking System Contact help-debbugs@HIDDEN with problems
Received: (at control) by debbugs.gnu.org; 23 Dec 2018 06:04:00 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Sun Dec 23 01:04:00 2018 Received: from localhost ([127.0.0.1]:60450 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1gawro-0007vV-HQ for submit <at> debbugs.gnu.org; Sun, 23 Dec 2018 01:04:00 -0500 Received: from mail-pf1-f170.google.com ([209.85.210.170]:46578) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <assafgordon@HIDDEN>) id 1gawrm-0007vG-Hc; Sun, 23 Dec 2018 01:03:58 -0500 Received: by mail-pf1-f170.google.com with SMTP id c73so4491646pfe.13; Sat, 22 Dec 2018 22:03:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-language:content-transfer-encoding; bh=qPT0mitvhMz6h/Dfgh5OnjNx3/sINGUSRHuGc7OS2JQ=; b=YOuKw9RyMs24T1X3UKcLRJn+EG1pdxfYL7pHJXrPslpRLia508E8iILl/mnkX4gxho R3qBdHHlVr3q2AE6Qo0baJjwCj4Uq7hPD3tiPRNbh6g0Xno8GiBfkfOZ+xwA3cMEGTTx EFfKK71QsEEbfO3qYAZ4+EimB5S8t6p47P3sgxkXuHgOpNXeOF7fyFqub+9lwrUMXrO5 +yWjFEbFETkwoVjVzAl8jcnErJgPJEhDZEoOjVEuhY+HNhyu+zo335xBTWSiTMePwORa 9Tgq0h4/911NpZThb9yUmTmltHIkeVH8eXhDayCTMmv/PwV5ynIaQYmoWGG9P17W4Qtx UKwg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=qPT0mitvhMz6h/Dfgh5OnjNx3/sINGUSRHuGc7OS2JQ=; b=cQVu37K9JWN3EajdEgRjgFXdi5EioaC8SUZFY5pQNpTkzeLpolbvNSNG7UfS0DXl6a wThcNUs1BkDrfYG4PY29lCMLEUC2+5S76vjuJFr1tyB6bWR1OnPxLcbHvwWUHlMIOcrS fDTComaeroEO2E5H8qbo0ajRwWHyLHO7Wqcup7aAIHQen7frJyQ0Wsrw3ZTWhvV7H1rh mskfQP3nSlDOD6L9NpzIs0jTZuxdz4cP0yUNuYh3iP8zJyR0f0XUctLhgPj+9df/OUkT 6IivkABnc1top+A5Sa6zrc9soI0OG9AN69UaCAs8xHHDZ+cb2IYUcbYmEeCMYWe2ebnx w4LA== X-Gm-Message-State: AJcUukc1LQxYoMaz8OhGB8xX9toap5PXYBiz9YdFQyp7fs2iLRKuHWPZ 1c+53Wl+u+0cBuFOnKx5/vfKPyx4 X-Google-Smtp-Source: ALg8bN60EaFkHcdQROg26EDT8AKjx8rtvlw8zr5XsndbpeUpGbcr9nEzZC8K8OidAt3mvBOMfg8OYw== X-Received: by 2002:a63:ba4d:: with SMTP id l13mr8294855pgu.194.1545545032160; Sat, 22 Dec 2018 22:03:52 -0800 (PST) Received: from tomato.housegordon.com (moose.housegordon.com. [184.68.105.38]) by smtp.googlemail.com with ESMTPSA id 4sm52445335pfq.10.2018.12.22.22.03.50 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 22 Dec 2018 22:03:51 -0800 (PST) Subject: Re: bug#33775: fold: counting multi-byte utf-8 sequences as separate columns To: Michael Siegel <msi@HIDDEN>, 33775 <at> debbugs.gnu.org References: <cb32cf5c-2f40-ab75-2c03-113bcd19d7ad@HIDDEN> From: Assaf Gordon <assafgordon@HIDDEN> Message-ID: <4e9b7e51-4020-133d-0b3f-0cc89076a062@HIDDEN> Date: Sat, 22 Dec 2018 23:03:50 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.3.0 MIME-Version: 1.0 In-Reply-To: <cb32cf5c-2f40-ab75-2c03-113bcd19d7ad@HIDDEN> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.0 (-) severity 33775 wishlist retitle 33775 multibyte: fold: multi-byte sequences as separate columns stop Hello, On 2018-12-16 6:32 p.m., Michael Siegel wrote: > I've just discovered an odd behavior of `fold' while trying to wrap a > piece of text containing phonetic characters. > > Take the following line, for example: Thank you for reporting this issue and providing clear, reproducible examples. Adding complete multibyte/utf8 support to all coreutils programs is an on-going effort. I'm marking this as a "wishlist" item, which will remain open until we complete the implementation. Related multibyte items are listed here (with "multibyte" prefix): https://debbugs.gnu.org/cgi/pkgreport.cgi?which=pkg&data=coreutils regards, - assaf
Received: (at control) by debbugs.gnu.org; 23 Dec 2018 06:04:00 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Sun Dec 23 01:04:00 2018 Received: from localhost ([127.0.0.1]:60450 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1gawro-0007vV-HQ for submit <at> debbugs.gnu.org; Sun, 23 Dec 2018 01:04:00 -0500 Received: from mail-pf1-f170.google.com ([209.85.210.170]:46578) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <assafgordon@HIDDEN>) id 1gawrm-0007vG-Hc; Sun, 23 Dec 2018 01:03:58 -0500 Received: by mail-pf1-f170.google.com with SMTP id c73so4491646pfe.13; Sat, 22 Dec 2018 22:03:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-language:content-transfer-encoding; bh=qPT0mitvhMz6h/Dfgh5OnjNx3/sINGUSRHuGc7OS2JQ=; b=YOuKw9RyMs24T1X3UKcLRJn+EG1pdxfYL7pHJXrPslpRLia508E8iILl/mnkX4gxho R3qBdHHlVr3q2AE6Qo0baJjwCj4Uq7hPD3tiPRNbh6g0Xno8GiBfkfOZ+xwA3cMEGTTx EFfKK71QsEEbfO3qYAZ4+EimB5S8t6p47P3sgxkXuHgOpNXeOF7fyFqub+9lwrUMXrO5 +yWjFEbFETkwoVjVzAl8jcnErJgPJEhDZEoOjVEuhY+HNhyu+zo335xBTWSiTMePwORa 9Tgq0h4/911NpZThb9yUmTmltHIkeVH8eXhDayCTMmv/PwV5ynIaQYmoWGG9P17W4Qtx UKwg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=qPT0mitvhMz6h/Dfgh5OnjNx3/sINGUSRHuGc7OS2JQ=; b=cQVu37K9JWN3EajdEgRjgFXdi5EioaC8SUZFY5pQNpTkzeLpolbvNSNG7UfS0DXl6a wThcNUs1BkDrfYG4PY29lCMLEUC2+5S76vjuJFr1tyB6bWR1OnPxLcbHvwWUHlMIOcrS fDTComaeroEO2E5H8qbo0ajRwWHyLHO7Wqcup7aAIHQen7frJyQ0Wsrw3ZTWhvV7H1rh mskfQP3nSlDOD6L9NpzIs0jTZuxdz4cP0yUNuYh3iP8zJyR0f0XUctLhgPj+9df/OUkT 6IivkABnc1top+A5Sa6zrc9soI0OG9AN69UaCAs8xHHDZ+cb2IYUcbYmEeCMYWe2ebnx w4LA== X-Gm-Message-State: AJcUukc1LQxYoMaz8OhGB8xX9toap5PXYBiz9YdFQyp7fs2iLRKuHWPZ 1c+53Wl+u+0cBuFOnKx5/vfKPyx4 X-Google-Smtp-Source: ALg8bN60EaFkHcdQROg26EDT8AKjx8rtvlw8zr5XsndbpeUpGbcr9nEzZC8K8OidAt3mvBOMfg8OYw== X-Received: by 2002:a63:ba4d:: with SMTP id l13mr8294855pgu.194.1545545032160; Sat, 22 Dec 2018 22:03:52 -0800 (PST) Received: from tomato.housegordon.com (moose.housegordon.com. [184.68.105.38]) by smtp.googlemail.com with ESMTPSA id 4sm52445335pfq.10.2018.12.22.22.03.50 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 22 Dec 2018 22:03:51 -0800 (PST) Subject: Re: bug#33775: fold: counting multi-byte utf-8 sequences as separate columns To: Michael Siegel <msi@HIDDEN>, 33775 <at> debbugs.gnu.org References: <cb32cf5c-2f40-ab75-2c03-113bcd19d7ad@HIDDEN> From: Assaf Gordon <assafgordon@HIDDEN> Message-ID: <4e9b7e51-4020-133d-0b3f-0cc89076a062@HIDDEN> Date: Sat, 22 Dec 2018 23:03:50 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.3.0 MIME-Version: 1.0 In-Reply-To: <cb32cf5c-2f40-ab75-2c03-113bcd19d7ad@HIDDEN> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.0 (-) severity 33775 wishlist retitle 33775 multibyte: fold: multi-byte sequences as separate columns stop Hello, On 2018-12-16 6:32 p.m., Michael Siegel wrote: > I've just discovered an odd behavior of `fold' while trying to wrap a > piece of text containing phonetic characters. > > Take the following line, for example: Thank you for reporting this issue and providing clear, reproducible examples. Adding complete multibyte/utf8 support to all coreutils programs is an on-going effort. I'm marking this as a "wishlist" item, which will remain open until we complete the implementation. Related multibyte items are listed here (with "multibyte" prefix): https://debbugs.gnu.org/cgi/pkgreport.cgi?which=pkg&data=coreutils regards, - assaf
X-Loop: help-debbugs@HIDDEN Subject: bug#33775: fold: counting multi-byte utf-8 sequences as separate columns Resent-From: Assaf Gordon <assafgordon@HIDDEN> Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-coreutils@HIDDEN Resent-Date: Sun, 23 Dec 2018 06:05:01 +0000 Resent-Message-ID: <handler.33775.B33775.154554504330517 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: followup 33775 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: To: Michael Siegel <msi@HIDDEN>, 33775 <at> debbugs.gnu.org Received: via spool by 33775-submit <at> debbugs.gnu.org id=B33775.154554504330517 (code B ref 33775); Sun, 23 Dec 2018 06:05:01 +0000 Received: (at 33775) by debbugs.gnu.org; 23 Dec 2018 06:04:03 +0000 Received: from localhost ([127.0.0.1]:60454 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1gawrq-0007w7-QO for submit <at> debbugs.gnu.org; Sun, 23 Dec 2018 01:04:03 -0500 Received: from mail-pf1-f170.google.com ([209.85.210.170]:46578) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <assafgordon@HIDDEN>) id 1gawrm-0007vG-Hc; Sun, 23 Dec 2018 01:03:58 -0500 Received: by mail-pf1-f170.google.com with SMTP id c73so4491646pfe.13; Sat, 22 Dec 2018 22:03:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-language:content-transfer-encoding; bh=qPT0mitvhMz6h/Dfgh5OnjNx3/sINGUSRHuGc7OS2JQ=; b=YOuKw9RyMs24T1X3UKcLRJn+EG1pdxfYL7pHJXrPslpRLia508E8iILl/mnkX4gxho R3qBdHHlVr3q2AE6Qo0baJjwCj4Uq7hPD3tiPRNbh6g0Xno8GiBfkfOZ+xwA3cMEGTTx EFfKK71QsEEbfO3qYAZ4+EimB5S8t6p47P3sgxkXuHgOpNXeOF7fyFqub+9lwrUMXrO5 +yWjFEbFETkwoVjVzAl8jcnErJgPJEhDZEoOjVEuhY+HNhyu+zo335xBTWSiTMePwORa 9Tgq0h4/911NpZThb9yUmTmltHIkeVH8eXhDayCTMmv/PwV5ynIaQYmoWGG9P17W4Qtx UKwg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=qPT0mitvhMz6h/Dfgh5OnjNx3/sINGUSRHuGc7OS2JQ=; b=cQVu37K9JWN3EajdEgRjgFXdi5EioaC8SUZFY5pQNpTkzeLpolbvNSNG7UfS0DXl6a wThcNUs1BkDrfYG4PY29lCMLEUC2+5S76vjuJFr1tyB6bWR1OnPxLcbHvwWUHlMIOcrS fDTComaeroEO2E5H8qbo0ajRwWHyLHO7Wqcup7aAIHQen7frJyQ0Wsrw3ZTWhvV7H1rh mskfQP3nSlDOD6L9NpzIs0jTZuxdz4cP0yUNuYh3iP8zJyR0f0XUctLhgPj+9df/OUkT 6IivkABnc1top+A5Sa6zrc9soI0OG9AN69UaCAs8xHHDZ+cb2IYUcbYmEeCMYWe2ebnx w4LA== X-Gm-Message-State: AJcUukc1LQxYoMaz8OhGB8xX9toap5PXYBiz9YdFQyp7fs2iLRKuHWPZ 1c+53Wl+u+0cBuFOnKx5/vfKPyx4 X-Google-Smtp-Source: ALg8bN60EaFkHcdQROg26EDT8AKjx8rtvlw8zr5XsndbpeUpGbcr9nEzZC8K8OidAt3mvBOMfg8OYw== X-Received: by 2002:a63:ba4d:: with SMTP id l13mr8294855pgu.194.1545545032160; Sat, 22 Dec 2018 22:03:52 -0800 (PST) Received: from tomato.housegordon.com (moose.housegordon.com. [184.68.105.38]) by smtp.googlemail.com with ESMTPSA id 4sm52445335pfq.10.2018.12.22.22.03.50 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 22 Dec 2018 22:03:51 -0800 (PST) References: <cb32cf5c-2f40-ab75-2c03-113bcd19d7ad@HIDDEN> From: Assaf Gordon <assafgordon@HIDDEN> Message-ID: <4e9b7e51-4020-133d-0b3f-0cc89076a062@HIDDEN> Date: Sat, 22 Dec 2018 23:03:50 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.3.0 MIME-Version: 1.0 In-Reply-To: <cb32cf5c-2f40-ab75-2c03-113bcd19d7ad@HIDDEN> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Spam-Score: -0.0 (/) X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.0 (-) severity 33775 wishlist retitle 33775 multibyte: fold: multi-byte sequences as separate columns stop Hello, On 2018-12-16 6:32 p.m., Michael Siegel wrote: > I've just discovered an odd behavior of `fold' while trying to wrap a > piece of text containing phonetic characters. > > Take the following line, for example: Thank you for reporting this issue and providing clear, reproducible examples. Adding complete multibyte/utf8 support to all coreutils programs is an on-going effort. I'm marking this as a "wishlist" item, which will remain open until we complete the implementation. Related multibyte items are listed here (with "multibyte" prefix): https://debbugs.gnu.org/cgi/pkgreport.cgi?which=pkg&data=coreutils regards, - assaf
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997 nCipher Corporation Ltd,
1994-97 Ian Jackson.