Assaf Gordon <assafgordon@HIDDEN>
to control <at> debbugs.gnu.org
.
Full text available.Assaf Gordon <assafgordon@HIDDEN>
to control <at> debbugs.gnu.org
.
Full text available.Pádraig Brady <P@HIDDEN>
to control <at> debbugs.gnu.org
.
Full text available.Jim Meyering <jim@HIDDEN>
to control <at> debbugs.gnu.org
.
Full text available.Received: (at 12192) by debbugs.gnu.org; 15 Sep 2012 10:30:00 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Sat Sep 15 06:30:00 2012 Received: from localhost ([127.0.0.1]:34896 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1TCpdP-0008Uy-VD for submit <at> debbugs.gnu.org; Sat, 15 Sep 2012 06:30:00 -0400 Received: from mx.meyering.net ([88.168.87.75]:49067) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from <jim@HIDDEN>) id 1TCpdN-0008Un-W7; Sat, 15 Sep 2012 06:29:58 -0400 Received: from rho.meyering.net (rho.meyering.net [127.0.0.1]) by rho.meyering.net (Acme Bit-Twister) with ESMTP id E49B7601F7; Sat, 15 Sep 2012 12:28:54 +0200 (CEST) From: Jim Meyering <jim@HIDDEN> To: Michael Stummvoll <michael@HIDDEN> Subject: Re: bug#12192: tr - bytes vs characters In-Reply-To: <20120813145222.0450a1a8@eddie> (Michael Stummvoll's message of "Mon, 13 Aug 2012 14:52:22 +0200") References: <20120813145222.0450a1a8@eddie> Date: Sat, 15 Sep 2012 12:28:54 +0200 Message-ID: <87wqzvvau1.fsf@HIDDEN> Lines: 37 MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.4 (--) X-Debbugs-Envelope-To: 12192 Cc: 12192 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <http://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <http://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <http://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Sender: debbugs-submit-bounces <at> debbugs.gnu.org Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org X-Spam-Score: -2.4 (--) forcemerge 12192 9365 thanks Michael Stummvoll wrote: > Hi gnu folks, > > as already known, tr cannot handle multibyte-encodings like utf-8: > >> mst@eddie:~$ echo "foo" | tr o =F6 >> f=C3=C3 > > i know, that multibyte encoding support is not needed for > posix-compilance, BUT: > > the manpage of tr says the following: > >> Translate, squeeze, and/or delete characters from standard input, >> writing to standard output. > > and thats the inconsistence imho. > > The typical interpretation of "character" in such a context means one > character on display. regardless which encoding is used or how many > bytes are used to display this. So, if tr realy translates "characters" > it should preserve the encoding. If it doesn't do, it does not > translate "characters" but "bytes". So there I see two ways: > > - add multybyte-encoding support to tr > or > - change the manpage and helptext to not say "characters" but "bytes" > > since it doesn't seem that somebody want to add the support to tr, an > update of the manpage would be the easier way to ensure the consistence. Thanks for the report. I'm merging this issue with the others that relate to tr and multi-byte support.
bug-coreutils@HIDDEN
:bug#12192
; Package coreutils
.
Full text available.Received: (at 12192) by debbugs.gnu.org; 17 Aug 2012 12:12:45 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Fri Aug 17 08:12:45 2012 Received: from localhost ([127.0.0.1]:35038 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1T2LPx-0000Ht-IA for submit <at> debbugs.gnu.org; Fri, 17 Aug 2012 08:12:45 -0400 Received: from wolf.stummi.org ([78.47.79.60]:56169) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from <michael@HIDDEN>) id 1T2LPv-0000Hm-Ka for 12192 <at> debbugs.gnu.org; Fri, 17 Aug 2012 08:12:44 -0400 Received: from eddie (dslb-088-072-034-000.pools.arcor-ip.net [88.72.34.0]) by wolf.stummi.org (Postfix) with ESMTPSA id D68EE1401FD; Fri, 17 Aug 2012 14:03:46 +0200 (CEST) Date: Fri, 17 Aug 2012 14:03:42 +0200 From: Michael Stummvoll <michael@HIDDEN> To: Paul Eggert <eggert@HIDDEN> Subject: Re: bug#12192: tr - bytes vs characters Message-ID: <20120817140342.03812f2b@eddie> In-Reply-To: <5029BBE2.1030407@HIDDEN> References: <20120813145222.0450a1a8@eddie> <502906FA.3040803@HIDDEN> <5029BBE2.1030407@HIDDEN> X-Mailer: Claws Mail 3.8.1 (GTK+ 2.24.10; i486-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Spam-Score: -1.9 (-) X-Debbugs-Envelope-To: 12192 Cc: 12192 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <http://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <http://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <http://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Sender: debbugs-submit-bounces <at> debbugs.gnu.org Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org X-Spam-Score: -1.9 (-) Hi there, > But yes, the main thing is for someone to contribute > correct, easy-to-maintain, and efficient code. Just for the record, if any day somebody wants to attend this I just noticed, that the "tr" from 9base can handle utf-8 correctly. 9base is a unix-port of the plan9 utils: http://tools.suckless.org/9base i didn't took an closer look yet to the sources neither from gnu tr nor from 9base tr. But may somebody other could benefit from there. Kind Regards, Michael
bug-coreutils@HIDDEN
:bug#12192
; Package coreutils
.
Full text available.Received: (at 12192) by debbugs.gnu.org; 14 Aug 2012 07:53:06 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Tue Aug 14 03:53:06 2012 Received: from localhost ([127.0.0.1]:55002 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1T1Bw1-0002qK-I1 for submit <at> debbugs.gnu.org; Tue, 14 Aug 2012 03:53:06 -0400 Received: from smtp.cs.ucla.edu ([131.179.128.62]:53558) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from <eggert@HIDDEN>) id 1T1Bw0-0002qC-30 for 12192 <at> debbugs.gnu.org; Tue, 14 Aug 2012 03:53:05 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id 81620A60005; Tue, 14 Aug 2012 00:44:26 -0700 (PDT) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id pcBwhj5pT7Wj; Tue, 14 Aug 2012 00:44:26 -0700 (PDT) Received: from [10.10.73.118] (unknown [208.181.80.18]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id 0FF43A60004; Tue, 14 Aug 2012 00:44:26 -0700 (PDT) Message-ID: <502A01D8.1080604@HIDDEN> Date: Tue, 14 Aug 2012 00:44:24 -0700 From: Paul Eggert <eggert@HIDDEN> User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Eric Blake <eblake@HIDDEN> Subject: Re: bug#12192: tr - bytes vs characters References: <20120813145222.0450a1a8@eddie> <502906FA.3040803@HIDDEN> <5029BBE2.1030407@HIDDEN> <5029E358.7000301@HIDDEN> In-Reply-To: <5029E358.7000301@HIDDEN> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Score: -1.9 (-) X-Debbugs-Envelope-To: 12192 Cc: 12192 <at> debbugs.gnu.org, Michael Stummvoll <michael@HIDDEN> X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <http://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <http://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <http://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Sender: debbugs-submit-bounces <at> debbugs.gnu.org Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org X-Spam-Score: -1.9 (-) On 08/13/2012 10:34 PM, Eric Blake wrote: > But POSIX _does_ require that tr be > locale-aware, and therefore if an implementation provides multibyte > locales (which most desktop glibc-based GNU/Linux systems do), then tr > should honor those locales, including multibyte character support. All this is absolutely correct; but still, if the issue is merely POSIX conformance, these glibc-based GNU/Linux systems do conform to POSIX, since the POSIX-conformance document for these systems can state that the supported locales are merely the single-byte locales. Admittedly this is legal hairsplitting, but if POSIX compliance is the issue then one is in legal-hairsplitting mode already....
bug-coreutils@HIDDEN
:bug#12192
; Package coreutils
.
Full text available.Received: (at 12192) by debbugs.gnu.org; 14 Aug 2012 05:43:09 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Tue Aug 14 01:43:09 2012 Received: from localhost ([127.0.0.1]:54765 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1T19uG-0005dk-Pj for submit <at> debbugs.gnu.org; Tue, 14 Aug 2012 01:43:09 -0400 Received: from mx1.redhat.com ([209.132.183.28]:22795) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from <eblake@HIDDEN>) id 1T19uD-0005db-K4 for 12192 <at> debbugs.gnu.org; Tue, 14 Aug 2012 01:43:07 -0400 Received: from int-mx01.intmail.prod.int.phx2.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id q7E5YLmR003362 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Tue, 14 Aug 2012 01:34:21 -0400 Received: from [10.3.113.122] (ovpn-113-122.phx2.redhat.com [10.3.113.122]) by int-mx01.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id q7E5YG5X014002; Tue, 14 Aug 2012 01:34:16 -0400 Message-ID: <5029E358.7000301@HIDDEN> Date: Mon, 13 Aug 2012 23:34:16 -0600 From: Eric Blake <eblake@HIDDEN> Organization: Red Hat User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:14.0) Gecko/20120717 Thunderbird/14.0 MIME-Version: 1.0 To: Paul Eggert <eggert@HIDDEN> Subject: Re: bug#12192: tr - bytes vs characters References: <20120813145222.0450a1a8@eddie> <502906FA.3040803@HIDDEN> <5029BBE2.1030407@HIDDEN> In-Reply-To: <5029BBE2.1030407@HIDDEN> X-Enigmail-Version: 1.4.3 OpenPGP: url=http://people.redhat.com/eblake/eblake.gpg Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="------------enig6C0A89109EE1E9A915F7BB30" X-Scanned-By: MIMEDefang 2.67 on 10.5.11.11 X-Spam-Score: -6.9 (------) X-Debbugs-Envelope-To: 12192 Cc: 12192 <at> debbugs.gnu.org, Michael Stummvoll <michael@HIDDEN> X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <http://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <http://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <http://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Sender: debbugs-submit-bounces <at> debbugs.gnu.org Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org X-Spam-Score: -6.9 (------) This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig6C0A89109EE1E9A915F7BB30 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On 08/13/2012 08:45 PM, Paul Eggert wrote: > On 08/13/2012 06:54 AM, Eric Blake wrote: >> POSIX _does_ require multi-byte support >=20 > The last time I checked, POSIX did not require > the implementation to provide any multibyte locales. > Has this changed? Fair enough - POSIX does not require the existence of a multibyte locale; an embedded system that provides only single-byte encodings can still be POSIX-compliant. But POSIX _does_ require that tr be locale-aware, and therefore if an implementation provides multibyte locales (which most desktop glibc-based GNU/Linux systems do), then tr should honor those locales, including multibyte character support. >=20 > But yes, the main thing is for someone to contribute > correct, easy-to-maintain, and efficient code. We're in violent agreement on this point :) --=20 Eric Blake eblake@HIDDEN +1-919-301-3266 Libvirt virtualization library http://libvirt.org --------------enig6C0A89109EE1E9A915F7BB30 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) Comment: Public key at http://people.redhat.com/eblake/eblake.gpg Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBCAAGBQJQKeNYAAoJEKeha0olJ0NqfhMH+gJNQMClvbTFDeBr47+3Xt62 dMuOJoYFEWaBakfAYwAjDja0dDAiV3Dlgdit8pIFV4as9Qi8m2u/7zHjNHWEc818 ao4LaS2AvHyoUcumcnq7IN/YWG6rpCl+W6JNsVQI/xZE36SLTvNl3C9LXFNx8gW5 3wNRVpAxF+Ga62WCBapUvdsr/njx6LohPxU99dovsjDJcG+nrI+Y0iQ+EVd78Q3Z 1So307IIq8NARN15jLFUQNTMV8b59xCjUBlz+80aK1Gr8IxbX1D3oxgFg6PdUaGf mDp0/kWuK8t2VOWU1WkqePXMu4cf0br1fKsyqrpv2K4SO1U//9uzclktaRxinP0= =8DF9 -----END PGP SIGNATURE----- --------------enig6C0A89109EE1E9A915F7BB30--
bug-coreutils@HIDDEN
:bug#12192
; Package coreutils
.
Full text available.Received: (at 12192) by debbugs.gnu.org; 14 Aug 2012 02:54:40 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Mon Aug 13 22:54:39 2012 Received: from localhost ([127.0.0.1]:54578 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1T17HD-0008Jy-Mo for submit <at> debbugs.gnu.org; Mon, 13 Aug 2012 22:54:39 -0400 Received: from smtp.cs.ucla.edu ([131.179.128.62]:44924) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from <eggert@HIDDEN>) id 1T17HA-0008Jj-5c for 12192 <at> debbugs.gnu.org; Mon, 13 Aug 2012 22:54:37 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id 0B2DAA60005; Mon, 13 Aug 2012 19:46:00 -0700 (PDT) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id iQhgQd4yoq6K; Mon, 13 Aug 2012 19:45:59 -0700 (PDT) Received: from [10.10.73.118] (unknown [208.181.80.18]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id 948A7A60004; Mon, 13 Aug 2012 19:45:59 -0700 (PDT) Message-ID: <5029BBE2.1030407@HIDDEN> Date: Mon, 13 Aug 2012 19:45:54 -0700 From: Paul Eggert <eggert@HIDDEN> User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Eric Blake <eblake@HIDDEN> Subject: Re: bug#12192: tr - bytes vs characters References: <20120813145222.0450a1a8@eddie> <502906FA.3040803@HIDDEN> In-Reply-To: <502906FA.3040803@HIDDEN> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Score: -1.9 (-) X-Debbugs-Envelope-To: 12192 Cc: 12192 <at> debbugs.gnu.org, Michael Stummvoll <michael@HIDDEN> X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <http://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <http://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <http://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Sender: debbugs-submit-bounces <at> debbugs.gnu.org Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org X-Spam-Score: -1.9 (-) On 08/13/2012 06:54 AM, Eric Blake wrote: > POSIX _does_ require multi-byte support The last time I checked, POSIX did not require the implementation to provide any multibyte locales. Has this changed? But yes, the main thing is for someone to contribute correct, easy-to-maintain, and efficient code.
bug-coreutils@HIDDEN
:bug#12192
; Package coreutils
.
Full text available.Received: (at 12192) by debbugs.gnu.org; 13 Aug 2012 14:02:41 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Mon Aug 13 10:02:41 2012 Received: from localhost ([127.0.0.1]:53505 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1T0vE8-0005jH-Sc for submit <at> debbugs.gnu.org; Mon, 13 Aug 2012 10:02:41 -0400 Received: from mx1.redhat.com ([209.132.183.28]:48770) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from <eblake@HIDDEN>) id 1T0vE5-0005j9-Kb for 12192 <at> debbugs.gnu.org; Mon, 13 Aug 2012 10:02:39 -0400 Received: from int-mx01.intmail.prod.int.phx2.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id q7DDs3Ud003524 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Mon, 13 Aug 2012 09:54:03 -0400 Received: from [10.3.113.122] (ovpn-113-122.phx2.redhat.com [10.3.113.122]) by int-mx01.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id q7DDs2QN024785; Mon, 13 Aug 2012 09:54:03 -0400 Message-ID: <502906FA.3040803@HIDDEN> Date: Mon, 13 Aug 2012 07:54:02 -0600 From: Eric Blake <eblake@HIDDEN> Organization: Red Hat User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:14.0) Gecko/20120717 Thunderbird/14.0 MIME-Version: 1.0 To: Michael Stummvoll <michael@HIDDEN> Subject: Re: bug#12192: tr - bytes vs characters References: <20120813145222.0450a1a8@eddie> In-Reply-To: <20120813145222.0450a1a8@eddie> X-Enigmail-Version: 1.4.3 OpenPGP: url=http://people.redhat.com/eblake/eblake.gpg Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="------------enig5917B7A51DC11C249879F585" X-Scanned-By: MIMEDefang 2.67 on 10.5.11.11 X-Spam-Score: -6.9 (------) X-Debbugs-Envelope-To: 12192 Cc: 12192 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <http://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <http://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <http://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Sender: debbugs-submit-bounces <at> debbugs.gnu.org Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org X-Spam-Score: -6.9 (------) This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig5917B7A51DC11C249879F585 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On 08/13/2012 06:52 AM, Michael Stummvoll wrote: > Hi gnu folks, >=20 > as already known, tr cannot handle multibyte-encodings like utf-8: >=20 >> mst@eddie:~$ echo "foo" | tr o =C3=B6 >> f=C3=83=C3=83 >=20 > i know, that multibyte encoding support is not needed for > posix-compilance, Actually, POSIX _does_ require multi-byte support; it's just that no one has yet contributed code for this upstream that is easy enough to maintain and without penalizing single-byte locales. Patches are welcome= =2E --=20 Eric Blake eblake@HIDDEN +1-919-301-3266 Libvirt virtualization library http://libvirt.org --------------enig5917B7A51DC11C249879F585 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) Comment: Public key at http://people.redhat.com/eblake/eblake.gpg Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBCAAGBQJQKQb6AAoJEKeha0olJ0NqEUoH/2dJx4tSPEIc1FCQ/ubCqbrs 7MMBxT8gAGXq7jjZKXH3kdvxlx5GXfQsSTPHDjITygo08XPD3Ng4UGTfjj8lhB7v 2YRc47C3n3eQ8Wq0bWU2oQ7NEllF9oAxSlKzS+y6MF0D9NQPTTF+C+AKmJLE3REe a+rYi53N32ng0UM/pZOX2mVuRZcYv7piizkKZbqyGl1z0LKU5+UBd2//cGGq394W QPdG49a+KmHicJ9Nw2sQRD+vPAj71+Qy/SGpCScEt+G5ak2T9BesUBTAAYvTGTPH G8+TtOjfV+okJ4cMCk15IK1tX/douTHTSyMBbW4m6yjry09DXyiHFw3WPFA9BfA= =TRe7 -----END PGP SIGNATURE----- --------------enig5917B7A51DC11C249879F585--
bug-coreutils@HIDDEN
:bug#12192
; Package coreutils
.
Full text available.Received: (at submit) by debbugs.gnu.org; 13 Aug 2012 13:01:25 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Mon Aug 13 09:01:25 2012 Received: from localhost ([127.0.0.1]:52807 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1T0uGq-0004D1-Vr for submit <at> debbugs.gnu.org; Mon, 13 Aug 2012 09:01:25 -0400 Received: from eggs.gnu.org ([208.118.235.92]:44414) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from <michael@HIDDEN>) id 1T0uGo-0004Cs-9L for submit <at> debbugs.gnu.org; Mon, 13 Aug 2012 09:01:23 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from <michael@HIDDEN>) id 1T0u8S-0004Sr-7i for submit <at> debbugs.gnu.org; Mon, 13 Aug 2012 08:52:49 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.2 Received: from lists.gnu.org ([208.118.235.17]:46589) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <michael@HIDDEN>) id 1T0u8S-0004Sl-4l for submit <at> debbugs.gnu.org; Mon, 13 Aug 2012 08:52:44 -0400 Received: from eggs.gnu.org ([208.118.235.92]:45797) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from <michael@HIDDEN>) id 1T0u8M-0005et-9Z for bug-coreutils@HIDDEN; Mon, 13 Aug 2012 08:52:43 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from <michael@HIDDEN>) id 1T0u8K-0004S9-Mb for bug-coreutils@HIDDEN; Mon, 13 Aug 2012 08:52:38 -0400 Received: from wolf.stummi.org ([78.47.79.60]:41632) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <michael@HIDDEN>) id 1T0u8K-0004S1-GD for bug-coreutils@HIDDEN; Mon, 13 Aug 2012 08:52:36 -0400 Received: from eddie (dslb-088-072-034-000.pools.arcor-ip.net [88.72.34.0]) by wolf.stummi.org (Postfix) with ESMTPSA id 03819140696 for <bug-coreutils@HIDDEN>; Mon, 13 Aug 2012 14:52:31 +0200 (CEST) Date: Mon, 13 Aug 2012 14:52:22 +0200 From: Michael Stummvoll <michael@HIDDEN> To: bug-coreutils@HIDDEN Subject: tr - bytes vs characters Message-ID: <20120813145222.0450a1a8@eddie> X-Mailer: Claws Mail 3.8.1 (GTK+ 2.24.10; i486-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 208.118.235.17 X-Spam-Score: -6.9 (------) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <http://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <http://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <http://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Sender: debbugs-submit-bounces <at> debbugs.gnu.org Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org X-Spam-Score: -6.9 (------) Hi gnu folks, as already known, tr cannot handle multibyte-encodings like utf-8: > mst@eddie:~$ echo "foo" | tr o =C3=B6 > f=C3=83=C3=83 i know, that multibyte encoding support is not needed for posix-compilance, BUT: the manpage of tr says the following:=20 > Translate, squeeze, and/or delete characters from standard input, > writing to standard output. and thats the inconsistence imho. The typical interpretation of "character" in such a context means one character on display. regardless which encoding is used or how many bytes are used to display this. So, if tr realy translates "characters" it should preserve the encoding. If it doesn't do, it does not translate "characters" but "bytes". So there I see two ways: - add multybyte-encoding support to tr or - change the manpage and helptext to not say "characters" but "bytes" since it doesn't seem that somebody want to add the support to tr, an update of the manpage would be the easier way to ensure the consistence. Kind regards, Michael
Michael Stummvoll <michael@HIDDEN>
:bug-coreutils@HIDDEN
.
Full text available.bug-coreutils@HIDDEN
:bug#12192
; Package coreutils
.
Full text available.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997 nCipher Corporation Ltd,
1994-97 Ian Jackson.