Assaf Gordon <assafgordon@HIDDEN>
to control <at> debbugs.gnu.org
.
Full text available.Assaf Gordon <assafgordon@HIDDEN>
to control <at> debbugs.gnu.org
.
Full text available.Received: (at 20114) by debbugs.gnu.org; 17 Mar 2015 15:13:21 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Tue Mar 17 11:13:21 2015 Received: from localhost ([127.0.0.1]:50206 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1YXtBJ-0006yc-Ae for submit <at> debbugs.gnu.org; Tue, 17 Mar 2015 11:13:21 -0400 Received: from mail-ie0-f182.google.com ([209.85.223.182]:36427) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <meyering@HIDDEN>) id 1YXtBH-0006yV-SV for 20114 <at> debbugs.gnu.org; Tue, 17 Mar 2015 11:13:20 -0400 Received: by iegc3 with SMTP id c3so13232932ieg.3 for <20114 <at> debbugs.gnu.org>; Tue, 17 Mar 2015 08:13:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc:content-type:content-transfer-encoding; bh=TRdQZAesMeIK1kYtOKOmmjZMJV7wWPdFwO2baHUezsg=; b=pKDYfdg5lf1Nxcz4gG/+Dk2U0XJDKTQc0/k+HWhON17tLKBO6xhVtpHSnbIkMXgghx 26bcMisKTgUfFzxuuFivNhrDfVcF7QRrOIE9RwXRcmdwO0tLFSqnmvNxvCklrbR0tt7Z ysrea/0RR0AFFCyKEeqH0BlXuMhq71rMBFwVcyLE0cpDrnoyrlRybtwn4FqJt+d05XGi E9NX6Y3YtIJaoc5Dbw9x/8EXZfsKVAMKRNNAXg32hygvFzbTGIzg2PN4gvua7p2JiCTz LixU3Sn8mXpeA/d8A+8oyt+mrBb/DE/OzFKm/VQQ1638UJim+HvUVcwP67NG0uswX/HI qYNg== X-Received: by 10.50.43.130 with SMTP id w2mr119766776igl.30.1426605199417; Tue, 17 Mar 2015 08:13:19 -0700 (PDT) MIME-Version: 1.0 Received: by 10.64.171.101 with HTTP; Tue, 17 Mar 2015 08:12:55 -0700 (PDT) In-Reply-To: <5506C94B.5090607@HIDDEN> References: <2777312.3PiQp2ULlP@HIDDEN> <5506C94B.5090607@HIDDEN> From: Jim Meyering <jim@HIDDEN> Date: Tue, 17 Mar 2015 08:12:55 -0700 X-Google-Sender-Auth: w2yblawbQw4NykbqmVr1jCA5dSM Message-ID: <CA+8g5KEGxMfqFUjRtPzBHob087dzZz4H32OK1n6iuvd7M=N-LQ@HIDDEN> Subject: Re: bug#20114: tr does not support multibyte characters in the first argument To: =?ISO-8859-1?Q?P=E1draig_Brady?= <P@HIDDEN> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 20114 Cc: Bjoern Jacke <bjoern@HIDDEN>, Assaf Gordon <assafgordon@HIDDEN>, Ondrej Oprala <ooprala@HIDDEN>, 20114 <at> debbugs.gnu.org, Daiki Ueno <ueno@HIDDEN>, Bruno Haible <bruno@HIDDEN> X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -0.7 (/) On Mon, Mar 16, 2015 at 5:15 AM, P=E1draig Brady <P@HIDDEN> wrote: ... > Yes you're right Bruno. > Multi-byte support in coreutils in general has languished, > but we hope to start improving support in the next major release (9?) > after the current imminent 8.24 stable release. > > To that end I've put together a plan: > http://www.pixelbeat.org/docs/coreutils_i18n/ Very nice plan!
bug-coreutils@HIDDEN
:bug#20114
; Package coreutils
.
Full text available.Received: (at 20114) by debbugs.gnu.org; 16 Mar 2015 12:15:13 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Mon Mar 16 08:15:13 2015 Received: from localhost ([127.0.0.1]:48180 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1YXTvN-00031C-7U for submit <at> debbugs.gnu.org; Mon, 16 Mar 2015 08:15:13 -0400 Received: from mail1.vodafone.ie ([213.233.128.43]:62589) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <P@HIDDEN>) id 1YXTvK-000313-6d for 20114 <at> debbugs.gnu.org; Mon, 16 Mar 2015 08:15:11 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ag4JACvIBlVtTAQN/2dsb2JhbABbgwZSWoI+wh6FK0YBAgKBK0wBAQEBAQF9hBABBTIBRhALDQsJFg8JAwIBAgFFBgEMAQcBAQWIKgEIr1GVIAEBAQEBBQEBAQEBAQEBGosXhHEHhC0FlCWHT4U7jQUjg24+MQGCQgEBAQ Received: from unknown (HELO localhost.localdomain) ([109.76.4.13]) by mail1.vodafone.ie with ESMTP; 16 Mar 2015 12:15:08 +0000 Message-ID: <5506C94B.5090607@HIDDEN> Date: Mon, 16 Mar 2015 12:15:07 +0000 From: =?windows-1252?Q?P=E1draig_Brady?= <P@HIDDEN> User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: Bruno Haible <bruno@HIDDEN>, 20114 <at> debbugs.gnu.org Subject: Re: bug#20114: tr does not support multibyte characters in the first argument References: <2777312.3PiQp2ULlP@HIDDEN> In-Reply-To: <2777312.3PiQp2ULlP@HIDDEN> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 8bit X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 20114 Cc: Bjoern Jacke <bjoern@HIDDEN>, Daiki Ueno <ueno@HIDDEN>, Assaf Gordon <assafgordon@HIDDEN>, Ondrej Oprala <ooprala@HIDDEN> X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: 0.0 (/) On 16/03/15 02:30, Bruno Haible wrote: > POSIX [1] specifies that the recognition of characters in 'tr' depends on > the environment variables LANG, etc. > > But trying to replace a multibyte character by another character does not > work: > > $ echo $LANG > de_DE.UTF-8 > $ enspace=`printf '\u2002'` > $ echo -n "X${enspace}Y" | tr "${enspace}" ' ' | od -t x1 > 0000000 58 20 20 20 59 > 0000005 > > Expected output would be: > $ echo -n "X${enspace}Y" | tr "${enspace}" ' ' | od -t x1 > 0000000 58 20 59 > 0000003 > > With 'sed' it works: > > $ echo -n "X${enspace}Y" | sed -e "s/${enspace}/ /g" | od -t x1 > 0000000 58 20 59 > 0000003 > > Bruno > > [1] http://pubs.opengroup.org/onlinepubs/9699919799/utilities/tr.html Yes you're right Bruno. Multi-byte support in coreutils in general has languished, but we hope to start improving support in the next major release (9?) after the current imminent 8.24 stable release. To that end I've put together a plan: http://www.pixelbeat.org/docs/coreutils_i18n/ cheers, Pádraig.
bug-coreutils@HIDDEN
:bug#20114
; Package coreutils
.
Full text available.Received: (at submit) by debbugs.gnu.org; 16 Mar 2015 02:30:45 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Sun Mar 15 22:30:45 2015 Received: from localhost ([127.0.0.1]:47819 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1YXKni-0003v2-IU for submit <at> debbugs.gnu.org; Sun, 15 Mar 2015 22:30:44 -0400 Received: from eggs.gnu.org ([208.118.235.92]:47847) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <bruno@HIDDEN>) id 1YXKnf-0003ur-QE for submit <at> debbugs.gnu.org; Sun, 15 Mar 2015 22:30:40 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from <bruno@HIDDEN>) id 1YXKne-00057c-Bd for submit <at> debbugs.gnu.org; Sun, 15 Mar 2015 22:30:39 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,T_DKIM_INVALID autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:37648) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <bruno@HIDDEN>) id 1YXKne-00057Y-64 for submit <at> debbugs.gnu.org; Sun, 15 Mar 2015 22:30:38 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:49561) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from <bruno@HIDDEN>) id 1YXKnc-0006xo-MF for bug-coreutils@HIDDEN; Sun, 15 Mar 2015 22:30:38 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from <bruno@HIDDEN>) id 1YXKnZ-00057B-Da for bug-coreutils@HIDDEN; Sun, 15 Mar 2015 22:30:36 -0400 Received: from mo6-p00-ob.smtp.rzone.de ([2a01:238:20a:202:5300::9]:14943) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <bruno@HIDDEN>) id 1YXKnY-00056y-VU for bug-coreutils@HIDDEN; Sun, 15 Mar 2015 22:30:33 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; t=1426473028; l=662; s=domk; d=clisp.org; h=Content-Type:Content-Transfer-Encoding:MIME-Version:Date:Subject:Cc: To:From; bh=i14EOmS7Gj7YJSZAM9pF+QCsmqpKrZ7klwujfaPtg3g=; b=meAQ97m2fXl842vglzH3VgCgObdv4mvkSmWnemM/mWGMcRutw19cS7sl8hQZTrgDMdF MFmt7cBS2iwqCBxS/IXlowVCkrXKe08zxNZ05wwrqiLBYzXzmKrIWgATBtf+RE999hwcJ Sr+0UhldBTvpcQOuq7x5SPJJ/I5wIHWDz/c= X-RZG-AUTH: :Ln4Re0+Ic/6oZXR1YgKryK8brksyK8dozXDwHXjf9hj/zDNRbf84418J X-RZG-CLASS-ID: mo00 Received: from bruno.haible.de (dslb-088-068-049-015.088.068.pools.vodafone-ip.de [88.68.49.15]) by smtp.strato.de (RZmta 37.4 DYNA|AUTH) with ESMTPSA id R03e25r2G2UPQQ1 (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate); Mon, 16 Mar 2015 03:30:25 +0100 (CET) From: Bruno Haible <bruno@HIDDEN> To: bug-coreutils@HIDDEN Subject: tr does not support multibyte characters in the first argument Date: Mon, 16 Mar 2015 03:30:25 +0100 Message-ID: <2777312.3PiQp2ULlP@HIDDEN> User-Agent: KMail/4.8.5 (Linux/3.2.0-64-generic; KDE/4.8.5; x86_64; ; ) MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: submit Cc: Bjoern Jacke <bjoern@HIDDEN> X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -5.0 (-----) POSIX [1] specifies that the recognition of characters in 'tr' depends on the environment variables LANG, etc. But trying to replace a multibyte character by another character does not work: $ echo $LANG de_DE.UTF-8 $ enspace=`printf '\u2002'` $ echo -n "X${enspace}Y" | tr "${enspace}" ' ' | od -t x1 0000000 58 20 20 20 59 0000005 Expected output would be: $ echo -n "X${enspace}Y" | tr "${enspace}" ' ' | od -t x1 0000000 58 20 59 0000003 With 'sed' it works: $ echo -n "X${enspace}Y" | sed -e "s/${enspace}/ /g" | od -t x1 0000000 58 20 59 0000003 Bruno [1] http://pubs.opengroup.org/onlinepubs/9699919799/utilities/tr.html
Bruno Haible <bruno@HIDDEN>
:bug-coreutils@HIDDEN
.
Full text available.bug-coreutils@HIDDEN
:bug#20114
; Package coreutils
.
Full text available.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997 nCipher Corporation Ltd,
1994-97 Ian Jackson.