GNU bug report logs - #9365
multibyte: tr: TR operates on bytes, not characters

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: coreutils; Severity: wishlist; Reported by: "Urmas" <davian818@HIDDEN>; merged with #9569, #10880, #12192, #13362; dated Thu, 25 Aug 2011 04:51:01 UTC; Maintainer for coreutils is bug-coreutils@HIDDEN.
Changed bug title to 'multibyte: tr: TR operates on bytes, not characters' from 'TR operates on bytes, not characters' Request was from Assaf Gordon <assafgordon@HIDDEN> to control <at> debbugs.gnu.org. Full text available.
Severity set to 'wishlist' from 'normal' Request was from Assaf Gordon <assafgordon@HIDDEN> to control <at> debbugs.gnu.org. Full text available.
Forcibly Merged 9365 9569 10880 12192 13362. Request was from Pádraig Brady <P@HIDDEN> to control <at> debbugs.gnu.org. Full text available.
Forcibly Merged 9365 9569 10880 12192. Request was from Jim Meyering <jim@HIDDEN> to control <at> debbugs.gnu.org. Full text available.
Forcibly Merged 9365 9569 10880. Request was from Paul Eggert <eggert@HIDDEN> to control <at> debbugs.gnu.org. Full text available.
Forcibly Merged 9365 9569. Request was from Paul Eggert <eggert@HIDDEN> to control <at> debbugs.gnu.org. Full text available.

Message received at 9365 <at> debbugs.gnu.org:


Received: (at 9365) by debbugs.gnu.org; 24 Feb 2012 17:29:55 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Fri Feb 24 12:29:55 2012
Received: from localhost ([127.0.0.1]:54411 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.72)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1S0yxs-0004M8-3f
	for submit <at> debbugs.gnu.org; Fri, 24 Feb 2012 12:29:54 -0500
Received: from mailout-us.gmx.com ([74.208.5.67]:55786
	helo=mailout-us.mail.com) by debbugs.gnu.org with smtp (Exim 4.72)
	(envelope-from <marton.kadar@HIDDEN>) id 1S0w1V-0007nU-Fp
	for 9365 <at> debbugs.gnu.org; Fri, 24 Feb 2012 09:21:27 -0500
Received: (qmail 10368 invoked by uid 0); 24 Feb 2012 14:18:25 -0000
Received: from 145.236.252.34 by rms-us005.v300.gmx.net with HTTP
Content-Type: text/plain; charset="utf-8"
Date: Fri, 24 Feb 2012 09:18:24 -0500
From: "Marton Kadar" <marton.kadar@HIDDEN>
Message-ID: <20120224141824.107150@HIDDEN>
MIME-Version: 1.0
Subject: Example
To: 9365 <at> debbugs.gnu.org
X-Authenticated: #77717673
X-Flags: 0001
X-Mailer: GMX.com Web Mailer
x-registered: 0
Content-Transfer-Encoding: 8bit
X-GMX-UID: Z/k1b79I3zOlOMiDynAha7l+IGRvb4A/
X-Spam-Score: -1.9 (-)
X-Debbugs-Envelope-To: 9365
X-Mailman-Approved-At: Fri, 24 Feb 2012 12:29:50 -0500
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.13
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <http://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>,
	<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <http://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <http://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>,
	<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Sender: debbugs-submit-bounces <at> debbugs.gnu.org
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
X-Spam-Score: -1.9 (-)

Environment for Hungary where á and í are proper lowercase letters
but for example Spanish has these letters too:

$ set | grep ^L
LANG=hu_HU.UTF-8
LC_ALL=hu_HU.UTF-8
LINES=73
LOGNAME=kadar1marto518

Now let's see the bytestream for the following string
(which means flood in Hungarian):

$ echo árvíz | od -c
0000000 303 241   r   v 303 255   z  \n
0000010

Let us try to delete a character and see if it worked:

$ echo árvíz | tr -d á | od -c
0000000   r   v 255   z  \n
0000005

Correct expected behavior would rather be:

$ echo árvíz | tr -d á | od -c
0000000   r   v 303 255   z  \n
0000006

I'll check the source for tr myself although never coded in C.
This should be a trivial fix. The problem is especially annoying
as we currently have no real simple and good general purpose case
conversion tool. (correct me if I'm wrong, but tr should be this
tool).

Marton Kadar




Information forwarded to bug-coreutils@HIDDEN:
bug#9365; Package coreutils. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 25 Aug 2011 04:51:00 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Aug 25 00:51:00 2011
Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.69)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1QwRu7-0000W9-4V
	for submit <at> debbugs.gnu.org; Thu, 25 Aug 2011 00:51:00 -0400
Received: from eggs.gnu.org ([140.186.70.92])
	by debbugs.gnu.org with esmtp (Exim 4.69)
	(envelope-from <davian818@HIDDEN>) id 1QwPHY-00068j-EH
	for submit <at> debbugs.gnu.org; Wed, 24 Aug 2011 22:03:01 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <davian818@HIDDEN>) id 1QwPF1-00020A-GG
	for submit <at> debbugs.gnu.org; Wed, 24 Aug 2011 22:00:24 -0400
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=-2.5 required=5.0 tests=BAYES_00,
	FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW,
	T_DKIM_INVALID autolearn=unavailable version=3.3.1
Received: from lists.gnu.org ([140.186.70.17]:49848)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <davian818@HIDDEN>) id 1QwPF1-000206-Em
	for submit <at> debbugs.gnu.org; Wed, 24 Aug 2011 22:00:23 -0400
Received: from eggs.gnu.org ([140.186.70.92]:39171)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <davian818@HIDDEN>) id 1QwPF0-0007Hy-IL
	for bug-coreutils@HIDDEN; Wed, 24 Aug 2011 22:00:23 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <davian818@HIDDEN>) id 1QwPEz-0001zu-HS
	for bug-coreutils@HIDDEN; Wed, 24 Aug 2011 22:00:22 -0400
Received: from mail-wy0-f169.google.com ([74.125.82.169]:39531)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <davian818@HIDDEN>) id 1QwPEz-0001zl-DH
	for bug-coreutils@HIDDEN; Wed, 24 Aug 2011 22:00:21 -0400
Received: by wyi11 with SMTP id 11so1514450wyi.0
	for <bug-coreutils@HIDDEN>; Wed, 24 Aug 2011 19:00:20 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=message-id:from:to:subject:date:mime-version:content-type
	:content-transfer-encoding:x-priority:x-msmail-priority:importance
	:x-mailer:x-mimeole;
	bh=YAD9HsSG/49e/kjCzAHrXXzZ/6ghsErT0E4dJfh0WYY=;
	b=U1GvgaLCR7DJUUpLU8np7i4iOTljG3q76QTSvKZjk5uAzhpBwesucnQWkSFkZ9cYTl
	PJJWchGzi+eQw9j7z4QkIOjeol8apGoNfSeS9nzj46CL+DtebRlEPuK8kmv27duu5Xr5
	jXJbkXifI79Y7w7PJT4WmIFAwUeK+2V5/AQa4=
Received: by 10.227.135.142 with SMTP id n14mr943314wbt.84.1314237620246;
	Wed, 24 Aug 2011 19:00:20 -0700 (PDT)
Received: from sandy (l49-18-203.cn.ru [178.49.18.203])
	by mx.google.com with ESMTPS id n20sm61392wbh.33.2011.08.24.19.00.15
	(version=SSLv3 cipher=OTHER); Wed, 24 Aug 2011 19:00:19 -0700 (PDT)
Message-ID: <95F385CFAFAF4C3886AF4F8C513EDB02@sandy>
From: "Urmas" <davian818@HIDDEN>
To: <bug-coreutils@HIDDEN>
Subject: TR operates on bytes, not characters
Date: Thu, 25 Aug 2011 09:01:14 +0700
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: base64
X-Priority: 3
X-MSMail-Priority: Normal
Importance: Normal
X-Mailer: Microsoft Windows Live Mail 14.0.8117.416
X-MimeOLE: Produced By Microsoft MimeOLE V14.0.8117.416
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2)
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3)
X-Received-From: 140.186.70.17
X-Spam-Score: -4.2 (----)
X-Debbugs-Envelope-To: submit
X-Mailman-Approved-At: Thu, 25 Aug 2011 00:50:57 -0400
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.11
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <http://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>,
	<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <http://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <http://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>,
	<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Sender: debbugs-submit-bounces <at> debbugs.gnu.org
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
X-Spam-Score: -4.1 (----)

W2NvcmV1dGlscyA4LjVdDQp0ciBpcyB0cmVhdGluZyBVVEYtOCBjaGFyYWN0ZXJzIGluIFNFVDEv
U0VUMiBhcyBieXRlIHNlcXVlbmNlcy4NCkNvcnJlY3QgZWl0aGVyICB0aGlzLCBvciBtYW51YWws
IHdoaWNoIHN0YXRlcyB0aGF0IFNFVDEvMiBhcmUgb2YgJ2NoYXJhY3RlcnMnLCBub3QgYnl0ZXMu
DQog





Acknowledgement sent to "Urmas" <davian818@HIDDEN>:
New bug report received and forwarded. Copy sent to bug-coreutils@HIDDEN. Full text available.
Report forwarded to owner <at> debbugs.gnu.org, bug-coreutils@HIDDEN:
bug#9365; Package coreutils. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Mon, 25 Nov 2019 12:00:02 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.