GNU bug report logs - #28038
multibyte: expand: expand(1) lacks MBC support

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: coreutils; Severity: wishlist; Reported by: Tilman Schmidt <tschmidt@HIDDEN>; dated Thu, 10 Aug 2017 16:11:01 UTC; Maintainer for coreutils is bug-coreutils@HIDDEN.
Changed bug title to 'multibyte: expand: expand(1) lacks MBC support' from 'expand(1) lacks MBC support' Request was from Assaf Gordon <assafgordon@HIDDEN> to control <at> debbugs.gnu.org. Full text available.
Severity set to 'wishlist' from 'normal' Request was from Assaf Gordon <assafgordon@HIDDEN> to control <at> debbugs.gnu.org. Full text available.

Message received at 28038 <at> debbugs.gnu.org:


Received: (at 28038) by debbugs.gnu.org; 11 Aug 2017 23:59:05 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Fri Aug 11 19:59:05 2017
Received: from localhost ([127.0.0.1]:56128 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1dgJpZ-0007iW-4a
	for submit <at> debbugs.gnu.org; Fri, 11 Aug 2017 19:59:05 -0400
Received: from mail-io0-f178.google.com ([209.85.223.178]:32969)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <assafgordon@HIDDEN>) id 1dgJpX-0007i1-6p
 for 28038 <at> debbugs.gnu.org; Fri, 11 Aug 2017 19:59:03 -0400
Received: by mail-io0-f178.google.com with SMTP id j32so25229899iod.0
 for <28038 <at> debbugs.gnu.org>; Fri, 11 Aug 2017 16:59:03 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=subject:to:references:from:message-id:date:user-agent:mime-version
 :in-reply-to:content-language:content-transfer-encoding;
 bh=emJZfxmiXK3arSJ09IYXbd9r20KHg35OixdC8+Q5WnI=;
 b=KAcAhdvwD7QoPw4EJvKGXG9ZZPyIhP4rp1jNRk7kKkHBY3OR6Z4NcRWJUHCCR/Ec96
 CKXPfJeuQJpJ3nQd+/No1PCuMHx0WiIy/n60SHsfS43PXtHiCTM7UXi2Q3CnLuEoeDGk
 vSMhHHb4lZ7YwS2NWXAcMV3804ZVwZPK92sHsJDPqCsRushlqXGe5L8IAbIGOu/uz5Cy
 vE0sg9rTOyt1EnrNFC+q0W6KgTCR0SqGBzTwHViTn7mby4SzcrZ8I1QIf5hmB/GaxuLE
 ckVgQDZKAU/rIc+E/LPc9A79TiLnPDYcF5gEde0vwThMGMOaB53uxBqeCYPA0gfrYb9x
 nqFw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:subject:to:references:from:message-id:date
 :user-agent:mime-version:in-reply-to:content-language
 :content-transfer-encoding;
 bh=emJZfxmiXK3arSJ09IYXbd9r20KHg35OixdC8+Q5WnI=;
 b=GmrjNlgA1j8ShHSRGj0C/1JIhbm+eOwZ+PS5eECuFIiTpsEYRE5PYRJKTGTjIqt/BI
 uaGX0dvdQghIp+TGFE7aS9QOnIcgjkmhHXg2Dwoj+kOEvNl5OW+QPXs6Qmfgb7Y0AFlv
 YM5fraltU4p9pQ7KITiLgYp2mPqLAk62lqyDeqv/1BkqZatgwJ0+qOimKjwGkQbNIBO5
 RJ+vq8tGPT4mSOTDvfSAkcifmp2WuEybBwB6ShgRmUsSdWMP/iPpHHERg31TqWhWMOHJ
 jDZlTDhTGvyITYVL6NIJugAGDxmZ2NYs0AzWYKsjHxt/q/SvISooHnds+RmSyaybzacJ
 N1NQ==
X-Gm-Message-State: AHYfb5jFGU+QLljbMvyxdKW6bO5713A0PcLnSrk+sJS/yMBbu1hy4A2I
 A9r236X6DhxnbK8bC14=
X-Received: by 10.107.205.72 with SMTP id d69mr15037037iog.224.1502495937159; 
 Fri, 11 Aug 2017 16:58:57 -0700 (PDT)
Received: from [192.168.88.253] (S010664777daa62d3.cg.shawcable.net.
 [70.72.44.2])
 by smtp.gmail.com with ESMTPSA id l62sm165571ita.8.2017.08.11.16.58.56
 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Fri, 11 Aug 2017 16:58:56 -0700 (PDT)
Subject: Re: bug#28038: expand(1) lacks MBC support
To: Tilman Schmidt <tschmidt@HIDDEN>, 28038 <at> debbugs.gnu.org
References: <E1dfq2P-00055w-Vm@HIDDEN>
From: Assaf Gordon <assafgordon@HIDDEN>
Message-ID: <1ae2b1f2-103d-c6e8-7474-39e463055e2a@HIDDEN>
Date: Fri, 11 Aug 2017 17:58:48 -0600
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101
 Thunderbird/52.2.1
MIME-Version: 1.0
In-Reply-To: <E1dfq2P-00055w-Vm@HIDDEN>
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: 8bit
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 28038
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -2.3 (--)

Hello Tilman,

On 10/08/17 10:10 AM, Tilman Schmidt wrote:
> it seems the expand(1) command does not properly support multi-byte
> characters.

That is correct.

> tschmidt@sl-vm-redmine01:~$ cat test.txt
> Text	ohne	Umlaute
> Täxt	müt	Umläuten
> tschmidt@sl-vm-redmine01:~$ expand test.txt
> Text    ohne    Umlaute
> Täxt   müt    Umläuten
> 
> Using Ubuntu 14.04.5 LTS with coreutils 8.21-1ubuntu.

Multibyte support is not available yet (neither in version 8.21 which is
4 years old, nor in the current version 8.27).

However, there is an on-going effort to add multibyte support
to all coreutils programs, including 'expand'.

You can read more technical details about it here:
  http://crashcourse.housegordon.org/coreutils-multibyte-support.html

In the current (work-in-progress) internationalization patch,
the 'expand' program does support multibyte locales, and expands
your input correctly:

multibyte locale:

   $ ./src/expand bug28038.txt
   Text    ohne    Umlaute
   Täxt    müt     Umläuten

versus forcing single-byte locale:

   $ LC_ALL=C ./src/expand bug28038.txt
   Text    ohne    Umlaute
   Täxt   müt    Umläuten


The latest version of the patch is available for download and
experimentation here:
  http://lists.gnu.org/archive/html/coreutils/2017-04/msg00009.html
However it should not be considered stable.

regards,
 - assaf






Information forwarded to bug-coreutils@HIDDEN:
bug#28038; Package coreutils. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 10 Aug 2017 16:10:58 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Aug 10 12:10:58 2017
Received: from localhost ([127.0.0.1]:54677 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1dfq30-0005VN-IB
	for submit <at> debbugs.gnu.org; Thu, 10 Aug 2017 12:10:58 -0400
Received: from eggs.gnu.org ([208.118.235.92]:57016)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <tschmidt@HIDDEN>) id 1dfq2a-0005UE-UR
 for submit <at> debbugs.gnu.org; Thu, 10 Aug 2017 12:10:33 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <tschmidt@HIDDEN>) id 1dfq2U-0005Bw-QW
 for submit <at> debbugs.gnu.org; Thu, 10 Aug 2017 12:10:27 -0400
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled
 version=3.3.2
Received: from lists.gnu.org ([2001:4830:134:3::11]:36098)
 by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32)
 (Exim 4.71) (envelope-from <tschmidt@HIDDEN>)
 id 1dfq2U-0005Bs-Mh
 for submit <at> debbugs.gnu.org; Thu, 10 Aug 2017 12:10:26 -0400
Received: from eggs.gnu.org ([2001:4830:134:3::10]:47969)
 by lists.gnu.org with esmtp (Exim 4.71)
 (envelope-from <tschmidt@HIDDEN>) id 1dfq2T-0001VO-Cc
 for bug-coreutils@HIDDEN; Thu, 10 Aug 2017 12:10:26 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <tschmidt@HIDDEN>) id 1dfq2Q-00059H-6z
 for bug-coreutils@HIDDEN; Thu, 10 Aug 2017 12:10:25 -0400
Received: from mail.cardtech.de ([217.111.131.25]:33461)
 by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32)
 (Exim 4.71) (envelope-from <tschmidt@HIDDEN>)
 id 1dfq2P-00055w-Vm
 for bug-coreutils@HIDDEN; Thu, 10 Aug 2017 12:10:22 -0400
Received: from localhost (localhost [127.0.0.1])
 by mail.cardtech.de (Postfix) with ESMTP id 8EE30100736
 for <bug-coreutils@HIDDEN>; Thu, 10 Aug 2017 18:10:11 +0200 (CEST)
X-Virus-Scanned: Debian amavisd-new at cardtech.de
Received: from mail.cardtech.de ([127.0.0.1])
 by localhost (mail.cardtech.de [127.0.0.1]) (amavisd-new, port 10024)
 with LMTP id CWbp0qucM--w for <bug-coreutils@HIDDEN>;
 Thu, 10 Aug 2017 18:10:11 +0200 (CEST)
To: bug-coreutils@HIDDEN
From: Tilman Schmidt <tschmidt@HIDDEN>
Subject: expand(1) lacks MBC support
Organization: cardtech Card & POS Service GmbH
Date: Thu, 10 Aug 2017 18:10:02 +0200
MIME-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha256;
 protocol="application/pgp-signature";
 boundary="FT2jNvOMO4sgkC9NGNxSfetPIQucFKXwe"
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
 [fuzzy]
Message-ID: <E1dfq2P-00055w-Vm@HIDDEN>
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x
X-Received-From: 2001:4830:134:3::11
X-Spam-Score: -4.0 (----)
X-Debbugs-Envelope-To: submit
X-Mailman-Approved-At: Thu, 10 Aug 2017 12:10:57 -0400
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -4.0 (----)

This is an OpenPGP/MIME signed message (RFC 4880 and 3156)
--FT2jNvOMO4sgkC9NGNxSfetPIQucFKXwe
Content-Type: multipart/mixed; boundary="DALVtVebScWafwkig43GTbnaVuCHmbqlC";
 protected-headers="v1"
From: Tilman Schmidt <tschmidt@HIDDEN>
To: bug-coreutils@HIDDEN
Message-ID: <c3041faa-619b-54f5-2ed5-7988dce8e87a@HIDDEN>
Subject: expand(1) lacks MBC support

--DALVtVebScWafwkig43GTbnaVuCHmbqlC
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: quoted-printable

Hi,

it seems the expand(1) command does not properly support multi-byte
characters.

tschmidt@sl-vm-redmine01:~$ echo $LANG
de_DE.UTF-8
tschmidt@sl-vm-redmine01:~$ cat test.txt
Text	ohne	Umlaute
T=C3=A4xt	m=C3=BCt	Uml=C3=A4uten
tschmidt@sl-vm-redmine01:~$ expand test.txt
Text    ohne    Umlaute
T=C3=A4xt   m=C3=BCt    Uml=C3=A4uten

Using Ubuntu 14.04.5 LTS with coreutils 8.21-1ubuntu.

Regards,
Tilman

--=20
Tilman Schmidt
Teamleiter Systemadministration

Tel. 0221 / 95 64 95 . 417
Fax  0221 / 95 64 95 . 699

eMail tschmidt@HIDDEN

cardtech
Card & POS Service GmbH
Richard-Byrd-Stra=C3=9Fe 37
50829 K=C3=B6ln
www.cardtech.de

AG K=C3=B6ln, HRB 20164
Gesch=C3=A4ftsf=C3=BChrer: Dr. Dietrich Gottwald, Christof Kohns


--DALVtVebScWafwkig43GTbnaVuCHmbqlC--

--FT2jNvOMO4sgkC9NGNxSfetPIQucFKXwe
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQIcBAEBCAAGBQJZjIVaAAoJEPUomAo96UJroqgP/1shfRiGsw7GNKhFX/w7gk6e
C8rBitL2iSuzFvS93M6HU7p3THEbcPJCuZHILvNcrl2Wy9KuKSD5vd9pQXLKObI2
5gBhrc1hZ4xAaS6IDKtGgBVx8bg9YLfvV8q8qBoz7YYL7v8P7zWA+Ziqetjhjnqd
eBzFiT3wG0oNI/yz7QEXoO6lCdBxlA8EqKG6C+CsF8mAoFAP/wsC00sb3EFLZkWc
M0YGvxPcrsWlmBzo4RDBMtPJAGy/Wf7PiDwGQ6jpiVRnnPpMBjUihHVKnRe2NloD
tZ5J3tbc4rXe170VoLJew4vjKEr2fFsjtDwdNFzt6uyKe5oYFExtufqCrZGsInzG
iO877RQN5ZYh3i6YSB31r/Rs44CBDnTWUI7loVSA2hCiBIgXs3Y/oDQDu3gdyszT
iCjHVCVmCYvE6/B9QR2Hn/hQYExWMe3jIHJB1lYJ/loJSiX6Fi7kTFkTsiAESh6W
QOf7LuKmxeUoemmr+uScU87nDdsS+zk5STxOk5fAQz4xaNPIqerF9Cm3iENpQrGO
mNLBdLcQwSu2uskvKbevT5aWRHH5J0Dx7DNXCBylxuTP7NPYuZLxkCfYBnpxo4QN
HC3a8chSjtcI87XsK63Z7hz23rkGal3R7iAgNOD/mFaSPPc+Pp0tVNXhInY4XjZI
6SsZ/oK3LDkwY2Irq6P2
=bDvv
-----END PGP SIGNATURE-----

--FT2jNvOMO4sgkC9NGNxSfetPIQucFKXwe--




Acknowledgement sent to Tilman Schmidt <tschmidt@HIDDEN>:
New bug report received and forwarded. Copy sent to bug-coreutils@HIDDEN. Full text available.
Report forwarded to bug-coreutils@HIDDEN:
bug#28038; Package coreutils. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Tue, 30 Oct 2018 01:15:01 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.