Assaf Gordon <assafgordon@HIDDEN>
to control <at> debbugs.gnu.org
.
Full text available.Assaf Gordon <assafgordon@HIDDEN>
to control <at> debbugs.gnu.org
.
Full text available.Received: (at 24924) by debbugs.gnu.org; 1 Dec 2016 08:49:49 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Thu Dec 01 03:49:49 2016 Received: from localhost ([127.0.0.1]:48064 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1cCN3s-0003e5-S0 for submit <at> debbugs.gnu.org; Thu, 01 Dec 2016 03:49:49 -0500 Received: from mail-wm0-f65.google.com ([74.125.82.65]:35133) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <stephane.chazelas@HIDDEN>) id 1cCN3s-0003dt-4t for 24924 <at> debbugs.gnu.org; Thu, 01 Dec 2016 03:49:48 -0500 Received: by mail-wm0-f65.google.com with SMTP id a20so33100157wme.2 for <24924 <at> debbugs.gnu.org>; Thu, 01 Dec 2016 00:49:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to :user-agent; bh=YZTA3RFTVonjPC1Wkbx67XwgHZ/bVs1JcUmfocz8pZs=; b=jUpFgCfMZpl03PEemgB9OKjuhjpqrdz/XfB2cyKHx6j9ImM2dS1DE4s49IJDyhQgOm 8SyHOeamFaCUHPxrmzi+frrr90SIadbq9nQp2IaJ+dFFNsP1cTqMfGQintURLB12ulK5 yPBjbkPQOpgl3Z1FoyTTBpKqwK3oEXqLCkAq/fakjuPpCD9vUnRn5DVWZzgPX6yJ3TGB WwXPVbZKetuzw3gq5i1Vld/Yf/2dFygAzhdYihrJ1/ov/IsuMXcSmSGbLMew42aZIQGj segtIfPXaigIcU0Qj/XgwvhCll48rcTTMtSU9N1GS1z+SMFw0Vq2QM+KyuQkIjL/N5Fa F8Dg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to:user-agent; bh=YZTA3RFTVonjPC1Wkbx67XwgHZ/bVs1JcUmfocz8pZs=; b=jpOTWzt3dl/uTgFbNOUhwCRpiMi0D+KzzDnYKqE7uQ9zsqv9JxzaKjqQnWwQZECe1k JfAc5XSds35M4GEIcglFk4TMU7gmbVWxUUhM5rNzeYDh3pGKxdFrwoacWnGuJtcjtazF HDppqwu0mNXTYhubns0du0+8NlsOCopgrppViPtNF5XiSw0myN71F9j4CUWUAQZ4LmHF AZlZHwUBr1ghYFTeKdirChF9MmCF7w4oLzWQPJli5NpJNTwuxzWq+0I1c9DkrfqdZ/I6 AY3YS5s5lNxldJC8v8JfgQ1oYoEQSPqr1IjjvDtDz3F1tmiH3DIQoqLu17krcaLI3Hd3 DENg== X-Gm-Message-State: AKaTC012ogmyeonNk8eATN9q7qWpaiLCabxfCCiJ9XCj6oTUaj00ibCZ8gFoUwHUMEWDhQ== X-Received: by 10.28.113.218 with SMTP id d87mr30705165wmi.111.1480582182274; Thu, 01 Dec 2016 00:49:42 -0800 (PST) Received: from chaz.gmail.com ([90.201.137.34]) by smtp.gmail.com with ESMTPSA id 63sm12013278wmg.2.2016.12.01.00.49.40 (version=TLS1_2 cipher=AES128-SHA bits=128/128); Thu, 01 Dec 2016 00:49:40 -0800 (PST) Date: Thu, 1 Dec 2016 08:49:39 +0000 From: Stephane Chazelas <stephane.chazelas@HIDDEN> To: Paul Eggert <eggert@HIDDEN> Subject: Re: bug#24924: GNU pr only working with singlebyte 1-width characters Message-ID: <20161201084939.GA11768@HIDDEN> References: <8737iyatfd.fsf@HIDDEN> <20161130113034.GA7005@HIDDEN> <69288e76-9f5d-c792-9bba-6a984461463b@HIDDEN> <20161201070405.GB4922@HIDDEN> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20161201070405.GB4922@HIDDEN> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: 0.5 (/) X-Debbugs-Envelope-To: 24924 Cc: 24924 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: 0.5 (/) 2016-12-01 07:04:05 +0000, Stephane Chazelas: > 2016-11-30 18:37:05 -0800, Paul Eggert: > [...] > > In the meantime if you could submit a patch for the > > documentation that should fix the immediate documentation > > problem. > [...] > > What about: [...] > +Please note that @command{pr} currently doesn't support multi-byte characters > +or non-ASCII characters that have a null or double width. If such characters > +occur in the input or column separators, column alignment may be off or lines > +may exceed the page width. There is also no provision to support bidirectional > +text. [...] Actually, it seems it can also truncate lines in the middle of some characters though it seems it's confined to multibyte characters that have byte values <= 127 like: $ locale charmap BIG5-HKSCS $ printf '\ue9\ue9\ue9\n' | pr -w5 -t2 | hd 00000000 88 6d 88 6d 88 0a |.m.m..| 00000006 See how that third (0x88 0x6d in BIG5-HKSCS) was truncated in the middle. It's as if it was considering all byte values >= 128 as having zero width in multi-byte locales (and only in multi-byte locales, that doesn't seem to occur in single-byte ones). So maybe: diff --git a/doc/coreutils.texi b/doc/coreutils.texi index cc85f22..15088ce 100644 --- a/doc/coreutils.texi +++ b/doc/coreutils.texi @@ -1838,6 +1838,13 @@ For single column output no line truncation occurs by default. Use @option{-W} option to truncate lines in that case. +Please note that @command{pr} currently doesn't support multi-byte characters +or non-ASCII characters that have a null or double width. If such characters +occur in the input or column separators, column alignment may be off or lines +may exceed the page width, or truncation may occur in the middle of some +characters producing invalid text output. There is also no provision to support +bidirectional text. + The following changes were made in version 1.22i and apply to later versions of @command{pr}: @c FIXME: this whole section here sounds very awkward to me. I -- Stephane
bug-coreutils@HIDDEN
:bug#24924
; Package coreutils
.
Full text available.Received: (at 24924) by debbugs.gnu.org; 1 Dec 2016 07:04:16 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Thu Dec 01 02:04:16 2016 Received: from localhost ([127.0.0.1]:48036 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1cCLPk-0001EP-9l for submit <at> debbugs.gnu.org; Thu, 01 Dec 2016 02:04:16 -0500 Received: from mail-wj0-f174.google.com ([209.85.210.174]:35211) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <stephane.chazelas@HIDDEN>) id 1cCLPh-0001E9-UO for 24924 <at> debbugs.gnu.org; Thu, 01 Dec 2016 02:04:14 -0500 Received: by mail-wj0-f174.google.com with SMTP id v7so195837364wjy.2 for <24924 <at> debbugs.gnu.org>; Wed, 30 Nov 2016 23:04:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=WTiA/47OLPW/dJG8HtSADogK/KSMf1Qt8NwELTGoYO0=; b=Tc8VtF5c+PCe0s+YBeDnjOP7WldtUnki2bq5xHWs5P2DKAcaULBwFexQtjlpwPVePQ sDNrGBrUVCvLnYThRT7olg9ieq1qoWMyhvPJJQ4JTU4QbsjLvXQ+UrhPij1KH14dSkH3 7pr5BtknVrODwryLSpaobwwPsu8Zaomb+ay+0SghpvAh+QP59B0nDzmIENb1t+oUtG3i QJI6jG5QM3pxGKovmpcrE1BRNOj2h7wCxJQr4dbPckSTk1zRVvUTch59f+hrnWz3lu8V mIAwU7GHv25TSmYlI2No5ZAygXB70dbURlvi+Wwy2cJEYpyEx+HEAYCbAehvfcf6kr0D ikvA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=WTiA/47OLPW/dJG8HtSADogK/KSMf1Qt8NwELTGoYO0=; b=VrdI2GRdk+a80BgEb69IBxXdZqp2gjaQf8EyPZuyAnyhEWI45wXp+5MaqXOyCab+qU vJ49x5C2ajl7VQdHXrkjQdmK673cbZSf8Im3fkDYwyiLZcs41JOpxPjHCZC8ZqQyjhU8 RHAylEtVu7Ez9DT43ZhjWCMYa8PMgSUquhARCEancXCUB97pVI9hFxphKa5letV7W2ms i8Iy/GTONqNCDFIBrVOyJRhOZoI/1OFPfgvwPDMnwVVQeKh8IbPKk7nBAtWgza35OgIl CnHgzx4c1cMa7AFNB5yrRfJfL+pmLMUhnoSpuTGNFICfR1jaWP9Gs/Pkmt32moC9wF/j nc9A== X-Gm-Message-State: AKaTC02sZOAS0ZWsLhy1t/NS5QD9LIQxxCGmiUdWq5sviyTl6GPAtL0/WQ65sSOhFlvw9Q== X-Received: by 10.194.85.77 with SMTP id f13mr32151632wjz.187.1480575848403; Wed, 30 Nov 2016 23:04:08 -0800 (PST) Received: from chaz.gmail.com ([90.201.137.34]) by smtp.gmail.com with ESMTPSA id t82sm11527927wmd.17.2016.11.30.23.04.06 (version=TLS1_2 cipher=AES128-SHA bits=128/128); Wed, 30 Nov 2016 23:04:07 -0800 (PST) Date: Thu, 1 Dec 2016 07:04:05 +0000 From: Stephane Chazelas <stephane.chazelas@HIDDEN> To: Paul Eggert <eggert@HIDDEN> Subject: Re: bug#24924: GNU pr only working with singlebyte 1-width characters Message-ID: <20161201070405.GB4922@HIDDEN> References: <8737iyatfd.fsf@HIDDEN> <20161130113034.GA7005@HIDDEN> <69288e76-9f5d-c792-9bba-6a984461463b@HIDDEN> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <69288e76-9f5d-c792-9bba-6a984461463b@HIDDEN> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: 24924 Cc: 24924 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -0.0 (/) 2016-11-30 18:37:05 -0800, Paul Eggert: [...] > In the meantime if you could submit a patch for the > documentation that should fix the immediate documentation > problem. [...] What about: diff --git a/doc/coreutils.texi b/doc/coreutils.texi index cc85f22..6eb497b 100644 --- a/doc/coreutils.texi +++ b/doc/coreutils.texi @@ -1838,6 +1838,12 @@ For single column output no line truncation occurs by default. Use @option{-W} option to truncate lines in that case. +Please note that @command{pr} currently doesn't support multi-byte characters +or non-ASCII characters that have a null or double width. If such characters +occur in the input or column separators, column alignment may be off or lines +may exceed the page width. There is also no provision to support bidirectional +text. + The following changes were made in version 1.22i and apply to later versions of @command{pr}: @c FIXME: this whole section here sounds very awkward to me. I
bug-coreutils@HIDDEN
:bug#24924
; Package coreutils
.
Full text available.Received: (at 24924) by debbugs.gnu.org; 1 Dec 2016 06:32:33 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Thu Dec 01 01:32:32 2016 Received: from localhost ([127.0.0.1]:48029 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1cCKv2-0000Vd-MG for submit <at> debbugs.gnu.org; Thu, 01 Dec 2016 01:32:32 -0500 Received: from mail-wj0-f194.google.com ([209.85.210.194]:32868) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <stephane.chazelas@HIDDEN>) id 1cCKv0-0000VQ-LR for 24924 <at> debbugs.gnu.org; Thu, 01 Dec 2016 01:32:31 -0500 Received: by mail-wj0-f194.google.com with SMTP id kp2so25041747wjc.0 for <24924 <at> debbugs.gnu.org>; Wed, 30 Nov 2016 22:32:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=tlEVIp9g79pYyZNMDdHhpHyehi7ZLotGBFqgrbf3VLk=; b=he4B0+M8H+klHLQvBBFHzMw2tzqFD3i/rWwsYvOD8TK6HTjxa6LaF3zqu2v6XX5TIt J7U3sj5THVgKCkRWael2PHsvwCN2xcYhKDahE1QSiJ1jMWa0FWuAB2YwSjtW/3tDXNuE OUP2TjC4QJMZZxRgphM1aFYEksL0vEtoh+XIYrQbvViQJggKHP5glCNsgp3fSF5e9AIq FGkxFCEh+oC3bgK2Ozg1krMMMZufS73E57cwA32KaIaJgsETMg1eQ9siPQ6wr1CFyS0R HIyEeamBec95B6Lb4HnNMAeT+DeBHJFzo0fbYmoCxw8qaHTdc/xpJjGaC3InyCLI55U/ hRGg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=tlEVIp9g79pYyZNMDdHhpHyehi7ZLotGBFqgrbf3VLk=; b=ivxTyo06ZR0eqKgKOO9JMCgH2aHUIfVxgqkq4D1OuaVd9qQ+VEfDrlTcj7x39emhid J/6MegRFQgUyTgjGmaeoPs9bQ499L73T0Q9JTLUXks5hYAKSBN8bYxhrNgF936h8oPJV MRhH0qgE0c5QfISlXaWxszx7dGh/a1UoZ2EcGirGB7A5nT2nVkjLSbbAP+vlZXS4Poo0 uZVyWeUDszC/3idZCmURmlnPaHIlggrywkb2pNueMO+0lkRFX+j/IJU+yq1WuZH3UGM9 vkFucqqFa55MIVUnCJd8MF9MQxVwHD+IKHuYPr3vglyJiEULypOp0umBpobdLFvzWHud WkRQ== X-Gm-Message-State: AKaTC01TzhyznSaGhSUDCngiKeYR/NcaFUGijHWN/gAw9LAXBIWXTYIto2q9xkFHQjLD9w== X-Received: by 10.194.174.39 with SMTP id bp7mr31089134wjc.5.1480573944940; Wed, 30 Nov 2016 22:32:24 -0800 (PST) Received: from chaz.gmail.com ([90.201.137.34]) by smtp.gmail.com with ESMTPSA id v10sm76531814wji.29.2016.11.30.22.32.23 (version=TLS1_2 cipher=AES128-SHA bits=128/128); Wed, 30 Nov 2016 22:32:23 -0800 (PST) Date: Thu, 1 Dec 2016 06:32:22 +0000 From: Stephane Chazelas <stephane.chazelas@HIDDEN> To: Paul Eggert <eggert@HIDDEN> Subject: Re: bug#24924: GNU pr only working with singlebyte 1-width characters Message-ID: <20161201063222.GA4922@HIDDEN> References: <8737iyatfd.fsf@HIDDEN> <20161130113034.GA7005@HIDDEN> <69288e76-9f5d-c792-9bba-6a984461463b@HIDDEN> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <69288e76-9f5d-c792-9bba-6a984461463b@HIDDEN> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: 0.5 (/) X-Debbugs-Envelope-To: 24924 Cc: 24924 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: 0.5 (/) 2016-11-30 18:37:05 -0800, Paul Eggert: > On 11/30/2016 03:30 AM, Stephane Chazelas wrote: > >That can also be seen as a POSIX conformance bug > > Not really, as POSIX does not require support for UTF-8 (except in > the pax utility, which is not part of coreutils). [...] POSIX does not require support for any charset. It only specifies one locale (C/POSIX), doesn't specify the charset in that locale other than it should be a single byte charset that covers the portable character set. Examples of such charsets are ASCII, iso8859-x or EBCDIC. In practice, that tends to be ASCII (except for some rare EBCDIC based IBM systems) as tha But it does support a localisation API and allows system to support other locales with other charsets. That API does support multi-byte encodings, including stateful ones (though how they are /defined/ is implementation defined for lock-shift ones and in practice those are unworkable so I'd expect those would eventually be removed from the standard). It doesn't require compliant systems to have locales with multi-byte character sets, but if they have (if they show up in the output of locale -a), then they have to be supported throughout (as specified, for all the utilities for instance). Basically, on systems that have locales with multi-byte encodings --UTF-8 or other-- (most Unix-like ones including GNU systems like Debian), GNU pr (and many other GNU utilities) is not POSIX compliant. See http://pubs.opengroup.org/onlinepubs/9699919799.2016edition/basedefs/V1_chap06.html for details. -- Stephane
bug-coreutils@HIDDEN
:bug#24924
; Package coreutils
.
Full text available.Received: (at 24924) by debbugs.gnu.org; 1 Dec 2016 02:37:14 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Wed Nov 30 21:37:14 2016 Received: from localhost ([127.0.0.1]:47951 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1cCHFK-0003Vg-Hc for submit <at> debbugs.gnu.org; Wed, 30 Nov 2016 21:37:14 -0500 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:34110) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <eggert@HIDDEN>) id 1cCHFI-0003VT-Nr for 24924 <at> debbugs.gnu.org; Wed, 30 Nov 2016 21:37:13 -0500 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 91199160074; Wed, 30 Nov 2016 18:37:06 -0800 (PST) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id eENVVwQNfSFv; Wed, 30 Nov 2016 18:37:05 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id E7D5F16008A; Wed, 30 Nov 2016 18:37:05 -0800 (PST) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id RVjv9EB-ShC1; Wed, 30 Nov 2016 18:37:05 -0800 (PST) Received: from Penguin.CS.UCLA.EDU (Penguin.CS.UCLA.EDU [131.179.64.200]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id D09BB160074; Wed, 30 Nov 2016 18:37:05 -0800 (PST) Subject: Re: bug#24924: GNU pr only working with singlebyte 1-width characters To: Stephane Chazelas <stephane.chazelas@HIDDEN>, 24924 <at> debbugs.gnu.org References: <8737iyatfd.fsf@HIDDEN> <20161130113034.GA7005@HIDDEN> From: Paul Eggert <eggert@HIDDEN> Organization: UCLA Computer Science Department Message-ID: <69288e76-9f5d-c792-9bba-6a984461463b@HIDDEN> Date: Wed, 30 Nov 2016 18:37:05 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: <20161130113034.GA7005@HIDDEN> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -2.9 (--) X-Debbugs-Envelope-To: 24924 X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -2.9 (--) On 11/30/2016 03:30 AM, Stephane Chazelas wrote: > That can also be seen as a POSIX conformance bug Not really, as POSIX does not require support for UTF-8 (except in the pax utility, which is not part of coreutils). It'd be nice if pr etc. could be made to work cleanly for UTF-8. In the meantime if you could submit a patch for the documentation that should fix the immediate documentation problem.
bug-coreutils@HIDDEN
:bug#24924
; Package coreutils
.
Full text available.Received: (at 24924) by debbugs.gnu.org; 30 Nov 2016 11:30:44 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Wed Nov 30 06:30:44 2016 Received: from localhost ([127.0.0.1]:46997 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1cC364-0001y8-49 for submit <at> debbugs.gnu.org; Wed, 30 Nov 2016 06:30:44 -0500 Received: from mail-wj0-f179.google.com ([209.85.210.179]:33117) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <stephane.chazelas@HIDDEN>) id 1cC362-0001xu-75 for 24924 <at> debbugs.gnu.org; Wed, 30 Nov 2016 06:30:42 -0500 Received: by mail-wj0-f179.google.com with SMTP id xy5so170988788wjc.0 for <24924 <at> debbugs.gnu.org>; Wed, 30 Nov 2016 03:30:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:subject:message-id:mime-version:content-disposition :content-transfer-encoding:user-agent; bh=T0CKMEZuAU0qL8uEC4tayK/LG3kf+F/djAh4U3a0+EE=; b=ec3QcbB2W4Lzfd2nv1/f8lAcaT5Hd9P+aC5fEMf6hi1fS87s79m4ErsckAo8wmEmrf g2TBoW7+QruVYnvk2+huUky9DWGbVi4atvTcnRGcmB60meE4PyzNxgMqVowNjc7yzNuQ J/hAc+JItV+xNgbXjULJElInYzODmKhtAlLWftuHpYBOqjc6WAGW2DIzjKSvpO6bM66y P7EcgzdmbvAMM7R+Q77IkXjn1bIorf4NleX4+CqSOTU09v9rrde5xr0KUtIwaT5VcXfI HbIYviEQlWC9rsTSRcoQvhg5fs8rn7ixjJgKtai1ccIM2OXVxossqbllWamGDlpXOgx7 1S+g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:subject:message-id:mime-version :content-disposition:content-transfer-encoding:user-agent; bh=T0CKMEZuAU0qL8uEC4tayK/LG3kf+F/djAh4U3a0+EE=; b=NXgZpnQgQuaH8T2Q0QsRzoYTz5boUXJ2RYQkXwJwrUXoYUDpLoVnF4PYgzI9YZOI97 +ZWQqNgr27KGKJwSqc7N16L4nyzjTdJKTe4sYB9FYqWBbD2N2WU8o0z5QeSAxEwtu0i8 G1cKQ3O5Th6fGw6pUEIf4JAddG1Y7EE9OiML058aEheIWfn29/Tkm7/Q5kuAIkHZtNeQ JfDbQAQuIk/jb9UJ7vVdlqWlESf2CswgGgA9CNa0BtjrJ68psEdJJWvnnYMU+JDL5qcc otdsSUYifFfWtYYNUMEeXIMnCisgVMwpJfaoo/0np30mkMgfmXTYOLFDIaoOMp0TdU7s u2NQ== X-Gm-Message-State: AKaTC02OUKK5BdYREJqBIPDHFsCZt0hc8pXN1QeMZFe3GN1ya3TAhDFDj2kGHEZ7tl1WnQ== X-Received: by 10.194.26.133 with SMTP id l5mr28012695wjg.4.1480505436233; Wed, 30 Nov 2016 03:30:36 -0800 (PST) Received: from chaz.gmail.com ([90.201.137.34]) by smtp.gmail.com with ESMTPSA id g184sm7541910wme.23.2016.11.30.03.30.34 for <24924 <at> debbugs.gnu.org> (version=TLS1_2 cipher=AES128-SHA bits=128/128); Wed, 30 Nov 2016 03:30:35 -0800 (PST) Date: Wed, 30 Nov 2016 11:30:34 +0000 From: Stephane Chazelas <stephane.chazelas@HIDDEN> To: 24924 <at> debbugs.gnu.org Subject: GNU pr only working with singlebyte 1-width characters Message-ID: <20161130113034.GA7005@HIDDEN> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: 24924 X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -0.0 (/) Only arguing on the classification of this bug here. Let's call a cat a cat. When something doesn't work as documented, it's a bug, not a wishlist entry. AFAICT, there's nothing in the GNU coreutils documentation that states that pr only works on input that consists exclusively of single-byte characters that are neither zero-width (though it copes OK with ASCII BS and TAB) nor double-width (or on ASCII-only input). Today, UTF-8 is the most commonly used character set, so it even affects English text (where £ (the British currency symbol) is encoded on two bytes in UTF-8 for instance), and even US-English text like for the ‘quoting characters’ (3 bytes each in UTF-8) now that ASCII ' has been demoted to just an apostrophe. That can also be seen as a POSIX conformance bug (though GNU coreutils doesn't claim POSIX conformance, only "The GNU utilities documented here are /mostly/ compatible with the POSIX standard"). $ pr -tm --sep-string='|' <(du --version) <(truncate --version) du (GNU coreutils) 8.25 |truncate (GNU coreutils) 8.25 Copyright (C) 2016 Free Software Fo|Copyright (C) 2016 Free Software Fo License GPLv3+: GNU GPL version 3 o|License GPLv3+: GNU GPL version 3 o This is free software: you are free|This is free software: you are free There is NO WARRANTY, to the extent|There is NO WARRANTY, to the extent | Written by Torbjörn Granlund, David |Written by Pádraig Brady. and Jim Meyering. | -- Stephane
bug-coreutils@HIDDEN
:bug#24924
; Package coreutils
.
Full text available.Received: (at 24924) by debbugs.gnu.org; 12 Nov 2016 10:12:46 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Sat Nov 12 05:12:46 2016 Received: from localhost ([127.0.0.1]:54441 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1c5VIk-0004De-6a for submit <at> debbugs.gnu.org; Sat, 12 Nov 2016 05:12:46 -0500 Received: from homie.mail.dreamhost.com ([208.97.132.208]:39943 helo=homiemail-a8.g.dreamhost.com) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <jidanni@HIDDEN>) id 1c5VIi-0004DV-8s for 24924 <at> debbugs.gnu.org; Sat, 12 Nov 2016 05:12:44 -0500 Received: from homiemail-a8.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a8.g.dreamhost.com (Postfix) with ESMTP id 001885F2067; Sat, 12 Nov 2016 02:12:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=jidanni.org; h=from:to:cc :subject:references:date:message-id:mime-version:content-type; s=jidanni.org; bh=sbrsDe4oeJhjSQdDLh0Q32K+Vnk=; b=eXoDVSxeuFMz8 3RqrNHNLkNsU3I3OaP7bc3NlT/DpbhypmWSKqHhrATM6TRlXNwuxZNVLFzY5eQY+ JulQ8rYNFClaNEv1+/IvdaJw7HzlV4e4AQMeVhetpiNw9lpq2c0Bf7KTCG50mLXO qW6k0MmnNE5AA6FuB2mGh4Nm/02pqU= Received: from jidanni.org (111-246-98-191.dynamic.hinet.net [111.246.98.191]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: jidanni@HIDDEN) by homiemail-a8.g.dreamhost.com (Postfix) with ESMTPSA id 14F6F5F2065; Sat, 12 Nov 2016 02:12:40 -0800 (PST) From: =?utf-8?B?56mN5Li55bC8?= Dan Jacobson <jidanni@HIDDEN> To: Assaf Gordon <assafgordon@HIDDEN> Subject: Re: bug#24924: pr has no concept of wide characters References: <8737iyatfd.fsf@HIDDEN> Date: Sat, 12 Nov 2016 18:12:36 +0800 Message-ID: <878tsp6m7f.fsf@HIDDEN> MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.5 (/) X-Debbugs-Envelope-To: 24924 Cc: 24924 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: 0.5 (/) >>>>> "AG" == Assaf Gordon <assafgordon@HIDDEN> writes: AG> I would very much appreciate if you could help me test it as there AG> are many edge-cases with multibyte support and wide-characters. Sure but you need to send me a .deb or $ which pr|xargs file /usr/bin/pr: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=14376d20f6383ec9348da986ecc693c6bb45a0ee, stripped AG> As a curiosity, AG> are you using UTF-8 locales exclusively, or do you have experience AG> with Shift-JIS or EUC-JP locales? Nope I just use zh_TW.utf8 all the time.
bug-coreutils@HIDDEN
:bug#24924
; Package coreutils
.
Full text available.Received: (at 24924) by debbugs.gnu.org; 11 Nov 2016 16:36:29 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Fri Nov 11 11:36:29 2016 Received: from localhost ([127.0.0.1]:54127 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1c5EoX-0000d9-2Y for submit <at> debbugs.gnu.org; Fri, 11 Nov 2016 11:36:29 -0500 Received: from mail-qk0-f196.google.com ([209.85.220.196]:35948) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <assafgordon@HIDDEN>) id 1c5EoV-0000cx-Jw for 24924 <at> debbugs.gnu.org; Fri, 11 Nov 2016 11:36:28 -0500 Received: by mail-qk0-f196.google.com with SMTP id h201so2440924qke.3 for <24924 <at> debbugs.gnu.org>; Fri, 11 Nov 2016 08:36:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=subject:to:references:from:cc:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding; bh=wgsGBLjZdXlo8ypwmQBfC7u0qVeW2X6JjzVEv3pgJKQ=; b=dsyu91PboTKqM+K7TT7K/j3ghzzx1BBKHvwcGIDEbE/L4Uu7NJcLzqf7/obeNttCCD S9Pw9TkPZCJ7P+rRfxQXkFxlAIIR0L3dnLHIqp5bNhUalZeKOioy5BeYT5iGiTVj4u4f miTrcTeoIU33V/XeuqSHxpW9HRBwwdudkOLR6TsCMRaMz4IBv7K3laxCnnKTJL6wzLF0 IJrnINXMwi3B7YQnrwr2XQSMcGmeU6XvDNw9UsX0yOhBMOWs2nfT/NF645n+QvGdb7II M+Gu3WMRSRh3/q+sV7jcOdEjEeDid2LRPtYsC4GCvRbvb6kzZ8cYH3dbv07Sk2XbrjMV Ug8Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:references:from:cc:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding; bh=wgsGBLjZdXlo8ypwmQBfC7u0qVeW2X6JjzVEv3pgJKQ=; b=OCReW7H75aiHWSDADxh1kCIgiszUPekNtUpiQY7MUGLnjwesjxrf7OixcDKyQUBsll 8a32nDc2M9k7v0VTzVK1OnrgsLgp4l7DdYzIMvNeIUxoGKXb6TiIEdbupPK0tUAaftx0 x6w4c0TCqInfA98ZpL9udXzuCERBMOnRFAPKaA1vr5VzoXR/uojXRchQgzdZ3XeXQjPD lkApUOt+dUlgZPXx7uGHP4AgBQJXiD7c2Cb+amJJViDASO9XUddH+SwWUWuW4F+NmzNb mfxQrwLi7aOSixRXMPMYSPY5asb9+J7Eom1oRQWm58lz1NL0FPNLUTY3blPK0qsMG/1R 46OA== X-Gm-Message-State: ABUngvewJwzQn0svyDgPYtpHtJkqfkJZ0K0tcsvjsbciYLnUbvAe85kdsPA19Ow3z5dI4A== X-Received: by 10.55.136.4 with SMTP id k4mr4017258qkd.57.1478882181985; Fri, 11 Nov 2016 08:36:21 -0800 (PST) Received: from disco.erlich.nygenome.org ([69.74.14.178]) by smtp.gmail.com with ESMTPSA id n191sm445654qke.19.2016.11.11.08.36.21 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 11 Nov 2016 08:36:21 -0800 (PST) Subject: Re: bug#24924: pr has no concept of wide characters To: =?UTF-8?B?56mN5Li55bC8IERhbiBKYWNvYnNvbg==?= <jidanni@HIDDEN> References: <8737iyatfd.fsf@HIDDEN> From: Assaf Gordon <assafgordon@HIDDEN> Message-ID: <c758a694-ba58-8e9b-7d97-81c0c44bff41@HIDDEN> Date: Fri, 11 Nov 2016 11:36:20 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.3.0 MIME-Version: 1.0 In-Reply-To: <8737iyatfd.fsf@HIDDEN> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Score: -0.2 (/) X-Debbugs-Envelope-To: 24924 Cc: 24924 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -0.2 (/) severity 24924 wishlist tags 24924 wishlist notabug thanks Hello Dan, On 11/11/2016 11:10 AM, 積丹尼 Dan Jacobson wrote: > The pr documentation (man, info) doesn't mention how it has no concept > of wide characters. > $ pr -m --sep-string='^^^' file file Indeed, most of the current coreutils programs do not support wide or multi-byte characters correctly. The current official implementation does not support it (which is why I marked this item as 'wishlist' and not a bug). On RedHat systems, there is the 'i18n' patch, which adds some support but also introduces some problematic issues: https://github.com/pixelb/coreutils/tree/i18n However, there is an active effort to make all of them multibyte aware. The latest updates are (in reverse chronological order, these are somewhat long threads): http://lists.gnu.org/archive/html/coreutils/2016-09/msg00026.html http://lists.gnu.org/archive/html/coreutils/2016-09/msg00011.html http://lists.gnu.org/archive/html/coreutils/2016-07/msg00013.html 'cut' and 'expand' were the first two programs I worked on. 'pr' is definitely on the list - once I have a proof-of-concept working, I would very much appreciate if you could help me test it as there are many edge-cases with multibyte support and wide-characters. As a curiosity, are you using UTF-8 locales exclusively, or do you have experience with Shift-JIS or EUC-JP locales? I'm leaving this ticket open, and welcome discussion and comments. regards, - assaf P.S. The usual disclaimer applies: there is currently no ETA for multibyte support in coreutils.
bug-coreutils@HIDDEN
:bug#24924
; Package coreutils
.
Full text available.Received: (at submit) by debbugs.gnu.org; 11 Nov 2016 16:11:09 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Fri Nov 11 11:11:09 2016 Received: from localhost ([127.0.0.1]:54100 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1c5EQ0-0008Qw-S7 for submit <at> debbugs.gnu.org; Fri, 11 Nov 2016 11:11:09 -0500 Received: from eggs.gnu.org ([208.118.235.92]:58545) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <jidanni@HIDDEN>) id 1c5EPz-0008Qj-0c for submit <at> debbugs.gnu.org; Fri, 11 Nov 2016 11:11:07 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from <jidanni@HIDDEN>) id 1c5EPs-0001kz-M0 for submit <at> debbugs.gnu.org; Fri, 11 Nov 2016 11:11:01 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: * X-Spam-Status: No, score=1.3 required=5.0 tests=BAYES_50,LOTS_OF_MONEY, RCVD_IN_SORBS_SPAM,T_DKIM_INVALID autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:55460) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from <jidanni@HIDDEN>) id 1c5EPs-0001kt-ID for submit <at> debbugs.gnu.org; Fri, 11 Nov 2016 11:11:00 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:49516) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from <jidanni@HIDDEN>) id 1c5EPr-0000rX-7q for bug-coreutils@HIDDEN; Fri, 11 Nov 2016 11:11:00 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from <jidanni@HIDDEN>) id 1c5EPo-0001jz-0w for bug-coreutils@HIDDEN; Fri, 11 Nov 2016 11:10:59 -0500 Received: from homie.mail.dreamhost.com ([208.97.132.208]:38808 helo=homiemail-a14.g.dreamhost.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from <jidanni@HIDDEN>) id 1c5EPn-0001j4-K8 for bug-coreutils@HIDDEN; Fri, 11 Nov 2016 11:10:55 -0500 Received: from homiemail-a14.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a14.g.dreamhost.com (Postfix) with ESMTP id 5A68739207E for <bug-coreutils@HIDDEN>; Fri, 11 Nov 2016 08:10:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=jidanni.org; h=from:to :subject:date:message-id:mime-version:content-type: content-transfer-encoding; s=jidanni.org; bh=9EQ+kaUuGuhlHV0tPS6 Ir3N3nKg=; b=VQAxtyCISnY1H4UKzZu4+C2wap7YrnJGEDMT6ayQ56emu7ToW34 eG/S00pUMuOGZc2dHGHW8mYdP5INH+1Jg7+q/p0xXmgHmSgAP5BC5/CUxE9BOFms EKzGcUI675WufzD0ZPtetog3Uj1A4E16d36mzuzsf6fJfNLY6J0vW8AQ= Received: from jidanni.org (111-246-99-93.dynamic.hinet.net [111.246.99.93]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: jidanni@HIDDEN) by homiemail-a14.g.dreamhost.com (Postfix) with ESMTPSA id B0FBC392076 for <bug-coreutils@HIDDEN>; Fri, 11 Nov 2016 08:10:49 -0800 (PST) From: =?utf-8?B?56mN5Li55bC8?= Dan Jacobson <jidanni@HIDDEN> To: bug-coreutils@HIDDEN Subject: pr has no concept of wide characters Date: Sat, 12 Nov 2016 00:10:46 +0800 Message-ID: <8737iyatfd.fsf@HIDDEN> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x (no timestamps) [generic] X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -4.5 (----) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -4.5 (----) The pr documentation (man, info) doesn't mention how it has no concept of wide characters. $ pr -m --sep-string=3D'^^^' file file 2016-11-12 00:06 Page 1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD^^^<!DOCTYPE HTML PUBLIC "-//W3C//DTD "http://www.w3.org/TR/html4/strict^^^"http://www.w3.org/TR/html4/strict <html lang=3D"zh-tw"> ^^^<html lang=3D"zh-tw"> <head> ^^^<head> <meta http-equiv=3D"Content-Type" c^^^ <meta http-equiv=3D"Content-Type"= c "text/html; charset=3Dutf-8"> ^^^ "text/html; charset=3Dutf-8"> <meta name=3D"viewport" content=3D"wi^^^ <meta name=3D"viewport" content= =3D"wi <title>My groups ordered by ...</^^^ <title>My groups ordered by ...</ <base href=3D"https://www.facebook.^^^ <base href=3D"https://www.faceboo= k. </head> ^^^</head> <body> ^^^<body> <dl> ^^^ <dl> <dt>"=E5=90=8C=E5=BF=97|Queer|Gdi"</dt> ^^^ <dt>"=E5=90=8C= =E5=BF=97|Queer|Gdi"</dt> <dd> 5 o =E5=8F=B0=E7=81=A3=E5=90=8C=E5=BF=97=E9=81=8A=E8=A1=8C=E8=81= =AF=E7=9B=9F Taiwan LGBT Pride Co^^^ <dd> 5 o =E5=8F=B0=E7=81=A3=E5=90= =8C=E5=BF=97=E9=81=8A=E8=A1=8C=E8=81=AF=E7=9B=9F Taiwan LGBT Pride Co <dd> 0 o =E5=8F=B0=E7=81=A3=E5=90=8C=E5=BF=97=E4=BA=A4=E5=8F=8B=E8=81= =AF=E7=9B=9F 301797916498866<BR> ^^^ <dd> 0 o =E5=8F=B0=E7=81=A3=E5=90= =8C=E5=BF=97=E4=BA=A4=E5=8F=8B=E8=81=AF=E7=9B=9F 301797916498866<BR> <dd> 25 o =E6=88=91=E6=98=AF(=E7=9B=B4)=E5=90=8C=E5=BF=97=EF=BC=8C=E6=88= =91=E5=BE=88=E9=A9=95=E5=82=B2! 185779952675<BR> ^^^ <dd> 25 o =E6=88= =91=E6=98=AF(=E7=9B=B4)=E5=90=8C=E5=BF=97=EF=BC=8C=E6=88=91=E5=BE=88=E9=A9= =95=E5=82=B2! 185779952675<BR> <dd> 25 o =E5=8F=B0=E7=81=A3=E9=85=B7=E5=85=92=E6=AC=8A=E7=9B=8A=E6=8E= =A8=E5=8B=95=E8=81=AF=E7=9B=9F Taiwan Gender Queer ^^^ <dd> 25 o =E5= =8F=B0=E7=81=A3=E9=85=B7=E5=85=92=E6=AC=8A=E7=9B=8A=E6=8E=A8=E5=8B=95=E8=81= =AF=E7=9B=9F Taiwan Gender Queer <dt>"=E6=80=A7=E5=88=A5|=E8=9D=B6=E5=9C=92" BUT NOT "TV"</dt> ^^^= <dt>"=E6=80=A7=E5=88=A5|=E8=9D=B6=E5=9C=92" BUT NOT "TV"</dt> <dd> 0 c =E8=B7=A8=E6=80=A7=E5=88=A5=E8=88=87=E5=A5=B3=E6=80=A7=E4=B8= =BB=E7=BE=A9 Transgender&Femi^^^ <dd> 0 c =E8=B7=A8=E6=80=A7=E5=88= =A5=E8=88=87=E5=A5=B3=E6=80=A7=E4=B8=BB=E7=BE=A9 Transgender&Femi <dd> 2 c =E4=B8=AD=E9=83=A8=E6=80=A7=E5=88=A5=E5=9C=98=E9=AB=94=E8=81= =AF=E7=9B=9F 293589073985313<BR> ^^^ <dd> 2 c =E4=B8=AD=E9=83=A8=E6=80= =A7=E5=88=A5=E5=9C=98=E9=AB=94=E8=81=AF=E7=9B=9F 293589073985313<BR> <dd> 1 o =E5=8F=B0=E7=81=A3TG=E8=9D=B6=E5=9C=92 320448571355058<BR^^^= <dd> 1 o =E5=8F=B0=E7=81=A3TG=E8=9D=B6=E5=9C=92 320448571355058<BR <dd> 0 o =E4=B8=AD=E8=8F=AF=E6=B0=91=E5=9C=8B=E8=B7=A8=E6=80=A7=E5=88= =A5=E8=80=85=E7=94=9F=E6=B4=BB=E6=AC=8A=E7=9B=8A=E4=BF=83=E9=80=B2=E5=90=88= =E4=BD=9C=E7=A4=BE=E8=A8=8A=E6=81=AF=E7=99=BC=E5=B8=83=E7=AB=99 252346365= 161476<BR> ^^^ <dd> 0 o =E4=B8=AD=E8=8F=AF=E6=B0=91=E5=9C=8B=E8=B7= =A8=E6=80=A7=E5=88=A5=E8=80=85=E7=94=9F=E6=B4=BB=E6=AC=8A=E7=9B=8A=E4=BF=83= =E9=80=B2=E5=90=88=E4=BD=9C=E7=A4=BE=E8=A8=8A=E6=81=AF=E7=99=BC=E5=B8=83=E7= =AB=99 252346365161476<BR> <dd> 3 o =E6=80=A7=E5=88=A5=E4=B8=8D=E6=98=8E=E9=97=9C=E6=87=B7=E5=8D= =94=E6=9C=83(Beyond Gender) 17160^^^ <dd> 3 o =E6=80=A7=E5=88=A5=E4=B8= =8D=E6=98=8E=E9=97=9C=E6=87=B7=E5=8D=94=E6=9C=83(Beyond Gender) 17160 <dd> 0 o =E5=81=BD=E7=99=BE=E5=90=88=E8=88=87=E5=81=BD=E5=A8=98=E3=80= =81=E8=B7=A8=E6=80=A7=E5=88=A5=E5=80=91=E7=9A=84=E5=93=B2=E5=AD=B8=E3=80=81= =E6=80=9D=E6=83=B3=E4=BA=A4=E6=B5=81=E7=A4=BE=E7=BE=A4 810661859077873<BR= > ^^^ <dd> 0 o =E5=81=BD=E7=99=BE=E5=90=88=E8=88=87=E5=81=BD=E5=A8=98=E3= =80=81=E8=B7=A8=E6=80=A7=E5=88=A5=E5=80=91=E7=9A=84=E5=93=B2=E5=AD=B8=E3=80= =81=E6=80=9D=E6=83=B3=E4=BA=A4=E6=B5=81=E7=A4=BE=E7=BE=A4 810661859077873= <BR> $ pr --version pr (GNU coreutils) 8.25
積丹尼 Dan Jacobson <jidanni@HIDDEN>
:bug-coreutils@HIDDEN
.
Full text available.bug-coreutils@HIDDEN
:bug#24924
; Package coreutils
.
Full text available.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997 nCipher Corporation Ltd,
1994-97 Ian Jackson.