Received: (at submit) by debbugs.gnu.org; 22 Apr 2016 03:47:53 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Apr 21 23:47:53 2016
Received: from localhost ([127.0.0.1]:42776 helo=debbugs.gnu.org)
by debbugs.gnu.org with esmtp (Exim 4.84_2)
(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
id 1atS4O-0000cI-MV
for submit <at> debbugs.gnu.org; Thu, 21 Apr 2016 23:47:53 -0400
Received: from eggs.gnu.org ([208.118.235.92]:36517)
by debbugs.gnu.org with esmtp (Exim 4.84_2)
(envelope-from <jonathan.russ.buchanan@HIDDEN>)
id 1atP2f-0004OH-5U
for submit <at> debbugs.gnu.org; Thu, 21 Apr 2016 20:33:53 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
(envelope-from <jonathan.russ.buchanan@HIDDEN>)
id 1atP2Y-00068v-SP
for submit <at> debbugs.gnu.org; Thu, 21 Apr 2016 20:33:48 -0400
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level:
X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,FREEMAIL_FROM,
T_DKIM_INVALID autolearn=disabled version=3.3.2
Received: from lists.gnu.org ([2001:4830:134:3::11]:47811)
by eggs.gnu.org with esmtp (Exim 4.71)
(envelope-from <jonathan.russ.buchanan@HIDDEN>)
id 1atP2Y-00068r-OR
for submit <at> debbugs.gnu.org; Thu, 21 Apr 2016 20:33:46 -0400
Received: from eggs.gnu.org ([2001:4830:134:3::10]:55708)
by lists.gnu.org with esmtp (Exim 4.71)
(envelope-from <jonathan.russ.buchanan@HIDDEN>)
id 1atP2X-0006zw-Dg
for bug-coreutils@HIDDEN; Thu, 21 Apr 2016 20:33:46 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
(envelope-from <jonathan.russ.buchanan@HIDDEN>)
id 1atP2V-00068G-Sa
for bug-coreutils@HIDDEN; Thu, 21 Apr 2016 20:33:45 -0400
Received: from mail-qk0-x243.google.com ([2607:f8b0:400d:c09::243]:33712)
by eggs.gnu.org with esmtp (Exim 4.71)
(envelope-from <jonathan.russ.buchanan@HIDDEN>)
id 1atP2V-000684-O0
for bug-coreutils@HIDDEN; Thu, 21 Apr 2016 20:33:43 -0400
Received: by mail-qk0-x243.google.com with SMTP id q184so4918371qkf.0
for <bug-coreutils@HIDDEN>; Thu, 21 Apr 2016 17:33:43 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
h=from:to:cc:subject:date:message-id:in-reply-to:references;
bh=R6DRUJiS32aFQIhNsh58NXfFyLWnoYyS63J384WlK98=;
b=Nd0X8PGuoQDhE0Zu0wf9jd5C285rhjcz2P+C7+trBE1CxeBUW1Cu2bTXx+10Pvfcay
cpl6RpjNZg31so6mLTbXNX3S7x9fOZkhAVCjky/xt/63gyzMBz5+dbhbm9uJc0+SiQyz
+lOsMvPy13MXDlm3s6DfCIsGyigvvI/FBne8OKxHxmx3qDZmlFd9n9SwA9yNplMf0Ov3
FjgDuBNbAnTRmSoMjGJMCOyWoG1n8AVv2+PuL00tYueYN75AsAwuDJDV8E+rdcdOkcfI
cvgcwR+FO8FkDB9xRJ5hA6LYC55j3yKU5WBKXfU/posfGpdkwsIeLVS2JndC2+h7bx4g
Cl1A==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=1e100.net; s=20130820;
h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to
:references;
bh=R6DRUJiS32aFQIhNsh58NXfFyLWnoYyS63J384WlK98=;
b=jl0h/zBEF8bPGmnh9WtgMxsTfkx+ex9/BSf8+imaDxslX49l7tOpLnfuee/Ca82sF7
dEDAziTKbXNIYgkDIwcnJp14rQ8tJMbpRTXBG2iln68iuUblwuiSeMeWDaMjMxcrIH4T
/qDD9RdgZLTCsa7SOEmJbw8/g5pH6J0rxEETSxxz+e48TuhL1zWrnvPCLVaVczIhIdYm
QEUh21ytDJj+OQDJ5Lk585xubIuQcFCh/cf1NIvcjjHG35nU9KNFHc56PtHIOKLVzc7U
VP30LqHdFAYIsZOOttabpBsAK14HhNAiSA1tlEBNxjxNHAL0cgYIsEYo3ssmfkUSPW4R
MXNw==
X-Gm-Message-State: AOPr4FXiFLnIjB6PS5SZTwMIkWR+3uXfPfPGC7HnniLekvKqKYT9kMJAXUvvYCAnIiz3eA==
X-Received: by 10.55.156.210 with SMTP id f201mr2932258qke.9.1461285223331;
Thu, 21 Apr 2016 17:33:43 -0700 (PDT)
Received: from JonathansMacBookPro.bexmta.net
(cblmdm72-240-163-109.buckeyecom.net. [72.240.163.109])
by smtp.gmail.com with ESMTPSA id e14sm1533704qhc.17.2016.04.21.17.33.42
(version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
Thu, 21 Apr 2016 17:33:42 -0700 (PDT)
From: Jonathan Buchanan <jonathan.russ.buchanan@HIDDEN>
To: bug-coreutils@HIDDEN
Subject: [PATCH 2/2] unexpand: Reimplemented the unexpand algorithm to satisfy
the standard
Date: Thu, 21 Apr 2016 20:33:36 -0400
Message-Id: <1461285216-14656-2-git-send-email-jonathan.russ.buchanan@HIDDEN>
X-Mailer: git-send-email 2.8.0
In-Reply-To: <1461285216-14656-1-git-send-email-jonathan.russ.buchanan@HIDDEN>
References: <1461285216-14656-1-git-send-email-jonathan.russ.buchanan@HIDDEN>
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x
X-Received-From: 2001:4830:134:3::11
X-Spam-Score: -4.0 (----)
X-Debbugs-Envelope-To: submit
X-Mailman-Approved-At: Thu, 21 Apr 2016 23:47:50 -0400
Cc: Jonathan Buchanan <jonathan.russ.buchanan@HIDDEN>
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -4.0 (----)
* TODO: Removed the section detailing how unexpand did
not satisfy the standard.
* src/unexpand.c: Reimplemented the unexpand algorithm. The program
now satisfies the conditions specified in the old TODO.
---
TODO | 4 --
src/unexpand.c | 176 ++++++++++++++++++++++-----------------------------------
2 files changed, 69 insertions(+), 111 deletions(-)
diff --git a/TODO b/TODO
index de95e5a..dc1a9e2 100644
--- a/TODO
+++ b/TODO
@@ -67,10 +67,6 @@ lib/strftime.c: Since %N is the only format that we need but that
would expand /%(-_)?\d*N/ to the desired string and then pass the
resulting string to glibc's strftime.
-unexpand: [http://www.opengroup.org/onlinepubs/007908799/xcu/unexpand.html]
- printf 'x\t \t y\n'|unexpand -t 8,9 should print its input, unmodified.
- printf 'x\t \t y\n'|unexpand -t 5,8 should print "x\ty\n"
-
sort: Investigate better sorting algorithms; see Knuth vol. 3.
We tried list merge sort, but it was about 50% slower than the
diff --git a/src/unexpand.c b/src/unexpand.c
index a758756..dcd40de 100644
--- a/src/unexpand.c
+++ b/src/unexpand.c
@@ -303,13 +303,6 @@ unexpand (void)
/* Input character, or EOF. */
int c;
- /* If true, perform translations. */
- bool convert = true;
-
-
- /* The following variables have valid values only when CONVERT
- is true: */
-
/* Column of next input character. */
uintmax_t column = 0;
@@ -319,127 +312,96 @@ unexpand (void)
/* Index in TAB_LIST of next tab stop to examine. */
size_t tab_index = 0;
- /* If true, the first pending blank came just before a tab stop. */
- bool one_blank_before_tab_stop = false;
-
- /* If true, the previous input character was a blank. This is
- initially true, since initial strings of blanks are treated
- as if the line was preceded by a blank. */
- bool prev_blank = true;
-
/* Number of pending columns of blanks. */
size_t pending = 0;
-
- /* Convert a line of text. */
+ /* If true, the previous input charactar was not a blank. */
+ bool previous_non_blank = false;
do
{
while ((c = getc (fp)) < 0 && (fp = next_file (fp)))
continue;
- if (convert)
+ if (c < 0)
+ {
+ free (pending_blank);
+ return;
+ }
+
+ /* Update the next tab column */
+ if (next_tab_column <= column)
{
- bool blank = !! isblank (c);
+ if (tab_size)
+ next_tab_column = (column + (tab_size - column % tab_size));
+ else
+ if (tab_index < first_free_tab)
+ next_tab_column = tab_list[tab_index++];
+ else
+ next_tab_column = -1;
+ }
- if (blank)
+ bool blank = !! isblank (c);
+ if (!blank)
+ {
+ /* If no -a, stop converting once a non-blank is reached. */
+ if (!convert_entire_line)
+ next_tab_column = -1;
+ if (fwrite (pending_blank, sizeof (char), pending, stdout)
+ != pending)
+ error (EXIT_FAILURE, errno, _("write error"));
+ pending = 0;
+ if (putchar (c) < 0)
+ error (EXIT_FAILURE, errno, _("write error"));
+ previous_non_blank = true;
+ }
+ else
+ {
+ pending_blank[pending] = c;
+ pending++;
+ /* POSIX says spaces should not precede tabs, so remove spaces
+ if a tab is found after spaces. */
+ if (pending_blank[0] != '\t' && c == '\t')
{
- if (next_tab_column <= column)
+ pending = 1;
+ pending_blank[0] = '\t';
+ }
+ if (column + 1 == next_tab_column)
+ {
+ /* POSIX says single trailing spaces should not be converted
+ to tabs if they are followed by a non-blank. */
+ if (c == ' ' && pending == 1 && previous_non_blank)
{
- if (tab_size)
- next_tab_column =
- column + (tab_size - column % tab_size);
+ previous_non_blank = false;
+ if ((c = getc (fp)) >= 0)
+ blank = !! isblank (c);
else
- while (true)
- if (tab_index == first_free_tab)
- {
- convert = false;
- break;
- }
- else
- {
- uintmax_t tab = tab_list[tab_index++];
- if (column < tab)
- {
- next_tab_column = tab;
- break;
- }
- }
- }
-
- if (convert)
- {
- if (next_tab_column < column)
- error (EXIT_FAILURE, 0, _("input line is too long"));
-
- if (c == '\t')
{
- column = next_tab_column;
-
- if (pending)
- pending_blank[0] = '\t';
+ /* End of file, do not convert to tab. */
+ if (putchar (' ') < 0)
+ error (EXIT_FAILURE, errno, _("write error"));
+ continue;
}
+ if (!blank)
+ c = ' ';
else
- {
- column++;
-
- if (! (prev_blank && column == next_tab_column))
- {
- /* It is not yet known whether the pending blanks
- will be replaced by tabs. */
- if (column == next_tab_column)
- one_blank_before_tab_stop = true;
- pending_blank[pending++] = c;
- prev_blank = true;
- continue;
- }
-
- /* Replace the pending blanks by a tab or two. */
- pending_blank[0] = c = '\t';
- }
-
- /* Discard pending blanks, unless it was a single
- blank just before the previous tab stop. */
- pending = one_blank_before_tab_stop;
+ c = '\t';
+ if (putchar (c) < 0)
+ error (EXIT_FAILURE, errno, _("write error"));
+ column += 1;
+ pending = 0;
+ /* Move the position in the file back and continue. */
+ fseek (fp, -1, SEEK_CUR);
+ continue;
}
- }
- else if (c == '\b')
- {
- /* Go back one column, and force recalculation of the
- next tab stop. */
- column -= !!column;
- next_tab_column = column;
- tab_index -= !!tab_index;
- }
- else
- {
- column++;
- if (!column)
- error (EXIT_FAILURE, 0, _("input line is too long"));
- }
-
- if (pending)
- {
- if (pending > 1 && one_blank_before_tab_stop)
- pending_blank[0] = '\t';
- if (fwrite (pending_blank, 1, pending, stdout) != pending)
- error (EXIT_FAILURE, errno, _("write error"));
+ previous_non_blank = false;
pending = 0;
- one_blank_before_tab_stop = false;
+ putchar ('\t');
}
-
- prev_blank = blank;
- convert &= convert_entire_line || blank;
- }
-
- if (c < 0)
- {
- free (pending_blank);
- return;
}
-
- if (putchar (c) < 0)
- error (EXIT_FAILURE, errno, _("write error"));
+ column++;
+ if (!column)
+ error (EXIT_FAILURE, 0, _("input line is too long"));
}
while (c != '\n');
}
--
2.8.0
bug-coreutils@HIDDEN:bug#23335; Package coreutils.
Full text available.
Received: (at submit) by debbugs.gnu.org; 22 Apr 2016 01:14:33 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Apr 21 21:14:33 2016
Received: from localhost ([127.0.0.1]:42732 helo=debbugs.gnu.org)
by debbugs.gnu.org with esmtp (Exim 4.84_2)
(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
id 1atPg1-0005MU-18
for submit <at> debbugs.gnu.org; Thu, 21 Apr 2016 21:14:33 -0400
Received: from eggs.gnu.org ([208.118.235.92]:36512)
by debbugs.gnu.org with esmtp (Exim 4.84_2)
(envelope-from <jonathan.russ.buchanan@HIDDEN>)
id 1atP2e-0004OF-4e
for submit <at> debbugs.gnu.org; Thu, 21 Apr 2016 20:33:52 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
(envelope-from <jonathan.russ.buchanan@HIDDEN>)
id 1atP2Y-00068l-6n
for submit <at> debbugs.gnu.org; Thu, 21 Apr 2016 20:33:46 -0400
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level:
X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,FREEMAIL_FROM,
T_DKIM_INVALID autolearn=disabled version=3.3.2
Received: from lists.gnu.org ([2001:4830:134:3::11]:36711)
by eggs.gnu.org with esmtp (Exim 4.71)
(envelope-from <jonathan.russ.buchanan@HIDDEN>)
id 1atP2Y-00068h-49
for submit <at> debbugs.gnu.org; Thu, 21 Apr 2016 20:33:46 -0400
Received: from eggs.gnu.org ([2001:4830:134:3::10]:55709)
by lists.gnu.org with esmtp (Exim 4.71)
(envelope-from <jonathan.russ.buchanan@HIDDEN>)
id 1atP2X-0006zx-Dd
for bug-coreutils@HIDDEN; Thu, 21 Apr 2016 20:33:46 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
(envelope-from <jonathan.russ.buchanan@HIDDEN>)
id 1atP2S-00067s-E1
for bug-coreutils@HIDDEN; Thu, 21 Apr 2016 20:33:45 -0400
Received: from mail-qg0-x244.google.com ([2607:f8b0:400d:c04::244]:32831)
by eggs.gnu.org with esmtp (Exim 4.71)
(envelope-from <jonathan.russ.buchanan@HIDDEN>)
id 1atP2S-00067o-A3
for bug-coreutils@HIDDEN; Thu, 21 Apr 2016 20:33:40 -0400
Received: by mail-qg0-x244.google.com with SMTP id 7so8583969qgj.0
for <bug-coreutils@HIDDEN>; Thu, 21 Apr 2016 17:33:39 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
h=from:to:cc:subject:date:message-id;
bh=oo0Mp0DjVC/NvivZbndVCX4iISUY3PX7O0TQX0ABmNo=;
b=KzL5Wd+M0jfUJ25UQCOj/zXEh7NDe1qx9QBCPRPphJdRpiczo/NM5Kvz8P2BX+twfM
Gd7in82BQyY+xZ+qZahOwzyE4CO5DYUvlMBKrbuGkHKJz8m2hgnghP9zhAOpzftPxgQQ
GLrwkx+Njg+LHaQPrU0IrA4lmpsyTVTPS5yisFbItLswjddUOmBfQW+4WD9QvHqvyuDX
cASBFLeXRG4WT46Tj7adLoz43apjWZOhPwyM/Xu2iylxXi7DbI2FaOaVek+wTlP+hTJr
4nTRmWtjmUGOucRVJkC/C2qS05d07hYaCBNoF90tmHCiaV8YIVNSEeRH4JqiFr/Rexwy
1FpA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=1e100.net; s=20130820;
h=x-gm-message-state:from:to:cc:subject:date:message-id;
bh=oo0Mp0DjVC/NvivZbndVCX4iISUY3PX7O0TQX0ABmNo=;
b=Ab8RulXGCCANDnyNes9hqZSDi7F0msnK9XE0KMrZcm0RcmlL7j+wKO02g6VcanvZsj
xi6NoB8ZHfkk1tbSewJ4nJM+qd8ff4RD7yr/7nLR1s535ZzwNmovuZDHuOZj7Qk+iC4I
x+aovpnpfdskCgz2/I++17ZAU6QMqDovCvgO8OcPk04VFgMCG9mYmetA4aFebHOWsOad
Knt/InDuUtgpUFLpc9RUjFj081yvoc3OmDRIT39m0sevZhG9YJAIwHSHk4SyaqHvL948
CxwK+ZL7tDfUmSdMMHMWRK16++5wtbAYp8KDAM4Sk9mLQuto3t38gIl48fe/GYmxHnTd
qEgA==
X-Gm-Message-State: AOPr4FVZCf/PGxTGM1ElTtkCoP2xOFfC4RyFSHnE0anypb34Ply6aJ4m/jjUYN99vDiH1w==
X-Received: by 10.140.19.52 with SMTP id 49mr19433805qgg.103.1461285219596;
Thu, 21 Apr 2016 17:33:39 -0700 (PDT)
Received: from JonathansMacBookPro.bexmta.net
(cblmdm72-240-163-109.buckeyecom.net. [72.240.163.109])
by smtp.gmail.com with ESMTPSA id e14sm1533704qhc.17.2016.04.21.17.33.39
(version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
Thu, 21 Apr 2016 17:33:39 -0700 (PDT)
From: Jonathan Buchanan <jonathan.russ.buchanan@HIDDEN>
To: bug-coreutils@HIDDEN
Subject: [PATCH 1/2] tests: Added two new tests for unexpand from TODO
Date: Thu, 21 Apr 2016 20:33:35 -0400
Message-Id: <1461285216-14656-1-git-send-email-jonathan.russ.buchanan@HIDDEN>
X-Mailer: git-send-email 2.8.0
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x
X-Received-From: 2001:4830:134:3::11
X-Spam-Score: -4.0 (----)
X-Debbugs-Envelope-To: submit
X-Mailman-Approved-At: Thu, 21 Apr 2016 21:14:31 -0400
Cc: Jonathan Buchanan <jonathan.russ.buchanan@HIDDEN>
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -4.0 (----)
* tests/misc/unexpand.pl: Added two tests from TODO that should pass
according to the specification but currently do not pass.
---
tests/misc/unexpand.pl | 2 ++
1 file changed, 2 insertions(+)
diff --git a/tests/misc/unexpand.pl b/tests/misc/unexpand.pl
index c592c5a..2cd84a1 100755
--- a/tests/misc/unexpand.pl
+++ b/tests/misc/unexpand.pl
@@ -48,6 +48,8 @@ my @Tests =
['aa-8', '-a', {IN=> 'w'.' 'x 8 ."y\n"}, {OUT=> "w\t y\n"}],
['b-1', '-t', '2,4', {IN=> " ."}, {OUT=>"\t\t ."}],
+ ['b-2', '-t', '8,9', {IN=> "x\t \t y\n"}, {OUT=>"x\t \t y\n"}],
+ ['b-3', '-t', '5,8', {IN=> "x\t \t y\n"}, {OUT=>"x\ty\n"}],
# These would infloop prior to textutils-2.0d.
['infloop-1', '-t', '1,2', {IN=> " \t\t .\n"}, {OUT=>"\t\t\t .\n"}],
--
2.8.0
Jonathan Buchanan <jonathan.russ.buchanan@HIDDEN>:bug-coreutils@HIDDEN.
Full text available.bug-coreutils@HIDDEN:bug#23335; Package coreutils.
Full text available.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997 nCipher Corporation Ltd,
1994-97 Ian Jackson.