GNU bug report logs -
#5832
Feature request: uniq -k
Previous Next
To reply to this bug, email your comments to 5832 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org
:
bug#5832
; Package
coreutils
.
(Sat, 03 Apr 2010 18:50:03 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Raphael Clifford <drraph <at> gmail.com>
:
New bug report received and forwarded. Copy sent to
bug-coreutils <at> gnu.org
.
(Sat, 03 Apr 2010 18:50:03 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
Please excuse the cross-post but I have been told this is the
appropriate place to place a feature request.
Is it possible to make a feature request for uniq to add the "-k"
option to specify
fields? Interestingly uniq already has such things as
-f, --skip-fields=N
avoid comparing the first N fields
and
-s, --skip-chars=N
avoid comparing the first N characters
but no explicit option to specify which fields should be considered
when doing the comparison. This would be very useful, for example,
when removing duplicates from time series data (where you are only
worried about consecutive duplicates on certain fields). The awk
equivalent would be something like
awk '$2$3$4$5 != p; {p=$2$3$4$5}'
for using fields 2 to 5 as comparators.
Raphael
P.S. http://www.opengroup.org/onlinepubs/9699919799/utilities/uniq.html
is the posix specification for uniq if that is of any interest.
Curiously it says nothing about which duplicate line to keep when you
don't consider all fields in the comparison.
Severity set to 'wishlist' from 'normal'
Request was from
bob <at> proulx.com (Bob Proulx)
to
control <at> debbugs.gnu.org
.
(Sat, 03 Apr 2010 21:43:02 GMT)
Full text and
rfc822 format available.
Information forwarded
to
owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org
:
bug#5832
; Package
coreutils
.
(Sun, 04 Apr 2010 14:31:03 GMT)
Full text and
rfc822 format available.
Message #10 received at 5832 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
This might be relevant:
uniq: missing option -W / --check-fields=N
http://lists.gnu.org/archive/html/bug-coreutils/2006-06/msg00168.html
Steve
On Sat, Apr 3, 2010 at 14:39, Raphael Clifford <drraph <at> gmail.com> wrote:
> Please excuse the cross-post but I have been told this is the
> appropriate place to place a feature request.
>
> Is it possible to make a feature request for uniq to add the "-k"
> option to specify
> fields? Interestingly uniq already has such things as
>
> -f, --skip-fields=N
> avoid comparing the first N fields
> and
>
> -s, --skip-chars=N
> avoid comparing the first N characters
>
> but no explicit option to specify which fields should be considered
> when doing the comparison. This would be very useful, for example,
> when removing duplicates from time series data (where you are only
> worried about consecutive duplicates on certain fields). The awk
> equivalent would be something like
>
> awk '$2$3$4$5 != p; {p=$2$3$4$5}'
>
> for using fields 2 to 5 as comparators.
>
> Raphael
>
> P.S. http://www.opengroup.org/onlinepubs/9699919799/utilities/uniq.html
> is the posix specification for uniq if that is of any interest.
> Curiously it says nothing about which duplicate line to keep when you
> don't consider all fields in the comparison.
>
>
>
>
>
>
[Message part 2 (text/html, inline)]
Information forwarded
to
owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org
:
bug#5832
; Package
coreutils
.
(Mon, 05 Apr 2010 09:39:02 GMT)
Full text and
rfc822 format available.
Message #13 received at 5832 <at> debbugs.gnu.org (full text, mbox):
Yes http://lists.gnu.org/archive/html/bug-coreutils/2006-06/msg00211.html
in particular is pretty much exactly the same feature request.
What is the current thinking on this?
Raphael
On 4 April 2010 05:22, Steve Ward <planet36 <at> gmail.com> wrote:
> This might be relevant:
>
> uniq: missing option -W / --check-fields=N
> http://lists.gnu.org/archive/html/bug-coreutils/2006-06/msg00168.html
>
>
>
> Steve
>
>
> On Sat, Apr 3, 2010 at 14:39, Raphael Clifford <drraph <at> gmail.com> wrote:
>>
>> Please excuse the cross-post but I have been told this is the
>> appropriate place to place a feature request.
>>
>> Is it possible to make a feature request for uniq to add the "-k"
>> option to specify
>> fields? Interestingly uniq already has such things as
>>
>> -f, --skip-fields=N
>> avoid comparing the first N fields
>> and
>>
>> -s, --skip-chars=N
>> avoid comparing the first N characters
>>
>> but no explicit option to specify which fields should be considered
>> when doing the comparison. This would be very useful, for example,
>> when removing duplicates from time series data (where you are only
>> worried about consecutive duplicates on certain fields). The awk
>> equivalent would be something like
>>
>> awk '$2$3$4$5 != p; {p=$2$3$4$5}'
>>
>> for using fields 2 to 5 as comparators.
>>
>> Raphael
>>
>> P.S. http://www.opengroup.org/onlinepubs/9699919799/utilities/uniq.html
>> is the posix specification for uniq if that is of any interest.
>> Curiously it says nothing about which duplicate line to keep when you
>> don't consider all fields in the comparison.
>>
>>
>>
>>
>>
>
>
Information forwarded
to
owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org
:
bug#5832
; Package
coreutils
.
(Fri, 09 Apr 2010 06:43:02 GMT)
Full text and
rfc822 format available.
Message #16 received at 5832 <at> debbugs.gnu.org (full text, mbox):
Raphael Clifford wrote:
> Yes http://lists.gnu.org/archive/html/bug-coreutils/2006-06/msg00211.html
> in particular is pretty much exactly the same feature request.
>
> What is the current thinking on this?
uniq's -k is still something we'd like.
>> uniq: missing option -W / --check-fields=N
>> http://lists.gnu.org/archive/html/bug-coreutils/2006-06/msg00168.html
I glanced through most of that thread, and the guidance is still valid.
If you are interested, be sure to start the copyright
assignment paperwork:
http://git.savannah.gnu.org/cgit/coreutils.git/tree/HACKING#n327 copyright
and to read/follow the other guidelines in HACKING.
2nd most important: to save yourself the pain of reworking big chunks
of code, and to keep review request size manageable, I suggest
you keep the mailing list in the loop on what you're doing/planning.
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#5832
; Package
coreutils
.
(Mon, 26 Dec 2011 17:43:04 GMT)
Full text and
rfc822 format available.
Message #19 received at 5832 <at> debbugs.gnu.org (full text, mbox):
On 12/26/11 08:35, Pádraig Brady wrote:
> supporting --key would not provide this functionality.
It would support it in the most common cases, no?
That is, if every line has (say) 10 fields, then
the proposed 'uniq -F3' would be equivalent to
the proposed 'uniq -k1,7'.
I can't offhand think of good use cases for uniq -F
that would not be subsumed by uniq -k.
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#5832
; Package
coreutils
.
(Mon, 26 Dec 2011 18:07:02 GMT)
Full text and
rfc822 format available.
Message #22 received at 5832 <at> debbugs.gnu.org (full text, mbox):
On 12/26/2011 05:39 PM, Paul Eggert wrote:
> On 12/26/11 08:35, Pádraig Brady wrote:
>> supporting --key would not provide this functionality.
>
> It would support it in the most common cases, no?
> That is, if every line has (say) 10 fields, then
> the proposed 'uniq -F3' would be equivalent to
> the proposed 'uniq -k1,7'.
That's what I thought at first too,
but then why didn't Adrien propose the
more normal --check-fields=7 rather than
the unusual -F3.
> I can't offhand think of good use cases for uniq -F
> that would not be subsumed by uniq -k.
Me too, Having a variable number of fields per line,
but ignoring the last constant N fields is very unusual,
and why I asked for a concrete example.
Personally I'm leaning towards suggesting `the rev| uniq -f | rev`
is fine for this edge case.
cheers,
Pádraig.
This bug report was last modified 13 years and 151 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.