Received: (at 44704) by debbugs.gnu.org; 18 Nov 2020 11:25:21 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Wed Nov 18 06:25:21 2020 Received: from localhost ([127.0.0.1]:34405 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1kfLaT-0002zB-Fs for submit <at> debbugs.gnu.org; Wed, 18 Nov 2020 06:25:21 -0500 Received: from mail-wm1-f46.google.com ([209.85.128.46]:39496) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <celvidge001@HIDDEN>) id 1kfLaR-0002yd-JQ for 44704 <at> debbugs.gnu.org; Wed, 18 Nov 2020 06:25:19 -0500 Received: by mail-wm1-f46.google.com with SMTP id s13so2366903wmh.4 for <44704 <at> debbugs.gnu.org>; Wed, 18 Nov 2020 03:25:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:references:cc:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=aUY8ZCPT10CSOqTexTKM1WOrzBx8DYPNoCq/fd57rFs=; b=qDbwKxBjs3blywMgq+bqXk+G4ACO/aWq0GI3U3IBQEkLtyOhReDNoGifCvMLg++rED EOOdNpuFB2uUS8ea3H/1NZuhU5UMtKN3yAur3Y7CTo+mFpWv9pp7mQ+hJTFT1jUD9+Ie MSj/Y0PBfreHCtPLw3tWMcU/yQEAyx8eqWQxWllGnpb/lRb0n+gx6Dw4NPens2zIXhdo 6sdEKgISOSpqhsmoDWQKnoXyTB3DuDeDk09ezyezSE8IM50fpz2+vNP7wNca+9V6IFcH AW2k0qiIhhzGJpcJHwCt/3WKtkrH6O8wMSKom5z1y1cu06DtuWJgorbRSnfCn1g4ZL4s fUOg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:cc:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=aUY8ZCPT10CSOqTexTKM1WOrzBx8DYPNoCq/fd57rFs=; b=IRgwXFQA7cWWL+iVazuuwZlBOynDFhJKbUzQ/TKKRbASSX03r9QPi69IwT723EBbmz 7sF+gzn+CRx6ukuDC/QdjOc12OTrb4q7X2IwNuT3wAYv73bk9IVNydbW1vGyQoRadAyK rvXc3FPzWM5rRJ7cig6+nQ0NGxQIRIZuBZcmsODt9ndsgHrezZyB+VGxQijGSiv1HjTQ 2OXA4uUrle2LnfaXnHRdrGMv7juozVN00fkwenTv5mJ6GAuRHCo13R59ps/xPM6gB5jN eYWfSeSCFywKqNXX/5yS/OA92MqOQBEpIMZeOaPuFTc8biH2xqknZfDIMivwrYibnGJD YrzA== X-Gm-Message-State: AOAM533ns7N2fTBTqya7TzoTJ3teZhRNK/+God3VDRWzryWIsVpQ6deL EVYj6TR3ZG6sGIpIe7kuJEA= X-Google-Smtp-Source: ABdhPJzQSK4143HhWieky0ve/8woZEty5nDUc16KXmn4Iy+KPVRZlrNBWLH0J1oMNXw7cGxjcTOGzg== X-Received: by 2002:a1c:b387:: with SMTP id c129mr3885641wmf.58.1605698713762; Wed, 18 Nov 2020 03:25:13 -0800 (PST) Received: from [192.168.23.100] (92.40.176.149.threembb.co.uk. [92.40.176.149]) by smtp.gmail.com with ESMTPSA id n10sm33944443wrx.9.2020.11.18.03.25.12 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 18 Nov 2020 03:25:12 -0800 (PST) Subject: Re: bug#44704: uniq: replace repeated lines with a message about how many repeated lines To: 44704 <at> debbugs.gnu.org References: <b898eca3e980f661156db1d268733149b0c47179.camel@HIDDEN> From: Chris Elvidge <celvidge001@HIDDEN> Message-ID: <7e7b68bc-e6b3-a1df-1d5e-c4a47435cf63@HIDDEN> Date: Wed, 18 Nov 2020 11:25:11 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 Lightning/5.4 MIME-Version: 1.0 In-Reply-To: <b898eca3e980f661156db1d268733149b0c47179.camel@HIDDEN> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-GB Content-Transfer-Encoding: 7bit X-Spam-Score: 3.8 (+++) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: On 17/11/2020 01:32 pm, Brian J. Murrell wrote: > It would be a useful enhancement to uniq to replace all lines > considered non-uniq (i.e. those that would be removed from the output) > with a messag [...] Content analysis details: (3.8 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 3.6 RCVD_IN_SBL_CSS RBL: Received via a relay in Spamhaus SBL-CSS [92.40.176.149 listed in zen.spamhaus.org] 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider (celvidge001[at]gmail.com) 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record 0.2 FREEMAIL_ENVFROM_END_DIGIT Envelope-from freemail username ends in digit (celvidge001[at]gmail.com) -0.0 SPF_PASS SPF: sender matches SPF record -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at https://www.dnswl.org/, no trust [209.85.128.46 listed in list.dnswl.org] -0.0 RCVD_IN_MSPIKE_H2 RBL: Average reputation (+2) [209.85.128.46 listed in wl.mailspike.net] -0.0 NICE_REPLY_A Looks like a legit reply (A) X-Debbugs-Envelope-To: 44704 Cc: "Brian J. Murrell" <brian@HIDDEN> X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: 2.8 (++) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: On 17/11/2020 01:32 pm, Brian J. Murrell wrote: > It would be a useful enhancement to uniq to replace all lines > considered non-uniq (i.e. those that would be removed from the output) > with a messag [...] Content analysis details: (2.8 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 RCVD_IN_MSPIKE_H2 RBL: Average reputation (+2) [209.85.128.46 listed in wl.mailspike.net] 3.6 RCVD_IN_SBL_CSS RBL: Received via a relay in Spamhaus SBL-CSS [92.40.176.149 listed in zen.spamhaus.org] -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at https://www.dnswl.org/, no trust [209.85.128.46 listed in list.dnswl.org] 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider (celvidge001[at]gmail.com) 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record 0.2 FREEMAIL_ENVFROM_END_DIGIT Envelope-from freemail username ends in digit (celvidge001[at]gmail.com) -0.0 SPF_PASS SPF: sender matches SPF record -1.0 MAILING_LIST_MULTI Multiple indicators imply a widely-seen list manager -0.0 NICE_REPLY_A Looks like a legit reply (A) On 17/11/2020 01:32 pm, Brian J. Murrell wrote: > It would be a useful enhancement to uniq to replace all lines > considered non-uniq (i.e. those that would be removed from the output) > with a message about how many times the previous line was repeated. > > I.e. > > $ cat <<EOF | uniq --replace-with-message '[previous line repeated %d times]' > first line > second line > repeated line > repeated line > repeated line > repeated line > repeated line > third line > EOF > first line > second line > repeated line > [previous line repeated 4 times] > third > line > > Cheers, > b. > > You could write your own function to do it. E.g. unique() { [ "$1" ] || { echo "Needs a readable file to test" && return 1; } [ -r "$1" ] || { echo "Needs a readable file to test" && return 1; } R=""; N=0 while IFS=$'\n' read L; do [ "$L" = "$R" ] && { ((N++)); continue; } [ "$N" -gt 0 ] && { echo "[Previous line repeated $N times]"; N=0; } R="$L" echo "$L" done <$1 } -- Chris Elvidge
bug-coreutils@HIDDEN
:bug#44704
; Package coreutils
.
Full text available.Received: (at 44704) by debbugs.gnu.org; 17 Nov 2020 22:18:21 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Tue Nov 17 17:18:21 2020 Received: from localhost ([127.0.0.1]:33252 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1kf9Ir-0003vX-4z for submit <at> debbugs.gnu.org; Tue, 17 Nov 2020 17:18:21 -0500 Received: from mail.interlinx.bc.ca ([69.165.217.196]:56378 helo=server.interlinx.bc.ca ident=bloodninja) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <brian@HIDDEN>) id 1kf9In-0003vN-Qc for 44704 <at> debbugs.gnu.org; Tue, 17 Nov 2020 17:18:20 -0500 Received: from pc.interlinx.bc.ca (pc.interlinx.bc.ca [IPv6:fd31:aeb1:48df:0:3b14:e643:83d8:7017]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by server.interlinx.bc.ca (Postfix) with ESMTPSA id 9FAB725AE0; Tue, 17 Nov 2020 17:18:08 -0500 (EST) Message-ID: <3eb9b58be3a6757c1b5f824ec9f75e1cb686c89f.camel@HIDDEN> Subject: Re: bug#44704: uniq: replace repeated lines with a message about how many repeated lines From: "Brian J. Murrell" <brian@HIDDEN> To: Paul Eggert <eggert@HIDDEN> Date: Tue, 17 Nov 2020 17:18:07 -0500 In-Reply-To: <e7fc262b-d243-deba-c1dd-658b0fe9e3ea@HIDDEN> References: <b898eca3e980f661156db1d268733149b0c47179.camel@HIDDEN> <e7fc262b-d243-deba-c1dd-658b0fe9e3ea@HIDDEN> Content-Type: multipart/signed; micalg="pgp-sha256"; protocol="application/pgp-signature"; boundary="=-i2UCE7KZWY5Yu1FU2GXO" User-Agent: Evolution 3.36.5 (3.36.5-1.fc32) MIME-Version: 1.0 X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 44704 Cc: 44704 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -3.3 (---) --=-i2UCE7KZWY5Yu1FU2GXO Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, 2020-11-17 at 14:10 -0800, Paul Eggert wrote: > On 11/17/20 5:32 AM, Brian J. Murrell wrote: > > [previous line repeated 4 times] >=20 > uniq -c already does something like that, though it outputs "5" > instead of "4".=20 Right. I had considered that. Something like: $ cat /tmp/in | uniq -c | while read c line; do > echo $line > if [ $c -gt 1 ]; then > echo "Last line repeated $((c-1)) times" > fi > done But that eats leading whitespace on $line. > Not sure it's worth gussying up 'uniq' to provide exactly the > functionality=20 > requested, as output reformatting is easy enough to do yourself using > awk or=20 > Python or whatever. Right. But if I were going to pull out such a big hammer, I'd just again, eliminate uniq and do everything in awk or Python or whatever. Anyway, it was just a suggestion. Doesn't seem like it will go much of anywhere. That's fine. If it really itched me enough, I guess I'd just submit a patch. Cheers, b. --=-i2UCE7KZWY5Yu1FU2GXO Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- iQEyBAABCAAdFiEE8B/A+mOVz5cTNBuZ2sHQNBbLyKAFAl+0TB8ACgkQ2sHQNBbL yKD/2Af2KM+gErz1bOIQYS9MKfww4G5C3kfPUb6Qbe8I+/L/UIm1ObDQDW05w8Uf SVjlphlxfP11EKKEwcIqSBkXQa8Qg/10uSCF8HRKTd/YLaml73zk14XmGiZY6lGI pPgX6srM8x4Z4VZ/k1P29A9X+PaWxe5XB1ckkGK1gfM12SV1WfOqvG23mMyTVxI4 OjXcK+/QYFMCLYM6ZFRnEQibzdAKfQxG+L1B8uB+baj1B4znbwUTFo/4LbNobNj4 BEHdSio1eK8YyVwRN7kdc+EUuh/fa1FxH18iKTkBXMQia3XTsbExgFHllpjjMekk ySSOjvkxiOZ901eNAkzvneb/qle8 =ueUO -----END PGP SIGNATURE----- --=-i2UCE7KZWY5Yu1FU2GXO--
bug-coreutils@HIDDEN
:bug#44704
; Package coreutils
.
Full text available.Received: (at 44704) by debbugs.gnu.org; 17 Nov 2020 22:11:10 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Tue Nov 17 17:11:10 2020 Received: from localhost ([127.0.0.1]:33227 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1kf9Bu-0003jz-1H for submit <at> debbugs.gnu.org; Tue, 17 Nov 2020 17:11:10 -0500 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:54370) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <eggert@HIDDEN>) id 1kf9Bn-0003jQ-DR for 44704 <at> debbugs.gnu.org; Tue, 17 Nov 2020 17:11:09 -0500 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id A66E216006A; Tue, 17 Nov 2020 14:10:57 -0800 (PST) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id cNnTtzFzHl6p; Tue, 17 Nov 2020 14:10:57 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id EC1AA16011F; Tue, 17 Nov 2020 14:10:56 -0800 (PST) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id hOmaf5TO97u3; Tue, 17 Nov 2020 14:10:56 -0800 (PST) Received: from [192.168.1.9] (cpe-23-243-218-95.socal.res.rr.com [23.243.218.95]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id C780416006A; Tue, 17 Nov 2020 14:10:56 -0800 (PST) Subject: Re: bug#44704: uniq: replace repeated lines with a message about how many repeated lines To: "Brian J. Murrell" <brian@HIDDEN> References: <b898eca3e980f661156db1d268733149b0c47179.camel@HIDDEN> From: Paul Eggert <eggert@HIDDEN> Organization: UCLA Computer Science Department Message-ID: <e7fc262b-d243-deba-c1dd-658b0fe9e3ea@HIDDEN> Date: Tue, 17 Nov 2020 14:10:56 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <b898eca3e980f661156db1d268733149b0c47179.camel@HIDDEN> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 44704 Cc: 44704 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -3.3 (---) On 11/17/20 5:32 AM, Brian J. Murrell wrote: > [previous line repeated 4 times] uniq -c already does something like that, though it outputs "5" instead of "4". Not sure it's worth gussying up 'uniq' to provide exactly the functionality requested, as output reformatting is easy enough to do yourself using awk or Python or whatever.
bug-coreutils@HIDDEN
:bug#44704
; Package coreutils
.
Full text available.Received: (at 44704) by debbugs.gnu.org; 17 Nov 2020 15:28:59 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Tue Nov 17 10:28:59 2020 Received: from localhost ([127.0.0.1]:60918 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1kf2uh-0002G9-Hz for submit <at> debbugs.gnu.org; Tue, 17 Nov 2020 10:28:59 -0500 Received: from mail.interlinx.bc.ca ([69.165.217.196]:32976 helo=server.interlinx.bc.ca ident=bloodninja) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <brian@HIDDEN>) id 1kf2uf-0002G1-FC for 44704 <at> debbugs.gnu.org; Tue, 17 Nov 2020 10:28:58 -0500 Received: from pc.interlinx.bc.ca (pc.interlinx.bc.ca [IPv6:fd31:aeb1:48df:0:3b14:e643:83d8:7017]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by server.interlinx.bc.ca (Postfix) with ESMTPSA id A2DC225AE0; Tue, 17 Nov 2020 10:28:53 -0500 (EST) Message-ID: <afab5b5bf0dd890ea8da21af84b21bd248a0f71a.camel@HIDDEN> Subject: Re: bug#44704: uniq: replace repeated lines with a message about how many repeated lines From: "Brian J. Murrell" <brian@HIDDEN> To: Assaf Gordon <assafgordon@HIDDEN>, 44704 <at> debbugs.gnu.org Date: Tue, 17 Nov 2020 10:28:53 -0500 In-Reply-To: <d83080a3-b122-ae92-dff6-e5f0003898ca@HIDDEN> References: <b898eca3e980f661156db1d268733149b0c47179.camel@HIDDEN> <d83080a3-b122-ae92-dff6-e5f0003898ca@HIDDEN> Content-Type: multipart/signed; micalg="pgp-sha256"; protocol="application/pgp-signature"; boundary="=-hg9mpm8IJaAWoThWfF8J" User-Agent: Evolution 3.36.5 (3.36.5-1.fc32) MIME-Version: 1.0 X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 44704 X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -3.3 (---) --=-hg9mpm8IJaAWoThWfF8J Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, 2020-11-17 at 08:05 -0700, Assaf Gordon wrote: >=20 > Hello, Hi, > uniq supports the "--group" option, which adds a blank line after > each > group of identical lines - this can be used down-stream to process > groups in any way you want. But there is no way to have it remove the repeated lines also, correct? By down-stream process, I feel like you are leaving it up to the down- stream to remove the duplicate lines as well as add the "repeated %s times" messages. Is that correct? If so, uniq really adds no value. The down-stream might as well just do the adjacent line comparison also in such a case. > And with counting: >=20 > $ cat in | uniq --group=3Dappend \ > | awk 'BEGIN { c =3D 0 } ; > $0=3D=3D"" { print "Group has " c " lines" ; c=3D0 ; next } = ; > 1 { print ; c++ }' > first line > Group has 1 lines > second line > Group has 1 lines > repeated line > repeated line > repeated line > repeated line > repeated line > Group has 5 lines > third line > Group has 1 lines This still doesn't really achieve the original stated goal as the repeated lines are not being replaced by your "Group has %d lines". I think once you add the repeated line suppression, you will see that adding a simple adjacent line comparison and just not using uniq at all is only slightly incrementally more in the down-stream (which is now the main). Cheers, b. --=-hg9mpm8IJaAWoThWfF8J Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEE8B/A+mOVz5cTNBuZ2sHQNBbLyKAFAl+z7DUACgkQ2sHQNBbL yKAEUAf+IEMVDPUuLvt/J27UG12w3qiCCv0129Oqoax9Jv7X3SjGuQh2iPFjcUFG 4tni0dfV6hXJLYcOB0f3Ml5J4dZJhJGZJD2T6amImVl0Lt/kZapLpXCIN19CDTVg mmhuX4L7jaCg3kquu7S4JTxGqhdrVFgrEha3d5Kvs5hUIIBZvmiNA95+WlHyFuuC yoQprAuBVCk0msDArUc2TdLCeCKBPubry60hce1A6YNJX/Z60hvgVYBpt6uAkMZW LOYb8lFWHNuuSSJSaCcBhWdGYhnIjtylLuNYtPpVwKuIKQ51zGrhcAceZ9zFrhOu Y+jg2RPISl7FqnTIj2ZLDtR0Eg7p8A== =FR0v -----END PGP SIGNATURE----- --=-hg9mpm8IJaAWoThWfF8J--
bug-coreutils@HIDDEN
:bug#44704
; Package coreutils
.
Full text available.Assaf Gordon <assafgordon@HIDDEN>
to control <at> debbugs.gnu.org
.
Full text available.Assaf Gordon <assafgordon@HIDDEN>
to control <at> debbugs.gnu.org
.
Full text available.Received: (at 44704) by debbugs.gnu.org; 17 Nov 2020 15:05:57 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Tue Nov 17 10:05:57 2020 Received: from localhost ([127.0.0.1]:60878 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1kf2YO-00081F-UF for submit <at> debbugs.gnu.org; Tue, 17 Nov 2020 10:05:57 -0500 Received: from mail-pf1-f180.google.com ([209.85.210.180]:38493) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <assafgordon@HIDDEN>) id 1kf2YN-00080x-Fq; Tue, 17 Nov 2020 10:05:55 -0500 Received: by mail-pf1-f180.google.com with SMTP id 10so17456448pfp.5; Tue, 17 Nov 2020 07:05:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-language:content-transfer-encoding; bh=xSW91yEyWhdOK2oZAJ653nh4BNmKSib/CV1a0kCEF9s=; b=BAkGGu4uSwG+9spbOtAryiWV3rJxSq9QgrgQV2vYB2GNXiz4gQwJAfgVfON8KcexP7 Wpt568HxXR66wlNgEPMkKEv9Hj+f/jetgIos7nuPPUU7rcH+vjXsmRTBbVSSTe9Uz57Q Vmt+FtQOF4T5sbemxDjRbKOEctZA5x/fSdyKW75biqAiRHXgWbRAjxUfD1OxqR+RmEwT khbxztP/d73/QapczNjuZeDdowRvknGSpUYC7Sc9hhnZukiTukWCSE/yZ8y35G6WKa5r vVOVxvahzHvTvSPRwzi+7XeizEcqLIZDUeD+FB77pqRi7V7k/vmxLcgkwUo0AMOvCyY4 ha7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=xSW91yEyWhdOK2oZAJ653nh4BNmKSib/CV1a0kCEF9s=; b=FVxEQQ1FyPQBrBUIin2mjN4Qh78UyTxrxVee3wD7q304xoTnoXAQPusb1J2eraJKt8 DO73KaJAPFeN/MOkljlwYAR3hONlZYwSgzJe7mgyKzQr8F6MuBk/Y67MysilpoS63akR 8060cClBfn8iH7M9pYJYzGPZdBjBpwd2IlkYs3r9HB3unTj6dBuAObhiIo5802N2keM1 LQ2Z64neq3u3t9TTltDNDAj/XHSdPNGZDQcqrk8XQlhSfz2dyVNqQAEMvWcV3nOoST6+ bLzmm7sdf62stHP6yYDWGSTfNdbVfOzB6FS+qbIH4TgySMM+p9CE5XE0cpLeTyVXkqgV rYag== X-Gm-Message-State: AOAM530ySeWX3dD3LDPbiGEv+J+82buu8hSfn0d4ByXyK6nx0MVWGsA+ GfHlcEZargGUSswz0OtzojXlfnTX538= X-Google-Smtp-Source: ABdhPJyGI8eM7uzVIFHPmWjt2uQP1egf+2/AvwxC6fBFMbvL7zR4lhrJ943c9YmaN4gLMxez//JTAA== X-Received: by 2002:a63:d1b:: with SMTP id c27mr3937800pgl.25.1605625548692; Tue, 17 Nov 2020 07:05:48 -0800 (PST) Received: from tomato.moose.housegordon.com (moose.housegordon.com. [184.68.105.38]) by smtp.googlemail.com with ESMTPSA id y14sm3514919pjt.39.2020.11.17.07.05.47 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 17 Nov 2020 07:05:47 -0800 (PST) Subject: Re: bug#44704: uniq: replace repeated lines with a message about how many repeated lines To: "Brian J. Murrell" <brian@HIDDEN>, 44704 <at> debbugs.gnu.org References: <b898eca3e980f661156db1d268733149b0c47179.camel@HIDDEN> From: Assaf Gordon <assafgordon@HIDDEN> Message-ID: <d83080a3-b122-ae92-dff6-e5f0003898ca@HIDDEN> Date: Tue, 17 Nov 2020 08:05:46 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.4.0 MIME-Version: 1.0 In-Reply-To: <b898eca3e980f661156db1d268733149b0c47179.camel@HIDDEN> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 44704 X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.0 (-) tag 44704 notabug severity 44704 wishlist stop Hello, On 2020-11-17 6:32 a.m., Brian J. Murrell wrote: > It would be a useful enhancement to uniq to replace all lines > considered non-uniq (i.e. those that would be removed from the output) > with a message about how many times the previous line was repeated. > > I.e. > > $ cat <<EOF | uniq --replace-with-message '[previous line repeated %d times]' [...] uniq supports the "--group" option, which adds a blank line after each group of identical lines - this can be used down-stream to process groups in any way you want. Example: $ cat <<EOF > in first line second line repeated line repeated line repeated line repeated line repeated line third line EOF $ cat in | uniq --group=append first line second line repeated line repeated line repeated line repeated line repeated line third line $ cat in | uniq --group=append \ | awk '$0=="" { print "do something after group" ; next } ; 1 { print }' first line do something after group second line do something after group repeated line repeated line repeated line repeated line repeated line do something after group third line do something after group And with counting: $ cat in | uniq --group=append \ | awk 'BEGIN { c = 0 } ; $0=="" { print "Group has " c " lines" ; c=0 ; next } ; 1 { print ; c++ }' first line Group has 1 lines second line Group has 1 lines repeated line repeated line repeated line repeated line repeated line Group has 5 lines third line Group has 1 lines Hope this helps. More information about "uniq --group=X" is here: https://www.gnu.org/software/coreutils/manual/html_node/uniq-invocation.html I'm marking this as "notabug/wishlist", but will likely close soon as "wontfix" unless we come up with convincing argument why "--group" is not sufficient for your use case. Regardless of the status, discussion can continue by replying to this thread. regards, - assaf
bug-coreutils@HIDDEN
:bug#44704
; Package coreutils
.
Full text available.Received: (at submit) by debbugs.gnu.org; 17 Nov 2020 14:13:28 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Tue Nov 17 09:13:28 2020 Received: from localhost ([127.0.0.1]:58741 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1kf1ja-0006DY-0O for submit <at> debbugs.gnu.org; Tue, 17 Nov 2020 09:13:28 -0500 Received: from lists.gnu.org ([209.51.188.17]:40212) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <brian@HIDDEN>) id 1kf16i-0005FW-OQ for submit <at> debbugs.gnu.org; Tue, 17 Nov 2020 08:33:20 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:49502) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <brian@HIDDEN>) id 1kf16g-0004vU-K8 for bug-coreutils@HIDDEN; Tue, 17 Nov 2020 08:33:16 -0500 Received: from mail.interlinx.bc.ca ([69.165.217.196]:39342 helo=server.interlinx.bc.ca) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <brian@HIDDEN>) id 1kf16e-0004oN-BA for bug-coreutils@HIDDEN; Tue, 17 Nov 2020 08:33:14 -0500 Received: from pc.interlinx.bc.ca (pc.interlinx.bc.ca [IPv6:fd31:aeb1:48df:0:3b14:e643:83d8:7017]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by server.interlinx.bc.ca (Postfix) with ESMTPSA id 2772925A11 for <bug-coreutils@HIDDEN>; Tue, 17 Nov 2020 08:32:37 -0500 (EST) Message-ID: <b898eca3e980f661156db1d268733149b0c47179.camel@HIDDEN> Subject: uniq: replace repeated lines with a message about how many repeated lines From: "Brian J. Murrell" <brian@HIDDEN> To: bug-coreutils@HIDDEN Date: Tue, 17 Nov 2020 08:32:36 -0500 Content-Type: multipart/signed; micalg="pgp-sha256"; protocol="application/pgp-signature"; boundary="=-Fyrz0DvWeWTpn2VgdZCz" User-Agent: Evolution 3.36.5 (3.36.5-1.fc32) MIME-Version: 1.0 Received-SPF: pass client-ip=69.165.217.196; envelope-from=brian@HIDDEN; helo=server.interlinx.bc.ca X-detected-operating-system: by eggs.gnu.org: First seen = 2020/11/17 08:32:50 X-ACL-Warn: Detected OS = Linux 3.11 and newer [fuzzy] X-Spam_score_int: -41 X-Spam_score: -4.2 X-Spam_bar: ---- X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: -1.4 (-) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Tue, 17 Nov 2020 09:13:21 -0500 X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -2.4 (--) --=-Fyrz0DvWeWTpn2VgdZCz Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable It would be a useful enhancement to uniq to replace all lines considered non-uniq (i.e. those that would be removed from the output) with a message about how many times the previous line was repeated. I.e. $ cat <<EOF | uniq --replace-with-message '[previous line repeated %d times= ]' first line second line repeated line repeated line repeated line repeated line repeated line third line EOF first line second line repeated line [previous line repeated 4 times] third line Cheers, b. --=-Fyrz0DvWeWTpn2VgdZCz Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEE8B/A+mOVz5cTNBuZ2sHQNBbLyKAFAl+z0PQACgkQ2sHQNBbL yKCfiwf9HsyiAbmdlIXw0xXgtmTKc9+SwEYyVFOSTKZ//6JR68mBAnipdX8NPwP8 GUhS6d9p0HPJKGTHNVzelJUfBRM2fnAzHVm+X/hHzJrsn6sJ/MnwXMFx9dap2RVG QR+V5yXpJRPd6FZdAH6C4dlHVWKwDYUgP3AtRt4HlL/TZ8wh/LartoHuDyuhq1tw lN2kefDepjvQLSq9O9EBCA1CEL9Up2+Y+g40yApyCOwvFzYMn/jJBipa3ZeSC/zk UN0LM0pkLre/OrCRWpD/yD1nca2ZO06MrdhHhXaB3PWMnmkWSRpasDAFnm3V54Gq yWJCHO1PW6V+8FLpuzt7kfEXUxiujw== =GmlJ -----END PGP SIGNATURE----- --=-Fyrz0DvWeWTpn2VgdZCz--
"Brian J. Murrell" <brian@HIDDEN>
:bug-coreutils@HIDDEN
.
Full text available.bug-coreutils@HIDDEN
:bug#44704
; Package coreutils
.
Full text available.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997 nCipher Corporation Ltd,
1994-97 Ian Jackson.