GNU bug report logs - #37754
wish for grep --and -eX -eY -eZ (X∩Y∩Z intersection, not X∪Y∪Z union)

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: grep; Severity: wishlist; Reported by: "Trent W. Buck" <trentbuck@HIDDEN>; dated Tue, 15 Oct 2019 01:49:01 UTC; Maintainer for grep is bug-grep@HIDDEN.

Message received at 37754 <at> debbugs.gnu.org:


Received: (at 37754) by debbugs.gnu.org; 19 Oct 2019 06:57:51 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sat Oct 19 02:57:51 2019
Received: from localhost ([127.0.0.1]:51840 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1iLigQ-0006fJ-K3
	for submit <at> debbugs.gnu.org; Sat, 19 Oct 2019 02:57:50 -0400
Received: from mailgw03.kcn.ne.jp ([61.86.7.210]:59276)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <noritnk@HIDDEN>) id 1iLigN-0006eu-Cv
 for 37754 <at> debbugs.gnu.org; Sat, 19 Oct 2019 02:57:49 -0400
Received: from mxs01-s (mailgw1.kcn.ne.jp [61.86.15.233])
 by mailgw03.kcn.ne.jp (Postfix) with ESMTP id C2B1B13F8B7
 for <37754 <at> debbugs.gnu.org>; Sat, 19 Oct 2019 15:57:39 +0900 (JST)
X-matriXscan-loop-detect: 08de342dc753551b3a9198fbd78915234f032a9d
Received: from mail12.kcn.ne.jp ([61.86.6.130]) by mxs01-s with ESMTP;
 Sat, 19 Oct 2019 15:57:39 +0900 (JST)
Received: from [10.120.1.123] (i118-21-128-66.s30.a048.ap.plala.or.jp
 [118.21.128.66])
 by mail12.kcn.ne.jp (Postfix) with ESMTPA id 621184121FD1;
 Sat, 19 Oct 2019 15:57:39 +0900 (JST)
Date: Sat, 19 Oct 2019 15:57:39 +0900
From: Norihiro Tanaka <noritnk@HIDDEN>
To: "Trent W. Buck" <trentbuck@HIDDEN>
Subject: Re: bug#37754: wish for grep --and -eX -eY -eZ
 =?ISO-2022-JP?B?KFgbJEIiQRsoQlkbJEIiQRsoQlo=?=
 intersection, not =?ISO-2022-JP?B?WBskQiJAGyhCWRskQiJAGyhCWg==?= union)
In-Reply-To: <20191015014817.GA3082@HIDDEN>
References: <156859974060.2726.6189784814472309967.reportbug@HIDDEN>
 <20191015014817.GA3082@HIDDEN>
Message-Id: <20191019155738.1011.27F6AC2D@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain; charset="ISO-2022-JP"
Content-Transfer-Encoding: 7bit
X-Mailer: Becky! ver. 2.74.02 [ja]
X-matriXscan-msec-AV: Clean
X-matriXscan-Action: Approve
X-matriXscan: Uncategorized
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 37754
Cc: 37754 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)


On Tue, 15 Oct 2019 12:48:17 +1100
"Trent W. Buck" <trentbuck@HIDDEN> wrote:

> Package: grep
> Version: 3.3-1
> Severity: wishlist
> 
> This bug was originally reported as
> https://bugs.debian.org/940464
> 
> Trent W. Buck wrote:
> > (Surely someone has already asked for this, but I can't see where.
> > I may have already reported this myself, and forgotten.
> > If so, sorry!)
> >
> > Right now if you do
> >
> >     grep -eX -eY -eZ
> >
> > You'll get lines that match *any of* X, Y, or Z.
> > Quite often I want to search for lines that match *all of* X, Y, and Z ? but in any order.
> > For example,
> >
> >     # all 4TB 2.5-inch SATA products
> >     grep -Fwi -eSATA -e2TB -e2.5in products.csv
> >
> > Below is a short discussion of the workarounds I know about.
> >
> > Is "grep --and" something that has already been discussed and rejected?
> > I looked through debbugs.gnu.org and the source tarball, but
> > I couldn't find anything about this.
> >
> >
> > PS: grep -v --and would intuitively mean "not all",
> > i.e. "grep -v --and -eX -eY" would return lines matching X *or* Y, but
> > omit lines matching *both* X and Y.
> >
> > PS: I can't decide if "--and" or "--intersection" is a better name.
> > I put both in the bug subject so people searching for either will find this ticket.
> > I think "--all" is probably too confusing.
> >
> >
> >
> > Workaround #1
> > =============
> > I can work around this by listing every possible order, but 1) this
> > scales poorly with the number of patterns; and 2) it can't be used
> > with -F.  For example,
> >
> >     grep --and -eX -eY -eZ input*.txt   # becomes
> >
> >     grep -eZ.*Y.*X \
> >          -eZ.*X.*Y \
> >          -eY.*Z.*X \
> >          -eY.*X.*Z \
> >          -eX.*Z.*Y \
> >          -eX.*Y.*Z \
> >          input*.txt
> >
> >
> > Workaround #2
> > =============
> > I can pipe greps together.  This is what I currently do.
> > This is more convenient and feels faster than workaround #1, but
> > I suspect the inter-process overhead is significant.
> >
> > If grep implemented this internally, it could zero-copy.
> > Being able to "grep -rnH --and" &c would also be convenient.
> >
> > For example,
> >
> >     grep --and -F -eX -eY -eZ input*.txt   # becomes
> >
> >     cat input*.txt |
> >     grep -F -eX |
> >     grep -F -eY |
> >     grep -F -eZ
> 
> 

> Workaround #1
> =============
> I can work around this by listing every possible order, but 1) this
> scales poorly with the number of patterns; and 2) it can't be used
> with -F.  For example,
>
>     grep --and -eX -eY -eZ input*.txt   # becomes
>
>     grep -eZ.*Y.*X \
>          -eZ.*X.*Y \
>          -eY.*Z.*X \
>          -eY.*X.*Z \
>          -eX.*Z.*Y \
>          -eX.*Y.*Z \
>          input*.txt

I have noticed that the above two do not necessarily produce the same results.

    grep --and -e123 -e234 input*.txt

    grep --and -e '123.*234' -e '234.*123' input*.txt

"1234" matches first, but it does not match second. 





Information forwarded to bug-grep@HIDDEN:
bug#37754; Package grep. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 18 Oct 2019 22:36:06 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Fri Oct 18 18:36:06 2019
Received: from localhost ([127.0.0.1]:51685 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1iLaqr-0004U1-MV
	for submit <at> debbugs.gnu.org; Fri, 18 Oct 2019 18:36:05 -0400
Received: from lists.gnu.org ([209.51.188.17]:34822)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <pj@HIDDEN>) id 1iLaqp-0004Tt-O9
 for submit <at> debbugs.gnu.org; Fri, 18 Oct 2019 18:36:04 -0400
Received: from eggs.gnu.org ([2001:470:142:3::10]:59788)
 by lists.gnu.org with esmtp (Exim 4.90_1)
 (envelope-from <pj@HIDDEN>) id 1iLaqo-0007DW-Cl
 for bug-grep@HIDDEN; Fri, 18 Oct 2019 18:36:03 -0400
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=0.1 required=5.0 tests=BAYES_50,RCVD_IN_DNSWL_LOW,
 URIBL_BLOCKED autolearn=disabled version=3.3.2
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <pj@HIDDEN>) id 1iLaqm-0004ao-VN
 for bug-grep@HIDDEN; Fri, 18 Oct 2019 18:36:02 -0400
Received: from out1-smtp.messagingengine.com ([66.111.4.25]:54765)
 by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32)
 (Exim 4.71) (envelope-from <pj@HIDDEN>) id 1iLaqm-0004aO-SG
 for bug-grep@HIDDEN; Fri, 18 Oct 2019 18:36:00 -0400
Received: from compute1.internal (compute1.nyi.internal [10.202.2.41])
 by mailout.nyi.internal (Postfix) with ESMTP id EA12B22299
 for <bug-grep@HIDDEN>; Fri, 18 Oct 2019 18:35:59 -0400 (EDT)
Received: from imap24 ([10.202.2.74])
 by compute1.internal (MEProxy); Fri, 18 Oct 2019 18:35:59 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
 messagingengine.com; h=content-type:date:from:in-reply-to
 :message-id:mime-version:references:subject:to:x-me-proxy
 :x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm1; bh=n6nFJc
 dp3vI6K1CyFjLkKdpcaR97VgSyBRZZDKe6iI4=; b=b6cdMEKV+yjiPteKNBQWaz
 M0aN5hzwqW5kdpqyuD+4q7WeJhh9d35F2Q59CVfDDkDDuZG/vrK/0i/fqtWbUfch
 R4ubUaWPJ2XgZVq0roB40mz60Us2B0K1YaGdeUmpbmamvh+/dl2iL7ktCgDUPYZc
 1OdEyT5JU7knw8D4x2wnSUp0wMeixE24jCVYj9mBPiabYp396NoUWl94EUlwv44/
 N1hrpWoh5HnV9FWPir6FOt07ruzAl767JOF/jp7ITycFNZ9/CZ9/0diwcOepxZ4b
 KxdsufWslaL0tm/VyGCCV8MsUdn/hyf2O4OytbgLK2/YwYgOZJjgH7BpYqJIDv2w
 ==
X-ME-Sender: <xms:Tz6qXTpPc1ieiolXLxyk4yNBuCNZFVHyQRk-pTNoXi9rM8pkx13mDQ>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedufedrkedtgdduvdcutefuodetggdotefrodftvf
 curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu
 uegrihhlohhuthemuceftddtnecunecujfgurhepofgfggfkjghffffhvffutgesthdtre
 dtreerjeenucfhrhhomhepfdfrrghulhculfgrtghkshhonhdfuceophhjsehushgrrdhn
 vghtqeenucfrrghrrghmpehmrghilhhfrhhomhepphhjsehushgrrdhnvghtnecuvehluh
 hsthgvrhfuihiivgeptd
X-ME-Proxy: <xmx:Tz6qXWdH0u5UaZRODSP34wFRiKp1vcc1sUl8TKqDyjAJ56PgD0Df2g>
 <xmx:Tz6qXXKrd8IFsjrx-QWFckNBSQ7cxQxWsVZ6LoAm6RLhiIm-kPdR5Q>
 <xmx:Tz6qXQehhwQeoAsDe8kmftVeEMYwm1wQKBYvEEhAF5D3iRUsz5sJ6Q>
 <xmx:Tz6qXdm_fJ3HgU4BfdHwiC-FQKxU-lg_M6gbZgkcCwAJVkkv02xSOQ>
Received: by mailuser.nyi.internal (Postfix, from userid 501)
 id 50C8B2000A3; Fri, 18 Oct 2019 18:35:59 -0400 (EDT)
X-Mailer: MessagingEngine.com Webmail Interface
User-Agent: Cyrus-JMAP/3.1.7-360-g7dda896-fmstable-20191004v2
Mime-Version: 1.0
Message-Id: <5e64c0d8-3a8f-4b31-a08a-05f335915878@HIDDEN>
In-Reply-To: <4dc7854d-a8c6-5d0d-59de-8b5e27d23280@HIDDEN>
References: <156859974060.2726.6189784814472309967.reportbug@HIDDEN>
 <20191015014817.GA3082@HIDDEN> <20191016212619.31E9.27F6AC2D@HIDDEN>
 <07cfaedb-b581-c46e-6f47-433ca8f3a76d@HIDDEN>
 <20191017001952.GA2991@HIDDEN>
 <7038d337-f127-74ef-aff6-782f2ad42969@HIDDEN>
 <20191018114921.GA21858@HIDDEN>
 <4dc7854d-a8c6-5d0d-59de-8b5e27d23280@HIDDEN>
Date: Fri, 18 Oct 2019 17:35:38 -0500
From: "Paul Jackson" <pj@HIDDEN>
To: bug-grep@HIDDEN
Subject: =?UTF-8?Q?Re:_bug#37754:_wish_for_grep_--and_-eX_-eY_-eZ_(X=E2=88=A9Y?=
 =?UTF-8?Q?=E2=88=A9Z_intersection,_not_X=E2=88=AAY=E2=88=AAZ_union)?=
Content-Type: text/plain
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
 [fuzzy]
X-Received-From: 66.111.4.25
X-Spam-Score: -1.6 (-)
X-Debbugs-Envelope-To: submit
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -2.6 (--)

I'm currently working on rewriting and packaging up a tool that
I use to handle such high volume [and/or/not] filters on long lists
of file pathnames and of log file entries.  It's a tool I've had in
my private toolbox for decades.  I call it "ftest".   It has a rich set
of "test" like flags for testing stat(2) attributes of file, but is
optimized for working in pipelines (as a filter, hence the "f").

Trent - do you need regular expression matching, or is glob matching
easily sufficient, or would even just fixed string matching be useful?

For [and/or/not] logical combinations of full regular expressions, 
I'll probably continue to use awk, as Paul Eggert suggested, though
that might be because I've long been an awk user, since teaching
an awk class to other engineers inside Bell Labs, some 40 years ago.

Perhaps sometime, months into the future, I'll follow up with an
update pointing to my "ftest" command on github.

-- 
                Paul Jackson
                pj@HIDDEN




Information forwarded to bug-grep@HIDDEN:
bug#37754; Package grep. Full text available.

Message received at 37754 <at> debbugs.gnu.org:


Received: (at 37754) by debbugs.gnu.org; 18 Oct 2019 17:51:39 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Fri Oct 18 13:51:39 2019
Received: from localhost ([127.0.0.1]:51512 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1iLWPb-0004xO-FV
	for submit <at> debbugs.gnu.org; Fri, 18 Oct 2019 13:51:39 -0400
Received: from zimbra.cs.ucla.edu ([131.179.128.68]:34402)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <eggert@HIDDEN>) id 1iLWPZ-0004x5-83
 for 37754 <at> debbugs.gnu.org; Fri, 18 Oct 2019 13:51:37 -0400
Received: from localhost (localhost [127.0.0.1])
 by zimbra.cs.ucla.edu (Postfix) with ESMTP id 1B4DA160690;
 Fri, 18 Oct 2019 10:51:30 -0700 (PDT)
Received: from zimbra.cs.ucla.edu ([127.0.0.1])
 by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032)
 with ESMTP id Sf29exvZokXl; Fri, 18 Oct 2019 10:51:29 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1])
 by zimbra.cs.ucla.edu (Postfix) with ESMTP id 6A0B1160697;
 Fri, 18 Oct 2019 10:51:29 -0700 (PDT)
X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu
Received: from zimbra.cs.ucla.edu ([127.0.0.1])
 by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026)
 with ESMTP id W1Mqw51nFBH5; Fri, 18 Oct 2019 10:51:29 -0700 (PDT)
Received: from Penguin.CS.UCLA.EDU (Penguin.CS.UCLA.EDU [131.179.64.200])
 by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 50323160690;
 Fri, 18 Oct 2019 10:51:29 -0700 (PDT)
Subject: =?UTF-8?Q?Re=3a_bug=2337754=3a_wish_for_grep_--and_-eX_-eY_-eZ_=28X?=
 =?UTF-8?B?4oipWeKIqVogaW50ZXJzZWN0aW9uLCBub3QgWOKIqlniiKpaIHVuaW9uKQ==?=
To: "Trent W. Buck" <trentbuck@HIDDEN>
References: <156859974060.2726.6189784814472309967.reportbug@HIDDEN>
 <20191015014817.GA3082@HIDDEN> <20191016212619.31E9.27F6AC2D@HIDDEN>
 <07cfaedb-b581-c46e-6f47-433ca8f3a76d@HIDDEN>
 <20191017001952.GA2991@HIDDEN>
 <7038d337-f127-74ef-aff6-782f2ad42969@HIDDEN>
 <20191018114921.GA21858@HIDDEN>
From: Paul Eggert <eggert@HIDDEN>
Organization: UCLA Computer Science Department
Message-ID: <4dc7854d-a8c6-5d0d-59de-8b5e27d23280@HIDDEN>
Date: Fri, 18 Oct 2019 10:51:29 -0700
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101
 Thunderbird/68.1.1
MIME-Version: 1.0
In-Reply-To: <20191018114921.GA21858@HIDDEN>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 7bit
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 37754
Cc: 37754 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

On 10/18/19 4:49 AM, Trent W. Buck wrote:
> In that case, each grep in the pipeline has to pay the costs to
> de-serialize input from the previous grep

Sure, but grep is designed to be a simple tool and we need to draw the 
line somewhere. For something more complicated there are already sed and 
awk (if you want to write to POSIX) or Perl or Python or whatever.

I mildly of prefer the A\&B notation because it could be used 
everywhere, not just in grep. (But of course someone would have to 
implement it. :-)




Information forwarded to bug-grep@HIDDEN:
bug#37754; Package grep. Full text available.

Message received at 37754 <at> debbugs.gnu.org:


Received: (at 37754) by debbugs.gnu.org; 18 Oct 2019 11:49:35 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Fri Oct 18 07:49:35 2019
Received: from localhost ([127.0.0.1]:49479 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1iLQlD-0003IR-Aw
	for submit <at> debbugs.gnu.org; Fri, 18 Oct 2019 07:49:35 -0400
Received: from mail-pg1-f193.google.com ([209.85.215.193]:36424)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <trentbuck@HIDDEN>) id 1iLQlB-0003ID-Eb
 for 37754 <at> debbugs.gnu.org; Fri, 18 Oct 2019 07:49:34 -0400
Received: by mail-pg1-f193.google.com with SMTP id 23so3254913pgk.3
 for <37754 <at> debbugs.gnu.org>; Fri, 18 Oct 2019 04:49:33 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=date:from:to:cc:subject:message-id:references:mime-version
 :content-disposition:in-reply-to:user-agent;
 bh=TKj15ALqW0/bRwDTBugdGET/2bJg7+KUZEdHt2TcaSg=;
 b=G5F02u8m69A+gpUdATgCtalgDvg1QO1Tq7+vXMCg7i/kUONxsHAJMR7F861jkK+LQS
 hW8ST0WLrBccSCkflUniyBgy2HWNEHbqKoUCxe7jSq5HcV/VXx9dDyjDcnbpBgoSuRAg
 o2AjQfRSMiwPdjO8TjFa+2BVU1t4ZFLI0HcYY7bnZ0GIv99vzylhgmYHp28rz5ieutCW
 LioGLgJWIaQBunC8S9gsAWMTCNd3vYa2vpSLcetHb/obanaMuPXMW0tG0YJzRD/QpJq7
 caAjcSLOyoLGwgumW1Ctqoq5cyah26w2WQc9G3R9W8WMjgLa2YKDGzL2vIsgrpTH9wux
 7IrA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:date:from:to:cc:subject:message-id:references
 :mime-version:content-disposition:in-reply-to:user-agent;
 bh=TKj15ALqW0/bRwDTBugdGET/2bJg7+KUZEdHt2TcaSg=;
 b=V2gbFrT+ghB0LCBqOyjymmrjPcmTellW/ntRkZnjq4TI/FlKrjDnCTw10tnTSBGkU8
 P/0ykvm09MIgDRayBU9jugfrYucAQBLdCWaHwnqkabQS6LzxE/8ehaA98TRc86IE2to6
 2PqhCgqWZ7GsXkgWE68sdLPM4HUbyFujuB2H+Z8Z1t+9WdGK+jFQJ6XTTGlQfmIKr8lB
 AGkNqbVoYJ61fyfe4WzNkOTm66BvwdETewTXCMkOgLwafyEZRtOpqyBRqx2hypTxwvAb
 +JCXeM0yJIhW1cF3PUT8ysfZpKc2xO7JW459CS9fVqh05GtzWCONZaKpUZihUO2Jv4qf
 ngRg==
X-Gm-Message-State: APjAAAX5G+ZqhACcbEXAx995aehNihIvCA2is6Z5BmP53WAFf/dVJQz7
 drnG7nfMbY7aT//bnhL0XhE=
X-Google-Smtp-Source: APXvYqyP6CQiLtY7CXfUYDLaOozMZVHsnucoZ6dKdhPVwInfCgLRK1YBVufJKuAFCaQYiv6u2hWoCQ==
X-Received: by 2002:a63:1f4e:: with SMTP id q14mr9799160pgm.144.1571399367446; 
 Fri, 18 Oct 2019 04:49:27 -0700 (PDT)
Received: from localhost
 (2001-44b8-4197-6c00-0000-0000-0000-0a92.static.ipv6.internode.on.net.
 [2001:44b8:4197:6c00::a92])
 by smtp.gmail.com with ESMTPSA id h1sm6823553pfk.124.2019.10.18.04.49.25
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Fri, 18 Oct 2019 04:49:26 -0700 (PDT)
Date: Fri, 18 Oct 2019 22:49:23 +1100
From: "Trent W. Buck" <trentbuck@HIDDEN>
To: Paul Eggert <eggert@HIDDEN>
Subject: Re: bug#37754: wish for grep =?utf-8?Q?--a?=
 =?utf-8?B?bmQgLWVYIC1lWSAtZVogKFjiiKlZ4oipWiBpbnRlcnNlY3Rpb24sIG5vdCBY?=
 =?utf-8?B?4oiqWeKIqlo=?= union)
Message-ID: <20191018114921.GA21858@HIDDEN>
References: <156859974060.2726.6189784814472309967.reportbug@HIDDEN>
 <20191015014817.GA3082@HIDDEN>
 <20191016212619.31E9.27F6AC2D@HIDDEN>
 <07cfaedb-b581-c46e-6f47-433ca8f3a76d@HIDDEN>
 <20191017001952.GA2991@HIDDEN>
 <7038d337-f127-74ef-aff6-782f2ad42969@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <7038d337-f127-74ef-aff6-782f2ad42969@HIDDEN>
User-Agent: Mutt/1.10.1 (2018-07-13)
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 37754
Cc: 37754 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

Paul Eggert wrote:
> On 10/16/19 5:19 PM, Trent W. Buck wrote:
> > I would expect "grep -Fw -e 4GB -e DDR4 --and" to print the same thing as
> >
> >      grep -Fw 4GB | grep -Fw DDR4 | grep -Fw -e 4GB -e DDR4 -o
>
> You're right, it's not obvious. :-)
>
> It may be better to just pipe greps together, as you do now. That's simple
> and fast enough for this relatively-uncommon case, and it's portable to all
> greps.

I admit that most of the time, I want "grep --and" for a small dataset
(<1MB computer_parts.txt), so it's merely a convenience.

Sometimes I grep audit logs (~1TB uncompressed), which takes anywhere
from 15 minutes to 3 days, depending on how I tweak my grep calls.

In that case, each grep in the pipeline has to pay the costs to
de-serialize input from the previous grep, and re-serialize output to
the next grep.  If the first grep matches (say) 200GB of the 1TB,
that's can be a lot of overhead (I assume).

I was basically hoping that if it was all in a single grep process,
the de/serialization steps could be skipped completely.
I think the buzzword for that is "zero-copy"?

I've noticed "grep" is about 30% slower than either "grep -F" or
"LC_COLLATE=C grep", because (I think) it avoids the costs of decoding
from UTF-8 to Unicode and back.  So I was basically expecting a
similar saving from --and.

I'm only speaking as an end user - I haven't dug through the grep
source, so those expectations might be unrealistic, and implementing
it might be painful/impossible.  I figured I should at least ask :-)

If your expert opinion is that it's a pain to implement (and
maintain!) and there's not enough demand, then I'm OK with that.
This is NOT something that's burning me every day.

Regardless, I appreciate you taking the time to discuss it. :-)


PS: Regarding portability, I'm personally not worried because when I
need a GNUism badly enough (e.g. du --threshold), I can usually get
permission to install the relevant GNU software, even if it's only
into %APPDATA% or $HOME.

PS: I noticed on bugs.gnu.org something about grep being
single-threaded, which might mean "grep --and" would end up being
SLOWER than the existing pipelines, since the kernel can distribute
a pipeline's elements across multiple CPUs/cores.




Information forwarded to bug-grep@HIDDEN:
bug#37754; Package grep. Full text available.

Message received at 37754 <at> debbugs.gnu.org:


Received: (at 37754) by debbugs.gnu.org; 17 Oct 2019 08:27:48 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Oct 17 04:27:47 2019
Received: from localhost ([127.0.0.1]:47111 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1iL18N-00032j-Kk
	for submit <at> debbugs.gnu.org; Thu, 17 Oct 2019 04:27:47 -0400
Received: from zimbra.cs.ucla.edu ([131.179.128.68]:53332)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <eggert@HIDDEN>) id 1iL18J-00032M-Fv
 for 37754 <at> debbugs.gnu.org; Thu, 17 Oct 2019 04:27:46 -0400
Received: from localhost (localhost [127.0.0.1])
 by zimbra.cs.ucla.edu (Postfix) with ESMTP id C3C0E160611;
 Thu, 17 Oct 2019 01:27:36 -0700 (PDT)
Received: from zimbra.cs.ucla.edu ([127.0.0.1])
 by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032)
 with ESMTP id VB4qB23w2lQp; Thu, 17 Oct 2019 01:27:36 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1])
 by zimbra.cs.ucla.edu (Postfix) with ESMTP id 1F7BE16064B;
 Thu, 17 Oct 2019 01:27:36 -0700 (PDT)
X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu
Received: from zimbra.cs.ucla.edu ([127.0.0.1])
 by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026)
 with ESMTP id PV59_BvrVOmC; Thu, 17 Oct 2019 01:27:36 -0700 (PDT)
Received: from [192.168.1.9] (cpe-23-242-74-103.socal.res.rr.com
 [23.242.74.103])
 by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id F0C23160611;
 Thu, 17 Oct 2019 01:27:35 -0700 (PDT)
Subject: =?UTF-8?Q?Re=3a_bug=2337754=3a_wish_for_grep_--and_-eX_-eY_-eZ_=28X?=
 =?UTF-8?B?4oipWeKIqVogaW50ZXJzZWN0aW9uLCBub3QgWOKIqlniiKpaIHVuaW9uKQ==?=
To: "Trent W. Buck" <trentbuck@HIDDEN>
References: <156859974060.2726.6189784814472309967.reportbug@HIDDEN>
 <20191015014817.GA3082@HIDDEN> <20191016212619.31E9.27F6AC2D@HIDDEN>
 <07cfaedb-b581-c46e-6f47-433ca8f3a76d@HIDDEN>
 <20191017001952.GA2991@HIDDEN>
From: Paul Eggert <eggert@HIDDEN>
Organization: UCLA Computer Science Department
Message-ID: <7038d337-f127-74ef-aff6-782f2ad42969@HIDDEN>
Date: Thu, 17 Oct 2019 01:27:35 -0700
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101
 Thunderbird/60.9.0
MIME-Version: 1.0
In-Reply-To: <20191017001952.GA2991@HIDDEN>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 7bit
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 37754
Cc: 37754 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

On 10/16/19 5:19 PM, Trent W. Buck wrote:
> I would expect "grep -Fw -e 4GB -e DDR4 --and" to print the same thing as
> 
>      grep -Fw 4GB | grep -Fw DDR4 | grep -Fw -e 4GB -e DDR4 -o

You're right, it's not obvious. :-)

It may be better to just pipe greps together, as you do now. That's simple and 
fast enough for this relatively-uncommon case, and it's portable to all greps.




Information forwarded to bug-grep@HIDDEN:
bug#37754; Package grep. Full text available.

Message received at 37754 <at> debbugs.gnu.org:


Received: (at 37754) by debbugs.gnu.org; 17 Oct 2019 00:20:07 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Wed Oct 16 20:20:07 2019
Received: from localhost ([127.0.0.1]:46769 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1iKtWQ-00084r-Ns
	for submit <at> debbugs.gnu.org; Wed, 16 Oct 2019 20:20:06 -0400
Received: from mail-pg1-f174.google.com ([209.85.215.174]:40283)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <trentbuck@HIDDEN>) id 1iKtWO-00084H-OH
 for 37754 <at> debbugs.gnu.org; Wed, 16 Oct 2019 20:20:05 -0400
Received: by mail-pg1-f174.google.com with SMTP id e13so214681pga.7
 for <37754 <at> debbugs.gnu.org>; Wed, 16 Oct 2019 17:20:04 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=date:from:to:cc:subject:message-id:references:mime-version
 :content-disposition:in-reply-to:user-agent;
 bh=h7r0BETtG+UkYELTPHVu1HyaMZrP8SZkdfOGf9kiSrg=;
 b=Tq8L98Ofeyyqm+zYpNU7zyY77Er5xMSi8VNXBqAXX+11QC5Jmf4HuothHdg7KdzffY
 AzGBILMUzQ879crIEHrT7NIa9+caE0LqgV1R6XglrhBpDGT3fUKSwP1l8z+z03QPoIf6
 q1TrEvyCrQQ1SR4PkHgxC7hqETL01FM0jkp1aRez2fQ49J4rfzPgVwUAENK5htYUcT5l
 YJipGskx04OIpbl1f4jyccGE5skuZ5wZl5bglQhii4Ug4XRvzHn2Oh+brITHG40NmWhD
 /HuPlM8651jkCrDV6X1II4Xwt2ZqlXiX5gPFfrq+HJ+IsVl8mTzdbWGui7HB253ydHny
 8g6g==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:date:from:to:cc:subject:message-id:references
 :mime-version:content-disposition:in-reply-to:user-agent;
 bh=h7r0BETtG+UkYELTPHVu1HyaMZrP8SZkdfOGf9kiSrg=;
 b=ULeoJ/eH+H99ACP6OsBVwOwtniGNAeaf4jup2sEtIZIAcpx8KowDbNiXeY4ddGxzVB
 Ai7B+anlcPoOIsDz5CUrE1u9RDu3fFF4wEjVipBN6RNORsZJ7RGV4jdQtgSO+eo5jXye
 cczZd3SHLEPMWHQqjmnjyB0+4s/Q/pZtDmgsr9gm5xpaXF4wVf9py1z7XS/VPp6JiKJF
 O5SietYjgi2HijFYly8Mhqbl6mN2IGmM0FZypcSs3IjckZPeVwzU2Jj9qQkS37xahTgb
 pjawx99RB1I6dNnTJ8jwGZHFwNEmyYsnmKBcPjOXwOuokUyEldJGHSefP6BkMEecZ03i
 CyIQ==
X-Gm-Message-State: APjAAAW8lPlRTGAya9J3RRNC9WfSQvt/iAkfciPFy1BrL7KpMn9ikVtf
 0GU9jWCtAulxpkRVxZyeUbQ=
X-Google-Smtp-Source: APXvYqycMi7MwDYssGzvx95ylaNaGlkyYLKZtEch2gp/HMmnZ6hIeDBBou/Fe4pB1L3LbX5PAzyWSQ==
X-Received: by 2002:a63:78c:: with SMTP id 134mr919344pgh.177.1571271598860;
 Wed, 16 Oct 2019 17:19:58 -0700 (PDT)
Received: from localhost
 (2001-44b8-4197-6c00-0000-0000-0000-0a92.static.ipv6.internode.on.net.
 [2001:44b8:4197:6c00::a92])
 by smtp.gmail.com with ESMTPSA id f12sm183780pgo.85.2019.10.16.17.19.57
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Wed, 16 Oct 2019 17:19:57 -0700 (PDT)
Date: Thu, 17 Oct 2019 11:19:54 +1100
From: "Trent W. Buck" <trentbuck@HIDDEN>
To: Paul Eggert <eggert@HIDDEN>
Subject: Re: bug#37754: wish for grep =?utf-8?Q?--a?=
 =?utf-8?B?bmQgLWVYIC1lWSAtZVogKFjiiKlZ4oipWiBpbnRlcnNlY3Rpb24sIG5vdCBY?=
 =?utf-8?B?4oiqWeKIqlo=?= union)
Message-ID: <20191017001952.GA2991@HIDDEN>
References: <156859974060.2726.6189784814472309967.reportbug@HIDDEN>
 <20191015014817.GA3082@HIDDEN>
 <20191016212619.31E9.27F6AC2D@HIDDEN>
 <07cfaedb-b581-c46e-6f47-433ca8f3a76d@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <07cfaedb-b581-c46e-6f47-433ca8f3a76d@HIDDEN>
User-Agent: Mutt/1.10.1 (2018-07-13)
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 37754
Cc: Norihiro Tanaka <noritnk@HIDDEN>, 37754 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

Paul Eggert wrote:
> Wouldn't it be more useful to have an intersection operator in regular
> expressions? That is, the pattern 'A\&B' would match anything that is
> matched by both A and B. If A and B have parenthesized subexpressions, both
> sets of parentheses would match and would count.

Not for me personally, because I almost always want to use it with -Fwi :-)

(-F is a lot faster - about as fast as LC_COLLATE - and it also means
I don't have to think about escaping special characters.)

> [...]
>
> This approach would allow intersection to be nested inside other operations.
> Also, it would clarify how other features work. For example, grep -o has
> clear semantics with this approach, whereas the semantics of grep -o are not
> so clear with the proposed --and option.

I hadn't thought about -o, and I agree that is not very obvious.

Given an input file like

   30$	Gamdias EROS (M2) USB Multi-Color Lighting Gaming Headset
   30$	Gamdias POSEIDON E1 Gaming Combo 3-in-1 K/B+3200dpi Optical Mouse+Stereo Headset
   30$	GeIL (GP34GB1600C11SC) 4GB DDR3 1600 Desktop RAM
   30$	GeIL Pristine (GP44GB2400C17SC) 4GB Single DDR4 2400 Desktop RAM
   30$	GeIL SO-DIMM 4GB (GGS34GB1600C11SC) 1.35V (Low Voltage) 4GB DDR3 1600 Notebook Ram

Where currently "grep -Fw -e 4GB -e DDR4 -o" prints

    4GB
    4GB
    DDR4
    4GB
    4GB

I would expect "grep -Fw -e 4GB -e DDR4 --and" to print the same thing as

    grep -Fw 4GB | grep -Fw DDR4 | grep -Fw -e 4GB -e DDR4 -o

i.e.

    4GB
    DDR4




Information forwarded to bug-grep@HIDDEN:
bug#37754; Package grep. Full text available.

Message received at 37754 <at> debbugs.gnu.org:


Received: (at 37754) by debbugs.gnu.org; 16 Oct 2019 18:57:46 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Wed Oct 16 14:57:46 2019
Received: from localhost ([127.0.0.1]:46551 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1iKoUU-0008TT-Ag
	for submit <at> debbugs.gnu.org; Wed, 16 Oct 2019 14:57:46 -0400
Received: from zimbra.cs.ucla.edu ([131.179.128.68]:42136)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <eggert@HIDDEN>) id 1iKoUR-0008TF-VB
 for 37754 <at> debbugs.gnu.org; Wed, 16 Oct 2019 14:57:45 -0400
Received: from localhost (localhost [127.0.0.1])
 by zimbra.cs.ucla.edu (Postfix) with ESMTP id 4EAFA160624;
 Wed, 16 Oct 2019 11:57:37 -0700 (PDT)
Received: from zimbra.cs.ucla.edu ([127.0.0.1])
 by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032)
 with ESMTP id VRGuNKRSKZ8s; Wed, 16 Oct 2019 11:57:36 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1])
 by zimbra.cs.ucla.edu (Postfix) with ESMTP id 94C32160629;
 Wed, 16 Oct 2019 11:57:36 -0700 (PDT)
X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu
Received: from zimbra.cs.ucla.edu ([127.0.0.1])
 by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026)
 with ESMTP id Fe0vmkAc8eig; Wed, 16 Oct 2019 11:57:36 -0700 (PDT)
Received: from Penguin.CS.UCLA.EDU (Penguin.CS.UCLA.EDU [131.179.64.200])
 by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 79D54160624;
 Wed, 16 Oct 2019 11:57:36 -0700 (PDT)
Subject: =?UTF-8?Q?Re=3a_bug=2337754=3a_wish_for_grep_--and_-eX_-eY_-eZ_=28X?=
 =?UTF-8?B?4oipWeKIqVogaW50ZXJzZWN0aW9uLCBub3QgWOKIqlniiKpaIHVuaW9uKQ==?=
To: Norihiro Tanaka <noritnk@HIDDEN>, "Trent W. Buck" <trentbuck@HIDDEN>
References: <156859974060.2726.6189784814472309967.reportbug@HIDDEN>
 <20191015014817.GA3082@HIDDEN> <20191016212619.31E9.27F6AC2D@HIDDEN>
From: Paul Eggert <eggert@HIDDEN>
Organization: UCLA Computer Science Department
Message-ID: <07cfaedb-b581-c46e-6f47-433ca8f3a76d@HIDDEN>
Date: Wed, 16 Oct 2019 11:57:31 -0700
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101
 Thunderbird/68.1.1
MIME-Version: 1.0
In-Reply-To: <20191016212619.31E9.27F6AC2D@HIDDEN>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 7bit
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 37754
Cc: 37754 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

Wouldn't it be more useful to have an intersection operator in regular 
expressions? That is, the pattern 'A\&B' would match anything that is 
matched by both A and B. If A and B have parenthesized subexpressions, 
both sets of parentheses would match and would count.

Assuming concatenation has higher precedence than \&, the requested 
behavior could be achieved via:

   grep '.*X.*\&.*Y.*\&.*Z.*'

This approach would allow intersection to be nested inside other 
operations. Also, it would clarify how other features work. For example, 
grep -o has clear semantics with this approach, whereas the semantics of 
grep -o are not so clear with the proposed --and option.




Information forwarded to bug-grep@HIDDEN:
bug#37754; Package grep. Full text available.

Message received at 37754 <at> debbugs.gnu.org:


Received: (at 37754) by debbugs.gnu.org; 16 Oct 2019 12:26:36 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Wed Oct 16 08:26:36 2019
Received: from localhost ([127.0.0.1]:45367 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1iKiNv-00070E-Nw
	for submit <at> debbugs.gnu.org; Wed, 16 Oct 2019 08:26:36 -0400
Received: from mailgw07.kcn.ne.jp ([61.86.7.214]:56578)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <noritnk@HIDDEN>) id 1iKiNt-0006zz-OX
 for 37754 <at> debbugs.gnu.org; Wed, 16 Oct 2019 08:26:34 -0400
Received: from mxs02-s (mailgw2.kcn.ne.jp [61.86.15.234])
 by mailgw07.kcn.ne.jp (Postfix) with ESMTP id 0556541013
 for <37754 <at> debbugs.gnu.org>; Wed, 16 Oct 2019 21:26:26 +0900 (JST)
X-matriXscan-loop-detect: b972eb63d34b8bcdeae18fd144fa2a18b8bea1b0
Received: from mail11.kcn.ne.jp ([61.86.6.129]) by mxs02-s with ESMTP;
 Wed, 16 Oct 2019 21:26:23 +0900 (JST)
Received: from [10.120.1.123] (i118-21-128-66.s30.a048.ap.plala.or.jp
 [118.21.128.66])
 by mail11.kcn.ne.jp (Postfix) with ESMTPA id 3C1F0415AF43;
 Wed, 16 Oct 2019 21:26:23 +0900 (JST)
Date: Wed, 16 Oct 2019 21:26:21 +0900
From: Norihiro Tanaka <noritnk@HIDDEN>
To: "Trent W. Buck" <trentbuck@HIDDEN>
Subject: Re: bug#37754: wish for grep --and -eX -eY -eZ
 =?ISO-2022-JP?B?KFgbJEIiQRsoQlkbJEIiQRsoQlo=?=
 intersection, not =?ISO-2022-JP?B?WBskQiJAGyhCWRskQiJAGyhCWg==?= union)
In-Reply-To: <20191015014817.GA3082@HIDDEN>
References: <156859974060.2726.6189784814472309967.reportbug@HIDDEN>
 <20191015014817.GA3082@HIDDEN>
Message-Id: <20191016212619.31E9.27F6AC2D@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain; charset="ISO-2022-JP"
Content-Transfer-Encoding: 7bit
X-Mailer: Becky! ver. 2.74.02 [ja]
X-matriXscan-msec-AV: Clean
X-matriXscan-Action: Approve
X-matriXscan: Uncategorized
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 37754
Cc: 37754 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)


On Tue, 15 Oct 2019 12:48:17 +1100
"Trent W. Buck" <trentbuck@HIDDEN> wrote:

> Package: grep
> Version: 3.3-1
> Severity: wishlist
> 
> This bug was originally reported as
> https://bugs.debian.org/940464
> 
> Trent W. Buck wrote:
> > (Surely someone has already asked for this, but I can't see where.
> > I may have already reported this myself, and forgotten.
> > If so, sorry!)
> >
> > Right now if you do
> >
> >     grep -eX -eY -eZ
> >
> > You'll get lines that match *any of* X, Y, or Z.
> > Quite often I want to search for lines that match *all of* X, Y, and Z ? but in any order.
> > For example,
> >
> >     # all 4TB 2.5-inch SATA products
> >     grep -Fwi -eSATA -e2TB -e2.5in products.csv
> >
> > Below is a short discussion of the workarounds I know about.
> >
> > Is "grep --and" something that has already been discussed and rejected?
> > I looked through debbugs.gnu.org and the source tarball, but
> > I couldn't find anything about this.
> >
> >
> > PS: grep -v --and would intuitively mean "not all",
> > i.e. "grep -v --and -eX -eY" would return lines matching X *or* Y, but
> > omit lines matching *both* X and Y.
> >
> > PS: I can't decide if "--and" or "--intersection" is a better name.
> > I put both in the bug subject so people searching for either will find this ticket.
> > I think "--all" is probably too confusing.
> >
> >
> >
> > Workaround #1
> > =============
> > I can work around this by listing every possible order, but 1) this
> > scales poorly with the number of patterns; and 2) it can't be used
> > with -F.  For example,
> >
> >     grep --and -eX -eY -eZ input*.txt   # becomes
> >
> >     grep -eZ.*Y.*X \
> >          -eZ.*X.*Y \
> >          -eY.*Z.*X \
> >          -eY.*X.*Z \
> >          -eX.*Z.*Y \
> >          -eX.*Y.*Z \
> >          input*.txt
> >
> >
> > Workaround #2
> > =============
> > I can pipe greps together.  This is what I currently do.
> > This is more convenient and feels faster than workaround #1, but
> > I suspect the inter-process overhead is significant.
> >
> > If grep implemented this internally, it could zero-copy.
> > Being able to "grep -rnH --and" &c would also be convenient.
> >
> > For example,
> >
> >     grep --and -F -eX -eY -eZ input*.txt   # becomes
> >
> >     cat input*.txt |
> >     grep -F -eX |
> >     grep -F -eY |
> >     grep -F -eZ
> 

Although I do not know wheter it is discussed and/or rejected, to add
the function to grep, internal conversion as workaround #1 will be
impremented in grep.  However, it scales poorly as you say, and it will
be slower than workaround #2 in many cases.





Information forwarded to bug-grep@HIDDEN:
bug#37754; Package grep. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 15 Oct 2019 01:48:32 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Mon Oct 14 21:48:32 2019
Received: from localhost ([127.0.0.1]:42493 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1iKBwu-0003Q0-0z
	for submit <at> debbugs.gnu.org; Mon, 14 Oct 2019 21:48:32 -0400
Received: from mail-pg1-f195.google.com ([209.85.215.195]:41746)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <trentbuck@HIDDEN>) id 1iKBwq-0003Pl-Ly
 for submit <at> debbugs.gnu.org; Mon, 14 Oct 2019 21:48:29 -0400
Received: by mail-pg1-f195.google.com with SMTP id t3so11096071pga.8
 for <submit <at> debbugs.gnu.org>; Mon, 14 Oct 2019 18:48:28 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=date:from:to:subject:message-id:references:mime-version
 :content-disposition:content-transfer-encoding:in-reply-to
 :user-agent; bh=aLLE0/JFbIq/RXLnEqVojvKxEoz2nTMWKwB0oQfX/+o=;
 b=HfasByYFB1ok7cXpTMwt49dopKORYLIqReh36PLSAWYYjxBSjIK8arMXs0Guy1UNKN
 bp5yHRbrTFuI1uT7aFOSKJKXE0tMFX0a7e92gPGo+EYFNK/Hz/SSPSbIwX1FAPPASo/G
 wAmiCQ3WsSal8yDIO4AzzYnNDVmBcIHNKGV1dXJhkbINr7YF77Hobip+lvXYSws+0p7P
 b/vL+sjD92wuHknLi49kZUpbeHWibg6L3pEDmpHTZuadJXPrVvQX2WfloMPtofwW/b1m
 h7sdPXIRjrqrZHoF8p1lDjGykzp63N8To3YbROaERzGyfIAyd3x2QmaINzOd9EbZl0E7
 +8Uw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:date:from:to:subject:message-id:references
 :mime-version:content-disposition:content-transfer-encoding
 :in-reply-to:user-agent;
 bh=aLLE0/JFbIq/RXLnEqVojvKxEoz2nTMWKwB0oQfX/+o=;
 b=Z1/qocuFryqUOk2+APc+7XTje0ce805U9v9j9Kymxmfdir/C5Db2UHq5jemvXwIrXL
 NJMbi3VsPNDBRnYifK8X/gElDagibS0drQ9duknc4blJffJXjxhPQz+i6A0ir/qtlf58
 0j0ziAnJjcx333OfX5B50xrQPXbpMx5H5CC/gcTX0jrWPkuUtfyf5NQGdJalomI9RfwP
 erVtki62Kf3eRch541PEaLIMTgmdUGLLP4K4vHlLzHyHEvVNFKiwgaUMzDJucsj8p2Z4
 i3qZcNrbvSxq696d/dFjjDuQ73GruXzcPcGHQtybfhGIaWhdJQYFjJ7Qsghdu6qIkX1S
 01qw==
X-Gm-Message-State: APjAAAXdLvGYZJq0vGWsDXaB6XkWr6oMh/v+fwxQXvlA6x1ltGjrDrCg
 ZUU2sSElqLAtvX1Cc/dwrw54cX0j
X-Google-Smtp-Source: APXvYqz4OMJ9AKc/PmryYLqY1SxEMcadNL5tbWs4XTuywZV8T15zR0cgzhODxuwXP/tQ1JpSB77PFg==
X-Received: by 2002:a17:90a:234d:: with SMTP id
 f71mr39604338pje.134.1571104102234; 
 Mon, 14 Oct 2019 18:48:22 -0700 (PDT)
Received: from localhost ([203.7.155.117])
 by smtp.gmail.com with ESMTPSA id y4sm17053927pfr.118.2019.10.14.18.48.19
 for <submit <at> debbugs.gnu.org>
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Mon, 14 Oct 2019 18:48:20 -0700 (PDT)
Date: Tue, 15 Oct 2019 12:48:17 +1100
From: "Trent W. Buck" <trentbuck@HIDDEN>
To: submit <at> debbugs.gnu.org
Subject: wish for grep --and -eX -eY =?utf-8?Q?-eZ_?=
 =?utf-8?B?KFjiiKlZ4oipWiBpbnRlcnNlY3Rpb24sIG5vdCBY4oiqWeKIqlo=?= union)
Message-ID: <20191015014817.GA3082@HIDDEN>
References: <156859974060.2726.6189784814472309967.reportbug@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <156859974060.2726.6189784814472309967.reportbug@HIDDEN>
User-Agent: Mutt/1.10.1 (2018-07-13)
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: submit
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

Package: grep
Version: 3.3-1
Severity: wishlist

This bug was originally reported as
https://bugs.debian.org/940464

Trent W. Buck wrote:
> (Surely someone has already asked for this, but I can't see where.
> I may have already reported this myself, and forgotten.
> If so, sorry!)
>
> Right now if you do
>
>     grep -eX -eY -eZ
>
> You'll get lines that match *any of* X, Y, or Z.
> Quite often I want to search for lines that match *all of* X, Y, and Z — but in any order.
> For example,
>
>     # all 4TB 2.5-inch SATA products
>     grep -Fwi -eSATA -e2TB -e2.5in products.csv
>
> Below is a short discussion of the workarounds I know about.
>
> Is "grep --and" something that has already been discussed and rejected?
> I looked through debbugs.gnu.org and the source tarball, but
> I couldn't find anything about this.
>
>
> PS: grep -v --and would intuitively mean "not all",
> i.e. "grep -v --and -eX -eY" would return lines matching X *or* Y, but
> omit lines matching *both* X and Y.
>
> PS: I can't decide if "--and" or "--intersection" is a better name.
> I put both in the bug subject so people searching for either will find this ticket.
> I think "--all" is probably too confusing.
>
>
>
> Workaround #1
> =============
> I can work around this by listing every possible order, but 1) this
> scales poorly with the number of patterns; and 2) it can't be used
> with -F.  For example,
>
>     grep --and -eX -eY -eZ input*.txt   # becomes
>
>     grep -eZ.*Y.*X \
>          -eZ.*X.*Y \
>          -eY.*Z.*X \
>          -eY.*X.*Z \
>          -eX.*Z.*Y \
>          -eX.*Y.*Z \
>          input*.txt
>
>
> Workaround #2
> =============
> I can pipe greps together.  This is what I currently do.
> This is more convenient and feels faster than workaround #1, but
> I suspect the inter-process overhead is significant.
>
> If grep implemented this internally, it could zero-copy.
> Being able to "grep -rnH --and" &c would also be convenient.
>
> For example,
>
>     grep --and -F -eX -eY -eZ input*.txt   # becomes
>
>     cat input*.txt |
>     grep -F -eX |
>     grep -F -eY |
>     grep -F -eZ




Acknowledgement sent to "Trent W. Buck" <trentbuck@HIDDEN>:
New bug report received and forwarded. Copy sent to bug-grep@HIDDEN. Full text available.
Report forwarded to bug-grep@HIDDEN:
bug#37754; Package grep. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Mon, 25 Nov 2019 12:00:02 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.