Received: (at 37754) by debbugs.gnu.org; 19 Oct 2019 06:57:51 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Sat Oct 19 02:57:51 2019 Received: from localhost ([127.0.0.1]:51840 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1iLigQ-0006fJ-K3 for submit <at> debbugs.gnu.org; Sat, 19 Oct 2019 02:57:50 -0400 Received: from mailgw03.kcn.ne.jp ([61.86.7.210]:59276) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <noritnk@HIDDEN>) id 1iLigN-0006eu-Cv for 37754 <at> debbugs.gnu.org; Sat, 19 Oct 2019 02:57:49 -0400 Received: from mxs01-s (mailgw1.kcn.ne.jp [61.86.15.233]) by mailgw03.kcn.ne.jp (Postfix) with ESMTP id C2B1B13F8B7 for <37754 <at> debbugs.gnu.org>; Sat, 19 Oct 2019 15:57:39 +0900 (JST) X-matriXscan-loop-detect: 08de342dc753551b3a9198fbd78915234f032a9d Received: from mail12.kcn.ne.jp ([61.86.6.130]) by mxs01-s with ESMTP; Sat, 19 Oct 2019 15:57:39 +0900 (JST) Received: from [10.120.1.123] (i118-21-128-66.s30.a048.ap.plala.or.jp [118.21.128.66]) by mail12.kcn.ne.jp (Postfix) with ESMTPA id 621184121FD1; Sat, 19 Oct 2019 15:57:39 +0900 (JST) Date: Sat, 19 Oct 2019 15:57:39 +0900 From: Norihiro Tanaka <noritnk@HIDDEN> To: "Trent W. Buck" <trentbuck@HIDDEN> Subject: Re: bug#37754: wish for grep --and -eX -eY -eZ =?ISO-2022-JP?B?KFgbJEIiQRsoQlkbJEIiQRsoQlo=?= intersection, not =?ISO-2022-JP?B?WBskQiJAGyhCWRskQiJAGyhCWg==?= union) In-Reply-To: <20191015014817.GA3082@HIDDEN> References: <156859974060.2726.6189784814472309967.reportbug@HIDDEN> <20191015014817.GA3082@HIDDEN> Message-Id: <20191019155738.1011.27F6AC2D@HIDDEN> MIME-Version: 1.0 Content-Type: text/plain; charset="ISO-2022-JP" Content-Transfer-Encoding: 7bit X-Mailer: Becky! ver. 2.74.02 [ja] X-matriXscan-msec-AV: Clean X-matriXscan-Action: Approve X-matriXscan: Uncategorized X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 37754 Cc: 37754 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.0 (-) On Tue, 15 Oct 2019 12:48:17 +1100 "Trent W. Buck" <trentbuck@HIDDEN> wrote: > Package: grep > Version: 3.3-1 > Severity: wishlist > > This bug was originally reported as > https://bugs.debian.org/940464 > > Trent W. Buck wrote: > > (Surely someone has already asked for this, but I can't see where. > > I may have already reported this myself, and forgotten. > > If so, sorry!) > > > > Right now if you do > > > > grep -eX -eY -eZ > > > > You'll get lines that match *any of* X, Y, or Z. > > Quite often I want to search for lines that match *all of* X, Y, and Z ? but in any order. > > For example, > > > > # all 4TB 2.5-inch SATA products > > grep -Fwi -eSATA -e2TB -e2.5in products.csv > > > > Below is a short discussion of the workarounds I know about. > > > > Is "grep --and" something that has already been discussed and rejected? > > I looked through debbugs.gnu.org and the source tarball, but > > I couldn't find anything about this. > > > > > > PS: grep -v --and would intuitively mean "not all", > > i.e. "grep -v --and -eX -eY" would return lines matching X *or* Y, but > > omit lines matching *both* X and Y. > > > > PS: I can't decide if "--and" or "--intersection" is a better name. > > I put both in the bug subject so people searching for either will find this ticket. > > I think "--all" is probably too confusing. > > > > > > > > Workaround #1 > > ============= > > I can work around this by listing every possible order, but 1) this > > scales poorly with the number of patterns; and 2) it can't be used > > with -F. For example, > > > > grep --and -eX -eY -eZ input*.txt # becomes > > > > grep -eZ.*Y.*X \ > > -eZ.*X.*Y \ > > -eY.*Z.*X \ > > -eY.*X.*Z \ > > -eX.*Z.*Y \ > > -eX.*Y.*Z \ > > input*.txt > > > > > > Workaround #2 > > ============= > > I can pipe greps together. This is what I currently do. > > This is more convenient and feels faster than workaround #1, but > > I suspect the inter-process overhead is significant. > > > > If grep implemented this internally, it could zero-copy. > > Being able to "grep -rnH --and" &c would also be convenient. > > > > For example, > > > > grep --and -F -eX -eY -eZ input*.txt # becomes > > > > cat input*.txt | > > grep -F -eX | > > grep -F -eY | > > grep -F -eZ > > > Workaround #1 > ============= > I can work around this by listing every possible order, but 1) this > scales poorly with the number of patterns; and 2) it can't be used > with -F. For example, > > grep --and -eX -eY -eZ input*.txt # becomes > > grep -eZ.*Y.*X \ > -eZ.*X.*Y \ > -eY.*Z.*X \ > -eY.*X.*Z \ > -eX.*Z.*Y \ > -eX.*Y.*Z \ > input*.txt I have noticed that the above two do not necessarily produce the same results. grep --and -e123 -e234 input*.txt grep --and -e '123.*234' -e '234.*123' input*.txt "1234" matches first, but it does not match second.
bug-grep@HIDDEN
:bug#37754
; Package grep
.
Full text available.Received: (at submit) by debbugs.gnu.org; 18 Oct 2019 22:36:06 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Fri Oct 18 18:36:06 2019 Received: from localhost ([127.0.0.1]:51685 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1iLaqr-0004U1-MV for submit <at> debbugs.gnu.org; Fri, 18 Oct 2019 18:36:05 -0400 Received: from lists.gnu.org ([209.51.188.17]:34822) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <pj@HIDDEN>) id 1iLaqp-0004Tt-O9 for submit <at> debbugs.gnu.org; Fri, 18 Oct 2019 18:36:04 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:59788) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from <pj@HIDDEN>) id 1iLaqo-0007DW-Cl for bug-grep@HIDDEN; Fri, 18 Oct 2019 18:36:03 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.1 required=5.0 tests=BAYES_50,RCVD_IN_DNSWL_LOW, URIBL_BLOCKED autolearn=disabled version=3.3.2 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from <pj@HIDDEN>) id 1iLaqm-0004ao-VN for bug-grep@HIDDEN; Fri, 18 Oct 2019 18:36:02 -0400 Received: from out1-smtp.messagingengine.com ([66.111.4.25]:54765) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from <pj@HIDDEN>) id 1iLaqm-0004aO-SG for bug-grep@HIDDEN; Fri, 18 Oct 2019 18:36:00 -0400 Received: from compute1.internal (compute1.nyi.internal [10.202.2.41]) by mailout.nyi.internal (Postfix) with ESMTP id EA12B22299 for <bug-grep@HIDDEN>; Fri, 18 Oct 2019 18:35:59 -0400 (EDT) Received: from imap24 ([10.202.2.74]) by compute1.internal (MEProxy); Fri, 18 Oct 2019 18:35:59 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to:x-me-proxy :x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm1; bh=n6nFJc dp3vI6K1CyFjLkKdpcaR97VgSyBRZZDKe6iI4=; b=b6cdMEKV+yjiPteKNBQWaz M0aN5hzwqW5kdpqyuD+4q7WeJhh9d35F2Q59CVfDDkDDuZG/vrK/0i/fqtWbUfch R4ubUaWPJ2XgZVq0roB40mz60Us2B0K1YaGdeUmpbmamvh+/dl2iL7ktCgDUPYZc 1OdEyT5JU7knw8D4x2wnSUp0wMeixE24jCVYj9mBPiabYp396NoUWl94EUlwv44/ N1hrpWoh5HnV9FWPir6FOt07ruzAl767JOF/jp7ITycFNZ9/CZ9/0diwcOepxZ4b KxdsufWslaL0tm/VyGCCV8MsUdn/hyf2O4OytbgLK2/YwYgOZJjgH7BpYqJIDv2w == X-ME-Sender: <xms:Tz6qXTpPc1ieiolXLxyk4yNBuCNZFVHyQRk-pTNoXi9rM8pkx13mDQ> X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedufedrkedtgdduvdcutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhepofgfggfkjghffffhvffutgesthdtre dtreerjeenucfhrhhomhepfdfrrghulhculfgrtghkshhonhdfuceophhjsehushgrrdhn vghtqeenucfrrghrrghmpehmrghilhhfrhhomhepphhjsehushgrrdhnvghtnecuvehluh hsthgvrhfuihiivgeptd X-ME-Proxy: <xmx:Tz6qXWdH0u5UaZRODSP34wFRiKp1vcc1sUl8TKqDyjAJ56PgD0Df2g> <xmx:Tz6qXXKrd8IFsjrx-QWFckNBSQ7cxQxWsVZ6LoAm6RLhiIm-kPdR5Q> <xmx:Tz6qXQehhwQeoAsDe8kmftVeEMYwm1wQKBYvEEhAF5D3iRUsz5sJ6Q> <xmx:Tz6qXdm_fJ3HgU4BfdHwiC-FQKxU-lg_M6gbZgkcCwAJVkkv02xSOQ> Received: by mailuser.nyi.internal (Postfix, from userid 501) id 50C8B2000A3; Fri, 18 Oct 2019 18:35:59 -0400 (EDT) X-Mailer: MessagingEngine.com Webmail Interface User-Agent: Cyrus-JMAP/3.1.7-360-g7dda896-fmstable-20191004v2 Mime-Version: 1.0 Message-Id: <5e64c0d8-3a8f-4b31-a08a-05f335915878@HIDDEN> In-Reply-To: <4dc7854d-a8c6-5d0d-59de-8b5e27d23280@HIDDEN> References: <156859974060.2726.6189784814472309967.reportbug@HIDDEN> <20191015014817.GA3082@HIDDEN> <20191016212619.31E9.27F6AC2D@HIDDEN> <07cfaedb-b581-c46e-6f47-433ca8f3a76d@HIDDEN> <20191017001952.GA2991@HIDDEN> <7038d337-f127-74ef-aff6-782f2ad42969@HIDDEN> <20191018114921.GA21858@HIDDEN> <4dc7854d-a8c6-5d0d-59de-8b5e27d23280@HIDDEN> Date: Fri, 18 Oct 2019 17:35:38 -0500 From: "Paul Jackson" <pj@HIDDEN> To: bug-grep@HIDDEN Subject: =?UTF-8?Q?Re:_bug#37754:_wish_for_grep_--and_-eX_-eY_-eZ_(X=E2=88=A9Y?= =?UTF-8?Q?=E2=88=A9Z_intersection,_not_X=E2=88=AAY=E2=88=AAZ_union)?= Content-Type: text/plain X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.25 X-Spam-Score: -1.6 (-) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -2.6 (--) I'm currently working on rewriting and packaging up a tool that I use to handle such high volume [and/or/not] filters on long lists of file pathnames and of log file entries. It's a tool I've had in my private toolbox for decades. I call it "ftest". It has a rich set of "test" like flags for testing stat(2) attributes of file, but is optimized for working in pipelines (as a filter, hence the "f"). Trent - do you need regular expression matching, or is glob matching easily sufficient, or would even just fixed string matching be useful? For [and/or/not] logical combinations of full regular expressions, I'll probably continue to use awk, as Paul Eggert suggested, though that might be because I've long been an awk user, since teaching an awk class to other engineers inside Bell Labs, some 40 years ago. Perhaps sometime, months into the future, I'll follow up with an update pointing to my "ftest" command on github. -- Paul Jackson pj@HIDDEN
bug-grep@HIDDEN
:bug#37754
; Package grep
.
Full text available.Received: (at 37754) by debbugs.gnu.org; 18 Oct 2019 17:51:39 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Fri Oct 18 13:51:39 2019 Received: from localhost ([127.0.0.1]:51512 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1iLWPb-0004xO-FV for submit <at> debbugs.gnu.org; Fri, 18 Oct 2019 13:51:39 -0400 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:34402) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <eggert@HIDDEN>) id 1iLWPZ-0004x5-83 for 37754 <at> debbugs.gnu.org; Fri, 18 Oct 2019 13:51:37 -0400 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 1B4DA160690; Fri, 18 Oct 2019 10:51:30 -0700 (PDT) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id Sf29exvZokXl; Fri, 18 Oct 2019 10:51:29 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 6A0B1160697; Fri, 18 Oct 2019 10:51:29 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id W1Mqw51nFBH5; Fri, 18 Oct 2019 10:51:29 -0700 (PDT) Received: from Penguin.CS.UCLA.EDU (Penguin.CS.UCLA.EDU [131.179.64.200]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 50323160690; Fri, 18 Oct 2019 10:51:29 -0700 (PDT) Subject: =?UTF-8?Q?Re=3a_bug=2337754=3a_wish_for_grep_--and_-eX_-eY_-eZ_=28X?= =?UTF-8?B?4oipWeKIqVogaW50ZXJzZWN0aW9uLCBub3QgWOKIqlniiKpaIHVuaW9uKQ==?= To: "Trent W. Buck" <trentbuck@HIDDEN> References: <156859974060.2726.6189784814472309967.reportbug@HIDDEN> <20191015014817.GA3082@HIDDEN> <20191016212619.31E9.27F6AC2D@HIDDEN> <07cfaedb-b581-c46e-6f47-433ca8f3a76d@HIDDEN> <20191017001952.GA2991@HIDDEN> <7038d337-f127-74ef-aff6-782f2ad42969@HIDDEN> <20191018114921.GA21858@HIDDEN> From: Paul Eggert <eggert@HIDDEN> Organization: UCLA Computer Science Department Message-ID: <4dc7854d-a8c6-5d0d-59de-8b5e27d23280@HIDDEN> Date: Fri, 18 Oct 2019 10:51:29 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.1.1 MIME-Version: 1.0 In-Reply-To: <20191018114921.GA21858@HIDDEN> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 37754 Cc: 37754 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -3.3 (---) On 10/18/19 4:49 AM, Trent W. Buck wrote: > In that case, each grep in the pipeline has to pay the costs to > de-serialize input from the previous grep Sure, but grep is designed to be a simple tool and we need to draw the line somewhere. For something more complicated there are already sed and awk (if you want to write to POSIX) or Perl or Python or whatever. I mildly of prefer the A\&B notation because it could be used everywhere, not just in grep. (But of course someone would have to implement it. :-)
bug-grep@HIDDEN
:bug#37754
; Package grep
.
Full text available.Received: (at 37754) by debbugs.gnu.org; 18 Oct 2019 11:49:35 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Fri Oct 18 07:49:35 2019 Received: from localhost ([127.0.0.1]:49479 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1iLQlD-0003IR-Aw for submit <at> debbugs.gnu.org; Fri, 18 Oct 2019 07:49:35 -0400 Received: from mail-pg1-f193.google.com ([209.85.215.193]:36424) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <trentbuck@HIDDEN>) id 1iLQlB-0003ID-Eb for 37754 <at> debbugs.gnu.org; Fri, 18 Oct 2019 07:49:34 -0400 Received: by mail-pg1-f193.google.com with SMTP id 23so3254913pgk.3 for <37754 <at> debbugs.gnu.org>; Fri, 18 Oct 2019 04:49:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=TKj15ALqW0/bRwDTBugdGET/2bJg7+KUZEdHt2TcaSg=; b=G5F02u8m69A+gpUdATgCtalgDvg1QO1Tq7+vXMCg7i/kUONxsHAJMR7F861jkK+LQS hW8ST0WLrBccSCkflUniyBgy2HWNEHbqKoUCxe7jSq5HcV/VXx9dDyjDcnbpBgoSuRAg o2AjQfRSMiwPdjO8TjFa+2BVU1t4ZFLI0HcYY7bnZ0GIv99vzylhgmYHp28rz5ieutCW LioGLgJWIaQBunC8S9gsAWMTCNd3vYa2vpSLcetHb/obanaMuPXMW0tG0YJzRD/QpJq7 caAjcSLOyoLGwgumW1Ctqoq5cyah26w2WQc9G3R9W8WMjgLa2YKDGzL2vIsgrpTH9wux 7IrA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=TKj15ALqW0/bRwDTBugdGET/2bJg7+KUZEdHt2TcaSg=; b=V2gbFrT+ghB0LCBqOyjymmrjPcmTellW/ntRkZnjq4TI/FlKrjDnCTw10tnTSBGkU8 P/0ykvm09MIgDRayBU9jugfrYucAQBLdCWaHwnqkabQS6LzxE/8ehaA98TRc86IE2to6 2PqhCgqWZ7GsXkgWE68sdLPM4HUbyFujuB2H+Z8Z1t+9WdGK+jFQJ6XTTGlQfmIKr8lB AGkNqbVoYJ61fyfe4WzNkOTm66BvwdETewTXCMkOgLwafyEZRtOpqyBRqx2hypTxwvAb +JCXeM0yJIhW1cF3PUT8ysfZpKc2xO7JW459CS9fVqh05GtzWCONZaKpUZihUO2Jv4qf ngRg== X-Gm-Message-State: APjAAAX5G+ZqhACcbEXAx995aehNihIvCA2is6Z5BmP53WAFf/dVJQz7 drnG7nfMbY7aT//bnhL0XhE= X-Google-Smtp-Source: APXvYqyP6CQiLtY7CXfUYDLaOozMZVHsnucoZ6dKdhPVwInfCgLRK1YBVufJKuAFCaQYiv6u2hWoCQ== X-Received: by 2002:a63:1f4e:: with SMTP id q14mr9799160pgm.144.1571399367446; Fri, 18 Oct 2019 04:49:27 -0700 (PDT) Received: from localhost (2001-44b8-4197-6c00-0000-0000-0000-0a92.static.ipv6.internode.on.net. [2001:44b8:4197:6c00::a92]) by smtp.gmail.com with ESMTPSA id h1sm6823553pfk.124.2019.10.18.04.49.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Oct 2019 04:49:26 -0700 (PDT) Date: Fri, 18 Oct 2019 22:49:23 +1100 From: "Trent W. Buck" <trentbuck@HIDDEN> To: Paul Eggert <eggert@HIDDEN> Subject: Re: bug#37754: wish for grep =?utf-8?Q?--a?= =?utf-8?B?bmQgLWVYIC1lWSAtZVogKFjiiKlZ4oipWiBpbnRlcnNlY3Rpb24sIG5vdCBY?= =?utf-8?B?4oiqWeKIqlo=?= union) Message-ID: <20191018114921.GA21858@HIDDEN> References: <156859974060.2726.6189784814472309967.reportbug@HIDDEN> <20191015014817.GA3082@HIDDEN> <20191016212619.31E9.27F6AC2D@HIDDEN> <07cfaedb-b581-c46e-6f47-433ca8f3a76d@HIDDEN> <20191017001952.GA2991@HIDDEN> <7038d337-f127-74ef-aff6-782f2ad42969@HIDDEN> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <7038d337-f127-74ef-aff6-782f2ad42969@HIDDEN> User-Agent: Mutt/1.10.1 (2018-07-13) X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 37754 Cc: 37754 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.0 (-) Paul Eggert wrote: > On 10/16/19 5:19 PM, Trent W. Buck wrote: > > I would expect "grep -Fw -e 4GB -e DDR4 --and" to print the same thing as > > > > grep -Fw 4GB | grep -Fw DDR4 | grep -Fw -e 4GB -e DDR4 -o > > You're right, it's not obvious. :-) > > It may be better to just pipe greps together, as you do now. That's simple > and fast enough for this relatively-uncommon case, and it's portable to all > greps. I admit that most of the time, I want "grep --and" for a small dataset (<1MB computer_parts.txt), so it's merely a convenience. Sometimes I grep audit logs (~1TB uncompressed), which takes anywhere from 15 minutes to 3 days, depending on how I tweak my grep calls. In that case, each grep in the pipeline has to pay the costs to de-serialize input from the previous grep, and re-serialize output to the next grep. If the first grep matches (say) 200GB of the 1TB, that's can be a lot of overhead (I assume). I was basically hoping that if it was all in a single grep process, the de/serialization steps could be skipped completely. I think the buzzword for that is "zero-copy"? I've noticed "grep" is about 30% slower than either "grep -F" or "LC_COLLATE=C grep", because (I think) it avoids the costs of decoding from UTF-8 to Unicode and back. So I was basically expecting a similar saving from --and. I'm only speaking as an end user - I haven't dug through the grep source, so those expectations might be unrealistic, and implementing it might be painful/impossible. I figured I should at least ask :-) If your expert opinion is that it's a pain to implement (and maintain!) and there's not enough demand, then I'm OK with that. This is NOT something that's burning me every day. Regardless, I appreciate you taking the time to discuss it. :-) PS: Regarding portability, I'm personally not worried because when I need a GNUism badly enough (e.g. du --threshold), I can usually get permission to install the relevant GNU software, even if it's only into %APPDATA% or $HOME. PS: I noticed on bugs.gnu.org something about grep being single-threaded, which might mean "grep --and" would end up being SLOWER than the existing pipelines, since the kernel can distribute a pipeline's elements across multiple CPUs/cores.
bug-grep@HIDDEN
:bug#37754
; Package grep
.
Full text available.Received: (at 37754) by debbugs.gnu.org; 17 Oct 2019 08:27:48 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Thu Oct 17 04:27:47 2019 Received: from localhost ([127.0.0.1]:47111 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1iL18N-00032j-Kk for submit <at> debbugs.gnu.org; Thu, 17 Oct 2019 04:27:47 -0400 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:53332) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <eggert@HIDDEN>) id 1iL18J-00032M-Fv for 37754 <at> debbugs.gnu.org; Thu, 17 Oct 2019 04:27:46 -0400 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id C3C0E160611; Thu, 17 Oct 2019 01:27:36 -0700 (PDT) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id VB4qB23w2lQp; Thu, 17 Oct 2019 01:27:36 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 1F7BE16064B; Thu, 17 Oct 2019 01:27:36 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id PV59_BvrVOmC; Thu, 17 Oct 2019 01:27:36 -0700 (PDT) Received: from [192.168.1.9] (cpe-23-242-74-103.socal.res.rr.com [23.242.74.103]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id F0C23160611; Thu, 17 Oct 2019 01:27:35 -0700 (PDT) Subject: =?UTF-8?Q?Re=3a_bug=2337754=3a_wish_for_grep_--and_-eX_-eY_-eZ_=28X?= =?UTF-8?B?4oipWeKIqVogaW50ZXJzZWN0aW9uLCBub3QgWOKIqlniiKpaIHVuaW9uKQ==?= To: "Trent W. Buck" <trentbuck@HIDDEN> References: <156859974060.2726.6189784814472309967.reportbug@HIDDEN> <20191015014817.GA3082@HIDDEN> <20191016212619.31E9.27F6AC2D@HIDDEN> <07cfaedb-b581-c46e-6f47-433ca8f3a76d@HIDDEN> <20191017001952.GA2991@HIDDEN> From: Paul Eggert <eggert@HIDDEN> Organization: UCLA Computer Science Department Message-ID: <7038d337-f127-74ef-aff6-782f2ad42969@HIDDEN> Date: Thu, 17 Oct 2019 01:27:35 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: <20191017001952.GA2991@HIDDEN> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 37754 Cc: 37754 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -3.3 (---) On 10/16/19 5:19 PM, Trent W. Buck wrote: > I would expect "grep -Fw -e 4GB -e DDR4 --and" to print the same thing as > > grep -Fw 4GB | grep -Fw DDR4 | grep -Fw -e 4GB -e DDR4 -o You're right, it's not obvious. :-) It may be better to just pipe greps together, as you do now. That's simple and fast enough for this relatively-uncommon case, and it's portable to all greps.
bug-grep@HIDDEN
:bug#37754
; Package grep
.
Full text available.Received: (at 37754) by debbugs.gnu.org; 17 Oct 2019 00:20:07 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Wed Oct 16 20:20:07 2019 Received: from localhost ([127.0.0.1]:46769 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1iKtWQ-00084r-Ns for submit <at> debbugs.gnu.org; Wed, 16 Oct 2019 20:20:06 -0400 Received: from mail-pg1-f174.google.com ([209.85.215.174]:40283) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <trentbuck@HIDDEN>) id 1iKtWO-00084H-OH for 37754 <at> debbugs.gnu.org; Wed, 16 Oct 2019 20:20:05 -0400 Received: by mail-pg1-f174.google.com with SMTP id e13so214681pga.7 for <37754 <at> debbugs.gnu.org>; Wed, 16 Oct 2019 17:20:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=h7r0BETtG+UkYELTPHVu1HyaMZrP8SZkdfOGf9kiSrg=; b=Tq8L98Ofeyyqm+zYpNU7zyY77Er5xMSi8VNXBqAXX+11QC5Jmf4HuothHdg7KdzffY AzGBILMUzQ879crIEHrT7NIa9+caE0LqgV1R6XglrhBpDGT3fUKSwP1l8z+z03QPoIf6 q1TrEvyCrQQ1SR4PkHgxC7hqETL01FM0jkp1aRez2fQ49J4rfzPgVwUAENK5htYUcT5l YJipGskx04OIpbl1f4jyccGE5skuZ5wZl5bglQhii4Ug4XRvzHn2Oh+brITHG40NmWhD /HuPlM8651jkCrDV6X1II4Xwt2ZqlXiX5gPFfrq+HJ+IsVl8mTzdbWGui7HB253ydHny 8g6g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=h7r0BETtG+UkYELTPHVu1HyaMZrP8SZkdfOGf9kiSrg=; b=ULeoJ/eH+H99ACP6OsBVwOwtniGNAeaf4jup2sEtIZIAcpx8KowDbNiXeY4ddGxzVB Ai7B+anlcPoOIsDz5CUrE1u9RDu3fFF4wEjVipBN6RNORsZJ7RGV4jdQtgSO+eo5jXye cczZd3SHLEPMWHQqjmnjyB0+4s/Q/pZtDmgsr9gm5xpaXF4wVf9py1z7XS/VPp6JiKJF O5SietYjgi2HijFYly8Mhqbl6mN2IGmM0FZypcSs3IjckZPeVwzU2Jj9qQkS37xahTgb pjawx99RB1I6dNnTJ8jwGZHFwNEmyYsnmKBcPjOXwOuokUyEldJGHSefP6BkMEecZ03i CyIQ== X-Gm-Message-State: APjAAAW8lPlRTGAya9J3RRNC9WfSQvt/iAkfciPFy1BrL7KpMn9ikVtf 0GU9jWCtAulxpkRVxZyeUbQ= X-Google-Smtp-Source: APXvYqycMi7MwDYssGzvx95ylaNaGlkyYLKZtEch2gp/HMmnZ6hIeDBBou/Fe4pB1L3LbX5PAzyWSQ== X-Received: by 2002:a63:78c:: with SMTP id 134mr919344pgh.177.1571271598860; Wed, 16 Oct 2019 17:19:58 -0700 (PDT) Received: from localhost (2001-44b8-4197-6c00-0000-0000-0000-0a92.static.ipv6.internode.on.net. [2001:44b8:4197:6c00::a92]) by smtp.gmail.com with ESMTPSA id f12sm183780pgo.85.2019.10.16.17.19.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 16 Oct 2019 17:19:57 -0700 (PDT) Date: Thu, 17 Oct 2019 11:19:54 +1100 From: "Trent W. Buck" <trentbuck@HIDDEN> To: Paul Eggert <eggert@HIDDEN> Subject: Re: bug#37754: wish for grep =?utf-8?Q?--a?= =?utf-8?B?bmQgLWVYIC1lWSAtZVogKFjiiKlZ4oipWiBpbnRlcnNlY3Rpb24sIG5vdCBY?= =?utf-8?B?4oiqWeKIqlo=?= union) Message-ID: <20191017001952.GA2991@HIDDEN> References: <156859974060.2726.6189784814472309967.reportbug@HIDDEN> <20191015014817.GA3082@HIDDEN> <20191016212619.31E9.27F6AC2D@HIDDEN> <07cfaedb-b581-c46e-6f47-433ca8f3a76d@HIDDEN> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <07cfaedb-b581-c46e-6f47-433ca8f3a76d@HIDDEN> User-Agent: Mutt/1.10.1 (2018-07-13) X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 37754 Cc: Norihiro Tanaka <noritnk@HIDDEN>, 37754 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.0 (-) Paul Eggert wrote: > Wouldn't it be more useful to have an intersection operator in regular > expressions? That is, the pattern 'A\&B' would match anything that is > matched by both A and B. If A and B have parenthesized subexpressions, both > sets of parentheses would match and would count. Not for me personally, because I almost always want to use it with -Fwi :-) (-F is a lot faster - about as fast as LC_COLLATE - and it also means I don't have to think about escaping special characters.) > [...] > > This approach would allow intersection to be nested inside other operations. > Also, it would clarify how other features work. For example, grep -o has > clear semantics with this approach, whereas the semantics of grep -o are not > so clear with the proposed --and option. I hadn't thought about -o, and I agree that is not very obvious. Given an input file like 30$ Gamdias EROS (M2) USB Multi-Color Lighting Gaming Headset 30$ Gamdias POSEIDON E1 Gaming Combo 3-in-1 K/B+3200dpi Optical Mouse+Stereo Headset 30$ GeIL (GP34GB1600C11SC) 4GB DDR3 1600 Desktop RAM 30$ GeIL Pristine (GP44GB2400C17SC) 4GB Single DDR4 2400 Desktop RAM 30$ GeIL SO-DIMM 4GB (GGS34GB1600C11SC) 1.35V (Low Voltage) 4GB DDR3 1600 Notebook Ram Where currently "grep -Fw -e 4GB -e DDR4 -o" prints 4GB 4GB DDR4 4GB 4GB I would expect "grep -Fw -e 4GB -e DDR4 --and" to print the same thing as grep -Fw 4GB | grep -Fw DDR4 | grep -Fw -e 4GB -e DDR4 -o i.e. 4GB DDR4
bug-grep@HIDDEN
:bug#37754
; Package grep
.
Full text available.Received: (at 37754) by debbugs.gnu.org; 16 Oct 2019 18:57:46 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Wed Oct 16 14:57:46 2019 Received: from localhost ([127.0.0.1]:46551 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1iKoUU-0008TT-Ag for submit <at> debbugs.gnu.org; Wed, 16 Oct 2019 14:57:46 -0400 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:42136) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <eggert@HIDDEN>) id 1iKoUR-0008TF-VB for 37754 <at> debbugs.gnu.org; Wed, 16 Oct 2019 14:57:45 -0400 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 4EAFA160624; Wed, 16 Oct 2019 11:57:37 -0700 (PDT) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id VRGuNKRSKZ8s; Wed, 16 Oct 2019 11:57:36 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 94C32160629; Wed, 16 Oct 2019 11:57:36 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id Fe0vmkAc8eig; Wed, 16 Oct 2019 11:57:36 -0700 (PDT) Received: from Penguin.CS.UCLA.EDU (Penguin.CS.UCLA.EDU [131.179.64.200]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 79D54160624; Wed, 16 Oct 2019 11:57:36 -0700 (PDT) Subject: =?UTF-8?Q?Re=3a_bug=2337754=3a_wish_for_grep_--and_-eX_-eY_-eZ_=28X?= =?UTF-8?B?4oipWeKIqVogaW50ZXJzZWN0aW9uLCBub3QgWOKIqlniiKpaIHVuaW9uKQ==?= To: Norihiro Tanaka <noritnk@HIDDEN>, "Trent W. Buck" <trentbuck@HIDDEN> References: <156859974060.2726.6189784814472309967.reportbug@HIDDEN> <20191015014817.GA3082@HIDDEN> <20191016212619.31E9.27F6AC2D@HIDDEN> From: Paul Eggert <eggert@HIDDEN> Organization: UCLA Computer Science Department Message-ID: <07cfaedb-b581-c46e-6f47-433ca8f3a76d@HIDDEN> Date: Wed, 16 Oct 2019 11:57:31 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.1.1 MIME-Version: 1.0 In-Reply-To: <20191016212619.31E9.27F6AC2D@HIDDEN> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 37754 Cc: 37754 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -3.3 (---) Wouldn't it be more useful to have an intersection operator in regular expressions? That is, the pattern 'A\&B' would match anything that is matched by both A and B. If A and B have parenthesized subexpressions, both sets of parentheses would match and would count. Assuming concatenation has higher precedence than \&, the requested behavior could be achieved via: grep '.*X.*\&.*Y.*\&.*Z.*' This approach would allow intersection to be nested inside other operations. Also, it would clarify how other features work. For example, grep -o has clear semantics with this approach, whereas the semantics of grep -o are not so clear with the proposed --and option.
bug-grep@HIDDEN
:bug#37754
; Package grep
.
Full text available.Received: (at 37754) by debbugs.gnu.org; 16 Oct 2019 12:26:36 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Wed Oct 16 08:26:36 2019 Received: from localhost ([127.0.0.1]:45367 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1iKiNv-00070E-Nw for submit <at> debbugs.gnu.org; Wed, 16 Oct 2019 08:26:36 -0400 Received: from mailgw07.kcn.ne.jp ([61.86.7.214]:56578) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <noritnk@HIDDEN>) id 1iKiNt-0006zz-OX for 37754 <at> debbugs.gnu.org; Wed, 16 Oct 2019 08:26:34 -0400 Received: from mxs02-s (mailgw2.kcn.ne.jp [61.86.15.234]) by mailgw07.kcn.ne.jp (Postfix) with ESMTP id 0556541013 for <37754 <at> debbugs.gnu.org>; Wed, 16 Oct 2019 21:26:26 +0900 (JST) X-matriXscan-loop-detect: b972eb63d34b8bcdeae18fd144fa2a18b8bea1b0 Received: from mail11.kcn.ne.jp ([61.86.6.129]) by mxs02-s with ESMTP; Wed, 16 Oct 2019 21:26:23 +0900 (JST) Received: from [10.120.1.123] (i118-21-128-66.s30.a048.ap.plala.or.jp [118.21.128.66]) by mail11.kcn.ne.jp (Postfix) with ESMTPA id 3C1F0415AF43; Wed, 16 Oct 2019 21:26:23 +0900 (JST) Date: Wed, 16 Oct 2019 21:26:21 +0900 From: Norihiro Tanaka <noritnk@HIDDEN> To: "Trent W. Buck" <trentbuck@HIDDEN> Subject: Re: bug#37754: wish for grep --and -eX -eY -eZ =?ISO-2022-JP?B?KFgbJEIiQRsoQlkbJEIiQRsoQlo=?= intersection, not =?ISO-2022-JP?B?WBskQiJAGyhCWRskQiJAGyhCWg==?= union) In-Reply-To: <20191015014817.GA3082@HIDDEN> References: <156859974060.2726.6189784814472309967.reportbug@HIDDEN> <20191015014817.GA3082@HIDDEN> Message-Id: <20191016212619.31E9.27F6AC2D@HIDDEN> MIME-Version: 1.0 Content-Type: text/plain; charset="ISO-2022-JP" Content-Transfer-Encoding: 7bit X-Mailer: Becky! ver. 2.74.02 [ja] X-matriXscan-msec-AV: Clean X-matriXscan-Action: Approve X-matriXscan: Uncategorized X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 37754 Cc: 37754 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.0 (-) On Tue, 15 Oct 2019 12:48:17 +1100 "Trent W. Buck" <trentbuck@HIDDEN> wrote: > Package: grep > Version: 3.3-1 > Severity: wishlist > > This bug was originally reported as > https://bugs.debian.org/940464 > > Trent W. Buck wrote: > > (Surely someone has already asked for this, but I can't see where. > > I may have already reported this myself, and forgotten. > > If so, sorry!) > > > > Right now if you do > > > > grep -eX -eY -eZ > > > > You'll get lines that match *any of* X, Y, or Z. > > Quite often I want to search for lines that match *all of* X, Y, and Z ? but in any order. > > For example, > > > > # all 4TB 2.5-inch SATA products > > grep -Fwi -eSATA -e2TB -e2.5in products.csv > > > > Below is a short discussion of the workarounds I know about. > > > > Is "grep --and" something that has already been discussed and rejected? > > I looked through debbugs.gnu.org and the source tarball, but > > I couldn't find anything about this. > > > > > > PS: grep -v --and would intuitively mean "not all", > > i.e. "grep -v --and -eX -eY" would return lines matching X *or* Y, but > > omit lines matching *both* X and Y. > > > > PS: I can't decide if "--and" or "--intersection" is a better name. > > I put both in the bug subject so people searching for either will find this ticket. > > I think "--all" is probably too confusing. > > > > > > > > Workaround #1 > > ============= > > I can work around this by listing every possible order, but 1) this > > scales poorly with the number of patterns; and 2) it can't be used > > with -F. For example, > > > > grep --and -eX -eY -eZ input*.txt # becomes > > > > grep -eZ.*Y.*X \ > > -eZ.*X.*Y \ > > -eY.*Z.*X \ > > -eY.*X.*Z \ > > -eX.*Z.*Y \ > > -eX.*Y.*Z \ > > input*.txt > > > > > > Workaround #2 > > ============= > > I can pipe greps together. This is what I currently do. > > This is more convenient and feels faster than workaround #1, but > > I suspect the inter-process overhead is significant. > > > > If grep implemented this internally, it could zero-copy. > > Being able to "grep -rnH --and" &c would also be convenient. > > > > For example, > > > > grep --and -F -eX -eY -eZ input*.txt # becomes > > > > cat input*.txt | > > grep -F -eX | > > grep -F -eY | > > grep -F -eZ > Although I do not know wheter it is discussed and/or rejected, to add the function to grep, internal conversion as workaround #1 will be impremented in grep. However, it scales poorly as you say, and it will be slower than workaround #2 in many cases.
bug-grep@HIDDEN
:bug#37754
; Package grep
.
Full text available.Received: (at submit) by debbugs.gnu.org; 15 Oct 2019 01:48:32 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Mon Oct 14 21:48:32 2019 Received: from localhost ([127.0.0.1]:42493 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1iKBwu-0003Q0-0z for submit <at> debbugs.gnu.org; Mon, 14 Oct 2019 21:48:32 -0400 Received: from mail-pg1-f195.google.com ([209.85.215.195]:41746) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <trentbuck@HIDDEN>) id 1iKBwq-0003Pl-Ly for submit <at> debbugs.gnu.org; Mon, 14 Oct 2019 21:48:29 -0400 Received: by mail-pg1-f195.google.com with SMTP id t3so11096071pga.8 for <submit <at> debbugs.gnu.org>; Mon, 14 Oct 2019 18:48:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to :user-agent; bh=aLLE0/JFbIq/RXLnEqVojvKxEoz2nTMWKwB0oQfX/+o=; b=HfasByYFB1ok7cXpTMwt49dopKORYLIqReh36PLSAWYYjxBSjIK8arMXs0Guy1UNKN bp5yHRbrTFuI1uT7aFOSKJKXE0tMFX0a7e92gPGo+EYFNK/Hz/SSPSbIwX1FAPPASo/G wAmiCQ3WsSal8yDIO4AzzYnNDVmBcIHNKGV1dXJhkbINr7YF77Hobip+lvXYSws+0p7P b/vL+sjD92wuHknLi49kZUpbeHWibg6L3pEDmpHTZuadJXPrVvQX2WfloMPtofwW/b1m h7sdPXIRjrqrZHoF8p1lDjGykzp63N8To3YbROaERzGyfIAyd3x2QmaINzOd9EbZl0E7 +8Uw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to:user-agent; bh=aLLE0/JFbIq/RXLnEqVojvKxEoz2nTMWKwB0oQfX/+o=; b=Z1/qocuFryqUOk2+APc+7XTje0ce805U9v9j9Kymxmfdir/C5Db2UHq5jemvXwIrXL NJMbi3VsPNDBRnYifK8X/gElDagibS0drQ9duknc4blJffJXjxhPQz+i6A0ir/qtlf58 0j0ziAnJjcx333OfX5B50xrQPXbpMx5H5CC/gcTX0jrWPkuUtfyf5NQGdJalomI9RfwP erVtki62Kf3eRch541PEaLIMTgmdUGLLP4K4vHlLzHyHEvVNFKiwgaUMzDJucsj8p2Z4 i3qZcNrbvSxq696d/dFjjDuQ73GruXzcPcGHQtybfhGIaWhdJQYFjJ7Qsghdu6qIkX1S 01qw== X-Gm-Message-State: APjAAAXdLvGYZJq0vGWsDXaB6XkWr6oMh/v+fwxQXvlA6x1ltGjrDrCg ZUU2sSElqLAtvX1Cc/dwrw54cX0j X-Google-Smtp-Source: APXvYqz4OMJ9AKc/PmryYLqY1SxEMcadNL5tbWs4XTuywZV8T15zR0cgzhODxuwXP/tQ1JpSB77PFg== X-Received: by 2002:a17:90a:234d:: with SMTP id f71mr39604338pje.134.1571104102234; Mon, 14 Oct 2019 18:48:22 -0700 (PDT) Received: from localhost ([203.7.155.117]) by smtp.gmail.com with ESMTPSA id y4sm17053927pfr.118.2019.10.14.18.48.19 for <submit <at> debbugs.gnu.org> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 14 Oct 2019 18:48:20 -0700 (PDT) Date: Tue, 15 Oct 2019 12:48:17 +1100 From: "Trent W. Buck" <trentbuck@HIDDEN> To: submit <at> debbugs.gnu.org Subject: wish for grep --and -eX -eY =?utf-8?Q?-eZ_?= =?utf-8?B?KFjiiKlZ4oipWiBpbnRlcnNlY3Rpb24sIG5vdCBY4oiqWeKIqlo=?= union) Message-ID: <20191015014817.GA3082@HIDDEN> References: <156859974060.2726.6189784814472309967.reportbug@HIDDEN> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <156859974060.2726.6189784814472309967.reportbug@HIDDEN> User-Agent: Mutt/1.10.1 (2018-07-13) X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.0 (-) Package: grep Version: 3.3-1 Severity: wishlist This bug was originally reported as https://bugs.debian.org/940464 Trent W. Buck wrote: > (Surely someone has already asked for this, but I can't see where. > I may have already reported this myself, and forgotten. > If so, sorry!) > > Right now if you do > > grep -eX -eY -eZ > > You'll get lines that match *any of* X, Y, or Z. > Quite often I want to search for lines that match *all of* X, Y, and Z — but in any order. > For example, > > # all 4TB 2.5-inch SATA products > grep -Fwi -eSATA -e2TB -e2.5in products.csv > > Below is a short discussion of the workarounds I know about. > > Is "grep --and" something that has already been discussed and rejected? > I looked through debbugs.gnu.org and the source tarball, but > I couldn't find anything about this. > > > PS: grep -v --and would intuitively mean "not all", > i.e. "grep -v --and -eX -eY" would return lines matching X *or* Y, but > omit lines matching *both* X and Y. > > PS: I can't decide if "--and" or "--intersection" is a better name. > I put both in the bug subject so people searching for either will find this ticket. > I think "--all" is probably too confusing. > > > > Workaround #1 > ============= > I can work around this by listing every possible order, but 1) this > scales poorly with the number of patterns; and 2) it can't be used > with -F. For example, > > grep --and -eX -eY -eZ input*.txt # becomes > > grep -eZ.*Y.*X \ > -eZ.*X.*Y \ > -eY.*Z.*X \ > -eY.*X.*Z \ > -eX.*Z.*Y \ > -eX.*Y.*Z \ > input*.txt > > > Workaround #2 > ============= > I can pipe greps together. This is what I currently do. > This is more convenient and feels faster than workaround #1, but > I suspect the inter-process overhead is significant. > > If grep implemented this internally, it could zero-copy. > Being able to "grep -rnH --and" &c would also be convenient. > > For example, > > grep --and -F -eX -eY -eZ input*.txt # becomes > > cat input*.txt | > grep -F -eX | > grep -F -eY | > grep -F -eZ
"Trent W. Buck" <trentbuck@HIDDEN>
:bug-grep@HIDDEN
.
Full text available.bug-grep@HIDDEN
:bug#37754
; Package grep
.
Full text available.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997 nCipher Corporation Ltd,
1994-97 Ian Jackson.