GNU bug report logs - #40634
Massive pattern list handling with -E format seems very slow since 2.28.

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: grep; Reported by: fryasu@HIDDEN; dated Wed, 15 Apr 2020 02:21:01 UTC; Maintainer for grep is bug-grep@HIDDEN.

Message received at 40634 <at> debbugs.gnu.org:


Received: (at 40634) by debbugs.gnu.org; 17 Apr 2020 00:35:47 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Apr 16 20:35:47 2020
Received: from localhost ([127.0.0.1]:39700 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1jPEyx-0005oY-49
	for submit <at> debbugs.gnu.org; Thu, 16 Apr 2020 20:35:47 -0400
Received: from mailgw03.kcn.ne.jp ([61.86.7.210]:43362)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <noritnk@HIDDEN>) id 1jPEyw-0005oL-0N
 for 40634 <at> debbugs.gnu.org; Thu, 16 Apr 2020 20:35:46 -0400
Received: from mxs01-s (mailgw1.kcn.ne.jp [61.86.15.233])
 by mailgw03.kcn.ne.jp (Postfix) with ESMTP id C4FA713F911
 for <40634 <at> debbugs.gnu.org>; Fri, 17 Apr 2020 09:35:38 +0900 (JST)
X-matriXscan-loop-detect: d2b092ae01d2dc9213a2e5009a205f0c3153b875
Received: from mail10.kcn.ne.jp ([61.86.6.128]) by mxs01-s with ESMTP;
 Fri, 17 Apr 2020 09:35:38 +0900 (JST)
Received: from [10.120.1.110] (i118-21-128-66.s30.a048.ap.plala.or.jp
 [118.21.128.66])
 by mail10.kcn.ne.jp (Postfix) with ESMTPA id 047844014574;
 Fri, 17 Apr 2020 09:35:37 +0900 (JST)
Date: Fri, 17 Apr 2020 09:35:36 +0900
From: Norihiro Tanaka <noritnk@HIDDEN>
To: Paul Eggert <eggert@HIDDEN>
Subject: Re: bug#40634: Massive pattern list handling with -E format seems
 very slow since 2.28.
In-Reply-To: <6503eb8e-e6fd-b4dd-aab7-76bb6955d87b@HIDDEN>
References: <20200417075312.8757.27F6AC2D@HIDDEN>
 <6503eb8e-e6fd-b4dd-aab7-76bb6955d87b@HIDDEN>
Message-Id: <20200417093536.875E.27F6AC2D@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain; charset="US-ASCII"
Content-Transfer-Encoding: 7bit
X-Mailer: Becky! ver. 2.74.02 [ja]
X-matriXscan-msec-AV: Clean
X-matriXscan-Action: Approve
X-matriXscan: Uncategorized
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 40634
Cc: fryasu@HIDDEN, 40634 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)


On Thu, 16 Apr 2020 16:00:29 -0700
Paul Eggert <eggert@HIDDEN> wrote:

> On 4/16/20 3:53 PM, Norihiro Tanaka wrote:
> 
> > I have had no idea to solve the problem yet.  If we revert it, bug#33357
> > will come back.
> 
> Yes, I'd rather not revert if we can help it.
> 
> My own thought was to not analyze the regular expression if we discover that the input is empty. :-)

Now, I have a idea, it is that we build indexes of epsilon nodes
including in follows before remove epsilon nodes.





Information forwarded to bug-grep@HIDDEN:
bug#40634; Package grep. Full text available.

Message received at 40634 <at> debbugs.gnu.org:


Received: (at 40634) by debbugs.gnu.org; 16 Apr 2020 23:00:39 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Apr 16 19:00:39 2020
Received: from localhost ([127.0.0.1]:39551 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1jPDUt-0003T6-9z
	for submit <at> debbugs.gnu.org; Thu, 16 Apr 2020 19:00:39 -0400
Received: from zimbra.cs.ucla.edu ([131.179.128.68]:55290)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <eggert@HIDDEN>) id 1jPDUr-0003Ss-26
 for 40634 <at> debbugs.gnu.org; Thu, 16 Apr 2020 19:00:37 -0400
Received: from localhost (localhost [127.0.0.1])
 by zimbra.cs.ucla.edu (Postfix) with ESMTP id 826A8160066;
 Thu, 16 Apr 2020 16:00:30 -0700 (PDT)
Received: from zimbra.cs.ucla.edu ([127.0.0.1])
 by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032)
 with ESMTP id V_F0nEfyLGtB; Thu, 16 Apr 2020 16:00:29 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1])
 by zimbra.cs.ucla.edu (Postfix) with ESMTP id D9311160085;
 Thu, 16 Apr 2020 16:00:29 -0700 (PDT)
X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu
Received: from zimbra.cs.ucla.edu ([127.0.0.1])
 by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026)
 with ESMTP id dcJibFV7nj8n; Thu, 16 Apr 2020 16:00:29 -0700 (PDT)
Received: from [192.168.1.9] (cpe-23-242-74-103.socal.res.rr.com
 [23.242.74.103])
 by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 779D0160066;
 Thu, 16 Apr 2020 16:00:29 -0700 (PDT)
Subject: Re: bug#40634: Massive pattern list handling with -E format seems
 very slow since 2.28.
To: Norihiro Tanaka <noritnk@HIDDEN>
References: <20200416155657.8753.27F6AC2D@HIDDEN>
 <0f97b14a-bfd8-7c24-c3cf-b4d370589433@HIDDEN>
 <20200417075312.8757.27F6AC2D@HIDDEN>
From: Paul Eggert <eggert@HIDDEN>
Organization: UCLA Computer Science Department
Message-ID: <6503eb8e-e6fd-b4dd-aab7-76bb6955d87b@HIDDEN>
Date: Thu, 16 Apr 2020 16:00:29 -0700
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101
 Thunderbird/68.7.0
MIME-Version: 1.0
In-Reply-To: <20200417075312.8757.27F6AC2D@HIDDEN>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 7bit
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 40634
Cc: fryasu@HIDDEN, 40634 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

On 4/16/20 3:53 PM, Norihiro Tanaka wrote:

> I have had no idea to solve the problem yet.  If we revert it, bug#33357
> will come back.

Yes, I'd rather not revert if we can help it.

My own thought was to not analyze the regular expression if we discover that the 
input is empty. :-)




Information forwarded to bug-grep@HIDDEN:
bug#40634; Package grep. Full text available.

Message received at 40634 <at> debbugs.gnu.org:


Received: (at 40634) by debbugs.gnu.org; 16 Apr 2020 22:53:24 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Apr 16 18:53:24 2020
Received: from localhost ([127.0.0.1]:39547 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1jPDNs-0003Hz-Hf
	for submit <at> debbugs.gnu.org; Thu, 16 Apr 2020 18:53:24 -0400
Received: from mailgw07.kcn.ne.jp ([61.86.7.214]:50538)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <noritnk@HIDDEN>) id 1jPDNq-0003Hl-8z
 for 40634 <at> debbugs.gnu.org; Thu, 16 Apr 2020 18:53:23 -0400
Received: from mxs02-s (mailgw2.kcn.ne.jp [61.86.15.234])
 by mailgw07.kcn.ne.jp (Postfix) with ESMTP id 05338410C7
 for <40634 <at> debbugs.gnu.org>; Fri, 17 Apr 2020 07:53:15 +0900 (JST)
X-matriXscan-loop-detect: 092dc6bad9f14809bb40e91634e4595cbfc2582f
Received: from mail14.kcn.ne.jp ([61.86.6.132]) by mxs02-s with ESMTP;
 Fri, 17 Apr 2020 07:53:13 +0900 (JST)
Received: from [10.120.1.110] (i118-21-128-66.s30.a048.ap.plala.or.jp
 [118.21.128.66])
 by mail14.kcn.ne.jp (Postfix) with ESMTPA id 5EF22416C35C;
 Fri, 17 Apr 2020 07:53:13 +0900 (JST)
Date: Fri, 17 Apr 2020 07:53:12 +0900
From: Norihiro Tanaka <noritnk@HIDDEN>
To: Paul Eggert <eggert@HIDDEN>
Subject: Re: bug#40634: Massive pattern list handling with -E format seems
 very slow since 2.28.
In-Reply-To: <0f97b14a-bfd8-7c24-c3cf-b4d370589433@HIDDEN>
References: <20200416155657.8753.27F6AC2D@HIDDEN>
 <0f97b14a-bfd8-7c24-c3cf-b4d370589433@HIDDEN>
Message-Id: <20200417075312.8757.27F6AC2D@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain; charset="US-ASCII"
Content-Transfer-Encoding: 7bit
X-Mailer: Becky! ver. 2.74.02 [ja]
X-matriXscan-msec-AV: Clean
X-matriXscan-Action: Approve
X-matriXscan: Uncategorized
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 40634
Cc: fryasu@HIDDEN, 40634 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)


On Thu, 16 Apr 2020 09:31:32 -0700
Paul Eggert <eggert@HIDDEN> wrote:

> On 4/15/20 11:56 PM, Norihiro Tanaka wrote:
> 
> > It seems to a lot of time is spent in dfa.c:replace().
> > It was added at d6df3873c7abc243683d0e8fccbfde4e76f23e53 in gnulib.
> 
> It would be pretty drastic to revert that patch. Is there some better way to move forward?

I have had no idea to solve the problem yet.  If we revert it, bug#33357
will come back.





Information forwarded to bug-grep@HIDDEN:
bug#40634; Package grep. Full text available.

Message received at 40634 <at> debbugs.gnu.org:


Received: (at 40634) by debbugs.gnu.org; 16 Apr 2020 16:31:44 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Apr 16 12:31:44 2020
Received: from localhost ([127.0.0.1]:39224 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1jP7QV-0004XW-Tx
	for submit <at> debbugs.gnu.org; Thu, 16 Apr 2020 12:31:44 -0400
Received: from zimbra.cs.ucla.edu ([131.179.128.68]:36702)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <eggert@HIDDEN>) id 1jP7QS-0004X4-10
 for 40634 <at> debbugs.gnu.org; Thu, 16 Apr 2020 12:31:41 -0400
Received: from localhost (localhost [127.0.0.1])
 by zimbra.cs.ucla.edu (Postfix) with ESMTP id 066DA1600B2;
 Thu, 16 Apr 2020 09:31:34 -0700 (PDT)
Received: from zimbra.cs.ucla.edu ([127.0.0.1])
 by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032)
 with ESMTP id CAAqHgpiOtac; Thu, 16 Apr 2020 09:31:33 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1])
 by zimbra.cs.ucla.edu (Postfix) with ESMTP id 5ED071600C7;
 Thu, 16 Apr 2020 09:31:33 -0700 (PDT)
X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu
Received: from zimbra.cs.ucla.edu ([127.0.0.1])
 by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026)
 with ESMTP id MiGrB6sXPbqZ; Thu, 16 Apr 2020 09:31:33 -0700 (PDT)
Received: from [192.168.1.9] (cpe-23-242-74-103.socal.res.rr.com
 [23.242.74.103])
 by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 3509B1600B2;
 Thu, 16 Apr 2020 09:31:33 -0700 (PDT)
Subject: Re: bug#40634: Massive pattern list handling with -E format seems
 very slow since 2.28.
To: Norihiro Tanaka <noritnk@HIDDEN>, 40634 <at> debbugs.gnu.org
References: <581950183.837648.1586910415729.JavaMail.yahoo.ref@HIDDEN>
 <581950183.837648.1586910415729.JavaMail.yahoo@HIDDEN>
 <20200416155657.8753.27F6AC2D@HIDDEN>
From: Paul Eggert <eggert@HIDDEN>
Organization: UCLA Computer Science Department
Message-ID: <0f97b14a-bfd8-7c24-c3cf-b4d370589433@HIDDEN>
Date: Thu, 16 Apr 2020 09:31:32 -0700
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101
 Thunderbird/68.7.0
MIME-Version: 1.0
In-Reply-To: <20200416155657.8753.27F6AC2D@HIDDEN>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 7bit
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 40634
Cc: fryasu@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

On 4/15/20 11:56 PM, Norihiro Tanaka wrote:

> It seems to a lot of time is spent in dfa.c:replace().
> It was added at d6df3873c7abc243683d0e8fccbfde4e76f23e53 in gnulib.

It would be pretty drastic to revert that patch. Is there some better way to 
move forward?




Information forwarded to bug-grep@HIDDEN:
bug#40634; Package grep. Full text available.

Message received at 40634 <at> debbugs.gnu.org:


Received: (at 40634) by debbugs.gnu.org; 16 Apr 2020 06:57:11 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Apr 16 02:57:11 2020
Received: from localhost ([127.0.0.1]:37413 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1jOySU-0006jL-Pc
	for submit <at> debbugs.gnu.org; Thu, 16 Apr 2020 02:57:11 -0400
Received: from mailgw03.kcn.ne.jp ([61.86.7.210]:42195)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <noritnk@HIDDEN>) id 1jOySS-0006j7-NS
 for 40634 <at> debbugs.gnu.org; Thu, 16 Apr 2020 02:57:09 -0400
Received: from mxs02-s (mailgw2.kcn.ne.jp [61.86.15.234])
 by mailgw03.kcn.ne.jp (Postfix) with ESMTP id 6222113F905
 for <40634 <at> debbugs.gnu.org>; Thu, 16 Apr 2020 15:57:01 +0900 (JST)
X-matriXscan-loop-detect: c677f73f25cada95bc6f2d646e7e8c6d0b105a80
Received: from mail12.kcn.ne.jp ([61.86.6.130]) by mxs02-s with ESMTP;
 Thu, 16 Apr 2020 15:56:58 +0900 (JST)
Received: from [10.120.1.110] (i118-21-128-66.s30.a048.ap.plala.or.jp
 [118.21.128.66])
 by mail12.kcn.ne.jp (Postfix) with ESMTPA id 9ACD040A5B53;
 Thu, 16 Apr 2020 15:56:58 +0900 (JST)
Date: Thu, 16 Apr 2020 15:56:58 +0900
From: Norihiro Tanaka <noritnk@HIDDEN>
To: 40634 <at> debbugs.gnu.org
Subject: Re: bug#40634: Massive pattern list handling with -E format seems
 very slow since 2.28.
In-Reply-To: <581950183.837648.1586910415729.JavaMail.yahoo@HIDDEN>
References: <581950183.837648.1586910415729.JavaMail.yahoo.ref@HIDDEN>
 <581950183.837648.1586910415729.JavaMail.yahoo@HIDDEN>
Message-Id: <20200416155657.8753.27F6AC2D@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain; charset="US-ASCII"
Content-Transfer-Encoding: 7bit
X-Mailer: Becky! ver. 2.74.02 [ja]
X-matriXscan-msec-AV: Clean
X-matriXscan-Action: Approve
X-matriXscan: Uncategorized
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 40634
Cc: fryasu@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

+ grep-2.2/src/grep -E -v -m1 -f grep-patterns.txt /dev/null
grep-2.2/src/grep: invalid option -- 'm'
Usage: grep [OPTION]... PATTERN [FILE]...
Try `grep --help' for more information.
real 0.00
user 0.00
sys 0.00
+ grep-2.3/src/grep -E -v -m1 -f grep-patterns.txt /dev/null
grep-2.3/src/grep: invalid option -- 'm'
Usage: grep [OPTION]... PATTERN [FILE]...
Try `grep --help' for more information.
real 0.00
user 0.00
sys 0.00
+ grep-2.4/src/grep -E -v -m1 -f grep-patterns.txt /dev/null
grep-2.4/src/grep: invalid option -- 'm'
Usage: grep [OPTION]... PATTERN [FILE]...
Try `grep --help' for more information.
real 0.00
user 0.00
sys 0.00
+ grep-2.4.1/src/grep -E -v -m1 -f grep-patterns.txt /dev/null
grep-2.4.1/src/grep: invalid option -- 'm'
Usage: grep [OPTION]... PATTERN [FILE]...
Try `grep --help' for more information.
real 0.00
user 0.00
sys 0.00
+ grep-2.4.2/src/grep -E -v -m1 -f grep-patterns.txt /dev/null
grep-2.4.2/src/grep: invalid option -- 'm'
Usage: grep [OPTION]... PATTERN [FILE]...
Try `grep --help' for more information.
real 0.00
user 0.00
sys 0.00
+ grep-2.5.4/src/grep -E -v -m1 -f grep-patterns.txt /dev/null
real 13.58
user 13.43
sys 0.14
+ grep-2.6/src/grep -E -v -m1 -f grep-patterns.txt /dev/null
real 3.95
user 3.51
sys 0.42
+ grep-2.6.1/src/grep -E -v -m1 -f grep-patterns.txt /dev/null
real 3.97
user 3.86
sys 0.11
+ grep-2.6.2/src/grep -E -v -m1 -f grep-patterns.txt /dev/null
real 3.98
user 3.69
sys 0.28
+ grep-2.6.3/src/grep -E -v -m1 -f grep-patterns.txt /dev/null
real 3.99
user 3.94
sys 0.04
+ grep-2.7/src/grep -E -v -m1 -f grep-patterns.txt /dev/null
real 3.93
user 3.68
sys 0.24
+ grep-2.8/src/grep -E -v -m1 -f grep-patterns.txt /dev/null
real 3.89
user 3.83
sys 0.05
+ grep-2.9/src/grep -E -v -m1 -f grep-patterns.txt /dev/null
real 3.92
user 3.63
sys 0.27
+ grep-2.10/src/grep -E -v -m1 -f grep-patterns.txt /dev/null
real 3.93
user 3.65
sys 0.27
+ grep-2.11/src/grep -E -v -m1 -f grep-patterns.txt /dev/null
real 3.98
user 3.89
sys 0.08
+ grep-2.12/src/grep -E -v -m1 -f grep-patterns.txt /dev/null
real 3.95
user 3.87
sys 0.06
+ grep-2.13/src/grep -E -v -m1 -f grep-patterns.txt /dev/null
real 3.97
user 3.85
sys 0.11
+ grep-2.14/src/grep -E -v -m1 -f grep-patterns.txt /dev/null
real 4.01
user 3.91
sys 0.09
+ grep-2.15/src/grep -E -v -m1 -f grep-patterns.txt /dev/null
real 4.05
user 3.99
sys 0.05
+ grep-2.16/src/grep -E -v -m1 -f grep-patterns.txt /dev/null
real 3.91
user 3.50
sys 0.40
+ grep-2.17/src/grep -E -v -m1 -f grep-patterns.txt /dev/null
real 3.98
user 3.94
sys 0.03
+ grep-2.18/src/grep -E -v -m1 -f grep-patterns.txt /dev/null
real 3.98
user 3.87
sys 0.10
+ grep-2.19/src/grep -E -v -m1 -f grep-patterns.txt /dev/null
real 0.54
user 0.50
sys 0.03
+ grep-2.20/src/grep -E -v -m1 -f grep-patterns.txt /dev/null
real 0.62
user 0.57
sys 0.04
+ grep-2.21/src/grep -E -v -m1 -f grep-patterns.txt /dev/null
real 0.43
user 0.41
sys 0.02
+ grep-2.22/src/grep -E -v -m1 -f grep-patterns.txt /dev/null
real 0.18
user 0.16
sys 0.01
+ grep-2.23/src/grep -E -v -m1 -f grep-patterns.txt /dev/null
real 0.18
user 0.13
sys 0.04
+ grep-2.24/src/grep -E -v -m1 -f grep-patterns.txt /dev/null
real 0.18
user 0.13
sys 0.04
+ grep-2.25/src/grep -E -v -m1 -f grep-patterns.txt /dev/null
real 0.18
user 0.16
sys 0.01
+ grep-2.26/src/grep -E -v -m1 -f grep-patterns.txt /dev/null
real 0.18
user 0.14
sys 0.04
+ grep-2.27/src/grep -E -v -m1 -f grep-patterns.txt /dev/null
real 0.17
user 0.15
sys 0.02
+ grep-2.28/src/grep -E -v -m1 -f grep-patterns.txt /dev/null
real 7.22
user 7.14
sys 0.07
+ grep-3.0/src/grep -E -v -m1 -f grep-patterns.txt /dev/null
real 7.26
user 7.11
sys 0.14
+ grep-3.1/src/grep -E -v -m1 -f grep-patterns.txt /dev/null
real 7.22
user 6.88
sys 0.33
+ grep-3.2/src/grep -E -v -m1 -f grep-patterns.txt /dev/null
real 7.20
user 7.04
sys 0.15
+ grep-3.3/src/grep -E -v -m1 -f grep-patterns.txt /dev/null
real 7.24
user 7.17
sys 0.07
+ grep-3.4/src/grep -E -v -m1 -f grep-patterns.txt /dev/null
real 7.13
user 6.49
sys 0.63

It seems to a lot of time is spent in dfa.c:replace().
It was added at d6df3873c7abc243683d0e8fccbfde4e76f23e53 in gnulib.





Information forwarded to bug-grep@HIDDEN:
bug#40634; Package grep. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 15 Apr 2020 02:20:43 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Tue Apr 14 22:20:43 2020
Received: from localhost ([127.0.0.1]:35053 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1jOXfM-0002Yx-28
	for submit <at> debbugs.gnu.org; Tue, 14 Apr 2020 22:20:43 -0400
Received: from lists.gnu.org ([209.51.188.17]:43609)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <fryasu@HIDDEN>) id 1jOVtW-0004AB-HD
 for submit <at> debbugs.gnu.org; Tue, 14 Apr 2020 20:27:12 -0400
Received: from eggs.gnu.org ([2001:470:142:3::10]:53876)
 by lists.gnu.org with esmtp (Exim 4.90_1)
 (envelope-from <fryasu@HIDDEN>) id 1jOVtU-0003yN-QC
 for bug-grep@HIDDEN; Tue, 14 Apr 2020 20:27:10 -0400
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,FREEMAIL_FROM,
 RCVD_IN_DNSWL_NONE,URIBL_BLOCKED autolearn=disabled version=3.3.2
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <fryasu@HIDDEN>) id 1jOVtQ-0003Io-SR
 for bug-grep@HIDDEN; Tue, 14 Apr 2020 20:27:05 -0400
Received: from nh601-vm9.bullet.mail.ssk.yahoo.co.jp ([182.22.90.18]:30744)
 by eggs.gnu.org with smtp (Exim 4.71)
 (envelope-from <fryasu@HIDDEN>) id 1jOVtP-0003Hg-V3
 for bug-grep@HIDDEN; Tue, 14 Apr 2020 20:27:04 -0400
Received: from [182.22.66.105] by nh601.bullet.mail.ssk.yahoo.co.jp with NNFMP;
 15 Apr 2020 00:26:56 -0000
Received: from [182.22.91.129] by t603.bullet.mail.ssk.yahoo.co.jp with NNFMP;
 15 Apr 2020 00:26:56 -0000
Received: from [127.0.0.1] by omp602.mail.ssk.yahoo.co.jp with NNFMP;
 15 Apr 2020 00:26:56 -0000
X-Yahoo-Newman-Property: ymail-3
X-Yahoo-Newman-Id: 737463.11452.bm@HIDDEN
X-YMail-OSG: xQPxqAAVM1myqlixxifzURz9e1yDLSG30VuTvNRI.srf8JZPiK5gEaxsqgntMtG
 lT5DyX4dPNZXjKaC0z6G2Xg1tJwSDz303iGzdQc9Nm2tWbWwjLby5qSsK1ekGHhK3rlN4k3woLfv
 K.tIxMFEx7YtGAVW3iuAUQpOy3tFQSdv_RiKUUonORlK.o6epHS.MPmZXlGVg39GTN.4tjgXvXBZ
 rCb0pHyK23ERNeRAHFnSWcJosyJQI9E3nHTbdbJNagDpg7oFNlJlgqzQebksQWzr.1je0_qDB_Eq
 J57FrQZ8NoeBZ5XGYs50qeKQ7l8TMtUPdFR_khYxq6zaIqD6f_6NvdKPWBzDyU0HLt0Vk7AMQrNG
 63RyB06yakuEGkINUIcuWvYXaa_DWDmPwIeWmZ8er0Jh359NnRXgnPCLHYQz73pFMHnwXr6eMVSM
 CCP4gNRhHGJxCyJJ7C0x8ioKtTCSuoR18onljo2lKFSt_vyDFLeGJXRYppjBcdRhiDIuNjtE1WBX
 aZhn90ZGWFTxJwZU2Ng6ZIRoGAlpoZ8PioR3DMkAVnJTZRKutQJJd7uh.imcLcRhr11Va3ppqLkT
 HVvtLTVaJPhh6w71B68OKVCo5KcwBOYDo_Jqx0QVA.BPnFy7PLpqjt9yp1j4Cq4GD2MQbhKDEIf0
 01l0WXxsSMgquRzODPaOgGAc736xgc7fCrymKEE9SrH1CWnEXWA65qhntF_TDL15Kuv62wuGsu5C
 oVqu_sWdIfxZ4NG.KD7xtVPE6
Received: from jws704007.mail.kks.yahoo.co.jp by
 sendmailws520.mail.kks.yahoo.co.jp; Wed, 15 Apr 2020 09:26:56 +0000;
 1586910416.133
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; t=1586910416; 
 s=yj20110701; d=yahoo.co.jp;
 h=Date:From:Reply-To:To:Message-ID:Subject:MIME-Version:Content-Type:Content-Transfer-Encoding:References;
 bh=SuMR0lFX59CKI/YSiW1e3+WvDIbueVZXjqD25gZA/YI=;
 b=b0hD/8mGHIjtcZ2GYVloRdVOyZeAroPoYz0MWrrWedMwlNBeU9h31xLdQNYMSnaD
 H/IlkijloiLHE5C3HA1gYl88HgRaw/QZXfk+YlQbUG2SUMlThmYFzaD2FmPUF+KEkW7
 f7Cbn1m8wlN9pKd51e1JsSa3AcLVATzK+N627HvE=
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=yj20110701; d=yahoo.co.jp; 
 h=Date:From:Reply-To:Message-ID:MIME-Version:Content-Type:Content-Transfer-Encoding:References;
 b=Iw6j1fkGQkm7QyHZafN7jER7jaLTAOs2aw8WMfr52XkVdK/S4NnxkPsfmJVxBkMb
 UjdHLQyUEZBtkyM1oPje1KqIvq1LNSuMM/08NjfbCjT5aOOvfPP7qyr9TwHU6Q5xeuW
 fYWi8Ko4ulqz3be9fWYQjKhUpKIkw8bXIWObYJMU=;
Date: Wed, 15 Apr 2020 09:26:55 +0900 (JST)
From: fryasu@HIDDEN
To: "bug-grep@HIDDEN" <bug-grep@HIDDEN>
Message-ID: <581950183.837648.1586910415729.JavaMail.yahoo@HIDDEN>
Subject: Massive pattern list handling with -E format seems very slow since
 2.28.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
References: <581950183.837648.1586910415729.JavaMail.yahoo.ref@HIDDEN>
Content-Length: 953
X-detected-operating-system: by eggs.gnu.org: FreeBSD 8.x [fuzzy]
X-Received-From: 182.22.90.18
X-Spam-Score: 0.3 (/)
X-Debbugs-Envelope-To: submit
X-Mailman-Approved-At: Tue, 14 Apr 2020 22:20:38 -0400
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Reply-To: fryasu@HIDDEN
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -0.7 (/)

Hi,

Massive pattern list handling with -E format seems very
slow, since grep 2.28.

Conversion from -E format to -F format may have problem
about performance.

When the processing time is measured by the script below,
the result isobviously different between 2.28 and 2.27.

----
#!/bin/bash
export LC_ALL=3DC

rm -f grep-patterns.txt
for i in {1..2000}; do
=C2=A0=C2=A0=C2=A0=C2=A0 pat=3D$(echo -n "$i" | sha1sum | cut -f1 -d ' ')
=C2=A0=C2=A0=C2=A0=C2=A0 echo -e "$pat$pat(\$|$pat)" >> grep-patterns.txt
done

echo executing grep...
time grep -E -v -m1 -f grep-patterns.txt /dev/null
----

The following is the results in my PC with fedora's RPM.
https://koji.fedoraproject.org/koji/packageinfo?packageID=3D1023

- result with grep 2.28

=C2=A0 real 0m11.087s / user 0m11.027s / sys 0m0.037s

- result with grep 2.27

=C2=A0 real 0m0.144s / user 0m0.116s / sys 0m0.027s

With also recent 3.4, result is same.


I hope you find it useful.


regards,





Acknowledgement sent to fryasu@HIDDEN:
New bug report received and forwarded. Copy sent to bug-grep@HIDDEN. Full text available.
Report forwarded to bug-grep@HIDDEN:
bug#40634; Package grep. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Fri, 17 Apr 2020 00:45:02 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.