Received: (at 62483) by debbugs.gnu.org; 2 Apr 2023 18:29:50 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Sun Apr 02 14:29:50 2023 Received: from localhost ([127.0.0.1]:42640 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1pj2Sc-0004O8-7q for submit <at> debbugs.gnu.org; Sun, 02 Apr 2023 14:29:50 -0400 Received: from mail.cs.ucla.edu ([131.179.128.66]:39294) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <eggert@HIDDEN>) id 1pj2SX-0004Ns-KW for 62483 <at> debbugs.gnu.org; Sun, 02 Apr 2023 14:29:48 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.cs.ucla.edu (Postfix) with ESMTP id AEBCE3C09FA02; Sun, 2 Apr 2023 11:29:39 -0700 (PDT) Received: from mail.cs.ucla.edu ([127.0.0.1]) by localhost (mail.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id oyIQv3a3PBiJ; Sun, 2 Apr 2023 11:29:39 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by mail.cs.ucla.edu (Postfix) with ESMTP id 752483C09FA01; Sun, 2 Apr 2023 11:29:39 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.10.3 mail.cs.ucla.edu 752483C09FA01 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cs.ucla.edu; s=9D0B346E-2AEB-11ED-9476-E14B719DCE6C; t=1680460179; bh=2f3sGrU+5qn62ukHicLKudSRcxmJsffhDbks/B5+0OI=; h=Message-ID:Date:MIME-Version:To:From; b=mnBN4wAUHPlu43V8zbNB9T9pxfGf6HyO7gPzgnJgbfqqBXvW/gp9J17F6/nI3rVGT +rrkdp/1IUPB5eBvCcQVeBn2RpKvqC13iaomLjQ4AVagHIh4UuiftUMSw2cWamRB6W Y2dQr4gZNvJvDgfPnhopgZd8M4Vmb2Ndoa3j7c9n6TEAp74gomzh9DqX43neTH8sVs NlK6CuUZbnfuu3E69cBE1NrvW6XQqr9vz0xfj5qILzwTiilGWONSjPfOer4tt5nzRl ZLb3+e1lUwDk6I/8kVFh3q3CE7Py8BCBNZuXWPyDbQPObf+WY4giKoTmeImHUqXHFh 23i2Aa5w9ntag== X-Virus-Scanned: amavisd-new at mail.cs.ucla.edu Received: from mail.cs.ucla.edu ([127.0.0.1]) by localhost (mail.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id xT4zfD34fpKg; Sun, 2 Apr 2023 11:29:39 -0700 (PDT) Received: from [192.168.1.9] (cpe-172-91-119-151.socal.res.rr.com [172.91.119.151]) by mail.cs.ucla.edu (Postfix) with ESMTPSA id 4A99F3C097AFE; Sun, 2 Apr 2023 11:29:39 -0700 (PDT) Message-ID: <576a26bb-b195-dd33-a5a4-ba38115889d5@HIDDEN> Date: Sun, 2 Apr 2023 11:29:39 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.9.0 Content-Language: en-US To: arnold@HIDDEN, dimitry@HIDDEN References: <CAHJ_xkTYjONGLYWUL8uhnOVHHsBYEVqZ8WSPbr4nn8RrdfykvA@HIDDEN> <C23CEBBF-8231-46B4-8224-E4FCEED45A7D@HIDDEN> <202303280646.32S6krUr030677@HIDDEN> <EC26823E-6BD3-4EC6-8959-4C5AF60D83FB@HIDDEN> <202304020652.3326q2Ut024466@HIDDEN> From: Paul Eggert <eggert@HIDDEN> Organization: UCLA Computer Science Department Subject: Re: bug#62483: echo a | grep -E -w '((()|a)|())*' # does not terminate In-Reply-To: <202304020652.3326q2Ut024466@HIDDEN> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -1.1 (-) X-Debbugs-Envelope-To: 62483 Cc: 62483 <at> debbugs.gnu.org, koen@HIDDEN X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -2.1 (--) On 2023-04-01 23:52, arnold@HIDDEN wrote: > It's interesting, as gawk uses the same regex, but with different flags. Also, GNU grep -w passes the following more-complicated regexp to dfaparse: (^|[^[:alnum:]_])(((()|a)|())*)([^[:alnum:]_]|$) and quite possibly the bug is related to this extra complexity.
bug-grep@HIDDEN
:bug#62483
; Package grep
.
Full text available.Received: (at 62483) by debbugs.gnu.org; 2 Apr 2023 10:04:12 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Sun Apr 02 06:04:12 2023 Received: from localhost ([127.0.0.1]:39237 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1piuZI-0000fR-43 for submit <at> debbugs.gnu.org; Sun, 02 Apr 2023 06:04:12 -0400 Received: from frenzy.freefriends.org ([198.99.81.75]:59844 helo=freefriends.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <arnold@HIDDEN>) id 1piuZG-0000fJ-Gp for 62483 <at> debbugs.gnu.org; Sun, 02 Apr 2023 06:04:11 -0400 X-Envelope-From: arnold@HIDDEN Received: from freefriends.org (frenzy.freefriends.org [198.99.81.75]) by freefriends.org (8.14.7/8.14.7) with ESMTP id 332A3umP023104 (version=TLSv1/SSLv3 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Sun, 2 Apr 2023 04:03:57 -0600 Received: (from arnold@localhost) by freefriends.org (8.14.7/8.14.7/Submit) id 332A3uGT023103; Sun, 2 Apr 2023 04:03:56 -0600 From: arnold@HIDDEN Message-Id: <202304021003.332A3uGT023103@HIDDEN> X-Authentication-Warning: frenzy.freefriends.org: arnold set sender to arnold@HIDDEN using -f Date: Sun, 02 Apr 2023 04:03:56 -0600 To: dimitry@HIDDEN, arnold@HIDDEN Subject: Re: bug#62483: echo a | grep -E -w '((()|a)|())*' # does not terminate References: <CAHJ_xkTYjONGLYWUL8uhnOVHHsBYEVqZ8WSPbr4nn8RrdfykvA@HIDDEN> <C23CEBBF-8231-46B4-8224-E4FCEED45A7D@HIDDEN> <202303280646.32S6krUr030677@HIDDEN> <EC26823E-6BD3-4EC6-8959-4C5AF60D83FB@HIDDEN> <202304020652.3326q2Ut024466@HIDDEN> <4DC78C45-1C0C-4A9E-BBFB-6F752AC047F5@HIDDEN> In-Reply-To: <4DC78C45-1C0C-4A9E-BBFB-6F752AC047F5@HIDDEN> User-Agent: Heirloom mailx 12.5 7/5/10 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 62483 Cc: 62483 <at> debbugs.gnu.org, koen@HIDDEN, bug-grep@HIDDEN X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: 0.0 (/) OK, thanks! Dimitry Andric <dimitry@HIDDEN> wrote: > Ah sorry, I did indeed rebuild grep with --with-included-regex, and for > debugging purposes added CFLAGS="-O0 -g". > > In any case, the problematic code is both in glibc and grep, as I > believe these are originating from the same source. > > -Dimitry > > > On 2 Apr 2023, at 08:52, arnold@HIDDEN wrote: > > > > Hi. > > > > Dimitry Andric <dimitry@HIDDEN> wrote: > > > >> Yes, it still reproduces when I configure the latest grep using > >> --without-included-regex: > > > > I assume you meant --with-included-regex? > > > > If you really used --without-included-regex, that doesn't prove anything... :-) > > > > It's interesting, as gawk uses the same regex, but with different flags. > > It might be worth trying to understand which of the syntax bits > > is causing this. > > > > Thanks, > > > > Arnold > > > >> > >> 1400 for (idx = pmatch[0].rm_so; idx <= pmatch[0].rm_eo ;) > >> 1: idx = 0 > >> (gdb) > >> 1402 update_regs (dfa, pmatch, prev_idx_match, cur_node, idx, nmatch); > >> 1: idx = 0 > >> (gdb) > >> 1404 if ((idx == pmatch[0].rm_eo && cur_node == mctx->last_node) > >> 1: idx = 0 > >> (gdb) > >> 1405 || (fs && re_node_set_contains (&eps_via_nodes, cur_node))) > >> 1: idx = 0 > >> (gdb) > >> 1428 cur_node = proceed_next_node (mctx, nmatch, pmatch, prev_idx_match, > >> 1: idx = 0 > >> (gdb) > >> 1432 if (__glibc_unlikely (cur_node < 0)) > >> 1: idx = 0 > >> (gdb) > >> 1400 for (idx = pmatch[0].rm_so; idx <= pmatch[0].rm_eo ;) > >> 1: idx = 0 > >> (gdb) > >> 1402 update_regs (dfa, pmatch, prev_idx_match, cur_node, idx, nmatch); > >> 1: idx = 0 > >> (gdb) > >> 1404 if ((idx == pmatch[0].rm_eo && cur_node == mctx->last_node) > >> 1: idx = 0 > >> (gdb) > >> 1405 || (fs && re_node_set_contains (&eps_via_nodes, cur_node))) > >> 1: idx = 0 > >> > >> The endless loop looks the same. grep's regexec.c is mostly the same as > >> glibc's, except for the latter having a bunch of #if RE_ENABLE_I18N > >> directives added. > >> > >> -Dimitry > >> > >>> On 28 Mar 2023, at 08:46, arnold@HIDDEN wrote: > >>> > >>> This does not reproduce with gawk, even when I force use of the regex > >>> matcher. > >>> > >>> What happens if grep is built with the included regex files instead of > >>> relying on the ones in the local glibc? > >>> > >>> Arnold > >>> > >>> Dimitry Andric <dimitry@HIDDEN> wrote: > >>> > >>>> On 27 Mar 2023, at 11:14, Koen Claessen <koen@HIDDEN> wrote: > >>>>> > >>>>> Running the command: > >>>>> > >>>>> echo a | grep -E -w '((()|a)|())*' > >>>>> > >>>>> does not terminate, and uses a LOT of processor time, for all versions of > >>>>> grep I have tried. > >>>>> > >>>>> This is the smallest case that could be found; simplifying anything in the > >>>>> input and/or expression leads to correct behavior. > >>>> > >>>> Reproducible with GNU grep 3.7 on Ubuntu 22: > >>>> > >>>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > >>>> 93938 dim 20 0 9.0m 2.1m 2.0m R 100.0 0.0 0:08.32 grep -E -w ((()|a)|())* > >>>> > >>>> It seems that at least on Ubuntu, grep in this mode uses glibc's regexec(3), and it is that implementation which ends up in an endless loop. > >>>> > >>>> It loops between lines 1415, 1417 and 1443, but idx and cur_node never change from 0: > >>>> > >>>> 1378 static reg_errcode_t > >>>> 1379 __attribute_warn_unused_result__ > >>>> 1380 set_regs (const regex_t *preg, const re_match_context_t *mctx, size_t nmatch, > >>>> 1381 regmatch_t *pmatch, bool fl_backtrack) > >>>> 1382 { > >>>> ... > >>>> 1415 for (idx = pmatch[0].rm_so; idx <= pmatch[0].rm_eo ;) > >>>> 1416 { > >>>> 1417 update_regs (dfa, pmatch, prev_idx_match, cur_node, idx, nmatch); > >>>> 1418 > >>>> 1419 if ((idx == pmatch[0].rm_eo && cur_node == mctx->last_node) > >>>> 1420 || (fs && re_node_set_contains (&eps_via_nodes, cur_node))) > >>>> ... > >>>> 1442 /* Proceed to next node. */ > >>>> 1443 cur_node = proceed_next_node (mctx, nmatch, pmatch, prev_idx_match, > >>>> 1444 &idx, cur_node, > >>>> 1445 &eps_via_nodes, fs); > >>>> 1446 > >>>> 1447 if (__glibc_unlikely (cur_node < 0)) > >>>> ... > >>>> 1465 } > >>>> 1466 } > >>>> > >>>> -Dimitry > >>>> > >>>> P.S.: Interestingly this does not reproduce with BSD grep, which returns immediately with "a". > >>>> > >> >
bug-grep@HIDDEN
:bug#62483
; Package grep
.
Full text available.Received: (at submit) by debbugs.gnu.org; 2 Apr 2023 10:04:18 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Sun Apr 02 06:04:18 2023 Received: from localhost ([127.0.0.1]:39240 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1piuZN-0000fk-QN for submit <at> debbugs.gnu.org; Sun, 02 Apr 2023 06:04:18 -0400 Received: from lists.gnu.org ([209.51.188.17]:39736) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <arnold@HIDDEN>) id 1piuZL-0000fb-ID for submit <at> debbugs.gnu.org; Sun, 02 Apr 2023 06:04:16 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <arnold@HIDDEN>) id 1piuZK-000719-JP for bug-grep@HIDDEN; Sun, 02 Apr 2023 06:04:14 -0400 Received: from frenzy.freefriends.org ([198.99.81.75] helo=freefriends.org) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <arnold@HIDDEN>) id 1piuZI-0002di-Og for bug-grep@HIDDEN; Sun, 02 Apr 2023 06:04:14 -0400 X-Envelope-From: arnold@HIDDEN Received: from freefriends.org (frenzy.freefriends.org [198.99.81.75]) by freefriends.org (8.14.7/8.14.7) with ESMTP id 332A3umP023104 (version=TLSv1/SSLv3 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Sun, 2 Apr 2023 04:03:57 -0600 Received: (from arnold@localhost) by freefriends.org (8.14.7/8.14.7/Submit) id 332A3uGT023103; Sun, 2 Apr 2023 04:03:56 -0600 From: arnold@HIDDEN Message-Id: <202304021003.332A3uGT023103@HIDDEN> X-Authentication-Warning: frenzy.freefriends.org: arnold set sender to arnold@HIDDEN using -f Date: Sun, 02 Apr 2023 04:03:56 -0600 To: dimitry@HIDDEN, arnold@HIDDEN Subject: Re: bug#62483: echo a | grep -E -w '((()|a)|())*' # does not terminate References: <CAHJ_xkTYjONGLYWUL8uhnOVHHsBYEVqZ8WSPbr4nn8RrdfykvA@HIDDEN> <C23CEBBF-8231-46B4-8224-E4FCEED45A7D@HIDDEN> <202303280646.32S6krUr030677@HIDDEN> <EC26823E-6BD3-4EC6-8959-4C5AF60D83FB@HIDDEN> <202304020652.3326q2Ut024466@HIDDEN> <4DC78C45-1C0C-4A9E-BBFB-6F752AC047F5@HIDDEN> In-Reply-To: <4DC78C45-1C0C-4A9E-BBFB-6F752AC047F5@HIDDEN> User-Agent: Heirloom mailx 12.5 7/5/10 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Received-SPF: none client-ip=198.99.81.75; envelope-from=arnold@HIDDEN; helo=freefriends.org X-Spam_score_int: -14 X-Spam_score: -1.5 X-Spam_bar: - X-Spam_report: (-1.5 / 5.0 requ) BAYES_00=-1.9, FORGED_SPF_HELO=0.001, KHOP_HELO_FCRDNS=0.399, SPF_HELO_PASS=-0.001, SPF_NONE=0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: submit Cc: 62483 <at> debbugs.gnu.org, koen@HIDDEN, bug-grep@HIDDEN X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -3.3 (---) OK, thanks! Dimitry Andric <dimitry@HIDDEN> wrote: > Ah sorry, I did indeed rebuild grep with --with-included-regex, and for > debugging purposes added CFLAGS="-O0 -g". > > In any case, the problematic code is both in glibc and grep, as I > believe these are originating from the same source. > > -Dimitry > > > On 2 Apr 2023, at 08:52, arnold@HIDDEN wrote: > > > > Hi. > > > > Dimitry Andric <dimitry@HIDDEN> wrote: > > > >> Yes, it still reproduces when I configure the latest grep using > >> --without-included-regex: > > > > I assume you meant --with-included-regex? > > > > If you really used --without-included-regex, that doesn't prove anything... :-) > > > > It's interesting, as gawk uses the same regex, but with different flags. > > It might be worth trying to understand which of the syntax bits > > is causing this. > > > > Thanks, > > > > Arnold > > > >> > >> 1400 for (idx = pmatch[0].rm_so; idx <= pmatch[0].rm_eo ;) > >> 1: idx = 0 > >> (gdb) > >> 1402 update_regs (dfa, pmatch, prev_idx_match, cur_node, idx, nmatch); > >> 1: idx = 0 > >> (gdb) > >> 1404 if ((idx == pmatch[0].rm_eo && cur_node == mctx->last_node) > >> 1: idx = 0 > >> (gdb) > >> 1405 || (fs && re_node_set_contains (&eps_via_nodes, cur_node))) > >> 1: idx = 0 > >> (gdb) > >> 1428 cur_node = proceed_next_node (mctx, nmatch, pmatch, prev_idx_match, > >> 1: idx = 0 > >> (gdb) > >> 1432 if (__glibc_unlikely (cur_node < 0)) > >> 1: idx = 0 > >> (gdb) > >> 1400 for (idx = pmatch[0].rm_so; idx <= pmatch[0].rm_eo ;) > >> 1: idx = 0 > >> (gdb) > >> 1402 update_regs (dfa, pmatch, prev_idx_match, cur_node, idx, nmatch); > >> 1: idx = 0 > >> (gdb) > >> 1404 if ((idx == pmatch[0].rm_eo && cur_node == mctx->last_node) > >> 1: idx = 0 > >> (gdb) > >> 1405 || (fs && re_node_set_contains (&eps_via_nodes, cur_node))) > >> 1: idx = 0 > >> > >> The endless loop looks the same. grep's regexec.c is mostly the same as > >> glibc's, except for the latter having a bunch of #if RE_ENABLE_I18N > >> directives added. > >> > >> -Dimitry > >> > >>> On 28 Mar 2023, at 08:46, arnold@HIDDEN wrote: > >>> > >>> This does not reproduce with gawk, even when I force use of the regex > >>> matcher. > >>> > >>> What happens if grep is built with the included regex files instead of > >>> relying on the ones in the local glibc? > >>> > >>> Arnold > >>> > >>> Dimitry Andric <dimitry@HIDDEN> wrote: > >>> > >>>> On 27 Mar 2023, at 11:14, Koen Claessen <koen@HIDDEN> wrote: > >>>>> > >>>>> Running the command: > >>>>> > >>>>> echo a | grep -E -w '((()|a)|())*' > >>>>> > >>>>> does not terminate, and uses a LOT of processor time, for all versions of > >>>>> grep I have tried. > >>>>> > >>>>> This is the smallest case that could be found; simplifying anything in the > >>>>> input and/or expression leads to correct behavior. > >>>> > >>>> Reproducible with GNU grep 3.7 on Ubuntu 22: > >>>> > >>>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > >>>> 93938 dim 20 0 9.0m 2.1m 2.0m R 100.0 0.0 0:08.32 grep -E -w ((()|a)|())* > >>>> > >>>> It seems that at least on Ubuntu, grep in this mode uses glibc's regexec(3), and it is that implementation which ends up in an endless loop. > >>>> > >>>> It loops between lines 1415, 1417 and 1443, but idx and cur_node never change from 0: > >>>> > >>>> 1378 static reg_errcode_t > >>>> 1379 __attribute_warn_unused_result__ > >>>> 1380 set_regs (const regex_t *preg, const re_match_context_t *mctx, size_t nmatch, > >>>> 1381 regmatch_t *pmatch, bool fl_backtrack) > >>>> 1382 { > >>>> ... > >>>> 1415 for (idx = pmatch[0].rm_so; idx <= pmatch[0].rm_eo ;) > >>>> 1416 { > >>>> 1417 update_regs (dfa, pmatch, prev_idx_match, cur_node, idx, nmatch); > >>>> 1418 > >>>> 1419 if ((idx == pmatch[0].rm_eo && cur_node == mctx->last_node) > >>>> 1420 || (fs && re_node_set_contains (&eps_via_nodes, cur_node))) > >>>> ... > >>>> 1442 /* Proceed to next node. */ > >>>> 1443 cur_node = proceed_next_node (mctx, nmatch, pmatch, prev_idx_match, > >>>> 1444 &idx, cur_node, > >>>> 1445 &eps_via_nodes, fs); > >>>> 1446 > >>>> 1447 if (__glibc_unlikely (cur_node < 0)) > >>>> ... > >>>> 1465 } > >>>> 1466 } > >>>> > >>>> -Dimitry > >>>> > >>>> P.S.: Interestingly this does not reproduce with BSD grep, which returns immediately with "a". > >>>> > >> >
bug-grep@HIDDEN
:bug#62483
; Package grep
.
Full text available.Received: (at submit) by debbugs.gnu.org; 2 Apr 2023 09:34:02 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Sun Apr 02 05:34:02 2023 Received: from localhost ([127.0.0.1]:39227 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1piu65-0008AY-Hx for submit <at> debbugs.gnu.org; Sun, 02 Apr 2023 05:34:02 -0400 Received: from lists.gnu.org ([209.51.188.17]:45166) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <dimitry@HIDDEN>) id 1piu64-0008AQ-1g for submit <at> debbugs.gnu.org; Sun, 02 Apr 2023 05:34:00 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <dimitry@HIDDEN>) id 1piu62-0002Ka-6N for bug-grep@HIDDEN; Sun, 02 Apr 2023 05:33:59 -0400 Received: from tensor.andric.com ([87.251.56.140]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <dimitry@HIDDEN>) id 1piu5y-00074R-0e for bug-grep@HIDDEN; Sun, 02 Apr 2023 05:33:55 -0400 Received: from smtpclient.apple (longrow.home.andric.com [192.168.0.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by tensor.andric.com (Postfix) with ESMTPSA id 6ACDF6C235; Sun, 2 Apr 2023 11:33:48 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=andric.com; s=201904; t=1680428028; bh=j++piMyInCb64AqGnHXFa/agOcQZYX5hFJCloNa82XE=; h=Subject:From:In-Reply-To:Date:Cc:Message-Id:References:To:From; b=OvpoxsYZyxsgQkhRl/xlrsYK62rT5hKW4sMbC0u3aDCYe7gYbalcM0EcbVcuZmsEp Ajq9eUTN6tzSLEXBOVFlEwFzoS/gkuLXqmKFqU4TKOJpsjfoXca+iNS3RN7w7k6HAP FyHBj1TZpHtlFZQW6GWjX1LTGv9+R+XmXi7cSlUAHzDwjONBsFO40E29XbZDoqHYdD xzUcUJSN1hpy7an+xzhBje/vvne3qpIEbReGf/Aab0z0j/KCjM6GT81+SKIog+el7H vLVb8tU69x6+y1CyoGlsfZ0IY17+aUdgLalblC2o7G4l8UObQXMmN1jOZ7QLq9QPgr OWfWv7UbnXYdg== Content-Type: multipart/signed; boundary="Apple-Mail=_C7CE2C4A-4BF8-4D9F-ABFD-9F42FF447F13"; protocol="application/pgp-signature"; micalg=pgp-sha1 Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3731.500.231\)) Subject: Re: bug#62483: echo a | grep -E -w '((()|a)|())*' # does not terminate From: Dimitry Andric <dimitry@HIDDEN> In-Reply-To: <202304020652.3326q2Ut024466@HIDDEN> Date: Sun, 2 Apr 2023 11:33:42 +0200 Message-Id: <4DC78C45-1C0C-4A9E-BBFB-6F752AC047F5@HIDDEN> References: <CAHJ_xkTYjONGLYWUL8uhnOVHHsBYEVqZ8WSPbr4nn8RrdfykvA@HIDDEN> <C23CEBBF-8231-46B4-8224-E4FCEED45A7D@HIDDEN> <202303280646.32S6krUr030677@HIDDEN> <EC26823E-6BD3-4EC6-8959-4C5AF60D83FB@HIDDEN> <202304020652.3326q2Ut024466@HIDDEN> To: arnold@HIDDEN X-Mailer: Apple Mail (2.3731.500.231) Received-SPF: pass client-ip=87.251.56.140; envelope-from=dimitry@HIDDEN; helo=tensor.andric.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: -1.4 (-) X-Debbugs-Envelope-To: submit Cc: 62483 <at> debbugs.gnu.org, Koen Claessen <koen@HIDDEN>, bug-grep@HIDDEN X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -2.4 (--) --Apple-Mail=_C7CE2C4A-4BF8-4D9F-ABFD-9F42FF447F13 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii Ah sorry, I did indeed rebuild grep with --with-included-regex, and for debugging purposes added CFLAGS=3D"-O0 -g". In any case, the problematic code is both in glibc and grep, as I believe these are originating from the same source. -Dimitry > On 2 Apr 2023, at 08:52, arnold@HIDDEN wrote: >=20 > Hi. >=20 > Dimitry Andric <dimitry@HIDDEN> wrote: >=20 >> Yes, it still reproduces when I configure the latest grep using >> --without-included-regex: >=20 > I assume you meant --with-included-regex? >=20 > If you really used --without-included-regex, that doesn't prove = anything... :-) >=20 > It's interesting, as gawk uses the same regex, but with different = flags. > It might be worth trying to understand which of the syntax bits > is causing this. >=20 > Thanks, >=20 > Arnold >=20 >>=20 >> 1400 for (idx =3D pmatch[0].rm_so; idx <=3D pmatch[0].rm_eo ;) >> 1: idx =3D 0 >> (gdb) >> 1402 update_regs (dfa, pmatch, prev_idx_match, cur_node, = idx, nmatch); >> 1: idx =3D 0 >> (gdb) >> 1404 if ((idx =3D=3D pmatch[0].rm_eo && cur_node =3D=3D = mctx->last_node) >> 1: idx =3D 0 >> (gdb) >> 1405 || (fs && re_node_set_contains (&eps_via_nodes, = cur_node))) >> 1: idx =3D 0 >> (gdb) >> 1428 cur_node =3D proceed_next_node (mctx, nmatch, pmatch, = prev_idx_match, >> 1: idx =3D 0 >> (gdb) >> 1432 if (__glibc_unlikely (cur_node < 0)) >> 1: idx =3D 0 >> (gdb) >> 1400 for (idx =3D pmatch[0].rm_so; idx <=3D pmatch[0].rm_eo ;) >> 1: idx =3D 0 >> (gdb) >> 1402 update_regs (dfa, pmatch, prev_idx_match, cur_node, = idx, nmatch); >> 1: idx =3D 0 >> (gdb) >> 1404 if ((idx =3D=3D pmatch[0].rm_eo && cur_node =3D=3D = mctx->last_node) >> 1: idx =3D 0 >> (gdb) >> 1405 || (fs && re_node_set_contains (&eps_via_nodes, = cur_node))) >> 1: idx =3D 0 >>=20 >> The endless loop looks the same. grep's regexec.c is mostly the same = as >> glibc's, except for the latter having a bunch of #if RE_ENABLE_I18N >> directives added. >>=20 >> -Dimitry >>=20 >>> On 28 Mar 2023, at 08:46, arnold@HIDDEN wrote: >>>=20 >>> This does not reproduce with gawk, even when I force use of the = regex >>> matcher. >>>=20 >>> What happens if grep is built with the included regex files instead = of >>> relying on the ones in the local glibc? >>>=20 >>> Arnold >>>=20 >>> Dimitry Andric <dimitry@HIDDEN> wrote: >>>=20 >>>> On 27 Mar 2023, at 11:14, Koen Claessen <koen@HIDDEN> wrote: >>>>>=20 >>>>> Running the command: >>>>>=20 >>>>> echo a | grep -E -w '((()|a)|())*' >>>>>=20 >>>>> does not terminate, and uses a LOT of processor time, for all = versions of >>>>> grep I have tried. >>>>>=20 >>>>> This is the smallest case that could be found; simplifying = anything in the >>>>> input and/or expression leads to correct behavior. >>>>=20 >>>> Reproducible with GNU grep 3.7 on Ubuntu 22: >>>>=20 >>>> PID USER PR NI VIRT RES SHR S %CPU %MEM = TIME+ COMMAND >>>> 93938 dim 20 0 9.0m 2.1m 2.0m R 100.0 0.0 = 0:08.32 grep -E -w ((()|a)|())* >>>>=20 >>>> It seems that at least on Ubuntu, grep in this mode uses glibc's = regexec(3), and it is that implementation which ends up in an endless = loop. >>>>=20 >>>> It loops between lines 1415, 1417 and 1443, but idx and cur_node = never change from 0: >>>>=20 >>>> 1378 static reg_errcode_t >>>> 1379 __attribute_warn_unused_result__ >>>> 1380 set_regs (const regex_t *preg, const re_match_context_t = *mctx, size_t nmatch, >>>> 1381 regmatch_t *pmatch, bool fl_backtrack) >>>> 1382 { >>>> ... >>>> 1415 for (idx =3D pmatch[0].rm_so; idx <=3D pmatch[0].rm_eo ;) >>>> 1416 { >>>> 1417 update_regs (dfa, pmatch, prev_idx_match, cur_node, = idx, nmatch); >>>> 1418 >>>> 1419 if ((idx =3D=3D pmatch[0].rm_eo && cur_node =3D=3D = mctx->last_node) >>>> 1420 || (fs && re_node_set_contains (&eps_via_nodes, = cur_node))) >>>> ... >>>> 1442 /* Proceed to next node. */ >>>> 1443 cur_node =3D proceed_next_node (mctx, nmatch, pmatch, = prev_idx_match, >>>> 1444 &idx, cur_node, >>>> 1445 &eps_via_nodes, fs); >>>> 1446 >>>> 1447 if (__glibc_unlikely (cur_node < 0)) >>>> ... >>>> 1465 } >>>> 1466 } >>>>=20 >>>> -Dimitry >>>>=20 >>>> P.S.: Interestingly this does not reproduce with BSD grep, which = returns immediately with "a". >>>>=20 >>=20 --Apple-Mail=_C7CE2C4A-4BF8-4D9F-ABFD-9F42FF447F13 Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename=signature.asc Content-Type: application/pgp-signature; name=signature.asc Content-Description: Message signed with OpenPGP -----BEGIN PGP SIGNATURE----- Version: GnuPG/MacGPG2 v2.2 iF0EARECAB0WIQR6tGLSzjX8bUI5T82wXqMKLiCWowUCZClL9gAKCRCwXqMKLiCW o0KTAKCiIhcduUQaGUuiJfykNQNnR6+9UACg7EEpahAec8VtdrBL/qhtw0BjprA= =F/9f -----END PGP SIGNATURE----- --Apple-Mail=_C7CE2C4A-4BF8-4D9F-ABFD-9F42FF447F13--
bug-grep@HIDDEN
:bug#62483
; Package grep
.
Full text available.Received: (at 62483) by debbugs.gnu.org; 2 Apr 2023 09:33:54 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Sun Apr 02 05:33:54 2023 Received: from localhost ([127.0.0.1]:39224 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1piu5x-0008AC-Vd for submit <at> debbugs.gnu.org; Sun, 02 Apr 2023 05:33:54 -0400 Received: from tensor.andric.com ([87.251.56.140]:53462 ident=postfix) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <dimitry@HIDDEN>) id 1piu5u-0008A0-Gk for 62483 <at> debbugs.gnu.org; Sun, 02 Apr 2023 05:33:52 -0400 Received: from smtpclient.apple (longrow.home.andric.com [192.168.0.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by tensor.andric.com (Postfix) with ESMTPSA id 6ACDF6C235; Sun, 2 Apr 2023 11:33:48 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=andric.com; s=201904; t=1680428028; bh=j++piMyInCb64AqGnHXFa/agOcQZYX5hFJCloNa82XE=; h=Subject:From:In-Reply-To:Date:Cc:Message-Id:References:To:From; b=OvpoxsYZyxsgQkhRl/xlrsYK62rT5hKW4sMbC0u3aDCYe7gYbalcM0EcbVcuZmsEp Ajq9eUTN6tzSLEXBOVFlEwFzoS/gkuLXqmKFqU4TKOJpsjfoXca+iNS3RN7w7k6HAP FyHBj1TZpHtlFZQW6GWjX1LTGv9+R+XmXi7cSlUAHzDwjONBsFO40E29XbZDoqHYdD xzUcUJSN1hpy7an+xzhBje/vvne3qpIEbReGf/Aab0z0j/KCjM6GT81+SKIog+el7H vLVb8tU69x6+y1CyoGlsfZ0IY17+aUdgLalblC2o7G4l8UObQXMmN1jOZ7QLq9QPgr OWfWv7UbnXYdg== Content-Type: multipart/signed; boundary="Apple-Mail=_C7CE2C4A-4BF8-4D9F-ABFD-9F42FF447F13"; protocol="application/pgp-signature"; micalg=pgp-sha1 Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3731.500.231\)) Subject: Re: bug#62483: echo a | grep -E -w '((()|a)|())*' # does not terminate From: Dimitry Andric <dimitry@HIDDEN> In-Reply-To: <202304020652.3326q2Ut024466@HIDDEN> Date: Sun, 2 Apr 2023 11:33:42 +0200 Message-Id: <4DC78C45-1C0C-4A9E-BBFB-6F752AC047F5@HIDDEN> References: <CAHJ_xkTYjONGLYWUL8uhnOVHHsBYEVqZ8WSPbr4nn8RrdfykvA@HIDDEN> <C23CEBBF-8231-46B4-8224-E4FCEED45A7D@HIDDEN> <202303280646.32S6krUr030677@HIDDEN> <EC26823E-6BD3-4EC6-8959-4C5AF60D83FB@HIDDEN> <202304020652.3326q2Ut024466@HIDDEN> To: arnold@HIDDEN X-Mailer: Apple Mail (2.3731.500.231) X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 62483 Cc: 62483 <at> debbugs.gnu.org, Koen Claessen <koen@HIDDEN>, bug-grep@HIDDEN X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.0 (-) --Apple-Mail=_C7CE2C4A-4BF8-4D9F-ABFD-9F42FF447F13 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii Ah sorry, I did indeed rebuild grep with --with-included-regex, and for debugging purposes added CFLAGS=3D"-O0 -g". In any case, the problematic code is both in glibc and grep, as I believe these are originating from the same source. -Dimitry > On 2 Apr 2023, at 08:52, arnold@HIDDEN wrote: >=20 > Hi. >=20 > Dimitry Andric <dimitry@HIDDEN> wrote: >=20 >> Yes, it still reproduces when I configure the latest grep using >> --without-included-regex: >=20 > I assume you meant --with-included-regex? >=20 > If you really used --without-included-regex, that doesn't prove = anything... :-) >=20 > It's interesting, as gawk uses the same regex, but with different = flags. > It might be worth trying to understand which of the syntax bits > is causing this. >=20 > Thanks, >=20 > Arnold >=20 >>=20 >> 1400 for (idx =3D pmatch[0].rm_so; idx <=3D pmatch[0].rm_eo ;) >> 1: idx =3D 0 >> (gdb) >> 1402 update_regs (dfa, pmatch, prev_idx_match, cur_node, = idx, nmatch); >> 1: idx =3D 0 >> (gdb) >> 1404 if ((idx =3D=3D pmatch[0].rm_eo && cur_node =3D=3D = mctx->last_node) >> 1: idx =3D 0 >> (gdb) >> 1405 || (fs && re_node_set_contains (&eps_via_nodes, = cur_node))) >> 1: idx =3D 0 >> (gdb) >> 1428 cur_node =3D proceed_next_node (mctx, nmatch, pmatch, = prev_idx_match, >> 1: idx =3D 0 >> (gdb) >> 1432 if (__glibc_unlikely (cur_node < 0)) >> 1: idx =3D 0 >> (gdb) >> 1400 for (idx =3D pmatch[0].rm_so; idx <=3D pmatch[0].rm_eo ;) >> 1: idx =3D 0 >> (gdb) >> 1402 update_regs (dfa, pmatch, prev_idx_match, cur_node, = idx, nmatch); >> 1: idx =3D 0 >> (gdb) >> 1404 if ((idx =3D=3D pmatch[0].rm_eo && cur_node =3D=3D = mctx->last_node) >> 1: idx =3D 0 >> (gdb) >> 1405 || (fs && re_node_set_contains (&eps_via_nodes, = cur_node))) >> 1: idx =3D 0 >>=20 >> The endless loop looks the same. grep's regexec.c is mostly the same = as >> glibc's, except for the latter having a bunch of #if RE_ENABLE_I18N >> directives added. >>=20 >> -Dimitry >>=20 >>> On 28 Mar 2023, at 08:46, arnold@HIDDEN wrote: >>>=20 >>> This does not reproduce with gawk, even when I force use of the = regex >>> matcher. >>>=20 >>> What happens if grep is built with the included regex files instead = of >>> relying on the ones in the local glibc? >>>=20 >>> Arnold >>>=20 >>> Dimitry Andric <dimitry@HIDDEN> wrote: >>>=20 >>>> On 27 Mar 2023, at 11:14, Koen Claessen <koen@HIDDEN> wrote: >>>>>=20 >>>>> Running the command: >>>>>=20 >>>>> echo a | grep -E -w '((()|a)|())*' >>>>>=20 >>>>> does not terminate, and uses a LOT of processor time, for all = versions of >>>>> grep I have tried. >>>>>=20 >>>>> This is the smallest case that could be found; simplifying = anything in the >>>>> input and/or expression leads to correct behavior. >>>>=20 >>>> Reproducible with GNU grep 3.7 on Ubuntu 22: >>>>=20 >>>> PID USER PR NI VIRT RES SHR S %CPU %MEM = TIME+ COMMAND >>>> 93938 dim 20 0 9.0m 2.1m 2.0m R 100.0 0.0 = 0:08.32 grep -E -w ((()|a)|())* >>>>=20 >>>> It seems that at least on Ubuntu, grep in this mode uses glibc's = regexec(3), and it is that implementation which ends up in an endless = loop. >>>>=20 >>>> It loops between lines 1415, 1417 and 1443, but idx and cur_node = never change from 0: >>>>=20 >>>> 1378 static reg_errcode_t >>>> 1379 __attribute_warn_unused_result__ >>>> 1380 set_regs (const regex_t *preg, const re_match_context_t = *mctx, size_t nmatch, >>>> 1381 regmatch_t *pmatch, bool fl_backtrack) >>>> 1382 { >>>> ... >>>> 1415 for (idx =3D pmatch[0].rm_so; idx <=3D pmatch[0].rm_eo ;) >>>> 1416 { >>>> 1417 update_regs (dfa, pmatch, prev_idx_match, cur_node, = idx, nmatch); >>>> 1418 >>>> 1419 if ((idx =3D=3D pmatch[0].rm_eo && cur_node =3D=3D = mctx->last_node) >>>> 1420 || (fs && re_node_set_contains (&eps_via_nodes, = cur_node))) >>>> ... >>>> 1442 /* Proceed to next node. */ >>>> 1443 cur_node =3D proceed_next_node (mctx, nmatch, pmatch, = prev_idx_match, >>>> 1444 &idx, cur_node, >>>> 1445 &eps_via_nodes, fs); >>>> 1446 >>>> 1447 if (__glibc_unlikely (cur_node < 0)) >>>> ... >>>> 1465 } >>>> 1466 } >>>>=20 >>>> -Dimitry >>>>=20 >>>> P.S.: Interestingly this does not reproduce with BSD grep, which = returns immediately with "a". >>>>=20 >>=20 --Apple-Mail=_C7CE2C4A-4BF8-4D9F-ABFD-9F42FF447F13 Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename=signature.asc Content-Type: application/pgp-signature; name=signature.asc Content-Description: Message signed with OpenPGP -----BEGIN PGP SIGNATURE----- Version: GnuPG/MacGPG2 v2.2 iF0EARECAB0WIQR6tGLSzjX8bUI5T82wXqMKLiCWowUCZClL9gAKCRCwXqMKLiCW o0KTAKCiIhcduUQaGUuiJfykNQNnR6+9UACg7EEpahAec8VtdrBL/qhtw0BjprA= =F/9f -----END PGP SIGNATURE----- --Apple-Mail=_C7CE2C4A-4BF8-4D9F-ABFD-9F42FF447F13--
bug-grep@HIDDEN
:bug#62483
; Package grep
.
Full text available.Received: (at 62483) by debbugs.gnu.org; 2 Apr 2023 06:52:22 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Sun Apr 02 02:52:22 2023 Received: from localhost ([127.0.0.1]:39032 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1pirZe-0003Td-3X for submit <at> debbugs.gnu.org; Sun, 02 Apr 2023 02:52:22 -0400 Received: from frenzy.freefriends.org ([198.99.81.75]:57092 helo=freefriends.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <arnold@HIDDEN>) id 1pirZb-0003TU-Kl for 62483 <at> debbugs.gnu.org; Sun, 02 Apr 2023 02:52:20 -0400 X-Envelope-From: arnold@HIDDEN Received: from freefriends.org (frenzy.freefriends.org [198.99.81.75]) by freefriends.org (8.14.7/8.14.7) with ESMTP id 3326q3Cs024467 (version=TLSv1/SSLv3 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Sun, 2 Apr 2023 00:52:04 -0600 Received: (from arnold@localhost) by freefriends.org (8.14.7/8.14.7/Submit) id 3326q2Ut024466; Sun, 2 Apr 2023 00:52:02 -0600 From: arnold@HIDDEN Message-Id: <202304020652.3326q2Ut024466@HIDDEN> X-Authentication-Warning: frenzy.freefriends.org: arnold set sender to arnold@HIDDEN using -f Date: Sun, 02 Apr 2023 00:52:02 -0600 To: dimitry@HIDDEN, arnold@HIDDEN Subject: Re: bug#62483: echo a | grep -E -w '((()|a)|())*' # does not terminate References: <CAHJ_xkTYjONGLYWUL8uhnOVHHsBYEVqZ8WSPbr4nn8RrdfykvA@HIDDEN> <C23CEBBF-8231-46B4-8224-E4FCEED45A7D@HIDDEN> <202303280646.32S6krUr030677@HIDDEN> <EC26823E-6BD3-4EC6-8959-4C5AF60D83FB@HIDDEN> In-Reply-To: <EC26823E-6BD3-4EC6-8959-4C5AF60D83FB@HIDDEN> User-Agent: Heirloom mailx 12.5 7/5/10 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 62483 Cc: 62483 <at> debbugs.gnu.org, koen@HIDDEN, bug-grep@HIDDEN X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: 0.0 (/) Hi. Dimitry Andric <dimitry@HIDDEN> wrote: > Yes, it still reproduces when I configure the latest grep using > --without-included-regex: I assume you meant --with-included-regex? If you really used --without-included-regex, that doesn't prove anything... :-) It's interesting, as gawk uses the same regex, but with different flags. It might be worth trying to understand which of the syntax bits is causing this. Thanks, Arnold > > 1400 for (idx = pmatch[0].rm_so; idx <= pmatch[0].rm_eo ;) > 1: idx = 0 > (gdb) > 1402 update_regs (dfa, pmatch, prev_idx_match, cur_node, idx, nmatch); > 1: idx = 0 > (gdb) > 1404 if ((idx == pmatch[0].rm_eo && cur_node == mctx->last_node) > 1: idx = 0 > (gdb) > 1405 || (fs && re_node_set_contains (&eps_via_nodes, cur_node))) > 1: idx = 0 > (gdb) > 1428 cur_node = proceed_next_node (mctx, nmatch, pmatch, prev_idx_match, > 1: idx = 0 > (gdb) > 1432 if (__glibc_unlikely (cur_node < 0)) > 1: idx = 0 > (gdb) > 1400 for (idx = pmatch[0].rm_so; idx <= pmatch[0].rm_eo ;) > 1: idx = 0 > (gdb) > 1402 update_regs (dfa, pmatch, prev_idx_match, cur_node, idx, nmatch); > 1: idx = 0 > (gdb) > 1404 if ((idx == pmatch[0].rm_eo && cur_node == mctx->last_node) > 1: idx = 0 > (gdb) > 1405 || (fs && re_node_set_contains (&eps_via_nodes, cur_node))) > 1: idx = 0 > > The endless loop looks the same. grep's regexec.c is mostly the same as > glibc's, except for the latter having a bunch of #if RE_ENABLE_I18N > directives added. > > -Dimitry > > > On 28 Mar 2023, at 08:46, arnold@HIDDEN wrote: > > > > This does not reproduce with gawk, even when I force use of the regex > > matcher. > > > > What happens if grep is built with the included regex files instead of > > relying on the ones in the local glibc? > > > > Arnold > > > > Dimitry Andric <dimitry@HIDDEN> wrote: > > > >> On 27 Mar 2023, at 11:14, Koen Claessen <koen@HIDDEN> wrote: > >>> > >>> Running the command: > >>> > >>> echo a | grep -E -w '((()|a)|())*' > >>> > >>> does not terminate, and uses a LOT of processor time, for all versions of > >>> grep I have tried. > >>> > >>> This is the smallest case that could be found; simplifying anything in the > >>> input and/or expression leads to correct behavior. > >> > >> Reproducible with GNU grep 3.7 on Ubuntu 22: > >> > >> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > >> 93938 dim 20 0 9.0m 2.1m 2.0m R 100.0 0.0 0:08.32 grep -E -w ((()|a)|())* > >> > >> It seems that at least on Ubuntu, grep in this mode uses glibc's regexec(3), and it is that implementation which ends up in an endless loop. > >> > >> It loops between lines 1415, 1417 and 1443, but idx and cur_node never change from 0: > >> > >> 1378 static reg_errcode_t > >> 1379 __attribute_warn_unused_result__ > >> 1380 set_regs (const regex_t *preg, const re_match_context_t *mctx, size_t nmatch, > >> 1381 regmatch_t *pmatch, bool fl_backtrack) > >> 1382 { > >> ... > >> 1415 for (idx = pmatch[0].rm_so; idx <= pmatch[0].rm_eo ;) > >> 1416 { > >> 1417 update_regs (dfa, pmatch, prev_idx_match, cur_node, idx, nmatch); > >> 1418 > >> 1419 if ((idx == pmatch[0].rm_eo && cur_node == mctx->last_node) > >> 1420 || (fs && re_node_set_contains (&eps_via_nodes, cur_node))) > >> ... > >> 1442 /* Proceed to next node. */ > >> 1443 cur_node = proceed_next_node (mctx, nmatch, pmatch, prev_idx_match, > >> 1444 &idx, cur_node, > >> 1445 &eps_via_nodes, fs); > >> 1446 > >> 1447 if (__glibc_unlikely (cur_node < 0)) > >> ... > >> 1465 } > >> 1466 } > >> > >> -Dimitry > >> > >> P.S.: Interestingly this does not reproduce with BSD grep, which returns immediately with "a". > >> >
bug-grep@HIDDEN
:bug#62483
; Package grep
.
Full text available.Received: (at submit) by debbugs.gnu.org; 2 Apr 2023 06:52:31 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Sun Apr 02 02:52:31 2023 Received: from localhost ([127.0.0.1]:39035 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1pirZm-0003Tz-R9 for submit <at> debbugs.gnu.org; Sun, 02 Apr 2023 02:52:31 -0400 Received: from lists.gnu.org ([209.51.188.17]:50340) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <arnold@HIDDEN>) id 1pirZk-0003To-83 for submit <at> debbugs.gnu.org; Sun, 02 Apr 2023 02:52:28 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <arnold@HIDDEN>) id 1pirZh-0008FE-Kz for bug-grep@HIDDEN; Sun, 02 Apr 2023 02:52:25 -0400 Received: from frenzy.freefriends.org ([198.99.81.75] helo=freefriends.org) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <arnold@HIDDEN>) id 1pirZf-0006yO-Kg for bug-grep@HIDDEN; Sun, 02 Apr 2023 02:52:25 -0400 X-Envelope-From: arnold@HIDDEN Received: from freefriends.org (frenzy.freefriends.org [198.99.81.75]) by freefriends.org (8.14.7/8.14.7) with ESMTP id 3326q3Cs024467 (version=TLSv1/SSLv3 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Sun, 2 Apr 2023 00:52:04 -0600 Received: (from arnold@localhost) by freefriends.org (8.14.7/8.14.7/Submit) id 3326q2Ut024466; Sun, 2 Apr 2023 00:52:02 -0600 From: arnold@HIDDEN Message-Id: <202304020652.3326q2Ut024466@HIDDEN> X-Authentication-Warning: frenzy.freefriends.org: arnold set sender to arnold@HIDDEN using -f Date: Sun, 02 Apr 2023 00:52:02 -0600 To: dimitry@HIDDEN, arnold@HIDDEN Subject: Re: bug#62483: echo a | grep -E -w '((()|a)|())*' # does not terminate References: <CAHJ_xkTYjONGLYWUL8uhnOVHHsBYEVqZ8WSPbr4nn8RrdfykvA@HIDDEN> <C23CEBBF-8231-46B4-8224-E4FCEED45A7D@HIDDEN> <202303280646.32S6krUr030677@HIDDEN> <EC26823E-6BD3-4EC6-8959-4C5AF60D83FB@HIDDEN> In-Reply-To: <EC26823E-6BD3-4EC6-8959-4C5AF60D83FB@HIDDEN> User-Agent: Heirloom mailx 12.5 7/5/10 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Received-SPF: none client-ip=198.99.81.75; envelope-from=arnold@HIDDEN; helo=freefriends.org X-Spam_score_int: -14 X-Spam_score: -1.5 X-Spam_bar: - X-Spam_report: (-1.5 / 5.0 requ) BAYES_00=-1.9, FORGED_SPF_HELO=0.001, KHOP_HELO_FCRDNS=0.399, SPF_HELO_PASS=-0.001, SPF_NONE=0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: submit Cc: 62483 <at> debbugs.gnu.org, koen@HIDDEN, bug-grep@HIDDEN X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -3.3 (---) Hi. Dimitry Andric <dimitry@HIDDEN> wrote: > Yes, it still reproduces when I configure the latest grep using > --without-included-regex: I assume you meant --with-included-regex? If you really used --without-included-regex, that doesn't prove anything... :-) It's interesting, as gawk uses the same regex, but with different flags. It might be worth trying to understand which of the syntax bits is causing this. Thanks, Arnold > > 1400 for (idx = pmatch[0].rm_so; idx <= pmatch[0].rm_eo ;) > 1: idx = 0 > (gdb) > 1402 update_regs (dfa, pmatch, prev_idx_match, cur_node, idx, nmatch); > 1: idx = 0 > (gdb) > 1404 if ((idx == pmatch[0].rm_eo && cur_node == mctx->last_node) > 1: idx = 0 > (gdb) > 1405 || (fs && re_node_set_contains (&eps_via_nodes, cur_node))) > 1: idx = 0 > (gdb) > 1428 cur_node = proceed_next_node (mctx, nmatch, pmatch, prev_idx_match, > 1: idx = 0 > (gdb) > 1432 if (__glibc_unlikely (cur_node < 0)) > 1: idx = 0 > (gdb) > 1400 for (idx = pmatch[0].rm_so; idx <= pmatch[0].rm_eo ;) > 1: idx = 0 > (gdb) > 1402 update_regs (dfa, pmatch, prev_idx_match, cur_node, idx, nmatch); > 1: idx = 0 > (gdb) > 1404 if ((idx == pmatch[0].rm_eo && cur_node == mctx->last_node) > 1: idx = 0 > (gdb) > 1405 || (fs && re_node_set_contains (&eps_via_nodes, cur_node))) > 1: idx = 0 > > The endless loop looks the same. grep's regexec.c is mostly the same as > glibc's, except for the latter having a bunch of #if RE_ENABLE_I18N > directives added. > > -Dimitry > > > On 28 Mar 2023, at 08:46, arnold@HIDDEN wrote: > > > > This does not reproduce with gawk, even when I force use of the regex > > matcher. > > > > What happens if grep is built with the included regex files instead of > > relying on the ones in the local glibc? > > > > Arnold > > > > Dimitry Andric <dimitry@HIDDEN> wrote: > > > >> On 27 Mar 2023, at 11:14, Koen Claessen <koen@HIDDEN> wrote: > >>> > >>> Running the command: > >>> > >>> echo a | grep -E -w '((()|a)|())*' > >>> > >>> does not terminate, and uses a LOT of processor time, for all versions of > >>> grep I have tried. > >>> > >>> This is the smallest case that could be found; simplifying anything in the > >>> input and/or expression leads to correct behavior. > >> > >> Reproducible with GNU grep 3.7 on Ubuntu 22: > >> > >> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > >> 93938 dim 20 0 9.0m 2.1m 2.0m R 100.0 0.0 0:08.32 grep -E -w ((()|a)|())* > >> > >> It seems that at least on Ubuntu, grep in this mode uses glibc's regexec(3), and it is that implementation which ends up in an endless loop. > >> > >> It loops between lines 1415, 1417 and 1443, but idx and cur_node never change from 0: > >> > >> 1378 static reg_errcode_t > >> 1379 __attribute_warn_unused_result__ > >> 1380 set_regs (const regex_t *preg, const re_match_context_t *mctx, size_t nmatch, > >> 1381 regmatch_t *pmatch, bool fl_backtrack) > >> 1382 { > >> ... > >> 1415 for (idx = pmatch[0].rm_so; idx <= pmatch[0].rm_eo ;) > >> 1416 { > >> 1417 update_regs (dfa, pmatch, prev_idx_match, cur_node, idx, nmatch); > >> 1418 > >> 1419 if ((idx == pmatch[0].rm_eo && cur_node == mctx->last_node) > >> 1420 || (fs && re_node_set_contains (&eps_via_nodes, cur_node))) > >> ... > >> 1442 /* Proceed to next node. */ > >> 1443 cur_node = proceed_next_node (mctx, nmatch, pmatch, prev_idx_match, > >> 1444 &idx, cur_node, > >> 1445 &eps_via_nodes, fs); > >> 1446 > >> 1447 if (__glibc_unlikely (cur_node < 0)) > >> ... > >> 1465 } > >> 1466 } > >> > >> -Dimitry > >> > >> P.S.: Interestingly this does not reproduce with BSD grep, which returns immediately with "a". > >> >
bug-grep@HIDDEN
:bug#62483
; Package grep
.
Full text available.Received: (at 62483) by debbugs.gnu.org; 2 Apr 2023 04:15:29 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Sun Apr 02 00:15:29 2023 Received: from localhost ([127.0.0.1]:38815 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1pip7p-00072N-2s for submit <at> debbugs.gnu.org; Sun, 02 Apr 2023 00:15:29 -0400 Received: from mail-lj1-f177.google.com ([209.85.208.177]:41554) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <meyering@HIDDEN>) id 1pip7m-000728-2d for 62483 <at> debbugs.gnu.org; Sun, 02 Apr 2023 00:15:27 -0400 Received: by mail-lj1-f177.google.com with SMTP id bx10so8705350ljb.8 for <62483 <at> debbugs.gnu.org>; Sat, 01 Apr 2023 21:15:26 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680408920; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=DodkkdU3opFDpkTR7I38Cd8sH5Ml6OzzLr47sEPUB+w=; b=xz/uM78Xeo+KZkaGtVa1P9mQjfLV1aivcei2WEfZNoOFMTOkBwpKonRG/ba1JbNr83 pYR+P1Z7IOyoWVpnhev0bwAe6bG4XFDHsU+cW10ImcfNNcmiK1ArWvyofz13j+DNKQ5n /vWdkbrQkfucGebgx/sLK7Jdh8OA0tZt1gbZNJnPjeTPcopGYPDCigA2gGQoEBVawv7g 4x6kAG7avDpCkKtc8U9mnvqk+UI472MyheOtQNwoE3M/v2Hv8m9zsDYT6bHBxxGdDYle 4lcLUHMuIgzGnfkj3/hevOpflvPGZxjPyKTr8WEjYK3ZapFlvI9VQTWpnbJK5hO6Xz77 hAxw== X-Gm-Message-State: AAQBX9e7GYXZxDzQFlrGuxweB9jVGPGIK9j3o02zJXm7N/ETVFXjGl10 mYjRz0gfA7+RtfFFOJlo5APyBjtu4o0gOz8KPHQ= X-Google-Smtp-Source: AKy350ZblXwqXc5nXQDfPymdw4nPR2IN4imJcGePAqD1vZp9OXVu/8aqbvMVAAgh/JkXmi0+y0PDQvbteeLWQRYc1Jg= X-Received: by 2002:a2e:95d8:0:b0:29a:9053:ed1b with SMTP id y24-20020a2e95d8000000b0029a9053ed1bmr9738568ljh.3.1680408919857; Sat, 01 Apr 2023 21:15:19 -0700 (PDT) MIME-Version: 1.0 References: <CAHJ_xkTYjONGLYWUL8uhnOVHHsBYEVqZ8WSPbr4nn8RrdfykvA@HIDDEN> In-Reply-To: <CAHJ_xkTYjONGLYWUL8uhnOVHHsBYEVqZ8WSPbr4nn8RrdfykvA@HIDDEN> From: Jim Meyering <jim@HIDDEN> Date: Sat, 1 Apr 2023 21:15:07 -0700 Message-ID: <CA+8g5KHc9W4dBJufWv96wd53ZE4fcYxrynuUEh9vNR-_1LFYbQ@HIDDEN> Subject: Re: bug#62483: echo a | grep -E -w '((()|a)|())*' # does not terminate To: Koen Claessen <koen@HIDDEN> Content-Type: multipart/mixed; boundary="000000000000bfb5b205f852b0a6" X-Spam-Score: 0.2 (/) X-Debbugs-Envelope-To: 62483 Cc: 62483 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -0.8 (/) --000000000000bfb5b205f852b0a6 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Mon, Mar 27, 2023 at 6:15=E2=80=AFAM Koen Claessen <koen@HIDDEN> wr= ote: > Running the command: > > echo a | grep -E -w '((()|a)|())*' > > does not terminate, and uses a LOT of processor time, for all versions of > grep I have tried. > > This is the smallest case that could be found; simplifying anything in th= e > input and/or expression leads to correct behavior. Thank you! How did you find that? FYI, this strikes grep-3.10 (on Fedora 37/glibc-2.36-9.fc37.x86_64) when using LC_ALL=3Den_US.UTF-8, but not with LC_ALL=3DC. I.e., this infloops: echo a | LC_ALL=3Den_US.UTF-8 grep -E -w '((()|a)|())*' but this works as expected and promptly prints its line of input: echo a | LC_ALL=3DC grep -E -w '((()|a)|())*' For now, I've added an expected-failing test case for this bug: --000000000000bfb5b205f852b0a6 Content-Type: application/octet-stream; name="grep-glibc-infloop.patch" Content-Disposition: attachment; filename="grep-glibc-infloop.patch" Content-Transfer-Encoding: base64 Content-ID: <f_lfyw0lqb0> X-Attachment-Id: f_lfyw0lqb0 RnJvbSBlNDQ4ZTg2YjVjMDFjMGE5OGIxNmU4NGQ2MWVkZTc5NDAxNzBiYjViIE1vbiBTZXAgMTcg MDA6MDA6MDAgMjAwMQpGcm9tOiBKaW0gTWV5ZXJpbmcgPG1leWVyaW5nQGZiLmNvbT4KRGF0ZTog U2F0LCAxIEFwciAyMDIzIDIxOjAxOjAxIC0wNzAwClN1YmplY3Q6IFtQQVRDSF0gdGVzdHM6IGFk ZCBhIGtub3duLWZhaWxpbmcgZ2xpYmMtaW5mbG9vcCB0ZXN0CgoqIHRlc3RzL2dsaWJjLWluZmxv b3A6IE5ldyBmaWxlLgpCYXNlZCBvbiB0aGUgY29tbWFuZCBmcm9tIEtvZW4gQ2xhZXNzZW4KcmVw b3J0ZWQgaW4gaHR0cHM6Ly9idWdzLmdudS5vcmcvNjI0ODMKKiB0ZXN0cy9NYWtlZmlsZS5hbSAo VEVTVFMpOiBBZGQgdGhlIGZpbGUgbmFtZQoqIFRIQU5LUy5pbjogQWRkIG5hbWUgb2YgcmVwb3J0 ZXIuCi0tLQogVEhBTktTLmluICAgICAgICAgICB8ICAxICsKIHRlc3RzL01ha2VmaWxlLmFtICAg fCAgMyArKysKIHRlc3RzL2dsaWJjLWluZmxvb3AgfCAxMyArKysrKysrKysrKysrCiAzIGZpbGVz IGNoYW5nZWQsIDE3IGluc2VydGlvbnMoKykKIGNyZWF0ZSBtb2RlIDEwMDc1NSB0ZXN0cy9nbGli Yy1pbmZsb29wCgpkaWZmIC0tZ2l0IGEvVEhBTktTLmluIGIvVEhBTktTLmluCmluZGV4IGZkMzYx OTIuLjA2ZjhjNTQgMTAwNjQ0Ci0tLSBhL1RIQU5LUy5pbgorKysgYi9USEFOS1MuaW4KQEAgLTU2 LDYgKzU2LDcgQEAgS2FybCBQZXR0ZXJzc29uICAgICAgICAgICAgICAgICAgICAga2FybC5wZXR0 ZXJzc29uQGtscG4uc2UKIEthdmVoIFIuIEdoYXppICAgICAgICAgICAgICAgICAgICAgIGdoYXpp QGNhaXAucnV0Z2Vycy5lZHUKIEthenVybyBGdXJ1a2F3YSAgICAgICAgICAgICAgICAgICAgIGZ1 cnVrYXdhQGFwcmljb3Qua2VrLmpwCiBLZWl0aCBCb3N0aWMgICAgICAgICAgICAgICAgICAgICAg ICBib3N0aWNAYnNkaS5jb20KK0tvZW4gQ2xhZXNzZW4gICAgICAgICAgICAgICAgICAgICAgIGtv ZW5AY2hhbG1lcnMuc2UKIEtyaXNobmEgU2V0aHVyYW1hbiAgICAgICAgICAgICAgICAgIGtyaXNo bmFAc2dpaHViLmNvcnAuc2dpLmNvbQogS3VydCBEIFNjaHdlaHIgICAgICAgICAgICAgICAgICAg ICAga2RzY2h3ZWhAaW5zY2kxNC51Y3NkLmVkdQogTHVkb3ZpYyBDb3VydMOocyAgICAgICAgICAg ICAgICAgICAgIGx1ZG9AZ251Lm9yZwpkaWZmIC0tZ2l0IGEvdGVzdHMvTWFrZWZpbGUuYW0gYi90 ZXN0cy9NYWtlZmlsZS5hbQppbmRleCBmMTk1YzhkLi4wYzNhZGJlIDEwMDY0NAotLS0gYS90ZXN0 cy9NYWtlZmlsZS5hbQorKysgYi90ZXN0cy9NYWtlZmlsZS5hbQpAQCAtNDksNiArNDksOCBAQCBM REFERCA9IC4uL2xpYi9saWJncmVwdXRpbHMuYSAkKExJQklOVEwpIC4uL2xpYi9saWJncmVwdXRp bHMuYQogIyBGSVhNRS0yMDE1OiBSZW1vdmUgdGhpcyBvbmNlIHRoZSBnbGliYyBhbmQgZ251bGli IGJ1Z3MgYXJlIGZpeGVkLgogWEZBSUxfVEVTVFMgPSB0cmlwbGUtYmFja3JlZgoKK1hGQUlMX1RF U1RTICs9IGdsaWJjLWluZmxvb3AKKwogIyBFcXVpdmFsZW5jZSBjbGFzc2VzIGFyZSBvbmx5IHN1 cHBvcnRlZCB3aGVuIHVzaW5nIHRoZSBzeXN0ZW0KICMgbWF0Y2hlciAod2hpY2ggbWVhbnMgb25s eSB3aXRoIGdsaWJjKS4KICMgVGhlIGluY2x1ZGVkIG1hdGNoZXIgbmVlZHMgdG8gYmUgZml4ZWQu CkBAIC0xMDgsNiArMTEwLDcgQEAgVEVTVFMgPQkJCQkJCVwKICAgZmlsbGJ1Zi1sb25nLWxpbmUJ CQkJXAogICBmbWJ0ZXN0CQkJCQlcCiAgIGZvYWQxCQkJCQkJXAorICBnbGliYy1pbmZsb29wCQkJ CQlcCiAgIGdyZXAtZGV2LW51bGwJCQkJCVwKICAgZ3JlcC1kZXYtbnVsbC1vdXQJCQkJXAogICBn cmVwLWRpcgkJCQkJXApkaWZmIC0tZ2l0IGEvdGVzdHMvZ2xpYmMtaW5mbG9vcCBiL3Rlc3RzL2ds aWJjLWluZmxvb3AKbmV3IGZpbGUgbW9kZSAxMDA3NTUKaW5kZXggMDAwMDAwMC4uNjI0NGI2Ygot LS0gL2Rldi9udWxsCisrKyBiL3Rlc3RzL2dsaWJjLWluZmxvb3AKQEAgLTAsMCArMSwxMyBAQAor IyEvYmluL3NoCisjIFRoaXMgd291bGQgaW5mbG9vcCB3aGVuIHVzaW5nIGdsaWJjJ3MgcmVnZXgg YXQgbGVhc3QgdW50aWwgZ2xpYmMtMi4zNi4KKy4gIiR7c3JjZGlyPS59L2luaXQuc2giOyBwYXRo X3ByZXBlbmRfIC4uL3NyYworCityZXF1aXJlX3RpbWVvdXRfCityZXF1aXJlX2VuX3V0ZjhfbG9j YWxlXworCitmYWlsPTAKKworZWNobyBhID4gaW4gfHwgZnJhbWV3b3JrX2ZhaWx1cmVfCityZXR1 cm5zXyAwIHRpbWVvdXQgMiBlbnYgTENfQUxMPWVuX1VTLlVURi04IGdyZXAgLUUgLXcgJygoKCl8 YSl8KCkpKicgaW4gfHwgZmFpbD0xCisKK0V4aXQgJGZhaWwKLS0gCjIuNDAuMC4xNjYuZzE0MGI5 NDc4ZGEKCg== --000000000000bfb5b205f852b0a6--
bug-grep@HIDDEN
:bug#62483
; Package grep
.
Full text available.Received: (at 62483) by debbugs.gnu.org; 31 Mar 2023 15:25:45 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Fri Mar 31 11:25:45 2023 Received: from localhost ([127.0.0.1]:34254 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1piGdM-0007El-K8 for submit <at> debbugs.gnu.org; Fri, 31 Mar 2023 11:25:45 -0400 Received: from tensor.andric.com ([87.251.56.140]:64446 ident=postfix) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <dimitry@HIDDEN>) id 1piGdI-0007EW-7c for 62483 <at> debbugs.gnu.org; Fri, 31 Mar 2023 11:25:42 -0400 Received: from smtpclient.apple (longrow.home.andric.com [192.168.0.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by tensor.andric.com (Postfix) with ESMTPSA id 559306059D; Fri, 31 Mar 2023 17:25:37 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=andric.com; s=201904; t=1680276337; bh=tv3wdsLbUzJuawKCrRN9gRq8xh79liGVtP2VlXgo/zE=; h=Subject:From:In-Reply-To:Date:Cc:Message-Id:References:To:From; b=Nz2cHPImbM0SK6j1NdD5Aygk3RL4epuqwPgTaTrK8MurzsrOZO/aL+KQbRgKDBhbj aplfthjlXLsiJZDVL2pSxwJVm4yxAp7BR65GpLgbw2PUwYPVGA4Hx3T4CCKcuH/M+R p85inkdVWLNGlChp1juhxPiFh2YP1gtM31fXWxnU8876mdCg1/vBX1C4O3bYA9g7eA cQkApkFp7WRvHQpBYsuiYsFA1e5QyXFSbf6ARiPW5O9EWj/MxPeymomqcOYvoj+k43 /NU1qEh1dbeXftTI+hKE+QfbZO3MWzD3rSKS1tz0nRW8/kPQB2qwKW6ZDDWY5lwk42 JgFZiA0j/f23g== Content-Type: multipart/signed; boundary="Apple-Mail=_3EB2C97A-5F5F-4754-9624-FDC38F6B0BB2"; protocol="application/pgp-signature"; micalg=pgp-sha1 Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3731.500.231\)) Subject: Re: bug#62483: echo a | grep -E -w '((()|a)|())*' # does not terminate From: Dimitry Andric <dimitry@HIDDEN> In-Reply-To: <202303280646.32S6krUr030677@HIDDEN> Date: Fri, 31 Mar 2023 17:25:21 +0200 Message-Id: <EC26823E-6BD3-4EC6-8959-4C5AF60D83FB@HIDDEN> References: <CAHJ_xkTYjONGLYWUL8uhnOVHHsBYEVqZ8WSPbr4nn8RrdfykvA@HIDDEN> <C23CEBBF-8231-46B4-8224-E4FCEED45A7D@HIDDEN> <202303280646.32S6krUr030677@HIDDEN> To: arnold@HIDDEN X-Mailer: Apple Mail (2.3731.500.231) X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 62483 Cc: 62483 <at> debbugs.gnu.org, Koen Claessen <koen@HIDDEN>, bug-grep@HIDDEN X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.0 (-) --Apple-Mail=_3EB2C97A-5F5F-4754-9624-FDC38F6B0BB2 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii Yes, it still reproduces when I configure the latest grep using --without-included-regex: 1400 for (idx =3D pmatch[0].rm_so; idx <=3D pmatch[0].rm_eo ;) 1: idx =3D 0 (gdb) 1402 update_regs (dfa, pmatch, prev_idx_match, cur_node, idx, = nmatch); 1: idx =3D 0 (gdb) 1404 if ((idx =3D=3D pmatch[0].rm_eo && cur_node =3D=3D = mctx->last_node) 1: idx =3D 0 (gdb) 1405 || (fs && re_node_set_contains (&eps_via_nodes, = cur_node))) 1: idx =3D 0 (gdb) 1428 cur_node =3D proceed_next_node (mctx, nmatch, pmatch, = prev_idx_match, 1: idx =3D 0 (gdb) 1432 if (__glibc_unlikely (cur_node < 0)) 1: idx =3D 0 (gdb) 1400 for (idx =3D pmatch[0].rm_so; idx <=3D pmatch[0].rm_eo ;) 1: idx =3D 0 (gdb) 1402 update_regs (dfa, pmatch, prev_idx_match, cur_node, idx, = nmatch); 1: idx =3D 0 (gdb) 1404 if ((idx =3D=3D pmatch[0].rm_eo && cur_node =3D=3D = mctx->last_node) 1: idx =3D 0 (gdb) 1405 || (fs && re_node_set_contains (&eps_via_nodes, = cur_node))) 1: idx =3D 0 The endless loop looks the same. grep's regexec.c is mostly the same as glibc's, except for the latter having a bunch of #if RE_ENABLE_I18N directives added. -Dimitry > On 28 Mar 2023, at 08:46, arnold@HIDDEN wrote: >=20 > This does not reproduce with gawk, even when I force use of the regex > matcher. >=20 > What happens if grep is built with the included regex files instead of > relying on the ones in the local glibc? >=20 > Arnold >=20 > Dimitry Andric <dimitry@HIDDEN> wrote: >=20 >> On 27 Mar 2023, at 11:14, Koen Claessen <koen@HIDDEN> wrote: >>>=20 >>> Running the command: >>>=20 >>> echo a | grep -E -w '((()|a)|())*' >>>=20 >>> does not terminate, and uses a LOT of processor time, for all = versions of >>> grep I have tried. >>>=20 >>> This is the smallest case that could be found; simplifying anything = in the >>> input and/or expression leads to correct behavior. >>=20 >> Reproducible with GNU grep 3.7 on Ubuntu 22: >>=20 >> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ = COMMAND >> 93938 dim 20 0 9.0m 2.1m 2.0m R 100.0 0.0 0:08.32 = grep -E -w ((()|a)|())* >>=20 >> It seems that at least on Ubuntu, grep in this mode uses glibc's = regexec(3), and it is that implementation which ends up in an endless = loop. >>=20 >> It loops between lines 1415, 1417 and 1443, but idx and cur_node = never change from 0: >>=20 >> 1378 static reg_errcode_t >> 1379 __attribute_warn_unused_result__ >> 1380 set_regs (const regex_t *preg, const re_match_context_t *mctx, = size_t nmatch, >> 1381 regmatch_t *pmatch, bool fl_backtrack) >> 1382 { >> ... >> 1415 for (idx =3D pmatch[0].rm_so; idx <=3D pmatch[0].rm_eo ;) >> 1416 { >> 1417 update_regs (dfa, pmatch, prev_idx_match, cur_node, idx, = nmatch); >> 1418 >> 1419 if ((idx =3D=3D pmatch[0].rm_eo && cur_node =3D=3D = mctx->last_node) >> 1420 || (fs && re_node_set_contains (&eps_via_nodes, = cur_node))) >> ... >> 1442 /* Proceed to next node. */ >> 1443 cur_node =3D proceed_next_node (mctx, nmatch, pmatch, = prev_idx_match, >> 1444 &idx, cur_node, >> 1445 &eps_via_nodes, fs); >> 1446 >> 1447 if (__glibc_unlikely (cur_node < 0)) >> ... >> 1465 } >> 1466 } >>=20 >> -Dimitry >>=20 >> P.S.: Interestingly this does not reproduce with BSD grep, which = returns immediately with "a". >>=20 --Apple-Mail=_3EB2C97A-5F5F-4754-9624-FDC38F6B0BB2 Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename=signature.asc Content-Type: application/pgp-signature; name=signature.asc Content-Description: Message signed with OpenPGP -----BEGIN PGP SIGNATURE----- Version: GnuPG/MacGPG2 v2.2 iF0EARECAB0WIQR6tGLSzjX8bUI5T82wXqMKLiCWowUCZCb7YQAKCRCwXqMKLiCW o+dZAJoC/FUUV4nkFDYncYF4RZt9TDj6FACfZpp9LHb3L1pn/aWcAXlxCWJKnU0= =R/1i -----END PGP SIGNATURE----- --Apple-Mail=_3EB2C97A-5F5F-4754-9624-FDC38F6B0BB2--
bug-grep@HIDDEN
:bug#62483
; Package grep
.
Full text available.Received: (at submit) by debbugs.gnu.org; 31 Mar 2023 15:25:51 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Fri Mar 31 11:25:51 2023 Received: from localhost ([127.0.0.1]:34257 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1piGdT-0007F5-3t for submit <at> debbugs.gnu.org; Fri, 31 Mar 2023 11:25:51 -0400 Received: from lists.gnu.org ([209.51.188.17]:56940) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <dimitry@HIDDEN>) id 1piGdN-0007Et-MP for submit <at> debbugs.gnu.org; Fri, 31 Mar 2023 11:25:46 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <dimitry@HIDDEN>) id 1piGdL-0008Od-T2 for bug-grep@HIDDEN; Fri, 31 Mar 2023 11:25:45 -0400 Received: from tensor.andric.com ([87.251.56.140]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <dimitry@HIDDEN>) id 1piGdJ-0007lc-Tf for bug-grep@HIDDEN; Fri, 31 Mar 2023 11:25:43 -0400 Received: from smtpclient.apple (longrow.home.andric.com [192.168.0.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by tensor.andric.com (Postfix) with ESMTPSA id 559306059D; Fri, 31 Mar 2023 17:25:37 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=andric.com; s=201904; t=1680276337; bh=tv3wdsLbUzJuawKCrRN9gRq8xh79liGVtP2VlXgo/zE=; h=Subject:From:In-Reply-To:Date:Cc:Message-Id:References:To:From; b=Nz2cHPImbM0SK6j1NdD5Aygk3RL4epuqwPgTaTrK8MurzsrOZO/aL+KQbRgKDBhbj aplfthjlXLsiJZDVL2pSxwJVm4yxAp7BR65GpLgbw2PUwYPVGA4Hx3T4CCKcuH/M+R p85inkdVWLNGlChp1juhxPiFh2YP1gtM31fXWxnU8876mdCg1/vBX1C4O3bYA9g7eA cQkApkFp7WRvHQpBYsuiYsFA1e5QyXFSbf6ARiPW5O9EWj/MxPeymomqcOYvoj+k43 /NU1qEh1dbeXftTI+hKE+QfbZO3MWzD3rSKS1tz0nRW8/kPQB2qwKW6ZDDWY5lwk42 JgFZiA0j/f23g== Content-Type: multipart/signed; boundary="Apple-Mail=_3EB2C97A-5F5F-4754-9624-FDC38F6B0BB2"; protocol="application/pgp-signature"; micalg=pgp-sha1 Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3731.500.231\)) Subject: Re: bug#62483: echo a | grep -E -w '((()|a)|())*' # does not terminate From: Dimitry Andric <dimitry@HIDDEN> In-Reply-To: <202303280646.32S6krUr030677@HIDDEN> Date: Fri, 31 Mar 2023 17:25:21 +0200 Message-Id: <EC26823E-6BD3-4EC6-8959-4C5AF60D83FB@HIDDEN> References: <CAHJ_xkTYjONGLYWUL8uhnOVHHsBYEVqZ8WSPbr4nn8RrdfykvA@HIDDEN> <C23CEBBF-8231-46B4-8224-E4FCEED45A7D@HIDDEN> <202303280646.32S6krUr030677@HIDDEN> To: arnold@HIDDEN X-Mailer: Apple Mail (2.3731.500.231) Received-SPF: pass client-ip=87.251.56.140; envelope-from=dimitry@HIDDEN; helo=tensor.andric.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: -1.4 (-) X-Debbugs-Envelope-To: submit Cc: 62483 <at> debbugs.gnu.org, Koen Claessen <koen@HIDDEN>, bug-grep@HIDDEN X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -2.4 (--) --Apple-Mail=_3EB2C97A-5F5F-4754-9624-FDC38F6B0BB2 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii Yes, it still reproduces when I configure the latest grep using --without-included-regex: 1400 for (idx =3D pmatch[0].rm_so; idx <=3D pmatch[0].rm_eo ;) 1: idx =3D 0 (gdb) 1402 update_regs (dfa, pmatch, prev_idx_match, cur_node, idx, = nmatch); 1: idx =3D 0 (gdb) 1404 if ((idx =3D=3D pmatch[0].rm_eo && cur_node =3D=3D = mctx->last_node) 1: idx =3D 0 (gdb) 1405 || (fs && re_node_set_contains (&eps_via_nodes, = cur_node))) 1: idx =3D 0 (gdb) 1428 cur_node =3D proceed_next_node (mctx, nmatch, pmatch, = prev_idx_match, 1: idx =3D 0 (gdb) 1432 if (__glibc_unlikely (cur_node < 0)) 1: idx =3D 0 (gdb) 1400 for (idx =3D pmatch[0].rm_so; idx <=3D pmatch[0].rm_eo ;) 1: idx =3D 0 (gdb) 1402 update_regs (dfa, pmatch, prev_idx_match, cur_node, idx, = nmatch); 1: idx =3D 0 (gdb) 1404 if ((idx =3D=3D pmatch[0].rm_eo && cur_node =3D=3D = mctx->last_node) 1: idx =3D 0 (gdb) 1405 || (fs && re_node_set_contains (&eps_via_nodes, = cur_node))) 1: idx =3D 0 The endless loop looks the same. grep's regexec.c is mostly the same as glibc's, except for the latter having a bunch of #if RE_ENABLE_I18N directives added. -Dimitry > On 28 Mar 2023, at 08:46, arnold@HIDDEN wrote: >=20 > This does not reproduce with gawk, even when I force use of the regex > matcher. >=20 > What happens if grep is built with the included regex files instead of > relying on the ones in the local glibc? >=20 > Arnold >=20 > Dimitry Andric <dimitry@HIDDEN> wrote: >=20 >> On 27 Mar 2023, at 11:14, Koen Claessen <koen@HIDDEN> wrote: >>>=20 >>> Running the command: >>>=20 >>> echo a | grep -E -w '((()|a)|())*' >>>=20 >>> does not terminate, and uses a LOT of processor time, for all = versions of >>> grep I have tried. >>>=20 >>> This is the smallest case that could be found; simplifying anything = in the >>> input and/or expression leads to correct behavior. >>=20 >> Reproducible with GNU grep 3.7 on Ubuntu 22: >>=20 >> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ = COMMAND >> 93938 dim 20 0 9.0m 2.1m 2.0m R 100.0 0.0 0:08.32 = grep -E -w ((()|a)|())* >>=20 >> It seems that at least on Ubuntu, grep in this mode uses glibc's = regexec(3), and it is that implementation which ends up in an endless = loop. >>=20 >> It loops between lines 1415, 1417 and 1443, but idx and cur_node = never change from 0: >>=20 >> 1378 static reg_errcode_t >> 1379 __attribute_warn_unused_result__ >> 1380 set_regs (const regex_t *preg, const re_match_context_t *mctx, = size_t nmatch, >> 1381 regmatch_t *pmatch, bool fl_backtrack) >> 1382 { >> ... >> 1415 for (idx =3D pmatch[0].rm_so; idx <=3D pmatch[0].rm_eo ;) >> 1416 { >> 1417 update_regs (dfa, pmatch, prev_idx_match, cur_node, idx, = nmatch); >> 1418 >> 1419 if ((idx =3D=3D pmatch[0].rm_eo && cur_node =3D=3D = mctx->last_node) >> 1420 || (fs && re_node_set_contains (&eps_via_nodes, = cur_node))) >> ... >> 1442 /* Proceed to next node. */ >> 1443 cur_node =3D proceed_next_node (mctx, nmatch, pmatch, = prev_idx_match, >> 1444 &idx, cur_node, >> 1445 &eps_via_nodes, fs); >> 1446 >> 1447 if (__glibc_unlikely (cur_node < 0)) >> ... >> 1465 } >> 1466 } >>=20 >> -Dimitry >>=20 >> P.S.: Interestingly this does not reproduce with BSD grep, which = returns immediately with "a". >>=20 --Apple-Mail=_3EB2C97A-5F5F-4754-9624-FDC38F6B0BB2 Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename=signature.asc Content-Type: application/pgp-signature; name=signature.asc Content-Description: Message signed with OpenPGP -----BEGIN PGP SIGNATURE----- Version: GnuPG/MacGPG2 v2.2 iF0EARECAB0WIQR6tGLSzjX8bUI5T82wXqMKLiCWowUCZCb7YQAKCRCwXqMKLiCW o+dZAJoC/FUUV4nkFDYncYF4RZt9TDj6FACfZpp9LHb3L1pn/aWcAXlxCWJKnU0= =R/1i -----END PGP SIGNATURE----- --Apple-Mail=_3EB2C97A-5F5F-4754-9624-FDC38F6B0BB2--
bug-grep@HIDDEN
:bug#62483
; Package grep
.
Full text available.Received: (at 62483) by debbugs.gnu.org; 28 Mar 2023 06:47:07 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Tue Mar 28 02:47:07 2023 Received: from localhost ([127.0.0.1]:48750 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1ph36p-0001Yd-2p for submit <at> debbugs.gnu.org; Tue, 28 Mar 2023 02:47:07 -0400 Received: from frenzy.freefriends.org ([198.99.81.75]:41354 helo=freefriends.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <arnold@HIDDEN>) id 1ph36m-0001YU-4b for 62483 <at> debbugs.gnu.org; Tue, 28 Mar 2023 02:47:05 -0400 X-Envelope-From: arnold@HIDDEN Received: from freefriends.org (frenzy.freefriends.org [198.99.81.75]) by freefriends.org (8.14.7/8.14.7) with ESMTP id 32S6ksEG030678 (version=TLSv1/SSLv3 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 28 Mar 2023 00:46:55 -0600 Received: (from arnold@localhost) by freefriends.org (8.14.7/8.14.7/Submit) id 32S6krUr030677; Tue, 28 Mar 2023 00:46:53 -0600 From: arnold@HIDDEN Message-Id: <202303280646.32S6krUr030677@HIDDEN> X-Authentication-Warning: frenzy.freefriends.org: arnold set sender to arnold@HIDDEN using -f Date: Tue, 28 Mar 2023 00:46:53 -0600 To: koen@HIDDEN, dimitry@HIDDEN Subject: Re: bug#62483: echo a | grep -E -w '((()|a)|())*' # does not terminate References: <CAHJ_xkTYjONGLYWUL8uhnOVHHsBYEVqZ8WSPbr4nn8RrdfykvA@HIDDEN> <C23CEBBF-8231-46B4-8224-E4FCEED45A7D@HIDDEN> In-Reply-To: <C23CEBBF-8231-46B4-8224-E4FCEED45A7D@HIDDEN> User-Agent: Heirloom mailx 12.5 7/5/10 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 62483 Cc: 62483 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: 0.0 (/) This does not reproduce with gawk, even when I force use of the regex matcher. What happens if grep is built with the included regex files instead of relying on the ones in the local glibc? Arnold Dimitry Andric <dimitry@HIDDEN> wrote: > On 27 Mar 2023, at 11:14, Koen Claessen <koen@HIDDEN> wrote: > > > > Running the command: > > > > echo a | grep -E -w '((()|a)|())*' > > > > does not terminate, and uses a LOT of processor time, for all versions of > > grep I have tried. > > > > This is the smallest case that could be found; simplifying anything in the > > input and/or expression leads to correct behavior. > > Reproducible with GNU grep 3.7 on Ubuntu 22: > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 93938 dim 20 0 9.0m 2.1m 2.0m R 100.0 0.0 0:08.32 grep -E -w ((()|a)|())* > > It seems that at least on Ubuntu, grep in this mode uses glibc's regexec(3), and it is that implementation which ends up in an endless loop. > > It loops between lines 1415, 1417 and 1443, but idx and cur_node never change from 0: > > 1378 static reg_errcode_t > 1379 __attribute_warn_unused_result__ > 1380 set_regs (const regex_t *preg, const re_match_context_t *mctx, size_t nmatch, > 1381 regmatch_t *pmatch, bool fl_backtrack) > 1382 { > ... > 1415 for (idx = pmatch[0].rm_so; idx <= pmatch[0].rm_eo ;) > 1416 { > 1417 update_regs (dfa, pmatch, prev_idx_match, cur_node, idx, nmatch); > 1418 > 1419 if ((idx == pmatch[0].rm_eo && cur_node == mctx->last_node) > 1420 || (fs && re_node_set_contains (&eps_via_nodes, cur_node))) > ... > 1442 /* Proceed to next node. */ > 1443 cur_node = proceed_next_node (mctx, nmatch, pmatch, prev_idx_match, > 1444 &idx, cur_node, > 1445 &eps_via_nodes, fs); > 1446 > 1447 if (__glibc_unlikely (cur_node < 0)) > ... > 1465 } > 1466 } > > -Dimitry > > P.S.: Interestingly this does not reproduce with BSD grep, which returns immediately with "a". >
bug-grep@HIDDEN
:bug#62483
; Package grep
.
Full text available.Received: (at 62483) by debbugs.gnu.org; 27 Mar 2023 17:54:48 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Mon Mar 27 13:54:48 2023 Received: from localhost ([127.0.0.1]:48396 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1pgr3P-0006af-MB for submit <at> debbugs.gnu.org; Mon, 27 Mar 2023 13:54:48 -0400 Received: from tensor.andric.com ([87.251.56.140]:61183 ident=postfix) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <dimitry@HIDDEN>) id 1pgr3O-0006aT-2P for 62483 <at> debbugs.gnu.org; Mon, 27 Mar 2023 13:54:47 -0400 Received: from smtpclient.apple (longrow.home.andric.com [192.168.0.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by tensor.andric.com (Postfix) with ESMTPSA id C41BC316CF; Mon, 27 Mar 2023 19:54:42 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=andric.com; s=201904; t=1679939682; bh=v6pn1ABwmo2Zk9WlVVTSX2HlDTRrAfOORf4/PvUrqeQ=; h=Subject:From:In-Reply-To:Date:Cc:Message-Id:References:To:From; b=L1SvqBeqaLZyAehXMuCh/SjMza9jiRzcc6wkwkKkhHpwRIOL4sLOoM6Adxphr00Mv xOL8dQszUSuyFlnwvqZUlYHmLFQ+pt73Y6zor1+WRuB4As5/OBjFlhOX0O8EEiel1a 4eGWJAZR/FmRt7E9t7txMFx7spGuIY+3VLv80eDqwp908gbxEcFBKDhSjNz4v5BUXV j9ThNMHLQeqo+ask6CkYXCsHyidfXdi9Sy6rn8F23AMb6RbhD4UoL0T+kZFM2bRcC6 7NCeb6tEcgKfRDGlNJgvpyBcdxyYTgE3iCgIP1L7BXpBhN5P6lA0j7K3OMeadi3Pc4 XR7NOl0WTc3pw== Content-Type: multipart/signed; boundary="Apple-Mail=_D6F6E370-5190-456A-AFDD-7684509FEA23"; protocol="application/pgp-signature"; micalg=pgp-sha1 Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3731.400.51.1.1\)) Subject: Re: bug#62483: echo a | grep -E -w '((()|a)|())*' # does not terminate From: Dimitry Andric <dimitry@HIDDEN> In-Reply-To: <CAHJ_xkTYjONGLYWUL8uhnOVHHsBYEVqZ8WSPbr4nn8RrdfykvA@HIDDEN> Date: Mon, 27 Mar 2023 19:54:27 +0200 Message-Id: <C23CEBBF-8231-46B4-8224-E4FCEED45A7D@HIDDEN> References: <CAHJ_xkTYjONGLYWUL8uhnOVHHsBYEVqZ8WSPbr4nn8RrdfykvA@HIDDEN> To: Koen Claessen <koen@HIDDEN> X-Mailer: Apple Mail (2.3731.400.51.1.1) X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 62483 Cc: 62483 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.0 (-) --Apple-Mail=_D6F6E370-5190-456A-AFDD-7684509FEA23 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii On 27 Mar 2023, at 11:14, Koen Claessen <koen@HIDDEN> wrote: >=20 > Running the command: >=20 > echo a | grep -E -w '((()|a)|())*' >=20 > does not terminate, and uses a LOT of processor time, for all versions = of > grep I have tried. >=20 > This is the smallest case that could be found; simplifying anything in = the > input and/or expression leads to correct behavior. Reproducible with GNU grep 3.7 on Ubuntu 22: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ = COMMAND 93938 dim 20 0 9.0m 2.1m 2.0m R 100.0 0.0 0:08.32 = grep -E -w ((()|a)|())* It seems that at least on Ubuntu, grep in this mode uses glibc's = regexec(3), and it is that implementation which ends up in an endless = loop. It loops between lines 1415, 1417 and 1443, but idx and cur_node never = change from 0: 1378 static reg_errcode_t 1379 __attribute_warn_unused_result__ 1380 set_regs (const regex_t *preg, const re_match_context_t *mctx, = size_t nmatch, 1381 regmatch_t *pmatch, bool fl_backtrack) 1382 { ... 1415 for (idx =3D pmatch[0].rm_so; idx <=3D pmatch[0].rm_eo ;) 1416 { 1417 update_regs (dfa, pmatch, prev_idx_match, cur_node, idx, = nmatch); 1418 1419 if ((idx =3D=3D pmatch[0].rm_eo && cur_node =3D=3D = mctx->last_node) 1420 || (fs && re_node_set_contains (&eps_via_nodes, = cur_node))) ... 1442 /* Proceed to next node. */ 1443 cur_node =3D proceed_next_node (mctx, nmatch, pmatch, = prev_idx_match, 1444 &idx, cur_node, 1445 &eps_via_nodes, fs); 1446 1447 if (__glibc_unlikely (cur_node < 0)) ... 1465 } 1466 } -Dimitry P.S.: Interestingly this does not reproduce with BSD grep, which returns = immediately with "a". --Apple-Mail=_D6F6E370-5190-456A-AFDD-7684509FEA23 Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename=signature.asc Content-Type: application/pgp-signature; name=signature.asc Content-Description: Message signed with OpenPGP -----BEGIN PGP SIGNATURE----- Version: GnuPG/MacGPG2 v2.2 iF0EARECAB0WIQR6tGLSzjX8bUI5T82wXqMKLiCWowUCZCHYUwAKCRCwXqMKLiCW o+9rAKD66Kl1mgPSqRVaG1i0Y05AnBOdBACfYshg0HktpRuLAQQBLyyAQHYSENU= =qC4Q -----END PGP SIGNATURE----- --Apple-Mail=_D6F6E370-5190-456A-AFDD-7684509FEA23--
bug-grep@HIDDEN
:bug#62483
; Package grep
.
Full text available.Received: (at submit) by debbugs.gnu.org; 27 Mar 2023 13:14:11 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Mon Mar 27 09:14:11 2023 Received: from localhost ([127.0.0.1]:46774 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1pgmfq-0005Kz-Vo for submit <at> debbugs.gnu.org; Mon, 27 Mar 2023 09:14:11 -0400 Received: from lists.gnu.org ([209.51.188.17]:59458) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <koen.claessen@HIDDEN>) id 1pgivx-0004UW-6G for submit <at> debbugs.gnu.org; Mon, 27 Mar 2023 05:14:33 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <koen.claessen@HIDDEN>) id 1pgivw-0002tj-DW for bug-grep@HIDDEN; Mon, 27 Mar 2023 05:14:32 -0400 Received: from mail-oi1-f175.google.com ([209.85.167.175]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from <koen.claessen@HIDDEN>) id 1pgivu-0000lO-J0 for bug-grep@HIDDEN; Mon, 27 Mar 2023 05:14:32 -0400 Received: by mail-oi1-f175.google.com with SMTP id bi31so5781083oib.9 for <bug-grep@HIDDEN>; Mon, 27 Mar 2023 02:14:29 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679908468; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=I0nsVRHdMBEFAIktPr2QorHZIFkZGpyQddZo2ESBGJ8=; b=2KbYRm6gB/nY5JRtjAjJaxhmjLHPSJBO8Cy78zB7X/9GPD5KOF/l9yexPGoJaB9nY1 43kJglMMfnl5tW61bsgJXRntfquVQhc5ksyWEkvo92vqmW6JbZZgr9CH99P7xAm1dzW9 q9/iMRMAttFeav8z95fj1lrvGNrCQaQ6x8r86bs0a2K5jbIvmOdNi6fpA5A97mNpMkuk MHlXMX630nG1TqFihmHmzmxhsEHn+EEoPH1ghbEXSVLfXsm+x97xq088/pRTPVPnEscC jBQhrDStI6y8q+HtPWQ4Nl7YIWMd/KADw3+N9UH/+CQDQcNipBBtyAc3Btc4wk6umvAY yQsw== X-Gm-Message-State: AO0yUKWR05lCph9vF+pSaIybDbqBKWRRVrJeEwzH0KBdtHCnzIspBUj/ U+hWrGx2XjyYZBmUJaJ1E2qYlJKqSkrFn9E2NfbZc/ug+Q4= X-Google-Smtp-Source: AK7set+FJwBk1ZfnerizAPa/0MtB+zNp+gLKRAc9psjQo/PlnuSfdZVsLPCMkXJvQJtu5+bxXOV+O9UTykY9L0q4dpc= X-Received: by 2002:a05:6808:b38:b0:387:1afd:5924 with SMTP id t24-20020a0568080b3800b003871afd5924mr2930420oij.8.1679908468569; Mon, 27 Mar 2023 02:14:28 -0700 (PDT) MIME-Version: 1.0 From: Koen Claessen <koen@HIDDEN> Date: Mon, 27 Mar 2023 11:14:17 +0200 Message-ID: <CAHJ_xkTYjONGLYWUL8uhnOVHHsBYEVqZ8WSPbr4nn8RrdfykvA@HIDDEN> Subject: echo a | grep -E -w '((()|a)|())*' # does not terminate To: bug-grep@HIDDEN Content-Type: multipart/alternative; boundary="000000000000863aec05f7de2b71" Received-SPF: pass client-ip=209.85.167.175; envelope-from=koen.claessen@HIDDEN; helo=mail-oi1-f175.google.com X-Spam_score_int: -13 X-Spam_score: -1.4 X-Spam_bar: - X-Spam_report: (-1.4 / 5.0 requ) BAYES_00=-1.9, FREEMAIL_FORGED_FROMDOMAIN=0.25, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.25, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-Spam-Score: -1.1 (-) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Mon, 27 Mar 2023 09:14:05 -0400 X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -2.1 (--) --000000000000863aec05f7de2b71 Content-Type: text/plain; charset="UTF-8" Hello! Running the command: echo a | grep -E -w '((()|a)|())*' does not terminate, and uses a LOT of processor time, for all versions of grep I have tried. This is the smallest case that could be found; simplifying anything in the input and/or expression leads to correct behavior. Kind regards, /Koen --000000000000863aec05f7de2b71 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr">Hello!<div><br></div><div>Running the command:</div><div><= br></div><div>=C2=A0=C2=A0echo a | grep -E -w '((()|a)|())*'</div><= div><br></div><div>does not terminate, and uses a LOT of processor time, fo= r all versions of grep I have tried.</div><div><br></div><div>This is the s= mallest case that could be found; simplifying anything in the input and/or = expression leads to correct behavior.</div><div><br></div><div>Kind regards= ,</div><div>/Koen</div></div> --000000000000863aec05f7de2b71--
Koen Claessen <koen@HIDDEN>
:bug-grep@HIDDEN
.
Full text available.bug-grep@HIDDEN
:bug#62483
; Package grep
.
Full text available.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997 nCipher Corporation Ltd,
1994-97 Ian Jackson.