GNU bug report logs - #62657
PCRE2-related workarounds that GNU grep might need

Previous Next

Package: grep;

Reported by: Paul Eggert <eggert <at> cs.ucla.edu>

Date: Mon, 3 Apr 2023 21:50:02 UTC

Severity: normal

To reply to this bug, email your comments to 62657 AT debbugs.gnu.org.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-grep <at> gnu.org:
bug#62657; Package grep. (Mon, 03 Apr 2023 21:50:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Paul Eggert <eggert <at> cs.ucla.edu>:
New bug report received and forwarded. Copy sent to bug-grep <at> gnu.org. (Mon, 03 Apr 2023 21:50:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: bug-grep <at> gnu.org
Subject: PCRE2-related workarounds that GNU grep might need
Date: Mon, 3 Apr 2023 14:48:30 -0700
Recent commits in Git do the following to work around bugs in PCRE2. 
Quite possibly GNU grep -P should do the same, when in a UTF-8 locale.

  * Disable PCRE2_UCP unless PCRE2 10.35 or higher.

  * If ignoring case and PCRE2_MATCH_INVALID_UTF is defined, then 
enable PCRE2_NO_START_OPTIMIZE unless PCRE2 10.36 or higher.




Information forwarded to bug-grep <at> gnu.org:
bug#62657; Package grep. (Tue, 04 Apr 2023 06:18:01 GMT) Full text and rfc822 format available.

Message #8 received at 62657 <at> debbugs.gnu.org (full text, mbox):

From: Carlo Arenas <carenas <at> gmail.com>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 62657 <at> debbugs.gnu.org
Subject: Re: bug#62657: PCRE2-related workarounds that GNU grep might need
Date: Mon, 3 Apr 2023 23:17:18 -0700
On Mon, Apr 3, 2023 at 2:50 PM Paul Eggert <eggert <at> cs.ucla.edu> wrote:
>
>    * Disable PCRE2_UCP unless PCRE2 10.35 or higher.

this is because of a bug in JIT, alternatively JIT could be disabled

>    * If ignoring case and PCRE2_MATCH_INVALID_UTF is defined, then
> enable PCRE2_NO_START_OPTIMIZE unless PCRE2 10.36 or higher.

this one is only triggered when PCRE2_MULTILINE is used, which is not
the case for GNU grep




Information forwarded to bug-grep <at> gnu.org:
bug#62657; Package grep. (Tue, 04 Apr 2023 06:24:02 GMT) Full text and rfc822 format available.

Message #11 received at 62657 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Carlo Arenas <carenas <at> gmail.com>
Cc: 62657 <at> debbugs.gnu.org
Subject: Re: bug#62657: PCRE2-related workarounds that GNU grep might need
Date: Mon, 3 Apr 2023 23:23:20 -0700
On 2023-04-03 23:17, Carlo Arenas wrote:
> On Mon, Apr 3, 2023 at 2:50 PM Paul Eggert <eggert <at> cs.ucla.edu> wrote:
>>
>>     * Disable PCRE2_UCP unless PCRE2 10.35 or higher.
> 
> this is because of a bug in JIT, alternatively JIT could be disabled

Oh, that might be better as it doesn't affect behavior (just performance).

>>     * If ignoring case and PCRE2_MATCH_INVALID_UTF is defined, then
>> enable PCRE2_NO_START_OPTIMIZE unless PCRE2 10.36 or higher.
> 
> this one is only triggered when PCRE2_MULTILINE is used, which is not
> the case for GNU grep

Thanks for letting us know.




Information forwarded to bug-grep <at> gnu.org:
bug#62657; Package grep. (Tue, 04 Apr 2023 07:35:02 GMT) Full text and rfc822 format available.

Message #14 received at 62657 <at> debbugs.gnu.org (full text, mbox):

From: Carlo Arenas <carenas <at> gmail.com>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 62657 <at> debbugs.gnu.org
Subject: Re: bug#62657: PCRE2-related workarounds that GNU grep might need
Date: Tue, 4 Apr 2023 00:34:20 -0700
On Mon, Apr 3, 2023 at 11:23 PM Paul Eggert <eggert <at> cs.ucla.edu> wrote:
>
> On 2023-04-03 23:17, Carlo Arenas wrote:
> > On Mon, Apr 3, 2023 at 2:50 PM Paul Eggert <eggert <at> cs.ucla.edu> wrote:
> >>
> >>     * Disable PCRE2_UCP unless PCRE2 10.35 or higher.
> >
> > this is because of a bug in JIT, alternatively JIT could be disabled
>
> Oh, that might be better as it doesn't affect behavior (just performance).

Also, unlike `git`; GNU grep doesn't use the fastpath JIT API and skip
UTF validation, so this crash can only be triggered in 10.34 and not
older versions, even with JIT and PCRE2_UCP enabled.

Carlo




This bug report was last modified 1 year and 17 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.