GNU bug report logs - #15758
grep 2.15 calls abort() on larger searches with -P

Previous Next

Package: grep;

Reported by: Dave Reisner <dreisner <at> archlinux.org>

Date: Wed, 30 Oct 2013 17:40:05 UTC

Severity: normal

Merged with 15759

Done: Jim Meyering <jim <at> meyering.net>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 15758 in the body.
You can then email your comments to 15758 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-grep <at> gnu.org:
bug#15758; Package grep. (Wed, 30 Oct 2013 17:40:05 GMT) Full text and rfc822 format available.

Acknowledgement sent to Dave Reisner <dreisner <at> archlinux.org>:
New bug report received and forwarded. Copy sent to bug-grep <at> gnu.org. (Wed, 30 Oct 2013 17:40:05 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Dave Reisner <dreisner <at> archlinux.org>
To: bug-grep <at> gnu.org
Subject: grep 2.15 calls abort() on larger searches with -P
Date: Wed, 30 Oct 2013 12:42:35 -0400
Hi,

A user reported a regression with grep 2.15 which is reasonably easy to
reproduce with an invocation such as: ``grep -Pr foo''. The root cause
is that pcre_exec returns an unhandled error (PCRE_ERROR_BADUTF8)
causing grep to call abort().

I bisected the breakage to commit 67436786c110bbb565 (and verified that
it still exists at git HEAD) which essentially introduces utf-8
validation for data. On a large enough file hierarchy, I suppose it's
inevitable that invalid UTF-8 data is encountered. I was able to fix
this with the inline diff which follows:

  diff --git a/src/pcresearch.c b/src/pcresearch.c
  index ad5999d..ce55ab3 100644
  --- a/src/pcresearch.c
  +++ b/src/pcresearch.c
  @@ -176,6 +176,9 @@ Pexecute (char const *buf, size_t size, size_t *match_size,
         switch (e)
           {
           case PCRE_ERROR_NOMATCH:
  +#ifdef HAVE_LANGINFO_CODESET
  +        case PCRE_ERROR_BADUTF8:
  +#endif
             return -1;

           case PCRE_ERROR_NOMEMORY:

I don't know if this is considered to be a correct fix, but I offer it
as a starting point for a discussion.

Cheers,
Dave

P.S. Please CC me on replies as I am not subscribed to the list.




Information forwarded to bug-grep <at> gnu.org:
bug#15758; Package grep. (Wed, 30 Oct 2013 21:20:02 GMT) Full text and rfc822 format available.

Message #8 received at 15758 <at> debbugs.gnu.org (full text, mbox):

From: Stefano Lattarini <stefano.lattarini <at> gmail.com>
To: Dave Reisner <dreisner <at> archlinux.org>
Cc: 15759 <at> debbugs.gnu.org, 15758 <at> debbugs.gnu.org
Subject: Re: bug#15758: grep 2.15 calls abort() on larger searches with -P
Date: Wed, 30 Oct 2013 21:19:33 +0000
merge 15758 15759
stop

bug#15758 is the same as bug#15759, so I'm merging them,
to avoid confusion or the risk of dispersing the discussion.

Regards,
  Stefano




Merged 15758 15759. Request was from Stefano Lattarini <stefano.lattarini <at> gmail.com> to control <at> debbugs.gnu.org. (Wed, 30 Oct 2013 21:20:05 GMT) Full text and rfc822 format available.

Information forwarded to bug-grep <at> gnu.org:
bug#15758; Package grep. (Thu, 31 Oct 2013 15:27:02 GMT) Full text and rfc822 format available.

Message #13 received at 15758 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Stefano Lattarini <stefano.lattarini <at> gmail.com>
Cc: 15759 <at> debbugs.gnu.org, 15758 <at> debbugs.gnu.org,
 Dave Reisner <dreisner <at> archlinux.org>
Subject: Re: bug#15758: grep 2.15 calls abort() on larger searches with -P
Date: Thu, 31 Oct 2013 08:26:10 -0700
> bug#15758 is the same as bug#15759, so I'm merging them,
> to avoid confusion or the risk of dispersing the discussion.

Thanks, Stefano and Dave.
With this and the nit about --version output being wrong, I now have
two reasons to make a new release.  I will take a look at the mass of
PCRE_ERROR* cases today.




Information forwarded to bug-grep <at> gnu.org:
bug#15758; Package grep. (Sat, 02 Nov 2013 23:07:02 GMT) Full text and rfc822 format available.

Message #16 received at 15758 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Stefano Lattarini <stefano.lattarini <at> gmail.com>
Cc: 15759 <at> debbugs.gnu.org, 15758 <at> debbugs.gnu.org,
 Dave Reisner <dreisner <at> archlinux.org>
Subject: Re: bug#15758: grep 2.15 calls abort() on larger searches with -P
Date: Sat, 2 Nov 2013 16:05:52 -0700
[Message part 1 (text/plain, inline)]
On Thu, Oct 31, 2013 at 8:26 AM, Jim Meyering <jim <at> meyering.net> wrote:
...
> With this and the nit about --version output being wrong, I now have
> two reasons to make a new release.

Thanks again for the report, Dave.
Here's the fix I expect to push:
[k.txt (text/plain, attachment)]

Information forwarded to bug-grep <at> gnu.org:
bug#15758; Package grep. (Mon, 04 Nov 2013 19:39:02 GMT) Full text and rfc822 format available.

Message #19 received at 15758 <at> debbugs.gnu.org (full text, mbox):

From: Dave Reisner <d <at> falconindy.com>
To: Jim Meyering <jim <at> meyering.net>
Cc: 15759 <at> debbugs.gnu.org, 15758 <at> debbugs.gnu.org,
 Dave Reisner <dreisner <at> archlinux.org>,
 Stefano Lattarini <stefano.lattarini <at> gmail.com>
Subject: Re: bug#15758: grep 2.15 calls abort() on larger searches with -P
Date: Mon, 4 Nov 2013 14:38:40 -0500
On Sat, Nov 02, 2013 at 04:05:52PM -0700, Jim Meyering wrote:
> On Thu, Oct 31, 2013 at 8:26 AM, Jim Meyering <jim <at> meyering.net> wrote:
> ...
> > With this and the nit about --version output being wrong, I now have
> > two reasons to make a new release.
> 
> Thanks again for the report, Dave.
> Here's the fix I expect to push:

Thanks Jim.

Apologies for not responding to this sooner. I tested your patch and can
confirm that the behavior is better, but the new behavior still seems
like a regression. Take, for example, the simple instance of grep'ing
grep's own git repo.

# with grep 2.14
$ grep -rPw GNULIB
gnulib/m4/bison.m4:dnl Declaring YACC & YFLAGS precious will not be necessary after GNULIB
gnulib/lib/glob.c:   HAVE_STRUCT_DIRENT_D_TYPE plays the same role in GNULIB.  */
gnulib/lib/netdb.in.h:   GNULIB getaddrinfo() replacement, so are not yet needed.
gnulib/lib/argp.h:/* GNULIB makes sure both program_invocation_name and

# with grep built from HEAD
$ ./src/grep -rPw GNULIB
./src/grep: invalid UTF-8 byte sequence in input

I would expect that the invalid UTF-8 wouldn't stop grep cold, but
continue on, ignoring the non-matching data, just as grep without the -P
flag does.

Cheers,
Dave




Information forwarded to bug-grep <at> gnu.org:
bug#15758; Package grep. (Tue, 05 Nov 2013 16:18:02 GMT) Full text and rfc822 format available.

Message #22 received at 15758 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Dave Reisner <d <at> falconindy.com>
Cc: 15759 <at> debbugs.gnu.org, 15758 <at> debbugs.gnu.org,
 Dave Reisner <dreisner <at> archlinux.org>,
 Stefano Lattarini <stefano.lattarini <at> gmail.com>
Subject: Re: bug#15758: grep 2.15 calls abort() on larger searches with -P
Date: Tue, 5 Nov 2013 08:17:15 -0800
On Mon, Nov 4, 2013 at 11:38 AM, Dave Reisner <d <at> falconindy.com> wrote:
> On Sat, Nov 02, 2013 at 04:05:52PM -0700, Jim Meyering wrote:
>> On Thu, Oct 31, 2013 at 8:26 AM, Jim Meyering <jim <at> meyering.net> wrote:
>> ...
>> > With this and the nit about --version output being wrong, I now have
>> > two reasons to make a new release.
>>
>> Thanks again for the report, Dave.
>> Here's the fix I expect to push:
>
> Thanks Jim.
>
> Apologies for not responding to this sooner. I tested your patch and can
> confirm that the behavior is better, but the new behavior still seems
> like a regression. Take, for example, the simple instance of grep'ing
> grep's own git repo.
>
> # with grep 2.14
> $ grep -rPw GNULIB
> gnulib/m4/bison.m4:dnl Declaring YACC & YFLAGS precious will not be necessary after GNULIB
> gnulib/lib/glob.c:   HAVE_STRUCT_DIRENT_D_TYPE plays the same role in GNULIB.  */
> gnulib/lib/netdb.in.h:   GNULIB getaddrinfo() replacement, so are not yet needed.
> gnulib/lib/argp.h:/* GNULIB makes sure both program_invocation_name and
>
> # with grep built from HEAD
> $ ./src/grep -rPw GNULIB
> ./src/grep: invalid UTF-8 byte sequence in input
>
> I would expect that the invalid UTF-8 wouldn't stop grep cold, but
> continue on, ignoring the non-matching data, just as grep without the -P
> flag does.

Hi Dave,

I agree, and so does pcregrep.  There are a few other problems with
grep's PCRE driver code: for example, a problem (no matter how serious)
in one file should not cause the entire grep run to exit; grep should
continue processing remaining files. And when grep reports the problem,
it should include at least the file name in the diagnostic.

I will fix those before the upcoming snapshot.

Thanks,
Jim




Information forwarded to bug-grep <at> gnu.org:
bug#15758; Package grep. (Tue, 26 Nov 2013 14:31:03 GMT) Full text and rfc822 format available.

Message #25 received at 15758 <at> debbugs.gnu.org (full text, mbox):

From: Santiago <santiago <at> debian.org>
To: Jim Meyering <jim <at> meyering.net>
Cc: 15758 <at> debbugs.gnu.org, 730472 <at> bugs.debian.org
Subject: Re: bug#15758: grep 2.15 calls abort() on larger searches with -P
Date: Tue, 26 Nov 2013 15:30:03 +0100
[Message part 1 (text/plain, inline)]
On Tue, Nov 05, 2013 at 08:17:15AM -0800, Jim Meyering wrote:
...
> 
> Hi Dave,
> 
> I agree, and so does pcregrep.  There are a few other problems with
> grep's PCRE driver code: for example, a problem (no matter how serious)
> in one file should not cause the entire grep run to exit; grep should
> continue processing remaining files. And when grep reports the problem,
> it should include at least the file name in the diagnostic.
> 
> I will fix those before the upcoming snapshot.
> 
> Thanks,
> Jim
> 
> 
> 

Hi there,

This bug was also reported in Debian ( http://bugs.debian.org/730472 ).

Taking a look on it, I think the most suitable solution for the moment
is to flag PCRE_NO_UTF8_CHECK instead of PCRE_UTF8, so
PCRE does not check if inputs are UTF8 valid. Resulting behavior is
similar to pre-grep-2.15. (See 15758-PCRE-no-check-UTF8.patch)

$ grep -Pr "DEFINE" /usr/lib/linux-kbuild-3.2/
/usr/lib/linux-kbuild-3.2/scripts/kernel-doc:   if ($prototype =~ m/DEFINE_SINGLE_EVENT\((.*?),/) {
/usr/lib/linux-kbuild-3.2/scripts/kernel-doc:   if ($prototype =~ m/DEFINE_EVENT\((.*?),(.*?),/) {
/usr/lib/linux-kbuild-3.2/scripts/kernel-doc:## if ($prototype =~ m/SYSCALL_DEFINE0\s*\(\s*(a-zA-Z0-9_)*\s*\)/) {
...


I have also tested printing a message when a file was invalid, but the results
can be annoying (15758-PCRE-no-exit-UTF8.patch), since a warning is shown even
if files do not match:

$ grep -Pr "DEFINE" /usr/lib/linux-kbuild-3.2/
grep: invalid UTF-8 byte sequence in input
grep: invalid UTF-8 byte sequence in input
grep: invalid UTF-8 byte sequence in input
grep: invalid UTF-8 byte sequence in input
grep: invalid UTF-8 byte sequence in input
grep: invalid UTF-8 byte sequence in input
...
/usr/lib/linux-kbuild-3.2/scripts/kernel-doc:   if ($prototype =~ m/DEFINE_SINGLE_EVENT\((.*?),/) {
/usr/lib/linux-kbuild-3.2/scripts/kernel-doc:   if ($prototype =~ m/DEFINE_EVENT\((.*?),(.*?),/) {
/usr/lib/linux-kbuild-3.2/scripts/kernel-doc:## if ($prototype =~ m/SYSCALL_DEFINE0\s*\(\s*(a-zA-Z0-9_)*\s*\)/) {
...

I propose 15758-PCRE-no-check-UTF8.patch as solution, at least temporal.

Regards,

Santiago

[15758-PCRE-no-check-UTF8.patch (text/x-diff, attachment)]
[15758-PCRE-no-exit-UTF8.patch (text/x-diff, attachment)]

Reply sent to Jim Meyering <jim <at> meyering.net>:
You have taken responsibility. (Fri, 13 Dec 2013 18:34:02 GMT) Full text and rfc822 format available.

Notification sent to Dave Reisner <dreisner <at> archlinux.org>:
bug acknowledged by developer. (Fri, 13 Dec 2013 18:34:03 GMT) Full text and rfc822 format available.

Message #30 received at 15758-done <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Santiago <santiago <at> debian.org>
Cc: 15758-done <at> debbugs.gnu.org, 730472 <at> bugs.debian.org
Subject: Re: bug#15758: grep 2.15 calls abort() on larger searches with -P
Date: Fri, 13 Dec 2013 10:33:35 -0800
On Tue, Nov 26, 2013 at 6:30 AM, Santiago <santiago <at> debian.org> wrote:
> This bug was also reported in Debian ( http://bugs.debian.org/730472 ).
>
> Taking a look on it, I think the most suitable solution for the moment
> is to flag PCRE_NO_UTF8_CHECK instead of PCRE_UTF8, so
> PCRE does not check if inputs are UTF8 valid. Resulting behavior is
> similar to pre-grep-2.15. (See 15758-PCRE-no-check-UTF8.patch)

Thanks for the suggested patches and report.  Your first patch is
almost right.  The problem is that we cannot remove the PCRE_UTF8 flag.
If we did that, it would disable UTF-8, reverting an older fix.
See tests/pcre-utf8 for examples, or run this:

  printf '\342\202\254\n' | LC_ALL=en_US.UTF-8 src/grep -P '^\p{S}'

I've added a commit log, improved a related test and attached
a slightly different patch, but left you as the "Author".
I'll wait for an explicit ACK before pushing it.

With that, there is no need to handle PCRE_ERROR_BADUTF8
because that should not happen.




Reply sent to Jim Meyering <jim <at> meyering.net>:
You have taken responsibility. (Fri, 13 Dec 2013 18:34:03 GMT) Full text and rfc822 format available.

Notification sent to Dave Reisner <d <at> falconindy.com>:
bug acknowledged by developer. (Fri, 13 Dec 2013 18:34:04 GMT) Full text and rfc822 format available.

Information forwarded to bug-grep <at> gnu.org:
bug#15758; Package grep. (Fri, 13 Dec 2013 19:06:02 GMT) Full text and rfc822 format available.

Message #38 received at 15758-done <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Santiago <santiago <at> debian.org>
Cc: 15758-done <15758-done <at> debbugs.gnu.org>, 730472 <730472 <at> bugs.debian.org>
Subject: Re: bug#15758: grep 2.15 calls abort() on larger searches with -P
Date: Fri, 13 Dec 2013 11:05:24 -0800
[Message part 1 (text/plain, inline)]
On Fri, Dec 13, 2013 at 10:33 AM, Jim Meyering <jim <at> meyering.net> wrote:
...
> Thanks for the suggested patches and report.  Your first patch is
> almost right.  The problem is that we cannot remove the PCRE_UTF8 flag.
> If we did that, it would disable UTF-8, reverting an older fix.
> See tests/pcre-utf8 for examples, or run this:
>
>   printf '\342\202\254\n' | LC_ALL=en_US.UTF-8 src/grep -P '^\p{S}'
>
> I've added a commit log, improved a related test and attached
> a slightly different patch, but left you as the "Author".
> I'll wait for an explicit ACK before pushing it.
>
> With that, there is no need to handle PCRE_ERROR_BADUTF8
> because that should not happen.

Patch attached, this time.
Thanks to Eric Blake for the quick off-list prod :-)
[k.txt (text/plain, attachment)]

Information forwarded to bug-grep <at> gnu.org:
bug#15758; Package grep. (Wed, 18 Dec 2013 16:52:02 GMT) Full text and rfc822 format available.

Message #41 received at 15758 <at> debbugs.gnu.org (full text, mbox):

From: Santiago <santiago <at> debian.org>
To: Jim Meyering <jim <at> meyering.net>
Cc: 15758-done <15758 <at> debbugs.gnu.org>, 730472 <730472 <at> bugs.debian.org>
Subject: Re: bug#15758: grep 2.15 calls abort() on larger searches with -P
Date: Wed, 18 Dec 2013 11:53:01 -0500
El 13/12/13 a las 11:05, Jim Meyering escribió:
> On Fri, Dec 13, 2013 at 10:33 AM, Jim Meyering <jim <at> meyering.net> wrote:
> ...
> > Thanks for the suggested patches and report.  Your first patch is
> > almost right.  The problem is that we cannot remove the PCRE_UTF8 flag.
> > If we did that, it would disable UTF-8, reverting an older fix.
> > See tests/pcre-utf8 for examples, or run this:
> >
> >   printf '\342\202\254\n' | LC_ALL=en_US.UTF-8 src/grep -P '^\p{S}'
> >
> > I've added a commit log, improved a related test and attached
> > a slightly different patch, but left you as the "Author".
> > I'll wait for an explicit ACK before pushing it.
> >
> > With that, there is no need to handle PCRE_ERROR_BADUTF8
> > because that should not happen.
> 
> Patch attached, this time.
> Thanks to Eric Blake for the quick off-list prod :-)

Hi Jim,

Thanks for your work, but I'm not sure using both flags works as we
need. Actually, I had tried that before submitting my patch. I got this
using your changes:

$ src/grep -Pr "DEFINE" /usr/lib/linux-kbuild-3.2/
src/grep: invalid UTF-8 byte sequence in input

When I'd expected something like:

$ LC_ALL=C src/grep -Pr "DEFINE" /usr/lib/linux-kbuild-3.2/
/usr/lib/linux-kbuild-3.2/scripts/kernel-doc:   if ($prototype =~ m/DEFINE_SINGLE_EVENT\((.*?),/) {
/usr/lib/linux-kbuild-3.2/scripts/kernel-doc:   if ($prototype =~ m/DEFINE_EVENT\((.*?),(.*?),/) {
/usr/lib/linux-kbuild-3.2/scripts/kernel-doc:## if ($prototype =~ m/SYSCALL_DEFINE0\s*\(\s*(a-zA-Z0-9_)*\s*\)/) {
/usr/lib/linux-kbuild-3.2/scripts/kernel-doc:   if ($prototype =~ m/SYSCALL_DEFINE0/) {
...

Maybe, it is a pcre (v. 8.31) issue. 

Regards,

Santiago




Information forwarded to bug-grep <at> gnu.org:
bug#15758; Package grep. (Wed, 18 Dec 2013 17:47:02 GMT) Full text and rfc822 format available.

Message #44 received at 15758 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Santiago <santiago <at> debian.org>
Cc: 15758-done <15758 <at> debbugs.gnu.org>, 730472 <730472 <at> bugs.debian.org>
Subject: Re: bug#15758: grep 2.15 calls abort() on larger searches with -P
Date: Wed, 18 Dec 2013 09:45:51 -0800
On Wed, Dec 18, 2013 at 8:53 AM, Santiago <santiago <at> debian.org> wrote:
...
> $ src/grep -Pr "DEFINE" /usr/lib/linux-kbuild-3.2/
> src/grep: invalid UTF-8 byte sequence in input
>
> When I'd expected something like:
>
> $ LC_ALL=C src/grep -Pr "DEFINE" /usr/lib/linux-kbuild-3.2/
> /usr/lib/linux-kbuild-3.2/scripts/kernel-doc:   if ($prototype =~ m/DEFINE_SINGLE_EVENT\((.*?),/) {
> /usr/lib/linux-kbuild-3.2/scripts/kernel-doc:   if ($prototype =~ m/DEFINE_EVENT\((.*?),(.*?),/) {
> /usr/lib/linux-kbuild-3.2/scripts/kernel-doc:## if ($prototype =~ m/SYSCALL_DEFINE0\s*\(\s*(a-zA-Z0-9_)*\s*\)/) {
> /usr/lib/linux-kbuild-3.2/scripts/kernel-doc:   if ($prototype =~ m/SYSCALL_DEFINE0/) {
> ...
>
> Maybe, it is a pcre (v. 8.31) issue.

Hi Santiago,
Thanks for testing that.
What do you get when you run the stand-alone example I gave in the
commit log and in the test?

  printf 'j\x82\nj\n'|LC_ALL=en_US.UTF-8 grep -P j|cat -A; echo $?

For me (using pcre-8.33), it works the way I want and both matches:

      jM-^B$
      j$
      0

Hmm... I see that with debian unstable's 8.31-2, it does indeed act differently.
I may have to think about excluding pcre support when the version
doesn't work the way I want.




Information forwarded to bug-grep <at> gnu.org:
bug#15758; Package grep. (Wed, 18 Dec 2013 23:09:02 GMT) Full text and rfc822 format available.

Message #47 received at 15758 <at> debbugs.gnu.org (full text, mbox):

From: Santiago <santiago <at> debian.org>
To: Jim Meyering <jim <at> meyering.net>
Cc: 15758 <at> debbugs.gnu.org, 730472 <730472 <at> bugs.debian.org>
Subject: Re: bug#15758: grep 2.15 calls abort() on larger searches with -P
Date: Wed, 18 Dec 2013 18:09:55 -0500
El 18/12/13 a las 09:45, Jim Meyering escribió:
...
> 
> Hi Santiago,
> Thanks for testing that.
> What do you get when you run the stand-alone example I gave in the
> commit log and in the test?
> 
>   printf 'j\x82\nj\n'|LC_ALL=en_US.UTF-8 grep -P j|cat -A; echo $?
> 
> For me (using pcre-8.33), it works the way I want and both matches:
> 
>       jM-^B$
>       j$
>       0
> 
> Hmm... I see that with debian unstable's 8.31-2, it does indeed act differently.
> I may have to think about excluding pcre support when the version
> doesn't work the way I want.

I get this:

$ printf 'j\x82\nj\n'|LC_ALL=en_US.UTF-8 src/grep -P j|cat -A; echo $?
src/grep: invalid UTF-8 byte sequence in input
0

I've also tried building debian packages for pcre 8.33 and 8.34, with same
results. I need to take a look if a debian patch is giving trouble.

Cheers!

Santiago




Information forwarded to bug-grep <at> gnu.org:
bug#15758; Package grep. (Thu, 19 Dec 2013 18:35:02 GMT) Full text and rfc822 format available.

Message #50 received at 15758 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Santiago <santiago <at> debian.org>
Cc: 15758-done <15758 <at> debbugs.gnu.org>, 730472 <730472 <at> bugs.debian.org>
Subject: Re: bug#15758: grep 2.15 calls abort() on larger searches with -P
Date: Thu, 19 Dec 2013 10:34:05 -0800
On Wed, Dec 18, 2013 at 3:09 PM, Santiago <santiago <at> debian.org> wrote:
> El 18/12/13 a las 09:45, Jim Meyering escribió:
...
>>   printf 'j\x82\nj\n'|LC_ALL=en_US.UTF-8 grep -P j|cat -A; echo $?
>>
>> For me (using pcre-8.33), it works the way I want and both matches:
>>
>>       jM-^B$
>>       j$
>>       0
>>
>> Hmm... I see that with debian unstable's 8.31-2, it does indeed act differently.
>> I may have to think about excluding pcre support when the version
>> doesn't work the way I want.
>
> I get this:
>
> $ printf 'j\x82\nj\n'|LC_ALL=en_US.UTF-8 src/grep -P j|cat -A; echo $?
> src/grep: invalid UTF-8 byte sequence in input
> 0
>
> I've also tried building debian packages for pcre 8.33 and 8.34, with same
> results. I need to take a look if a debian patch is giving trouble.

I have confirmed that grep linked with libpcre.a built from upstream
sources [commit f9d3a72ea5e86a674a9836b462e1231ecce0d739] (8.34) also
works way I expect.




Information forwarded to bug-grep <at> gnu.org:
bug#15758; Package grep. (Sat, 21 Dec 2013 18:47:01 GMT) Full text and rfc822 format available.

Message #53 received at 15758 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Santiago <santiago <at> debian.org>
Cc: 15758-done <15758 <at> debbugs.gnu.org>, 730472 <730472 <at> bugs.debian.org>
Subject: Re: bug#15758: grep 2.15 calls abort() on larger searches with -P
Date: Sat, 21 Dec 2013 10:46:18 -0800
On Thu, Dec 19, 2013 at 10:34 AM, Jim Meyering <jim <at> meyering.net> wrote:
> I have confirmed that grep linked with libpcre.a built from upstream
> sources [commit f9d3a72ea5e86a674a9836b462e1231ecce0d739] (8.34) also
> works the way I expect.

More data points: Fedora 20 and OS/X work both with pcre-8.33, so I
conclude this is a problem specific to some Debian-specific patch.

I expect to push that patch as-is and defer to a separate commit
(or maybe even skip altogether) any portability hack that might warn
or disable PCRE support when detecting the broken library.




Information forwarded to bug-grep <at> gnu.org:
bug#15758; Package grep. (Sat, 21 Dec 2013 19:02:02 GMT) Full text and rfc822 format available.

Message #56 received at 15758 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Santiago <santiago <at> debian.org>
Cc: 15758-done <15758 <at> debbugs.gnu.org>, 730472 <730472 <at> bugs.debian.org>
Subject: Re: bug#15758: grep 2.15 calls abort() on larger searches with -P
Date: Sat, 21 Dec 2013 11:01:24 -0800
On Sat, Dec 21, 2013 at 10:46 AM, Jim Meyering <jim <at> meyering.net> wrote:
> I expect to push that patch as-is and defer to a separate commit
> (or maybe even skip altogether) any portability hack that might warn
> or disable PCRE support when detecting the broken library.

Pushed.  Let's take any discussion of grep's workaround for Debian's
PCRE problem to a new thread/issue.




Information forwarded to bug-grep <at> gnu.org:
bug#15758; Package grep. (Tue, 31 Dec 2013 18:54:01 GMT) Full text and rfc822 format available.

Message #59 received at 15758 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Santiago <santiago <at> debian.org>
Cc: 15758-done <15758 <at> debbugs.gnu.org>, 730472 <730472 <at> bugs.debian.org>
Subject: Re: bug#15758: grep 2.15 calls abort() on larger searches with -P
Date: Tue, 31 Dec 2013 10:53:13 -0800
[Message part 1 (text/plain, inline)]
On Sat, Dec 21, 2013 at 11:01 AM, Jim Meyering <jim <at> meyering.net> wrote:
> On Sat, Dec 21, 2013 at 10:46 AM, Jim Meyering <jim <at> meyering.net> wrote:
>> I expect to push that patch as-is and defer to a separate commit
>> (or maybe even skip altogether) any portability hack that might warn
>> or disable PCRE support when detecting the broken library.
>
> Pushed.  Let's take any discussion of grep's workaround for Debian's
> PCRE problem to a new thread/issue.

Hmm... I was chagrined not to be able to reproduce the output I quoted
above, so dug into it and found the real error (mine), fixed it and
adjusted the test:
[k.txt (text/plain, attachment)]

Information forwarded to bug-grep <at> gnu.org:
bug#15758; Package grep. (Tue, 07 Jan 2014 11:02:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-grep <at> gnu.org:
bug#15758; Package grep. (Tue, 07 Jan 2014 11:03:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-grep <at> gnu.org:
bug#15758; Package grep. (Tue, 07 Jan 2014 11:03:03 GMT) Full text and rfc822 format available.

Information forwarded to bug-grep <at> gnu.org:
bug#15758; Package grep. (Tue, 07 Jan 2014 11:03:03 GMT) Full text and rfc822 format available.

Information forwarded to bug-grep <at> gnu.org:
bug#15758; Package grep. (Tue, 07 Jan 2014 11:03:04 GMT) Full text and rfc822 format available.

Information forwarded to bug-grep <at> gnu.org:
bug#15758; Package grep. (Tue, 07 Jan 2014 11:04:01 GMT) Full text and rfc822 format available.

Information forwarded to bug-grep <at> gnu.org:
bug#15758; Package grep. (Tue, 07 Jan 2014 11:04:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-grep <at> gnu.org:
bug#15758; Package grep. (Tue, 07 Jan 2014 12:51:01 GMT) Full text and rfc822 format available.

Information forwarded to bug-grep <at> gnu.org:
bug#15758; Package grep. (Tue, 07 Jan 2014 12:55:01 GMT) Full text and rfc822 format available.

Information forwarded to bug-grep <at> gnu.org:
bug#15758; Package grep. (Tue, 07 Jan 2014 12:55:03 GMT) Full text and rfc822 format available.

Information forwarded to bug-grep <at> gnu.org:
bug#15758; Package grep. (Tue, 07 Jan 2014 12:56:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-grep <at> gnu.org:
bug#15758; Package grep. (Tue, 07 Jan 2014 12:57:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-grep <at> gnu.org:
bug#15758; Package grep. (Tue, 07 Jan 2014 12:59:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-grep <at> gnu.org:
bug#15758; Package grep. (Tue, 07 Jan 2014 12:59:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-grep <at> gnu.org:
bug#15758; Package grep. (Thu, 23 Jan 2014 09:05:02 GMT) Full text and rfc822 format available.

Message #104 received at 15758 <at> debbugs.gnu.org (full text, mbox):

From: Santiago <santiago <at> debian.org>
To: Jim Meyering <jim <at> meyering.net>
Cc: 15758 <15758 <at> debbugs.gnu.org>, 730472 <730472 <at> bugs.debian.org>
Subject: Re: bug#15758: grep 2.15 calls abort() on larger searches with -P
Date: Thu, 23 Jan 2014 10:03:25 +0100
On Tue, Dec 31, 2013 at 10:53:13AM -0800, Jim Meyering wrote:
> On Sat, Dec 21, 2013 at 11:01 AM, Jim Meyering <jim <at> meyering.net> wrote:
> > On Sat, Dec 21, 2013 at 10:46 AM, Jim Meyering <jim <at> meyering.net> wrote:
> >> I expect to push that patch as-is and defer to a separate commit
> >> (or maybe even skip altogether) any portability hack that might warn
> >> or disable PCRE support when detecting the broken library.
> >
> > Pushed.  Let's take any discussion of grep's workaround for Debian's
> > PCRE problem to a new thread/issue.
> 
> Hmm... I was chagrined not to be able to reproduce the output I quoted
> above, so dug into it and found the real error (mine), fixed it and
> adjusted the test:

(Sorry, I was forgetting to answer you, my holidays were quite long.)

Great! It works and it's on debian unstable now. 

Thanks,

Santiago




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 20 Feb 2014 12:24:03 GMT) Full text and rfc822 format available.

This bug report was last modified 10 years and 60 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.