GNU bug report logs - #63272
29.0.90; xref fails on long lines

Previous Next

Package: emacs;

Reported by: Juri Linkov <juri <at> linkov.net>

Date: Thu, 4 May 2023 15:16:03 UTC

Severity: normal

Tags: notabug

Found in version 29.0.90

Fixed in version 29.0.60

Done: Juri Linkov <juri <at> linkov.net>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 63272 in the body.
You can then email your comments to 63272 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#63272; Package emacs. (Thu, 04 May 2023 15:16:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Juri Linkov <juri <at> linkov.net>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Thu, 04 May 2023 15:16:03 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: bug-gnu-emacs <at> gnu.org
Subject: 29.0.90; xref fails on long lines
Date: Thu, 04 May 2023 18:07:46 +0300
1. Create a file with a long line, e.g. type

   a C-u 500000 b c

Save the file and commit to git.
(long-line-optimizations-p returns t)

2. Try to search a regexp that matches the whole long line, e.g.

   C-x p g a.*c RET

Debugger entered--Lisp error: (error "Stack overflow in regexp matcher")
  xref--collect-matches-1("a.*c" "/tmp/file" 1 1 500003 nil)
  xref--collect-matches((1 "/tmp/file" "abbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb...") "a.*c" #<buffer  *xref-temp*> nil)
  #f(compiled-function (hit) #<bytecode 0x122b1e80d7055e69>)((1 "/tmp/file" "abbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb..."))
  xref--convert-hits(((1 "/tmp/file" "abbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb...")) "a.*c")
  xref-matches-in-files("a.*c" ("/tmp/file"))
  project--find-regexp-in-files("a.*c" ("/tmp/file"))
  apply(project--find-regexp-in-files ("a.*c" ("/tmp/file")))
  #f(compiled-function (&rest args2) #<bytecode -0xae28f07f9498cbf>)()
  xref--show-xref-buffer(#f(compiled-function (&rest args2) #<bytecode -0xae28f07f9498cbf>) ((window . #<window 3 on tmp>) (display-action) (auto-jump)))
  xref--show-xrefs(#f(compiled-function (&rest args2) #<bytecode -0xae28f07f9498cbf>) nil)
  xref-show-xrefs(#f(compiled-function (&rest args2) #<bytecode -0xae28f07f9498cbf>) nil)
  project-find-regexp("a.*c")
  funcall-interactively(project-find-regexp "a.*c")
  call-interactively(project-find-regexp nil nil)
  command-execute(project-find-regexp)




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63272; Package emacs. (Thu, 04 May 2023 15:34:02 GMT) Full text and rfc822 format available.

Message #8 received at 63272 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Juri Linkov <juri <at> linkov.net>, 63272 <at> debbugs.gnu.org
Subject: Re: bug#63272: 29.0.90; xref fails on long lines
Date: Thu, 4 May 2023 18:33:17 +0300
On 04/05/2023 18:07, Juri Linkov wrote:
> 1. Create a file with a long line, e.g. type
> 
>     a C-u 500000 b c
> 
> Save the file and commit to git.
> (long-line-optimizations-p returns t)
> 
> 2. Try to search a regexp that matches the whole long line, e.g.
> 
>     C-x p g a.*c RET
> 
> Debugger entered--Lisp error: (error "Stack overflow in regexp matcher")

Isn't that more like a problem with the regexp you entered, or with our 
regexp engine?

E.g. try this:

  C-x p g a[^c]*c RET




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63272; Package emacs. (Thu, 04 May 2023 15:59:02 GMT) Full text and rfc822 format available.

Message #11 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Gregory Heytings <gregory <at> heytings.org>
To: Juri Linkov <juri <at> linkov.net>
Cc: 63272 <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org
Subject: Re: bug#63272: 29.0.90; xref fails on long lines
Date: Thu, 04 May 2023 15:58:02 +0000
>
> 1. Create a file with a long line, e.g. type
>
> a C-u 500000 b c
>
> Save the file and commit to git.
>
> (long-line-optimizations-p returns t)
>
> 2. Try to search a regexp that matches the whole long line, e.g.
>
>   C-x p g a.*c RET
>
> Debugger entered--Lisp error: (error "Stack overflow in regexp matcher")
>

That seems to be a problem with/limitation of the regexp engine that is 
not immediately related to (displaying) long lines.  After (setq 
long-line-threshold nil) you will get the same error.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63272; Package emacs. (Thu, 04 May 2023 15:59:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63272; Package emacs. (Fri, 05 May 2023 09:14:01 GMT) Full text and rfc822 format available.

Message #17 received at 63272 <at> debbugs.gnu.org (full text, mbox):

From: Gregory Heytings <gregory <at> heytings.org>
To: Juri Linkov <juri <at> linkov.net>
Cc: 63272 <at> debbugs.gnu.org
Subject: Re: bug#63272: 29.0.90; xref fails on long lines
Date: Fri, 05 May 2023 09:13:43 +0000
>> 2. Try to search a regexp that matches the whole long line, e.g.
>>
>> C-x p g a.*c RET
>>
>> Debugger entered--Lisp error: (error "Stack overflow in regexp 
>> matcher")
>
> That seems to be a problem with/limitation of the regexp engine that is 
> not immediately related to (displaying) long lines.  After (setq 
> long-line-threshold nil) you will get the same error.
>

By the way, you may want to have a look at bug#61514, in which such 
problematic regexps are discussed.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63272; Package emacs. (Fri, 05 May 2023 17:54:02 GMT) Full text and rfc822 format available.

Message #20 received at 63272 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Gregory Heytings <gregory <at> heytings.org>
Cc: 63272 <at> debbugs.gnu.org
Subject: Re: bug#63272: 29.0.90; xref fails on long lines
Date: Fri, 05 May 2023 20:39:11 +0300
>>> 2. Try to search a regexp that matches the whole long line, e.g.
>>>
>>> C-x p g a.*c RET
>>>
>>> Debugger entered--Lisp error: (error "Stack overflow in regexp matcher")
>>
>> That seems to be a problem with/limitation of the regexp engine that is
>> not immediately related to (displaying) long lines.  After (setq
>> long-line-threshold nil) you will get the same error.
>
> By the way, you may want to have a look at bug#61514, in which such
> problematic regexps are discussed.

Thanks for the reference, I see it's fixed, and I can't find more problems.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63272; Package emacs. (Fri, 05 May 2023 17:54:02 GMT) Full text and rfc822 format available.

Message #23 received at 63272 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: 63272 <at> debbugs.gnu.org
Subject: Re: bug#63272: 29.0.90; xref fails on long lines
Date: Fri, 05 May 2023 20:50:19 +0300
tags 63272 notabug
close 63272 29.0.60
thanks

>> 1. Create a file with a long line, e.g. type
>>     a C-u 500000 b c
>> Save the file and commit to git.
>> (long-line-optimizations-p returns t)
>> 2. Try to search a regexp that matches the whole long line, e.g.
>>     C-x p g a.*c RET
>> Debugger entered--Lisp error: (error "Stack overflow in regexp matcher")
>
> Isn't that more like a problem with the regexp you entered, or with our
> regexp engine?
>
> E.g. try this:
>
>   C-x p g a[^c]*c RET

In the real case the prefix and suffix were unique, so I didn't expect
that in a generated file there was a very long distance between prefix
and suffix.  To limit the distance between prefix and suffix I tried:

  C-x p g prefix.{0,100}suffix

but xref that uses ripgrep fails to find matches.
So needed to fall back to ripgrep-based rgrep in this case:

  M-x rgrep prefix.{0,100}suffix

that works successfully.

It seems the problem is because of different regexp syntax
used by ripgrep and re-search-forward in xref--collect-matches-1.

Since this is a separate problem, I'm closing this one,
then a new feature request could be opened if you want.




Added tag(s) notabug. Request was from Juri Linkov <juri <at> linkov.net> to control <at> debbugs.gnu.org. (Fri, 05 May 2023 17:54:03 GMT) Full text and rfc822 format available.

bug marked as fixed in version 29.0.60, send any further explanations to 63272 <at> debbugs.gnu.org and Juri Linkov <juri <at> linkov.net> Request was from Juri Linkov <juri <at> linkov.net> to control <at> debbugs.gnu.org. (Fri, 05 May 2023 17:54:03 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63272; Package emacs. (Fri, 05 May 2023 20:50:02 GMT) Full text and rfc822 format available.

Message #30 received at 63272 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Juri Linkov <juri <at> linkov.net>
Cc: 63272 <at> debbugs.gnu.org
Subject: Re: bug#63272: 29.0.90; xref fails on long lines
Date: Fri, 5 May 2023 23:49:14 +0300
On 05/05/2023 20:50, Juri Linkov wrote:
> In the real case the prefix and suffix were unique, so I didn't expect
> that in a generated file there was a very long distance between prefix
> and suffix.  To limit the distance between prefix and suffix I tried:
> 
>    C-x p g prefix.{0,100}suffix

Try escaping { and }:

  C-x p g prefix.\{0,100\}suffix

The regexp needs to use the syntax that Emacs can understand.

Just the subset of it that can be translated to command like, so that 
Grep and Ripgrep can work with it too.

> but xref that uses ripgrep fails to find matches.
> So needed to fall back to ripgrep-based rgrep in this case:
> 
>    M-x rgrep prefix.{0,100}suffix
> 
> that works successfully.
> 
> It seems the problem is because of different regexp syntax
> used by ripgrep and re-search-forward in xref--collect-matches-1.

The conversion is performed by xref--regexp-to-extended.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63272; Package emacs. (Mon, 08 May 2023 15:57:02 GMT) Full text and rfc822 format available.

Message #33 received at 63272 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: 63272 <at> debbugs.gnu.org
Subject: Re: bug#63272: 29.0.90; xref fails on long lines
Date: Mon, 08 May 2023 18:41:13 +0300
>> In the real case the prefix and suffix were unique, so I didn't expect
>> that in a generated file there was a very long distance between prefix
>> and suffix.  To limit the distance between prefix and suffix I tried:
>>    C-x p g prefix.{0,100}suffix
>
> Try escaping { and }:
>
>   C-x p g prefix.\{0,100\}suffix

Thanks, this works.

> The regexp needs to use the syntax that Emacs can understand.
>
> Just the subset of it that can be translated to command like, so that Grep
> and Ripgrep can work with it too.
>
>> but xref that uses ripgrep fails to find matches.
>> So needed to fall back to ripgrep-based rgrep in this case:
>>    M-x rgrep prefix.{0,100}suffix
>> that works successfully.
>> It seems the problem is because of different regexp syntax
>> used by ripgrep and re-search-forward in xref--collect-matches-1.
>
> The conversion is performed by xref--regexp-to-extended.

Shouldn't the conversion also escape { and } ?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63272; Package emacs. (Mon, 08 May 2023 19:00:02 GMT) Full text and rfc822 format available.

Message #36 received at 63272 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Juri Linkov <juri <at> linkov.net>
Cc: 63272 <at> debbugs.gnu.org
Subject: Re: bug#63272: 29.0.90; xref fails on long lines
Date: Mon, 8 May 2023 21:59:05 +0300
On 08/05/2023 18:41, Juri Linkov wrote:
>>> In the real case the prefix and suffix were unique, so I didn't expect
>>> that in a generated file there was a very long distance between prefix
>>> and suffix.  To limit the distance between prefix and suffix I tried:
>>>     C-x p g prefix.{0,100}suffix
>>
>> Try escaping { and }:
>>
>>    C-x p g prefix.\{0,100\}suffix
> 
> Thanks, this works.
> 
>> The regexp needs to use the syntax that Emacs can understand.
>>
>> Just the subset of it that can be translated to command like, so that Grep
>> and Ripgrep can work with it too.
>>
>>> but xref that uses ripgrep fails to find matches.
>>> So needed to fall back to ripgrep-based rgrep in this case:
>>>     M-x rgrep prefix.{0,100}suffix
>>> that works successfully.
>>> It seems the problem is because of different regexp syntax
>>> used by ripgrep and re-search-forward in xref--collect-matches-1.
>>
>> The conversion is performed by xref--regexp-to-extended.
> 
> Shouldn't the conversion also escape { and } ?

It _un_escapes them in this case: { and } have to be escaped in Emacs 
regexps, but they don't need to be escaped in "extended regular extensions".

So the conversion toggles escaping.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Tue, 06 Jun 2023 11:24:10 GMT) Full text and rfc822 format available.

This bug report was last modified 318 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.