GNU bug report logs -
#49836
Support ripgrep in semantic-symref-tool-grep
Previous Next
To reply to this bug, email your comments to 49836 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#49836
; Package
emacs
.
(Mon, 02 Aug 2021 21:40:01 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Juri Linkov <juri <at> linkov.net>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Mon, 02 Aug 2021 21:40:01 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Creating a separate bug report from bug#49731 because this is a real problem.
Now grep.el completely supports ripgrep when 'grep-find-template'
is customized to a command line that uses 'rg' such as e.g.
"find <D> <X> -type f <F> -print0 | sort -z | xargs -0 -e
rg <C> -nH --no-heading -j8 --sort path -M 200 --max-columns-preview -e <R>"
But such grep setting breaks the command 'xref-find-references':
>>>> 1. while xref-find-references works fine in `emacs -Q`,
>>>> I don't know why with my customization typing e.g.
>>>> 'M-? isearch-lazy-highlight RET' reports
>>>> "No references found for: isearch-lazy-highlight".
>>>
>>> Try and see which of the "tools" semantic-symref-perform-search ends
>>> up using.
>>
>> Thanks for the pointers to semantic-symref-perform-search.
>> It prepends "-n " to my customized pattern "rg -nH",
>> so the arg "-n" is duplicated on the command line:
>> `rg -n -nH`
>> and signals the error:
>> error: The argument '--line-number' was provided more than once, but
>> cannot be used multiple times
>> This error is caused by the bug in the command line parser used by
>> ripgrep:
>> https://github.com/clap-rs/clap/issues/2171
>> that was fixed only 6 months ago, so it will take much time
>> before this fix will reach ripgrep, and this bug will be closed:
>> https://github.com/BurntSushi/ripgrep/issues/1701
>
> The above might be worked around with creating a symref-grep specific user
> option for grep-find-template which would default to the "global" value of
> that variable.
Maybe like the existing option 'semantic-symref-grep-shell', e.g.:
(defcustom semantic-symref-grep-program 'grep
"The program to use for regexp search inside files."
:type `(choice
(const :tag "Use Grep" grep)
(const :tag "Use ripgrep" ripgrep)
(symbol :tag "User defined"))
:version "28.1")
But the problem is that for users it's hard to see the connection
between the broken 'xref-find-references' and the need to customize an option
with unrelated name 'semantic-symref-grep'.
>> But even without duplicated "-n" semantic-symref-perform-search
>> doesn't work with ripgrep because it doesn't find such pattern:
>> \\\\\\(\\^\\\\\\|\\\\W\\\\\\)isearch-lazy-highlight\\\\\\(\\\\W\\\\\\|\\$\\\\\\)
>> Maybe semantic-symref-perform-search could be improved to
>> support ripgrep?
>> Because without these two problems it works fine with ripgrep.
>
> ...but the above tells us (I think) that semantic-symref-perform-search is
> trying to use the basic regexp syntax, and ripgrep doesn't support that
> (only Extended, or PCRE).
>
> For your personal consumption, perhaps the best approach is to create
> a separate "tool", like Grep (by copying symref/grep.el and tweaking some
> of its definitions), and then register it in semantic-symref-tool-alist.
>
> I don't know if ripgrep is that much faster for this particular purpose. So
> maybe it's too much work for little benefit.
A more general solution would be to add to grep.el the same options
that you added to xref:
xref-search-program grep/ripgrep
xref-search-program-alist
'((grep . "xargs -0 grep <C> -snHE -e <R>")
(ripgrep . "xargs -0 rg <C> -nH --no-messages -g '!*/' -e <R> | sort -t: -k1,1 -k2n,2"))
This means to turn the existing variable 'grep-program' into the user option
as the following patch does.
Also later grep.el could use the value "rg" of 'grep-program'
to create the corresponding grep-find-template in grep-compute-defaults.
But I don't know if it's ok to mention rigrep in grep.el?
Anyway, here is the patch that fixes 'xref-find-references':
[grep-program.patch (text/x-diff, inline)]
diff --git a/lisp/progmodes/grep.el b/lisp/progmodes/grep.el
index 8f0a5acf70..aba4d59371 100644
--- a/lisp/progmodes/grep.el
+++ b/lisp/progmodes/grep.el
@@ -484,9 +484,13 @@ grep-mode-font-lock-keywords
This gets tacked on the end of the generated expressions.")
;;;###autoload
-(defvar grep-program (purecopy "grep")
+(defcustom grep-program (purecopy "grep")
"The default grep program for `grep-command' and `grep-find-command'.
-This variable's value takes effect when `grep-compute-defaults' is called.")
+This variable's value takes effect when `grep-compute-defaults' is called."
+ :type `(choice
+ (const :tag "Use Grep" "grep")
+ (string :tag "User defined"))
+ :version "28.1")
;;;###autoload
(defvar find-program (purecopy "find")
diff --git a/lisp/cedet/semantic/symref/grep.el b/lisp/cedet/semantic/symref/grep.el
index 180d779a78..e13c21bc07 100644
--- a/lisp/cedet/semantic/symref/grep.el
+++ b/lisp/cedet/semantic/symref/grep.el
@@ -150,15 +150,17 @@ semantic-symref-perform-search
"-l ")
((eq (oref tool searchtype) 'regexp)
"-nE ")
- (t "-n ")))
+ (t (if (equal grep-program "rg") "" "-n "))))
(greppat (cond ((eq (oref tool searchtype) 'regexp)
(oref tool searchfor))
(t
;; Can't use the word boundaries: Grep
;; doesn't always agree with the language
;; syntax on those.
- (format "\\(^\\|\\W\\)%s\\(\\W\\|$\\)"
- (oref tool searchfor)))))
+ (if (equal grep-program "rg")
+ (oref tool searchfor)
+ (format "\\(^\\|\\W\\)%s\\(\\W\\|$\\)"
+ (oref tool searchfor))))))
;; Misc
(b (get-buffer-create "*Semantic SymRef*"))
(ans nil)
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#49836
; Package
emacs
.
(Tue, 03 Aug 2021 08:16:03 GMT)
Full text and
rfc822 format available.
Message #8 received at 49836 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
tags 49836 + patch
quit
> This means to turn the existing variable 'grep-program' into the user option
> as the following patch does.
>
> Also later grep.el could use the value "rg" of 'grep-program'
> to create the corresponding grep-find-template in grep-compute-defaults.
>
> But I don't know if it's ok to mention rigrep in grep.el?
So here is the complete support for rigrep in grep.el:
[grep-program-ripgrep.patch (text/x-diff, inline)]
diff --git a/lisp/progmodes/xref.el b/lisp/progmodes/xref.el
index 7453dbed99..c4adee609c 100644
--- a/lisp/progmodes/xref.el
+++ b/lisp/progmodes/xref.el
@@ -1544,7 +1544,7 @@ xref-search-program
"The program to use for regexp search inside files.
This must reference a corresponding entry in `xref-search-program-alist'."
- :type `(choice
+ :type '(choice
(const :tag "Use Grep" grep)
(const :tag "Use ripgrep" ripgrep)
(symbol :tag "User defined"))
diff --git a/lisp/progmodes/grep.el b/lisp/progmodes/grep.el
index 8f0a5acf70..cc3375a284 100644
--- a/lisp/progmodes/grep.el
+++ b/lisp/progmodes/grep.el
@@ -484,9 +484,14 @@ grep-mode-font-lock-keywords
This gets tacked on the end of the generated expressions.")
;;;###autoload
-(defvar grep-program (purecopy "grep")
+(defcustom grep-program (purecopy "grep")
"The default grep program for `grep-command' and `grep-find-command'.
-This variable's value takes effect when `grep-compute-defaults' is called.")
+This variable's value takes effect when `grep-compute-defaults' is called."
+ :type '(choice
+ (const :tag "Use Grep" "grep")
+ (const :tag "Use ripgrep" "rg")
+ (string :tag "User defined"))
+ :version "28.1")
;;;###autoload
(defvar find-program (purecopy "find")
@@ -709,13 +714,14 @@ grep-compute-defaults
(let ((grep-options
(concat (if grep-use-null-device "-n" "-nH")
(if grep-use-null-filename-separator " --null")
+ (if (equal grep-program "rg") " --no-heading")
(when (grep-probe grep-program
`(nil nil nil "-e" "foo" ,(null-device))
nil 1)
" -e"))))
(unless grep-command
(setq grep-command
- (format "%s %s %s " grep-program
+ (format "%s%s %s " grep-program
(or
(and grep-highlight-matches
(grep-probe
@@ -723,7 +729,7 @@ grep-compute-defaults
`(nil nil nil "--color" "x" ,(null-device))
nil 1)
(if (eq grep-highlight-matches 'always)
- "--color=always" "--color=auto"))
+ " --color=always" " --color=auto"))
"")
grep-options)))
(unless grep-template
@@ -983,6 +989,8 @@ grep-expand-template
(push "--color=always" opts))
((eq grep-highlight-matches 'auto)
(push "--color=auto" opts)))
+ (when (equal grep-program "rg")
+ (push "--no-heading" opts))
opts))
(excl . ,excl)
(dir . ,dir)
@@ -1131,7 +1139,7 @@ lgrep
files
nil
(and grep-find-ignored-files
- (concat " --exclude="
+ (concat (if (equal grep-program "rg") " -g=!" " --exclude=")
(mapconcat
(lambda (ignore)
(cond ((stringp ignore)
@@ -1141,7 +1149,7 @@ lgrep
(shell-quote-argument
(cdr ignore))))))
grep-find-ignored-files
- " --exclude=")))
+ (if (equal grep-program "rg") " -g=!" " --exclude="))))
(and (eq grep-use-directories-skip t)
'("--directories=skip"))))
(when command
@@ -1339,7 +1347,8 @@ zrgrep
nil default-directory t))
(confirm (equal current-prefix-arg '(4))))
(list regexp files dir confirm grep-find-template)))))))
- (let ((grep-find-template template)
+ (let ((grep-program "zgrep")
+ (grep-find-template template)
;; Set `grep-highlight-matches' to `always'
;; since `zgrep' puts filters in the grep output.
(grep-highlight-matches 'always))
diff --git a/lisp/cedet/semantic/symref/grep.el b/lisp/cedet/semantic/symref/grep.el
index 180d779a78..034f797076 100644
--- a/lisp/cedet/semantic/symref/grep.el
+++ b/lisp/cedet/semantic/symref/grep.el
@@ -150,15 +150,14 @@ semantic-symref-perform-search
"-l ")
((eq (oref tool searchtype) 'regexp)
"-nE ")
- (t "-n ")))
+ (t (if (equal grep-program "rg") "" "-n "))))
(greppat (cond ((eq (oref tool searchtype) 'regexp)
(oref tool searchfor))
(t
;; Can't use the word boundaries: Grep
;; doesn't always agree with the language
;; syntax on those.
- (format "\\(^\\|\\W\\)%s\\(\\W\\|$\\)"
- (oref tool searchfor)))))
+ (format "\\b%s\\b" (oref tool searchfor)))))
;; Misc
(b (get-buffer-create "*Semantic SymRef*"))
(ans nil)
Added tag(s) patch.
Request was from
Juri Linkov <juri <at> linkov.net>
to
control <at> debbugs.gnu.org
.
(Tue, 03 Aug 2021 08:16:03 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#49836
; Package
emacs
.
(Wed, 04 Aug 2021 03:16:02 GMT)
Full text and
rfc822 format available.
Message #13 received at 49836 <at> debbugs.gnu.org (full text, mbox):
I think we can improve this part:
On 03.08.2021 11:10, Juri Linkov wrote:
> diff --git a/lisp/cedet/semantic/symref/grep.el b/lisp/cedet/semantic/symref/grep.el
> index 180d779a78..034f797076 100644
> --- a/lisp/cedet/semantic/symref/grep.el
> +++ b/lisp/cedet/semantic/symref/grep.el
> @@ -150,15 +150,14 @@ semantic-symref-perform-search
> "-l ")
> ((eq (oref tool searchtype) 'regexp)
> "-nE ")
> - (t "-n ")))
> + (t (if (equal grep-program "rg") "" "-n "))))
It might be cleaner to see whether grep-find-template already includes
that flag, and if so, omit it. Though the search might be non-trivial if
it's in the form like "-abcn", still, that's searchable by regexp.
> (greppat (cond ((eq (oref tool searchtype) 'regexp)
> (oref tool searchfor))
> (t
> ;; Can't use the word boundaries: Grep
> ;; doesn't always agree with the language
> ;; syntax on those.
> - (format "\\(^\\|\\W\\)%s\\(\\W\\|$\\)"
> - (oref tool searchfor)))))
> + (format "\\b%s\\b" (oref tool searchfor)))))
> ;; Misc
> (b (get-buffer-create "*Semantic SymRef*"))
> (ans nil)
I think the original idea (surrounding with \W) is sound: after all, not
every symbol boundary in Emacs sense is a word boundary in Grep or RG.
If a method, say, ends with ?, then it won't be.
The problem with the above regexp is that it uses the basic syntax,
instead of Extended. But we can flip it.
As long as we're able to ask Grep to search with Extended syntax, we can
use (format "(^|\\W)%s(\\W|$)" (oref tool searchfor)). And that can be
achieved with the same method as is used in xref-matches-in-directory:
Something like (replace-regexp-in-string "grep <C>" "grep <C> -E"
grep-find-template t t),
to be sure it's not ripgrep in there.
The new user option can be used too, but I'd probably prefer a more
independent solution here.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#49836
; Package
emacs
.
(Wed, 04 Aug 2021 21:26:02 GMT)
Full text and
rfc822 format available.
Message #16 received at 49836 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
>> @@ -150,15 +150,14 @@ semantic-symref-perform-search
>> "-l ")
>> ((eq (oref tool searchtype) 'regexp)
>> "-nE ")
>> - (t "-n ")))
>> + (t (if (equal grep-program "rg") "" "-n "))))
>
> It might be cleaner to see whether grep-find-template already includes that
> flag, and if so, omit it. Though the search might be non-trivial if it's in
> the form like "-abcn", still, that's searchable by regexp.
Indeed, but such a hack is a temporary measure and can be removed later
after ripgrep will be fixed.
>> ;; Can't use the word boundaries: Grep
>> ;; doesn't always agree with the language
>> ;; syntax on those.
>> - (format "\\(^\\|\\W\\)%s\\(\\W\\|$\\)"
>> - (oref tool searchfor)))))
>> + (format "\\b%s\\b" (oref tool searchfor)))))
>
> I think the original idea (surrounding with \W) is sound: after all, not
> every symbol boundary in Emacs sense is a word boundary in Grep or RG. If
> a method, say, ends with ?, then it won't be.
I tried to search for 'soap-type-is-array?' in the Emacs tree,
and ripgrep can find it with "\\b%s\\b", but Grep can't.
> The problem with the above regexp is that it uses the basic syntax, instead
> of Extended. But we can flip it.
>
> As long as we're able to ask Grep to search with Extended syntax, we can
> use (format "(^|\\W)%s(\\W|$)" (oref tool searchfor)). And that can be
> achieved with the same method as is used in xref-matches-in-directory:
>
> Something like (replace-regexp-in-string "grep <C>" "grep <C> -E"
> grep-find-template t t), to be sure it's not ripgrep in there.
>
> The new user option can be used too, but I'd probably prefer a more
> independent solution here.
It would be more preferable not to change the existing default logic
to avoid possible troubles. Since Grep with Basic syntax works fine,
then better not to switch to Extended syntax.
The new user option is already used in many places in grep.el
in the previous patch, so it should be ok to use it in semantic-symref
as well:
[semantic-symref-perform-search-rg.patch (text/x-diff, inline)]
diff --git a/lisp/cedet/semantic/symref/grep.el b/lisp/cedet/semantic/symref/grep.el
index 180d779a78..b7d08409aa 100644
--- a/lisp/cedet/semantic/symref/grep.el
+++ b/lisp/cedet/semantic/symref/grep.el
@@ -150,15 +150,22 @@ semantic-symref-perform-search
"-l ")
((eq (oref tool searchtype) 'regexp)
"-nE ")
- (t "-n ")))
+ (t (if (equal grep-program "rg")
+ ;; TODO: remove this after ripgrep is fixed (bug#49836)
+ (unless (string-search "rg <C> -nH" grep-find-template)
+ "-n ")
+ "-n "))))
(greppat (cond ((eq (oref tool searchtype) 'regexp)
(oref tool searchfor))
(t
;; Can't use the word boundaries: Grep
;; doesn't always agree with the language
;; syntax on those.
- (format "\\(^\\|\\W\\)%s\\(\\W\\|$\\)"
- (oref tool searchfor)))))
+ (if (equal grep-program "rg")
+ (format "(^|\\W)%s(\\W|$)"
+ (oref tool searchfor))
+ (format "\\(^\\|\\W\\)%s\\(\\W\\|$\\)"
+ (oref tool searchfor))))))
;; Misc
(b (get-buffer-create "*Semantic SymRef*"))
(ans nil)
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#49836
; Package
emacs
.
(Thu, 05 Aug 2021 03:04:01 GMT)
Full text and
rfc822 format available.
Message #19 received at 49836 <at> debbugs.gnu.org (full text, mbox):
On 05.08.2021 00:23, Juri Linkov wrote:
>> I think the original idea (surrounding with \W) is sound: after all, not
>> every symbol boundary in Emacs sense is a word boundary in Grep or RG. If
>> a method, say, ends with ?, then it won't be.
> I tried to search for 'soap-type-is-array?' in the Emacs tree,
> and ripgrep can find it with "\\b%s\\b", but Grep can't.
Did you search through symref, or in console? If the former, it seems
some regexp-quoting is missing somewhere (the question mark was no
escaped). Because see what the console says:
$ rg "\bsoap-type-is-array?\b"
lisp/net/soap-client.el
950:(defun soap-type-is-array? (type)
990: (if (soap-type-is-array? type)
ChangeLog.2
19080: * lisp/net/soap-client.el (soap-type-is-array?): new defun
$ rg "\bsoap-type-is-array\?\b"
^^ no matches
And
$ rg "\bsoap-type-is-array\?"
has matches, of course.
> It would be more preferable not to change the existing default logic
> to avoid possible troubles. Since Grep with Basic syntax works fine,
> then better not to switch to Extended syntax.
See above. But also consider what happens if a user sees that
grep-program is now customizable and ripgrep is an officially supported
value. They change it to "rg", and then suddenly their 'M-x rgrep' input
has to use the extended regexp format?
Worse than that, any third-party package that uses grep-find-template
will suddenly have a high chance of failing if they pass any nontrivial
regexps to it, especially if those have groupings or alternations.
It's a hard problem: grep.el is not prepared for abstracting like that.
If we at least standardized it internally on Extended format, that would
at least remove one source of uncertainty and bugs. The user still can
input basic regexps interactively, we can translate them easily.
Further: seeing xref-search-program-alist, people asked for support for
other similar programs, such as ag and pt. Any solution we end up with,
we should try to ensure they are valid values of grep-program as well.
> The new user option is already used in many places in grep.el
> in the previous patch, so it should be ok to use it in semantic-symref
> as well:
>
> diff --git a/lisp/cedet/semantic/symref/grep.el b/lisp/cedet/semantic/symref/grep.el
> index 180d779a78..b7d08409aa 100644
> --- a/lisp/cedet/semantic/symref/grep.el
> +++ b/lisp/cedet/semantic/symref/grep.el
> @@ -150,15 +150,22 @@ semantic-symref-perform-search
> "-l ")
> ((eq (oref tool searchtype) 'regexp)
> "-nE ")
> - (t "-n ")))
> + (t (if (equal grep-program "rg")
> + ;; TODO: remove this after ripgrep is fixed (bug#49836)
> + (unless (string-search "rg <C> -nH" grep-find-template)
> + "-n ")
> + "-n "))))
I'm actually fine with this part.
> (greppat (cond ((eq (oref tool searchtype) 'regexp)
> (oref tool searchfor))
> (t
> ;; Can't use the word boundaries: Grep
> ;; doesn't always agree with the language
> ;; syntax on those.
> - (format "\\(^\\|\\W\\)%s\\(\\W\\|$\\)"
> - (oref tool searchfor)))))
> + (if (equal grep-program "rg")
> + (format "(^|\\W)%s(\\W|$)"
> + (oref tool searchfor))
> + (format "\\(^\\|\\W\\)%s\\(\\W\\|$\\)"
> + (oref tool searchfor))))))
This can work. Except the comparison should be with "grep", I think: all
other alternatives only work with the Extended format.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#49836
; Package
emacs
.
(Fri, 06 Aug 2021 00:51:01 GMT)
Full text and
rfc822 format available.
Message #22 received at 49836 <at> debbugs.gnu.org (full text, mbox):
>>> I think the original idea (surrounding with \W) is sound: after all, not
>>> every symbol boundary in Emacs sense is a word boundary in Grep or RG. If
>>> a method, say, ends with ?, then it won't be.
>> I tried to search for 'soap-type-is-array?' in the Emacs tree,
>> and ripgrep can find it with "\\b%s\\b", but Grep can't.
>
> Did you search through symref, or in console? If the former, it seems some
> regexp-quoting is missing somewhere (the question mark was no
> escaped). Because see what the console says:
>
> $ rg "\bsoap-type-is-array?\b"
> lisp/net/soap-client.el
> 950:(defun soap-type-is-array? (type)
> 990: (if (soap-type-is-array? type)
>
> ChangeLog.2
> 19080: * lisp/net/soap-client.el (soap-type-is-array?): new defun
>
> $ rg "\bsoap-type-is-array\?\b"
>
> ^^ no matches
>
> And
>
> $ rg "\bsoap-type-is-array\?"
>
> has matches, of course.
semantic-symref-grep-use-template constructs such command line:
"rg ... -e \\\\bsoap-type-is-array\\?\\\\b"
that finds matches.
>> It would be more preferable not to change the existing default logic
>> to avoid possible troubles. Since Grep with Basic syntax works fine,
>> then better not to switch to Extended syntax.
>
> See above. But also consider what happens if a user sees that grep-program
> is now customizable and ripgrep is an officially supported value. They
> change it to "rg", and then suddenly their 'M-x rgrep' input has to use the
> extended regexp format?
This difference could be explained in the documentation.
> Worse than that, any third-party package that uses grep-find-template will
> suddenly have a high chance of failing if they pass any nontrivial regexps
> to it, especially if those have groupings or alternations.
This already happened after trying to customize grep-find-template
to use rg broke xref-find-references, so the problem already exists
and needs to be solved.
> It's a hard problem: grep.el is not prepared for abstracting like that. If
> we at least standardized it internally on Extended format, that would at
> least remove one source of uncertainty and bugs. The user still can input
> basic regexps interactively, we can translate them easily.
Is there a package that can translate between them reliably?
> Further: seeing xref-search-program-alist, people asked for support for
> other similar programs, such as ag and pt. Any solution we end up with, we
> should try to ensure they are valid values of grep-program as well.
Why not, semantic-symref already supports alternative tools
such as cscope, global, idutils. So xref could support more too.
>> + (if (equal grep-program "rg")
>> + (format "(^|\\W)%s(\\W|$)"
>> + (oref tool searchfor))
>> + (format "\\(^\\|\\W\\)%s\\(\\W\\|$\\)"
>> + (oref tool searchfor))))))
>
> This can work. Except the comparison should be with "grep", I think: all
> other alternatives only work with the Extended format.
I'm worried about the case when the user customizes
'grep-program' to e.g. an absolute path "/bin/grep"
or "/usr/local/bin/grep", etc.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#49836
; Package
emacs
.
(Sat, 07 Aug 2021 14:13:02 GMT)
Full text and
rfc822 format available.
Message #25 received at 49836 <at> debbugs.gnu.org (full text, mbox):
On 06.08.2021 03:35, Juri Linkov wrote:
> semantic-symref-grep-use-template constructs such command line:
>
> "rg ... -e \\\\bsoap-type-is-array\\?\\\\b"
>
> that finds matches.
The correct one will probably look like
"rg ... -e \\\\bsoap-type-is-array\\\\?\\\\b"
(same number of backslashes before '?' as before 'b'), and it won't find
any. The one you mentioned will find false positives.
E.g., try searching for 'file-name-as-directory?'. Or 'carr?'.
>> See above. But also consider what happens if a user sees that grep-program
>> is now customizable and ripgrep is an officially supported value. They
>> change it to "rg", and then suddenly their 'M-x rgrep' input has to use the
>> extended regexp format?
>
> This difference could be explained in the documentation.
If it comes to that, yes, but it's usually better to fix usability
problems that just document them.
>> Worse than that, any third-party package that uses grep-find-template will
>> suddenly have a high chance of failing if they pass any nontrivial regexps
>> to it, especially if those have groupings or alternations.
>
> This already happened after trying to customize grep-find-template
> to use rg broke xref-find-references, so the problem already exists
> and needs to be solved.
The problem exists, and has been for a long time: grep.el doesn't
properly support the "alternative" search programs, which are very
popular now. Its abstraction is leaky and doesn't work with anything but
grep. But I think that means we need a better abstraction.
Let's try to make sure we don't create bigger problems when fixing it.
And "packages stop working when I customize grep-program" sounds worse
than "I can't customize grep-program to 'rg', so my searches are a bit
slower than they could have been".
>> It's a hard problem: grep.el is not prepared for abstracting like that. If
>> we at least standardized it internally on Extended format, that would at
>> least remove one source of uncertainty and bugs. The user still can input
>> basic regexps interactively, we can translate them easily.
>
> Is there a package that can translate between them reliably?
For the limited purpose of symref/grep, we could use
xref--regexp-to-extended. It's already used in xref-matches-in-directory
and xref-matches-in-files. Better name/documentation and tests are pending.
Note that it actually translates from a (subset of) Emacs regexp to
Extended and back (it's reversible). The proper basic regexp syntax
treats '+' and '?' as normal characters unless escaped, but they're
special in Emacs regexps.
The above function is how one can use Emacs syntax (though only limited
a subset, for now) in project-find-regexp.
I also saw some commits to ELPA yesterday, that show that Consult
includes a more advanced version of this feature:
https://git.savannah.gnu.org/cgit/emacs/elpa.git/commit/?h=externals/consult&id=7bd3e44929d44cf0e17f38e943e9be2bd6014237
https://git.savannah.gnu.org/cgit/emacs/elpa.git/commit/?h=externals/consult&id=95dadd98a6a0f08955f67f1e9a7cc312435a86b8
Not sure how mature it is (seems still in development), but perhaps we
could move it to the core sooner or later. And use it instead, if it
does provide any improvement for our use case here.
>> Further: seeing xref-search-program-alist, people asked for support for
>> other similar programs, such as ag and pt. Any solution we end up with, we
>> should try to ensure they are valid values of grep-program as well.
>
> Why not, semantic-symref already supports alternative tools
> such as cscope, global, idutils. So xref could support more too.
It's easy enough for Xref, yes. It only has to support one single,
well-defined scenario.
>>> + (if (equal grep-program "rg")
>>> + (format "(^|\\W)%s(\\W|$)"
>>> + (oref tool searchfor))
>>> + (format "\\(^\\|\\W\\)%s\\(\\W\\|$\\)"
>>> + (oref tool searchfor))))))
>>
>> This can work. Except the comparison should be with "grep", I think: all
>> other alternatives only work with the Extended format.
>
> I'm worried about the case when the user customizes
> 'grep-program' to e.g. an absolute path "/bin/grep"
> or "/usr/local/bin/grep", etc.
(string-match "\\bgrep\\b" grep-program) could take care of this.
To sum up, I'm all for adding some clutches to symref/grep.el, to
support your advanced scenario, right now.
As for having grep-program customizable, perhaps we should add some new
feature/abstraction/package? To avoid breakage, and for it to be opt-in
for any new callers from Lisp.
Or indeed have templates use Extended syntax, and grep-expand-template
translate REGEXP to it. That can cause breakage for existing users,
though, those who already customize grep-find-template, etc, to their
particular programs.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#49836
; Package
emacs
.
(Fri, 17 Sep 2021 16:10:03 GMT)
Full text and
rfc822 format available.
Message #28 received at 49836 <at> debbugs.gnu.org (full text, mbox):
Redirecting here from
https://lists.gnu.org/archive/html/emacs-devel/2021-09/msg01132.html
> Speaking of Ripgrep, the compatible behavior of -w is only with recent
> versions (reported and fixed in
> https://github.com/BurntSushi/ripgrep/issues/389), starting with
> 0.10.0. Debian 10 and Fedora 31 include that versions or newer
> (https://repology.org/project/ripgrep/versions).
>
> Not that it's really important: we don't support Ripgrep officially.
Thanks to Mattias, part of the reported problem is solved.
What remains to do here is to support Ripgrep officially.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#49836
; Package
emacs
.
(Fri, 17 Sep 2021 16:25:02 GMT)
Full text and
rfc822 format available.
Message #31 received at 49836 <at> debbugs.gnu.org (full text, mbox):
Juri Linkov <juri <at> linkov.net> writes:
> What remains to do here is to support Ripgrep officially.
What would that entail? Just documenting it?
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#49836
; Package
emacs
.
(Sat, 18 Sep 2021 13:55:02 GMT)
Full text and
rfc822 format available.
Message #34 received at 49836 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Attached is a straw-man patch that is not production-quality but illustrates some of the concepts I'd like to see:
* ripgrep supported for M-?
* ripgrep auto-detected and used by default if available
* treat ripgrep as a search method of its own and not just a different grep program
* corollary: selection of search method should be made symbolically and not by supplying shell command strings
Concerns not addressed by the patch:
* unify back-ends and customisation options in xref.el and symref/grep.el
* tramp
* correct way to auto-detect ripgrep -- I have no idea, really, and would gladly settle for something dead simple, perhaps looking at the output of rg --version, instead of the voodoo code in the patch
* other search methods -- for example, it would be interesting to allow use of Spotlight or similar indexed searching tools in some cases.
* speeding up the parts that are not ripgrep. If the actual command (ripgrep or something else) takes zero seconds, what if anything prevents a crisp snappy response from Emacs?
That said, it appears to work. Right now I'm not near my Linux machine and can just compare against the fairly slow BSD grep that comes with macOS, so obviously the speed-up is tremendous.
[0001-xref-ripgrep-support.patch (application/octet-stream, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#49836
; Package
emacs
.
(Sat, 18 Sep 2021 14:15:01 GMT)
Full text and
rfc822 format available.
Message #37 received at 49836 <at> debbugs.gnu.org (full text, mbox):
> From: Mattias Engdegård <mattiase <at> acm.org>
> Date: Sat, 18 Sep 2021 15:53:51 +0200
> Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 49836 <at> debbugs.gnu.org,
> Dmitry Gutov <dgutov <at> yandex.ru>
>
> If the actual command (ripgrep or something else) takes zero seconds, what if anything prevents a crisp snappy response from Emacs?
The sub-process communications infrastructure, of course.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#49836
; Package
emacs
.
(Sat, 18 Sep 2021 14:20:02 GMT)
Full text and
rfc822 format available.
Message #40 received at 49836 <at> debbugs.gnu.org (full text, mbox):
18 sep. 2021 kl. 16.14 skrev Eli Zaretskii <eliz <at> gnu.org>:
>> If the actual command (ripgrep or something else) takes zero seconds, what if anything prevents a crisp snappy response from Emacs?
>
> The sub-process communications infrastructure, of course.
Yes, very likely, but also post-processing the output from the search process (matching, sorting, etc) and preparing it for display. It would be interesting to see a break-down and to see what if anything can be down to make to go faster.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#49836
; Package
emacs
.
(Sat, 18 Sep 2021 14:26:01 GMT)
Full text and
rfc822 format available.
Message #43 received at 49836 <at> debbugs.gnu.org (full text, mbox):
> From: Mattias Engdegård <mattiase <at> acm.org>
> Date: Sat, 18 Sep 2021 16:18:54 +0200
> Cc: juri <at> linkov.net, larsi <at> gnus.org, 49836 <at> debbugs.gnu.org, dgutov <at> yandex.ru
>
> > The sub-process communications infrastructure, of course.
>
> Yes, very likely, but also post-processing the output from the search process (matching, sorting, etc) and preparing it for display. It would be interesting to see a break-down and to see what if anything can be down to make to go faster.
You could build Emacs with profiling and run that, I guess.
And on GNU/Linux, there's Perf.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#49836
; Package
emacs
.
(Sat, 18 Sep 2021 18:46:02 GMT)
Full text and
rfc822 format available.
Message #46 received at 49836 <at> debbugs.gnu.org (full text, mbox):
>> What remains to do here is to support Ripgrep officially.
>
> What would that entail? Just documenting it?
Mattias is adding support for ripgrep to xref and symref/grep.el in
https://debbugs.gnu.org/49836#34
and I sent the first version of ripgrep support for progmodes/grep.el in
https://debbugs.gnu.org/49836#8
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#49836
; Package
emacs
.
(Sat, 18 Sep 2021 21:49:01 GMT)
Full text and
rfc822 format available.
Message #49 received at 49836 <at> debbugs.gnu.org (full text, mbox):
On 18.09.2021 17:18, Mattias Engdegård wrote:
> 18 sep. 2021 kl. 16.14 skrev Eli Zaretskii<eliz <at> gnu.org>:
>
>>> If the actual command (ripgrep or something else) takes zero seconds, what if anything prevents a crisp snappy response from Emacs?
>> The sub-process communications infrastructure, of course.
> Yes, very likely, but also post-processing the output from the search process (matching, sorting, etc) and preparing it for display. It would be interesting to see a break-down and to see what if anything can be down to make to go faster.
>
Before you dig in with OS-level debugger, here are things to try to
narrow down the problems:
- Do the search for a common term (many matches).
- Do the search for a rare term (with 1-2 matches).
See how the inside/outside timings compare. I'm guessing the latter case
should be snappy, with a relatively small ratio. Whatever overhead Emacs
has, will probably be on the sub-process infrastructure.
To investigate the former case (many matches), I suggest starting with
'benchmark-progn'.
Wrapping the 'process-file' call will measure the shell invocation and
getting the output into the buffer.
diff --git a/lisp/cedet/semantic/symref/grep.el
b/lisp/cedet/semantic/symref/grep.el
index 53745b429a..7db5e79c91 100644
--- a/lisp/cedet/semantic/symref/grep.el
+++ b/lisp/cedet/semantic/symref/grep.el
@@ -163,8 +163,9 @@ semantic-symref-perform-search
(let ((cmd (semantic-symref-grep-use-template
(directory-file-name (file-local-name rootdir))
filepattern grepflags greppat)))
- (process-file semantic-symref-grep-shell nil b nil
- shell-command-switch cmd)))
+ (benchmark-progn
+ (process-file semantic-symref-grep-shell nil b nil
+ shell-command-switch cmd))))
(setq ans (semantic-symref-parse-tool-output tool b))
;; Return the answer
ans))
Measuring the subsequent semantic-symref-parse-tool-output call can also
show something, but it's usually fast.
Wrapping the xref--convert-hits call alone inside
xref-references-in-directory should be more interesting: its work is
less trivial. I don't know how much further it can be optimized, but
help is welcome, of course.
diff --git a/lisp/progmodes/xref.el b/lisp/progmodes/xref.el
index 69cabd0b5a..ab0476b2bb 100644
--- a/lisp/progmodes/xref.el
+++ b/lisp/progmodes/xref.el
@@ -1548,9 +1548,11 @@ xref-references-in-directory
(inst (semantic-symref-instantiate :searchfor symbol
:searchtype 'symbol
:searchscope 'subdirs
- :resulttype 'line-and-text)))
- (xref--convert-hits (semantic-symref-perform-search inst)
- (format "\\_<%s\\_>" (regexp-quote symbol)))))
+ :resulttype 'line-and-text))
+ (search-hits (semantic-symref-perform-search inst)))
+ (benchmark-progn
+ (xref--convert-hits search-hits
+ (format "\\_<%s\\_>" (regexp-quote symbol))))))
(define-obsolete-function-alias
'xref-collect-references
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#49836
; Package
emacs
.
(Sat, 18 Sep 2021 23:54:02 GMT)
Full text and
rfc822 format available.
Message #52 received at 49836 <at> debbugs.gnu.org (full text, mbox):
On 18.09.2021 16:53, Mattias Engdegård wrote:
> Attached is a straw-man patch that is not production-quality but illustrates some of the concepts I'd like to see:
>
> * ripgrep supported for M-?
> * ripgrep auto-detected and used by default if available
> * treat ripgrep as a search method of its own and not just a different grep program
> * corollary: selection of search method should be made symbolically and not by supplying shell command strings
Perhaps we should split off the auto-detection feature and consider the
patch without it first. If people don't mind adding yet another
grep-or-ripgrep custom variable, this can be a reasonable change.
After landing that we could discuss the auto-detection approach, on
local and remote machines, and whether we could manage to do it only
once per host.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#49836
; Package
emacs
.
(Sun, 19 Sep 2021 00:22:01 GMT)
Full text and
rfc822 format available.
Message #55 received at 49836 <at> debbugs.gnu.org (full text, mbox):
On 9/18/2021 4:53 PM, Dmitry Gutov wrote:
> Perhaps we should split off the auto-detection feature and consider the
> patch without it first. If people don't mind adding yet another
> grep-or-ripgrep custom variable, this can be a reasonable change.
>
> After landing that we could discuss the auto-detection approach, on
> local and remote machines, and whether we could manage to do it only
> once per host.
I've done something along these lines for `urgrep'[1], a package to
provide something like `M-x rgrep' that works across the seemingly
ever-growing list of grep-like tools out there. I'm still working on
improving the documentation before I think about putting it on ELPA, but
maybe there are some bits in there that would be useful here. I'd be
happy to coordinate on this if there's interest.
[1] https://github.com/jimporter/urgrep
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#49836
; Package
emacs
.
(Sun, 19 Sep 2021 10:13:02 GMT)
Full text and
rfc822 format available.
Message #58 received at 49836 <at> debbugs.gnu.org (full text, mbox):
19 sep. 2021 kl. 02.21 skrev Jim Porter <jporterbugs <at> gmail.com>:
> [1] https://github.com/jimporter/urgrep
Thank you -- many fine ideas here and I'll be sure to steal at least something, even if the goals are slightly different.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#49836
; Package
emacs
.
(Mon, 20 Sep 2021 00:15:02 GMT)
Full text and
rfc822 format available.
Message #61 received at 49836 <at> debbugs.gnu.org (full text, mbox):
On 19.09.2021 13:11, Mattias Engdegård wrote:
> even if the goals are slightly different
Are the goals different?
It seems like a good direction: search program specified as a symbol,
and the package knows how to generate search queries based on given
requirements (with a certain space).
I'm not sure it's flexible enough to be used in both
xref-matches-in-files and semantic/symref (yet?), but when I tried to
imagine a package that would, it looked fairly similar.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#49836
; Package
emacs
.
(Mon, 20 Sep 2021 05:10:01 GMT)
Full text and
rfc822 format available.
Message #64 received at 49836 <at> debbugs.gnu.org (full text, mbox):
On 9/19/2021 5:14 PM, Dmitry Gutov wrote:
> On 19.09.2021 13:11, Mattias Engdegård wrote:
>> even if the goals are slightly different
>
> Are the goals different?
I think in the long run, the goals are very much the same. But in the
short run, my goal with urgrep was just to make something that would
work like `rgrep', but support multiple tools. There are already a
multitude of Emacs packages that provide `rgrep'-like functionality for
a particular tool, but I wanted something that worked (almost) the same
no matter what happens to be installed on the system.
> I'm not sure it's flexible enough to be used in both
> xref-matches-in-files and semantic/symref (yet?), but when I tried to
> imagine a package that would, it looked fairly similar.
If there are any (useful) commands that can't be generated with
`urgrep-command', but which most grep-like tools support, I definitely
want to add that capability. The current set of options is really just
what I use semi-regularly, so there's bound to be stuff I missed,
especially regarding semantic/symref.
That said, I don't want to slow things down too much in this bug. Maybe
for Emacs 29 though, it would make sense to put (parts of?) urgrep into
Emacs, since a unified solution would probably be helpful. I'll try to
find some time to post a message to emacs-devel to discuss this and get
some feedback.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#49836
; Package
emacs
.
(Mon, 20 Sep 2021 17:05:01 GMT)
Full text and
rfc822 format available.
Message #67 received at 49836 <at> debbugs.gnu.org (full text, mbox):
On 20.09.2021 08:09, Jim Porter wrote:
> On 9/19/2021 5:14 PM, Dmitry Gutov wrote:
>> On 19.09.2021 13:11, Mattias Engdegård wrote:
>>> even if the goals are slightly different
>>
>> Are the goals different?
>
> I think in the long run, the goals are very much the same. But in the
> short run, my goal with urgrep was just to make something that would
> work like `rgrep', but support multiple tools. There are already a
> multitude of Emacs packages that provide `rgrep'-like functionality for
> a particular tool, but I wanted something that worked (almost) the same
> no matter what happens to be installed on the system.
>
>> I'm not sure it's flexible enough to be used in both
>> xref-matches-in-files and semantic/symref (yet?), but when I tried to
>> imagine a package that would, it looked fairly similar.
>
> If there are any (useful) commands that can't be generated with
> `urgrep-command', but which most grep-like tools support, I definitely
> want to add that capability. The current set of options is really just
> what I use semi-regularly, so there's bound to be stuff I missed,
> especially regarding semantic/symref.
semantic/symref/grep is not too complicated in that regard: the command
looks like, for example
find -H ~/vc/emacs-master -type f \( -name \*..\*emacs -o -name
\*.ede -o -name \*.el \) -exec grep -nw -nH --null -e mhtml-mode \{\} +
xref's use is slightly different, but ultimately simpler: it assumes
files are piped from stdin. So it's either
xargs -0 rg -i -nH --no-messages -g '!*/' -e xref-search-program
or
xargs -0 grep -i -snHE -e <R>
And the xargs prefix can be just added on outside of your package.
> That said, I don't want to slow things down too much in this bug. Maybe
> for Emacs 29 though, it would make sense to put (parts of?) urgrep into
> Emacs, since a unified solution would probably be helpful. I'll try to
> find some time to post a message to emacs-devel to discuss this and get
> some feedback.
Sure. Thank you.
This bug report was last modified 3 years and 60 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.