GNU bug report logs - #44983
Truncate long lines of grep output

Package: emacs;

Reported by: Juri Linkov <juri <at> linkov.net>

Date: Tue, 1 Dec 2020 08:56:01 UTC

Severity: normal

Fixed in version 29.1

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 44983 in the body.
You can then email your comments to 44983 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox

Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Tue, 01 Dec 2020 08:56:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Juri Linkov <juri <at> linkov.net>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Tue, 01 Dec 2020 08:56:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: bug-gnu-emacs <at> gnu.org
Subject: Truncate long lines of grep output
Date: Tue, 01 Dec 2020 10:45:29 +0200

[New bug report from emacs-devel]
>>>> For grep output a bigger problem is that grep on binary data
>>>> might output too long lines before the terminating newline.
>>>
>>> (*) We already have this kind of problem with "normal" files which contain
>>> minified assets (JS or CSS). The file contents are usually normal ASCII,
>>> but it's just one line which can reach several MBs in length.
>>>
>>> The usual way to deal with that is with project-ignores and
>>> grep-find-ignored-files. That works for both cases.
>> This is a bug problem - often grep output lines are so long
>> that Emacs freezes, so need to kill the process.  Updating
>> manually ignored-files every time a new file causes freeze
>> is very unreliable and time-consuming workaround.
>
> And a non-obvious one (for an average user).
>
> Is the same problem exhibited by commands using the Xref UI? I don't
> remember seeing it, but of course our projects can be very different.

No difference from grep, Xref output has the same problem.

>> I tried to fix this problem, and fortunately the fix is simple
>> with the 1-liner patch.
>> It does exactly the same thing that we recently did to hide
>> overly long grep command lines with 'grep-find-abbreviate'.
>> The patch even uses the same 'grep-find-abbreviate-properties'
>> to allow clicking the hidden part to expand it.
>> diff --git a/lisp/progmodes/grep.el b/lisp/progmodes/grep.el
>> index dafba22f77..e0df2402ee 100644
>> --- a/lisp/progmodes/grep.el
>> +++ b/lisp/progmodes/grep.el
>> @@ -492,6 +492,9 @@ grep-mode-font-lock-keywords
>>         (0 grep-context-face)
>>         (1 (if (eq (char-after (match-beginning 1)) ?\0)
>>                `(face nil display ,(match-string 2)))))
>> +     ;; Hide excessive parts of grep output lines
>> +     ("^.+?:.\\{,64\\}\\(.*\\).\\{10\\}$"
>> +      1 grep-find-abbreviate-properties)
>>        ;; Hide excessive part of rgrep command
>>        ("^find \\(\\. -type d .*\\\\)\\)"
>>         (1 (if grep-find-abbreviate grep-find-abbreviate-properties
>>
>> More customizability could be added later to define the
>> length of the hidden part, etc.
>
> Maybe we'll want it to be dynamically determined by fill-column.
>
> Or just be a big enough value (e.g. 256) that the only lines where this
> rule is hit are obviously too long.

Or maybe determined by the frame width.

This will avoid the need of using such workarounds as in bug#44941:

grep -a "$@" | cut -c -200

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Tue, 01 Dec 2020 15:03:02 GMT) Full text and rfc822 format available.

Message #8 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Juri Linkov <juri <at> linkov.net>, 44983 <at> debbugs.gnu.org
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Tue, 1 Dec 2020 17:02:09 +0200

On 01.12.2020 10:45, Juri Linkov wrote:
> [New bug report from emacs-devel]
>>>>> For grep output a bigger problem is that grep on binary data
>>>>> might output too long lines before the terminating newline.
>>>>
>>>> (*) We already have this kind of problem with "normal" files which contain
>>>> minified assets (JS or CSS). The file contents are usually normal ASCII,
>>>> but it's just one line which can reach several MBs in length.
>>>>
>>>> The usual way to deal with that is with project-ignores and
>>>> grep-find-ignored-files. That works for both cases.
>>> This is a bug problem - often grep output lines are so long
>>> that Emacs freezes, so need to kill the process.  Updating
>>> manually ignored-files every time a new file causes freeze
>>> is very unreliable and time-consuming workaround.
>>
>> And a non-obvious one (for an average user).
>>
>> Is the same problem exhibited by commands using the Xref UI? I don't
>> remember seeing it, but of course our projects can be very different.
> 
> No difference from grep, Xref output has the same problem.

Perhaps (setq truncate-lines t) could help in that case?

Then the lines would be cut at the window width, as you suggest below.

> This will avoid the need of using such workarounds as in bug#44941:
> 
> grep -a "$@" | cut -c -200

That might cut filenames unnecessary. Even when those a long, we need 
them in their entirety.

The Grep results parsing code could be changed to only take the first XY 
characters of each line, though.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Tue, 01 Dec 2020 16:11:01 GMT) Full text and rfc822 format available.

Message #11 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 44983 <at> debbugs.gnu.org, juri <at> linkov.net
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Tue, 01 Dec 2020 18:09:50 +0200

> From: Dmitry Gutov <dgutov <at> yandex.ru>
> Date: Tue, 1 Dec 2020 17:02:09 +0200
> 
> >>> This is a bug problem - often grep output lines are so long
> >>> that Emacs freezes, so need to kill the process.  Updating
> >>> manually ignored-files every time a new file causes freeze
> >>> is very unreliable and time-consuming workaround.
> >>
> >> And a non-obvious one (for an average user).
> >>
> >> Is the same problem exhibited by commands using the Xref UI? I don't
> >> remember seeing it, but of course our projects can be very different.
> > 
> > No difference from grep, Xref output has the same problem.
> 
> Perhaps (setq truncate-lines t) could help in that case?

Not necessarily, because the truncated parts are still in the buffer,
and the display code which is slow in that case basically moves
through the buffer one character at a time in many cases.  Only some
specific scenarios (read: a small number of commands) can jump to the
next physical line disregarding the truncated parts.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Tue, 01 Dec 2020 16:47:02 GMT) Full text and rfc822 format available.

Message #14 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Andreas Schwab <schwab <at> linux-m68k.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: juri <at> linkov.net, 44983 <at> debbugs.gnu.org, Dmitry Gutov <dgutov <at> yandex.ru>
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Tue, 01 Dec 2020 17:46:33 +0100

On Dez 01 2020, Eli Zaretskii wrote:

>> From: Dmitry Gutov <dgutov <at> yandex.ru>
>> Date: Tue, 1 Dec 2020 17:02:09 +0200
>> 
>> >>> This is a bug problem - often grep output lines are so long
>> >>> that Emacs freezes, so need to kill the process.  Updating
>> >>> manually ignored-files every time a new file causes freeze
>> >>> is very unreliable and time-consuming workaround.
>> >>
>> >> And a non-obvious one (for an average user).
>> >>
>> >> Is the same problem exhibited by commands using the Xref UI? I don't
>> >> remember seeing it, but of course our projects can be very different.
>> > 
>> > No difference from grep, Xref output has the same problem.
>> 
>> Perhaps (setq truncate-lines t) could help in that case?
>
> Not necessarily, because the truncated parts are still in the buffer,
> and the display code which is slow in that case basically moves
> through the buffer one character at a time in many cases.  Only some
> specific scenarios (read: a small number of commands) can jump to the
> next physical line disregarding the truncated parts.

But moving though the buffer is much faster than rendering it.

Andreas.

-- 
Andreas Schwab, schwab <at> linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Tue, 01 Dec 2020 18:28:01 GMT) Full text and rfc822 format available.

Message #17 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Andreas Schwab <schwab <at> linux-m68k.org>
Cc: juri <at> linkov.net, 44983 <at> debbugs.gnu.org, dgutov <at> yandex.ru
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Tue, 01 Dec 2020 20:26:59 +0200

> From: Andreas Schwab <schwab <at> linux-m68k.org>
> Cc: Dmitry Gutov <dgutov <at> yandex.ru>,  44983 <at> debbugs.gnu.org,  juri <at> linkov.net
> Date: Tue, 01 Dec 2020 17:46:33 +0100
> 
> > Not necessarily, because the truncated parts are still in the buffer,
> > and the display code which is slow in that case basically moves
> > through the buffer one character at a time in many cases.  Only some
> > specific scenarios (read: a small number of commands) can jump to the
> > next physical line disregarding the truncated parts.
> 
> But moving though the buffer is much faster than rendering it.

I meant moving in the likes of move_it_to.  These simulate display, so
they are almost as slow as rendering itself.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Tue, 01 Dec 2020 20:38:02 GMT) Full text and rfc822 format available.

Message #20 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 44983 <at> debbugs.gnu.org
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Tue, 01 Dec 2020 22:34:37 +0200

>>> Is the same problem exhibited by commands using the Xref UI? I don't
>>> remember seeing it, but of course our projects can be very different.
>> No difference from grep, Xref output has the same problem.
>
> Perhaps (setq truncate-lines t) could help in that case?

I customized truncate-lines to t long ago, and still this doesn't help
to improve performance on long lines in grep output.

> Then the lines would be cut at the window width, as you suggest below.
>
>> This will avoid the need of using such workarounds as in bug#44941:
>> grep -a "$@" | cut -c -200
>
> That might cut filenames unnecessary. Even when those a long, we need them
> in their entirety.
>
> The Grep results parsing code could be changed to only take the first XY
> characters of each line, though.

The proposed patch doesn't cut filenames, it hides only endings of long lines.
But still performance is not much better on very long lines.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Tue, 01 Dec 2020 20:38:02 GMT) Full text and rfc822 format available.

Message #23 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 44983 <at> debbugs.gnu.org, Dmitry Gutov <dgutov <at> yandex.ru>
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Tue, 01 Dec 2020 22:35:55 +0200

>> Perhaps (setq truncate-lines t) could help in that case?
>
> Not necessarily, because the truncated parts are still in the buffer,
> and the display code which is slow in that case basically moves
> through the buffer one character at a time in many cases.  Only some
> specific scenarios (read: a small number of commands) can jump to the
> next physical line disregarding the truncated parts.

It's very strange that after adding the text property 'display "[…]"
on a very long line, motion commands are still very slow in that buffer.

Could you help to understand why hiding long regions
doesn't help to improve performance?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Wed, 02 Dec 2020 03:23:01 GMT) Full text and rfc822 format available.

Message #26 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Juri Linkov <juri <at> linkov.net>
Cc: 44983 <at> debbugs.gnu.org, dgutov <at> yandex.ru
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Wed, 02 Dec 2020 05:21:58 +0200

> From: Juri Linkov <juri <at> linkov.net>
> Cc: Dmitry Gutov <dgutov <at> yandex.ru>,  44983 <at> debbugs.gnu.org
> Date: Tue, 01 Dec 2020 22:35:55 +0200
> 
> >> Perhaps (setq truncate-lines t) could help in that case?
> >
> > Not necessarily, because the truncated parts are still in the buffer,
> > and the display code which is slow in that case basically moves
> > through the buffer one character at a time in many cases.  Only some
> > specific scenarios (read: a small number of commands) can jump to the
> > next physical line disregarding the truncated parts.
> 
> It's very strange that after adding the text property 'display "[…]"
> on a very long line, motion commands are still very slow in that buffer.
> 
> Could you help to understand why hiding long regions
> doesn't help to improve performance?

I can try, but please tell which commands are slow.  Is it C-f/C-b,
C-n/C-p, C-v/M-v, something else?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Wed, 02 Dec 2020 09:40:02 GMT) Full text and rfc822 format available.

Message #29 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 44983 <at> debbugs.gnu.org, dgutov <at> yandex.ru
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Wed, 02 Dec 2020 11:35:38 +0200

>> It's very strange that after adding the text property 'display "[…]"
>> on a very long line, motion commands are still very slow in that buffer.
>>
>> Could you help to understand why hiding long regions
>> doesn't help to improve performance?
>
> I can try, but please tell which commands are slow.  Is it C-f/C-b,
> C-n/C-p, C-v/M-v, something else?

Hmm, something strange is going on.  After inserting million-char lines:

(dotimes (_ 10)
  (insert (propertize (make-string 1000000 ?a)
           'display "[…]" 'invisible t) "\n"))

No problem, everything is still fast, C-f/C-b, C-n/C-p, C-v/M-v
move fast.  After saving to a file, grep on this file is fast
with the previous patch that hides long lines.

However, when grepping on minified web assets files
where all styles and scripts are on one long line,
then output becomes slower and slower as the line
inserted by the grep process filter grows longer.

It works this way: compilation-filter/grep-filter
inserts the next chunk of the long line, then
font-lock applies the rule from the previous patch
that hides the inserted substring starting from the
fixed position from the beginning of the line until
the end of the line, and repeats the same for every
new inserted chunk of the long line.

Maybe instead of using font-lock to hide long parts
of grep lines, it would be better to do the same
directly in compilation-filter/grep-filter?

Or maybe the problem is caused by special characters
used in minified web assets that contain many '{' chars.
And indeed, after inserting 100 thousands of '{'

(insert (propertize (make-string 100000 ?{)
         'display "[…]" 'invisible t) "\n")

and saving to a file, later visiting such file
Emacs becomes unresponsive for indefinite time.
But visiting the file with 100 thousands '{'
with find-file-literally causes no such problem.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Wed, 02 Dec 2020 10:29:02 GMT) Full text and rfc822 format available.

Message #32 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Juri Linkov <juri <at> linkov.net>
Cc: 44983 <at> debbugs.gnu.org, dgutov <at> yandex.ru
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Wed, 02 Dec 2020 12:28:17 +0200

On December 2, 2020 11:35:38 AM GMT+02:00, Juri Linkov <juri <at> linkov.net> wrote:
>
> Or maybe the problem is caused by special characters
> used in minified web assets that contain many '{' chars.
> And indeed, after inserting 100 thousands of '{'
> 
> (insert (propertize (make-string 100000 ?{)
>          'display "[…]" 'invisible t) "\n")
> 
> and saving to a file, later visiting such file
> Emacs becomes unresponsive for indefinite time.
> But visiting the file with 100 thousands '{'
> with find-file-literally causes no such problem.

Does it help to set bidi-inhibit-bpa non-nil?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Wed, 02 Dec 2020 21:36:02 GMT) Full text and rfc822 format available.

Message #35 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 44983 <at> debbugs.gnu.org, dgutov <at> yandex.ru
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Wed, 02 Dec 2020 22:53:18 +0200

>> Or maybe the problem is caused by special characters
>> used in minified web assets that contain many '{' chars.
>> And indeed, after inserting 100 thousands of '{'
>>
>> (insert (propertize (make-string 100000 ?{)
>>          'display "[…]" 'invisible t) "\n")
>>
>> and saving to a file, later visiting such file
>> Emacs becomes unresponsive for indefinite time.
>> But visiting the file with 100 thousands '{'
>> with find-file-literally causes no such problem.
>
> Does it help to set bidi-inhibit-bpa non-nil?

This helped to open the file with a lot of '{'.
But on minified files grep.el is still very slow.

Then instead of hiding long lines using font-lock,
I tried to do the same using the process filter:

(defun grep-filter ()
  (save-excursion
    (let ((end (point-marker)))
      (goto-char compilation-filter-start)
      (forward-line 0)
      (while (< (point) end)
        (let ((eol (line-end-position)))
          (when (> (- eol (point)) 64)
            (put-text-property (+ 64 (point)) (line-end-position)
                               'display "[…]"))
          (forward-line 1))))))

Still very slow.  Then tried to delete the excessive parts of long lines:

(defun grep-filter-try ()
  (save-excursion
    (let ((end (point-marker)))
      (goto-char compilation-filter-start)
      (forward-line 0)
      (while (< (point) end)
        (let ((eol (line-end-position)))
          (when (> (- eol (point)) 64)
            (delete-region (min (+ 64 (point)) (point-max)) (line-end-position)))
          (forward-line 1))))))

Now Emacs becomes more responsive.  But still output processing
takes too much time.

Finally, the last thing to try was the same solution that Richard
showed in bug#44941:

  grep -a "$@" | cut -c -200

that gives the best possible result.

I doubt that it would be possible to invent something better.

So the question is should this be customizable for adding
`cut -c` automatically to the end of the grep command line,
possibly with `stdbuf -oL` suggested by Andreas.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Thu, 03 Dec 2020 14:49:01 GMT) Full text and rfc822 format available.

Message #38 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Juri Linkov <juri <at> linkov.net>
Cc: 44983 <at> debbugs.gnu.org, dgutov <at> yandex.ru
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Thu, 03 Dec 2020 16:47:51 +0200

> From: Juri Linkov <juri <at> linkov.net>
> Cc: dgutov <at> yandex.ru, 44983 <at> debbugs.gnu.org
> Date: Wed, 02 Dec 2020 22:53:18 +0200
> 
> > Does it help to set bidi-inhibit-bpa non-nil?
> 
> This helped to open the file with a lot of '{'.
> But on minified files grep.el is still very slow.

What are "minified files"?

And when you say "slow" do you mean slow in receiving Grep output,
slow in displaying the received output, or slow in moving though the
*grep* buffer after everything was displayed?

> Then instead of hiding long lines using font-lock,
> I tried to do the same using the process filter:
> 
> (defun grep-filter ()
>   (save-excursion
>     (let ((end (point-marker)))
>       (goto-char compilation-filter-start)
>       (forward-line 0)
>       (while (< (point) end)
>         (let ((eol (line-end-position)))
>           (when (> (- eol (point)) 64)
>             (put-text-property (+ 64 (point)) (line-end-position)
>                                'display "[…]"))
>           (forward-line 1))))))
> 
> Still very slow.

Same question as above.

> Then tried to delete the excessive parts of long lines:
> 
> (defun grep-filter-try ()
>   (save-excursion
>     (let ((end (point-marker)))
>       (goto-char compilation-filter-start)
>       (forward-line 0)
>       (while (< (point) end)
>         (let ((eol (line-end-position)))
>           (when (> (- eol (point)) 64)
>             (delete-region (min (+ 64 (point)) (point-max)) (line-end-position)))
>           (forward-line 1))))))
> 
> Now Emacs becomes more responsive.  But still output processing
> takes too much time.

What is "output processing", and how did you measure the time it
takes?

> Finally, the last thing to try was the same solution that Richard
> showed in bug#44941:
> 
>   grep -a "$@" | cut -c -200
> 
> that gives the best possible result.
> 
> I doubt that it would be possible to invent something better.
> 
> So the question is should this be customizable for adding
> `cut -c` automatically to the end of the grep command line,
> possibly with `stdbuf -oL` suggested by Andreas.

I suggested to request the equivalent of "cut -c" to be a feature
added to Grep.

Failing that, I don't think Emacs should do something like that,
especially since 'cut' is not guaranteed to be available.  Users who
have such problems can, of course, modify the Grep command to do that.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Thu, 03 Dec 2020 16:31:02 GMT) Full text and rfc822 format available.

Message #41 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Rudolf Schlatte <rudi <at> constantly.at>
To: bug-gnu-emacs <at> gnu.org
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Thu, 03 Dec 2020 17:30:10 +0100

Eli Zaretskii <eliz <at> gnu.org> writes:

>> From: Juri Linkov <juri <at> linkov.net>
>> Cc: dgutov <at> yandex.ru, 44983 <at> debbugs.gnu.org
>> Date: Wed, 02 Dec 2020 22:53:18 +0200
>> 
>> > Does it help to set bidi-inhibit-bpa non-nil?
>> 
>> This helped to open the file with a lot of '{'.
>> But on minified files grep.el is still very slow.
>
> What are "minified files"?

Javascript libraries are often “minified” for deployment by shortening
identifiers and eliminating whitespace, including linebreaks.  So a
300kb library might be compressed into a 200kb one-line file.  Trying to
open such files makes Emacs unresponsive.

Rudi

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Thu, 03 Dec 2020 21:40:04 GMT) Full text and rfc822 format available.

Message #44 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 44983 <at> debbugs.gnu.org, dgutov <at> yandex.ru
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Thu, 03 Dec 2020 23:17:08 +0200

[Message part 1 (text/plain, inline)]

> And when you say "slow" do you mean slow in receiving Grep output,
> slow in displaying the received output, or slow in moving though the
> *grep* buffer after everything was displayed?

Slow in receiving, slow in displaying, or but not slow in moving
though the hidden parts of long lines.

>> Then instead of hiding long lines using font-lock,
>> I tried to do the same using the process filter:
>> 
>> (defun grep-filter ()
>>   (save-excursion
>>     (let ((end (point-marker)))
>>       (goto-char compilation-filter-start)
>>       (forward-line 0)
>>       (while (< (point) end)
>>         (let ((eol (line-end-position)))
>>           (when (> (- eol (point)) 64)
>>             (put-text-property (+ 64 (point)) (line-end-position)
>>                                'display "[…]"))
>>           (forward-line 1))))))
>> 
>> Still very slow.
>
> Same question as above.

Slow in receiving and slow in displaying.

>> Then tried to delete the excessive parts of long lines:
>> 
>> (defun grep-filter-try ()
>>   (save-excursion
>>     (let ((end (point-marker)))
>>       (goto-char compilation-filter-start)
>>       (forward-line 0)
>>       (while (< (point) end)
>>         (let ((eol (line-end-position)))
>>           (when (> (- eol (point)) 64)
>>             (delete-region (min (+ 64 (point)) (point-max)) (line-end-position)))
>>           (forward-line 1))))))
>> 
>> Now Emacs becomes more responsive.  But still output processing
>> takes too much time.
>
> What is "output processing", and how did you measure the time it
> takes?

Measuring visually, it takes too much time to output the long lines.

>> Finally, the last thing to try was the same solution that Richard
>> showed in bug#44941:
>> 
>>   grep -a "$@" | cut -c -200
>> 
>> that gives the best possible result.
>> 
>> I doubt that it would be possible to invent something better.
>> 
>> So the question is should this be customizable for adding
>> `cut -c` automatically to the end of the grep command line,
>> possibly with `stdbuf -oL` suggested by Andreas.
>
> I suggested to request the equivalent of "cut -c" to be a feature
> added to Grep.
>
> Failing that, I don't think Emacs should do something like that,
> especially since 'cut' is not guaranteed to be available.  Users who
> have such problems can, of course, modify the Grep command to do that.

Finally I solved the long-standing problem by customizing
grep-find-template to

  "find <D> <X> -type f <F> -print0 | sort -z | xargs -0 -e grep <C> --color=always -inH -e <R> | cut -c -200"

I'm not sure if something like this could be added to grep, but
here is an example how such a new option could look:

[gnu-sort-cut.patch (text/x-diff, inline)]

diff --git a/lisp/progmodes/grep.el b/lisp/progmodes/grep.el
index dafba22f77..a5a2142a9e 100644
--- a/lisp/progmodes/grep.el
+++ b/lisp/progmodes/grep.el
@@ -534,6 +534,7 @@ grep-find-use-xargs
                  (const :tag "find -exec {} +" exec-plus)
                  (const :tag "find -print0 | xargs -0" gnu)
                  (const :tag "find -print0 | sort -z | xargs -0'" gnu-sort)
+                 (const :tag "find -print0 | sort -z | xargs -0' ... | cut -c -200" gnu-sort-cut)
                  string
 		 (const :tag "Not Set" nil))
   :set #'grep-apply-setting
@@ -722,7 +723,8 @@ grep-compute-defaults
 		     (goto-char (point-min))
 		     (search-forward "--color" nil t))
 		   ;; Windows and DOS pipes fail `isatty' detection in Grep.
-		   (if (memq system-type '(windows-nt ms-dos))
+		   (if (or (eq grep-find-use-xargs 'gnu-sort-cut)
+                           (memq system-type '(windows-nt ms-dos)))
 		       'always 'auto)))))
 
     (unless (and grep-command grep-find-command
@@ -775,6 +777,9 @@ grep-compute-defaults
 		      ((eq grep-find-use-xargs 'gnu-sort)
 		       (format "%s . -type f -print0 | sort -z | \"%s\" -0 %s"
 			       find-program xargs-program grep-command))
+		      ((eq grep-find-use-xargs 'gnu-sort-cut)
+		       (format "%s . -type f -print0 | sort -z | \"%s\" -0 %s | cut -c -200"
+			       find-program xargs-program grep-command))
 		      ((memq grep-find-use-xargs '(exec exec-plus))
 		       (let ((cmd0 (format "%s . -type f -exec %s"
 					   find-program grep-command))
@@ -803,6 +808,9 @@ grep-compute-defaults
 			((eq grep-find-use-xargs 'gnu-sort)
 			 (format "%s <D> <X> -type f <F> -print0 | sort -z | \"%s\" -0 %s"
 				 find-program xargs-program gcmd))
+			((eq grep-find-use-xargs 'gnu-sort-cut)
+			 (format "%s <D> <X> -type f <F> -print0 | sort -z | \"%s\" -0 %s | cut -c -200"
+				 find-program xargs-program gcmd))
 			((eq grep-find-use-xargs 'exec)
 			 (format "%s <D> <X> -type f <F> -exec %s %s %s%s"
 				 find-program gcmd quot-braces null quot-scolon))

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Sat, 05 Dec 2020 19:54:01 GMT) Full text and rfc822 format available.

Message #47 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 44983 <at> debbugs.gnu.org, dgutov <at> yandex.ru
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Sat, 05 Dec 2020 21:47:06 +0200

>> I suggested to request the equivalent of "cut -c" to be a feature
>> added to Grep.
>>
>> Failing that, I don't think Emacs should do something like that,
>> especially since 'cut' is not guaranteed to be available.  Users who
>> have such problems can, of course, modify the Grep command to do that.
>
> Finally I solved the long-standing problem by customizing
> grep-find-template to
>
>   "find <D> <X> -type f <F> -print0 | sort -z | xargs -0 -e grep <C> --color=always -inH -e <R> | cut -c -200"

I noticed the problems caused by "cut -c": it counts bytes,
not multi-byte characters.  Even though it documentation says
that -b selects bytes, and -c selects characters, still
when used with "cut -c -200" it selects bytes, not UTF characters.

Often it cuts in the middle of a multi-byte UTF-8 character,
so octal codes are displayed at the end of grep lines.

This is like the character limit for a SMS message is 160 characters,
whereas actually this means not characters, but bytes, because
on an UTF text the SMS limit is only 70 characters.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Sun, 06 Dec 2020 21:17:02 GMT) Full text and rfc822 format available.

Message #50 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 44983 <at> debbugs.gnu.org, dgutov <at> yandex.ru
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Sun, 06 Dec 2020 22:39:15 +0200

> I noticed the problems caused by "cut -c": it counts bytes,
> not multi-byte characters.  Even though it documentation says
> that -b selects bytes, and -c selects characters, still
> when used with "cut -c -200" it selects bytes, not UTF characters.
>
> Often it cuts in the middle of a multi-byte UTF-8 character,
> so octal codes are displayed at the end of grep lines.

OTOH, ripgrep has the suitable options:

  -M, --max-columns NUM
      Don’t print lines longer than this limit in bytes. Longer lines are omitted,
      and only the number of matches in that line is printed.

  --max-columns-preview
      When the --max-columns flag is used, ripgrep will by default completely
      replace any line that is too long with a message indicating that a matching
      line was removed.  When this flag is combined with --max-columns, a preview
      of the line (corresponding to the limit size) is shown instead, where the
      part of the line exceeding the limit is not shown.

Wouldn't it be unthinkable to add support of ripgrep to grep.el?
This will allow switching to ripgrep when there is a need to
search in files with long lines.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Sun, 06 Dec 2020 21:38:02 GMT) Full text and rfc822 format available.

Message #53 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Juri Linkov <juri <at> linkov.net>, Eli Zaretskii <eliz <at> gnu.org>
Cc: 44983 <at> debbugs.gnu.org
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Sun, 6 Dec 2020 23:37:11 +0200

On 06.12.2020 22:39, Juri Linkov wrote:
>> I noticed the problems caused by "cut -c": it counts bytes,
>> not multi-byte characters.  Even though it documentation says
>> that -b selects bytes, and -c selects characters, still
>> when used with "cut -c -200" it selects bytes, not UTF characters.
>>
>> Often it cuts in the middle of a multi-byte UTF-8 character,
>> so octal codes are displayed at the end of grep lines.
> 
> OTOH, ripgrep has the suitable options:
> 
>    -M, --max-columns NUM
>        Don’t print lines longer than this limit in bytes. Longer lines are omitted,
>        and only the number of matches in that line is printed.
> 
>    --max-columns-preview
>        When the --max-columns flag is used, ripgrep will by default completely
>        replace any line that is too long with a message indicating that a matching
>        line was removed.  When this flag is combined with --max-columns, a preview
>        of the line (corresponding to the limit size) is shown instead, where the
>        part of the line exceeding the limit is not shown.

You can experiment with these Right Now(tm) by customizing 
xref-search-program-alist (as well as xref-search-program). They'll only 
affect commands that use xref-matches-in-files, though.

> Wouldn't it be unthinkable to add support of ripgrep to grep.el?
> This will allow switching to ripgrep when there is a need to
> search in files with long lines.

I'm fairly sure nothing in terms of politics is stopping us here, but if 
we wanted to update grep.el's abstractions to use different search 
programs, it looks like a bigger job to me.

Though maybe you can get away with customizing a select number of 
variables? Like grep-template, grep-find-template, etc.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Sun, 06 Dec 2020 21:56:02 GMT) Full text and rfc822 format available.

Message #56 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 44983 <at> debbugs.gnu.org
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Sun, 06 Dec 2020 23:54:53 +0200

>> OTOH, ripgrep has the suitable options:
>>    -M, --max-columns NUM
>>        Don’t print lines longer than this limit in bytes. Longer lines are omitted,
>>        and only the number of matches in that line is printed.
>>    --max-columns-preview
>>        When the --max-columns flag is used, ripgrep will by default completely
>>        replace any line that is too long with a message indicating that a matching
>>        line was removed.  When this flag is combined with --max-columns, a preview
>>        of the line (corresponding to the limit size) is shown instead, where the
>>        part of the line exceeding the limit is not shown.
>
> You can experiment with these Right Now(tm) by customizing
> xref-search-program-alist (as well as xref-search-program). They'll only
> affect commands that use xref-matches-in-files, though.

You mean adding "-M 200 --max-columns-preview" to xref-search-program-alist?
It works nice, thanks.  Should this be added by default?

>> Wouldn't it be unthinkable to add support of ripgrep to grep.el?
>> This will allow switching to ripgrep when there is a need to
>> search in files with long lines.
>
> I'm fairly sure nothing in terms of politics is stopping us here, but if we
> wanted to update grep.el's abstractions to use different search programs,
> it looks like a bigger job to me.
>
> Though maybe you can get away with customizing a select number of
> variables? Like grep-template, grep-find-template, etc.

I customized grep-find-template to "find <D> <X> -type f <F> -print0 | sort -z |
 xargs -0 -e rg -inH --color always --no-heading -M 200 --max-columns-preview -e <R>"

But this also requires customizing grep-match-regexp to the value
"\033\\[[0-9]*m\033\\[[0-9]*1m\033\\[[0-9]*1m\\(.*?\\)\033\\[[0-9]*0m"
provided by Simon in bug#41766.

And also required a small fix in grep.el:

diff --git a/lisp/progmodes/grep.el b/lisp/progmodes/grep.el
index dafba22f77..0a5fd6bf5d 100644
--- a/lisp/progmodes/grep.el
+++ b/lisp/progmodes/grep.el
@@ -412,7 +412,7 @@ grep-regexp-alist
                (- mend beg))))))
      nil nil
      (3 '(face nil display ":")))
-    ("^Binary file \\(.+\\) matches$" 1 nil nil 0 1))
+    ("^Binary file \\(.+\\) matches" 1 nil nil 0 1))
   "Regexp used to match grep hits.
 See `compilation-error-regexp-alist' for format details.")

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Mon, 07 Dec 2020 02:42:02 GMT) Full text and rfc822 format available.

Message #59 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Juri Linkov <juri <at> linkov.net>
Cc: 44983 <at> debbugs.gnu.org
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Mon, 7 Dec 2020 04:41:09 +0200

On 06.12.2020 23:54, Juri Linkov wrote:
>>> OTOH, ripgrep has the suitable options:
>>>     -M, --max-columns NUM
>>>         Don’t print lines longer than this limit in bytes. Longer lines are omitted,
>>>         and only the number of matches in that line is printed.
>>>     --max-columns-preview
>>>         When the --max-columns flag is used, ripgrep will by default completely
>>>         replace any line that is too long with a message indicating that a matching
>>>         line was removed.  When this flag is combined with --max-columns, a preview
>>>         of the line (corresponding to the limit size) is shown instead, where the
>>>         part of the line exceeding the limit is not shown.
>>
>> You can experiment with these Right Now(tm) by customizing
>> xref-search-program-alist (as well as xref-search-program). They'll only
>> affect commands that use xref-matches-in-files, though.
> 
> You mean adding "-M 200 --max-columns-preview" to xref-search-program-alist?

Yup.

> It works nice, thanks.  Should this be added by default?

Maybe someday?

Currently, it has a certain side-effect: whenever there are matches that 
don't fit the specified width, they will be omitted from the resulting 
xref buffer. Depending on the user's intent, it can be a problem.

Perhaps they did, after all, intend to search that minified JS file as well?

This should be fixable (in xref--collect-matches-1, probably), but we'd 
have to consider carefully on what to do in situations like that. E.g., 
if we put some placeholder there, that would mean that "search and 
replace" won't work.

Alternatively, xref--collect-matches-1 could apply the limit itself, no 
matter whether grep or rg is used. And it could make sure to only do 
that after the last match. This might be the slower option, but hard to 
say in advance, some comparison benchmark could help here.

>>> Wouldn't it be unthinkable to add support of ripgrep to grep.el?
>>> This will allow switching to ripgrep when there is a need to
>>> search in files with long lines.
>>
>> I'm fairly sure nothing in terms of politics is stopping us here, but if we
>> wanted to update grep.el's abstractions to use different search programs,
>> it looks like a bigger job to me.
>>
>> Though maybe you can get away with customizing a select number of
>> variables? Like grep-template, grep-find-template, etc.
> 
> I customized grep-find-template to "find <D> <X> -type f <F> -print0 | sort -z |
>   xargs -0 -e rg -inH --color always --no-heading -M 200 --max-columns-preview -e <R>"
> 
> But this also requires customizing grep-match-regexp to the value
> "\033\\[[0-9]*m\033\\[[0-9]*1m\033\\[[0-9]*1m\\(.*?\\)\033\\[[0-9]*0m"
> provided by Simon in bug#41766.

It's odd your last suggestion in that bug was not applied (adding :type 
'(choice) to grep-match-regexp). Perhaps do that now?

Although, personally, I've found a symbolic value to work better for a 
var like that (example: xref-search-program). This way we can ultimately 
consolidate info about a particular program in one place (some alist).

That aside, could you explain the difference between the regexps? Do 
grep and rg use different colors or something like that? Ideally, of 
course, that would be just 1 regexp (if that's possible without loss in 
performance, or significant loss in clarify).

> And also required a small fix in grep.el:
> 
> diff --git a/lisp/progmodes/grep.el b/lisp/progmodes/grep.el
> index dafba22f77..0a5fd6bf5d 100644
> --- a/lisp/progmodes/grep.el
> +++ b/lisp/progmodes/grep.el
> @@ -412,7 +412,7 @@ grep-regexp-alist
>                  (- mend beg))))))
>        nil nil
>        (3 '(face nil display ":")))
> -    ("^Binary file \\(.+\\) matches$" 1 nil nil 0 1))
> +    ("^Binary file \\(.+\\) matches" 1 nil nil 0 1))
>     "Regexp used to match grep hits.
>   See `compilation-error-regexp-alist' for format details.")

Nice.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Tue, 08 Dec 2020 05:36:02 GMT) Full text and rfc822 format available.

Message #62 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Richard Stallman <rms <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: eliz <at> gnu.org, 44983 <at> debbugs.gnu.org, juri <at> linkov.net
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Tue, 08 Dec 2020 00:35:11 -0500

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

What is xref-search?

Is this something I could use instead of cut, to truncate
long lines of grep output?

-- 
Dr Richard Stallman
Chief GNUisance of the GNU Project (https://gnu.org)
Founder, Free Software Foundation (https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Tue, 08 Dec 2020 19:16:02 GMT) Full text and rfc822 format available.

Message #65 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: rms <at> gnu.org
Cc: eliz <at> gnu.org, 44983 <at> debbugs.gnu.org, juri <at> linkov.net
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Tue, 8 Dec 2020 21:15:37 +0200

On 08.12.2020 07:35, Richard Stallman wrote:
> What is xref-search?

We don't actually employ such a notion, but if I was asked to define it, 
it would be the act of using a command based on xref-matches-in-files 
(which see). The main thing that separates that from 'M-x grep', though, 
is the implementation approach.

> Is this something I could use instead of cut, to truncate
> long lines of grep output?

You can use the commands based on it. And we could change the 
implementation of the aforementioned function that it would "cut" such 
long lines. In that case, the cutting could be performed using Emacs 
Lisp. 'cut' could still be used instead, though. Or 'ripgrep' could be 
instructed to do that.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Tue, 08 Dec 2020 19:49:02 GMT) Full text and rfc822 format available.

Message #68 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 44983 <at> debbugs.gnu.org
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Tue, 08 Dec 2020 21:41:28 +0200

> Alternatively, xref--collect-matches-1 could apply the limit itself, no
> matter whether grep or rg is used. And it could make sure to only do that
> after the last match. This might be the slower option, but hard to say in
> advance, some comparison benchmark could help here.

I think until a long string is inserted to the buffer, truncating the
string in the variable in xref--collect-matches-1 should be much faster.

>> But this also requires customizing grep-match-regexp to the value
>> "\033\\[[0-9]*m\033\\[[0-9]*1m\033\\[[0-9]*1m\\(.*?\\)\033\\[[0-9]*0m"
>> provided by Simon in bug#41766.
>
> It's odd your last suggestion in that bug was not applied (adding :type
> '(choice) to grep-match-regexp). Perhaps do that now?
>
> Although, personally, I've found a symbolic value to work better for a var
> like that (example: xref-search-program). This way we can ultimately
> consolidate info about a particular program in one place (some alist).
>
> That aside, could you explain the difference between the regexps? Do grep
> and rg use different colors or something like that? Ideally, of course,
> that would be just 1 regexp (if that's possible without loss in
> performance, or significant loss in clarify).

They should be merged into one regexp indeed.  Because after customizing it
to the rg regexp, grep output doesn't highlight matches anymore (I use both
grep and rg interchangeably by different commands).

Currently their separate regexps are:

grep:
"\033\\[0?1;31m
 \\(.*?\\)
 \033\\[[0-9]*m"

rg:
"\033\\[[0-9]*m
 \033\\[[0-9]*1m
 \033\\[[0-9]*1m
 \\(.*?\\)
 \033\\[[0-9]*0m"

That could be combined into one regexp:

"\033\\[[0-9?;]*m
 \\(?:\033\\[[0-9]*1m\\)\\{0,2\\}
 \\(.*?\\)
 \033\\[[0-9]*0?m"

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Wed, 09 Dec 2020 03:01:02 GMT) Full text and rfc822 format available.

Message #71 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Juri Linkov <juri <at> linkov.net>
Cc: 44983 <at> debbugs.gnu.org
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Wed, 9 Dec 2020 05:00:19 +0200

On 08.12.2020 21:41, Juri Linkov wrote:
>> Alternatively, xref--collect-matches-1 could apply the limit itself, no
>> matter whether grep or rg is used. And it could make sure to only do that
>> after the last match. This might be the slower option, but hard to say in
>> advance, some comparison benchmark could help here.
> 
> I think until a long string is inserted to the buffer, truncating the
> string in the variable in xref--collect-matches-1 should be much faster.

It would surely be faster, but how would that overhead compare to the 
whole operation?

Could be negligible, except in the most extreme cases. After all, the 
main slowdown factor with long strings is the display engine, and it 
won't be in play there.

The upside is we'd be able to support column limiting with Grep too. 
Which is the default configuration. And we'd extract the cutoff column 
into a more visible user option.

>> That aside, could you explain the difference between the regexps? Do grep
>> and rg use different colors or something like that? Ideally, of course,
>> that would be just 1 regexp (if that's possible without loss in
>> performance, or significant loss in clarify).
> 
> They should be merged into one regexp indeed.  Because after customizing it
> to the rg regexp, grep output doesn't highlight matches anymore (I use both
> grep and rg interchangeably by different commands).
> 
> Currently their separate regexps are:
> 
> grep:
> "\033\\[0?1;31m
>   \\(.*?\\)
>   \033\\[[0-9]*m"
> 
> rg:
> "\033\\[[0-9]*m
>   \033\\[[0-9]*1m
>   \033\\[[0-9]*1m
>   \\(.*?\\)
>   \033\\[[0-9]*0m"
> 
> That could be combined into one regexp:
> 
> "\033\\[[0-9?;]*m
>   \\(?:\033\\[[0-9]*1m\\)\\{0,2\\}
>   \\(.*?\\)
>   \033\\[[0-9]*0?m"

Makes sense. Is the parsing performance the same?

Also, with the increased complexity, I'd rather we added a couple of 
tests, or a comment with output examples. Or maybe both.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Wed, 09 Dec 2020 19:23:02 GMT) Full text and rfc822 format available.

Message #74 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 44983 <at> debbugs.gnu.org
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Wed, 09 Dec 2020 21:17:28 +0200

>>> Alternatively, xref--collect-matches-1 could apply the limit itself, no
>>> matter whether grep or rg is used. And it could make sure to only do that
>>> after the last match. This might be the slower option, but hard to say in
>>> advance, some comparison benchmark could help here.
>> I think until a long string is inserted to the buffer, truncating the
>> string in the variable in xref--collect-matches-1 should be much faster.
>
> It would surely be faster, but how would that overhead compare to the
> whole operation?
>
> Could be negligible, except in the most extreme cases. After all, the main
> slowdown factor with long strings is the display engine, and it won't be in
> play there.
>
> The upside is we'd be able to support column limiting with Grep too. Which
> is the default configuration. And we'd extract the cutoff column into
> a more visible user option.

This is exactly what we need.  After that this bug report/feature request
can be closed.

BTW, for sorting currently xref-search-program-alist uses:

    "| sort -t: -k1,1 -k2n,2"

but fortunately ripgrep has a special option to do the same with:

    "--sort path"

>>> That aside, could you explain the difference between the regexps? Do grep
>>> and rg use different colors or something like that? Ideally, of course,
>>> that would be just 1 regexp (if that's possible without loss in
>>> performance, or significant loss in clarify).
>> They should be merged into one regexp indeed.  Because after customizing
>> it
>> to the rg regexp, grep output doesn't highlight matches anymore (I use both
>> grep and rg interchangeably by different commands).
>> Currently their separate regexps are:
>> grep:
>> "\033\\[0?1;31m
>>   \\(.*?\\)
>>   \033\\[[0-9]*m"
>> rg:
>> "\033\\[[0-9]*m
>>   \033\\[[0-9]*1m
>>   \033\\[[0-9]*1m
>>   \\(.*?\\)
>>   \033\\[[0-9]*0m"
>> That could be combined into one regexp:
>> "\033\\[[0-9?;]*m
>>   \\(?:\033\\[[0-9]*1m\\)\\{0,2\\}
>>   \\(.*?\\)
>>   \033\\[[0-9]*0?m"
>
> Makes sense. Is the parsing performance the same?

Performance is not a problem.  The problem is that more lax regexp
causes more false positives.  So the above regexp highlighted even
the separator colons (':') between file names and column numbers.

BTW, it's possible to see all highlighted parts of the output
by changing the argument 'MODE' of 'compilation-start' in 'grep'
from #'grep-mode to t (so it uses comint-mode in grep buffers).

Anyway, I found the shortest change needed to support ripgrep,
and pushed to master.

> Also, with the increased complexity, I'd rather we added a couple of tests,
> or a comment with output examples. Or maybe both.

Fortunately, we have all possible cases listed in etc/grep.txt,
so it was easy to check if everything is highlighted correctly now.
Also I added ripgrep samples to etc/grep.txt.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Wed, 09 Dec 2020 20:07:02 GMT) Full text and rfc822 format available.

Message #77 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Juri Linkov <juri <at> linkov.net>
Cc: 44983 <at> debbugs.gnu.org
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Wed, 9 Dec 2020 22:06:01 +0200

On 09.12.2020 21:17, Juri Linkov wrote:
>>> I think until a long string is inserted to the buffer, truncating the
>>> string in the variable in xref--collect-matches-1 should be much faster.
>>
>> It would surely be faster, but how would that overhead compare to the
>> whole operation?
>>
>> Could be negligible, except in the most extreme cases. After all, the main
>> slowdown factor with long strings is the display engine, and it won't be in
>> play there.
>>
>> The upside is we'd be able to support column limiting with Grep too. Which
>> is the default configuration. And we'd extract the cutoff column into
>> a more visible user option.
> 
> This is exactly what we need.  After that this bug report/feature request
> can be closed.

Perhaps you would like to come up with the name for the new user option? 
The changes to xref--collect-matches-1 should be straightforward (it 
will include a choice, though: whether to cut off matches when they 
don't fit). Since you're the one who has experienced poor performance 
because of this, though, you can do the benchmarking. Basically, what we 
need to know is whether the new option indeed makes performance acceptable.

> BTW, for sorting currently xref-search-program-alist uses:
> 
>      "| sort -t: -k1,1 -k2n,2"
> 
> but fortunately ripgrep has a special option to do the same with:
> 
>      "--sort path"

Somehow, that option came out to be consistently slower in my 
benchmarking. Even when the results are only a few lines (that's 
actually when the difference should be most apparent, because with many 
lines Elisp takes up the most of CPU time). You can try it yourself:

(benchmark 10 '(project-find-regexp ":package-version '(xref"))

  0.86 with '| sort'
  1.33 with '--sort path'

$ rg --version
ripgrep 12.1.1 (rev 7cb211378a)
-SIMD -AVX (compiled)
+SIMD +AVX (runtime)

We can also document it in the docstring, though. For those who don't 
have 'sort' installed.

>>> They should be merged into one regexp indeed.  Because after customizing
>>> it
>>> to the rg regexp, grep output doesn't highlight matches anymore (I use both
>>> grep and rg interchangeably by different commands).
>>> Currently their separate regexps are:
>>> grep:
>>> "\033\\[0?1;31m
>>>    \\(.*?\\)
>>>    \033\\[[0-9]*m"
>>> rg:
>>> "\033\\[[0-9]*m
>>>    \033\\[[0-9]*1m
>>>    \033\\[[0-9]*1m
>>>    \\(.*?\\)
>>>    \033\\[[0-9]*0m"
>>> That could be combined into one regexp:
>>> "\033\\[[0-9?;]*m
>>>    \\(?:\033\\[[0-9]*1m\\)\\{0,2\\}
>>>    \\(.*?\\)
>>>    \033\\[[0-9]*0?m"
>>
>> Makes sense. Is the parsing performance the same?
> 
> Performance is not a problem.  The problem is that more lax regexp
> causes more false positives.  So the above regexp highlighted even
> the separator colons (':') between file names and column numbers.
> 
> BTW, it's possible to see all highlighted parts of the output
> by changing the argument 'MODE' of 'compilation-start' in 'grep'
> from #'grep-mode to t (so it uses comint-mode in grep buffers).

Because ansi-color-process-output is in comint-output-filter-functions?

> Anyway, I found the shortest change needed to support ripgrep,
> and pushed to master.

Excellent.

>> Also, with the increased complexity, I'd rather we added a couple of tests,
>> or a comment with output examples. Or maybe both.
> 
> Fortunately, we have all possible cases listed in etc/grep.txt,
> so it was easy to check if everything is highlighted correctly now.
> Also I added ripgrep samples to etc/grep.txt.

Thanks!

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Wed, 09 Dec 2020 21:47:01 GMT) Full text and rfc822 format available.

Message #80 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Jean Louis <bugs <at> gnu.support>
To: Juri Linkov <juri <at> linkov.net>
Cc: 44983 <at> debbugs.gnu.org, Dmitry Gutov <dgutov <at> yandex.ru>
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Thu, 10 Dec 2020 00:43:23 +0300

Also see this:

https://www.topbug.net/blog/2016/08/18/truncate-long-matching-lines-of-grep-a-solution-that-preserves-color/

,----
| For the example above, the following command should print only 20
| characters before and after the searching keyword (This requires GNU
| grep. If you are on Mac OS X and using the BSD grep, please consider
| following this article to install GNU grep):
| 
| grep -oE '.{0,20}jQuery.{0,20}' bootstrap.min.js
`----

where I get this:

grep -o --color -nH --null -E  ".{0,20}setting.{0,20}" tmp-2020-11-26-01:3*
tmp-2020-11-26-01:32:17986egO3: supported, but its setting does not have prior

Grep finished with 1 match found at Thu Dec 10 00:42:21

from this line long made-up line:

‘--color[=WHEN]’ ‘--colour[=WHEN]’ Surround the matched (non-empty) strings, matching lines, context lines, file names, line numbers, byte offsets, and separators (for fields and groups of context lines) with escape sequences to display them in color on the terminal.  The colors are defined by the environment variable ‘GREP_COLORS’ and default to ‘ms=01;31:mc=01;31:sl=:cx=:fn=35:ln=32:bn=32:se=36’ for bold red matched text, magenta file names, green line numbers, green byte offsets, cyan separators, and default terminal colors otherwise. The deprecated environment variable ‘GREP_COLOR’ is still supported, but its setting does not have priority; it defaults to ‘01;31’ (bold red) which only covers the color for matched text. WHEN is ‘never’, ‘always’, or ‘auto’. ‘-L’ ‘--files-without-match’ Suppress normal output; instead print the name of each input file from which no output would normally have been printed.  The scanning of each file stops on the first match. ‘-l’ ‘--files-with-matches’ Suppress normal output; instead print the name of each input file from which output would normally have been printed.  The scanning of each file stops on the first match.  (‘-l’ is specified by POSIX.)

and that solves the problem of truncating long lines.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Thu, 10 Dec 2020 08:35:02 GMT) Full text and rfc822 format available.

Message #83 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Jean Louis <bugs <at> gnu.support>
Cc: 44983 <at> debbugs.gnu.org, Dmitry Gutov <dgutov <at> yandex.ru>
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Thu, 10 Dec 2020 10:06:53 +0200

> Also see this:
> ,----
> | grep -oE '.{0,20}jQuery.{0,20}' bootstrap.min.js
> `----

But what if the user enters such a regexp as "abc|xyz",
then it will be composed into such command:

  grep -oE '.{0,20}abc|xyz.{0,20}'

that matches either 20 characters before "abc", or 20 characters
after "xyz".  Then needs to add parentheses:

  grep -oE '.{0,20}(abc|xyz).{0,20}'

What is worse is that the whole match is highlighted,
including 20 characters before and after the real match.
So it seems this solution is not perfect.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Thu, 10 Dec 2020 08:35:02 GMT) Full text and rfc822 format available.

Message #86 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 44983 <at> debbugs.gnu.org
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Thu, 10 Dec 2020 10:18:16 +0200

> Perhaps you would like to come up with the name for the new user option?

Maybe something like 'xref-search-truncate' with a number of columns,
nil by default.

>> BTW, for sorting currently xref-search-program-alist uses:
>>      "| sort -t: -k1,1 -k2n,2"
>> but fortunately ripgrep has a special option to do the same with:
>>      "--sort path"
>
> Somehow, that option came out to be consistently slower in my
> benchmarking. Even when the results are only a few lines (that's actually
> when the difference should be most apparent, because with many lines Elisp
> takes up the most of CPU time). You can try it yourself:
>
> (benchmark 10 '(project-find-regexp ":package-version '(xref"))
>
>   0.86 with '| sort'
>   1.33 with '--sort path'

I confirm that in my tests '--sort path' is 2 times slower than '| sort'.

>> BTW, it's possible to see all highlighted parts of the output
>> by changing the argument 'MODE' of 'compilation-start' in 'grep'
>> from #'grep-mode to t (so it uses comint-mode in grep buffers).
>
> Because ansi-color-process-output is in comint-output-filter-functions?

Exactly.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Thu, 10 Dec 2020 10:44:02 GMT) Full text and rfc822 format available.

Message #89 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Jean Louis <bugs <at> gnu.support>
To: Juri Linkov <juri <at> linkov.net>
Cc: 44983 <at> debbugs.gnu.org, Dmitry Gutov <dgutov <at> yandex.ru>
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Thu, 10 Dec 2020 13:08:10 +0300

* Juri Linkov <juri <at> linkov.net> [2020-12-10 11:34]:
> > Also see this:
> > ,----
> > | grep -oE '.{0,20}jQuery.{0,20}' bootstrap.min.js
> > `----
> 
> But what if the user enters such a regexp as "abc|xyz",
> then it will be composed into such command:
> 
>   grep -oE '.{0,20}abc|xyz.{0,20}'
> 
> that matches either 20 characters before "abc", or 20 characters
> after "xyz".  Then needs to add parentheses:
> 
>   grep -oE '.{0,20}(abc|xyz).{0,20}'

I do not find it problematic. Grep is anyway kind of advanced tool. I
think that Emacs "Search for files (grep)" menu option is anyway not
user friendly. It is made for those who know what is GNU/Linux, UNIX,
BSD. When user is faced with that option most probably will give up
soon in using it. Because the prompt asks user to enter something like:

grep --color -nH --null -e

but does not tell the user what it means, neither that one has to put
joker or file names after the term. Usability is degraded as the
function is only for advanced users there. Majority of GNU/Linux users
use GUI for any work.

In that sense advanced users should know how to use grep to at least
get results they need and want.  You put good intentions to beautify
the grep output.  But it is probably not necessary. They will not mind
of highlighting. They can do:

grep -nH --null -e

And there will be no highlighting. It gives the result. 

What would be more user friendly would be a form or wizard that would
specify if all files are to be searched or recursively, and what would
be the search term. That would degrade power of grep but it would be
more user friendly to many people.

In my opinion I believe that majority of users who ever clicked
"Search Files (grep)" gave up after few attempts.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Thu, 10 Dec 2020 20:49:02 GMT) Full text and rfc822 format available.

Message #92 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Juri Linkov <juri <at> linkov.net>
Cc: 44983 <at> debbugs.gnu.org
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Thu, 10 Dec 2020 22:48:10 +0200

On 10.12.2020 10:18, Juri Linkov wrote:
>>> BTW, for sorting currently xref-search-program-alist uses:
>>>       "| sort -t: -k1,1 -k2n,2"
>>> but fortunately ripgrep has a special option to do the same with:
>>>       "--sort path"
>> Somehow, that option came out to be consistently slower in my
>> benchmarking. Even when the results are only a few lines (that's actually
>> when the difference should be most apparent, because with many lines Elisp
>> takes up the most of CPU time). You can try it yourself:
>>
>> (benchmark 10 '(project-find-regexp ":package-version '(xref"))
>>
>>    0.86 with '| sort'
>>    1.33 with '--sort path'
> I confirm that in my tests '--sort path' is 2 times slower than '| sort'.

And that's because '--sort path' forces single-threaded mode: 
https://github.com/BurntSushi/ripgrep/issues/152

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Sat, 12 Dec 2020 21:09:03 GMT) Full text and rfc822 format available.

Message #95 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Jean Louis <bugs <at> gnu.support>
Cc: 44983 <at> debbugs.gnu.org, Dmitry Gutov <dgutov <at> yandex.ru>
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Sat, 12 Dec 2020 22:42:13 +0200

> I do not find it problematic. Grep is anyway kind of advanced tool. I
> think that Emacs "Search for files (grep)" menu option is anyway not
> user friendly.
> ...
> What would be more user friendly would be a form or wizard that would
> specify if all files are to be searched or recursively, and what would
> be the search term. That would degrade power of grep but it would be
> more user friendly to many people.
>
> In my opinion I believe that majority of users who ever clicked
> "Search Files (grep)" gave up after few attempts.

Indeed, "Search for files (grep)" menu option is not user friendly.
This is why we added a wizard command "Recursive Grep..." under it
in the same menu.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Sun, 13 Dec 2020 15:12:02 GMT) Full text and rfc822 format available.

Message #98 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Juri Linkov <juri <at> linkov.net>
Cc: 44983 <at> debbugs.gnu.org, bugs <at> gnu.support, dgutov <at> yandex.ru
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Sun, 13 Dec 2020 17:11:41 +0200

> From: Juri Linkov <juri <at> linkov.net>
> Date: Sat, 12 Dec 2020 22:42:13 +0200
> Cc: 44983 <at> debbugs.gnu.org, Dmitry Gutov <dgutov <at> yandex.ru>
> 
> > I do not find it problematic. Grep is anyway kind of advanced tool. I
> > think that Emacs "Search for files (grep)" menu option is anyway not
> > user friendly.
> > ...
> > What would be more user friendly would be a form or wizard that would
> > specify if all files are to be searched or recursively, and what would
> > be the search term. That would degrade power of grep but it would be
> > more user friendly to many people.
> >
> > In my opinion I believe that majority of users who ever clicked
> > "Search Files (grep)" gave up after few attempts.
> 
> Indeed, "Search for files (grep)" menu option is not user friendly.

In what way is it not user-friendly?  It just invokes "M-x grep".

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Sun, 13 Dec 2020 17:33:02 GMT) Full text and rfc822 format available.

Message #101 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Jean Louis <bugs <at> gnu.support>
To: Juri Linkov <juri <at> linkov.net>
Cc: 44983 <at> debbugs.gnu.org, Dmitry Gutov <dgutov <at> yandex.ru>
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Sun, 13 Dec 2020 13:57:26 +0300

* Juri Linkov <juri <at> linkov.net> [2020-12-13 00:09]:
> > I do not find it problematic. Grep is anyway kind of advanced tool. I
> > think that Emacs "Search for files (grep)" menu option is anyway not
> > user friendly.
> > ...
> > What would be more user friendly would be a form or wizard that would
> > specify if all files are to be searched or recursively, and what would
> > be the search term. That would degrade power of grep but it would be
> > more user friendly to many people.
> >
> > In my opinion I believe that majority of users who ever clicked
> > "Search Files (grep)" gave up after few attempts.
> 
> Indeed, "Search for files (grep)" menu option is not user friendly.
> This is why we added a wizard command "Recursive Grep..." under it
> in the same menu.

Good for programmers, good for you and good for me. Emacs is for
advanced users from that view point. From that view point everything
fits into place. 

From view point of users coming to Emacs "Recursive Grep" will not
have its meaning. Or any meaning at all.

It would be good to have a popularity-contest package similar to
Debian, where one could gather statistics what is actually used by
some users and submit that statistics.

Other good test could be to put 5 people together who used computers
for last 10 years regardless of their operating system and tell them
to open up Emacs and find files containing the term "Emacs" and watch
how they are doing.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Sun, 13 Dec 2020 17:37:02 GMT) Full text and rfc822 format available.

Message #104 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Jean Louis <bugs <at> gnu.support>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: dgutov <at> yandex.ru, 44983 <at> debbugs.gnu.org, Juri Linkov <juri <at> linkov.net>
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Sun, 13 Dec 2020 18:37:49 +0300

* Eli Zaretskii <eliz <at> gnu.org> [2020-12-13 18:12]:
> > From: Juri Linkov <juri <at> linkov.net>
> > Date: Sat, 12 Dec 2020 22:42:13 +0200
> > Cc: 44983 <at> debbugs.gnu.org, Dmitry Gutov <dgutov <at> yandex.ru>
> > 
> > > I do not find it problematic. Grep is anyway kind of advanced tool. I
> > > think that Emacs "Search for files (grep)" menu option is anyway not
> > > user friendly.
> > > ...
> > > What would be more user friendly would be a form or wizard that would
> > > specify if all files are to be searched or recursively, and what would
> > > be the search term. That would degrade power of grep but it would be
> > > more user friendly to many people.
> > >
> > > In my opinion I believe that majority of users who ever clicked
> > > "Search Files (grep)" gave up after few attempts.
> > 
> > Indeed, "Search for files (grep)" menu option is not user friendly.
> 
> In what way is it not user-friendly?  It just invokes "M-x grep".

User of Emacs are many, just Debian GNU/Linux reports 16000 users
known from the popularity contest package. It is probably small
percentage of overall number of users. Recently there was Emacs survey
and they interviewed 7000 users. Emacs has many bugs but we do not get
enough bugs reported. The ratio is reported bugs does not nearly
correspond to number of users.

From our view point it is user friendly. For me is user friendly if we
place Emacs functions in the menu without their descriptions.

From view point of many thousands of users it is not user friendly and
means nothing.

What does Recursive grep means? You have to know command line to know
what it means. Majority of GNU/Linux users do not even use command
line or terminals. We use it, but we are not representative number of
users.

"Search files recursively" would be better useful meaning

"Recursive grep" is reserved for power users. It is user friendly for
subset of users, not for majority of users.

Message from my staff member who was using Emacs and went thoroughly
through Tutorial:

[18:34] Happiness > > I have one analysis question, without expectation:
> Would you know how to search files by using Emacs?
> Do you know what means "grep"?
> Do you know what is "recursive grep"?
> No need to look up, just tell me
I have learned it but it might need me to repeat again as in the
tutorial I was practising, not yet well captured these terms on memory

But tutorial is not related to those terms. She cannot know what I
mean possibly. She can write reports but would not, without special
explanation, understand what means "Recursive grep".

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Sun, 13 Dec 2020 20:44:02 GMT) Full text and rfc822 format available.

Message #107 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 44983 <at> debbugs.gnu.org, bugs <at> gnu.support, dgutov <at> yandex.ru
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Sun, 13 Dec 2020 22:17:23 +0200

>> > In my opinion I believe that majority of users who ever clicked
>> > "Search Files (grep)" gave up after few attempts.
>> 
>> Indeed, "Search for files (grep)" menu option is not user friendly.
>
> In what way is it not user-friendly?  It just invokes "M-x grep".

It's not friendly for users who don't know syntax of grep command line.

OTOH, "Recursive grep" (rgrep) is easier to use, but its menu item text
is not clear to users who don't know what is grep.  Maybe a better title
for 'rgrep' would be "Search text in files"?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Mon, 14 Dec 2020 16:16:02 GMT) Full text and rfc822 format available.

Message #110 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Juri Linkov <juri <at> linkov.net>
Cc: 44983 <at> debbugs.gnu.org, bugs <at> gnu.support, dgutov <at> yandex.ru
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Mon, 14 Dec 2020 18:15:08 +0200

> From: Juri Linkov <juri <at> linkov.net>
> Cc: bugs <at> gnu.support,  44983 <at> debbugs.gnu.org,  dgutov <at> yandex.ru
> Date: Sun, 13 Dec 2020 22:17:23 +0200
> 
> >> > In my opinion I believe that majority of users who ever clicked
> >> > "Search Files (grep)" gave up after few attempts.
> >> 
> >> Indeed, "Search for files (grep)" menu option is not user friendly.
> >
> > In what way is it not user-friendly?  It just invokes "M-x grep".
> 
> It's not friendly for users who don't know syntax of grep command line.

If someone wants to add a more user-friendly dialog for searching text
(or perhaps reuse a dialog provided by the GUI toolkits), I think it
will be welcome.  It is not a simple job, though, because the dialog
should allow access to most of the advanced features of Grep.

> OTOH, "Recursive grep" (rgrep) is easier to use, but its menu item text
> is not clear to users who don't know what is grep.  Maybe a better title
> for 'rgrep' would be "Search text in files"?

FWIW, I don't think rgrep is significantly more user-friendly, so IMO
it is not the model on which to base a better UI.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Mon, 14 Dec 2020 20:10:02 GMT) Full text and rfc822 format available.

Message #113 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Eli Zaretskii <eliz <at> gnu.org>, Juri Linkov <juri <at> linkov.net>
Cc: 44983 <at> debbugs.gnu.org, bugs <at> gnu.support
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Mon, 14 Dec 2020 22:09:34 +0200

On 14.12.2020 18:15, Eli Zaretskii wrote:
>>>>> In my opinion I believe that majority of users who ever clicked
>>>>> "Search Files (grep)" gave up after few attempts.
>>>> Indeed, "Search for files (grep)" menu option is not user friendly.
>>> In what way is it not user-friendly?  It just invokes "M-x grep".
>> It's not friendly for users who don't know syntax of grep command line.
> If someone wants to add a more user-friendly dialog for searching text
> (or perhaps reuse a dialog provided by the GUI toolkits), I think it
> will be welcome.  It is not a simple job, though, because the dialog
> should allow access to most of the advanced features of Grep.

Perhaps a better option would be to take advantage of the 'transient' 
package (currently in GNU ELPA, but unreleased). Here's an example of 
its UI (bottom window):

https://camo.githubusercontent.com/f87497aec74dd0efee4ef78ba2b33b24d5535446b5d5cbef768653f4b945c38c/687474703a2f2f726561646d652e656d6163736169722e6d652f7472616e7369656e742e706e67

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Thu, 24 Dec 2020 20:39:03 GMT) Full text and rfc822 format available.

Message #116 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 44983 <at> debbugs.gnu.org
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Thu, 24 Dec 2020 22:33:18 +0200

[Message part 1 (text/plain, inline)]

> Anyway, I found the shortest change needed to support ripgrep,
> and pushed to master.

Here is another patch needed to support rg because currently rg fails
when --color is used without a value.  OTOH, in grep --color is the same as
--color=auto, so this is a win-win situation:

[grep-color-auto.patch (text/x-diff, inline)]

diff --git a/lisp/progmodes/grep.el b/lisp/progmodes/grep.el
index 5dc99cc7e9..ef73dac4c0 100644
--- a/lisp/progmodes/grep.el
+++ b/lisp/progmodes/grep.el
@@ -79,7 +79,7 @@ grep-highlight-matches
 markers for highlighting and adds the --color option in front of
 any explicit grep options before starting the grep.
 
-When this option is `auto', grep uses `--color' to highlight
+When this option is `auto', grep uses `--color=auto' to highlight
 matches only when it outputs to a terminal (when `grep' is the last
 command in the pipe), thus avoiding the use of any potentially-harmful
 escape sequences when standard output goes to a file or pipe.
@@ -95,7 +95,7 @@ grep-highlight-matches
   :type '(choice (const :tag "Do not highlight matches with grep markers" nil)
 		 (const :tag "Highlight matches with grep markers" t)
 		 (const :tag "Use --color=always" always)
-		 (const :tag "Use --color" auto)
+		 (const :tag "Use --color=auto" auto)
 		 (other :tag "Not Set" auto-detect))
   :set #'grep-apply-setting
   :version "22.1")
@@ -743,7 +743,7 @@ grep-compute-defaults
                                `(nil nil nil "--color" "x" ,(null-device))
                                nil 1)
                               (if (eq grep-highlight-matches 'always)
-                                  "--color=always" "--color"))
+                                  "--color=always" "--color=auto"))
                          "")
                         grep-options)))
 	(unless grep-template
@@ -1000,7 +1000,7 @@ grep-expand-template
                             ((eq grep-highlight-matches 'always)
                              (push "--color=always" opts))
                             ((eq grep-highlight-matches 'auto)
-                             (push "--color" opts)))
+                             (push "--color=auto" opts)))
                            opts))
                 (excl . ,excl)
                 (dir . ,dir)

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Thu, 24 Dec 2020 23:40:02 GMT) Full text and rfc822 format available.

Message #119 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Juri Linkov <juri <at> linkov.net>
Cc: 44983 <at> debbugs.gnu.org
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Fri, 25 Dec 2020 01:38:51 +0200

On 24.12.2020 22:33, Juri Linkov wrote:
> Here is another patch needed to support rg because currently rg fails
> when --color is used without a value.  OTOH, in grep --color is the same as
> --color=auto, so this is a win-win situation:

Makes sense.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Fri, 29 Apr 2022 11:40:02 GMT) Full text and rfc822 format available.

Message #122 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Juri Linkov <juri <at> linkov.net>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 44983 <at> debbugs.gnu.org, dgutov <at> yandex.ru
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Fri, 29 Apr 2022 13:39:41 +0200

Juri Linkov <juri <at> linkov.net> writes:

> Maybe instead of using font-lock to hide long parts
> of grep lines, it would be better to do the same
> directly in compilation-filter/grep-filter?

I now have a rough patch that does this, but the problem is that even if
I splat a "..." display over the text, font-lock seems to insist on
going over the data anyway, so the display is still dog slow.

I thought I remembered there was a way to say to font-lock "ignore this
bit of the buffer", but I can't find it now.  Do I misremember?

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Fri, 29 Apr 2022 12:23:02 GMT) Full text and rfc822 format available.

Message #125 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: dgutov <at> yandex.ru, 44983 <at> debbugs.gnu.org, juri <at> linkov.net
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Fri, 29 Apr 2022 15:22:18 +0300

> From: Lars Ingebrigtsen <larsi <at> gnus.org>
> Cc: Eli Zaretskii <eliz <at> gnu.org>,  44983 <at> debbugs.gnu.org,  dgutov <at> yandex.ru
> Date: Fri, 29 Apr 2022 13:39:41 +0200
> 
> I thought I remembered there was a way to say to font-lock "ignore this
> bit of the buffer", but I can't find it now.  Do I misremember?

Make the text invisible?

If that doesn't help either, I suggest to profile the code, because it
could be the slow display is due to something else.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Fri, 29 Apr 2022 12:42:02 GMT) Full text and rfc822 format available.

Message #128 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: dgutov <at> yandex.ru, 44983 <at> debbugs.gnu.org, juri <at> linkov.net
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Fri, 29 Apr 2022 14:41:49 +0200

Eli Zaretskii <eliz <at> gnu.org> writes:

>> I thought I remembered there was a way to say to font-lock "ignore this
>> bit of the buffer", but I can't find it now.  Do I misremember?
>
> Make the text invisible?

The text is covered by a display property, which should be much the same
thing. 

> If that doesn't help either, I suggest to profile the code, because it
> could be the slow display is due to something else.

Hm, yes...  even if I disable font-lock-mode, it's still slow.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Fri, 29 Apr 2022 13:09:02 GMT) Full text and rfc822 format available.

Message #131 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: dgutov <at> yandex.ru, 44983 <at> debbugs.gnu.org, juri <at> linkov.net
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Fri, 29 Apr 2022 16:08:08 +0300

> From: Lars Ingebrigtsen <larsi <at> gnus.org>
> Cc: juri <at> linkov.net,  44983 <at> debbugs.gnu.org,  dgutov <at> yandex.ru
> Date: Fri, 29 Apr 2022 14:41:49 +0200
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> >> I thought I remembered there was a way to say to font-lock "ignore this
> >> bit of the buffer", but I can't find it now.  Do I misremember?
> >
> > Make the text invisible?
> 
> The text is covered by a display property, which should be much the same
> thing. 

Not really, it isn't.  The effect on the glass is the same, but the
effect on the display code is different.

> > If that doesn't help either, I suggest to profile the code, because it
> > could be the slow display is due to something else.
> 
> Hm, yes...  even if I disable font-lock-mode, it's still slow.

Then I think a profile should tell something interesting.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Fri, 29 Apr 2022 16:03:01 GMT) Full text and rfc822 format available.

Message #134 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Lars Ingebrigtsen <larsi <at> gnus.org>, Juri Linkov <juri <at> linkov.net>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 44983 <at> debbugs.gnu.org
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Fri, 29 Apr 2022 19:02:19 +0300

On 29.04.2022 14:39, Lars Ingebrigtsen wrote:
> Juri Linkov<juri <at> linkov.net>  writes:
> 
>> Maybe instead of using font-lock to hide long parts
>> of grep lines, it would be better to do the same
>> directly in compilation-filter/grep-filter?
> I now have a rough patch that does this, but the problem is that even if
> I splat a "..." display over the text, font-lock seems to insist on
> going over the data anyway, so the display is still dog slow.
> 
> I thought I remembered there was a way to say to font-lock "ignore this
> bit of the buffer", but I can't find it now.  Do I misremember?

FWIW, this is more or less solved for Xref output buffers these days. 
And the solution is based on the 'invisible' property.

See bug#46859.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Fri, 29 Apr 2022 17:23:03 GMT) Full text and rfc822 format available.

Message #137 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 44983 <at> debbugs.gnu.org, dgutov <at> yandex.ru
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Fri, 29 Apr 2022 20:15:00 +0300

>> Maybe instead of using font-lock to hide long parts
>> of grep lines, it would be better to do the same
>> directly in compilation-filter/grep-filter?
>
> I now have a rough patch that does this, but the problem is that even if
> I splat a "..." display over the text, font-lock seems to insist on
> going over the data anyway, so the display is still dog slow.
>
> I thought I remembered there was a way to say to font-lock "ignore this
> bit of the buffer", but I can't find it now.  Do I misremember?

I don't remember such font-lock text property, but now I have no problems
when long lines are hidden initially with:

```
(add-hook 'xref-after-update-hook
          (lambda ()
            (setq-local outline-regexp (if (eq xref-file-name-display 'abs)
                                           "/" "[^ 0-9]")
                        outline-default-state 1
                        outline-default-rules '(subtree-has-long-lines)
                        outline-default-long-line 1000)
            (outline-minor-mode +1)))
```

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Sat, 30 Apr 2022 00:28:02 GMT) Full text and rfc822 format available.

Message #140 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Juri Linkov <juri <at> linkov.net>, Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: 44983 <at> debbugs.gnu.org
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Sat, 30 Apr 2022 03:27:11 +0300

On 29.04.2022 20:15, Juri Linkov wrote:
> I don't remember such font-lock text property, but now I have no problems
> when long lines are hidden initially with:

When you apply this, do you disable the existing mechanism for dealing 
with long lines? By setting 'xref-truncation-width' to nil.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Sat, 30 Apr 2022 09:25:01 GMT) Full text and rfc822 format available.

Message #143 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: dgutov <at> yandex.ru, 44983 <at> debbugs.gnu.org, juri <at> linkov.net
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Sat, 30 Apr 2022 11:24:29 +0200

Eli Zaretskii <eliz <at> gnu.org> writes:

>> > If that doesn't help either, I suggest to profile the code, because it
>> > could be the slow display is due to something else.
>> 
>> Hm, yes...  even if I disable font-lock-mode, it's still slow.
>
> Then I think a profile should tell something interesting.

Turns out to be font lock anyway:

        9152  88% - redisplay_internal (C function)
        9148  88%  - jit-lock-function
        9148  88%   - jit-lock-fontify-now
        9148  88%    - jit-lock--run-functions
        9144  87%     - run-hook-wrapped
        9144  87%      - #<compiled -0x1568eefe49e247c3>
        9144  87%       - font-lock-fontify-region
        9144  87%        - font-lock-default-fontify-region
        9144  87%           font-lock-fontify-keywords-region

Apparently disabling font-lock-mode in the *grep* buffer wasn't
sufficient to make it go away for some reason or other.  Disabling
global-font-lock-mode makes the problem go away.  And using invisible
text instead of a display property makes no difference -- font-lock
seems to really want to do font locking on ever-growing lines that are
inserted into the buffer by the process.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Sat, 30 Apr 2022 09:37:01 GMT) Full text and rfc822 format available.

Message #146 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: dgutov <at> yandex.ru, 44983 <at> debbugs.gnu.org, juri <at> linkov.net
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Sat, 30 Apr 2022 11:36:37 +0200

I've instrumented some functions to try to see what's going on.

I've set things up so that grep lines that are longer than 200 chars are
invisible starting at the 200th character.  While the grep is running,
`jit-lock-fontify-now' is called repeatedly and takes longer time each
time, but with the same region:

Fontifying *grep* 392-1892
Fontifying *grep* 392-1892
Fontifying *grep* 392-1892

392 is the start of the line, and 1892 is in the invisible portion of
the line.  That's 1500 characters, so it should be fast -- but perhaps
it's extending it to the end of the line anyway?

But before I start trying to debug that, I'm wondering: Why is
`jit-lock-fontify-now' called at all here?  There have been no display
changes -- the text was inserted, but as invisible text, so no font
locking should be necessary.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Sat, 30 Apr 2022 09:41:02 GMT) Full text and rfc822 format available.

Message #149 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 44983 <at> debbugs.gnu.org,
 Juri Linkov <juri <at> linkov.net>
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Sat, 30 Apr 2022 11:40:38 +0200

Dmitry Gutov <dgutov <at> yandex.ru> writes:

> FWIW, this is more or less solved for Xref output buffers these
> days. And the solution is based on the 'invisible' property.

Skimming the code there, it seems like xref just gets a list that it
inserts into the buffer, and then applies the invisibility spec to the
long lines?  That's a bit different from what compilation-mode/grep is
doing, where a process inserts text.  I.e., invisible text, in general,
works fine, but there's some bad interaction between
processes/invisible/font-lock somewhere.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Sat, 30 Apr 2022 09:57:01 GMT) Full text and rfc822 format available.

Message #152 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 44983 <at> debbugs.gnu.org, Juri Linkov <juri <at> linkov.net>
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Sat, 30 Apr 2022 11:56:11 +0200

This is the cause of the problem:

(defvar grep-mode-font-lock-keywords
   '(;; Command output lines.
     (": \\(.+\\): \\(?:Permission denied\\|No such \\(?:file or directory\\|device or address\\)\\)$"
      1 grep-error-face)

With that removed, everything's nice and fast.  Limiting that .+ to 200
characters also makes things fast:

diff --git a/lisp/progmodes/grep.el b/lisp/progmodes/grep.el
index 17905dec2e..7620536b4b 100644
--- a/lisp/progmodes/grep.el
+++ b/lisp/progmodes/grep.el
@@ -456,7 +456,7 @@ grep-find-abbreviate-properties
 
 (defvar grep-mode-font-lock-keywords
    '(;; Command output lines.
-     (": \\(.+\\): \\(?:Permission denied\\|No such \\(?:file or directory\\|device or address\\)\\)$"
+     (": \\(.\\{,200\\}\\): \\(?:Permission denied\\|No such \\(?:file or directory\\|device or address\\)\\)$"
       1 grep-error-face)
      ;; remove match from grep-regexp-alist before fontifying
      ("^Grep[/a-zA-Z]* started.*"

But I guess the real question here is still why we're font-locking
invisible text.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Sat, 30 Apr 2022 10:10:01 GMT) Full text and rfc822 format available.

Message #155 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: juri <at> linkov.net, 44983 <at> debbugs.gnu.org, dgutov <at> yandex.ru
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Sat, 30 Apr 2022 13:09:12 +0300

> From: Lars Ingebrigtsen <larsi <at> gnus.org>
> Date: Sat, 30 Apr 2022 11:56:11 +0200
> Cc: 44983 <at> debbugs.gnu.org, Juri Linkov <juri <at> linkov.net>
> 
> But I guess the real question here is still why we're font-locking
> invisible text.

We are not.  The display engine will never call jit-lock on a region
that starts in invisible text.  But a region that starts in visible
text can end in invisible text, and font-lock doesn't pay attention to
invisibility spec, AFAIR, it just looks at the buffer text
disregarding everything else.

For me, the more important question is: why the problem didn't
disappear when you turned off font-lock-mode in the offending buffer.
And I think I know why: you need to turn off jit-lock-mode as well.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Sat, 30 Apr 2022 10:16:01 GMT) Full text and rfc822 format available.

Message #158 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: dgutov <at> yandex.ru, 44983 <at> debbugs.gnu.org, juri <at> linkov.net
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Sat, 30 Apr 2022 13:15:07 +0300

> From: Lars Ingebrigtsen <larsi <at> gnus.org>
> Cc: juri <at> linkov.net,  44983 <at> debbugs.gnu.org,  dgutov <at> yandex.ru
> Date: Sat, 30 Apr 2022 11:36:37 +0200
> 
> But before I start trying to debug that, I'm wondering: Why is
> `jit-lock-fontify-now' called at all here?  There have been no display
> changes -- the text was inserted, but as invisible text, so no font
> locking should be necessary.

Are you saying that buffer position 392 was in invisible text?  If so,
jit-lock-fontify-now should not have been called.  But if position 392
is visible, then what you see is expected: the buffer text has
changed, and therefore redisplay will arrange to redisplay the buffer.
Part of redisplaying the buffer is making sure the text that might
wind up on display is fontified.  Which part will actually be on
display can only be known _after_ the text is fontified (because
fontification can change faces, and thus affect what's visible in the
window).  So we always fontify the 500-character chunk, per
jit-lock.el's defaults.

Did I answer your question?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Sat, 30 Apr 2022 11:01:01 GMT) Full text and rfc822 format available.

Message #161 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: juri <at> linkov.net, 44983 <at> debbugs.gnu.org, dgutov <at> yandex.ru
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Sat, 30 Apr 2022 12:59:52 +0200

I've now implemented the line-hiding in Emacs 29.  Grepping for
"Grenadine" in the Emacs tree now takes approx two seconds, while it
takes about a minute in Emacs 28 (and Emacs is unusable while it's
running).

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no

bug marked as fixed in version 29.1, send any further explanations to 44983 <at> debbugs.gnu.org and Juri Linkov <juri <at> linkov.net> Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Sat, 30 Apr 2022 11:01:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Sat, 30 Apr 2022 11:04:02 GMT) Full text and rfc822 format available.

Message #166 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: juri <at> linkov.net, 44983 <at> debbugs.gnu.org, dgutov <at> yandex.ru
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Sat, 30 Apr 2022 13:02:59 +0200

Eli Zaretskii <eliz <at> gnu.org> writes:

> We are not.  The display engine will never call jit-lock on a region
> that starts in invisible text.  But a region that starts in visible
> text can end in invisible text, and font-lock doesn't pay attention to
> invisibility spec, AFAIR, it just looks at the buffer text
> disregarding everything else.

Yes, that's correct, I think.  But shouldn't it be smarter here?  That
is, the display engine does know that all the text it inserted was
invisible, so calling jit-lock again (with the same parameters as
previous time) is futile.

However, this is probably not something many modes do, so putting more
effort into optimising this is probably not worth it.

> For me, the more important question is: why the problem didn't
> disappear when you turned off font-lock-mode in the offending buffer.
> And I think I know why: you need to turn off jit-lock-mode as well.

Probably.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Sat, 30 Apr 2022 11:05:01 GMT) Full text and rfc822 format available.

Message #169 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: dgutov <at> yandex.ru, 44983 <at> debbugs.gnu.org, juri <at> linkov.net
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Sat, 30 Apr 2022 13:04:46 +0200

Eli Zaretskii <eliz <at> gnu.org> writes:

> Part of redisplaying the buffer is making sure the text that might
> wind up on display is fontified.  Which part will actually be on
> display can only be known _after_ the text is fontified (because
> fontification can change faces, and thus affect what's visible in the
> window).

Yeah, that's true -- font-lock might end up making the text visible,
even, I guess?  But then we're being slightly inconsistent -- if the
entire region is invisible, then we don't let font-lock do anything, you
said.

But it probably doesn't really matter much.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Sat, 30 Apr 2022 11:13:02 GMT) Full text and rfc822 format available.

Message #172 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: juri <at> linkov.net, 44983 <at> debbugs.gnu.org, dgutov <at> yandex.ru
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Sat, 30 Apr 2022 14:12:45 +0300

> From: Lars Ingebrigtsen <larsi <at> gnus.org>
> Cc: dgutov <at> yandex.ru,  44983 <at> debbugs.gnu.org,  juri <at> linkov.net
> Date: Sat, 30 Apr 2022 13:02:59 +0200
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> > We are not.  The display engine will never call jit-lock on a region
> > that starts in invisible text.  But a region that starts in visible
> > text can end in invisible text, and font-lock doesn't pay attention to
> > invisibility spec, AFAIR, it just looks at the buffer text
> > disregarding everything else.
> 
> Yes, that's correct, I think.  But shouldn't it be smarter here?  That
> is, the display engine does know that all the text it inserted was
> invisible

No, it doesn't know that.  The display engine handles the 'fontified'
property first, and the invisible property only after that.  Even more
importantly, the display engine handles these properties only when it
gets to a character with that property, so it's enough that we have a
single character with no invisible property that needs to be
fontified, to have the display engine invoke jit-lock on a chunk of
text starting with that visible character.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44983; Package emacs. (Sun, 01 May 2022 17:49:03 GMT) Full text and rfc822 format available.

Message #175 received at 44983 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 44983 <at> debbugs.gnu.org
Subject: Re: bug#44983: Truncate long lines of grep output
Date: Sun, 01 May 2022 20:14:17 +0300

>> I don't remember such font-lock text property, but now I have no problems
>> when long lines are hidden initially with:
>
> When you apply this, do you disable the existing mechanism for dealing with
> long lines? By setting 'xref-truncation-width' to nil.

Oops, I forgot about xref-truncation-width.  Maybe it's actually
xref-truncation-width that fixed the problem.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Mon, 30 May 2022 11:24:09 GMT) Full text and rfc822 format available.

This bug report was last modified 3 years and 211 days ago.

Previous Next

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #44983 Truncate long lines of grep output

GNU bug report logs - #44983
Truncate long lines of grep output