GNU bug report logs - #75796
Very inefficient use of --color escape sequences

Previous Next

Package: grep;

Reported by: Peter White <peter.white <at> posteo.net>

Date: Fri, 24 Jan 2025 04:19:02 UTC

Severity: normal

To reply to this bug, email your comments to 75796 AT debbugs.gnu.org.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-grep <at> gnu.org:
bug#75796; Package grep. (Fri, 24 Jan 2025 04:19:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Peter White <peter.white <at> posteo.net>:
New bug report received and forwarded. Copy sent to bug-grep <at> gnu.org. (Fri, 24 Jan 2025 04:19:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Peter White <peter.white <at> posteo.net>
To: bug-grep <at> gnu.org
Subject: Very inefficient use of --color escape sequences
Date: Fri, 24 Jan 2025 03:20:28 +0000
Hi there,

I just stumbled on this by accidentally redirecting the output of a
colored grep invocation to a file like so:
	
	$ grep --color=always . /sys/kernel/mm/transparent_hugepage/* >thp-status.txt 2>/dev/null
	# for convenience the 1st line of actual uncolored output
	/sys/kernel/mm/transparent_hugepage/defrag:always defer defer+madvise [madvise] never
	
Subsequently opening said file in vim I was overwhelmed with escape
sequences. From the looks of it every character in the match gets its
very own color escape sequence as opposed to the string indicating the
path to the file (everything before ':') which is actually kind of
readable b/c it starts with only one escape sequence which gets reset
just before ':'.

Now I know that this is not really how one is supposed to use the
program and its output but it does show that the way grep works when
coloring the match is rather inefficient. As I said, I only discovered
this by accident, as in, otherwise I hadn't noticed any performance
issues or other ill-effects. The interesting part is that this way the
color escapes outweigh the actual payload text quite substantially:

	$ du -b grep*
	2471    grep-color-always.txt
	359     grep-color-auto.txt

This the same output as above, the names should say it all. That's just
shy of factor 7! I think that is quite some overhead for what the actual
purpose is. Now I don't necessarily agree with "benchmarking" terminal
emulators but did read a little about it. And maybe I just don't have a
brutal enough use case to make this an actual performance issue on my
hardware. But I believe fixing this might just result in some
improvements in that area, FWIW.

I am using GNU grep 3.11, which seems to be the current stable release,
on the current Ubuntu 24.04 LTS release. So maybe this has been
addressed in a later devel version? A cursory search for the issue
turned up empty though. That's why I would rather avoid compiling from
source for now, unless of course for testing a possible fix. Again, I
don't *need* a fix and would be happy to wait another year for the next
Ubuntu LTS, given the non-noticeable impact on my end.

So is this something that cannot be done any other way b/c of the way
matching works - as in: "if it can't be done elegantly use brute force"?
Or could this be classified as an actual bug, albeit a low priority one?


Peter White




Information forwarded to bug-grep <at> gnu.org:
bug#75796; Package grep. (Fri, 24 Jan 2025 04:40:01 GMT) Full text and rfc822 format available.

Message #8 received at 75796 <at> debbugs.gnu.org (full text, mbox):

From: Peter White <peter.white <at> posteo.net>
To: 75796 <at> debbugs.gnu.org
Subject: Re: bug#75796: Very inefficient use of --color escape sequences
Date: Fri, 24 Jan 2025 04:39:23 +0000
On Fri, Jan 24, 2025 at 03:20:28AM +0000, Peter White wrote:
> Hi there,
> 
> I just stumbled on this by accidentally redirecting the output of a
> colored grep invocation to a file like so:
> 	
> 	$ grep --color=always . /sys/kernel/mm/transparent_hugepage/* >thp-status.txt 2>/dev/null
> 	# for convenience the 1st line of actual uncolored output
> 	/sys/kernel/mm/transparent_hugepage/defrag:always defer defer+madvise [madvise] never
> 	
> Subsequently opening said file in vim I was overwhelmed with escape
> sequences. From the looks of it every character in the match gets its
> very own color escape sequence as opposed to the string indicating the
> path to the file (everything before ':')

I did have a look at the code, after all, and even tried a premature
"fix" which broke the foad1 test in the debian source package. But that
made me realize that this is not a bug at all and grep did as designed
by highlighting every single match which just happened to be every
single character because of '.' being the pattern.

I am sorry for wasting anyone's time and can only hope that this
retraction reaches them before they might go chasing ghosts.


Peter White




This bug report was last modified 75 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.