GNU bug report logs - #38322
GCC optimize levels makes huge impact on performance

Previous Next

Package: grep;

Reported by: Balázs Vinarz <vinibali1 <at> gmail.com>

Date: Fri, 22 Nov 2019 17:05:02 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 38322 in the body.
You can then email your comments to 38322 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-grep <at> gnu.org:
bug#38322; Package grep. (Fri, 22 Nov 2019 17:05:04 GMT) Full text and rfc822 format available.

Acknowledgement sent to Balázs Vinarz <vinibali1 <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-grep <at> gnu.org. (Fri, 22 Nov 2019 17:05:07 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Balázs Vinarz <vinibali1 <at> gmail.com>
To: bug-grep <at> gnu.org
Subject: GCC optimize levels makes huge impact on performance
Date: Fri, 22 Nov 2019 18:00:29 +0200
Hello there!

Today I was working on two bigger, plain text, csv-like database files
(file1: ~175k lines and 15MB, file2: ~ 168k lines and 14MB). I just
searched for lines, using grep -f $file2 $file1. I was so surprised
when I realized the search was running for minutes already without a
single line at the standard output. I decided to have a try with
custom compiled binaries, because in my mind the size optimized
binaries are the fastest.
In the end grep (3.1) was running for:
- 4m50s if I used the one was coming from Ubuntu,
- 4m29s in case of custom recompiled with GCC7.4 and CFLAGS="O2" and
- 3m17s in case of custom recompiled with GCC7.4 and CFLAGS="Os".
I repeated the runs multiple times, I would say it's accurate. The
files were located on tmpfs.
Binary sizes are: 215K for Ubuntu, 184K for O2 and 150K for Os.
CPU: Intel I5-8350U
OS: Ubuntu 18.04.3 LTS
Would you mind change the default optimize level on the make
configuration? Did somebody ever measured the benefits using different
GCC optimalization levels?
I know that this is a special use case, but the improvement is huge.
I'm looking forward for your feedback.

Best regards




Information forwarded to bug-grep <at> gnu.org:
bug#38322; Package grep. (Sat, 23 Nov 2019 01:54:01 GMT) Full text and rfc822 format available.

Message #8 received at 38322 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Balázs Vinarz <vinibali1 <at> gmail.com>
Cc: 38322 <at> debbugs.gnu.org
Subject: Re: bug#38322: GCC optimize levels makes huge impact on performance
Date: Fri, 22 Nov 2019 17:52:53 -0800
On 11/22/19 8:00 AM, Balázs Vinarz wrote:
> Would you mind change the default optimize level on the make
> configuration? Did somebody ever measured the benefits using different
> GCC optimalization levels?

Lots of measurements have been done. They often disagree. Even if grep 
changed the default optimization level (which I'm not sure is a good 
idea), distros like Ubuntu often override the default and if so, changes 
to the default wouldn't help you.

> I know that this is a special use case, but the improvement is huge.
> I'm looking forward for your feedback.

It sounds like you're using grep to do set subtraction; is this a 
common-enough usage to be worth special-casing grep for? (One could 
argue that it's easy enough to do set subtraction with Awk or Python or 
whatever....) If we do want to tune grep for set-like operations, that 
suggests doing some surgery to its internals rather than merely fiddling 
with -O flags.




Reply sent to Paul Eggert <eggert <at> cs.ucla.edu>:
You have taken responsibility. (Thu, 02 Jan 2020 10:09:02 GMT) Full text and rfc822 format available.

Notification sent to Balázs Vinarz <vinibali1 <at> gmail.com>:
bug acknowledged by developer. (Thu, 02 Jan 2020 10:09:02 GMT) Full text and rfc822 format available.

Message #13 received at 38322-done <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Balázs Vinarz <vinibali1 <at> gmail.com>
Cc: 38322-done <at> debbugs.gnu.org
Subject: Re: bug#38322: GCC optimize levels makes huge impact on performance
Date: Thu, 2 Jan 2020 02:08:27 -0800
On 11/22/19 5:52 PM, Paul Eggert wrote:
> If we do want to tune grep for set-like operations, that suggests doing some
> surgery to its internals rather than merely fiddling with -O flags.

Since I last wrote, some of that surgery has been done by another grep
contributor, and a simple 'grep -f file1 file2' benchmark that I just now tried
sped up from 47 seconds (for grep 3.1) to 2.3 seconds (for the next version of
grep). So this algorithmic change should far outweigh any GCC optimization level
change.

Anyway, the topic seems to have died down so I'm closing the bug report.




Information forwarded to bug-grep <at> gnu.org:
bug#38322; Package grep. (Thu, 02 Jan 2020 15:49:02 GMT) Full text and rfc822 format available.

Message #16 received at 38322 <at> debbugs.gnu.org (full text, mbox):

From: Balázs Vinarz <vinibali1 <at> gmail.com>
To: 38322 <at> debbugs.gnu.org
Date: Thu, 2 Jan 2020 16:48:18 +0100
[Message part 1 (text/plain, inline)]
Thank you Paul,
I wish you a happy new year!
Best regards
[Message part 2 (text/html, inline)]

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Fri, 31 Jan 2020 12:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 4 years and 80 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.