GNU bug report logs - #38322
GCC optimize levels makes huge impact on performance

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: grep; Reported by: Balázs Vinarz <vinibali1@HIDDEN>; Done: Paul Eggert <eggert@HIDDEN>; Maintainer for grep is bug-grep@HIDDEN.

Message received at 38322-done <at> debbugs.gnu.org:


Received: (at 38322-done) by debbugs.gnu.org; 2 Jan 2020 10:08:36 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Jan 02 05:08:36 2020
Received: from localhost ([127.0.0.1]:38260 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1imxPA-0004Qw-FI
	for submit <at> debbugs.gnu.org; Thu, 02 Jan 2020 05:08:36 -0500
Received: from zimbra.cs.ucla.edu ([131.179.128.68]:50796)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <eggert@HIDDEN>) id 1imxP8-0004Qg-Qr
 for 38322-done <at> debbugs.gnu.org; Thu, 02 Jan 2020 05:08:35 -0500
Received: from localhost (localhost [127.0.0.1])
 by zimbra.cs.ucla.edu (Postfix) with ESMTP id 9F4BE160054;
 Thu,  2 Jan 2020 02:08:28 -0800 (PST)
Received: from zimbra.cs.ucla.edu ([127.0.0.1])
 by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032)
 with ESMTP id BtRsLaOI0XOC; Thu,  2 Jan 2020 02:08:28 -0800 (PST)
Received: from localhost (localhost [127.0.0.1])
 by zimbra.cs.ucla.edu (Postfix) with ESMTP id F09B8160058;
 Thu,  2 Jan 2020 02:08:27 -0800 (PST)
X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu
Received: from zimbra.cs.ucla.edu ([127.0.0.1])
 by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026)
 with ESMTP id 73q1MyH7BXME; Thu,  2 Jan 2020 02:08:27 -0800 (PST)
Received: from [192.168.1.9] (cpe-23-242-74-103.socal.res.rr.com
 [23.242.74.103])
 by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id CA177160054;
 Thu,  2 Jan 2020 02:08:27 -0800 (PST)
Subject: Re: bug#38322: GCC optimize levels makes huge impact on performance
From: Paul Eggert <eggert@HIDDEN>
To: =?UTF-8?Q?Bal=c3=a1zs_Vinarz?= <vinibali1@HIDDEN>
References: <CAO=iczE0RVtth4uD6ggy6cTs7jtKu2rj7ptVN6KVEWeXrrDgkg@HIDDEN>
 <d4a71e10-5839-ea35-6dd7-ea09156ae181@HIDDEN>
Organization: UCLA Computer Science Department
Message-ID: <418a248b-adfc-1800-3a8e-61afc82b3145@HIDDEN>
Date: Thu, 2 Jan 2020 02:08:27 -0800
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101
 Thunderbird/68.2.2
MIME-Version: 1.0
In-Reply-To: <d4a71e10-5839-ea35-6dd7-ea09156ae181@HIDDEN>
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: 7bit
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 38322-done
Cc: 38322-done <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

On 11/22/19 5:52 PM, Paul Eggert wrote:
> If we do want to tune grep for set-like operations, that suggests doing some
> surgery to its internals rather than merely fiddling with -O flags.

Since I last wrote, some of that surgery has been done by another grep
contributor, and a simple 'grep -f file1 file2' benchmark that I just now tried
sped up from 47 seconds (for grep 3.1) to 2.3 seconds (for the next version of
grep). So this algorithmic change should far outweigh any GCC optimization level
change.

Anyway, the topic seems to have died down so I'm closing the bug report.




Notification sent to Balázs Vinarz <vinibali1@HIDDEN>:
bug acknowledged by developer. Full text available.
Reply sent to Paul Eggert <eggert@HIDDEN>:
You have taken responsibility. Full text available.

Message received at 38322 <at> debbugs.gnu.org:


Received: (at 38322) by debbugs.gnu.org; 23 Nov 2019 01:53:05 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Fri Nov 22 20:53:05 2019
Received: from localhost ([127.0.0.1]:55739 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1iYKbh-0004ey-3y
	for submit <at> debbugs.gnu.org; Fri, 22 Nov 2019 20:53:05 -0500
Received: from zimbra.cs.ucla.edu ([131.179.128.68]:41966)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <eggert@HIDDEN>) id 1iYKbd-0004eP-2a
 for 38322 <at> debbugs.gnu.org; Fri, 22 Nov 2019 20:53:03 -0500
Received: from localhost (localhost [127.0.0.1])
 by zimbra.cs.ucla.edu (Postfix) with ESMTP id 431B016027C;
 Fri, 22 Nov 2019 17:52:54 -0800 (PST)
Received: from zimbra.cs.ucla.edu ([127.0.0.1])
 by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032)
 with ESMTP id 7Sh17jm83X6Y; Fri, 22 Nov 2019 17:52:53 -0800 (PST)
Received: from localhost (localhost [127.0.0.1])
 by zimbra.cs.ucla.edu (Postfix) with ESMTP id 861C616017F;
 Fri, 22 Nov 2019 17:52:53 -0800 (PST)
X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu
Received: from zimbra.cs.ucla.edu ([127.0.0.1])
 by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026)
 with ESMTP id kEpo_Et32CFm; Fri, 22 Nov 2019 17:52:53 -0800 (PST)
Received: from Penguin.CS.UCLA.EDU (Penguin.CS.UCLA.EDU [131.179.64.200])
 by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 6D57A16027C;
 Fri, 22 Nov 2019 17:52:53 -0800 (PST)
Subject: Re: bug#38322: GCC optimize levels makes huge impact on performance
To: =?UTF-8?Q?Bal=c3=a1zs_Vinarz?= <vinibali1@HIDDEN>
References: <CAO=iczE0RVtth4uD6ggy6cTs7jtKu2rj7ptVN6KVEWeXrrDgkg@HIDDEN>
From: Paul Eggert <eggert@HIDDEN>
Organization: UCLA Computer Science Department
Message-ID: <d4a71e10-5839-ea35-6dd7-ea09156ae181@HIDDEN>
Date: Fri, 22 Nov 2019 17:52:53 -0800
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101
 Thunderbird/68.2.2
MIME-Version: 1.0
In-Reply-To: <CAO=iczE0RVtth4uD6ggy6cTs7jtKu2rj7ptVN6KVEWeXrrDgkg@HIDDEN>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: quoted-printable
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 38322
Cc: 38322 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

On 11/22/19 8:00 AM, Bal=C3=A1zs Vinarz wrote:
> Would you mind change the default optimize level on the make
> configuration? Did somebody ever measured the benefits using different
> GCC optimalization levels?

Lots of measurements have been done. They often disagree. Even if grep=20
changed the default optimization level (which I'm not sure is a good=20
idea), distros like Ubuntu often override the default and if so, changes=20
to the default wouldn't help you.

> I know that this is a special use case, but the improvement is huge.
> I'm looking forward for your feedback.

It sounds like you're using grep to do set subtraction; is this a=20
common-enough usage to be worth special-casing grep for? (One could=20
argue that it's easy enough to do set subtraction with Awk or Python or=20
whatever....) If we do want to tune grep for set-like operations, that=20
suggests doing some surgery to its internals rather than merely fiddling=20
with -O flags.




Information forwarded to bug-grep@HIDDEN:
bug#38322; Package grep. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 22 Nov 2019 17:04:54 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Fri Nov 22 12:04:54 2019
Received: from localhost ([127.0.0.1]:55422 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1iYCMV-0006XW-Tt
	for submit <at> debbugs.gnu.org; Fri, 22 Nov 2019 12:04:54 -0500
Received: from lists.gnu.org ([209.51.188.17]:38404)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <vinibali1@HIDDEN>) id 1iYBNL-00051v-Ig
 for submit <at> debbugs.gnu.org; Fri, 22 Nov 2019 11:01:41 -0500
Received: from eggs.gnu.org ([2001:470:142:3::10]:46989)
 by lists.gnu.org with esmtp (Exim 4.90_1)
 (envelope-from <vinibali1@HIDDEN>) id 1iYBNJ-00044L-UO
 for bug-grep@HIDDEN; Fri, 22 Nov 2019 11:01:38 -0500
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level: *
X-Spam-Status: No, score=1.1 required=5.0 tests=BAYES_50,
 FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM autolearn=disabled version=3.3.2
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <vinibali1@HIDDEN>) id 1iYBNI-0000gE-Fq
 for bug-grep@HIDDEN; Fri, 22 Nov 2019 11:01:37 -0500
Received: from mail-il1-x12f.google.com ([2607:f8b0:4864:20::12f]:40917)
 by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16)
 (Exim 4.71) (envelope-from <vinibali1@HIDDEN>) id 1iYBNI-0000ex-BY
 for bug-grep@HIDDEN; Fri, 22 Nov 2019 11:01:36 -0500
Received: by mail-il1-x12f.google.com with SMTP id v17so3536381ilg.7
 for <bug-grep@HIDDEN>; Fri, 22 Nov 2019 08:01:33 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=mime-version:from:date:message-id:subject:to;
 bh=BmuuUO3S5GxxlVodOyqZZtJp9h6ttSXFqOxvWKSuLMQ=;
 b=EZC8UY9rihy9j7z0lfCIjNaBiPjRJmQMOTAWEVRJFe3MIy7qxEid5acZoKQuuBwSEy
 pkuClm3huSZPK6Szq0PAnt7OIC9iGaYH5QfwBpgjk7EcnvoOJWMRFNpD1xLr+ynPQ2g5
 gG/ezqQk8UP1o1c8wsXc8J6peVfljPSZe6o+ov2av1vpWHPkOnKZknkz8DuKJUAXrBLX
 V09A40pDTOW4ZRO0bFabq5mN8WjDUSW2lqjoUqFv/UXmsYDULPTOpC2es49JrAWkcy/5
 aZxokRcIX5tXpR9nDQsjF3CfOqiuJhaHjhFTC+BZUhpdDTB9xvKIsVBfWhv25Z8Bz6Hj
 CmzA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:from:date:message-id:subject:to;
 bh=BmuuUO3S5GxxlVodOyqZZtJp9h6ttSXFqOxvWKSuLMQ=;
 b=mzdIXYNNSsCwLBs0XRnNehRF9pIj6UFx8zzZMLQlwZOXvvAesxfxp+aJypG5k5QMFv
 0U4o5JsHZ9WnMnmmWOnmtVQ/c9Li3AHlGbtEVoHA2gRrfB8yrfra16pqGDE8uDF1QvEO
 BEOs4z4wHKD7fMViLjQKs3MoLkTUE9OfyajesRwrToOhVoSlKw3M8nF7Jx5ElAtt4iMI
 Y+EYXDEucDbdyZ2DlROnibBSyTZcco4ygYGU5U9tzn6K/boHnRmazTecvXxkYoigHR7p
 kN5Ih45x3Horx1rZMsVIifjPR5TV+zk/TdxJLeRzOrrxGuVOZNoH16QphgrEKtv1ope9
 0mFA==
X-Gm-Message-State: APjAAAVznKbNd4ugLpXUD04WYgEAPlzDaSZ7MzzE6tw+dR52MImuCVOV
 KYAPvibc0FGVzYBDePJ5DURbAdGF0RxizOf98kZY8mI=
X-Google-Smtp-Source: APXvYqz0ddqcyequ3VBZrTdp5Pf1AZzoUMcCFYKRo4OnkoPrVYx+KxEiMSS2r36rzFeUODSVFAxrstFDKQsVjzv4Fc0=
X-Received: by 2002:a92:c10f:: with SMTP id p15mr16845017ile.119.1574438492532; 
 Fri, 22 Nov 2019 08:01:32 -0800 (PST)
MIME-Version: 1.0
From: =?UTF-8?Q?Bal=C3=A1zs_Vinarz?= <vinibali1@HIDDEN>
Date: Fri, 22 Nov 2019 18:00:29 +0200
Message-ID: <CAO=iczE0RVtth4uD6ggy6cTs7jtKu2rj7ptVN6KVEWeXrrDgkg@HIDDEN>
Subject: GCC optimize levels makes huge impact on performance
To: bug-grep@HIDDEN
Content-Type: text/plain; charset="UTF-8"
X-detected-operating-system: by eggs.gnu.org: Genre and OS details not
 recognized.
X-Received-From: 2607:f8b0:4864:20::12f
X-Spam-Score: 0.9 (/)
X-Debbugs-Envelope-To: submit
X-Mailman-Approved-At: Fri, 22 Nov 2019 12:04:50 -0500
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -2.1 (--)

Hello there!

Today I was working on two bigger, plain text, csv-like database files
(file1: ~175k lines and 15MB, file2: ~ 168k lines and 14MB). I just
searched for lines, using grep -f $file2 $file1. I was so surprised
when I realized the search was running for minutes already without a
single line at the standard output. I decided to have a try with
custom compiled binaries, because in my mind the size optimized
binaries are the fastest.
In the end grep (3.1) was running for:
- 4m50s if I used the one was coming from Ubuntu,
- 4m29s in case of custom recompiled with GCC7.4 and CFLAGS="O2" and
- 3m17s in case of custom recompiled with GCC7.4 and CFLAGS="Os".
I repeated the runs multiple times, I would say it's accurate. The
files were located on tmpfs.
Binary sizes are: 215K for Ubuntu, 184K for O2 and 150K for Os.
CPU: Intel I5-8350U
OS: Ubuntu 18.04.3 LTS
Would you mind change the default optimize level on the make
configuration? Did somebody ever measured the benefits using different
GCC optimalization levels?
I know that this is a special use case, but the improvement is huge.
I'm looking forward for your feedback.

Best regards




Acknowledgement sent to Balázs Vinarz <vinibali1@HIDDEN>:
New bug report received and forwarded. Copy sent to bug-grep@HIDDEN. Full text available.
Report forwarded to bug-grep@HIDDEN:
bug#38322; Package grep. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Thu, 2 Jan 2020 10:15:02 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.