X-Loop: help-debbugs@HIDDEN Subject: bug#39832: [PATCH] Optimized the deflate in aarch64 Resent-From: Yikun Jiang <yikunkero@HIDDEN> Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-gzip@HIDDEN Resent-Date: Sat, 29 Feb 2020 10:10:02 +0000 Resent-Message-ID: <handler.39832.B.15829709937694 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: report 39832 X-GNU-PR-Package: gzip X-GNU-PR-Keywords: patch To: 39832 <at> debbugs.gnu.org X-Debbugs-Original-To: bug-gzip@HIDDEN Received: via spool by submit <at> debbugs.gnu.org id=B.15829709937694 (code B ref -1); Sat, 29 Feb 2020 10:10:02 +0000 Received: (at submit) by debbugs.gnu.org; 29 Feb 2020 10:09:53 +0000 Received: from localhost ([127.0.0.1]:34272 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1j7z4C-0001zx-W8 for submit <at> debbugs.gnu.org; Sat, 29 Feb 2020 05:09:53 -0500 Received: from lists.gnu.org ([209.51.188.17]:53632) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <yikunkero@HIDDEN>) id 1j7yYG-0007Hi-G1 for submit <at> debbugs.gnu.org; Sat, 29 Feb 2020 04:36:52 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:43883) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from <yikunkero@HIDDEN>) id 1j7yYF-0001S4-5h for bug-gzip@HIDDEN; Sat, 29 Feb 2020 04:36:52 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,FREEMAIL_FROM, HTML_MESSAGE autolearn=disabled version=3.3.2 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from <yikunkero@HIDDEN>) id 1j7yYD-0006cA-Ti for bug-gzip@HIDDEN; Sat, 29 Feb 2020 04:36:51 -0500 Received: from mail-lj1-x243.google.com ([2a00:1450:4864:20::243]:40041) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from <yikunkero@HIDDEN>) id 1j7yYD-0006Yd-LK for bug-gzip@HIDDEN; Sat, 29 Feb 2020 04:36:49 -0500 Received: by mail-lj1-x243.google.com with SMTP id 143so6101467ljj.7 for <bug-gzip@HIDDEN>; Sat, 29 Feb 2020 01:36:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=RjTYvL6S15GHdhO2UzO4B1a4/fbB7aVRofsxUa5IiXY=; b=l39wt1krDAfRRHcGE/cyNtVQlMwwlmcuIAOhyXWGko+3Ro9FUKTomXxau9PQLchHPk HXneW5mAUjhrHM/yPC94ygK9MBPddaVKzChjCYFJeuarNaOIlhle56SttdUG2lyLfNsK IJLpPHbIZf9QT2uHc6raA85SiD8ovTwIJ/cgeAC7zosNaG9l3A6n1J00KH5EvjnO/qnO 3lEphIN4OKbHK+nb5ZOXiv+NHUbdugqrq5E5NkZIRX2lYmmBvXXig0AC0I05IXow8Ifv rb003NWZqB0dGk+h+dJwSBXOYOn++4VvJ2tXiU54S9RYfN3cX+ZII9Y9UPO1/fk43iNc 3Hgg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=RjTYvL6S15GHdhO2UzO4B1a4/fbB7aVRofsxUa5IiXY=; b=YlYPnZpQsuVs0XhcaMuU4qAfi5k63s1bqfWM0Tysd0H9VSBCwkJH4lbDpTeFAbTURx tkz/3jySiE21bt4JEMaUspzFLLyRPUJsxPDWOXufU9h0xPzEMbAQGwAqUJJoBDa7i+nR mUWWPJX9zEJ/LNlIjK+TaY/QssyOR0dETvqaJP7NQt/ReDMITsEvqptauTBxo0d9xzOp weuBm0aijo9LtxQ+gdxzRX9mCClOYZc846rVi0S5s33A6gToGv9KBb9pUeMFy8y3W/Km T6yYJCy4KwFxDAWaVqCwNFJ8cSMBwQDIwr4Fc9NmSLwyVM/fXwZu2eOEfUKDXg531Ij3 eTMg== X-Gm-Message-State: ANhLgQ3B/zRiM09omoSgNOvx6iqYTn19v78R+eIJ8OcQZZowlU0g+TgW N3NHpBg+/sHtJDOhjuKgpn3uFqCNLuR217CxHilKadD3ulU= X-Google-Smtp-Source: ADFU+vvx0Cn/UwxOAXk9ponjemoGfq1y9N7+eB8dm/jq+cjrzNhHdbgSpUi+JLVIQqmwZUsZx+lVawm9C5IXf1vQaEo= X-Received: by 2002:a2e:80cc:: with SMTP id r12mr5241620ljg.154.1582969008369; Sat, 29 Feb 2020 01:36:48 -0800 (PST) MIME-Version: 1.0 From: Yikun Jiang <yikunkero@HIDDEN> Date: Sat, 29 Feb 2020 17:36:37 +0800 Message-ID: <CAArz_dAeJ8FfE4ksbEHnb-W9Be-M1WsuRhLahFK0QQ35HF8V9g@HIDDEN> Content-Type: multipart/alternative; boundary="0000000000006f347c059fb3b106" X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2a00:1450:4864:20::243 X-Spam-Score: 0.3 (/) X-Mailman-Approved-At: Sat, 29 Feb 2020 05:09:52 -0500 X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -0.7 (/) --0000000000006f347c059fb3b106 Content-Type: text/plain; charset="UTF-8" From: Yikun Jiang <yikunkero@HIDDEN> This patch uses the prefetch instruction to pre-load the next_match into cache to improve the performance, also makes an unrolling change to decrease the number of if branch usage. --- deflate.c | 30 ++++++++++++++++++++++++++++-- 1 file changed, 28 insertions(+), 2 deletions(-) diff --git a/deflate.c b/deflate.c index 5ed2a9b..008c032 100644 --- a/deflate.c +++ b/deflate.c @@ -378,6 +378,9 @@ longest_match(IPos cur_match) register int len; /* length of current match */ int best_len = prev_length; /* best match length so far */ IPos limit = strstart > (IPos)MAX_DIST ? strstart - (IPos)MAX_DIST : NIL; +#ifdef __aarch64__ + IPos next_match; +#endif /* Stop when cur_match becomes <= limit. To simplify the code, * we prevent matches with the string of window index 0. */ @@ -411,6 +414,10 @@ longest_match(IPos cur_match) do { Assert(cur_match < strstart, "no future"); match = window + cur_match; +#ifdef __aarch64__ + next_match = prev[cur_match & WMASK]; + __asm__("PRFM PLDL1STRM, [%0]"::"r"(&(prev[next_match & WMASK]))); +#endif /* Skip to next match if the match length cannot increase * or if the match length is less than 2: @@ -488,8 +495,14 @@ longest_match(IPos cur_match) scan_end = scan[best_len]; #endif } - } while ((cur_match = prev[cur_match & WMASK]) > limit - && --chain_length != 0); + } +#ifdef __aarch64__ + while ((cur_match = next_match) > limit + && --chain_length != 0); +#else + while ((cur_match = prev[cur_match & WMASK]) > limit + && --chain_length != 0); +#endif return best_len; } @@ -777,7 +790,20 @@ deflate (int pack_level) lookahead -= prev_length-1; prev_length -= 2; RSYNC_ROLL(strstart, prev_length+1); + + while (prev_length >= 4) { + prev_length -= 4; + strstart++; + INSERT_STRING(strstart, hash_head); + strstart++; + INSERT_STRING(strstart, hash_head); + strstart++; + INSERT_STRING(strstart, hash_head); + strstart++; + INSERT_STRING(strstart, hash_head); + } do { + if (prev_length == 0) break; strstart++; INSERT_STRING(strstart, hash_head); /* strstart never exceeds WSIZE-MAX_MATCH, so there are -- 2.17.1 --0000000000006f347c059fb3b106 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr">From: Yikun Jiang <<a href=3D"mailto:yikunkero@HIDDEN= m" target=3D"_blank">yikunkero@HIDDEN</a>><br><br>This patch uses the= prefetch instruction to pre-load the<br>next_match into cache to improve t= he performance, also makes<br>an unrolling change to decrease the number of= if branch usage.<br>---<br>=C2=A0deflate.c | 30 ++++++++++++++++++++++++++= ++--<br>=C2=A01 file changed, 28 insertions(+), 2 deletions(-)<br><br>diff = --git a/deflate.c b/deflate.c<br>index 5ed2a9b..008c032 100644<br>--- a/def= late.c<br>+++ b/deflate.c<br>@@ -378,6 +378,9 @@ longest_match(IPos cur_mat= ch)<br>=C2=A0 =C2=A0 =C2=A0register int len;=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/* length= of current match */<br>=C2=A0 =C2=A0 =C2=A0int best_len =3D prev_length;= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/* best match= length so far */<br>=C2=A0 =C2=A0 =C2=A0IPos limit =3D strstart > (IPos= )MAX_DIST ? strstart - (IPos)MAX_DIST : NIL;<br>+#ifdef __aarch64__<br>+=C2= =A0 =C2=A0 IPos next_match;<br>+#endif<br>=C2=A0 =C2=A0 =C2=A0/* Stop when = cur_match becomes <=3D limit. To simplify the code,<br>=C2=A0 =C2=A0 =C2= =A0 * we prevent matches with the string of window index 0.<br>=C2=A0 =C2= =A0 =C2=A0 */<br>@@ -411,6 +414,10 @@ longest_match(IPos cur_match)<br>=C2= =A0 =C2=A0 =C2=A0do {<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Assert(cur_match= < strstart, "no future");<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0match =3D window + cur_match;<br>+#ifdef __aarch64__<br>+=C2=A0 =C2=A0 = =C2=A0 =C2=A0 next_match =3D prev[cur_match & WMASK];<br>+=C2=A0 =C2=A0= =C2=A0 =C2=A0 __asm__("PRFM=C2=A0 =C2=A0PLDL1STRM, [%0]"::"= r"(&(prev[next_match & WMASK])));<br>+#endif<br><br>=C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0/* Skip to next match if the match length cannot in= crease<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 * or if the match length is le= ss than 2:<br>@@ -488,8 +495,14 @@ longest_match(IPos cur_match)<br>=C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0scan_end=C2=A0 =C2=A0=3D scan[best= _len];<br>=C2=A0#endif<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0}<br>-=C2=A0 = =C2=A0 } while ((cur_match =3D prev[cur_match & WMASK]) > limit<br>-= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0&& --chain_length != =3D 0);<br>+=C2=A0 =C2=A0 }<br>+#ifdef __aarch64__<br>+=C2=A0 =C2=A0 while = ((cur_match =3D next_match) > limit<br>+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 && --chain_length !=3D 0);<br>+#else<br>+=C2=A0 =C2=A0 w= hile ((cur_match =3D prev[cur_match & WMASK]) > limit<br>+=C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 && --chain_length !=3D 0);<br>+#end= if<br><br>=C2=A0 =C2=A0 =C2=A0return best_len;<br>=C2=A0}<br>@@ -777,7 +790= ,20 @@ deflate (int pack_level)<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0lookahead -=3D prev_length-1;<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0prev_length -=3D 2;<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0RSYNC_ROLL(strstart, prev_length+1);<br>+<br>+=C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 while (prev_length >=3D 4) {<br>+=C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 prev_length -=3D 4;<br>+=C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 strstart++;<br>+=C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 INSERT_STRING(strstart, h= ash_head);<br>+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 strs= tart++;<br>+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 INSERT_= STRING(strstart, hash_head);<br>+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 strstart++;<br>+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 INSERT_STRING(strstart, hash_head);<br>+=C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 strstart++;<br>+=C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 INSERT_STRING(strstart, hash_head);<br>+=C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 }<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0do {<br>+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 if (prev_length =3D=3D 0) break;<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0strstart++;<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0INSERT_STRING(strstart, hash_head);<br>=C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/* strstart neve= r exceeds WSIZE-MAX_MATCH, so there are<font color=3D"#888888"><br>--<br>2.= 17.1</font>=C2=A0=C2=A0<br></div> --0000000000006f347c059fb3b106--
Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.505 (Entity 5.505) Content-Type: text/plain; charset=utf-8 X-Loop: help-debbugs@HIDDEN From: help-debbugs@HIDDEN (GNU bug Tracking System) To: Yikun Jiang <yikunkero@HIDDEN> Subject: bug#39832: Acknowledgement ([PATCH] Optimized the deflate in aarch64) Message-ID: <handler.39832.B.15829709937694.ack <at> debbugs.gnu.org> References: <CAArz_dAeJ8FfE4ksbEHnb-W9Be-M1WsuRhLahFK0QQ35HF8V9g@HIDDEN> X-Gnu-PR-Message: ack 39832 X-Gnu-PR-Package: gzip X-Gnu-PR-Keywords: patch Reply-To: 39832 <at> debbugs.gnu.org Date: Sat, 29 Feb 2020 10:10:02 +0000 Thank you for filing a new bug report with debbugs.gnu.org. This is an automatically generated reply to let you know your message has been received. Your message is being forwarded to the package maintainers and other interested parties for their attention; they will reply in due course. Your message has been sent to the package maintainer(s): bug-gzip@HIDDEN If you wish to submit further information on this problem, please send it to 39832 <at> debbugs.gnu.org. Please do not send mail to help-debbugs@HIDDEN unless you wish to report a problem with the Bug-tracking system. --=20 39832: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D39832 GNU Bug Tracking System Contact help-debbugs@HIDDEN with problems
Received: (at control) by debbugs.gnu.org; 5 Apr 2022 01:36:35 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Mon Apr 04 21:36:35 2022 Received: from localhost ([127.0.0.1]:53428 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1nbY7X-0004FO-0I for submit <at> debbugs.gnu.org; Mon, 04 Apr 2022 21:36:35 -0400 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:46330) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <eggert@HIDDEN>) id 1nbY7U-0004F4-VE for control <at> debbugs.gnu.org; Mon, 04 Apr 2022 21:36:33 -0400 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id CB30716009A for <control <at> debbugs.gnu.org>; Mon, 4 Apr 2022 18:36:26 -0700 (PDT) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id 5oYvPCcQLN8w for <control <at> debbugs.gnu.org>; Mon, 4 Apr 2022 18:36:26 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 3B52F160130 for <control <at> debbugs.gnu.org>; Mon, 4 Apr 2022 18:36:26 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id ZEoBnrxCVF8D for <control <at> debbugs.gnu.org>; Mon, 4 Apr 2022 18:36:26 -0700 (PDT) Received: from [131.179.64.200] (Penguin.CS.UCLA.EDU [131.179.64.200]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 1C87816009A for <control <at> debbugs.gnu.org>; Mon, 4 Apr 2022 18:36:26 -0700 (PDT) Message-ID: <ddb4b521-92e0-48d0-2157-eb6ccb8ca9ac@HIDDEN> Date: Mon, 4 Apr 2022 18:36:25 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.7.0 Content-Language: en-US To: GNU bug control <control <at> debbugs.gnu.org> From: Paul Eggert <eggert@HIDDEN> Subject: gzip bug report maintenance Organization: UCLA Computer Science Department Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -3.3 (---) tags 41535 wontfix tags 39832 wontfix tags 39831 wontfix
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997 nCipher Corporation Ltd,
1994-97 Ian Jackson.