GNU bug report logs - #39831
[PATCH] Using crc instructions instead of crc_32_tab in aarch64.

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: gzip; Reported by: Yikun Jiang <yikunkero@HIDDEN>; Keywords: wontfix patch; dated Sat, 29 Feb 2020 10:10:01 UTC; Maintainer for gzip is bug-gzip@HIDDEN.
Added tag(s) wontfix. Request was from Paul Eggert <eggert@HIDDEN> to control <at> debbugs.gnu.org. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 29 Feb 2020 10:09:53 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sat Feb 29 05:09:53 2020
Received: from localhost ([127.0.0.1]:34270 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1j7z4C-0001zu-HR
	for submit <at> debbugs.gnu.org; Sat, 29 Feb 2020 05:09:52 -0500
Received: from lists.gnu.org ([209.51.188.17]:44544)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <yikunkero@HIDDEN>) id 1j7yXC-0007FK-FD
 for submit <at> debbugs.gnu.org; Sat, 29 Feb 2020 04:35:46 -0500
Received: from eggs.gnu.org ([2001:470:142:3::10]:43760)
 by lists.gnu.org with esmtp (Exim 4.90_1)
 (envelope-from <yikunkero@HIDDEN>) id 1j7yXA-0000u5-TF
 for bug-gzip@HIDDEN; Sat, 29 Feb 2020 04:35:46 -0500
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,FREEMAIL_FROM,
 HTML_MESSAGE autolearn=disabled version=3.3.2
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <yikunkero@HIDDEN>) id 1j7yX9-00068q-Bu
 for bug-gzip@HIDDEN; Sat, 29 Feb 2020 04:35:44 -0500
Received: from mail-lj1-x243.google.com ([2a00:1450:4864:20::243]:41886)
 by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16)
 (Exim 4.71) (envelope-from <yikunkero@HIDDEN>) id 1j7yX9-000680-0z
 for bug-gzip@HIDDEN; Sat, 29 Feb 2020 04:35:43 -0500
Received: by mail-lj1-x243.google.com with SMTP id u26so5945578ljd.8
 for <bug-gzip@HIDDEN>; Sat, 29 Feb 2020 01:35:42 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=mime-version:from:date:message-id:subject:to;
 bh=2SlkeNp4G+YegXwm/5dJum7n72AI5zFob0Ix6Py/Eno=;
 b=bM/NNGfaDjGzoXv3VXWt4W7ABRyntfvS508TlYP5j6tZYDEb+e3daXq6TqR1zTbWVv
 f2pMIj2/vScAH+EYeNVW+f8/maDT5bye/JfO84GGMSMsrcGIBCcXixVE2y/ZSx2rHw6N
 NVoAOTm09AFIBNfdswLOJLRtFprOdYiYiKH9ax2801D11DhHRXd0NJ/3KAN2kNY0ZKbt
 GpHPeszGiQR1w07cMTwcHa1k3srrCl9iKg2jrEJhwuk06OqQcUEs/w4PHCnj1mGKESlY
 BTBg8NdHbceA6zxOL6/4GU61e8ms0Axnfviw4pCzsvIKr0qUyb524242rSECYo1j3Vfo
 Xqig==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:from:date:message-id:subject:to;
 bh=2SlkeNp4G+YegXwm/5dJum7n72AI5zFob0Ix6Py/Eno=;
 b=XJsb2bqy62YZ42tl3kW38nyytshELZrezxQ3Wxkq42R9wxZo1tcjaTKI/KAZazwXPB
 bOY6u+AJs20JDmqK+AQ2+iobJMUl27lfGF9De/KY2eeqiKjOrPwuw35tkDxFYv3ruwQZ
 G2zc78vRlBmLUmWDymObxvq3eze/Y1dPC1t04NU8SPY9pG+3Ft8xo+NusCZuy0VZSl+E
 d6/iDhBxJnBfTE99T+TEsaaNVfNd8CE2zBx2C249j5C4pXbnhpd7eEnvpISzOA/ipQtF
 L8r4P8JfuphnWLs9s2Vz4KhgB6JWgOuUx9vIP2bJlNXcReTBKdW6XAAxqAH2TDnnxxof
 zOpw==
X-Gm-Message-State: ANhLgQ2jyADvrq5babaAx5WMKvwdovz66U5hJ2cYiNjqNmGGDfhqcDLf
 Hagfv8VMrvTv38jfBn0n8bcBINaE9Dk27Jk+UunKpVCLqwU=
X-Google-Smtp-Source: ADFU+vtfvJ91NIxqSadRiLscY8IeMsaJSlanmfia2WgrD7H5u+g5TRBhdbBKDkmKbqHw/HyNVcZGaHl7MysQFhuZB20=
X-Received: by 2002:a2e:9816:: with SMTP id a22mr296799ljj.24.1582968940325;
 Sat, 29 Feb 2020 01:35:40 -0800 (PST)
MIME-Version: 1.0
From: Yikun Jiang <yikunkero@HIDDEN>
Date: Sat, 29 Feb 2020 17:35:29 +0800
Message-ID: <CAArz_dDcZLs0xzLuuxQdWrMW+pX=iBWbw4wz4e=MudKM7-ZU8A@HIDDEN>
Subject: [PATCH] Using crc instructions instead of crc_32_tab in aarch64.
To: bug-gzip@HIDDEN
Content-Type: multipart/alternative; boundary="00000000000060ee50059fb3ad77"
X-detected-operating-system: by eggs.gnu.org: Genre and OS details not
 recognized.
X-Received-From: 2a00:1450:4864:20::243
X-Spam-Score: 0.3 (/)
X-Debbugs-Envelope-To: submit
X-Mailman-Approved-At: Sat, 29 Feb 2020 05:09:52 -0500
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -0.7 (/)

--00000000000060ee50059fb3ad77
Content-Type: text/plain; charset="UTF-8"

From: Yikun Jiang <yikunkero@HIDDEN>

Implement CRC function using inline assembly instructions
instead of crc_32_tab to improve the performance in aarch64.
---
 util.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 50 insertions(+)

diff --git a/util.c b/util.c
index 79fe505..d978c61 100644
--- a/util.c
+++ b/util.c
@@ -32,6 +32,17 @@
 #include <dirname.h>
 #include <xalloc.h>

+/* ========================================================================
+ * Implement CRC function using inline assembly instructions instead of
+ * crc_32_tab in aarch64.
+ */
+#ifdef __aarch64__
+#  define CRC32D(crc, value) __asm__("crc32x %w[c], %w[c],
%x[v]":[c]"+r"(crc):[v]"r"(value))
+#  define CRC32W(crc, value) __asm__("crc32w %w[c], %w[c],
%w[v]":[c]"+r"(crc):[v]"r"(value))
+#  define CRC32H(crc, value) __asm__("crc32h %w[c], %w[c],
%w[v]":[c]"+r"(crc):[v]"r"(value))
+#  define CRC32B(crc, value) __asm__("crc32b %w[c], %w[c],
%w[v]":[c]"+r"(crc):[v]"r"(value))
+#endif
+
 #ifndef CHAR_BIT
 #  define CHAR_BIT 8
 #endif
@@ -41,6 +52,7 @@ static int write_buffer (int, voidp, unsigned int);
 /* ========================================================================
  * Table of CRC-32's of all single-byte values (made by makecrc.c)
  */
+#ifndef __aarch64__
 static const ulg crc_32_tab[] = {
   0x00000000L, 0x77073096L, 0xee0e612cL, 0x990951baL, 0x076dc419L,
   0x706af48fL, 0xe963a535L, 0x9e6495a3L, 0x0edb8832L, 0x79dcb8a4L,
@@ -95,6 +107,7 @@ static const ulg crc_32_tab[] = {
   0x5d681b02L, 0x2a6f2b94L, 0xb40bbe37L, 0xc30c8ea1L, 0x5a05df1bL,
   0x2d02ef8dL
 };
+#endif

 /* Shift register contents.  */
 static ulg crc = 0xffffffffL;
@@ -134,6 +147,42 @@ ulg updcrc(s, n)
 {
     register ulg c;         /* temporary variable */

+#ifdef __aarch64__
+    register const uint8_t  *buf1;
+    register const uint16_t *buf2;
+    register const uint32_t *buf4;
+    register const uint64_t *buf8;
+    int64_t length = (int64_t)n;
+    buf8 = (const  uint64_t *)(const void *)s;
+
+    if (s == NULL) {
+        c = 0xffffffffL;
+    } else {
+        c = crc;
+        while(length >= sizeof(uint64_t)) {
+            CRC32D(c, *buf8++);
+            length -= sizeof(uint64_t);
+        }
+
+        buf4 = (const uint32_t *)(const void *)buf8;
+        if (length >= sizeof(uint32_t)) {
+            CRC32W(c, *buf4++);
+            length -= sizeof(uint32_t);
+        }
+
+        buf2 = (const uint16_t *)(const void *)buf4;
+        if(length >= sizeof(uint16_t)) {
+            CRC32H(c, *buf2++);
+            length -= sizeof(uint16_t);
+        }
+
+        buf1 = (const uint8_t *)(const void *)buf2;
+        if (length >= sizeof(uint8_t)) {
+            CRC32B(c, *buf1);
+            length -= sizeof(uint8_t);
+        }
+    }
+#else
     if (s == NULL) {
         c = 0xffffffffL;
     } else {
@@ -142,6 +191,7 @@ ulg updcrc(s, n)
             c = crc_32_tab[((int)c ^ (*s++)) & 0xff] ^ (c >> 8);
         } while (--n);
     }
+#endif
     crc = c;
     return c ^ 0xffffffffL;       /* (instead of ~c for 64-bit machines) */
 }
--
2.17.1

--00000000000060ee50059fb3ad77
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">From: Yikun Jiang &lt;<a href=3D"mailto:yikunkero@HIDDEN=
m" target=3D"_blank">yikunkero@HIDDEN</a>&gt;<br><br>Implement CRC funct=
ion using inline assembly instructions<br>instead of crc_32_tab to improve =
the performance in aarch64.<br>---<br>=C2=A0util.c | 50 +++++++++++++++++++=
+++++++++++++++++++++++++++++++<br>=C2=A01 file changed, 50 insertions(+)<b=
r><br>diff --git a/util.c b/util.c<br>index 79fe505..d978c61 100644<br>--- =
a/util.c<br>+++ b/util.c<br>@@ -32,6 +32,17 @@<br>=C2=A0#include &lt;dirnam=
e.h&gt;<br>=C2=A0#include &lt;xalloc.h&gt;<br><br>+/* =3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D<br>+ * Implement CRC function=
 using inline assembly instructions instead of<br>+ * crc_32_tab in aarch64=
.<br>+ */<br>+#ifdef __aarch64__<br>+#=C2=A0 define CRC32D(crc, value) __as=
m__(&quot;crc32x %w[c], %w[c], %x[v]&quot;:[c]&quot;+r&quot;(crc):[v]&quot;=
r&quot;(value))<br>+#=C2=A0 define CRC32W(crc, value) __asm__(&quot;crc32w =
%w[c], %w[c], %w[v]&quot;:[c]&quot;+r&quot;(crc):[v]&quot;r&quot;(value))<b=
r>+#=C2=A0 define CRC32H(crc, value) __asm__(&quot;crc32h %w[c], %w[c], %w[=
v]&quot;:[c]&quot;+r&quot;(crc):[v]&quot;r&quot;(value))<br>+#=C2=A0 define=
 CRC32B(crc, value) __asm__(&quot;crc32b %w[c], %w[c], %w[v]&quot;:[c]&quot=
;+r&quot;(crc):[v]&quot;r&quot;(value))<br>+#endif<br>+<br>=C2=A0#ifndef CH=
AR_BIT<br>=C2=A0#=C2=A0 define CHAR_BIT 8<br>=C2=A0#endif<br>@@ -41,6 +52,7=
 @@ static int write_buffer (int, voidp, unsigned int);<br>=C2=A0/* =3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D<br>=C2=A0 * Ta=
ble of CRC-32&#39;s of all single-byte values (made by makecrc.c)<br>=C2=A0=
 */<br>+#ifndef __aarch64__<br>=C2=A0static const ulg crc_32_tab[] =3D {<br=
>=C2=A0 =C2=A00x00000000L, 0x77073096L, 0xee0e612cL, 0x990951baL, 0x076dc41=
9L,<br>=C2=A0 =C2=A00x706af48fL, 0xe963a535L, 0x9e6495a3L, 0x0edb8832L, 0x7=
9dcb8a4L,<br>@@ -95,6 +107,7 @@ static const ulg crc_32_tab[] =3D {<br>=C2=
=A0 =C2=A00x5d681b02L, 0x2a6f2b94L, 0xb40bbe37L, 0xc30c8ea1L, 0x5a05df1bL,<=
br>=C2=A0 =C2=A00x2d02ef8dL<br>=C2=A0};<br>+#endif<br><br>=C2=A0/* Shift re=
gister contents.=C2=A0 */<br>=C2=A0static ulg crc =3D 0xffffffffL;<br>@@ -1=
34,6 +147,42 @@ ulg updcrc(s, n)<br>=C2=A0{<br>=C2=A0 =C2=A0 =C2=A0register=
 ulg c;=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/* temporary variable */<br><br>+#=
ifdef __aarch64__<br>+=C2=A0 =C2=A0 register const uint8_t=C2=A0 *buf1;<br>=
+=C2=A0 =C2=A0 register const uint16_t *buf2;<br>+=C2=A0 =C2=A0 register co=
nst uint32_t *buf4;<br>+=C2=A0 =C2=A0 register const uint64_t *buf8;<br>+=
=C2=A0 =C2=A0 int64_t length =3D (int64_t)n;<br>+=C2=A0 =C2=A0 buf8 =3D (co=
nst=C2=A0 uint64_t *)(const void *)s;<br>+<br>+=C2=A0 =C2=A0 if (s =3D=3D N=
ULL) {<br>+=C2=A0 =C2=A0 =C2=A0 =C2=A0 c =3D 0xffffffffL;<br>+=C2=A0 =C2=A0=
 } else {<br>+=C2=A0 =C2=A0 =C2=A0 =C2=A0 c =3D crc;<br>+=C2=A0 =C2=A0 =C2=
=A0 =C2=A0 while(length &gt;=3D sizeof(uint64_t)) {<br>+=C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 CRC32D(c, *buf8++);<br>+=C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 length -=3D sizeof(uint64_t);<br>+=C2=A0 =C2=A0 =C2=A0 =
=C2=A0 }<br>+<br>+=C2=A0 =C2=A0 =C2=A0 =C2=A0 buf4 =3D (const uint32_t *)(c=
onst void *)buf8;<br>+=C2=A0 =C2=A0 =C2=A0 =C2=A0 if (length &gt;=3D sizeof=
(uint32_t)) {<br>+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 CRC32W(c, *buf4=
++);<br>+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 length -=3D sizeof(uint3=
2_t);<br>+=C2=A0 =C2=A0 =C2=A0 =C2=A0 }<br>+<br>+=C2=A0 =C2=A0 =C2=A0 =C2=
=A0 buf2 =3D (const uint16_t *)(const void *)buf4;<br>+=C2=A0 =C2=A0 =C2=A0=
 =C2=A0 if(length &gt;=3D sizeof(uint16_t)) {<br>+=C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 CRC32H(c, *buf2++);<br>+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 length -=3D sizeof(uint16_t);<br>+=C2=A0 =C2=A0 =C2=A0 =C2=A0 }<=
br>+<br>+=C2=A0 =C2=A0 =C2=A0 =C2=A0 buf1 =3D (const uint8_t *)(const void =
*)buf2;<br>+=C2=A0 =C2=A0 =C2=A0 =C2=A0 if (length &gt;=3D sizeof(uint8_t))=
 {<br>+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 CRC32B(c, *buf1);<br>+=C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 length -=3D sizeof(uint8_t);<br>+=C2=
=A0 =C2=A0 =C2=A0 =C2=A0 }<br>+=C2=A0 =C2=A0 }<br>+#else<br>=C2=A0 =C2=A0 =
=C2=A0if (s =3D=3D NULL) {<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0c =3D 0xfff=
fffffL;<br>=C2=A0 =C2=A0 =C2=A0} else {<br>@@ -142,6 +191,7 @@ ulg updcrc(s=
, n)<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0c =3D crc_32_tab[((=
int)c ^ (*s++)) &amp; 0xff] ^ (c &gt;&gt; 8);<br>=C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0} while (--n);<br>=C2=A0 =C2=A0 =C2=A0}<br>+#endif<br>=C2=A0 =C2=
=A0 =C2=A0crc =3D c;<br>=C2=A0 =C2=A0 =C2=A0return c ^ 0xffffffffL;=C2=A0 =
=C2=A0 =C2=A0 =C2=A0/* (instead of ~c for 64-bit machines) */<br>=C2=A0}<fo=
nt color=3D"#888888"><br>--<br>2.17.1</font>=C2=A0=C2=A0<br></div>

--00000000000060ee50059fb3ad77--




Acknowledgement sent to Yikun Jiang <yikunkero@HIDDEN>:
New bug report received and forwarded. Copy sent to bug-gzip@HIDDEN. Full text available.
Report forwarded to bug-gzip@HIDDEN:
bug#39831; Package gzip. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Tue, 5 Apr 2022 01:45:01 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.