Paul Eggert <eggert@HIDDEN>
to control <at> debbugs.gnu.org.
Full text available.
Received: (at submit) by debbugs.gnu.org; 29 Feb 2020 10:09:53 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sat Feb 29 05:09:53 2020
Received: from localhost ([127.0.0.1]:34270 helo=debbugs.gnu.org)
by debbugs.gnu.org with esmtp (Exim 4.84_2)
(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
id 1j7z4C-0001zu-HR
for submit <at> debbugs.gnu.org; Sat, 29 Feb 2020 05:09:52 -0500
Received: from lists.gnu.org ([209.51.188.17]:44544)
by debbugs.gnu.org with esmtp (Exim 4.84_2)
(envelope-from <yikunkero@HIDDEN>) id 1j7yXC-0007FK-FD
for submit <at> debbugs.gnu.org; Sat, 29 Feb 2020 04:35:46 -0500
Received: from eggs.gnu.org ([2001:470:142:3::10]:43760)
by lists.gnu.org with esmtp (Exim 4.90_1)
(envelope-from <yikunkero@HIDDEN>) id 1j7yXA-0000u5-TF
for bug-gzip@HIDDEN; Sat, 29 Feb 2020 04:35:46 -0500
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level:
X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,FREEMAIL_FROM,
HTML_MESSAGE autolearn=disabled version=3.3.2
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
(envelope-from <yikunkero@HIDDEN>) id 1j7yX9-00068q-Bu
for bug-gzip@HIDDEN; Sat, 29 Feb 2020 04:35:44 -0500
Received: from mail-lj1-x243.google.com ([2a00:1450:4864:20::243]:41886)
by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16)
(Exim 4.71) (envelope-from <yikunkero@HIDDEN>) id 1j7yX9-000680-0z
for bug-gzip@HIDDEN; Sat, 29 Feb 2020 04:35:43 -0500
Received: by mail-lj1-x243.google.com with SMTP id u26so5945578ljd.8
for <bug-gzip@HIDDEN>; Sat, 29 Feb 2020 01:35:42 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
h=mime-version:from:date:message-id:subject:to;
bh=2SlkeNp4G+YegXwm/5dJum7n72AI5zFob0Ix6Py/Eno=;
b=bM/NNGfaDjGzoXv3VXWt4W7ABRyntfvS508TlYP5j6tZYDEb+e3daXq6TqR1zTbWVv
f2pMIj2/vScAH+EYeNVW+f8/maDT5bye/JfO84GGMSMsrcGIBCcXixVE2y/ZSx2rHw6N
NVoAOTm09AFIBNfdswLOJLRtFprOdYiYiKH9ax2801D11DhHRXd0NJ/3KAN2kNY0ZKbt
GpHPeszGiQR1w07cMTwcHa1k3srrCl9iKg2jrEJhwuk06OqQcUEs/w4PHCnj1mGKESlY
BTBg8NdHbceA6zxOL6/4GU61e8ms0Axnfviw4pCzsvIKr0qUyb524242rSECYo1j3Vfo
Xqig==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=1e100.net; s=20161025;
h=x-gm-message-state:mime-version:from:date:message-id:subject:to;
bh=2SlkeNp4G+YegXwm/5dJum7n72AI5zFob0Ix6Py/Eno=;
b=XJsb2bqy62YZ42tl3kW38nyytshELZrezxQ3Wxkq42R9wxZo1tcjaTKI/KAZazwXPB
bOY6u+AJs20JDmqK+AQ2+iobJMUl27lfGF9De/KY2eeqiKjOrPwuw35tkDxFYv3ruwQZ
G2zc78vRlBmLUmWDymObxvq3eze/Y1dPC1t04NU8SPY9pG+3Ft8xo+NusCZuy0VZSl+E
d6/iDhBxJnBfTE99T+TEsaaNVfNd8CE2zBx2C249j5C4pXbnhpd7eEnvpISzOA/ipQtF
L8r4P8JfuphnWLs9s2Vz4KhgB6JWgOuUx9vIP2bJlNXcReTBKdW6XAAxqAH2TDnnxxof
zOpw==
X-Gm-Message-State: ANhLgQ2jyADvrq5babaAx5WMKvwdovz66U5hJ2cYiNjqNmGGDfhqcDLf
Hagfv8VMrvTv38jfBn0n8bcBINaE9Dk27Jk+UunKpVCLqwU=
X-Google-Smtp-Source: ADFU+vtfvJ91NIxqSadRiLscY8IeMsaJSlanmfia2WgrD7H5u+g5TRBhdbBKDkmKbqHw/HyNVcZGaHl7MysQFhuZB20=
X-Received: by 2002:a2e:9816:: with SMTP id a22mr296799ljj.24.1582968940325;
Sat, 29 Feb 2020 01:35:40 -0800 (PST)
MIME-Version: 1.0
From: Yikun Jiang <yikunkero@HIDDEN>
Date: Sat, 29 Feb 2020 17:35:29 +0800
Message-ID: <CAArz_dDcZLs0xzLuuxQdWrMW+pX=iBWbw4wz4e=MudKM7-ZU8A@HIDDEN>
Subject: [PATCH] Using crc instructions instead of crc_32_tab in aarch64.
To: bug-gzip@HIDDEN
Content-Type: multipart/alternative; boundary="00000000000060ee50059fb3ad77"
X-detected-operating-system: by eggs.gnu.org: Genre and OS details not
recognized.
X-Received-From: 2a00:1450:4864:20::243
X-Spam-Score: 0.3 (/)
X-Debbugs-Envelope-To: submit
X-Mailman-Approved-At: Sat, 29 Feb 2020 05:09:52 -0500
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -0.7 (/)
--00000000000060ee50059fb3ad77
Content-Type: text/plain; charset="UTF-8"
From: Yikun Jiang <yikunkero@HIDDEN>
Implement CRC function using inline assembly instructions
instead of crc_32_tab to improve the performance in aarch64.
---
util.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 50 insertions(+)
diff --git a/util.c b/util.c
index 79fe505..d978c61 100644
--- a/util.c
+++ b/util.c
@@ -32,6 +32,17 @@
#include <dirname.h>
#include <xalloc.h>
+/* ========================================================================
+ * Implement CRC function using inline assembly instructions instead of
+ * crc_32_tab in aarch64.
+ */
+#ifdef __aarch64__
+# define CRC32D(crc, value) __asm__("crc32x %w[c], %w[c],
%x[v]":[c]"+r"(crc):[v]"r"(value))
+# define CRC32W(crc, value) __asm__("crc32w %w[c], %w[c],
%w[v]":[c]"+r"(crc):[v]"r"(value))
+# define CRC32H(crc, value) __asm__("crc32h %w[c], %w[c],
%w[v]":[c]"+r"(crc):[v]"r"(value))
+# define CRC32B(crc, value) __asm__("crc32b %w[c], %w[c],
%w[v]":[c]"+r"(crc):[v]"r"(value))
+#endif
+
#ifndef CHAR_BIT
# define CHAR_BIT 8
#endif
@@ -41,6 +52,7 @@ static int write_buffer (int, voidp, unsigned int);
/* ========================================================================
* Table of CRC-32's of all single-byte values (made by makecrc.c)
*/
+#ifndef __aarch64__
static const ulg crc_32_tab[] = {
0x00000000L, 0x77073096L, 0xee0e612cL, 0x990951baL, 0x076dc419L,
0x706af48fL, 0xe963a535L, 0x9e6495a3L, 0x0edb8832L, 0x79dcb8a4L,
@@ -95,6 +107,7 @@ static const ulg crc_32_tab[] = {
0x5d681b02L, 0x2a6f2b94L, 0xb40bbe37L, 0xc30c8ea1L, 0x5a05df1bL,
0x2d02ef8dL
};
+#endif
/* Shift register contents. */
static ulg crc = 0xffffffffL;
@@ -134,6 +147,42 @@ ulg updcrc(s, n)
{
register ulg c; /* temporary variable */
+#ifdef __aarch64__
+ register const uint8_t *buf1;
+ register const uint16_t *buf2;
+ register const uint32_t *buf4;
+ register const uint64_t *buf8;
+ int64_t length = (int64_t)n;
+ buf8 = (const uint64_t *)(const void *)s;
+
+ if (s == NULL) {
+ c = 0xffffffffL;
+ } else {
+ c = crc;
+ while(length >= sizeof(uint64_t)) {
+ CRC32D(c, *buf8++);
+ length -= sizeof(uint64_t);
+ }
+
+ buf4 = (const uint32_t *)(const void *)buf8;
+ if (length >= sizeof(uint32_t)) {
+ CRC32W(c, *buf4++);
+ length -= sizeof(uint32_t);
+ }
+
+ buf2 = (const uint16_t *)(const void *)buf4;
+ if(length >= sizeof(uint16_t)) {
+ CRC32H(c, *buf2++);
+ length -= sizeof(uint16_t);
+ }
+
+ buf1 = (const uint8_t *)(const void *)buf2;
+ if (length >= sizeof(uint8_t)) {
+ CRC32B(c, *buf1);
+ length -= sizeof(uint8_t);
+ }
+ }
+#else
if (s == NULL) {
c = 0xffffffffL;
} else {
@@ -142,6 +191,7 @@ ulg updcrc(s, n)
c = crc_32_tab[((int)c ^ (*s++)) & 0xff] ^ (c >> 8);
} while (--n);
}
+#endif
crc = c;
return c ^ 0xffffffffL; /* (instead of ~c for 64-bit machines) */
}
--
2.17.1
--00000000000060ee50059fb3ad77
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr">From: Yikun Jiang <<a href=3D"mailto:yikunkero@HIDDEN=
m" target=3D"_blank">yikunkero@HIDDEN</a>><br><br>Implement CRC funct=
ion using inline assembly instructions<br>instead of crc_32_tab to improve =
the performance in aarch64.<br>---<br>=C2=A0util.c | 50 +++++++++++++++++++=
+++++++++++++++++++++++++++++++<br>=C2=A01 file changed, 50 insertions(+)<b=
r><br>diff --git a/util.c b/util.c<br>index 79fe505..d978c61 100644<br>--- =
a/util.c<br>+++ b/util.c<br>@@ -32,6 +32,17 @@<br>=C2=A0#include <dirnam=
e.h><br>=C2=A0#include <xalloc.h><br><br>+/* =3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D<br>+ * Implement CRC function=
using inline assembly instructions instead of<br>+ * crc_32_tab in aarch64=
.<br>+ */<br>+#ifdef __aarch64__<br>+#=C2=A0 define CRC32D(crc, value) __as=
m__("crc32x %w[c], %w[c], %x[v]":[c]"+r"(crc):[v]"=
r"(value))<br>+#=C2=A0 define CRC32W(crc, value) __asm__("crc32w =
%w[c], %w[c], %w[v]":[c]"+r"(crc):[v]"r"(value))<b=
r>+#=C2=A0 define CRC32H(crc, value) __asm__("crc32h %w[c], %w[c], %w[=
v]":[c]"+r"(crc):[v]"r"(value))<br>+#=C2=A0 define=
CRC32B(crc, value) __asm__("crc32b %w[c], %w[c], %w[v]":[c]"=
;+r"(crc):[v]"r"(value))<br>+#endif<br>+<br>=C2=A0#ifndef CH=
AR_BIT<br>=C2=A0#=C2=A0 define CHAR_BIT 8<br>=C2=A0#endif<br>@@ -41,6 +52,7=
@@ static int write_buffer (int, voidp, unsigned int);<br>=C2=A0/* =3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D<br>=C2=A0 * Ta=
ble of CRC-32's of all single-byte values (made by makecrc.c)<br>=C2=A0=
*/<br>+#ifndef __aarch64__<br>=C2=A0static const ulg crc_32_tab[] =3D {<br=
>=C2=A0 =C2=A00x00000000L, 0x77073096L, 0xee0e612cL, 0x990951baL, 0x076dc41=
9L,<br>=C2=A0 =C2=A00x706af48fL, 0xe963a535L, 0x9e6495a3L, 0x0edb8832L, 0x7=
9dcb8a4L,<br>@@ -95,6 +107,7 @@ static const ulg crc_32_tab[] =3D {<br>=C2=
=A0 =C2=A00x5d681b02L, 0x2a6f2b94L, 0xb40bbe37L, 0xc30c8ea1L, 0x5a05df1bL,<=
br>=C2=A0 =C2=A00x2d02ef8dL<br>=C2=A0};<br>+#endif<br><br>=C2=A0/* Shift re=
gister contents.=C2=A0 */<br>=C2=A0static ulg crc =3D 0xffffffffL;<br>@@ -1=
34,6 +147,42 @@ ulg updcrc(s, n)<br>=C2=A0{<br>=C2=A0 =C2=A0 =C2=A0register=
ulg c;=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/* temporary variable */<br><br>+#=
ifdef __aarch64__<br>+=C2=A0 =C2=A0 register const uint8_t=C2=A0 *buf1;<br>=
+=C2=A0 =C2=A0 register const uint16_t *buf2;<br>+=C2=A0 =C2=A0 register co=
nst uint32_t *buf4;<br>+=C2=A0 =C2=A0 register const uint64_t *buf8;<br>+=
=C2=A0 =C2=A0 int64_t length =3D (int64_t)n;<br>+=C2=A0 =C2=A0 buf8 =3D (co=
nst=C2=A0 uint64_t *)(const void *)s;<br>+<br>+=C2=A0 =C2=A0 if (s =3D=3D N=
ULL) {<br>+=C2=A0 =C2=A0 =C2=A0 =C2=A0 c =3D 0xffffffffL;<br>+=C2=A0 =C2=A0=
} else {<br>+=C2=A0 =C2=A0 =C2=A0 =C2=A0 c =3D crc;<br>+=C2=A0 =C2=A0 =C2=
=A0 =C2=A0 while(length >=3D sizeof(uint64_t)) {<br>+=C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 CRC32D(c, *buf8++);<br>+=C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 length -=3D sizeof(uint64_t);<br>+=C2=A0 =C2=A0 =C2=A0 =
=C2=A0 }<br>+<br>+=C2=A0 =C2=A0 =C2=A0 =C2=A0 buf4 =3D (const uint32_t *)(c=
onst void *)buf8;<br>+=C2=A0 =C2=A0 =C2=A0 =C2=A0 if (length >=3D sizeof=
(uint32_t)) {<br>+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 CRC32W(c, *buf4=
++);<br>+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 length -=3D sizeof(uint3=
2_t);<br>+=C2=A0 =C2=A0 =C2=A0 =C2=A0 }<br>+<br>+=C2=A0 =C2=A0 =C2=A0 =C2=
=A0 buf2 =3D (const uint16_t *)(const void *)buf4;<br>+=C2=A0 =C2=A0 =C2=A0=
=C2=A0 if(length >=3D sizeof(uint16_t)) {<br>+=C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 CRC32H(c, *buf2++);<br>+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 length -=3D sizeof(uint16_t);<br>+=C2=A0 =C2=A0 =C2=A0 =C2=A0 }<=
br>+<br>+=C2=A0 =C2=A0 =C2=A0 =C2=A0 buf1 =3D (const uint8_t *)(const void =
*)buf2;<br>+=C2=A0 =C2=A0 =C2=A0 =C2=A0 if (length >=3D sizeof(uint8_t))=
{<br>+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 CRC32B(c, *buf1);<br>+=C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 length -=3D sizeof(uint8_t);<br>+=C2=
=A0 =C2=A0 =C2=A0 =C2=A0 }<br>+=C2=A0 =C2=A0 }<br>+#else<br>=C2=A0 =C2=A0 =
=C2=A0if (s =3D=3D NULL) {<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0c =3D 0xfff=
fffffL;<br>=C2=A0 =C2=A0 =C2=A0} else {<br>@@ -142,6 +191,7 @@ ulg updcrc(s=
, n)<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0c =3D crc_32_tab[((=
int)c ^ (*s++)) & 0xff] ^ (c >> 8);<br>=C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0} while (--n);<br>=C2=A0 =C2=A0 =C2=A0}<br>+#endif<br>=C2=A0 =C2=
=A0 =C2=A0crc =3D c;<br>=C2=A0 =C2=A0 =C2=A0return c ^ 0xffffffffL;=C2=A0 =
=C2=A0 =C2=A0 =C2=A0/* (instead of ~c for 64-bit machines) */<br>=C2=A0}<fo=
nt color=3D"#888888"><br>--<br>2.17.1</font>=C2=A0=C2=A0<br></div>
--00000000000060ee50059fb3ad77--
Yikun Jiang <yikunkero@HIDDEN>:bug-gzip@HIDDEN.
Full text available.bug-gzip@HIDDEN:bug#39831; Package gzip.
Full text available.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997 nCipher Corporation Ltd,
1994-97 Ian Jackson.