Paul Eggert <eggert@HIDDEN>
to control <at> debbugs.gnu.org
.
Full text available.Received: (at submit) by debbugs.gnu.org; 29 Feb 2020 10:09:53 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Sat Feb 29 05:09:53 2020 Received: from localhost ([127.0.0.1]:34270 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1j7z4C-0001zu-HR for submit <at> debbugs.gnu.org; Sat, 29 Feb 2020 05:09:52 -0500 Received: from lists.gnu.org ([209.51.188.17]:44544) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <yikunkero@HIDDEN>) id 1j7yXC-0007FK-FD for submit <at> debbugs.gnu.org; Sat, 29 Feb 2020 04:35:46 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:43760) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from <yikunkero@HIDDEN>) id 1j7yXA-0000u5-TF for bug-gzip@HIDDEN; Sat, 29 Feb 2020 04:35:46 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,FREEMAIL_FROM, HTML_MESSAGE autolearn=disabled version=3.3.2 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from <yikunkero@HIDDEN>) id 1j7yX9-00068q-Bu for bug-gzip@HIDDEN; Sat, 29 Feb 2020 04:35:44 -0500 Received: from mail-lj1-x243.google.com ([2a00:1450:4864:20::243]:41886) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from <yikunkero@HIDDEN>) id 1j7yX9-000680-0z for bug-gzip@HIDDEN; Sat, 29 Feb 2020 04:35:43 -0500 Received: by mail-lj1-x243.google.com with SMTP id u26so5945578ljd.8 for <bug-gzip@HIDDEN>; Sat, 29 Feb 2020 01:35:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=2SlkeNp4G+YegXwm/5dJum7n72AI5zFob0Ix6Py/Eno=; b=bM/NNGfaDjGzoXv3VXWt4W7ABRyntfvS508TlYP5j6tZYDEb+e3daXq6TqR1zTbWVv f2pMIj2/vScAH+EYeNVW+f8/maDT5bye/JfO84GGMSMsrcGIBCcXixVE2y/ZSx2rHw6N NVoAOTm09AFIBNfdswLOJLRtFprOdYiYiKH9ax2801D11DhHRXd0NJ/3KAN2kNY0ZKbt GpHPeszGiQR1w07cMTwcHa1k3srrCl9iKg2jrEJhwuk06OqQcUEs/w4PHCnj1mGKESlY BTBg8NdHbceA6zxOL6/4GU61e8ms0Axnfviw4pCzsvIKr0qUyb524242rSECYo1j3Vfo Xqig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=2SlkeNp4G+YegXwm/5dJum7n72AI5zFob0Ix6Py/Eno=; b=XJsb2bqy62YZ42tl3kW38nyytshELZrezxQ3Wxkq42R9wxZo1tcjaTKI/KAZazwXPB bOY6u+AJs20JDmqK+AQ2+iobJMUl27lfGF9De/KY2eeqiKjOrPwuw35tkDxFYv3ruwQZ G2zc78vRlBmLUmWDymObxvq3eze/Y1dPC1t04NU8SPY9pG+3Ft8xo+NusCZuy0VZSl+E d6/iDhBxJnBfTE99T+TEsaaNVfNd8CE2zBx2C249j5C4pXbnhpd7eEnvpISzOA/ipQtF L8r4P8JfuphnWLs9s2Vz4KhgB6JWgOuUx9vIP2bJlNXcReTBKdW6XAAxqAH2TDnnxxof zOpw== X-Gm-Message-State: ANhLgQ2jyADvrq5babaAx5WMKvwdovz66U5hJ2cYiNjqNmGGDfhqcDLf Hagfv8VMrvTv38jfBn0n8bcBINaE9Dk27Jk+UunKpVCLqwU= X-Google-Smtp-Source: ADFU+vtfvJ91NIxqSadRiLscY8IeMsaJSlanmfia2WgrD7H5u+g5TRBhdbBKDkmKbqHw/HyNVcZGaHl7MysQFhuZB20= X-Received: by 2002:a2e:9816:: with SMTP id a22mr296799ljj.24.1582968940325; Sat, 29 Feb 2020 01:35:40 -0800 (PST) MIME-Version: 1.0 From: Yikun Jiang <yikunkero@HIDDEN> Date: Sat, 29 Feb 2020 17:35:29 +0800 Message-ID: <CAArz_dDcZLs0xzLuuxQdWrMW+pX=iBWbw4wz4e=MudKM7-ZU8A@HIDDEN> Subject: [PATCH] Using crc instructions instead of crc_32_tab in aarch64. To: bug-gzip@HIDDEN Content-Type: multipart/alternative; boundary="00000000000060ee50059fb3ad77" X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2a00:1450:4864:20::243 X-Spam-Score: 0.3 (/) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Sat, 29 Feb 2020 05:09:52 -0500 X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -0.7 (/) --00000000000060ee50059fb3ad77 Content-Type: text/plain; charset="UTF-8" From: Yikun Jiang <yikunkero@HIDDEN> Implement CRC function using inline assembly instructions instead of crc_32_tab to improve the performance in aarch64. --- util.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 50 insertions(+) diff --git a/util.c b/util.c index 79fe505..d978c61 100644 --- a/util.c +++ b/util.c @@ -32,6 +32,17 @@ #include <dirname.h> #include <xalloc.h> +/* ======================================================================== + * Implement CRC function using inline assembly instructions instead of + * crc_32_tab in aarch64. + */ +#ifdef __aarch64__ +# define CRC32D(crc, value) __asm__("crc32x %w[c], %w[c], %x[v]":[c]"+r"(crc):[v]"r"(value)) +# define CRC32W(crc, value) __asm__("crc32w %w[c], %w[c], %w[v]":[c]"+r"(crc):[v]"r"(value)) +# define CRC32H(crc, value) __asm__("crc32h %w[c], %w[c], %w[v]":[c]"+r"(crc):[v]"r"(value)) +# define CRC32B(crc, value) __asm__("crc32b %w[c], %w[c], %w[v]":[c]"+r"(crc):[v]"r"(value)) +#endif + #ifndef CHAR_BIT # define CHAR_BIT 8 #endif @@ -41,6 +52,7 @@ static int write_buffer (int, voidp, unsigned int); /* ======================================================================== * Table of CRC-32's of all single-byte values (made by makecrc.c) */ +#ifndef __aarch64__ static const ulg crc_32_tab[] = { 0x00000000L, 0x77073096L, 0xee0e612cL, 0x990951baL, 0x076dc419L, 0x706af48fL, 0xe963a535L, 0x9e6495a3L, 0x0edb8832L, 0x79dcb8a4L, @@ -95,6 +107,7 @@ static const ulg crc_32_tab[] = { 0x5d681b02L, 0x2a6f2b94L, 0xb40bbe37L, 0xc30c8ea1L, 0x5a05df1bL, 0x2d02ef8dL }; +#endif /* Shift register contents. */ static ulg crc = 0xffffffffL; @@ -134,6 +147,42 @@ ulg updcrc(s, n) { register ulg c; /* temporary variable */ +#ifdef __aarch64__ + register const uint8_t *buf1; + register const uint16_t *buf2; + register const uint32_t *buf4; + register const uint64_t *buf8; + int64_t length = (int64_t)n; + buf8 = (const uint64_t *)(const void *)s; + + if (s == NULL) { + c = 0xffffffffL; + } else { + c = crc; + while(length >= sizeof(uint64_t)) { + CRC32D(c, *buf8++); + length -= sizeof(uint64_t); + } + + buf4 = (const uint32_t *)(const void *)buf8; + if (length >= sizeof(uint32_t)) { + CRC32W(c, *buf4++); + length -= sizeof(uint32_t); + } + + buf2 = (const uint16_t *)(const void *)buf4; + if(length >= sizeof(uint16_t)) { + CRC32H(c, *buf2++); + length -= sizeof(uint16_t); + } + + buf1 = (const uint8_t *)(const void *)buf2; + if (length >= sizeof(uint8_t)) { + CRC32B(c, *buf1); + length -= sizeof(uint8_t); + } + } +#else if (s == NULL) { c = 0xffffffffL; } else { @@ -142,6 +191,7 @@ ulg updcrc(s, n) c = crc_32_tab[((int)c ^ (*s++)) & 0xff] ^ (c >> 8); } while (--n); } +#endif crc = c; return c ^ 0xffffffffL; /* (instead of ~c for 64-bit machines) */ } -- 2.17.1 --00000000000060ee50059fb3ad77 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr">From: Yikun Jiang <<a href=3D"mailto:yikunkero@HIDDEN= m" target=3D"_blank">yikunkero@HIDDEN</a>><br><br>Implement CRC funct= ion using inline assembly instructions<br>instead of crc_32_tab to improve = the performance in aarch64.<br>---<br>=C2=A0util.c | 50 +++++++++++++++++++= +++++++++++++++++++++++++++++++<br>=C2=A01 file changed, 50 insertions(+)<b= r><br>diff --git a/util.c b/util.c<br>index 79fe505..d978c61 100644<br>--- = a/util.c<br>+++ b/util.c<br>@@ -32,6 +32,17 @@<br>=C2=A0#include <dirnam= e.h><br>=C2=A0#include <xalloc.h><br><br>+/* =3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D<br>+ * Implement CRC function= using inline assembly instructions instead of<br>+ * crc_32_tab in aarch64= .<br>+ */<br>+#ifdef __aarch64__<br>+#=C2=A0 define CRC32D(crc, value) __as= m__("crc32x %w[c], %w[c], %x[v]":[c]"+r"(crc):[v]"= r"(value))<br>+#=C2=A0 define CRC32W(crc, value) __asm__("crc32w = %w[c], %w[c], %w[v]":[c]"+r"(crc):[v]"r"(value))<b= r>+#=C2=A0 define CRC32H(crc, value) __asm__("crc32h %w[c], %w[c], %w[= v]":[c]"+r"(crc):[v]"r"(value))<br>+#=C2=A0 define= CRC32B(crc, value) __asm__("crc32b %w[c], %w[c], %w[v]":[c]"= ;+r"(crc):[v]"r"(value))<br>+#endif<br>+<br>=C2=A0#ifndef CH= AR_BIT<br>=C2=A0#=C2=A0 define CHAR_BIT 8<br>=C2=A0#endif<br>@@ -41,6 +52,7= @@ static int write_buffer (int, voidp, unsigned int);<br>=C2=A0/* =3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D<br>=C2=A0 * Ta= ble of CRC-32's of all single-byte values (made by makecrc.c)<br>=C2=A0= */<br>+#ifndef __aarch64__<br>=C2=A0static const ulg crc_32_tab[] =3D {<br= >=C2=A0 =C2=A00x00000000L, 0x77073096L, 0xee0e612cL, 0x990951baL, 0x076dc41= 9L,<br>=C2=A0 =C2=A00x706af48fL, 0xe963a535L, 0x9e6495a3L, 0x0edb8832L, 0x7= 9dcb8a4L,<br>@@ -95,6 +107,7 @@ static const ulg crc_32_tab[] =3D {<br>=C2= =A0 =C2=A00x5d681b02L, 0x2a6f2b94L, 0xb40bbe37L, 0xc30c8ea1L, 0x5a05df1bL,<= br>=C2=A0 =C2=A00x2d02ef8dL<br>=C2=A0};<br>+#endif<br><br>=C2=A0/* Shift re= gister contents.=C2=A0 */<br>=C2=A0static ulg crc =3D 0xffffffffL;<br>@@ -1= 34,6 +147,42 @@ ulg updcrc(s, n)<br>=C2=A0{<br>=C2=A0 =C2=A0 =C2=A0register= ulg c;=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/* temporary variable */<br><br>+#= ifdef __aarch64__<br>+=C2=A0 =C2=A0 register const uint8_t=C2=A0 *buf1;<br>= +=C2=A0 =C2=A0 register const uint16_t *buf2;<br>+=C2=A0 =C2=A0 register co= nst uint32_t *buf4;<br>+=C2=A0 =C2=A0 register const uint64_t *buf8;<br>+= =C2=A0 =C2=A0 int64_t length =3D (int64_t)n;<br>+=C2=A0 =C2=A0 buf8 =3D (co= nst=C2=A0 uint64_t *)(const void *)s;<br>+<br>+=C2=A0 =C2=A0 if (s =3D=3D N= ULL) {<br>+=C2=A0 =C2=A0 =C2=A0 =C2=A0 c =3D 0xffffffffL;<br>+=C2=A0 =C2=A0= } else {<br>+=C2=A0 =C2=A0 =C2=A0 =C2=A0 c =3D crc;<br>+=C2=A0 =C2=A0 =C2= =A0 =C2=A0 while(length >=3D sizeof(uint64_t)) {<br>+=C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 CRC32D(c, *buf8++);<br>+=C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 length -=3D sizeof(uint64_t);<br>+=C2=A0 =C2=A0 =C2=A0 = =C2=A0 }<br>+<br>+=C2=A0 =C2=A0 =C2=A0 =C2=A0 buf4 =3D (const uint32_t *)(c= onst void *)buf8;<br>+=C2=A0 =C2=A0 =C2=A0 =C2=A0 if (length >=3D sizeof= (uint32_t)) {<br>+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 CRC32W(c, *buf4= ++);<br>+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 length -=3D sizeof(uint3= 2_t);<br>+=C2=A0 =C2=A0 =C2=A0 =C2=A0 }<br>+<br>+=C2=A0 =C2=A0 =C2=A0 =C2= =A0 buf2 =3D (const uint16_t *)(const void *)buf4;<br>+=C2=A0 =C2=A0 =C2=A0= =C2=A0 if(length >=3D sizeof(uint16_t)) {<br>+=C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 CRC32H(c, *buf2++);<br>+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 length -=3D sizeof(uint16_t);<br>+=C2=A0 =C2=A0 =C2=A0 =C2=A0 }<= br>+<br>+=C2=A0 =C2=A0 =C2=A0 =C2=A0 buf1 =3D (const uint8_t *)(const void = *)buf2;<br>+=C2=A0 =C2=A0 =C2=A0 =C2=A0 if (length >=3D sizeof(uint8_t))= {<br>+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 CRC32B(c, *buf1);<br>+=C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 length -=3D sizeof(uint8_t);<br>+=C2= =A0 =C2=A0 =C2=A0 =C2=A0 }<br>+=C2=A0 =C2=A0 }<br>+#else<br>=C2=A0 =C2=A0 = =C2=A0if (s =3D=3D NULL) {<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0c =3D 0xfff= fffffL;<br>=C2=A0 =C2=A0 =C2=A0} else {<br>@@ -142,6 +191,7 @@ ulg updcrc(s= , n)<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0c =3D crc_32_tab[((= int)c ^ (*s++)) & 0xff] ^ (c >> 8);<br>=C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0} while (--n);<br>=C2=A0 =C2=A0 =C2=A0}<br>+#endif<br>=C2=A0 =C2= =A0 =C2=A0crc =3D c;<br>=C2=A0 =C2=A0 =C2=A0return c ^ 0xffffffffL;=C2=A0 = =C2=A0 =C2=A0 =C2=A0/* (instead of ~c for 64-bit machines) */<br>=C2=A0}<fo= nt color=3D"#888888"><br>--<br>2.17.1</font>=C2=A0=C2=A0<br></div> --00000000000060ee50059fb3ad77--
Yikun Jiang <yikunkero@HIDDEN>
:bug-gzip@HIDDEN
.
Full text available.bug-gzip@HIDDEN
:bug#39831
; Package gzip
.
Full text available.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997 nCipher Corporation Ltd,
1994-97 Ian Jackson.