GNU bug report logs - #67022
Gzip decompression can be 60% faster using zlib's CRC32

Previous Next

Package: gzip;

Reported by: Young Mo Kang <kym327 <at> gmail.com>

Date: Thu, 9 Nov 2023 17:42:01 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 67022 in the body.
You can then email your comments to 67022 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gzip <at> gnu.org:
bug#67022; Package gzip. (Thu, 09 Nov 2023 17:42:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Young Mo Kang <kym327 <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-gzip <at> gnu.org. (Thu, 09 Nov 2023 17:42:01 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Young Mo Kang <kym327 <at> gmail.com>
To: bug-gzip <at> gnu.org
Subject: Gzip decompression can be 60% faster using zlib's CRC32
Date: Thu, 9 Nov 2023 12:40:24 -0500
Hello,


I have noticed that GNU Gzip's CRC32 calculation is the main bottleneck 
in decompression, and it can run significantly faster >60% if we replace 
it with crc32 function from zlib.


I tested decompression speed of linux source code tar.gz file before and 
after replacing CRC32 computation. On an AMD 7735HS system, I get

GNU Gzip unmodified
    Elapsed (wall clock) time (h:mm:ss or m:ss): 0:05.11
GNU Gzip with CRC32 from zlib
    Elapsed (wall clock) time (h:mm:ss or m:ss): 0:03.16


And I saw even better performance improvement when tested on an Apple 
Silicon M1 system.

GNU Gzip unmodified
    Elapsed (wall clock) time (h:mm:ss or m:ss): 0:06.83
GNU Gzip with CRC32 from zlib
    Elapsed (wall clock) time (h:mm:ss or m:ss): 0:03.72


Since both GNU Gzip and zlib are written by the same authors, I was 
wondering if GNU Gzip can share zlib's CRC32 calculation and obtain this 
performance gain--I am not sure if there would be a license issue though.


The following bash script should reproduce the result

```

# download GNU Gzip and zlib
wget -O- https://ftp.gnu.org/gnu/gzip/gzip-1.13.tar.gz | tar xzf -
wget -O- https://zlib.net/zlib-1.3.tar.gz | tar xzf -

# download linux source code as a test file for decompression speed
wget -O- https://cdn.kernel.org/pub/linux/kernel/v6.x/linux-6.6.1.tar.xz 
| xz -d | gzip > linux.tar.gz

# compile zlib
cd zlib-1.3
CFLAGS="-O2 -g" ./configure --static && make -j
cd ..

# compile GNU Gzip
cd gzip-1.13
CFLAGS="-O2 -g" ./configure && make -j

# measure decompression speed
/usr/bin/time -v ./gzip -d < ../linux.tar.gz > linux.tar 2> ../gzip1.time

# use crc32 from zlib
cat > util.diff << EOF
@@ -27,6 +27,7 @@
 #include <stdlib.h>
 #include <errno.h>

+#include "crc32.h"
 #include "tailor.h"
 #include "gzip.h"
 #include <dirname.h>
@@ -136,25 +137,14 @@ copy (int in, int out)
 ulg
 updcrc (uch const *s, unsigned n)
 {
-    register ulg c;         /* temporary variable */
-
-    if (s == NULL) {
-        c = 0xffffffffL;
-    } else {
-        c = crc;
-        if (n) do {
-            c = crc_32_tab[((int)c ^ (*s++)) & 0xff] ^ (c >> 8);
-        } while (--n);
-    }
-    crc = c;
-    return c ^ 0xffffffffL;       /* (instead of ~c for 64-bit machines) */
+    crc = crc32(crc, s, n);
 }

 /* Return a current CRC value.  */
 ulg
 getcrc ()
 {
-  return crc ^ 0xffffffffL;
+  return crc;
 }

 #ifdef IBM_Z_DFLTCC
EOF
patch < util.diff util.c

# create header file
cat > crc32.h << EOF
#pragma once

unsigned long  crc32(unsigned long crc, const unsigned char  *buf,
                            unsigned int len);
EOF

# copy crc32 object file from zlib
cp ../zlib-1.3/crc32.o .

# re-compile GNU Gzip
gcc -O2 -g -c util.c -Ilib
gcc -O2 -g *.o lib/libgzip.a -o gzip

# measure decompression speed
/usr/bin/time -v ./gzip -d < ../linux.tar.gz > linux.tar 2> ../gzip2.time

# print out time difference
cd ..
echo
echo "GNU Gzip unmodified"
grep Elapsed gzip1.time
echo "GNU Gzip with CRC32 from zlib"
grep Elapsed gzip2.time
```





Information forwarded to bug-gzip <at> gnu.org:
bug#67022; Package gzip. (Thu, 09 Nov 2023 18:34:02 GMT) Full text and rfc822 format available.

Message #8 received at 67022 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Young Mo Kang <kym327 <at> gmail.com>, 67022 <at> debbugs.gnu.org
Subject: Re: bug#67022: Gzip decompression can be 60% faster using zlib's CRC32
Date: Thu, 9 Nov 2023 10:32:43 -0800
On 2023-11-09 09:40, Young Mo Kang wrote:
> Since both GNU Gzip and zlib are written by the same authors, I was 
> wondering if GNU Gzip can share zlib's CRC32 calculation and obtain this 
> performance gain--I am not sure if there would be a license issue though.

Shouldn't be a license issue. It's just a lack of time issue.




Information forwarded to bug-gzip <at> gnu.org:
bug#67022; Package gzip. (Thu, 14 Mar 2024 02:48:03 GMT) Full text and rfc822 format available.

Message #11 received at submit <at> debbugs.gnu.org (full text, mbox):

From: wrotycz <wrotycz <at> wir.pl>
To: bug-gzip <bug-gzip <at> gnu.org>
Subject: Re: bug#67022: Gzip decompression can be 60% faster using zlib's CRC32
Date: Thu, 14 Mar 2024 02:14:54 +0100
[Message part 1 (text/plain, inline)]
I tried to implement this slice-by-8 CRC but couldn&#39;t do it the way I wanted - without bloated zlib tables and stuff. Maybe because I don&#39;t get what updcrc(), getcrc(), setcrc() are and what they actually do. The whole program is magical, there is no way to find do_de/compress(), nor anything like that, so it was difficult to actually find relevant code. But anyway, for a start I manged to make `makecrcs8.c&#39; to generate CRCs lookup table (attached).   Regards  wrotycz
[Message part 2 (text/html, inline)]
[makecrcs8.c (application/octet-stream, attachment)]

Information forwarded to bug-gzip <at> gnu.org:
bug#67022; Package gzip. (Thu, 14 Mar 2024 18:02:01 GMT) Full text and rfc822 format available.

Message #14 received at 67022 <at> debbugs.gnu.org (full text, mbox):

From: "Adler, Mark" <madler <at> alumni.caltech.edu>
To: wrotycz <wrotycz <at> wir.pl>
Cc: "67022 <at> debbugs.gnu.org" <67022 <at> debbugs.gnu.org>
Subject: Re: bug#67022: Gzip decompression can be 60% faster using zlib's CRC32
Date: Thu, 14 Mar 2024 18:00:30 +0000
Mr. Wrotycz,

I’m not sure what you mean by “bloated”, but anyway, the current code in zlib for CRC-32 does not use slice-by-8. It uses braids, which is faster. If there were going to be a replacement for the software CRC in gzip, it should start there.

It could also finish there, simply by copying all the code. It is, after all, open source.

As for the “magical”, any sufficiently advanced technology is indistinguishable from magic.  :-)

Mark


> On Mar 13, 2024, at 6:14 PM, wrotycz <wrotycz <at> wir.pl> wrote:
> 
> I tried to implement this slice-by-8 CRC but couldn&#39;t do it the way I wanted - without bloated zlib tables and stuff. Maybe because I don&#39;t get what updcrc(), getcrc(), setcrc() are and what they actually do. The whole program is magical, there is no way to find do_de/compress(), nor anything like that, so it was difficult to actually find relevant code. But anyway, for a start I manged to make `makecrcs8.c&#39; to generate CRCs lookup table (attached).   Regards  wrotycz
> <makecrcs8.c>


Information forwarded to bug-gzip <at> gnu.org:
bug#67022; Package gzip. (Thu, 14 Mar 2024 19:49:01 GMT) Full text and rfc822 format available.

Message #17 received at 67022 <at> debbugs.gnu.org (full text, mbox):

From: "Adler, Mark" <madler <at> alumni.caltech.edu>
To: wrotycz <wrotycz <at> wir.pl>
Cc: "67022 <at> debbugs.gnu.org" <67022 <at> debbugs.gnu.org>
Subject: Re: bug#67022: Gzip decompression can be 60% faster using zlib's CRC32
Date: Thu, 14 Mar 2024 19:48:00 +0000
> And, again, due to level of complexity in the code, I didn't check it. I have this quirk that I want to understand stuff I use, here I couldn't so I didn't follow.
> 
> I will check that braids thing when I find something about it.


The paper on braided CRCs is in the zlib distribution: doc/crc-doc.1.0.pdf .

I apologize for your nausea. I recommend Pepto Bismol.





Information forwarded to bug-gzip <at> gnu.org:
bug#67022; Package gzip. (Thu, 14 Mar 2024 22:59:02 GMT) Full text and rfc822 format available.

Message #20 received at 67022 <at> debbugs.gnu.org (full text, mbox):

From: "Adler, Mark" <madler <at> alumni.caltech.edu>
To: wrotycz <wrotycz <at> wir.pl>
Cc: "67022 <at> debbugs.gnu.org" <67022 <at> debbugs.gnu.org>
Subject: Re: bug#67022: Gzip decompression can be 60% faster using zlib's CRC32
Date: Thu, 14 Mar 2024 22:57:23 +0000
On Mar 14, 2024, at 3:46 PM, wrotycz <wrotycz <at> not.really> wrote:
> 
> 
> Despite that the question is how do I use zlib crc32()? It doesn't give me correct result.
> 
> My 'rig' is this:
> 
> ~~~
> crc = -1
> while (buffer, length = read_data()):
>     {
>     crc = crcfunc(crc, buffer, length)
>     }
> crc = ~crc
> ~~~
> 
> This doesn't work with `crc32_z();’

You would need to read the documentation in zlib.h in order to be able to use its crc32() or crc32_z() correctly. From zlib.h:

   Usage example:

     uLong crc = crc32(0L, Z_NULL, 0);

     while (read_buffer(buffer, length) != EOF) {
       crc = crc32(crc, buffer, length);
     }
     if (crc != original_crc) error();

The initial value returned by crc32(0L, Z_NULL, 0) is 0, not -1. And there is no post inversion. To get your test code to work, it would need to be:

> crc = 0
> while (buffer, length = read_data()):
>     {
>     crc = crc32_z(crc, buffer, length)
>     }



Information forwarded to bug-gzip <at> gnu.org:
bug#67022; Package gzip. (Fri, 15 Mar 2024 05:10:01 GMT) Full text and rfc822 format available.

Message #23 received at submit <at> debbugs.gnu.org (full text, mbox):

From: wrotycz <wrotycz <at> wir.pl>
To: 67022 <at> debbugs.gnu.org <67022 <at> debbugs.gnu.org>,
 bug-gzip <bug-gzip <at> gnu.org>
Subject: Re: bug#67022: Gzip decompression can be 60% faster using zlib's CRC32
Date: Thu, 14 Mar 2024 18:23:17 +0100
[Message part 1 (text/plain, inline)]
Re: bug#67022: Gzip decompression can be 60% faster using zlib&#39;s CRC32    I tried to make that CRC-slice-by-8 happen but, as I miss something, maybe you will get that right.  My guess the problem lies in all those negations of crc (`(return) crc = crc ^ 0xffffffff;&#39;).   I prepared diff that does that, in more, less simple way. `crc32_8b...()&#39; should be in util.c, and crc32.h should only contain lookup table generated by `sample/makecrc8.c&#39;, but until it works correctly it is in the header. Moving it to corret place is the least of the problem.   I put the diff in paste as it&#39;s bit big to attach to email.   ~~~ bash  wget -Ogzip-1.13-crc328-2.diff  paste.ee https://paste.ee/d/mkqTU/0  cd gzip-1.13  patch -p1 ../gzip-1.13-crc328-2.diff  ~~~   Hope that will help to make it happen.
[Message part 2 (text/html, inline)]

Information forwarded to bug-gzip <at> gnu.org:
bug#67022; Package gzip. (Fri, 15 Mar 2024 05:10:03 GMT) Full text and rfc822 format available.

Information forwarded to bug-gzip <at> gnu.org:
bug#67022; Package gzip. (Fri, 15 Mar 2024 05:10:04 GMT) Full text and rfc822 format available.

Message #29 received at 67022 <at> debbugs.gnu.org (full text, mbox):

From: wrotycz <wrotycz <at> wir.pl>
To: Adler, Mark <madler <at> alumni.caltech.edu>
Cc: 67022 <at> debbugs.gnu.org <67022 <at> debbugs.gnu.org>
Subject: Re: Re: bug#67022: Gzip decompression can be 60% faster using zlib's CRC32
Date: Thu, 14 Mar 2024 19:56:41 +0100
[Message part 1 (text/plain, inline)]
&gt; Mr. Wrotycz,   Mr Adler, let&#39;s spare the formalities. That&#39;s a nickname.   &gt; &gt; On Mar 13, 2024, at 6:14 PM, wrotycz &lt; wrotycz@#$%...   I know it&#39;s a mail list but I would appreciate to not broadcast my email to every (spam)bot on the web. Archive is publicly available and easily searchable by boys, so next time don&#39;t quote someone&#39;s email in the message.   &gt; I’m not sure what you mean by “bloated”   I mean 600[kB] of crc32.h. I know, there is a lot of conditionals (compilation)  that make it much smaller in  resulting object, but the sheer size of it makes me nausea. I guess, there is only one person that can understand flow of conditional directives in it. And crc32_z() is even more complicated in this regard.   To illustrate the difference between code of zlib and, what I consider comprehensible, compare code of xxhash and xxhash-clean.  I hope you understand.  &gt; for CRC-32 does not use slice-by-8. It uses braids   I noticed the lookup table is &#39;upside down&#39; but didn&#39;t know that thing.  And, again, due to level of complexity in the code, I didn&#39;t check it. I have this quirk that I want to understand stuff I use, here I couldn&#39;t so I didn&#39;t follow.   I will check that braids thing when I find something about it.   Regards
[Message part 2 (text/html, inline)]

Information forwarded to bug-gzip <at> gnu.org:
bug#67022; Package gzip. (Fri, 15 Mar 2024 05:10:04 GMT) Full text and rfc822 format available.

Message #32 received at 67022 <at> debbugs.gnu.org (full text, mbox):

From: wrotycz <wrotycz <at> wir.pl>
To: Adler, Mark <madler <at> alumni.caltech.edu>
Cc: 67022 <at> debbugs.gnu.org <67022 <at> debbugs.gnu.org>
Subject: Re: bug#67022: Gzip decompression can be 60% faster using zlib's CRC32
Date: Thu, 14 Mar 2024 23:46:14 +0100
[Message part 1 (text/plain, inline)]
&gt; The paper on braided CRCs is in the zlib distribution: doc/crc-doc.1.0.pdf   Thanks for a tip, I read that and now it&#39;s bit clearer to me.    Maybe it is slightly faster, but I wouldn&#39;t bet dollars against nuts it&#39;s exact in my case as 7zip-crc also uses slice-by-8 algorithm and is actually faster.  Despite that the question is how do I use zlib crc32()? It doesn&#39;t give me correct result.   My &#39;rig&#39; is this:   ~~~  crc = -1  while (buffer, length = read_data()):      {      crc = crcfunc(crc, buffer, length)      }  crc = ~crc  ~~~   This doesn&#39;t work with `crc32_z();&#39;
[Message part 2 (text/html, inline)]

Information forwarded to bug-gzip <at> gnu.org:
bug#67022; Package gzip. (Fri, 15 Mar 2024 05:10:05 GMT) Full text and rfc822 format available.

Message #35 received at 67022 <at> debbugs.gnu.org (full text, mbox):

From: wrotycz <wrotycz <at> wir.pl>
To: Adler, Mark <madler <at> alumni.caltech.edu>
Cc: 67022 <at> debbugs.gnu.org <67022 <at> debbugs.gnu.org>
Subject: Re: bug#67022: Gzip decompression can be 60% faster using zlib's CRC32
Date: Fri, 15 Mar 2024 00:26:28 +0100
[Message part 1 (text/plain, inline)]
&gt; The initial value returned by crc32(0L, Z_NULL, 0) is 0, not -1. And there is no post inversion   That was it, thanks.
[Message part 2 (text/html, inline)]

Information forwarded to bug-gzip <at> gnu.org:
bug#67022; Package gzip. (Sat, 16 Mar 2024 13:33:02 GMT) Full text and rfc822 format available.

Message #38 received at 67022 <at> debbugs.gnu.org (full text, mbox):

From: wrotycz <wrotycz <at> wir.pl>
To: Adler <at> debbugs.gnu.org <Adler <at> debbugs.gnu.org>,
 Mark <madler <at> alumni.caltech.edu>
Cc: 67022 <at> debbugs.gnu.org <67022 <at> debbugs.gnu.org>
Subject: Re: bug#67022: Gzip decompression can be 60% faster using zlib's CRC32
Date: Sat, 16 Mar 2024 14:31:39 +0100
[Message part 1 (text/plain, inline)]
&gt; current code in zlib for CRC-32 does not use slice-by-8. It uses braids, which is faster.    It turns out not to be that fancy in gzip&#39;s harness.  It seems to be tiny bit faster than slice-by-8 during decompression but slower during compression; to the point of being on par with current Sarwate algorithm, or tiny bit faster (  paste.ee paste.ee/p/QeBKL  ).    Benefits are not that unambiguous. Though I didn&#39;t make that slicing-by-8 (  paste.ee paste.ee/d/mkqTU/0  ) work with gzip properly, I would consider it in a context of application itself. If braids are faster with decompression, not much though, and slower with compression, then it would make sense to weight it somehow in this context and asses fitness for this particular application.    Either way, whichever algorithm would be chosen it still would make sense to generate crc table dynamically, when program starts. It takes, on my turtle computer, less than 10 us. Even rolled version. Unrolled isn&#39;t much faster either (10%-20%).  Using already generated &#39;Sarwate&#39;s&#39; table  as a basis, halves that time.  That&#39;s imperceptible differences. None is going to notice that, even with atomic clock.
[Message part 2 (text/html, inline)]

Reply sent to Paul Eggert <eggert <at> cs.ucla.edu>:
You have taken responsibility. (Tue, 11 Feb 2025 07:47:02 GMT) Full text and rfc822 format available.

Notification sent to Young Mo Kang <kym327 <at> gmail.com>:
bug acknowledged by developer. (Tue, 11 Feb 2025 07:47:02 GMT) Full text and rfc822 format available.

Message #43 received at 67022-done <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Young Mo Kang <kym327 <at> gmail.com>
Cc: 67022-done <at> debbugs.gnu.org
Subject: Re: bug#67022: Gzip decompression can be 60% faster using zlib's CRC32
Date: Mon, 10 Feb 2025 23:46:40 -0800
On 2023-11-09 10:32, Paul Eggert wrote:
> On 2023-11-09 09:40, Young Mo Kang wrote:
>> Since both GNU Gzip and zlib are written by the same authors, I was 
>> wondering if GNU Gzip can share zlib's CRC32 calculation and obtain 
>> this performance gain--I am not sure if there would be a license issue 
>> though.
> 
> Shouldn't be a license issue. It's just a lack of time issue.

Due to work by Sam Russell and others it looks like gzip now has faster 
CRC32 code on Savannah master, so closing this old bug report. See:

https://bugs.gnu.org/74927
https://bugs.gnu.org/74192




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Tue, 11 Mar 2025 11:24:06 GMT) Full text and rfc822 format available.

This bug report was last modified 2 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.