Paul Eggert <eggert@HIDDEN>
to control <at> debbugs.gnu.org
.
Full text available.Received: (at submit) by debbugs.gnu.org; 16 Jan 2014 23:51:25 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Thu Jan 16 18:51:25 2014 Received: from localhost ([127.0.0.1]:54196 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1W3wia-0003Qf-Ej for submit <at> debbugs.gnu.org; Thu, 16 Jan 2014 18:51:24 -0500 Received: from eggs.gnu.org ([208.118.235.92]:48647) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <dudaj@HIDDEN>) id 1W3wfB-0003GQ-Iy for submit <at> debbugs.gnu.org; Thu, 16 Jan 2014 18:47:54 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from <dudaj@HIDDEN>) id 1W3wf6-0006hF-OK for submit <at> debbugs.gnu.org; Thu, 16 Jan 2014 18:47:53 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:58383) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <dudaj@HIDDEN>) id 1W3wf6-0006hB-Kv for submit <at> debbugs.gnu.org; Thu, 16 Jan 2014 18:47:48 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:50339) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from <dudaj@HIDDEN>) id 1W3wf2-0004dc-6w for bug-gzip@HIDDEN; Thu, 16 Jan 2014 18:47:48 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from <dudaj@HIDDEN>) id 1W3wex-0006gJ-UF for bug-gzip@HIDDEN; Thu, 16 Jan 2014 18:47:44 -0500 Received: from mailhub129.itcs.purdue.edu ([128.210.5.129]:53790) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <dudaj@HIDDEN>) id 1W3wex-0006g5-OS for bug-gzip@HIDDEN; Thu, 16 Jan 2014 18:47:39 -0500 Received: from [10.184.172.48] (pal-nat184-172-048.itap.purdue.edu [10.184.172.48]) (authenticated bits=0) by mailhub129.itcs.purdue.edu (8.14.4/8.14.4/mta-auth.smtp.purdue.edu) with ESMTP id s0GNlcWW032218 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT) for <bug-gzip@HIDDEN>; Thu, 16 Jan 2014 18:47:38 -0500 Message-ID: <52D86F98.1030301@HIDDEN> Date: Thu, 16 Jan 2014 18:47:36 -0500 From: =?UTF-8?B?SmFyb3PFgmF3IER1ZGE=?= <dudaj@HIDDEN> User-Agent: Mozilla/5.0 (Windows NT 6.2; WOW64; rv:17.0) Gecko/20130801 Thunderbird/17.0.8 MIME-Version: 1.0 To: bug-gzip@HIDDEN Subject: New entropy coding: faster than Huffman, compression rate like arithmetic Content-Type: text/plain; charset=UTF-8; format=flowed X-PMX-Version: 6.0.2.2308539 X-PerlMx-Virus-Scanned: Yes Content-Transfer-Encoding: quoted-printable X-MIME-Autoconverted: from 8bit to quoted-printable by mailhub129.itcs.purdue.edu id s0GNlcWW032218 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Thu, 16 Jan 2014 18:51:22 -0500 X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <http://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <http://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <http://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -5.0 (-----) Hallo, Since I haven't found gzip development mailing list, I am sending it here. There is a new approach to entropy coding: asymmetric numeral systems.=20 It has some connection with arithmetic coding, but is simpler: uses a=20 single natural number as the state, instead of two to represent the=20 range. Thanks of it, we can put the entire behavior for given large=20 alphabet probability distribution into a relatively small table: a few=20 kilobytes for 256 size alphabet. This way it turns out about 50% faster than Huffman decoding, still=20 providing accuracy (compression rate) like arithmetic coding. Here is some implementation: https://github.com/Cyan4973/FiniteStateEntro= py Just replacing Huffman with it, we get improvement of both speed and=20 compression rate, like recently in Zhuff of Yann Collet:=20 http://fastcompression.blogspot.fr/2013/12/zhuff-v09-or-first-fse-applica= tion.html I thought it could also improve gzip DEFLATE in the same way? Best, Jarek -- dr Jaros=C5=82aw Duda Center for Science of Information, Purdue University, USA http://www.soihub.org/people.php?id=3D484
Jarosław Duda <dudaj@HIDDEN>
:bug-gzip@HIDDEN
.
Full text available.bug-gzip@HIDDEN
:bug#16471
; Package gzip
.
Full text available.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997 nCipher Corporation Ltd,
1994-97 Ian Jackson.