GNU bug report logs - #30935
gzip -l reports wrong size for decompressed files larger than 4GB

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: gzip; Reported by: Wolfgang Formann <wformann@HIDDEN>; merged with #17804, #29089, #30936, #38766, #42965, #48424, #52227; dated Sun, 25 Mar 2018 13:31:03 UTC; Maintainer for gzip is bug-gzip@HIDDEN.
Merged 17804 29089 30935 30936 38766 42965 48424 52227. Request was from Paul Eggert <eggert@HIDDEN> to control <at> debbugs.gnu.org. Full text available.

Message received at 30935 <at> debbugs.gnu.org:


Received: (at 30935) by debbugs.gnu.org; 26 Mar 2018 01:50:16 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sun Mar 25 21:50:16 2018
Received: from localhost ([127.0.0.1]:53333 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1f0HH5-0000Bu-8G
	for submit <at> debbugs.gnu.org; Sun, 25 Mar 2018 21:50:16 -0400
Received: from mail-qk0-f175.google.com ([209.85.220.175]:47062)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <meyering@HIDDEN>) id 1f0HH2-0000Bj-Mh
 for 30935 <at> debbugs.gnu.org; Sun, 25 Mar 2018 21:50:13 -0400
Received: by mail-qk0-f175.google.com with SMTP id o184so18525187qkd.13
 for <30935 <at> debbugs.gnu.org>; Sun, 25 Mar 2018 18:50:12 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=mime-version:sender:in-reply-to:references:from:date:message-id
 :subject:to:cc;
 bh=dTsgf06UROOC3c1hzuNYA2Roc+faQueWzwyrJ780DXE=;
 b=JTB+MhL9xf5nHGcHYlFvQLMZvii57S+hkHMDjWPm2oUbTxbM6JKf+u7fxqgxTAA+gp
 CDfZPTuhc/Cczh0XkYJq/3NJdQrQDSAvKhXv/8CCNz00DMW/S3TLFCmSbQVbFxHOLBve
 U2fWlDgx8tSDlwehEUNprjWoXBC3cPq0E4iM4/M+q5Qpc0FdnprPJr33Rtws5AeLQvbq
 AnqoSy9wom3ywfo2KrOibS4S1ecy+sRKQpxRu49+zr+tS/0xEyPxJQDf6IDENxxZnF9x
 n5bhuSNbK9+8uBwSE/PpA1xw7DX0KjKx54G1UCUURS01QZN3Ziw4rKkUt+1EJbDso5+R
 RGoQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:sender:in-reply-to:references:from
 :date:message-id:subject:to:cc;
 bh=dTsgf06UROOC3c1hzuNYA2Roc+faQueWzwyrJ780DXE=;
 b=blpNMftl9YXDREYWdRLlLJGS3wUeh6hY4DPyziM6eEtSg8UufsCA6WKt1LHKHZjLRi
 k134t6q1h2UFtjDZ108Pjd4mRhrKN3Le4IzdY4wNbx5Jny4A1DneVD1YsR8vxpuUp0ER
 vLFCKUfIpLE6H0uaWrizBcbcAgeMXCxBBqGDSrUhBNY7dj8i3cHbSHC+JZiDxJ57k9z4
 WfDvm/owDylsc2CaUu2HN+RJ1Zi36Zi4Hrbs2UTVQPjgGPTLq2OD0i4aOqqRidrxZYb7
 Pkg2Wx9PrhQawdQ4EHPQGGlE1TTB/EOLuxTxDU9lWedbhBKppsYXTNj5ePnQJiI28qF4
 AhJg==
X-Gm-Message-State: AElRT7FWOCFcZM7j5IJEh2BE18O2h76CiMGDLXjx2NaNQ2/zhTZWmMoC
 5Wz1fZR4aKyjhkAfWgMDZGMJroOGpdDA11V6oW4=
X-Google-Smtp-Source: AG47ELsOwmT6aZevvISFG4dC7XpQyKWrSbfhx5J3hAAXxx8kUt8uUBe5cu/aleyShoc2RGJ8jC154Wgj1Q/fG+UviNU=
X-Received: by 10.55.10.6 with SMTP id 6mr51926050qkk.271.1522029006750; Sun,
 25 Mar 2018 18:50:06 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.55.98.4 with HTTP; Sun, 25 Mar 2018 18:49:46 -0700 (PDT)
In-Reply-To: <8fce72e5-031b-669d-0321-558d80802a97@HIDDEN>
References: <f0998c66-c94b-855a-c97d-fd35b428488d@HIDDEN>
 <DA50D0E4-9080-407B-AE56-5EACB8FFE3B4@HIDDEN>
 <290b7a77-d236-61c5-aefe-63d5c2dcb4b5@HIDDEN>
 <8fce72e5-031b-669d-0321-558d80802a97@HIDDEN>
From: Jim Meyering <jim@HIDDEN>
Date: Sun, 25 Mar 2018 18:49:46 -0700
X-Google-Sender-Auth: fJez7dW_ywfCuY3TDlg3OSmMkSY
Message-ID: <CA+8g5KHxJ6xEoTYcme99_F68qLAtbt9H==cuTDKZB6Zbebc1gw@HIDDEN>
Subject: Re: bug#30935: gzip -l reports wrong size for decompressed files
 larger than 4GB
To: Paul Eggert <eggert@HIDDEN>
Content-Type: text/plain; charset="UTF-8"
X-Spam-Score: 0.5 (/)
X-Debbugs-Envelope-To: 30935
Cc: Wolfgang Formann <wformann@HIDDEN>, 30935 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: 0.5 (/)

tags 30935 notabug
close 30935
stop

On Sun, Mar 25, 2018 at 6:36 PM, Paul Eggert <eggert@HIDDEN> wrote:
> Wolfgang Formann wrote:
>>
>> I accept that problem. I would be happy, when a similar statement like
>> yours would be in the man page of gzip.
>
> It already is in the gzip manual, which is the main source of detailed info
> like that.

Marking this "issue" as closed in our bug tracker.




Information forwarded to bug-gzip@HIDDEN:
bug#30935; Package gzip. Full text available.

Message received at 30935 <at> debbugs.gnu.org:


Received: (at 30935) by debbugs.gnu.org; 26 Mar 2018 01:36:43 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sun Mar 25 21:36:43 2018
Received: from localhost ([127.0.0.1]:53328 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1f0H3z-0008KO-0p
	for submit <at> debbugs.gnu.org; Sun, 25 Mar 2018 21:36:43 -0400
Received: from zimbra.cs.ucla.edu ([131.179.128.68]:59086)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <eggert@HIDDEN>) id 1f0H3x-0008KB-HZ
 for 30935 <at> debbugs.gnu.org; Sun, 25 Mar 2018 21:36:41 -0400
Received: from localhost (localhost [127.0.0.1])
 by zimbra.cs.ucla.edu (Postfix) with ESMTP id 9FB46160F9F;
 Sun, 25 Mar 2018 18:36:35 -0700 (PDT)
Received: from zimbra.cs.ucla.edu ([127.0.0.1])
 by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032)
 with ESMTP id Y51OTvgO5xoS; Sun, 25 Mar 2018 18:36:35 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1])
 by zimbra.cs.ucla.edu (Postfix) with ESMTP id F3E291611D3;
 Sun, 25 Mar 2018 18:36:34 -0700 (PDT)
X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu
Received: from zimbra.cs.ucla.edu ([127.0.0.1])
 by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026)
 with ESMTP id dy530_8WCoo6; Sun, 25 Mar 2018 18:36:34 -0700 (PDT)
Received: from [192.168.1.9] (unknown [47.154.30.119])
 by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id D65CC160F9F;
 Sun, 25 Mar 2018 18:36:34 -0700 (PDT)
Subject: Re: bug#30935: gzip -l reports wrong size for decompressed files
 larger than 4GB
To: Wolfgang Formann <wformann@HIDDEN>, 30935 <at> debbugs.gnu.org
References: <f0998c66-c94b-855a-c97d-fd35b428488d@HIDDEN>
 <DA50D0E4-9080-407B-AE56-5EACB8FFE3B4@HIDDEN>
 <290b7a77-d236-61c5-aefe-63d5c2dcb4b5@HIDDEN>
From: Paul Eggert <eggert@HIDDEN>
Organization: UCLA Computer Science Department
Message-ID: <8fce72e5-031b-669d-0321-558d80802a97@HIDDEN>
Date: Sun, 25 Mar 2018 18:36:34 -0700
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101
 Thunderbird/52.6.0
MIME-Version: 1.0
In-Reply-To: <290b7a77-d236-61c5-aefe-63d5c2dcb4b5@HIDDEN>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 7bit
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 30935
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -2.3 (--)

Wolfgang Formann wrote:
> I accept that problem. I would be happy, when a similar statement like yours 
> would be in the man page of gzip.

It already is in the gzip manual, which is the main source of detailed info like 
that.




Information forwarded to bug-gzip@HIDDEN:
bug#30935; Package gzip. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 25 Mar 2018 22:58:39 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sun Mar 25 18:58:39 2018
Received: from localhost ([127.0.0.1]:53272 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1f0Eb0-0004Tu-0j
	for submit <at> debbugs.gnu.org; Sun, 25 Mar 2018 18:58:39 -0400
Received: from eggs.gnu.org ([208.118.235.92]:36136)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <wformann@HIDDEN>) id 1f0D9L-0002HG-Tf
 for submit <at> debbugs.gnu.org; Sun, 25 Mar 2018 17:26:00 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <wformann@HIDDEN>) id 1f0D9F-0004ji-Ej
 for submit <at> debbugs.gnu.org; Sun, 25 Mar 2018 17:25:54 -0400
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,FREEMAIL_FROM
 autolearn=disabled version=3.3.2
Received: from lists.gnu.org ([2001:4830:134:3::11]:50060)
 by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32)
 (Exim 4.71) (envelope-from <wformann@HIDDEN>) id 1f0D9F-0004jc-B3
 for submit <at> debbugs.gnu.org; Sun, 25 Mar 2018 17:25:53 -0400
Received: from eggs.gnu.org ([2001:4830:134:3::10]:55322)
 by lists.gnu.org with esmtp (Exim 4.71)
 (envelope-from <wformann@HIDDEN>) id 1f0D9E-0005iZ-04
 for bug-gzip@HIDDEN; Sun, 25 Mar 2018 17:25:53 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <wformann@HIDDEN>) id 1f0D9A-0004ek-Rw
 for bug-gzip@HIDDEN; Sun, 25 Mar 2018 17:25:51 -0400
Received: from vsmx009.vodafonemail.xion.oxcs.net ([153.92.174.87]:34520)
 by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32)
 (Exim 4.71) (envelope-from <wformann@HIDDEN>) id 1f0D9A-0004dI-KR
 for bug-gzip@HIDDEN; Sun, 25 Mar 2018 17:25:48 -0400
Received: from vsmx001.vodafonemail.xion.oxcs.net (unknown [192.168.75.191])
 by mta-5-out.mta.xion.oxcs.net (Postfix) with ESMTP id E1566C050D
 for <bug-gzip@HIDDEN>; Sun, 25 Mar 2018 21:25:45 +0000 (UTC)
Received: from [192.168.0.198] (unknown [109.90.4.22])
 by mta-5-out.mta.xion.oxcs.net (Postfix) with ESMTPA id B136630008D
 for <bug-gzip@HIDDEN>; Sun, 25 Mar 2018 21:25:43 +0000 (UTC)
From: Wolfgang Formann <wformann@HIDDEN>
Subject: Re: bug#30935: gzip -l reports wrong size for decompressed files
 larger than 4GB
To: bug-gzip@HIDDEN
References: <f0998c66-c94b-855a-c97d-fd35b428488d@HIDDEN>
 <DA50D0E4-9080-407B-AE56-5EACB8FFE3B4@HIDDEN>
Message-ID: <290b7a77-d236-61c5-aefe-63d5c2dcb4b5@HIDDEN>
Date: Sun, 25 Mar 2018 23:25:42 +0200
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:49.0) Gecko/20100101
 Firefox/49.0 SeaMonkey/2.46
MIME-Version: 1.0
In-Reply-To: <DA50D0E4-9080-407B-AE56-5EACB8FFE3B4@HIDDEN>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-VADE-STATUS: LEGIT
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
 [fuzzy]
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x
X-Received-From: 2001:4830:134:3::11
X-Spam-Score: -4.1 (----)
X-Debbugs-Envelope-To: submit
X-Mailman-Approved-At: Sun, 25 Mar 2018 18:58:36 -0400
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -4.1 (----)

Mark,

I accept that problem. I would be happy, when a similar statement like yours would be in the man page of gzip.

Wolfgang

Mark Adler schrieb:
> Wolfgang,
>
> The gzip format stores only the low 32 bits of the uncompressed length as the last four bytes of the stream, so it is not possible to show the correct number. At least not without decompressing the whole thing.
>
> There are two other ways that the displayed uncompressed size can be incorrect, even for small files. Those are if a) there is more than one gzip member in the gzip stream, in which case only the uncompressed size of the last member will be shown, or b) if there are junk bytes after the end of the gzip stream, in which case the junk will be shown as the length.
>
> In short, the reported length is informational at best, and should not be trusted if the information is important.The purpose of the length modulo 2^32 being in the trailer is as an additional integrity check along with the CRC. However it was also used for gzip -l, which was perhaps a mistake.
>
> You can get the actual decompressed length only by decompressing, and discarding the uncompressed data if you only want the length. You can either:
>
>     gzip -dc file.gz | wc -c
>
> or:
>
>     pigz -lt file.gz
>
> The latter will report the members of the gzip stream separately.
>
> Mark
>
>
>> On Mar 25, 2018, at 1:42 AM, Wolfgang Formann <wformann@HIDDEN> wrote:
>>
>> Hello!
>>
>> I am using gzip 1.6 from openSUSE Leap 42.3 with latest patches
>>
>> $ file /usr/bin/gzip
>> /usr/bin/gzip: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.0.0, BuildID[sha1]=7103d56e17e6f81a52db927e393dce601c3af0e1, stripped
>>
>> There is a compressed file available at https://data.dnb.de/opendata/GND.rdf.gz which has a size of 1.232.465.678 bytes. Uncompressed it will have a size of 19.465.374.298
>>
>> The problem is:
>> $ gzip -l GND.rdf.gz
>>         compressed        uncompressed  ratio uncompressed_name
>>         1232465678          2285505114  46.1% GND.rdf
>>
>> This number 2285505114 is actually the lower 32 bits of the real size 19GB.
>> $ echo "19465374298-16*1024*1024*1024" | bc
>> 2285505114
>>
>> Such a behaviour is okay for 32-bit software, 64-bit should show correct numbers.
>>
>> Thanks
>> Wolfgang
>>
>>
>>
>>
>
>
>
>
>





Information forwarded to bug-gzip@HIDDEN:
bug#30935; Package gzip. Full text available.

Message received at 30935 <at> debbugs.gnu.org:


Received: (at 30935) by debbugs.gnu.org; 25 Mar 2018 21:06:40 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sun Mar 25 17:06:40 2018
Received: from localhost ([127.0.0.1]:53221 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1f0Cqd-0001pG-M0
	for submit <at> debbugs.gnu.org; Sun, 25 Mar 2018 17:06:40 -0400
Received: from mail.alumni.caltech.edu ([131.215.242.114]:63963)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <madler@HIDDEN>) id 1f0Cqa-0001p0-VS
 for 30935 <at> debbugs.gnu.org; Sun, 25 Mar 2018 17:06:37 -0400
Received: from [192.168.0.193] (108-213-71-79.lightspeed.sntcca.sbcglobal.net
 [108.213.71.79]) (Authenticated sender: madler)
 by mail.alumni.caltech.edu (Postfix) with ESMTPSA id 687C11093827;
 Sun, 25 Mar 2018 14:05:53 -0700 (PDT)
DKIM-Filter: OpenDKIM Filter v2.11.0 mail.alumni.caltech.edu 687C11093827
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=alumni.caltech.edu;
 s=enforce; t=1522011953;
 bh=vncThyRb1UC/CLdNxLtPhhfqc58JuOvAGSnTAWGBsrg=;
 h=Subject:From:In-Reply-To:Date:Cc:References:To:From;
 b=c8iSQWpHYS2hsmJuFCi6C4GtsUzxx+KH3/DIMU0OVAK1zO0aF/oIf1/zfeaxUBKLV
 bIxPYv1gpQ9XBtyqt9qXdf2ga5n+3RIIS98WcDJp010YdTVceLeZ7BvYbi3khFG15V
 rqp0DOcYC7IfBGaDJeSELVMMTByy5J4yOzTpkdIE=
Content-Type: text/plain;
	charset=us-ascii
Mime-Version: 1.0 (Mac OS X Mail 11.2 \(3445.5.20\))
Subject: Re: bug#30935: gzip -l reports wrong size for decompressed files
 larger than 4GB
From: Mark Adler <madler@HIDDEN>
In-Reply-To: <f0998c66-c94b-855a-c97d-fd35b428488d@HIDDEN>
Date: Sun, 25 Mar 2018 14:05:52 -0700
Content-Transfer-Encoding: quoted-printable
Message-Id: <DA50D0E4-9080-407B-AE56-5EACB8FFE3B4@HIDDEN>
References: <f0998c66-c94b-855a-c97d-fd35b428488d@HIDDEN>
To: Wolfgang Formann <wformann@HIDDEN>
X-Mailer: Apple Mail (2.3445.5.20)
X-MailScanner-Information-Alumni: 
X-Alumni-MailScanner-ID: 687C11093827.A43D4
X-MailScanner-Alumni: No Virii found
X-Spam-Status-Alumni: not spam, SpamAssassin (not cached, score=-1.1,
 required 5, ALL_TRUSTED -1.00, DKIM_SIGNED 0.10, DKIM_VALID -0.10,
 DKIM_VALID_AU -0.10)
X-MailScanner-From: madler@HIDDEN
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 30935
Cc: 30935 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -2.3 (--)

Wolfgang,

The gzip format stores only the low 32 bits of the uncompressed length =
as the last four bytes of the stream, so it is not possible to show the =
correct number. At least not without decompressing the whole thing.

There are two other ways that the displayed uncompressed size can be =
incorrect, even for small files. Those are if a) there is more than one =
gzip member in the gzip stream, in which case only the uncompressed size =
of the last member will be shown, or b) if there are junk bytes after =
the end of the gzip stream, in which case the junk will be shown as the =
length.

In short, the reported length is informational at best, and should not =
be trusted if the information is important.The purpose of the length =
modulo 2^32 being in the trailer is as an additional integrity check =
along with the CRC. However it was also used for gzip -l, which was =
perhaps a mistake.

You can get the actual decompressed length only by decompressing, and =
discarding the uncompressed data if you only want the length. You can =
either:

    gzip -dc file.gz | wc -c

or:

    pigz -lt file.gz

The latter will report the members of the gzip stream separately.

Mark


> On Mar 25, 2018, at 1:42 AM, Wolfgang Formann <wformann@HIDDEN> =
wrote:
>=20
> Hello!
>=20
> I am using gzip 1.6 from openSUSE Leap 42.3 with latest patches
>=20
> $ file /usr/bin/gzip
> /usr/bin/gzip: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), =
dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for =
GNU/Linux 3.0.0, BuildID[sha1]=3D7103d56e17e6f81a52db927e393dce601c3af0e1,=
 stripped
>=20
> There is a compressed file available at =
https://data.dnb.de/opendata/GND.rdf.gz which has a size of =
1.232.465.678 bytes. Uncompressed it will have a size of 19.465.374.298
>=20
> The problem is:
> $ gzip -l GND.rdf.gz
>         compressed        uncompressed  ratio uncompressed_name
>         1232465678          2285505114  46.1% GND.rdf
>=20
> This number 2285505114 is actually the lower 32 bits of the real size =
19GB.
> $ echo "19465374298-16*1024*1024*1024" | bc
> 2285505114
>=20
> Such a behaviour is okay for 32-bit software, 64-bit should show =
correct numbers.
>=20
> Thanks
> Wolfgang
>=20
>=20
>=20
>=20





Information forwarded to bug-gzip@HIDDEN:
bug#30935; Package gzip. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 25 Mar 2018 13:30:38 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sun Mar 25 09:30:37 2018
Received: from localhost ([127.0.0.1]:52222 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1f05jJ-0005sY-2p
	for submit <at> debbugs.gnu.org; Sun, 25 Mar 2018 09:30:37 -0400
Received: from eggs.gnu.org ([208.118.235.92]:59240)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <wformann@HIDDEN>) id 1f01F0-000327-Ny
 for submit <at> debbugs.gnu.org; Sun, 25 Mar 2018 04:43:03 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <wformann@HIDDEN>) id 1f01Eu-00006Q-IR
 for submit <at> debbugs.gnu.org; Sun, 25 Mar 2018 04:42:57 -0400
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=0.0 required=5.0 tests=BAYES_40,FREEMAIL_FROM
 autolearn=disabled version=3.3.2
Received: from lists.gnu.org ([2001:4830:134:3::11]:40583)
 by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32)
 (Exim 4.71) (envelope-from <wformann@HIDDEN>) id 1f01Eu-00006L-Ef
 for submit <at> debbugs.gnu.org; Sun, 25 Mar 2018 04:42:56 -0400
Received: from eggs.gnu.org ([2001:4830:134:3::10]:50197)
 by lists.gnu.org with esmtp (Exim 4.71)
 (envelope-from <wformann@HIDDEN>) id 1f01Et-0003iE-AY
 for bug-gzip@HIDDEN; Sun, 25 Mar 2018 04:42:56 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <wformann@HIDDEN>) id 1f01Eq-0008UT-8S
 for bug-gzip@HIDDEN; Sun, 25 Mar 2018 04:42:55 -0400
Received: from mx009.vodafonemail.xion.oxcs.net ([153.92.174.39]:13629)
 by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32)
 (Exim 4.71) (envelope-from <wformann@HIDDEN>) id 1f01Eq-0008RF-0m
 for bug-gzip@HIDDEN; Sun, 25 Mar 2018 04:42:52 -0400
Received: from vsmx002.vodafonemail.xion.oxcs.net (unknown [192.168.75.192])
 by mta-6-out.mta.xion.oxcs.net (Postfix) with ESMTP id C7162D9B2F3
 for <bug-gzip@HIDDEN>; Sun, 25 Mar 2018 08:42:45 +0000 (UTC)
Received: from [192.168.0.198] (unknown [109.90.4.22])
 by mta-6-out.mta.xion.oxcs.net (Postfix) with ESMTPA id 93CC4199C61
 for <bug-gzip@HIDDEN>; Sun, 25 Mar 2018 08:42:43 +0000 (UTC)
To: bug-gzip@HIDDEN
From: Wolfgang Formann <wformann@HIDDEN>
Subject: gzip -l reports wrong size for decompressed files larger than 4GB
Message-ID: <f0998c66-c94b-855a-c97d-fd35b428488d@HIDDEN>
Date: Sun, 25 Mar 2018 10:42:42 +0200
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:49.0) Gecko/20100101
 Firefox/49.0 SeaMonkey/2.46
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-VADE-STATUS: LEGIT
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
 [fuzzy]
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x
X-Received-From: 2001:4830:134:3::11
X-Spam-Score: -4.1 (----)
X-Debbugs-Envelope-To: submit
X-Mailman-Approved-At: Sun, 25 Mar 2018 09:30:36 -0400
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -4.1 (----)

Hello!

I am using gzip 1.6 from openSUSE Leap 42.3 with latest patches

$ file /usr/bin/gzip
/usr/bin/gzip: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter 
/lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.0.0, BuildID[sha1]=7103d56e17e6f81a52db927e393dce601c3af0e1, stripped

There is a compressed file available at https://data.dnb.de/opendata/GND.rdf.gz which has a size of 1.232.465.678 bytes. 
Uncompressed it will have a size of 19.465.374.298

The problem is:
$ gzip -l GND.rdf.gz
          compressed        uncompressed  ratio uncompressed_name
          1232465678          2285505114  46.1% GND.rdf

This number 2285505114 is actually the lower 32 bits of the real size 19GB.
$ echo "19465374298-16*1024*1024*1024" | bc
2285505114

Such a behaviour is okay for 32-bit software, 64-bit should show correct numbers.

Thanks
Wolfgang





Acknowledgement sent to Wolfgang Formann <wformann@HIDDEN>:
New bug report received and forwarded. Copy sent to bug-gzip@HIDDEN. Full text available.
Report forwarded to bug-gzip@HIDDEN:
bug#30935; Package gzip. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Wed, 1 Dec 2021 23:45:01 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.