GNU bug report logs - #34110
du: add dual-column showing apparent-size and disk-size

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: coreutils; Severity: wishlist; Reported by: René J.V. Bertin <rjvbertin@HIDDEN>; dated Wed, 16 Jan 2019 22:04:02 UTC; Maintainer for coreutils is bug-coreutils@HIDDEN.
Changed bug title to 'du: add dual-column showing apparent-size and disk-size' from 'feature request: dual-column du output, showing "real" and "on-disk" sizes (and about that "apparent-size" concept)' Request was from Assaf Gordon <assafgordon@HIDDEN> to control <at> debbugs.gnu.org. Full text available.

Message received at 34110 <at> debbugs.gnu.org:


Received: (at 34110) by debbugs.gnu.org; 18 Jan 2019 06:43:50 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Fri Jan 18 01:43:50 2019
Received: from localhost ([127.0.0.1]:35571 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1gkNsc-00023P-85
	for submit <at> debbugs.gnu.org; Fri, 18 Jan 2019 01:43:50 -0500
Received: from mail-pl1-f173.google.com ([209.85.214.173]:33106)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <assafgordon@HIDDEN>)
 id 1gkNsa-000233-J7; Fri, 18 Jan 2019 01:43:48 -0500
Received: by mail-pl1-f173.google.com with SMTP id z23so5931457plo.0;
 Thu, 17 Jan 2019 22:43:48 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=subject:to:cc:references:from:message-id:date:user-agent
 :mime-version:in-reply-to:content-language:content-transfer-encoding;
 bh=/uktsIINYwdSMYx+/bLsDOJf/DysDI4tvUtq/KPwmpE=;
 b=rtVD2xmcVG9a7J+BOlr+5FbncStTQtmzPr70cwu9UEze8IGB3XO+RmwfJq/+j5nFFs
 agL2Bb/2VA6m3mFbHBh7ch1PuMTffUZZ9fRMvsVHgN/2+iNZK0bFe7nLi3wa6H+0k81H
 aNgxAsOt4XP9SAy2F7fwRWlzxuRr3fp6klow4pqa+2ydGXIoQsqfRvyGqUHVy3xXJhMb
 Esee3MVoxmn0uEt6lgyDQd3pdMQoyj1knbK0IvtvWXjVcJywA7uscENgWx2Iu34gfUQc
 Pn0cFSzkJJLw5k2EBboUdx+XfHGa/2EZ9sk1RHsaoFbK287z6AP8Mzhk+8fxNDftz1ES
 pyQQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:subject:to:cc:references:from:message-id:date
 :user-agent:mime-version:in-reply-to:content-language
 :content-transfer-encoding;
 bh=/uktsIINYwdSMYx+/bLsDOJf/DysDI4tvUtq/KPwmpE=;
 b=W6hKP0nTsgXehMhOLeUWTZ4t01KDIS3zjatIDehGlT5BhEdBOdhzzUQSwmPoBFA8aI
 7JoJ9BDfNdwsgbMIaPlNIXVMX6uGo1ft2qZVrKkmnqyGKKmjIzn9ZbAS5XIZCy6wViN/
 f5s6s+4A2ti9cRmI/zXlQgSqD5rhqRnqR/8UiTrf5kR1Ev+8m2fFwzPRpp1ZM5TeYmR0
 SbU8Rz94cobcLMnSRSeX7kZzsUUpe7k7/Ci918kipiWFoAYKVzpAIdIU/GjkiS7G5tMV
 LP057NplkPb+FB0pOeMU2A6zK7JEDjbczWcu9ucQZpwlZ4JUVX74aHiX2ysspR10XO8z
 NtCA==
X-Gm-Message-State: AJcUukff0wVa30Z0KBix5yA8b3gB+RHUd1s6TQdBpCvffM/asljhFU1X
 62or5yUZCJ1vUFONaIfMf0e8Izd6
X-Google-Smtp-Source: ALg8bN7Qp3TE2RLAt/ltNCByfWKg6HpixBRHm+7MXpm90UHDc8UY4OURlRjwPm+ZIH93bRZRJEMm7w==
X-Received: by 2002:a17:902:bb98:: with SMTP id
 m24mr17573949pls.71.1547793822269; 
 Thu, 17 Jan 2019 22:43:42 -0800 (PST)
Received: from tomato.housegordon.com (moose.housegordon.com. [184.68.105.38])
 by smtp.googlemail.com with ESMTPSA id
 z127sm7241060pfb.80.2019.01.17.22.43.40
 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Thu, 17 Jan 2019 22:43:40 -0800 (PST)
Subject: Re: bug#34110: feature request: dual-column du output, showing "real"
 and "on-disk" sizes (and about that "apparent-size" concept)
To: =?UTF-8?B?UmVuw6kgSi5WLiBCZXJ0aW4=?= <rjvbertin@HIDDEN>
References: <1689402.NmkMBi2P6V@bola>
 <0f86a299-a979-95c0-3341-d7f8aec63351@HIDDEN> <2667700.xSuBLWJ7V7@bola>
From: Assaf Gordon <assafgordon@HIDDEN>
Message-ID: <1fa3067a-c864-8021-8585-6a1ebb2845c4@HIDDEN>
Date: Thu, 17 Jan 2019 23:43:39 -0700
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101
 Thunderbird/60.4.0
MIME-Version: 1.0
In-Reply-To: <2667700.xSuBLWJ7V7@bola>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 8bit
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 34110
Cc: 34110 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

severity 34110 wishlist
retitle 34110 du: add dual-column showing apparent-size and disk-size
stop

Hello,

On 2019-01-17 3:13 a.m., René J.V. Bertin wrote:
> On Wednesday January 16 2019 16:06:50 Assaf Gordon wrote:
> 
>> I hope this helps to clarify "apparent-size".
> 
> Yes and no :) I understand what "apparent-size" does [....] 
> My whole point is that there might be a better name. 

The parameter name "--apparent-size" is not likely to be changed.
It has been named so for about 16 years (since 'fileutils 4.5.8'
which is even before 'coreutils' was created as a unified package).

Changing it would break existing scripts and user expectations.

> I realise that you cannot really call the content size observable "real size" when reporting from a disk-usage viewpoint, but "content size" (--content-size, -C) should be clear enough?

Creating a second alias to "--apparent-size" is possible, but I'm not
sure it's warranted.

---

I think the discussion about "--apparent-size" is mostly concluded,
but the idea to have two-columns is an interesting feature request.

I'm marking this as a "wish list" item.
Concrete patches are welcomed.

regards,
  - assaf








Information forwarded to bug-coreutils@HIDDEN:
bug#34110; Package coreutils. Full text available.

Message received at 34110 <at> debbugs.gnu.org:


Received: (at 34110) by debbugs.gnu.org; 17 Jan 2019 10:13:33 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Jan 17 05:13:33 2019
Received: from localhost ([127.0.0.1]:34175 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1gk4g1-0007zF-AX
	for submit <at> debbugs.gnu.org; Thu, 17 Jan 2019 05:13:33 -0500
Received: from mail-wr1-f51.google.com ([209.85.221.51]:35601)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <rjvbertin@HIDDEN>) id 1gk4g0-0007z3-50
 for 34110 <at> debbugs.gnu.org; Thu, 17 Jan 2019 05:13:32 -0500
Received: by mail-wr1-f51.google.com with SMTP id 96so10339913wrb.2
 for <34110 <at> debbugs.gnu.org>; Thu, 17 Jan 2019 02:13:32 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=from:to:cc:subject:date:message-id:user-agent:in-reply-to
 :references:mime-version:content-transfer-encoding;
 bh=JVNokNSG4xqPdSt1kY3vdAvysOPuwkJonovCJipC5Hc=;
 b=V7WlOCwmSAmDVJqK/qzcqdxVMRg2KbnLlFwgheX5j81iBDDYZWipP6cr6T66ykWEki
 iZxr/tMmUVL58hdHv2K/7KHBLwcl9Wt2zMbDEvoB2MnkTaAGysd9jj6Z8A/HfClh4Edf
 Q7uaGZVfZFQUI0q8aZv5LAXj5FKUZZCXCG0IQlQ5LwQZwNGb+OxatLCVQtnGhHJVEVTz
 LQCAIPkujbbTxpe/xNwzrPlICHL3Vc0IJsxs3phn0AKywAlUDgpQKR+U1pBWA7Upb55f
 K8wmusnwSpjzxZywdj+8Y7kp9bATaCzr4rzBPtsjqonMpAh3qUMU/lmZatvhVBgzNORE
 S9gg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:from:to:cc:subject:date:message-id:user-agent
 :in-reply-to:references:mime-version:content-transfer-encoding;
 bh=JVNokNSG4xqPdSt1kY3vdAvysOPuwkJonovCJipC5Hc=;
 b=SjPi3W84uSngzZggfwpzAVnbjJIS3K0A8Qxh7Z32X6At90pjx84wAs8R4jMnfP4vhJ
 Lmd/Pm9w0t0IsjGGjCQo//LCyymMszNQC4tV13gIgV3HESrs2uVkyYLfb6/+x9myenlu
 6P/c495gwZK4oPljDZYoZKKmsONZWR8xGpc/kCgsXp2qGOeW3RLdgJ0N+EzQHV74Da2M
 K8c5Z/z6GOhN94YAmvnxhRVUV1KdxPUHefSHc64X0qr7uK5qTO5pVVuZ83aK25nwdINW
 pWuwWniVC9vUjWlLo0Le/M1uYfctVaHdFGCVHxmQyNQ/8nmjSZCSPgqEXNSpeJpzSi4k
 h14Q==
X-Gm-Message-State: AJcUukeRnSfn+fRZCM83Upot3gXqlKhsy5XZpmqetbfX7cbNtf/oMSLb
 IZWi9aJ19sjMp0RhiSuPF9g=
X-Google-Smtp-Source: ALg8bN7oCdPUUoxQNfRfYPKGZgvW1TXsjTKDWXwgCBQIS3XsiDPWKY9phKO4EJiozFD4JeLtsYLF9Q==
X-Received: by 2002:a5d:6889:: with SMTP id h9mr11164830wru.222.1547720006288; 
 Thu, 17 Jan 2019 02:13:26 -0800 (PST)
Received: from bola.localnet
 (2a01cb0c84dee6009bda76eb03bc33f7.ipv6.abo.wanadoo.fr.
 [2a01:cb0c:84de:e600:9bda:76eb:3bc:33f7])
 by smtp.googlemail.com with ESMTPSA id
 l20sm165792622wrb.93.2019.01.17.02.13.25
 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Thu, 17 Jan 2019 02:13:25 -0800 (PST)
From: =?ISO-8859-1?Q?Ren=E9_J=2EV=2E?= Bertin <rjvbertin@HIDDEN>
To: Assaf Gordon <assafgordon@HIDDEN>
Subject: Re: bug#34110: feature request: dual-column du output,
 showing "real" and "on-disk" sizes (and about that "apparent-size"
 concept)
Date: Thu, 17 Jan 2019 11:13:11 +0100
Message-ID: <2667700.xSuBLWJ7V7@bola>
User-Agent: KMail/4.13.3 (Linux/4.14.23-ck1-mainline-core2-rjvb; KDE/4.14.38;
 x86_64; ; )
In-Reply-To: <0f86a299-a979-95c0-3341-d7f8aec63351@HIDDEN>
References: <1689402.NmkMBi2P6V@bola>
 <0f86a299-a979-95c0-3341-d7f8aec63351@HIDDEN>
MIME-Version: 1.0
Content-Transfer-Encoding: 7Bit
Content-Type: text/plain; charset="us-ascii"
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 34110
Cc: 34110 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

On Wednesday January 16 2019 16:06:50 Assaf Gordon wrote:

Hello,

Yes, I used the exact same directory in all comparisons. It's a nodejs cache (or whatever) directory as you may have guessed; I picked it because it's a good example of the sort of directory found these days which can create considerable overhead. Small enough it'd tend to get dismissed as significant, but containing a large number of files (almost 8000 in my case), most of them tiny.

>I hope this helps to clarify "apparent-size".

Yes and no :) I understand what "apparent-size" does (and have dug through the code looking for ideas how to do similar things in one of my own apps).

My whole point is that there might be a better name. I know one should distinguish every-day language and technical terms but if the latter start to appear (pun intended) like the former (and lack a shorthand) then they'd best be chosen such that they don't require thinking about their interpretation.

Paul's comment about not being able to know what happens underneath only makes this argument stronger IMHO. On the one hand, du can only report how big a item would appear to be on disk (based on what stat() reports). In addition, how would it handle knowledge about the number of disks that a given file is written to? On the other hand, the actual content size is a given that shouldn't change and that is not subject to any existential questions. (Though as my examples show, this isn't necessarily true when du'in directories, and esp. so for HFS+ with compression.)

I realise that you cannot really call the content size observable "real size" when reporting from a disk-usage viewpoint, but "content size" (--content-size, -C) should be clear enough? "Estimated on-disk size" would be good enough as a header for the other observable (an estimate can be 100% accurate after all).

Cheers,
R.




Information forwarded to bug-coreutils@HIDDEN:
bug#34110; Package coreutils. Full text available.

Message received at 34110 <at> debbugs.gnu.org:


Received: (at 34110) by debbugs.gnu.org; 17 Jan 2019 00:28:22 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Wed Jan 16 19:28:22 2019
Received: from localhost ([127.0.0.1]:34024 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1gjvXi-0002Nc-0d
	for submit <at> debbugs.gnu.org; Wed, 16 Jan 2019 19:28:22 -0500
Received: from zimbra.cs.ucla.edu ([131.179.128.68]:38244)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <eggert@HIDDEN>) id 1gjvXh-0002NP-0u
 for 34110 <at> debbugs.gnu.org; Wed, 16 Jan 2019 19:28:21 -0500
Received: from localhost (localhost [127.0.0.1])
 by zimbra.cs.ucla.edu (Postfix) with ESMTP id 334B6161027;
 Wed, 16 Jan 2019 16:28:15 -0800 (PST)
Received: from zimbra.cs.ucla.edu ([127.0.0.1])
 by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032)
 with ESMTP id GVYX2rQdbbfv; Wed, 16 Jan 2019 16:28:14 -0800 (PST)
Received: from localhost (localhost [127.0.0.1])
 by zimbra.cs.ucla.edu (Postfix) with ESMTP id 79C0D161045;
 Wed, 16 Jan 2019 16:28:14 -0800 (PST)
X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu
Received: from zimbra.cs.ucla.edu ([127.0.0.1])
 by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026)
 with ESMTP id qla9z_2m3lxT; Wed, 16 Jan 2019 16:28:14 -0800 (PST)
Received: from Penguin.CS.UCLA.EDU (Penguin.CS.UCLA.EDU [131.179.64.200])
 by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 6077F160FF4;
 Wed, 16 Jan 2019 16:28:14 -0800 (PST)
Subject: Re: bug#34110: feature request: dual-column du output, showing "real"
 and "on-disk" sizes (and about that "apparent-size" concept)
To: =?UTF-8?B?UmVuw6kgSi5WLiBCZXJ0aW4=?= <rjvbertin@HIDDEN>,
 34110 <at> debbugs.gnu.org
References: <1689402.NmkMBi2P6V@bola>
From: Paul Eggert <eggert@HIDDEN>
Openpgp: preference=signencrypt
Autocrypt: addr=eggert@HIDDEN; prefer-encrypt=mutual; keydata=
 xsFNBEyAcmQBEADAAyH2xoTu7ppG5D3a8FMZEon74dCvc4+q1XA2J2tBy2pwaTqfhpxxdGA9
 Jj50UJ3PD4bSUEgN8tLZ0san47l5XTAFLi2456ciSl5m8sKaHlGdt9XmAAtmXqeZVIYX/UFS
 96fDzf4xhEmm/y7LbYEPQdUdxu47xA5KhTYp5bltF3WYDz1Ygd7gx07Auwp7iw7eNvnoDTAl
 KAl8KYDZzbDNCQGEbpY3efZIvPdeI+FWQN4W+kghy+P6au6PrIIhYraeua7XDdb2LS1en3Ss
 mE3QjqfRqI/A2ue8JMwsvXe/WK38Ezs6x74iTaqI3AFH6ilAhDqpMnd/msSESNFt76DiO1ZK
 QMr9amVPknjfPmJISqdhgB1DlEdw34sROf6V8mZw0xfqT6PKE46LcFefzs0kbg4GORf8vjG2
 Sf1tk5eU8MBiyN/bZ03bKNjNYMpODDQQwuP84kYLkX2wBxxMAhBxwbDVZudzxDZJ1C2VXujC
 OJVxq2kljBM9ETYuUGqd75AW2LXrLw6+MuIsHFAYAgRr7+KcwDgBAfwhPBYX34nSSiHlmLC+
 KaHLeCLF5ZI2vKm3HEeCTtlOg7xZEONgwzL+fdKo+D6SoC8RRxJKs8a3sVfI4t6CnrQzvJbB
 n6gxdgCu5i29J1QCYrCYvql2UyFPAK+do99/1jOXT4m2836j1wARAQABzSBQYXVsIEVnZ2Vy
 dCA8ZWdnZXJ0QGNzLnVjbGEuZWR1PsLBfgQTAQIAKAUCTIByZAIbAwUJEswDAAYLCQgHAwIG
 FQgCCQoLBBYCAwECHgECF4AACgkQ7ZfpDmKqfjRRGw/+Ij03dhYfYl/gXVRiuzV1gGrbHk+t
 nfrI/C7fAeoFzQ5tVgVinShaPkZo0HTPf18x6IDEdAiO8Mqo1yp0CtHmzGMCJ50o4Grgfjlr
 6g/+vtEOKbhleszN2XpJvpwM2QgGvn/laTLUu8PH9aRWTs7qJJZKKKAb4sxYc92FehPu6FOD
 0dDiyhlDAq4lOV2mdBpzQbiojoZzQLMQwjpgCTK2572eK9EOEQySUThXrSIz6ASenp4NYTFH
 s9tuJQvXk9gZDdPSl3bp+47dGxlxEWLpBIM7zIONw4ks4azgT8nvDZxA5IZHtvqBlJLBObYY
 0Le61Wp0y3TlBDh2qdK8eYL426W4scEMSuig5gb8OAtQiBW6k2sGUxxeiv8ovWu8YAZgKJfu
 oWI+uRnMEddruY8JsoM54KaKvZikkKs2bg1ndtLVzHpJ6qFZC7QVjeHUh6/BmgvdjWPZYFTt
 N+KA9CWX3GQKKgN3uu988yznD7LnB98T4EUH1HA/GnfBqMV1gpzTvPc4qVQinCmIkEFp83zl
 +G5fCjJJ3W7ivzCnYo4KhKLpFUm97okTKR2LW3xZzEW4cLSWO387MTK3CzDOx5qe6s4a91Zu
 ZM/j/TQdTLDaqNn83kA4Hq48UHXYxcIh+Nd8k/3w6lFuoK0wrOFiywjLx+0ur5jmmbecBGHc
 1xdhAFHOwU0ETIByZAEQAKaF678T9wyH4wjTrV1Pz3cDEoSnV/0ZUrOT37p1dcGyj/IXq1x6
 70HRVahAmk0sZpYc25PF9D5GPYHFWlNjuPU96rDndXB3hedmBRhLdC4bAXjI4DV+bmdVe+q/
 IMnlZRaVlm9EiMCVAR6w13sReu7qXkW9r3RwY2AzXskp/tAe4BRKr1Zmbvi2nbnQ6epEC42r
 Rbx0B1EhjbIQZ5JHGk24iPT7LdBgnNmos5wYjzwNlkMQD5T0Ydzhk7J+UxwA5m46mOhRDC2r
 FV/A0gm5TLy8DXjv/Esc4gYnYai6SQqnUEVh5LuV8YCJBnijs+Tiw71x1icmn6xGI45EugJO
 gec+rLypYgpVp4x0HI5T88qBRYCkxH3Kg8Qo+EWNA9A4LRQ9DX8njona0gf0s03tocK8kBN6
 6UoqqPtHBnc4eMgBymCflK12eKfd2YYxnyg9cZazWA5VslvTxpm76hbg5oiAEH/Vg/8MxHyA
 nPhfrgwyPrmJEcVBafdspJnYQxBYNco2LFPIhlOvWh8r4at+s+M3Lb26oUTczlgdW1Sf3SDA
 77BMRnF0FQyE+7AzV79MBN4ykiqaezQxtaF1Fy/tvkhffSo8u+dwG0EgJh+te38gTcISVr0G
 IPplLz6YhjrbHrPRF1CN5UuL9DBGjxuN35RLNVEfta6RUFlR6NctTjvrABEBAAHCwWUEGAEC
 AA8FAkyAcmQCGwwFCRLMAwAACgkQ7ZfpDmKqfjSrHA/+KzAKvTxRhA9MWNLxIyJ7S5uJ16gs
 T3oCjZrBKGEhKMOGX4O0GA6VOEryO7QRCCYah3oxSG38IAnNeiwJXgU9Bzkk85UGbPEd7HGF
 /VSeHCQwWou6jqUDTSDvn9YhNTdG0KXPM74aC+xr2Zow1O2mhXihgWKD0Dw+0LYPnUOsQ0KO
 FxHXXYHmRrS1OZPU59BLvc+TRhIhafSHKLwbXK+6ckkxBx6h8z5ccpG0Qs4bFhdFYnFrEieD
 LoGmnE2YLhdV6swJ9VNCS6pLiEohT3fm7aXm15tZOIyzMZhHRSAPblXxQ0ZSWjq8oRrcYNFx
 c4W1URpAkBCOYJoXvQfD5L3lqAl8TCqDUzYxhH/tJhbDdHrqHH767jaDaTB1+Talp/2AMKwc
 XNOdiklGxbmHVG6YGl6g8Lrbsu9NZEI4yLlHzuikthJWgz+3vZhVGyNlt+HNIoF6CjDL2omu
 5cEq4RDHM44QqPk6l7O0pUvN1mT4B+S1b08RKpqm/ff015E37HNV/piIvJlxGAYz8PSfuGCB
 1thMYqlmgdhd9/BabGFbGGYHA6U4/T5zqU+f6xHy1SsAQZ1MSKlLwekBIT+4/cLRGqCHjnV0
 q5H/T6a7t5mPkbzSrOLSo4puj+IToNjYyYIDBWzhlA19avOa+rvUjmHtD3sFN7cXWtkGoi8b
 uNcby4U=
Organization: UCLA Computer Science Department
Message-ID: <08ff12cb-42a1-949e-9a52-09865cf06774@HIDDEN>
Date: Wed, 16 Jan 2019 16:28:14 -0800
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101
 Thunderbird/60.4.0
MIME-Version: 1.0
In-Reply-To: <1689402.NmkMBi2P6V@bola>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Content-Language: en-US
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 34110
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

I like the idea of two columns at once.

> with "--apparent-size", du returns the actual file size; without, it returns how large the file appears to be (judging from its disk footprint).

The "apparent" size is the size that "ls -l" outputs, and is the size 
that traditional I/O operations like 'read' and 'write' deal with, 
regardless of the underlying implementation (where the size might be 
smaller or larger than the "apparent" size). In contrast the "disk 
usage" size is whatever the filesystem tells us it is. I wouldn't call 
either size the "actual" size these days, as even the disk usage (or 
"disk footprint") might be virtual blocks stored in a lower-level 
compressed device, and there's no way "du" can find out how much of the 
lower-level device is being used.





Information forwarded to bug-coreutils@HIDDEN:
bug#34110; Package coreutils. Full text available.

Message received at 34110 <at> debbugs.gnu.org:


Received: (at 34110) by debbugs.gnu.org; 16 Jan 2019 23:07:02 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Wed Jan 16 18:07:01 2019
Received: from localhost ([127.0.0.1]:33982 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1gjuGz-0000UI-JL
	for submit <at> debbugs.gnu.org; Wed, 16 Jan 2019 18:07:01 -0500
Received: from mail-pf1-f169.google.com ([209.85.210.169]:46401)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <assafgordon@HIDDEN>) id 1gjuGx-0000U3-Na
 for 34110 <at> debbugs.gnu.org; Wed, 16 Jan 2019 18:07:00 -0500
Received: by mail-pf1-f169.google.com with SMTP id c73so3775862pfe.13
 for <34110 <at> debbugs.gnu.org>; Wed, 16 Jan 2019 15:06:59 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=subject:to:references:from:message-id:date:user-agent:mime-version
 :in-reply-to:content-language:content-transfer-encoding;
 bh=CF9ATFbk/YmSlm5ivj5GuQlWSc28Jbq/p0i3YflpLsY=;
 b=mGJOxYBCZxC+xsgfpe0ZOvQFM58+tdL0GHDPZuGT0JxGFFOoo6zo6LP57nP+/Lh0KW
 h7wzyKHay8fcM4i08uf+Pt6C82LHnuuLSgKGsn4OjAeilszpqBA54+kqxi9yOpi2EU2k
 lYoJdNl+p8/Js/iINV4IeVPRfcOAUA1gdqJQI4zPKAnTx6Tsv9/9VVBZv3PEv4Q1423H
 mQnnL7p9oD6sTFJQs2OBONfvp24H32lUNlvumwTfy6ZcpKzdDdn2AWNb9Ke2fGMffGPw
 THqhJfw/Ct8ZaEFYwW6fCMMIwEE01TWquh/dEZj9DZsJk0CrWoCAtq+jgGK4dsXjO2Gm
 LzLg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:subject:to:references:from:message-id:date
 :user-agent:mime-version:in-reply-to:content-language
 :content-transfer-encoding;
 bh=CF9ATFbk/YmSlm5ivj5GuQlWSc28Jbq/p0i3YflpLsY=;
 b=V40mZbs/SpNDNc4QQRrOWI1mjZ0MxD2nri+CEh/HZhLVbloVLQvyUwu5ZuCbVucDjV
 Z7iFO3T5w7HLXajjXEVJ1rIr0CNZ81bP441ebWT6C3nwqDAY8bQr1MbDuGJk2Jzqvjqj
 +lVUsjxcc9O687hEZCWq39TtdE1TR9EWOsYFGxylRtIXYF0eiTllcrr26sViCwH40lK2
 WkSoAx+sSMu34729FBewM1z6LcrSNZr+SR2DXstZSMghUYIrvjOYeRfU+wx8wWhu5WKH
 qA+Mh8CdzQuv2dIDtalYOGh30d13cyPGP/C+C5QeNwp68CfNikxCC6fzQkIFSxrv2GJx
 cYKw==
X-Gm-Message-State: AJcUukfgwE1ItPnxwB/oxpbaaYk0r6uGfKFqO/ZMQdGU4w1oA6b1LKdh
 2xrp5ZxFiivwPQzBzsbTZLYdMBQ4
X-Google-Smtp-Source: ALg8bN6BLUOuH0s2Ye6OZG4c3oVoIOxUZZ1+YDZkjtH26J6Xwv+yiTZkD5CVx4+XVRk9YO+41XudCQ==
X-Received: by 2002:a62:509b:: with SMTP id g27mr12444924pfj.48.1547680013239; 
 Wed, 16 Jan 2019 15:06:53 -0800 (PST)
Received: from tomato.housegordon.com (moose.housegordon.com. [184.68.105.38])
 by smtp.googlemail.com with ESMTPSA id
 p7sm10263923pfa.22.2019.01.16.15.06.51
 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Wed, 16 Jan 2019 15:06:52 -0800 (PST)
Subject: Re: bug#34110: feature request: dual-column du output, showing "real"
 and "on-disk" sizes (and about that "apparent-size" concept)
To: =?UTF-8?B?UmVuw6kgSi5WLiBCZXJ0aW4=?= <rjvbertin@HIDDEN>,
 34110 <at> debbugs.gnu.org
References: <1689402.NmkMBi2P6V@bola>
From: Assaf Gordon <assafgordon@HIDDEN>
Message-ID: <0f86a299-a979-95c0-3341-d7f8aec63351@HIDDEN>
Date: Wed, 16 Jan 2019 16:06:50 -0700
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101
 Thunderbird/60.4.0
MIME-Version: 1.0
In-Reply-To: <1689402.NmkMBi2P6V@bola>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 8bit
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 34110
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

Hello,

I'll address only the "apparent-size" issue (not the two-columns, or 
compressed file-systems):

On 2019-01-16 1:13 p.m., René J.V. Bertin wrote:

> According to `du --help`, the apparent-size option reports a size that is not the actual disk usage. The numbers above seem to show the opposite.
> If anything, I find the concept of "apparent size" more appropriate to the size a file occupies on the storage medium because ultimately that storage device will not give you more than "struct stat : st_size" bytes for uncompressed filesystems.
> Another way to say it: with "--apparent-size", du returns the actual file size; without, it returns how large the file appears to be (judging from its disk footprint).

"apparent-size" shows how much content/data the file has.
without "apparent-size" du shows the amount of storage consumed (or 
"wasted"?) on the storage medium (accounting sparse file holes, though 
I'm not sure about compression).

To illustrate, create three files with specific sizes:

   $ head --bytes=1700 /dev/zero > a
   $ head --bytes=4097 /dev/zero > b
   $ truncate --size=1050000 c        # will be a sparse file

These are their sizes, as in the amount of bytes they contain:

   $ ls -log
   total 12
   -rw-r--r-- 1    1700 Jan 16 15:36 a
   -rw-r--r-- 1    4097 Jan 16 15:36 b
   -rw-r--r-- 1 1050000 Jan 16 15:37 c


These are their "apparent-sizes", rounded up to the nearest
1K block:

   $ du --apparent-size a b c
   2     a
   5     b
   1026  c

e.g. file "a" is 1700 bytes, rounded-up to 2K, and "du --apparent-size"
shows "2".

Using "--apparent-size --block-size=1" (and its equivalent, "--bytes")
will show the exact sizes:

   $ du --apparent-size --block-size=1 a b c
   1700     a
   4097     b
   1050000  c

Without "--apparent-size", du shows how much storage space is actually 
used/wasted/consumed on the storage medium by the files:

   $ du a b c
   4    a
   8    b
   0    c

How are these numbers calculated?

The simplest case is file "c" - it is completely sparse - so despite
logically containing 1,050,000 zeros, on the actual storage medium it 
consumes zero data blocks (ignoring inodes blocks and somesuch).

File "a" has 1,700 bytes of data.
On my filesystem the basic block size is 4096, as shown by "stat -f":

   $ stat -f /
     File: "/"
       ID: 5a2cade519bada6a Namelen: 255     Type: ext2/ext3
->Block size: 4096       Fundamental block size: 4096    <-----
   Blocks: Total: 27559017   Free: 18845977   Available: 17435289
   Inodes: Total: 7036928    Free: 6496730

Therefore, any file from size 1 to size 4096 will consume exactly one
disk block. On most common filesystems, disk blocks can not be shared
between files. Meaning that this block is fully consumed.

That's why for file "a" du shows "4" - meaning 4K bytes (exactly one
block) is consumed on the storage medium by this file.

Similarly for file "b" - its size is 4097, which is 1 byte more than one
filesystem block. Hence, file "b" consumes 2 blocks, coming up to 8K.
du then shows "8" for file "b".


Now to your examples:

> %> du -hcs /Volumes/nif64/tmp/.npm/ ; du -hcs --apparent-size
/Volumes/nif64/tmp/.npm/
> 340M    /Volumes/nif64/tmp/.npm/ > 180M    /Volumes/nif64/tmp/.npm/
> Same folder on btrfs (mounted with compress=lzo): > %> du -hcs /mnt/.npm/ ; du -hcs --apparent-size  /mnt/.npm> 198M 
/mnt/.npm/> 181M    /mnt/.npm

In both cases, "du --apparent-size" shows about 180MB of actual data 
(181MB in the second example). That is the amount of actual content
(number of total bytes in these files).

In the first case, these files consume 340MB of space on your disk.
In the second case, these files consume 198MB of space on your disk.
The reason they consume MORE than their actual data is explained above
with the file-system blocks.

This suggest to me that compression is not accounted for in these
values. If it was, then the consumed size (without "--apparent-size")
should've been less than the actual size (with "--apparent-size").

A quick on-line search shows that btrsf's default block size is 16K,
while ZFS's default record-size is 128KB. That might explain
why similar amount of data (and I assume, similar number of files and
sizes) consume more disk space on ZFS (Could be wrong, though, comments
are welcomed).


I hope this helps to clarify "apparent-size".

I'll leave it to others to comment on how compressed file systems
come into play with du.

regards,
  - assaf







Information forwarded to bug-coreutils@HIDDEN:
bug#34110; Package coreutils. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 16 Jan 2019 22:03:52 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Wed Jan 16 17:03:52 2019
Received: from localhost ([127.0.0.1]:33957 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1gjtHr-0007IC-O9
	for submit <at> debbugs.gnu.org; Wed, 16 Jan 2019 17:03:52 -0500
Received: from eggs.gnu.org ([209.51.188.92]:49646)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <rjvbertin@HIDDEN>) id 1gjrZK-0004ej-Ff
 for submit <at> debbugs.gnu.org; Wed, 16 Jan 2019 15:13:46 -0500
Received: from lists.gnu.org ([209.51.188.17]:55387)
 by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32)
 (Exim 4.71) (envelope-from <rjvbertin@HIDDEN>) id 1gjrZF-0001Qk-Ca
 for submit <at> debbugs.gnu.org; Wed, 16 Jan 2019 15:13:41 -0500
Received: from eggs.gnu.org ([209.51.188.92]:59801)
 by lists.gnu.org with esmtp (Exim 4.71)
 (envelope-from <rjvbertin@HIDDEN>) id 1gjrZE-0005Ur-Av
 for bug-coreutils@HIDDEN; Wed, 16 Jan 2019 15:13:41 -0500
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,FREEMAIL_FROM
 autolearn=disabled version=3.3.2
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <rjvbertin@HIDDEN>) id 1gjrZD-0001BJ-D7
 for bug-coreutils@HIDDEN; Wed, 16 Jan 2019 15:13:40 -0500
Received: from mail-wm1-x32f.google.com ([2a00:1450:4864:20::32f]:36108)
 by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16)
 (Exim 4.71) (envelope-from <rjvbertin@HIDDEN>) id 1gjrZD-0000yn-0T
 for bug-coreutils@HIDDEN; Wed, 16 Jan 2019 15:13:39 -0500
Received: by mail-wm1-x32f.google.com with SMTP id p6so3406246wmc.1
 for <bug-coreutils@HIDDEN>; Wed, 16 Jan 2019 12:13:38 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=from:to:subject:date:message-id:user-agent:mime-version
 :content-transfer-encoding;
 bh=gceNAf4VRtSZvJKhaf5vEmz5MkQA/NWED6FbatJI23Q=;
 b=ZZeT3TVUPrh9gIpLGJXCEMlOEQeBUGyqYlsOa7iqzk8m6qrakFOKV4QV/SpwVgDBYT
 C0iDUyyYXZk9hXIQVOr8lw3sezL/RmMDmpx5U9kZGVG8yO40LT5H+g3oOmkl5JnJojua
 4w06VIm4YvpbTQWfI1Dc/L3HZhoiJWk1Jv9BqG6Dm3wFlboMbyqOOLuQ4WWvxxXPfAcv
 PcfgTGzPBCpD3AMiE/DZ52Q7ajybtGcTCQL34cBxsUcJE+JZ+82FWIIs3B+tg/mqtrbG
 T+UC3pMudgjn2ocWIvb0ZKtYrJKrtwuVDnT+Ko0rY2QVKXV45bcQ6f6jt/AxbSeXtCKl
 iSYA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:from:to:subject:date:message-id:user-agent
 :mime-version:content-transfer-encoding;
 bh=gceNAf4VRtSZvJKhaf5vEmz5MkQA/NWED6FbatJI23Q=;
 b=gWeJ9jPQF6Neby7GtWTX+MbldgtHjrhsuUKe2WK1ncetfSxpno7PKa1LaOph7LxcTn
 Btt6Op2S329A+D/3HvKdBu5/IQwHFnpmDwA24wP4dUyInTVPgsSB+SFta9RBzxb9/BOJ
 ilkWVc1tw6OkxUDk8ofmGIgMn3DI3ihqX4Xy9h77XkYgSqgrs+TzUNO8bBzSqe/JvpjR
 me/1gwUdhaU++l+n78K7nqdr4XRFI2z4Ttwxx80kpzJZMoHiodNveE3140PkpkrnHWuZ
 W85xg1H89Q+rR8LCawc2TgZBa1BkV53eWWdpCeXtUbTig1zkFZ1AIm/Y5e4UetynWv9y
 csqg==
X-Gm-Message-State: AJcUukePoz8+a2739I8ixV9vu18z1QRZqROLZccg7mH/OFBs0DX0U6tS
 Fr3D0RGiTEbLwgHXw6/agyqWolpD
X-Google-Smtp-Source: ALg8bN5HHvrqWZMEphcMLslumzXMkLpybDaAra1VTxBOFFKupgX+nULXjU4qAlgji6TYqFeSKklwKA==
X-Received: by 2002:a1c:30b:: with SMTP id 11mr8837969wmd.110.1547669616275;
 Wed, 16 Jan 2019 12:13:36 -0800 (PST)
Received: from bola.localnet
 (2a01cb0c84dee6009bda76eb03bc33f7.ipv6.abo.wanadoo.fr.
 [2a01:cb0c:84de:e600:9bda:76eb:3bc:33f7])
 by smtp.googlemail.com with ESMTPSA id 198sm60870249wmt.36.2019.01.16.12.13.34
 for <bug-coreutils@HIDDEN>
 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Wed, 16 Jan 2019 12:13:34 -0800 (PST)
From: =?ISO-8859-1?Q?Ren=E9_J=2EV=2E?= Bertin <rjvbertin@HIDDEN>
To: bug-coreutils@HIDDEN
Subject: feature request: dual-column du output,
 showing "real" and "on-disk" sizes (and about that "apparent-size"
 concept)
Date: Wed, 16 Jan 2019 21:13:15 +0100
Message-ID: <1689402.NmkMBi2P6V@bola>
User-Agent: KMail/4.13.3 (Linux/4.14.23-ck1-mainline-core2-rjvb; KDE/4.14.38;
 x86_64; ; )
MIME-Version: 1.0
Content-Transfer-Encoding: 7Bit
Content-Type: text/plain; charset="us-ascii"
X-detected-operating-system: by eggs.gnu.org: Genre and OS details not
 recognized.
X-Received-From: 2a00:1450:4864:20::32f
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x
X-Spam-Score: 1.0 (+)
X-Debbugs-Envelope-To: submit
X-Mailman-Approved-At: Wed, 16 Jan 2019 17:03:50 -0500
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -0.0 (/)

Hi,

I hope feature requests are acceptable here.

Now that more and more filesystems have support for compression it becomes more interesting the comparre actual file/directory (content) size and the corresponding on-disk size. Currently you have to call du twice to do that, which quickly becomes cumbersome in practice (commandlines, parsing the output) and requires repeating the same IO operations twice.

The code obtains both size values at the same time so it would make sense to do both calculations at the same time, and provide an option to display the regular and "apparent-size" values in column output. My guess would be that the cost of calculating both output values at the same time is negligible w.r.t. the cost of the stat() call (and thus that there's no need to complexify the code with "calculate this and/or that" conditionals).

The option could be called --both, --colums (-C) or --two (-T).

I'd also reconsider the "apparent-size" term as I think it is confusing and ambiguous. Consider this, taken from a ZFS dataset with gzip-9 compression (and copies=1; du v8.30):

%> du -hcs /Volumes/nif64/tmp/.npm/ ; du -hcs --apparent-size /Volumes/nif64/tmp/.npm/
340M    /Volumes/nif64/tmp/.npm/
180M    /Volumes/nif64/tmp/.npm/

Same folder on btrfs (mounted with compress=lzo):
%> du -hcs /mnt/.npm/ ; du -hcs --apparent-size  /mnt/.npm
198M    /mnt/.npm/
181M    /mnt/.npm

According to `du --help`, the apparent-size option reports a size that is not the actual disk usage. The numbers above seem to show the opposite.
If anything, I find the concept of "apparent size" more appropriate to the size a file occupies on the storage medium because ultimately that storage device will not give you more than "struct stat : st_size" bytes for uncompressed filesystems. 
Another way to say it: with "--apparent-size", du returns the actual file size; without, it returns how large the file appears to be (judging from its disk footprint).

For comparison; same folder,  on Mac with HFS+
%> du -hcs /Volumes/VMs/.npm ; du -hcs --apparent-size /Volumes/VMs/.npm
198M    /Volumes/VMs/.npm
181M    /Volumes/VMs/.npm

Idem, with HFS+ compression (zip-9)
%> du -hcs /Volumes/VMs/.npm ; du -hcs --apparent-size /Volumes/VMs/.npm
115M    /Volumes/VMs/.npm
148M    /Volumes/VMs/.npm

Thoughts?

Thanks,
R.





Acknowledgement sent to René J.V. Bertin <rjvbertin@HIDDEN>:
New bug report received and forwarded. Copy sent to bug-coreutils@HIDDEN. Full text available.
Report forwarded to bug-coreutils@HIDDEN:
bug#34110; Package coreutils. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Fri, 18 Jan 2019 06:45:02 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.