GNU bug report logs - #35488
Feature du --files-only request

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: coreutils; Severity: wishlist; Reported by: David Ellenberger <davidmarioellenberger@HIDDEN>; dated Mon, 29 Apr 2019 15:06:02 UTC; Maintainer for coreutils is bug-coreutils@HIDDEN.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 2 May 2019 09:41:00 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu May 02 05:41:00 2019
Received: from localhost ([127.0.0.1]:45602 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1hM8D5-0002LY-UE
	for submit <at> debbugs.gnu.org; Thu, 02 May 2019 05:41:00 -0400
Received: from eggs.gnu.org ([209.51.188.92]:48015)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <coreutils@HIDDEN>) id 1hM8D4-0002LL-Vd
 for submit <at> debbugs.gnu.org; Thu, 02 May 2019 05:40:59 -0400
Received: from lists.gnu.org ([209.51.188.17]:43150)
 by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32)
 (Exim 4.71) (envelope-from <coreutils@HIDDEN>) id 1hM8Cr-0001ro-Do
 for submit <at> debbugs.gnu.org; Thu, 02 May 2019 05:40:49 -0400
Received: from eggs.gnu.org ([209.51.188.92]:34153)
 by lists.gnu.org with esmtp (Exim 4.71)
 (envelope-from <coreutils@HIDDEN>) id 1hM8Cq-0006LD-Az
 for bug-coreutils@HIDDEN; Thu, 02 May 2019 05:40:45 -0400
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled
 version=3.3.2
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <coreutils@HIDDEN>) id 1hM8Cc-0001ha-M2
 for bug-coreutils@HIDDEN; Thu, 02 May 2019 05:40:36 -0400
Received: from ishtar.tlinx.org ([173.164.175.65]:58374
 helo=Ishtar.sc.tlinx.org)
 by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32)
 (Exim 4.71) (envelope-from <coreutils@HIDDEN>) id 1hM8CP-000137-Vm
 for bug-coreutils@HIDDEN; Thu, 02 May 2019 05:40:23 -0400
Received: from [192.168.3.12] (Athenae [192.168.3.12])
 by Ishtar.sc.tlinx.org (8.14.7/8.14.4/SuSE Linux 0.8) with ESMTP id
 x429du9i030468; Thu, 2 May 2019 02:39:58 -0700
Message-ID: <5CCABAEB.6040700@HIDDEN>
Date: Thu, 02 May 2019 02:39:55 -0700
From: L A Walsh <coreutils@HIDDEN>
User-Agent: Thunderbird
MIME-Version: 1.0
To: David Ellenberger <davidmarioellenberger@HIDDEN>
Subject: Re: bug#35488: Feature du --files-only request
References: <CADH3eD1EqJMhxowYYZT9gYZ-cjgJnbHa-uqUJb_est3G_zAjEQ@HIDDEN>
In-Reply-To: <CADH3eD1EqJMhxowYYZT9gYZ-cjgJnbHa-uqUJb_est3G_zAjEQ@HIDDEN>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x (no
 timestamps) [generic]
X-Received-From: 173.164.175.65
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: submit
Cc: Coreutils <bug-coreutils@HIDDEN>
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

On 4/29/2019 4:36 AM, David Ellenberger wrote:
> Dear maintainers
>
> I understand that admins have become accustomed to see 4096 in directories
> as it's consistent with the ls command and the technicality behind it.
>   
---
    Except that it isn't 4096 on all file systems.  For empty
directories, I usually see 0 bytes allocated.  And for very large
directories, it may be megabytes of space in the directory.
The directory takes real space on linux/unix and different depending
on what filesystem you use.  On windows in NTFS, directories, I think
can be virtual since the meta info and names are in a file control
block.  In that single case, the directories may really take zero
space in the file system, but that's a quirk of how NTFS works in
windows.  If the real intent is to measure used disk space, including
directories seems advisable as they can take real space on most
file systems that is counted against a user quota if it exists.

    Confusingly, depending on the file-allocation block size on the
source and target (they are often 4k, but don't have to be) and on
the amount of ***fragmentation*** in the free space, of the source
and target.  If free space is heavily fragmented, a directory may
need to be spread out into several areas, making it larger than
necessary if free space wasn't so fragmented.  That's why you often
hear people say you should keep about 15-25% of your disk space
free -- thats so large contiguous areas won't entirely disappear and
storage will be more efficient.

    Anyway, just my opinion, but not sure if du should exclude
directories entirely, but I wouldn't be against separate subtotal
lines for directories and files -- that would make it even more helpful!






Information forwarded to bug-coreutils@HIDDEN:
bug#35488; Package coreutils. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 29 Apr 2019 15:05:41 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Mon Apr 29 11:05:41 2019
Received: from localhost ([127.0.0.1]:39481 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1hL7qe-0005Fw-RW
	for submit <at> debbugs.gnu.org; Mon, 29 Apr 2019 11:05:41 -0400
Received: from eggs.gnu.org ([209.51.188.92]:48821)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <davidmarioellenberger@HIDDEN>) id 1hL4ag-0008J5-4s
 for submit <at> debbugs.gnu.org; Mon, 29 Apr 2019 07:36:58 -0400
Received: from lists.gnu.org ([209.51.188.17]:53479)
 by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32)
 (Exim 4.71) (envelope-from <davidmarioellenberger@HIDDEN>)
 id 1hL4aa-0002yp-UH
 for submit <at> debbugs.gnu.org; Mon, 29 Apr 2019 07:36:52 -0400
Received: from eggs.gnu.org ([209.51.188.92]:34967)
 by lists.gnu.org with esmtp (Exim 4.71)
 (envelope-from <davidmarioellenberger@HIDDEN>) id 1hL4aZ-0002WJ-K9
 for bug-coreutils@HIDDEN; Mon, 29 Apr 2019 07:36:52 -0400
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,FREEMAIL_FROM,
 HTML_MESSAGE autolearn=disabled version=3.3.2
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <davidmarioellenberger@HIDDEN>) id 1hL4aX-0002x8-M4
 for bug-coreutils@HIDDEN; Mon, 29 Apr 2019 07:36:51 -0400
Received: from mail-pf1-x42c.google.com ([2607:f8b0:4864:20::42c]:37434)
 by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16)
 (Exim 4.71) (envelope-from <davidmarioellenberger@HIDDEN>)
 id 1hL4aV-0002tY-Jd
 for bug-coreutils@HIDDEN; Mon, 29 Apr 2019 07:36:49 -0400
Received: by mail-pf1-x42c.google.com with SMTP id g3so5183764pfi.4
 for <bug-coreutils@HIDDEN>; Mon, 29 Apr 2019 04:36:44 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=mime-version:from:date:message-id:subject:to;
 bh=46U39xh3mbSwIcJ8qg3t8fY8Zdi3Z3PpMO2niJIc9WY=;
 b=fZvV8461l4jopQkgN2R1Br5dUosPKzcL91TKPfEutaPBo/jyDTMDAEGk0EBGCRnlrx
 QYev/oQ9Mt/bFCzlPSxPzfULALMNkgfjiV2SGKi+g67+RVnmCc0RcdRwSzSfzOtWJ9jb
 q3vbAAEghol+8GF1amA+Xn1E4A+WtgZwk/ewQz1mJsX1Fl3O1Z4E+6KvdtIwAPXQdBzd
 oGjYiJxsDSRpL0V90Nm5DNSpj7zE4X8gCIVjZEtbv1PDXF2z9WznmyuyxtFVW1SyPcmQ
 2LSGHqdu3MM1rSyzju/kqSvVTFTEn+WsHW+0YpUrSYCCE1Qlg4L/dJiBEYVTv9O64eT1
 X03g==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:from:date:message-id:subject:to;
 bh=46U39xh3mbSwIcJ8qg3t8fY8Zdi3Z3PpMO2niJIc9WY=;
 b=WkUc4z5p4/L4QXXMV2uetQ8LqTQyKNnFgir8o0aZPjmri4dH2L4eBGPdoSfs0HZiGE
 I8JXO5VwjHQzi7g44jKQdEN/fFZ+KRuvdHkPUUeJnV2qSgask0AubZxKyehUjZKVqwc/
 S97B5Y5yTFnABzFwxyAyLN3NlQZdgNGd/8TMEIXVb0/j3ut976b+HT6Fqj0nu6rUnRyp
 KT/h+KwMdetel7zH9f1/IB7E3j41evcJI0aSUE8wYxb+vDUGvKS770N2zDbgCdsdYKw3
 vYD9R9EIG2VfGRrwJLFXlwBhJ8jP4ls6yI0ABT6uCCWsJLxlJJXiLyJtjJCRqeFyvV07
 /G6Q==
X-Gm-Message-State: APjAAAXNXaIsY7mKJ6ymFj1CcZ6ZXPNl783nfl3d51kMkrvee/R2QJJQ
 L2d2ByA3jMkkjBFNnI9ufPkRIdPP69vRt0Taslq8vpbhZ70=
X-Google-Smtp-Source: APXvYqw4UgvN9VKE9cgINVqjCHRya/Sp0LxCZdwEIbQhIvGDkQocwI6JXG6YXhH/rYsvDXsVThbp1RAFVoFsLDrMLDI=
X-Received: by 2002:a65:65c9:: with SMTP id y9mr58286346pgv.47.1556537803150; 
 Mon, 29 Apr 2019 04:36:43 -0700 (PDT)
MIME-Version: 1.0
From: David Ellenberger <davidmarioellenberger@HIDDEN>
Date: Mon, 29 Apr 2019 13:36:31 +0200
Message-ID: <CADH3eD1EqJMhxowYYZT9gYZ-cjgJnbHa-uqUJb_est3G_zAjEQ@HIDDEN>
Subject: Feature du --files-only request
To: bug-coreutils@HIDDEN
Content-Type: multipart/alternative; boundary="000000000000d61e1e0587a9b251"
X-detected-operating-system: by eggs.gnu.org: Genre and OS details not
 recognized.
X-Received-From: 2607:f8b0:4864:20::42c
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x
X-Spam-Score: -1.3 (-)
X-Debbugs-Envelope-To: submit
X-Mailman-Approved-At: Mon, 29 Apr 2019 11:05:39 -0400
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -2.3 (--)

--000000000000d61e1e0587a9b251
Content-Type: text/plain; charset="UTF-8"

Dear maintainers

As you're probably aware, du --apparent-size calculates/reports 1 file
system block for a [empty] folder (typically 4096 bytes). I find that a bit
inconsistent with what the option suggests. The manual entry to
--apparent-size doesn't help the understanding.

From a practical point of view, when we admins copy a folder&files
structure from one to another file system where block size doesn't
correspond, we cannot use du to get a count comparison and have to resort
to something like:
$ ls -anR | grep -v '^d' | awk '{total += $5} END {print total, "Bytes"}'

Windows explorer shows zero bytes for an empty folder or folder containing
multiple empty folders. This way, comparing two copied folders&files's
content by size works out well regardless of file system and its block
sizes it uses.

I understand that admins have become accustomed to see 4096 in directories
as it's consistent with the ls command and the technicality behind it.

In my daily admin tasks I never had to count sizes of empty folders. The
overhead of provisioning and enable the file system to work is something we
typically accept and do not require to re-calculate nor even to understand
in all details. Anyway, the FS provisioning and logical blocks perspective
is a complete different things for which we have the df command and other
tools.

I'd therefore suggest a new option --files-only (which calculates only the
size of files and skips over anything else that has a directory attribute
flag, device, symbolic link etc..).

Like that we would finally be able to count file sizes consistently align
with the du manual entry which says 'DESCRIPTION: Summarize disk usage of
each FILE, recursively for directories.'

Thanks for reading,

David

--000000000000d61e1e0587a9b251
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr">Dear maintainers<div><br=
></div><div>As you&#39;re probably aware, du=C2=A0<font color=3D"#000000">-=
-apparent-size calculates/reports 1 file system block for a [empty] folder =
(typically 4096 bytes). I find that a bit inconsistent with what the option=
 suggests. The manual entry </font><span style=3D"color:rgb(0,0,0)">to=C2=
=A0</span><span style=3D"color:rgb(0,0,0)">--apparent-size</span><span styl=
e=3D"color:rgb(0,0,0)">=C2=A0doesn&#39;t help the understanding.</span></di=
v><div><span style=3D"color:rgb(0,0,0)"><br></span></div><div><font color=
=3D"#000000">From a practical point of view, when we admins copy a folder&a=
mp;files structure from one to another file system where block size doesn&#=
39;t correspond, we cannot use du to get a count comparison and have to res=
ort to something like:</font></div><div><font color=3D"#000000">$ ls -anR |=
 grep -v &#39;^d&#39; | awk &#39;{total +=3D $5} END {print total, &quot;By=
tes&quot;}&#39;</font></div><div><font color=3D"#000000"><br></font></div><=
div><font color=3D"#000000">Windows explorer shows zero bytes for an empty =
folder or folder containing multiple empty folders. This way, comparing two=
 copied folders&amp;files&#39;s content by size works out well regardless o=
f file system and its block sizes it uses.</font></div><div><font color=3D"=
#000000"><br></font></div><div>I understand that admins have become accusto=
med to see 4096 in directories as it&#39;s consistent with the ls command a=
nd the technicality behind it.</div><div><br></div><div>In my daily admin t=
asks I never had to count sizes of empty folders. The overhead of provision=
ing and enable the file system to work is something we typically accept and=
 do not require to re-calculate nor even to understand in all details. Anyw=
ay, the FS provisioning and logical blocks perspective is a complete differ=
ent things for which we have the df command and other tools.</div><div><br>=
</div><div><div>I&#39;d therefore suggest a new option --files-only (which =
calculates only the size of files and skips over anything else that has a d=
irectory attribute flag, device, symbolic link etc..).</div><div><br></div>=
<div>Like that we would finally be able to count file sizes consistently al=
ign with the du manual entry which says &#39;DESCRIPTION: Summarize disk us=
age of each FILE, recursively for directories.&#39;</div></div><div><br></d=
iv><div>Thanks for reading,</div><div><font color=3D"#000000"><br></font></=
div><div><font color=3D"#000000">David</font></div></div></div></div>

--000000000000d61e1e0587a9b251--




Acknowledgement sent to David Ellenberger <davidmarioellenberger@HIDDEN>:
New bug report received and forwarded. Copy sent to bug-coreutils@HIDDEN. Full text available.
Report forwarded to bug-coreutils@HIDDEN:
bug#35488; Package coreutils. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Mon, 25 Nov 2019 12:00:02 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.