GNU bug report logs - #39850
"du" command can not count some files

Previous Next

Package: coreutils;

Reported by: Hyunho Cho<mug896 <at> naver.com>

Date: Sun, 1 Mar 2020 08:06:02 UTC

Severity: normal

To reply to this bug, email your comments to 39850 AT debbugs.gnu.org.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-coreutils <at> gnu.org:
bug#39850; Package coreutils. (Sun, 01 Mar 2020 08:06:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Hyunho Cho<mug896 <at> naver.com>:
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Sun, 01 Mar 2020 08:06:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Hyunho Cho<mug896 <at> naver.com>
To: <bug-coreutils <at> gnu.org>
Subject: "du" command can not count some files
Date: Sun, 01 Mar 2020 16:13:46 +0900
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 19.10
Release:        19.10
Codename:       eoan

$ uname -a
Linux EliteBook 5.3.0-40-generic #32-Ubuntu SMP Fri Jan 31 20:24:34 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

$ du --version
du (GNU coreutils) 8.30

====================================================================================


$ find /usr/bin -type f | wc -l
2234

$ find /usr/bin -type f -print0 | du -b --files0-from=- | wc -l
2222

$ du -b $( find /usr/bin -type f ) | wc -l
2222



$ find /usr/bin -type f -exec stat -c %s {} + | awk '{sum+=$1} END{ print sum}'
1296011570

$ find /usr/bin -type f -print0 | du -b --files0-from=- | awk '{sum+=$1} END{ print sum}'
1282350388



$ diff <( find /usr/bin -type f | sort ) <( find /usr/bin -type f -print0 | du --files0-from=-  | cut -f 2  | sort )
1231d1230
< /usr/bin/perl5.28.1
1233d1231
< /usr/bin/perlbug
1262d1259
< /usr/bin/pigz
1272d1268
< /usr/bin/pkg-config
1517,1518d1512
< /usr/bin/python3.6m
< /usr/bin/python3.7
1619d1612
< /usr/bin/rb
1697d1689
< /usr/bin/rz
1727d1718
< /usr/bin/sb
1893d1883
< /usr/bin/sz
1932d1921
< /usr/bin/tiptop
1990d1978
< /usr/bin/unzip





Information forwarded to bug-coreutils <at> gnu.org:
bug#39850; Package coreutils. (Sun, 01 Mar 2020 08:32:03 GMT) Full text and rfc822 format available.

Message #8 received at 39850 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Hyunho Cho <mug896 <at> naver.com>
Cc: 39850 <at> debbugs.gnu.org
Subject: Re: bug#39850: "du" command can not count some files
Date: Sun, 1 Mar 2020 00:31:18 -0800
I don't see a bug there, as the files you say "du" is not counting have counts 
of zero.




Information forwarded to bug-coreutils <at> gnu.org:
bug#39850; Package coreutils. (Tue, 03 Mar 2020 06:02:02 GMT) Full text and rfc822 format available.

Message #11 received at 39850 <at> debbugs.gnu.org (full text, mbox):

From: Bob Proulx <bob <at> proulx.com>
To: Hyunho Cho <mug896 <at> naver.com>
Cc: 39850 <at> debbugs.gnu.org
Subject: Re: bug#39850: "du" command can not count some files
Date: Mon, 2 Mar 2020 23:01:08 -0700
Hyunho Cho wrote:
> $ find /usr/bin -type f | wc -l
> 2234
> 
> $ find /usr/bin -type f -print0 | du -b --files0-from=- | wc -l
> 2222

Hard links.  Files that are hard linked are only counted once by du
since du is summing up the disk usage and hard linked files only use
disk on the first usage.

Add the du -l option if you want to count hard linked files multiple
times.

  find /usr/bin -type f -print0 | du -l -b --files0-from=- | wc -l

That will generate an incorrect total disk usage amount however as it
will report hard linked disk space for each hard link.  But it all
depends upon what you are trying to count.

> $ du -b $( find /usr/bin -type f ) | wc -l
> 2222

  du -l -b $( find /usr/bin -type f ) | wc -l

> $ find /usr/bin -type f -exec stat -c %s {} + | awk '{sum+=$1} END{ print sum}'
> 1296011570
> 
> $ find /usr/bin -type f -print0 | du -b --files0-from=- | awk '{sum+=$1} END{ print sum}'
> 1282350388

  find /usr/bin -type f -print0 | du -l -b --files0-from=- | awk '{sum+=$1} END{ print sum}'

> $ diff <( find /usr/bin -type f | sort ) <( find /usr/bin -type f -print0 | du --files0-from=-  | cut -f 2  | sort )

  diff <( find /usr/bin -type f | sort ) <( find /usr/bin -type f -print0 | du -l --files0-from=-  | cut -f 2  | sort )

I am surprised you didn't try du on each file in addition to stat -c %s
on each file when you were summing them up. :-)

  find /usr/bin -type f -exec du -b {} \; | awk '{sum+=$1} END{ print sum}'

Bob




This bug report was last modified 4 years and 26 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.