GNU bug report logs - #55093
"split -n K/N <file>" BUG: Last Chunk incomplete if input file >= 262144 bytes

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: coreutils; Reported by: Adam Holt <holt@HIDDEN>; Keywords: moreinfo; dated Sun, 24 Apr 2022 16:00:02 UTC; Maintainer for coreutils is bug-coreutils@HIDDEN.
Added tag(s) moreinfo. Request was from Paul Eggert <eggert@HIDDEN> to control <at> debbugs.gnu.org. Full text available.

Message received at 55093 <at> debbugs.gnu.org:


Received: (at 55093) by debbugs.gnu.org; 24 Apr 2022 23:14:34 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sun Apr 24 19:14:34 2022
Received: from localhost ([127.0.0.1]:60334 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1nilR4-0006ru-Jh
	for submit <at> debbugs.gnu.org; Sun, 24 Apr 2022 19:14:34 -0400
Received: from zimbra.cs.ucla.edu ([131.179.128.68]:54998)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <eggert@HIDDEN>) id 1nilR1-0006rd-Ni
 for 55093 <at> debbugs.gnu.org; Sun, 24 Apr 2022 19:14:32 -0400
Received: from localhost (localhost [127.0.0.1])
 by zimbra.cs.ucla.edu (Postfix) with ESMTP id 6553B16009A;
 Sun, 24 Apr 2022 16:14:25 -0700 (PDT)
Received: from zimbra.cs.ucla.edu ([127.0.0.1])
 by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032)
 with ESMTP id Mey-XUOCkXgy; Sun, 24 Apr 2022 16:14:24 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1])
 by zimbra.cs.ucla.edu (Postfix) with ESMTP id B98C21600C5;
 Sun, 24 Apr 2022 16:14:24 -0700 (PDT)
X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu
Received: from zimbra.cs.ucla.edu ([127.0.0.1])
 by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026)
 with ESMTP id pJWSjoFIyNYk; Sun, 24 Apr 2022 16:14:24 -0700 (PDT)
Received: from [192.168.1.9] (cpe-172-91-119-151.socal.res.rr.com
 [172.91.119.151])
 by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 9586516009A;
 Sun, 24 Apr 2022 16:14:24 -0700 (PDT)
Message-ID: <baf62d1c-5c82-bb74-388c-f75d38f47318@HIDDEN>
Date: Sun, 24 Apr 2022 16:14:24 -0700
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101
 Thunderbird/91.7.0
Subject: Re: bug#55093: "split -n K/N <file>" BUG: Last Chunk incomplete if
 input file >= 262144 bytes
Content-Language: en-US
To: Adam Holt <holt@HIDDEN>
References: <CAHaBuGfAPmUDzXKbsej1CbQXOQ=28dpTKJzcguOm-j=aCdaLWA@HIDDEN>
From: Paul Eggert <eggert@HIDDEN>
Organization: UCLA Computer Science Department
In-Reply-To: <CAHaBuGfAPmUDzXKbsej1CbQXOQ=28dpTKJzcguOm-j=aCdaLWA@HIDDEN>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 55093
Cc: 55093 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

On 4/24/22 07:40, Adam Holt wrote:

> split (GNU coreutils) 8.32

That's an old version, dated 2020. Please try the current version 
coreutils 9.1, which has bug fixes in this area.

Also, there's no need to cc. rms and tg; they're not working on 'split' 
any more.

Thanks.




Information forwarded to bug-coreutils@HIDDEN:
bug#55093; Package coreutils. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 24 Apr 2022 15:59:01 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sun Apr 24 11:59:01 2022
Received: from localhost ([127.0.0.1]:60031 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1niedZ-0001pl-1z
	for submit <at> debbugs.gnu.org; Sun, 24 Apr 2022 11:59:01 -0400
Received: from lists.gnu.org ([209.51.188.17]:39692)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <aholt888@HIDDEN>) id 1nidPb-0003mA-UT
 for submit <at> debbugs.gnu.org; Sun, 24 Apr 2022 10:40:39 -0400
Received: from eggs.gnu.org ([2001:470:142:3::10]:34454)
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <aholt888@HIDDEN>)
 id 1nidPb-00037D-Mb
 for bug-coreutils@HIDDEN; Sun, 24 Apr 2022 10:40:31 -0400
Received: from mail-ej1-f50.google.com ([209.85.218.50]:36389)
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128)
 (Exim 4.90_1) (envelope-from <aholt888@HIDDEN>)
 id 1nidPZ-0006Z0-Tq; Sun, 24 Apr 2022 10:40:31 -0400
Received: by mail-ej1-f50.google.com with SMTP id k23so25079688ejd.3;
 Sun, 24 Apr 2022 07:40:28 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20210112;
 h=x-gm-message-state:mime-version:from:date:message-id:subject:to;
 bh=UOlDfQv5y1pjE3Q6PU0Kj0ab7baGE7N/HOtq2tq27YA=;
 b=W0u/yNsvasgC6uV9gtu702iV8NQuTNBMBD7uhkHJiiOiUm4PIDgsmHzHxaYC7NjVGr
 L/lsftUW+o0xjqOx+QVxHecr/viHn9T/Jq4ZUyce8kdkHB61KiBWCKWdbske0WP7enbb
 tniqgCl7YRmkhOxJClQZvpGQ3BciaO3Kn/bNdatSKnuqizmHIMWuQrsNsP8YCN3CpDLE
 vZtYAqw3yJAJDBdqdQwqeW+CaNbHEQw3fTfLQamuEAFQEtcXBY1hWywUzkZq1kZ8Z4Zo
 avQ6FlqLCuI+giPQSu2s5+N1cTouEavlbHkK9ByV+83rBOmHYpO2sveF3ACVBH8vkSlh
 xYww==
X-Gm-Message-State: AOAM532BGK0/p4KH27kqSeZzxzpninyomWS3rQ9h4jQ+hzygWuqDE9i3
 tXm0t5BbovKN0P8E32t8I7UDVDC9fPmpKUTpzsEBaU44mPU=
X-Google-Smtp-Source: ABdhPJwsrvU4C0kqDwQ82yOwgS+wt/hqxhYpgfIAmFoYvEarJD5lamW38ptZZMO/dKhUt6CY9Ij2mdPEdEUDzqc2p0Y=
X-Received: by 2002:a17:906:4fc4:b0:6da:b4c6:fadb with SMTP id
 i4-20020a1709064fc400b006dab4c6fadbmr12582033ejw.282.1650811227288; Sun, 24
 Apr 2022 07:40:27 -0700 (PDT)
MIME-Version: 1.0
From: Adam Holt <holt@HIDDEN>
Date: Sun, 24 Apr 2022 10:40:01 -0400
Message-ID: <CAHaBuGfAPmUDzXKbsej1CbQXOQ=28dpTKJzcguOm-j=aCdaLWA@HIDDEN>
Subject: "split -n K/N <file>" BUG: Last Chunk incomplete if input file >=
 262144 bytes
To: bug-coreutils@HIDDEN, =?UTF-8?Q?Torbj=C3=B6rn_Granlund?= <tg@HIDDEN>, 
 Richard Stallman <rms@HIDDEN>
Content-Type: multipart/alternative; boundary="000000000000cb36f605dd6770e0"
Received-SPF: pass client-ip=209.85.218.50; envelope-from=aholt888@HIDDEN;
 helo=mail-ej1-f50.google.com
X-Spam_score_int: -11
X-Spam_score: -1.2
X-Spam_bar: -
X-Spam_report: (-1.2 / 5.0 requ) BAYES_00=-1.9, FREEMAIL_ENVFROM_END_DIGIT=0.25,
 FREEMAIL_FORGED_FROMDOMAIN=0.249, FREEMAIL_FROM=0.001,
 HEADER_FROM_DIFFERENT_DOMAINS=0.249, HTML_MESSAGE=0.001,
 RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001,
 SPF_HELO_NONE=0.001, SPF_PASS=-0.001,
 T_SCC_BODY_TEXT_LINE=-0.01 autolearn=no autolearn_force=no
X-Spam_action: no action
X-Spam-Score: -0.6 (/)
X-Debbugs-Envelope-To: submit
X-Mailman-Approved-At: Sun, 24 Apr 2022 11:58:59 -0400
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.6 (-)

--000000000000cb36f605dd6770e0
Content-Type: text/plain; charset="UTF-8"

Hello !

Where do I report a serious data loss bug with GNU's split command?

Example:

$ dd if=/dev/random of=file bs=262144 count=1    # Create file containing
262144 bytes

$ split -n 1/2 file | wc -c
131072
$ split -n 2/2 file | wc -c
0    # SHOULD BE 131072

split -n 1/3 file | wc -c
87381
split -n 2/3 file | wc -c
87381
split -n 3/3 file | wc -c
0    # SHOULD BE 87382


The Last Chunk is completely missing, as you can see in both above examples.

Additionally, if the input file is larger than 2^18 = 262144 bytes, the
Last Chunk generated by "split -n K/N file" is then truncated (i.e. many
bytes are missing, from the beginning of the Last Chunk).

Here's the version number I'm running:

$ split --version
split (GNU coreutils) 8.32
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <
https://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Torbjorn Granlund and Richard M. Stallman.


Thanks so much for your help forwarding this to anybody who might be able
to confirm and ideally resolve this for all !

Regards,
Adam

--
https://internet-in-a-box.org
https://twitter.com/internet_in_box

--000000000000cb36f605dd6770e0
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Hello !<div><br></div><div>Where do I report a serious dat=
a loss bug with GNU&#39;s split command?</div><div><br></div><div>Example:<=
/div><div><br></div><div><blockquote style=3D"margin:0 0 0 40px;border:none=
;padding:0px"><div>$ dd if=3D/dev/random of=3Dfile bs=3D262144 count=3D1=C2=
=A0 =C2=A0 # Create file containing 262144 bytes<br></div><div><br></div><d=
iv>$ split -n 1/2 file | wc -c<br>131072<br>$ split -n 2/2 file | wc -c<br>=
0=C2=A0 =C2=A0 # SHOULD BE=C2=A0131072<br></div><div><br></div><div>split -=
n 1/3 file | wc -c<br>87381<br></div><div>split -n 2/3 file | wc -c<br>8738=
1<br></div><div>split -n 3/3 file | wc -c<br>0=C2=A0 =C2=A0 # SHOULD BE 873=
82<br></div></blockquote></div><div><br></div><div>The Last Chunk is comple=
tely missing, as you can see in both above examples.</div><div><br></div><d=
iv>Additionally, if the input=C2=A0file is larger than 2^18 =3D 262144 byte=
s, the Last Chunk generated by &quot;split -n K/N file&quot; is then trunca=
ted (i.e. many bytes are missing,=C2=A0from the beginning of the Last Chunk=
).</div><div><br></div><div><div>Here&#39;s the version number I&#39;m runn=
ing:</div><div><br></div><div><blockquote style=3D"margin:0px 0px 0px 40px;=
border:none;padding:0px"><div>$ split --version<br>split (GNU coreutils) 8.=
32<br>Copyright (C) 2020 Free Software Foundation, Inc.<br>License GPLv3+: =
GNU GPL version 3 or later &lt;<a href=3D"https://gnu.org/licenses/gpl.html=
">https://gnu.org/licenses/gpl.html</a>&gt;.<br>This is free software: you =
are free to change and redistribute it.<br>There is NO WARRANTY, to the ext=
ent permitted by law.<br><br>Written by Torbjorn Granlund and Richard M. St=
allman.</div></blockquote></div></div><div><br></div><div>Thanks so much fo=
r=C2=A0your help forwarding this to anybody who might be able to confirm an=
d ideally resolve=C2=A0this for all !</div><div><br></div><div>Regards,</di=
v><div>Adam</div><div><br></div><div>--</div><div><a href=3D"https://intern=
et-in-a-box.org">https://internet-in-a-box.org</a></div><div><a href=3D"htt=
ps://twitter.com/internet_in_box">https://twitter.com/internet_in_box</a></=
div></div>

--000000000000cb36f605dd6770e0--




Acknowledgement sent to Adam Holt <holt@HIDDEN>:
New bug report received and forwarded. Copy sent to bug-coreutils@HIDDEN. Full text available.
Report forwarded to bug-coreutils@HIDDEN:
bug#55093; Package coreutils. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Sun, 24 Apr 2022 23:30:02 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.