Paul Eggert <eggert@HIDDEN>
to control <at> debbugs.gnu.org
.
Full text available.Received: (at 67593) by debbugs.gnu.org; 3 Dec 2023 09:38:00 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Sun Dec 03 04:38:00 2023 Received: from localhost ([127.0.0.1]:59141 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1r9ivH-0005E2-NZ for submit <at> debbugs.gnu.org; Sun, 03 Dec 2023 04:38:00 -0500 Received: from mail.cs.ucla.edu ([131.179.128.66]:43494) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <eggert@HIDDEN>) id 1r9ivC-0005Dl-HH for 67593 <at> debbugs.gnu.org; Sun, 03 Dec 2023 04:37:58 -0500 Received: from localhost (localhost [127.0.0.1]) by mail.cs.ucla.edu (Postfix) with ESMTP id A0D013C011BE8; Sun, 3 Dec 2023 01:37:38 -0800 (PST) Received: from mail.cs.ucla.edu ([127.0.0.1]) by localhost (mail.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id FGWrLZ70_9Dl; Sun, 3 Dec 2023 01:37:38 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by mail.cs.ucla.edu (Postfix) with ESMTP id 40B763C011BE9; Sun, 3 Dec 2023 01:37:38 -0800 (PST) DKIM-Filter: OpenDKIM Filter v2.10.3 mail.cs.ucla.edu 40B763C011BE9 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cs.ucla.edu; s=9D0B346E-2AEB-11ED-9476-E14B719DCE6C; t=1701596258; bh=aDy4ky4M39t5F1qmqyh0WFciUApGhm4yUPUGUs/kRCI=; h=Message-ID:Date:MIME-Version:To:From; b=hgFsC+Dvndau/kIARzwGzlQirgAi0KOipLQNLngi30r0px2z/BSdPsRPHnYV6Dn3L VBTmduCuREeooEXcixJ841T2oiAF4nKh9by/qSkhftSbi+bCZdBU2JtqV5KXEb4izj gIfaK3Zq3A8y22wMbTw7G9bPyNXGNidJMDvsBrphEWMJS47z4bZwpM+/UYJhc5KgPp /e8Jwn66HuOmP7p0lpyw8kEvGtc6JnEO00mdqrOE0nnosdNKTWozwI7Sg1RlaJD1cM U+gyv+mFzXigGOKlAgA9PC6+krQvezmAUIMeYebzBhxUzOKs2n7yqYZsEr2hzxA0U+ l6GXbr/tFXlig== X-Virus-Scanned: amavisd-new at mail.cs.ucla.edu Received: from mail.cs.ucla.edu ([127.0.0.1]) by localhost (mail.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id eubgGxeoDxLg; Sun, 3 Dec 2023 01:37:38 -0800 (PST) Received: from [192.168.254.12] (unknown [47.148.192.211]) by mail.cs.ucla.edu (Postfix) with ESMTPSA id 197003C011BE8; Sun, 3 Dec 2023 01:37:38 -0800 (PST) Message-ID: <fd60967a-6dd1-4f38-98fb-e253c64a40a0@HIDDEN> Date: Sun, 3 Dec 2023 01:37:37 -0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: bug#67593: `split --number=l/N` no longer splits evenly Content-Language: en-US To: Victor Engmark <victor@HIDDEN> References: <e965589eeb832a95c0d107345244e928f57f060d.camel@HIDDEN> From: Paul Eggert <eggert@HIDDEN> Organization: UCLA Computer Science Department In-Reply-To: <e965589eeb832a95c0d107345244e928f57f060d.camel@HIDDEN> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: 67593 Cc: 67593 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.0 (-) That's not a bug, in that 'split' is behaving as documented. The first input line is one byte shorter than the second one. 'Split' divides the input into two regions, and because the first region happens to be one byte longer than the second region both input lines are sent to the first output file. In older coreutils, 'split' used a different algorithm to compute region sizes, which worked better for your test case but considerably worse in others. For example, in older coreutils: seq 50 >in split -n l/71 in created 43 files of size 0, 9 files of size 2, 18 files of size 3, and one file of size 69. Current coreutils splits much better: it creates 21 files of size 0, 9 files of size 2, and 41 files of size 3.
bug-coreutils@HIDDEN
:bug#67593
; Package coreutils
.
Full text available.Received: (at submit) by debbugs.gnu.org; 3 Dec 2023 00:25:42 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Sat Dec 02 19:25:42 2023 Received: from localhost ([127.0.0.1]:58911 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1r9aIo-00053o-DG for submit <at> debbugs.gnu.org; Sat, 02 Dec 2023 19:25:42 -0500 Received: from lists.gnu.org ([2001:470:142::17]:40304) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <victor@HIDDEN>) id 1r9aIm-00053Z-Pe for submit <at> debbugs.gnu.org; Sat, 02 Dec 2023 19:25:41 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <victor@HIDDEN>) id 1r9aIV-0005lH-Dh for bug-coreutils@HIDDEN; Sat, 02 Dec 2023 19:25:23 -0500 Received: from smtp.domeneshop.no ([2a01:5b40:0:3006::1]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <victor@HIDDEN>) id 1r9aIN-0005aZ-NO for bug-coreutils@HIDDEN; Sat, 02 Dec 2023 19:25:22 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=engmark.name; s=ds202212; h=MIME-Version:Content-Transfer-Encoding: Content-Type:Date:To:From:Subject:Message-ID:Sender:Reply-To:Cc:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=O1iFrUJFvDV263dpvBIsk8p82GVI3aX03eIalu3VeBU=; b=E0+YELAD7wSJRC/kSJj40Pt6sz 6agORx+rcAn7HYMhNDxIzYXSm5nYu2EImhyZscJQYgy3BkcZk9ro/YEchoCY65HxCPBfcwbKydMZj Na33442g4UEQ+FVVCBE3hEDioNH6Ki8cmrxrAtZ8Pz/rFO90TYT4p8ZLkon6fx3tcswQsGYE6VI5X n/636l+7lyyzHW/eezEMJeHIQZLl8Op+7w0VKUqbYyK1hGjiqchzVxxAr7Z7PDTbfot46jJ1HysL5 vk06U+SVBx/rX0lg/lBcROiBulousnfD1US29pOGjtcPc+mWjgkNHZZI7+ES5vLeTTX+oJ1eLgYCH L82HKW+Q==; Received: from [2404:440c:2302:a500:2f69:6958:165a:f3b4] (port=46542 helo=default-rdns.vocus.co.nz) by smtp.domeneshop.no with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from <victor@HIDDEN>) id 1r9aID-001pUa-UY for bug-coreutils@HIDDEN; Sun, 03 Dec 2023 01:25:06 +0100 Message-ID: <e965589eeb832a95c0d107345244e928f57f060d.camel@HIDDEN> Subject: `split --number=l/N` no longer splits evenly From: Victor Engmark <victor@HIDDEN> To: bug-coreutils@HIDDEN Date: Sun, 03 Dec 2023 13:25:01 +1300 Autocrypt: addr=victor@HIDDEN; prefer-encrypt=mutual; keydata=mDMEZQ/G5hYJKwYBBAHaRw8BAQdAZNkbxswUuLGnj1OYtA8j4Zc8ECWXP65YMaJ8CXoPboy0JFZpY3RvciBFbmdtYXJrIDx2aWN0b3JAZW5nbWFyay5uYW1lPoiZBBMWCgBBFiEE8cX2MBPl3i+1dNQqzl6zg0TYXbYFAmUPxuYCGwMFCRLMAwAFCwkIBwICIgIGFQoJCAsCBBYCAwECHgcCF4AACgkQzl6zg0TYXbap5QEA6lzeeVuncmbEt+BQROEz7/8dbNz8gxgr9b7U61P7RdsA/23wIERJqzWj4beDPKC/PrujaRaJibYG5yvh41BWTzYFuDgEZQ/G5hIKKwYBBAGXVQEFAQEHQOTKFfUlk8M1kvjTORDMxJG0EK97yWsrqtnFZ7/FMgVnAwEIB4h+BBgWCgAmFiEE8cX2MBPl3i+1dNQqzl6zg0TYXbYFAmUPxuYCGwwFCRLMAwAACgkQzl6zg0TYXbarUwEAwXCvoKR374iCCAW0HV7d0jv/HAfZva/C1tZmRn+A1cIBAJ2a4M4vXEoop46NA1AlYDZNBW4NaIS+Uhdjks2qeWQK Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.50.1 MIME-Version: 1.0 Received-SPF: pass client-ip=2a01:5b40:0:3006::1; envelope-from=victor@HIDDEN; helo=smtp.domeneshop.no X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -0.0 (/) Hi all Commit fb6fc7f3ce6b0b70a5df7f605e71c4f8541e256b (part of v9.2) introduced a regression in how `split --number=3Dl/N` works. Test script `tests/split/l-chunk2.sh`: ``` #!/bin/sh . "${srcdir=3D.}/tests/init.sh"; path_prepend_ ./src print_ver_ split printf 'first\n' > exp1 || framework_failure_ printf 'second\n' > exp2 || framework_failure_ cat exp1 exp2 > in || framework_failure_ split -e -n l/2 in || framework_failure_ compare exp1 xaa || fail=3D1 compare exp2 xab || fail=3D1 Exit $fail ``` Relevant test output: ``` + diff -u exp1 xaa --- exp1 2023-12-03 12:42:50.511334991 +1300 +++ xaa 2023-12-03 12:42:50.513334908 +1300 @@ -1 +1,2 @@ first +second ``` and ``` + diff -u exp2 xab diff: xab: No such file or directory ``` In other words, it doesn't split the file at all, despite it containing two lines of content. The bug is still present in current master (commit 73d119f4f8052a9fb6cef13cd9e75d5a4e23311a). Bisected on NixOS 23.11 using the following script: ``` #!/bin/sh set -e export CFLAGS=3D-w # Avoid build failure git submodule update git clean -fdx --exclude=3Dbisect.sh --exclude=3Dtests/split/l-chunk2.sh ./bootstrap autoconf ./configure make make check TESTS=3Dtests/split/l-chunk2.sh SUBDIRS=3D. ``` and these commands: ``` git bisect start master v9.1 git bisect run ./bisect.sh ``` Cheers Victor
Victor Engmark <victor@HIDDEN>
:bug-coreutils@HIDDEN
.
Full text available.bug-coreutils@HIDDEN
:bug#67593
; Package coreutils
.
Full text available.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997 nCipher Corporation Ltd,
1994-97 Ian Jackson.