GNU bug report logs - #70231
Performance issue on sort with zero-sized pseudo files

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: coreutils; Reported by: Takashi Kusumi <tkusumi@HIDDEN>; dated Sat, 6 Apr 2024 06:39:02 UTC; Maintainer for coreutils is bug-coreutils@HIDDEN.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 6 Apr 2024 06:38:52 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sat Apr 06 02:38:52 2024
Received: from localhost ([127.0.0.1]:38021 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1rszhS-000783-Hs
	for submit <at> debbugs.gnu.org; Sat, 06 Apr 2024 02:38:52 -0400
Received: from lists.gnu.org ([2001:470:142::17]:43480)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <tkusumi@HIDDEN>) id 1rswFh-0003Qb-0g
 for submit <at> debbugs.gnu.org; Fri, 05 Apr 2024 22:57:59 -0400
Received: from eggs.gnu.org ([2001:470:142:3::10])
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <tkusumi@HIDDEN>)
 id 1rswFT-00021p-5u
 for bug-coreutils@HIDDEN; Fri, 05 Apr 2024 22:57:43 -0400
Received: from mail-os0jpn01on20700.outbound.protection.outlook.com
 ([2a01:111:f403:201a::700]
 helo=JPN01-OS0-obe.outbound.protection.outlook.com)
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <tkusumi@HIDDEN>)
 id 1rswFB-0000nn-O3
 for bug-coreutils@HIDDEN; Fri, 05 Apr 2024 22:57:27 -0400
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none;
 b=JjDjX6kZb3vwJpM3ZT8FffXfJhL087gVeV2NK27TAqbtLVQ/N7SCrY1/vvDstwbpKo7W+t8gS3htGTuE30p8jXDHfUcBM1oajccUtzHw5LnwOUCHbD7ZD66xCSK1WD+5KQ3no696d9ZgOX9sMez3WGPeogNXB6jmA0D3DwHvV8xGUCZCP7/xf+0YmKwsedkeDEkV0ew1+OKM/8QbsMUl/9RPd32Ny3cxZEKooiz14d8XS0SWcu8dQHdpmpidQ9k/kq/r0sMJ2JwzvOpciNDAPwq9Wch3Ul+Pk4/W7z+qB49ONmI1mtcecLbCrZzViReLyFRH0OmewQQNfnShs115Aw==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; 
 s=arcselector9901;
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1;
 bh=hYJHjpGaaWvBWkSw9/5lMydMncB4577oG9vDJUlFTHk=;
 b=CRUC4kj1YOrzAxDNEXuvAN1bBiEY9lQJdwBzlo9UX5NcRyGoESyIJKZJxmCtCZYRV1vDRk5qe54XCk1mro/GJU3bGd13AlbAfTHQReNoNJZ7MDvo1innmGNeCk3GZVgHb15V7+YIA4TZ/YZuSvlAPub6GHzkYPljfBP/hE6YEDFZI3t7G+3q54SKo7CU9n6qeBmXOu7wAdIjA5WDMtnD9Gul5RslYaGfjUaT2CPD72x3RZzCsxmFb6Xci7FT5oB/aPyKoXSp/o1LcVJOp9u4SC3glr6PhZZ1XnwrRtT2x8a90l9ZkjDdezVwnpMWIyNDIH8nZ/XECwFpva9eZL8KqQ==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass
 smtp.mailfrom=zlab.co.jp; dmarc=pass action=none header.from=zlab.co.jp;
 dkim=pass header.d=zlab.co.jp; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=yjgroup1.onmicrosoft.com; s=selector2-yjgroup1-onmicrosoft-com;
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;
 bh=hYJHjpGaaWvBWkSw9/5lMydMncB4577oG9vDJUlFTHk=;
 b=kO+GSJB+gZCRykbEgIwZ8/lZ+GuUtGcVeTQaCph6Y+lANq8t3/Pfgt4FkAmTP4AEx8w0xv6KdCzFx1Awzkwo3GG7RoBAkGpUZn5fbjFwm4K5ejeIvbJWWdgACriQn8gSJuoTMu/AsLtFUmZMYokOvJC8PUp3mxjF51TzZRfLfk4=
Received: from OSAPR01MB1619.jpnprd01.prod.outlook.com (2603:1096:603:2e::15)
 by TYWPR01MB9969.jpnprd01.prod.outlook.com (2603:1096:400:1e1::5)
 with Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7409.46; Sat, 6 Apr
 2024 02:52:18 +0000
Received: from OSAPR01MB1619.jpnprd01.prod.outlook.com
 ([fe80::9555:3b3d:ba6:147e]) by OSAPR01MB1619.jpnprd01.prod.outlook.com
 ([fe80::9555:3b3d:ba6:147e%5]) with mapi id 15.20.7409.042; Sat, 6 Apr 2024
 02:52:18 +0000
Content-Type: multipart/mixed; boundary="------------scVQ1eSWWXgHHmIO5FRVkQAM"
Message-ID: <5d9a5fa2-e716-4d68-b4fe-8ef58e4e4e0c@HIDDEN>
Date: Sat, 6 Apr 2024 11:52:17 +0900
User-Agent: Mozilla Thunderbird
Content-Language: en-US
From: Takashi Kusumi <tkusumi@HIDDEN>
Subject: Performance issue on sort with zero-sized pseudo files
To: bug-coreutils@HIDDEN
X-ClientProxiedBy: TY2PR01CA0006.jpnprd01.prod.outlook.com
 (2603:1096:404:a::18) To OSAPR01MB1619.jpnprd01.prod.outlook.com
 (2603:1096:603:2e::15)
MIME-Version: 1.0
X-MS-PublicTrafficType: Email
X-MS-TrafficTypeDiagnostic: OSAPR01MB1619:EE_|TYWPR01MB9969:EE_
X-MS-Exchange-SenderADCheck: 1
X-MS-Exchange-AntiSpam-Relay: 0
X-Microsoft-Antispam: BCL:0;
X-Microsoft-Antispam-Message-Info: iqgayc0hMGYCCVIPCH9QVPjrulHLFmpDw/VHNocfnf7Itefb2Nei593vN4eb3FRuDDH9etTUyEl2sYyYz5strEWMuZKke6ytio+0091NnfyIpjgQA19RhLbaQaygZFJxWhOx9zPpDLvz9tWOmn8vJAEeNkM+wlXfBurBCPBGJZgOetbhsTDPYByD/9apelCnwMRlIr7hTF+qi5fseBkMda8zPtVZ99tP98NqmpMn3MXzG91Sa6b7rQDpH6HVGcu/83ZkLcIG0m9fauymMVuG/l1rzKFU119rMhjZGjByP2rTJUdy0TzzRyrOqWUr7insOA+5ayVz+YGVmsl+4imz+uZf9ObLvHuVO8+7XnOFfU//pnT+PSPAXGc68GYiiuIEE5i9IS1vuqPZnhO7nFBrnKHpvttBstb1EMNJ8Hz3re9W7LA7/ymbkx77XwDAm1cT9qD1QvCWg5Z2QiXXn5MWvb/KOh4qO4jlddnB9trh4BaqN9un6y7krbUeOcyzS4h9Pf7pMmsqIBaP191jiK/D45SC3v3K6v420O5oGI8SzqoCE9G9gzbuRt2WnnnX0UHhF2JnBqfYcGio8WOeE867UqRMgiFPI7u4PYf5VT7TKt54GFnTqYA9ZHYZ5bubYYeDo1BTPuuZjtxIn+JEZjJUvOPE5SH12gbrq5Wm0WNta7o=
X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:;
 IPV:NLI; SFV:NSPM; H:OSAPR01MB1619.jpnprd01.prod.outlook.com; PTR:; CAT:NONE;
 SFS:(13230031)(366007)(376005)(1800799015); DIR:OUT; SFP:1102; 
X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1
X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?aUcvbi9FNkhoYzZNUVRVanJCNGVITS96VWRxOVVCN2Vac1pTRXErMTl5TWFI?=
 =?utf-8?B?dkZmakFWZFJpUjgrMnh4anIwODc1ZUVUcUJpM3Z3bTd5T3M2SmZJV2tlRVg2?=
 =?utf-8?B?dktjNlU5cGRDYWZmNVJwV2tkbXIxRkhGTmJyVSs0Z08yMmQ2WHBpbDlMai9j?=
 =?utf-8?B?MDJvR2Q4Mlk5WHdEVjdqMGprVVBrNnNCbndqY0l3SjNLT3h3aFNweVk0QkdN?=
 =?utf-8?B?aG9RSDVaV3E5cEJmdU81ZGI5NUdsUU5GQk9iU2xFTkFKd2FBWEVSV0pFZHdl?=
 =?utf-8?B?OHR1WUR1ZXI1RWpZZExNRHZqMGFoQ015OFJmN0R4WmhFZ2IxckV5c213VEpK?=
 =?utf-8?B?TDdZaUlIQ0tOSDVJRm9hOHRlNzdKbjY3REozMXhVRzR6SFFxaW5TTVhtb2NX?=
 =?utf-8?B?RTVONzdPSERsai9uMlA4ZTV2a2x3Mk1Wb0dJalRWM0NhTklrWktnN2oxdWFx?=
 =?utf-8?B?WmcrdWJPdlB4MWNtTXI3OEhZRytQSWJRbGdUUk1IRmVNQnk4MjE0TTEyZzhK?=
 =?utf-8?B?S3k5Z09tQlByaFlEaUZ1Vyt2dlNncWM4Rk9mNG82WmgrMTZXMXlVYkJ3Rk1q?=
 =?utf-8?B?MDJ0TUQwQUczMm01K3pVUEhPUmtueGhZaDFXeHRvbEl3eWpiTEZIbW5yQzY1?=
 =?utf-8?B?SVV2Y1JWdzhianZSZmo3aWNhVWxvWmM2VThaOE14Umtld2UwK1NFNEZjZm16?=
 =?utf-8?B?UGliRjh6SU03NW5IaEJLQ05MKzdLTFFuK25lcmNJMEYyWE9SUlEwWWFPY1ha?=
 =?utf-8?B?MDNiekFBZGJzNm5EL3RMdzI4cHd6czdnQ2xqM3FpLzF4NkJEUUdsNUliMG5C?=
 =?utf-8?B?aFhINzZRRGVwMzFWU0dUU0VzUDFOL2pNWVhJMXFRbmZKdDFrZkUwa25ZWC9m?=
 =?utf-8?B?WTFJVUYyNlNNalhJeHlIS3k5c3dhYWtqbThCaktVcDM3U24yOG1DNmdLblNp?=
 =?utf-8?B?RndMVDduZ1Zoa3E4WlUzK2Uwa1VBWm5Pb2Jxd1hpT2NsZ2NrWHRBeVJldHdR?=
 =?utf-8?B?ZmE2MElvUWp3M3Jwa3BzeU9HTE9lMzJQQkJtVzZNL3B0dFZkNURzOXdlTE9s?=
 =?utf-8?B?cVdDOGtSR2dtUWduZCtRZ1dXNkxCRjc3NE51RWdiYmxXSVJzZkNiYWllbDRt?=
 =?utf-8?B?Zm5UYmM3VGpLRXNDN1pmRnBvK0w5aFoySXlaOEYySnlFSkp5ZG5sejUyYklq?=
 =?utf-8?B?WVVaenAyVzNBY0VxZ1pOb0k0cTg5UFdPZmtCMGZIa1kvcWdBR2xsSExCeTQx?=
 =?utf-8?B?eHZGK2RDcy9zOEY3b0JycFI5VXUrcXdMd2JXenVmM25WZ0pua2YvTGt0S1FC?=
 =?utf-8?B?TWRpajhLRTNTa1Y4MXNKQU1wWGFsNU95Wmo4aVJNMkMwMmY3akd1YndDTUJn?=
 =?utf-8?B?NVNkUjZrYjJVV1hFUFR0RC9VYVhiQ24yT0xjTCtVT2NtOFJ3dnhBU1Q0L293?=
 =?utf-8?B?Mk5VdEhEUStNTndvTnJGbGtjdythekJ2WGthdjBYTnl6Z2s0SHNSRFdVM0Vt?=
 =?utf-8?B?SlpxS2RDWmlyQzM0NW5WeWkzV0ZWaHI3UG1rUExMeDhmWmNPaXF6a2NCcFg4?=
 =?utf-8?B?blpPOU44bkFmVWg0TFh0WnR5dWw4dVhLbVhpZGxRMm4yL01iZEc5Tjk4QTZx?=
 =?utf-8?B?bmt4MVBiWlk4SzNrTkFtSHlHc2hkQ2gzOTZhUm1na2ZWYTRvbWVhQmRtTHJv?=
 =?utf-8?B?YnRad2FubVJCVDJKdWh1OStjR1psUUhjRDRRY0NCWFdNRHZTQ1hDa0lCSm9J?=
 =?utf-8?B?UDdZR1Q0VUdNcERaQ1NlS0JaRFkxcm5LS3MxTmpPQjRaUm94eUk4Wk5GcHRC?=
 =?utf-8?B?c3FYcDdCVGtyTnVINjk2K21IS1RnVGc2OVBONHQxb1Z2MFV0MmpFTFByUkZR?=
 =?utf-8?B?aTRvNUZPVEdWdXZza01oMGwvSlhycU5MRHo3U09ka0JHMyt4U1Q0WFdpMUk4?=
 =?utf-8?B?aHdkL3FsTnhJQmZBcXd6Zm5ZZ2FNenpqcXR1MDd1NTRxcEpHNmhGK243SGhm?=
 =?utf-8?B?WW5iNzlxS1k5bHUxRXdwVmhsTFNqeDZMZTBRZmlKcHJ1TmNGN25QUDk2aVlp?=
 =?utf-8?B?Yk9VeWVhY1phZXFaY00zcTJQV1pCT3FOcms2UjFaWUs5ZWNQNnA1c1NaN2xx?=
 =?utf-8?Q?lQOolbrfH7C0TWGNeOuO2vJQ9?=
X-OriginatorOrg: zlab.co.jp
X-MS-Exchange-CrossTenant-Network-Message-Id: 7051c020-3341-4154-1b89-08dc55e492e4
X-MS-Exchange-CrossTenant-AuthSource: OSAPR01MB1619.jpnprd01.prod.outlook.com
X-MS-Exchange-CrossTenant-AuthAs: Internal
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 06 Apr 2024 02:52:18.6408 (UTC)
X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted
X-MS-Exchange-CrossTenant-Id: 58f17630-b846-437d-ac06-7ecf8a587a66
X-MS-Exchange-CrossTenant-MailboxType: HOSTED
X-MS-Exchange-CrossTenant-UserPrincipalName: xOae3lwLkmb8c3RcU41A5XNxrezr3ACXgd6pLBuEsGtL/M9i+gbNbRbv7syY6ISCjF4xiSzz8wWkUaTPPFXiQg==
X-MS-Exchange-Transport-CrossTenantHeadersStamped: TYWPR01MB9969
Received-SPF: pass client-ip=2a01:111:f403:201a::700;
 envelope-from=tkusumi@HIDDEN;
 helo=JPN01-OS0-obe.outbound.protection.outlook.com
X-Spam_score_int: -18
X-Spam_score: -1.9
X-Spam_bar: -
X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1,
 DKIM_VALID=-0.1, SPF_HELO_PASS=-0.001,
 SPF_PASS=-0.001 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-Spam-Score: 0.9 (/)
X-Debbugs-Envelope-To: submit
X-Mailman-Approved-At: Sat, 06 Apr 2024 02:38:42 -0400
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -0.1 (/)

--------------scVQ1eSWWXgHHmIO5FRVkQAM
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit

Hi,

I have found a performance issue with the sort command when used on
pseudo files with zero size. For instance, sorting `/proc/kallsyms`, as
demonstrated below, takes significantly longer than executing with
`cat`, generating numerous temporary files. I confirmed this issue on
v8.32 as well as on commit 8f3989d in the master branch.

   $ time cat /proc/kallsyms | sort > /dev/null
   real    0m0.954s
   user    0m0.873s
   sys     0m0.096s

   $ time sort /proc/kallsyms > /dev/null
   real    0m8.555s
   user    0m3.367s
   sys     0m5.064s

   $ strace -e trace=openat sort /proc/kallsyms 2>&1 > /dev/null \
     | grep /tmp/sort | head -100
   ...
   openat(AT_FDCWD, "/tmp/sortM6Y6Y1", ...
   openat(AT_FDCWD, "/tmp/sortPrHKMG", ...

   $ strace -e trace=openat -c sort /proc/kallsyms > /dev/null
   % time     seconds  usecs/call     calls    errors syscall
   ------ ----------- ----------- --------- --------- ----------------
   100.00    6.419777          19    333258         8 openat
   ------ ----------- ----------- --------- --------- ----------------
   100.00    6.419777          19    333258         8 total

It appears that the buffer size allocated for pseudo files with zero
size is insufficient, likely because it is based on their file size,
which is zero. As seen in the attached patch, I think using
`INPUT_FILE_SIZE_GUESS` to calculate the buffer size when the file size
is zero would resolve this issue.

Best regards,
Takashi Kusumi
--------------scVQ1eSWWXgHHmIO5FRVkQAM
Content-Type: text/plain; charset=UTF-8;
 name="0001-sort-fix-performance-issue-on-zero-sized-pseudo-file.patch"
Content-Disposition: attachment;
 filename*0="0001-sort-fix-performance-issue-on-zero-sized-pseudo-file.pa";
 filename*1="tch"
Content-Transfer-Encoding: base64

RnJvbSA5Zjc1OWNjNzIwMTRhYjY2ZGEwZTE0MzE4ZDRhYTBjNzJlOTMxMWQ5IE1vbiBTZXAgMTcg
MDA6MDA6MDAgMjAwMQpGcm9tOiBUYWthc2hpIEt1c3VtaSA8dGt1c3VtaUB6bGFiLmNvLmpwPgpE
YXRlOiBGcmksIDUgQXByIDIwMjQgMTI6MDM6NDIgKzA5MDAKU3ViamVjdDogW1BBVENIXSBzb3J0
OiBmaXggcGVyZm9ybWFuY2UgaXNzdWUgb24gemVyby1zaXplZCBwc2V1ZG8gZmlsZXMKClByZXZp
b3VzbHksIGFuIGluc3VmZmljaWVudCBidWZmZXIgc2l6ZSB3YXMgY2hvc2VuIGZvciB6ZXJvLXNp
emVkIHBzZXVkbwpmaWxlcyAoZS5nLiwgL3Byb2Mva2FsbHN5bXMpLiBOb3csIHRoZSBidWZmZXIg
c2l6ZSBpcyBjYWxjdWxhdGVkIHVzaW5nCklOUFVUX0ZJTEVfU0laRV9HVUVTUyB3aGVuIHRoZSBm
aWxlIHNpemUgaXMgemVyby4KLS0tCiBzcmMvc29ydC5jIHwgMiArLQogMSBmaWxlIGNoYW5nZWQs
IDEgaW5zZXJ0aW9uKCspLCAxIGRlbGV0aW9uKC0pCgpkaWZmIC0tZ2l0IGEvc3JjL3NvcnQuYyBi
L3NyYy9zb3J0LmMKaW5kZXggMzI5ZWQ0NWRjLi44ZDc1N2RhNTUgMTAwNjQ0Ci0tLSBhL3NyYy9z
b3J0LmMKKysrIGIvc3JjL3NvcnQuYwpAQCAtMTUzOCw3ICsxNTM4LDcgQEAgc29ydF9idWZmZXJf
c2l6ZSAoRklMRSAqY29uc3QgKmZwcywgc2l6ZV90IG5mcHMsCiAgICAgICAgICAgIT0gMCkKICAg
ICAgICAgc29ydF9kaWUgKF8oInN0YXQgZmFpbGVkIiksIGZpbGVzW2ldKTsKIAotICAgICAgaWYg
KFNfSVNSRUcgKHN0LnN0X21vZGUpKQorICAgICAgaWYgKFNfSVNSRUcgKHN0LnN0X21vZGUpICYm
IHN0LnN0X3NpemUgIT0gMCkKICAgICAgICAgZmlsZV9zaXplID0gc3Quc3Rfc2l6ZTsKICAgICAg
IGVsc2UKICAgICAgICAgewotLSAKMi40MS4wCgo=

--------------scVQ1eSWWXgHHmIO5FRVkQAM--




Acknowledgement sent to Takashi Kusumi <tkusumi@HIDDEN>:
New bug report received and forwarded. Copy sent to bug-coreutils@HIDDEN. Full text available.
Report forwarded to bug-coreutils@HIDDEN:
bug#70231; Package coreutils. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Sat, 6 Apr 2024 06:45:04 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.