GNU bug report logs - #40540
Faster sort with locale

Previous Next

Package: coreutils;

Reported by: Ole Tange <ole <at> tange.dk>

Date: Fri, 10 Apr 2020 13:20:01 UTC

Severity: normal

To reply to this bug, email your comments to 40540 AT debbugs.gnu.org.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-coreutils <at> gnu.org:
bug#40540; Package coreutils. (Fri, 10 Apr 2020 13:20:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Ole Tange <ole <at> tange.dk>:
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Fri, 10 Apr 2020 13:20:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Ole Tange <ole <at> tange.dk>
To: bug-coreutils <at> gnu.org
Subject: Faster sort with locale
Date: Fri, 10 Apr 2020 15:19:19 +0200
I have noticed that if locale is set, then sort becomes much slower.

I imagine that it is because instead of doing

  simple_compare(string1,string2)

it does:

  localized_compare(string1,string2)

But would it be possible to convert the input string1 into a string in
a generalized format, which would sort the same way as the localized
sort, but using a simple compare? Like this:

  string1_general = localize(string1)
  string2_general = localize(string2)
  simple_compare(string1_general,string2_general)

If that is possible, then localize() can be done by other cores in
advance and thereby offload the "primary" core.


/Ole




Information forwarded to bug-coreutils <at> gnu.org:
bug#40540; Package coreutils. (Fri, 10 Apr 2020 18:57:02 GMT) Full text and rfc822 format available.

Message #8 received at 40540 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Ole Tange <ole <at> tange.dk>
Cc: 40540 <at> debbugs.gnu.org
Subject: Re: bug#40540: Faster sort with locale
Date: Fri, 10 Apr 2020 11:56:47 -0700
On 4/10/20 6:19 AM, Ole Tange wrote:
> But would it be possible to convert the input string1 into a string in
> a generalized format, which would sort the same way as the localized
> sort, but using a simple compare?

I tried doing that a long time ago by using strxfrm, but it made 'sort' 
significantly slower. You're welcome to try again; perhaps things have changed.




This bug report was last modified 4 years and 15 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.