GNU bug report logs - #21844
`sort` behavior unstable based on neighboring elements ?

Previous Next

Package: coreutils;

Reported by: Mike Frysinger <vapier <at> gentoo.org>

Date: Fri, 6 Nov 2015 16:43:02 UTC

Severity: normal

Tags: notabug

Done: Assaf Gordon <assafgordon <at> gmail.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 21844 in the body.
You can then email your comments to 21844 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-coreutils <at> gnu.org:
bug#21844; Package coreutils. (Fri, 06 Nov 2015 16:43:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Mike Frysinger <vapier <at> gentoo.org>:
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Fri, 06 Nov 2015 16:43:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Mike Frysinger <vapier <at> gentoo.org>
To: bug-coreutils <at> gnu.org
Subject: `sort` behavior unstable based on neighboring elements ?
Date: Fri, 6 Nov 2015 11:41:41 -0500
[Message part 1 (text/plain, inline)]
i got this bug report today about sort mismatches.  the order of the
inputs changes the order of the outputs which surprised me.  but it
might be a nuance of unicode collation i'm not familiar with ?

$ printf '%s\n' aarch64 abc zed | LC_ALL=nb_NO.UTF-8 sort -u
aarch64
abc
zed
$ printf '%s\n' abc aarch64 zed | LC_ALL=nb_NO.UTF-8 sort -u
abc
zed
aarch64

why aren't the outputs here the same ?  a nordic user pointed out
that aa is an alternative for å which comes after z, which is fine,
but that doesn't explain why the output isn't the same here.
-mike
[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#21844; Package coreutils. (Fri, 06 Nov 2015 16:54:02 GMT) Full text and rfc822 format available.

Message #8 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: bug-coreutils <at> gnu.org, Mike Frysinger <vapier <at> gentoo.org>
Subject: Re: bug#21844: `sort` behavior unstable based on neighboring elements
 ?
Date: Fri, 6 Nov 2015 16:53:02 +0000
tag 21844 notabug
close 21844
stop

On 06/11/15 16:41, Mike Frysinger wrote:
> i got this bug report today about sort mismatches.  the order of the
> inputs changes the order of the outputs which surprised me.  but it
> might be a nuance of unicode collation i'm not familiar with ?
> 
> $ printf '%s\n' aarch64 abc zed | LC_ALL=nb_NO.UTF-8 sort -u
> aarch64
> abc
> zed
> $ printf '%s\n' abc aarch64 zed | LC_ALL=nb_NO.UTF-8 sort -u
> abc
> zed
> aarch64
> 
> why aren't the outputs here the same ?  a nordic user pointed out
> that aa is an alternative for å which comes after z, which is fine,
> but that doesn't explain why the output isn't the same here.
> -mike

strcoll is giving the wrong results:

$ printf '%s\n' abc aarch64 zed | LC_ALL=nb_NO.UTF-8 ltrace -e strcoll sort >/dev/null
sort->strcoll("aarch64", "zed")                                                = 3
sort->strcoll("abc", "zed")                                                    = -25

$ printf '%s\n' aarch64 abc zed | LC_ALL=nb_NO.UTF-8 ltrace -e strcoll sort >/dev/null
sort->strcoll("abc", "zed")                                                    = -25
sort->strcoll("aarch64", "abc")                                                = -1

I think this is due to:
https://sourceware.org/bugzilla/show_bug.cgi?id=18589

Fixed by:
https://sourceware.org/git/?p=glibc.git;a=commit;h=87701a58

cheers,
Pádraig.




Added tag(s) notabug. Request was from Assaf Gordon <assafgordon <at> gmail.com> to control <at> debbugs.gnu.org. (Wed, 24 Oct 2018 21:17:02 GMT) Full text and rfc822 format available.

bug closed, send any further explanations to 21844 <at> debbugs.gnu.org and Mike Frysinger <vapier <at> gentoo.org> Request was from Assaf Gordon <assafgordon <at> gmail.com> to control <at> debbugs.gnu.org. (Wed, 24 Oct 2018 21:17:02 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 22 Nov 2018 12:24:08 GMT) Full text and rfc822 format available.

This bug report was last modified 5 years and 291 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.