GNU bug report logs - #55212
GNU Linux "sort -g" can hang indefinitely when run on standard input if NaNs are involved

Previous Next

Package: coreutils;

Reported by: Giulio Genovese <giulio.genovese <at> gmail.com>

Date: Sun, 1 May 2022 20:54:01 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 55212 in the body.
You can then email your comments to 55212 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-coreutils <at> gnu.org:
bug#55212; Package coreutils. (Sun, 01 May 2022 20:54:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Giulio Genovese <giulio.genovese <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Sun, 01 May 2022 20:54:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Giulio Genovese <giulio.genovese <at> gmail.com>
To: bug-coreutils <at> gnu.org
Subject: GNU Linux "sort -g" can hang indefinitely when run on standard input
 if NaNs are involved
Date: Sun, 1 May 2022 13:16:10 -0400
[Message part 1 (text/plain, inline)]
As explained here
<https://unix.stackexchange.com/questions/700863/gnu-linux-sort-g-can-hang-indefinitely-when-run-on-standard-input-on-ubuntu>,
when running "sort -g" from standard input, if NaNs are involved, this can
cause "sort -g" to hang indefinitely while consuming 100% of the CPU. This
seems to be system dependent. I cannot reproduce the bug on a RHEL7
machine. However, multiple users seem to be able to reproduce the bug. The
following command can provoke the bug:

yes nan | head -n128095 | timeout 5 sort -g
[Message part 2 (text/html, inline)]

Reply sent to Paul Eggert <eggert <at> cs.ucla.edu>:
You have taken responsibility. (Mon, 02 May 2022 06:04:01 GMT) Full text and rfc822 format available.

Notification sent to Giulio Genovese <giulio.genovese <at> gmail.com>:
bug acknowledged by developer. (Mon, 02 May 2022 06:04:02 GMT) Full text and rfc822 format available.

Message #10 received at 55212-done <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Giulio Genovese <giulio.genovese <at> gmail.com>
Cc: 55212-done <at> debbugs.gnu.org
Subject: Re: bug#55212: GNU Linux "sort -g" can hang indefinitely when run on
 standard input if NaNs are involved
Date: Sun, 1 May 2022 23:03:38 -0700
[Message part 1 (text/plain, inline)]
Thanks for the bug report. This bug is entertaining, as it comes from 
GCC now being so smart that it optimizes away a memset that cleared 
padding bits. We added the memset in coreutils 8.14 (2011) to try to fix 
the sort -g infinite loop bug (introduced in 1999), but the memset isn't 
guaranteed to fix the bug because the memset can be optimized away.

If the padding bits happen to be clear already sort is OK, but if not 
the results can be inconsistent when you compare two NaNs to each other, 
and inconsistent results can make sort infloop.

The C standard allows this level of intelligence in the compiler, so 
it's a bug in GNU 'sort'.

I installed the attached patch; please give it a try. For now I'll 
boldly close the bug report; we can easily reopen it if this patch 
doesn't actually fix the problem.
[0001-sort-fix-sort-g-infloop-again.patch (text/x-patch, attachment)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#55212; Package coreutils. (Mon, 02 May 2022 13:32:02 GMT) Full text and rfc822 format available.

Message #13 received at 55212 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: 55212 <at> debbugs.gnu.org, eggert <at> cs.ucla.edu, giulio.genovese <at> gmail.com
Subject: Re: bug#55212: GNU Linux "sort -g" can hang indefinitely when run on
 standard input if NaNs are involved
Date: Mon, 2 May 2022 14:31:23 +0100
[Message part 1 (text/plain, inline)]
On 02/05/2022 07:03, Paul Eggert wrote:
> Thanks for the bug report. This bug is entertaining, as it comes from
> GCC now being so smart that it optimizes away a memset that cleared
> padding bits. We added the memset in coreutils 8.14 (2011) to try to fix
> the sort -g infinite loop bug (introduced in 1999), but the memset isn't
> guaranteed to fix the bug because the memset can be optimized away.
> 
> If the padding bits happen to be clear already sort is OK, but if not
> the results can be inconsistent when you compare two NaNs to each other,
> and inconsistent results can make sort infloop.
> 
> The C standard allows this level of intelligence in the compiler, so
> it's a bug in GNU 'sort'.
> 
> I installed the attached patch; please give it a try. For now I'll
> boldly close the bug report; we can easily reopen it if this patch
> doesn't actually fix the problem.

This is a bit slower of course, but since an edge case not a big concern:

  $ time yes nan | head -n128095 | timeout 10 sort -g >/dev/null
  real	0m0.693s

  $ time yes nan | head -n128095 | timeout 10 src/sort -g >/dev/null
  real	0m0.924s

I'll add the test case I think (attached)
since the existing one didn't trigger this.

thanks!
Pádraig
[sort-nan-inf-adjustments.patch (text/x-patch, attachment)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#55212; Package coreutils. (Mon, 02 May 2022 17:17:01 GMT) Full text and rfc822 format available.

Message #16 received at 55212 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Pádraig Brady <P <at> draigBrady.com>
Cc: 55212 <at> debbugs.gnu.org, giulio.genovese <at> gmail.com
Subject: Re: bug#55212: GNU Linux "sort -g" can hang indefinitely when run on
 standard input if NaNs are involved
Date: Mon, 2 May 2022 10:16:37 -0700
On 5/2/22 06:31, Pádraig Brady wrote:
> This is a bit slower of course, but since an edge case not a big concern:

Yes, my thoughts too. There are ways to speed up common lots-o-NaN cases 
portably (I toyed with the idea of using ieee754.h), but I went with the 
simple approach for now.

A nit: 'time' needs to be at the end of the pipeline:

$ yes nan | head -n128095 | bash -c 'time src/sort -g' >/dev/null

real	0m0.552s
user	0m0.551s
sys	0m0.001s
$ yes nan | head -n128095 | bash -c 'time sort -g' >/dev/null

real	0m0.392s
user	0m0.382s
sys	0m0.009s
512-day $ yes nan | head -n128095 | bash -c 'time sort -g' >/dev/null
[Here I had to control-C since 'sort' inflooped.]




Information forwarded to bug-coreutils <at> gnu.org:
bug#55212; Package coreutils. (Tue, 10 May 2022 19:09:01 GMT) Full text and rfc822 format available.

Message #19 received at 55212 <at> debbugs.gnu.org (full text, mbox):

From: coreutils <at> tlinx.org
To: Giulio Genovese <giulio.genovese <at> gmail.com>
Cc: 55212 <at> debbugs.gnu.org
Subject: Re: bug#55212: GNU Linux "sort -g" can hang indefinitely when run
 on standard input if NaNs are involved
Date: Tue, 10 May 2022 12:08:03 -0700
On 2022/05/01 10:16, Giulio Genovese wrote:
> As explained here
> <https://unix.stackexchange.com/questions/700863/gnu-linux-sort-g-can-hang-indefinitely-when-run-on-standard-input-on-ubuntu>,
> when running "sort -g" from standard input, if NaNs are involved, this can
> cause "sort -g" to hang indefinitely while consuming 100% of the CPU. This
> seems to be system dependent. I cannot reproduce the bug on a RHEL7
> machine. However, multiple users seem to be able to reproduce the bug. The
> following command can provoke the bug:
>
> yes nan | head -n128095 | timeout 5 sort -g
>   
----
   Unless there is some magic about -n1238095,
Using a smaller number like -n1 or -n12 both terminate after 1 or 12 lines.
I wasn't willing to wait for 1.2M lines being output, though the -n128085
case did start to output until I terminated it -- which would seem to
indicate that the sort had finished and was dumping its output.  I.e.
no indefinite hang.

sort from 8.26.18-5e871
from suse old tumbleweed (out of sync - no longer updates)






Information forwarded to bug-coreutils <at> gnu.org:
bug#55212; Package coreutils. (Tue, 10 May 2022 23:31:02 GMT) Full text and rfc822 format available.

Message #22 received at 55212 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: coreutils <at> tlinx.org
Cc: 55212 <at> debbugs.gnu.org, Giulio Genovese <giulio.genovese <at> gmail.com>
Subject: Re: bug#55212: GNU Linux "sort -g" can hang indefinitely when run on
 standard input if NaNs are involved
Date: Tue, 10 May 2022 16:30:31 -0700
On 5/10/22 12:08, coreutils <at> tlinx.org wrote:
>     Unless there is some magic about -n1238095,

The test is random and there's no magic, just luck. The larger the 
random test, the more likely you'll run into the unlucky situation where 
the unpatched 'sort' infloops.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Wed, 08 Jun 2022 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 1 year and 315 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.