GNU bug report logs -
#55212
GNU Linux "sort -g" can hang indefinitely when run on standard input if NaNs are involved
Previous Next
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 55212 in the body.
You can then email your comments to 55212 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-coreutils <at> gnu.org
:
bug#55212
; Package
coreutils
.
(Sun, 01 May 2022 20:54:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Giulio Genovese <giulio.genovese <at> gmail.com>
:
New bug report received and forwarded. Copy sent to
bug-coreutils <at> gnu.org
.
(Sun, 01 May 2022 20:54:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
As explained here
<https://unix.stackexchange.com/questions/700863/gnu-linux-sort-g-can-hang-indefinitely-when-run-on-standard-input-on-ubuntu>,
when running "sort -g" from standard input, if NaNs are involved, this can
cause "sort -g" to hang indefinitely while consuming 100% of the CPU. This
seems to be system dependent. I cannot reproduce the bug on a RHEL7
machine. However, multiple users seem to be able to reproduce the bug. The
following command can provoke the bug:
yes nan | head -n128095 | timeout 5 sort -g
[Message part 2 (text/html, inline)]
Reply sent
to
Paul Eggert <eggert <at> cs.ucla.edu>
:
You have taken responsibility.
(Mon, 02 May 2022 06:04:01 GMT)
Full text and
rfc822 format available.
Notification sent
to
Giulio Genovese <giulio.genovese <at> gmail.com>
:
bug acknowledged by developer.
(Mon, 02 May 2022 06:04:02 GMT)
Full text and
rfc822 format available.
Message #10 received at 55212-done <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Thanks for the bug report. This bug is entertaining, as it comes from
GCC now being so smart that it optimizes away a memset that cleared
padding bits. We added the memset in coreutils 8.14 (2011) to try to fix
the sort -g infinite loop bug (introduced in 1999), but the memset isn't
guaranteed to fix the bug because the memset can be optimized away.
If the padding bits happen to be clear already sort is OK, but if not
the results can be inconsistent when you compare two NaNs to each other,
and inconsistent results can make sort infloop.
The C standard allows this level of intelligence in the compiler, so
it's a bug in GNU 'sort'.
I installed the attached patch; please give it a try. For now I'll
boldly close the bug report; we can easily reopen it if this patch
doesn't actually fix the problem.
[0001-sort-fix-sort-g-infloop-again.patch (text/x-patch, attachment)]
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#55212
; Package
coreutils
.
(Mon, 02 May 2022 13:32:02 GMT)
Full text and
rfc822 format available.
Message #13 received at 55212 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On 02/05/2022 07:03, Paul Eggert wrote:
> Thanks for the bug report. This bug is entertaining, as it comes from
> GCC now being so smart that it optimizes away a memset that cleared
> padding bits. We added the memset in coreutils 8.14 (2011) to try to fix
> the sort -g infinite loop bug (introduced in 1999), but the memset isn't
> guaranteed to fix the bug because the memset can be optimized away.
>
> If the padding bits happen to be clear already sort is OK, but if not
> the results can be inconsistent when you compare two NaNs to each other,
> and inconsistent results can make sort infloop.
>
> The C standard allows this level of intelligence in the compiler, so
> it's a bug in GNU 'sort'.
>
> I installed the attached patch; please give it a try. For now I'll
> boldly close the bug report; we can easily reopen it if this patch
> doesn't actually fix the problem.
This is a bit slower of course, but since an edge case not a big concern:
$ time yes nan | head -n128095 | timeout 10 sort -g >/dev/null
real 0m0.693s
$ time yes nan | head -n128095 | timeout 10 src/sort -g >/dev/null
real 0m0.924s
I'll add the test case I think (attached)
since the existing one didn't trigger this.
thanks!
Pádraig
[sort-nan-inf-adjustments.patch (text/x-patch, attachment)]
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#55212
; Package
coreutils
.
(Mon, 02 May 2022 17:17:01 GMT)
Full text and
rfc822 format available.
Message #16 received at 55212 <at> debbugs.gnu.org (full text, mbox):
On 5/2/22 06:31, Pádraig Brady wrote:
> This is a bit slower of course, but since an edge case not a big concern:
Yes, my thoughts too. There are ways to speed up common lots-o-NaN cases
portably (I toyed with the idea of using ieee754.h), but I went with the
simple approach for now.
A nit: 'time' needs to be at the end of the pipeline:
$ yes nan | head -n128095 | bash -c 'time src/sort -g' >/dev/null
real 0m0.552s
user 0m0.551s
sys 0m0.001s
$ yes nan | head -n128095 | bash -c 'time sort -g' >/dev/null
real 0m0.392s
user 0m0.382s
sys 0m0.009s
512-day $ yes nan | head -n128095 | bash -c 'time sort -g' >/dev/null
[Here I had to control-C since 'sort' inflooped.]
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#55212
; Package
coreutils
.
(Tue, 10 May 2022 19:09:01 GMT)
Full text and
rfc822 format available.
Message #19 received at 55212 <at> debbugs.gnu.org (full text, mbox):
On 2022/05/01 10:16, Giulio Genovese wrote:
> As explained here
> <https://unix.stackexchange.com/questions/700863/gnu-linux-sort-g-can-hang-indefinitely-when-run-on-standard-input-on-ubuntu>,
> when running "sort -g" from standard input, if NaNs are involved, this can
> cause "sort -g" to hang indefinitely while consuming 100% of the CPU. This
> seems to be system dependent. I cannot reproduce the bug on a RHEL7
> machine. However, multiple users seem to be able to reproduce the bug. The
> following command can provoke the bug:
>
> yes nan | head -n128095 | timeout 5 sort -g
>
----
Unless there is some magic about -n1238095,
Using a smaller number like -n1 or -n12 both terminate after 1 or 12 lines.
I wasn't willing to wait for 1.2M lines being output, though the -n128085
case did start to output until I terminated it -- which would seem to
indicate that the sort had finished and was dumping its output. I.e.
no indefinite hang.
sort from 8.26.18-5e871
from suse old tumbleweed (out of sync - no longer updates)
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#55212
; Package
coreutils
.
(Tue, 10 May 2022 23:31:02 GMT)
Full text and
rfc822 format available.
Message #22 received at 55212 <at> debbugs.gnu.org (full text, mbox):
On 5/10/22 12:08, coreutils <at> tlinx.org wrote:
> Unless there is some magic about -n1238095,
The test is random and there's no magic, just luck. The larger the
random test, the more likely you'll run into the unlucky situation where
the unpatched 'sort' infloops.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Wed, 08 Jun 2022 11:24:04 GMT)
Full text and
rfc822 format available.
This bug report was last modified 1 year and 315 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.