GNU bug report logs -
#28506
coreutils 8.28 test suite hangs on APFS filesystem
Previous Next
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 28506 in the body.
You can then email your comments to 28506 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-coreutils <at> gnu.org
:
bug#28506
; Package
coreutils
.
(Mon, 18 Sep 2017 20:19:01 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Jack Howarth <howarth.mailing.lists <at> gmail.com>
:
New bug report received and forwarded. Copy sent to
bug-coreutils <at> gnu.org
.
(Mon, 18 Sep 2017 20:19:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
The coreutils 8.28 release, when built on macOS 10.13 under the new APFS
filesystem, produces a hang during the test suite run. The hang appears to
occur in the execution of coreutils-8.28/tests/split/filter.sh at..
+ yes
+ head -n200K
+ split -b1G '--filter=head -c1 >/dev/null'
+ for mode in ''\'''\''' ''\''r/'\'''
+ FILE = -
according to the filter.log generated from executing the section of
split/filter.sh containing...
yes | head -n200K | split -b1G --filter='head -c1 >/dev/null' || fail=1
# Ensure that "endless" input is ignored when all filters finish
for mode in '' 'r/'; do
FILE = '-'
if test "$mode" = ''; then
FILE = 'zero.in'
truncate -s10T "$FILE" || continue
fi
for N in 1 2; do
rm -f x??.n || framework_failure_
timeout 10 sh -c \
"yes | split --filter='head -c1 >\$FILE.n' -n $mode$N $FILE" || fail=1
# Also ensure we get appropriate output from each filter
seq 1 $N | tr '0-9' 1 > stat.exp
stat -c%s x??.n > stat.out || framework_failure_
compare stat.exp stat.out || fail=1
done
done
I haven't opened a radar report yet as the Apple engineers can't look
directly at the source code for coreutils due to the GPLv3 licensing and
the test suite seems to be tangled up with the makefiles making it
impossible to extract a stand-alone test case reproducer to attach to a
radar bug report.
Jack
ps Again, the hang seems to occur at the tail end of the log after it
emits...
+ FILE = -
Any suggestions on how reduce this to a simpler test case? I would note
that the new APFS filesystem produces a failure in the python test suite...
https://bugs.python.org/issue31380
which is due to APFS not allowing files to be created with filenames that
contain unassigned codepoints in the Unicode 9.0 standard, whereas HFS+
does. So perhaps the coreutils hang might be a similar issue?
[Message part 2 (text/html, inline)]
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#28506
; Package
coreutils
.
(Mon, 18 Sep 2017 21:09:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 28506 <at> debbugs.gnu.org (full text, mbox):
On Mon, Sep 18, 2017 at 1:18 PM, Jack Howarth
<howarth.mailing.lists <at> gmail.com> wrote:
> The coreutils 8.28 release, when built on macOS 10.13 under the new APFS
> filesystem, produces a hang during the test suite run. The hang appears to
> occur in the execution of coreutils-8.28/tests/split/filter.sh at..
>
> + yes
> + head -n200K
> + split -b1G '--filter=head -c1 >/dev/null'
> + for mode in ''\'''\''' ''\''r/'\'''
> + FILE = -
>
> according to the filter.log generated from executing the section of
> split/filter.sh containing...
>
> yes | head -n200K | split -b1G --filter='head -c1 >/dev/null' || fail=1
>
> # Ensure that "endless" input is ignored when all filters finish
> for mode in '' 'r/'; do
> FILE = '-'
> if test "$mode" = ''; then
> FILE = 'zero.in'
> truncate -s10T "$FILE" || continue
> fi
> for N in 1 2; do
> rm -f x??.n || framework_failure_
> timeout 10 sh -c \
> "yes | split --filter='head -c1 >\$FILE.n' -n $mode$N $FILE" || fail=1
> # Also ensure we get appropriate output from each filter
> seq 1 $N | tr '0-9' 1 > stat.exp
> stat -c%s x??.n > stat.out || framework_failure_
> compare stat.exp stat.out || fail=1
> done
> done
>
> I haven't opened a radar report yet as the Apple engineers can't look
> directly at the source code for coreutils due to the GPLv3 licensing and
> the test suite seems to be tangled up with the makefiles making it
> impossible to extract a stand-alone test case reproducer to attach to a
> radar bug report.
> Jack
> ps Again, the hang seems to occur at the tail end of the log after it
> emits...
>
> + FILE = -
>
> Any suggestions on how reduce this to a simpler test case? I would note
> that the new APFS filesystem produces a failure in the python test suite...
>
> https://bugs.python.org/issue31380
>
> which is due to APFS not allowing files to be created with filenames that
> contain unassigned codepoints in the Unicode 9.0 standard, whereas HFS+
> does. So perhaps the coreutils hang might be a similar issue?
Thank you for the testing and for the report.
Is there any chance your failing test was via a python2 framework? I'm
asking (on Pádraig's behalf) because there is a known problem whereby
SIGPIPE is mishandled in that case, and that might explain this
failure, since the data-generation phase relies on SIGPIPE killing
this test's "yes" command.
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#28506
; Package
coreutils
.
(Mon, 18 Sep 2017 23:27:01 GMT)
Full text and
rfc822 format available.
Message #11 received at 28506 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Mon, Sep 18, 2017 at 5:08 PM, Jim Meyering <jim <at> meyering.net> wrote:
> On Mon, Sep 18, 2017 at 1:18 PM, Jack Howarth
> <howarth.mailing.lists <at> gmail.com> wrote:
> > The coreutils 8.28 release, when built on macOS 10.13 under the new APFS
> > filesystem, produces a hang during the test suite run. The hang appears
> to
> > occur in the execution of coreutils-8.28/tests/split/filter.sh at..
> >
> > + yes
> > + head -n200K
> > + split -b1G '--filter=head -c1 >/dev/null'
> > + for mode in ''\'''\''' ''\''r/'\'''
> > + FILE = -
> >
> > according to the filter.log generated from executing the section of
> > split/filter.sh containing...
> >
> > yes | head -n200K | split -b1G --filter='head -c1 >/dev/null' || fail=1
> >
> > # Ensure that "endless" input is ignored when all filters finish
> > for mode in '' 'r/'; do
> > FILE = '-'
> > if test "$mode" = ''; then
> > FILE = 'zero.in'
> > truncate -s10T "$FILE" || continue
> > fi
> > for N in 1 2; do
> > rm -f x??.n || framework_failure_
> > timeout 10 sh -c \
> > "yes | split --filter='head -c1 >\$FILE.n' -n $mode$N $FILE" ||
> fail=1
> > # Also ensure we get appropriate output from each filter
> > seq 1 $N | tr '0-9' 1 > stat.exp
> > stat -c%s x??.n > stat.out || framework_failure_
> > compare stat.exp stat.out || fail=1
> > done
> > done
> >
> > I haven't opened a radar report yet as the Apple engineers can't look
> > directly at the source code for coreutils due to the GPLv3 licensing and
> > the test suite seems to be tangled up with the makefiles making it
> > impossible to extract a stand-alone test case reproducer to attach to a
> > radar bug report.
> > Jack
> > ps Again, the hang seems to occur at the tail end of the log after it
> > emits...
> >
> > + FILE = -
> >
> > Any suggestions on how reduce this to a simpler test case? I would note
> > that the new APFS filesystem produces a failure in the python test
> suite...
> >
> > https://bugs.python.org/issue31380
> >
> > which is due to APFS not allowing files to be created with filenames that
> > contain unassigned codepoints in the Unicode 9.0 standard, whereas HFS+
> > does. So perhaps the coreutils hang might be a similar issue?
>
> Thank you for the testing and for the report.
>
> Is there any chance your failing test was via a python2 framework? I'm
> asking (on Pádraig's behalf) because there is a known problem whereby
> SIGPIPE is mishandled in that case, and that might explain this
> failure, since the data-generation phase relies on SIGPIPE killing
> this test's "yes" command.
>
I doubt it as the hang doesn't happen under 10.13 when run on a JHFS
formatted volume.
Jack
[Message part 2 (text/html, inline)]
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#28506
; Package
coreutils
.
(Mon, 18 Sep 2017 23:42:02 GMT)
Full text and
rfc822 format available.
Message #14 received at 28506 <at> debbugs.gnu.org (full text, mbox):
On Mon, Sep 18, 2017 at 4:26 PM, Jack Howarth
<howarth.mailing.lists <at> gmail.com> wrote:
> On Mon, Sep 18, 2017 at 5:08 PM, Jim Meyering <jim <at> meyering.net> wrote:
...
>> Is there any chance your failing test was via a python2 framework? I'm
>> asking (on Pádraig's behalf) because there is a known problem whereby
>> SIGPIPE is mishandled in that case, and that might explain this
>> failure, since the data-generation phase relies on SIGPIPE killing
>> this test's "yes" command.
>
> I doubt it as the hang doesn't happen under 10.13 when run on a JHFS
> formatted volume.
How did you run the tests?
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#28506
; Package
coreutils
.
(Tue, 19 Sep 2017 01:08:01 GMT)
Full text and
rfc822 format available.
Message #17 received at 28506 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Mon, Sep 18, 2017 at 7:40 PM, Jim Meyering <jim <at> meyering.net> wrote:
> On Mon, Sep 18, 2017 at 4:26 PM, Jack Howarth
> <howarth.mailing.lists <at> gmail.com> wrote:
> > On Mon, Sep 18, 2017 at 5:08 PM, Jim Meyering <jim <at> meyering.net> wrote:
> ...
> >> Is there any chance your failing test was via a python2 framework? I'm
> >> asking (on Pádraig's behalf) because there is a known problem whereby
> >> SIGPIPE is mishandled in that case, and that might explain this
> >> failure, since the data-generation phase relies on SIGPIPE killing
> >> this test's "yes" command.
> >
> > I doubt it as the hang doesn't happen under 10.13 when run on a JHFS
> > formatted volume.
>
> How did you run the tests?
>
Actually, I forgot to mention that the coreutils test suite hang only
occurred on the APFS volumes when the coreutils built against the gettext
and libiconv from fink. A build outside of fink which didn't build against
those packages didn't show the hang in the coreutils test suite. The fink
gettext and libiconv packages that I am using are those from...
https://sourceforge.net/p/fink/package-submissions/4955/
and
https://sourceforge.net/p/fink/package-submissions/5004/
which are both patched for the format string strictness in High Sierra. I
found that using --disable-nls in configuring coreutils was insufficient to
suppress the test suite hang which I assume is due to the presence of...
#define HAVE_LIBINTL_H 1
in the generated ./lib/config.h
despite the presence of...
/* #undef HAVE_DCGETTEXT */
/* #undef HAVE_GETTEXT */
when --disable-nls is used so it still could be a Unicode related change in
APFS, no?
Jack
[Message part 2 (text/html, inline)]
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#28506
; Package
coreutils
.
(Thu, 21 Sep 2017 05:21:03 GMT)
Full text and
rfc822 format available.
Message #20 received at 28506 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On 18/09/17 18:07, Jack Howarth wrote:
> On Mon, Sep 18, 2017 at 7:40 PM, Jim Meyering <jim <at> meyering.net> wrote:
>
>> On Mon, Sep 18, 2017 at 4:26 PM, Jack Howarth
>> <howarth.mailing.lists <at> gmail.com> wrote:
>>> On Mon, Sep 18, 2017 at 5:08 PM, Jim Meyering <jim <at> meyering.net> wrote:
>> ...
>>>> Is there any chance your failing test was via a python2 framework? I'm
>>>> asking (on Pádraig's behalf) because there is a known problem whereby
>>>> SIGPIPE is mishandled in that case, and that might explain this
>>>> failure, since the data-generation phase relies on SIGPIPE killing
>>>> this test's "yes" command.
>>>
>>> I doubt it as the hang doesn't happen under 10.13 when run on a JHFS
>>> formatted volume.
>>
>> How did you run the tests?
>>
>
> Actually, I forgot to mention that the coreutils test suite hang only
> occurred on the APFS volumes when the coreutils built against the gettext
> and libiconv from fink. A build outside of fink which didn't build against
> those packages didn't show the hang in the coreutils test suite. The fink
> gettext and libiconv packages that I am using are those from...
>
> https://sourceforge.net/p/fink/package-submissions/4955/
>
> and
>
> https://sourceforge.net/p/fink/package-submissions/5004/
>
> which are both patched for the format string strictness in High Sierra. I
> found that using --disable-nls in configuring coreutils was insufficient to
> suppress the test suite hang which I assume is due to the presence of...
>
> #define HAVE_LIBINTL_H 1
>
> in the generated ./lib/config.h
>
> despite the presence of...
>
> /* #undef HAVE_DCGETTEXT */
> /* #undef HAVE_GETTEXT */
>
> when --disable-nls is used so it still could be a Unicode related change in
> APFS, no?
> Jack
The libintl bit reminded me of https://lists.gnu.org/archive/html/bug-gnulib/2014-10/msg00014.html
I.E. on OSX enabling those libs creates implicit threads I think.
Perhaps that's messing with SIGPIPE handling and only the implicit
thread gets it, thus not killing the main yes(1) thread.
However the yes(1) is also protected with a timeout(1) call.
Perhaps timeout(1) is a silent noop. We should support OSX through DYLD_INSERT_LIBRARIES,
but perhaps there is something preventing that on your system?
But then would the timeout tests fail. Could you check the timeout tests with:
make SUBDIRS=. TESTS=tests/misc/filter.sh check
In any case we should protect calls to timeout(1) to ensure it's supported.
The attached does that at least.
cheers,
Pádraig.
[require_timeout.patch (text/x-patch, attachment)]
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#28506
; Package
coreutils
.
(Thu, 21 Sep 2017 06:03:02 GMT)
Full text and
rfc822 format available.
Message #23 received at 28506 <at> debbugs.gnu.org (full text, mbox):
On Wed, Sep 20, 2017 at 10:20 PM, Pádraig Brady <P <at> draigbrady.com> wrote:
> On 18/09/17 18:07, Jack Howarth wrote:
>> On Mon, Sep 18, 2017 at 7:40 PM, Jim Meyering <jim <at> meyering.net> wrote:
>>
>>> On Mon, Sep 18, 2017 at 4:26 PM, Jack Howarth
>>> <howarth.mailing.lists <at> gmail.com> wrote:
>>>> On Mon, Sep 18, 2017 at 5:08 PM, Jim Meyering <jim <at> meyering.net> wrote:
>>> ...
>>>>> Is there any chance your failing test was via a python2 framework? I'm
>>>>> asking (on Pádraig's behalf) because there is a known problem whereby
>>>>> SIGPIPE is mishandled in that case, and that might explain this
>>>>> failure, since the data-generation phase relies on SIGPIPE killing
>>>>> this test's "yes" command.
>>>>
>>>> I doubt it as the hang doesn't happen under 10.13 when run on a JHFS
>>>> formatted volume.
>>>
>>> How did you run the tests?
>>>
>>
>> Actually, I forgot to mention that the coreutils test suite hang only
>> occurred on the APFS volumes when the coreutils built against the gettext
>> and libiconv from fink. A build outside of fink which didn't build against
>> those packages didn't show the hang in the coreutils test suite. The fink
>> gettext and libiconv packages that I am using are those from...
>>
>> https://sourceforge.net/p/fink/package-submissions/4955/
>>
>> and
>>
>> https://sourceforge.net/p/fink/package-submissions/5004/
>>
>> which are both patched for the format string strictness in High Sierra. I
>> found that using --disable-nls in configuring coreutils was insufficient to
>> suppress the test suite hang which I assume is due to the presence of...
>>
>> #define HAVE_LIBINTL_H 1
>>
>> in the generated ./lib/config.h
>>
>> despite the presence of...
>>
>> /* #undef HAVE_DCGETTEXT */
>> /* #undef HAVE_GETTEXT */
>>
>> when --disable-nls is used so it still could be a Unicode related change in
>> APFS, no?
>> Jack
>
> The libintl bit reminded me of https://lists.gnu.org/archive/html/bug-gnulib/2014-10/msg00014.html
> I.E. on OSX enabling those libs creates implicit threads I think.
> Perhaps that's messing with SIGPIPE handling and only the implicit
> thread gets it, thus not killing the main yes(1) thread.
> However the yes(1) is also protected with a timeout(1) call.
> Perhaps timeout(1) is a silent noop. We should support OSX through DYLD_INSERT_LIBRARIES,
> but perhaps there is something preventing that on your system?
> But then would the timeout tests fail. Could you check the timeout tests with:
>
> make SUBDIRS=. TESTS=tests/misc/filter.sh check
>
> In any case we should protect calls to timeout(1) to ensure it's supported.
> The attached does that at least.
Good idea.
Do you think there should be a syntax-check rule to ensure that any
timeout-using test first calls require_timeout_? This makes me wonder
if we should make timeout a function that does that job (the first
time only), and then exec's the real timeout command.
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#28506
; Package
coreutils
.
(Fri, 22 Sep 2017 00:24:01 GMT)
Full text and
rfc822 format available.
Message #26 received at 28506 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Thu, Sep 21, 2017 at 1:20 AM, Pádraig Brady <P <at> draigbrady.com> wrote:
> On 18/09/17 18:07, Jack Howarth wrote:
> > On Mon, Sep 18, 2017 at 7:40 PM, Jim Meyering <jim <at> meyering.net> wrote:
> >
> >> On Mon, Sep 18, 2017 at 4:26 PM, Jack Howarth
> >> <howarth.mailing.lists <at> gmail.com> wrote:
> >>> On Mon, Sep 18, 2017 at 5:08 PM, Jim Meyering <jim <at> meyering.net>
> wrote:
> >> ...
> >>>> Is there any chance your failing test was via a python2 framework? I'm
> >>>> asking (on Pádraig's behalf) because there is a known problem whereby
> >>>> SIGPIPE is mishandled in that case, and that might explain this
> >>>> failure, since the data-generation phase relies on SIGPIPE killing
> >>>> this test's "yes" command.
> >>>
> >>> I doubt it as the hang doesn't happen under 10.13 when run on a JHFS
> >>> formatted volume.
> >>
> >> How did you run the tests?
> >>
> >
> > Actually, I forgot to mention that the coreutils test suite hang only
> > occurred on the APFS volumes when the coreutils built against the gettext
> > and libiconv from fink. A build outside of fink which didn't build
> against
> > those packages didn't show the hang in the coreutils test suite. The fink
> > gettext and libiconv packages that I am using are those from...
> >
> > https://sourceforge.net/p/fink/package-submissions/4955/
> >
> > and
> >
> > https://sourceforge.net/p/fink/package-submissions/5004/
> >
> > which are both patched for the format string strictness in High Sierra. I
> > found that using --disable-nls in configuring coreutils was insufficient
> to
> > suppress the test suite hang which I assume is due to the presence of...
> >
> > #define HAVE_LIBINTL_H 1
> >
> > in the generated ./lib/config.h
> >
> > despite the presence of...
> >
> > /* #undef HAVE_DCGETTEXT */
> > /* #undef HAVE_GETTEXT */
> >
> > when --disable-nls is used so it still could be a Unicode related change
> in
> > APFS, no?
> > Jack
>
> The libintl bit reminded me of https://lists.gnu.org/archive/
> html/bug-gnulib/2014-10/msg00014.html
> I.E. on OSX enabling those libs creates implicit threads I think.
> Perhaps that's messing with SIGPIPE handling and only the implicit
> thread gets it, thus not killing the main yes(1) thread.
> However the yes(1) is also protected with a timeout(1) call.
> Perhaps timeout(1) is a silent noop. We should support OSX through
> DYLD_INSERT_LIBRARIES,
> but perhaps there is something preventing that on your system?
> But then would the timeout tests fail. Could you check the timeout tests
> with:
>
> make SUBDIRS=. TESTS=tests/misc/filter.sh check
>
> In any case we should protect calls to timeout(1) to ensure it's supported.
> The attached does that at least.
>
> cheers,
> Pádraig.
>
Pádraig,
The hang on APFS volumes doesn't seem to be related to CoreFoundation
threading. If I repeat the steps that I used to track down a similar issue
in make 4.0/4.1 by rebuilding libiconv with --disable-nls and coreutils
with the same --disable-nls so that neither are linked against
CoreFoundation, the test suite hang still occurs. Also, for the stock
build, adding your proposed timeout changes doesn't eliminate the hang in
the test suite either.
Jack
[Message part 2 (text/html, inline)]
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#28506
; Package
coreutils
.
(Sat, 23 Sep 2017 03:05:02 GMT)
Full text and
rfc822 format available.
Message #29 received at 28506 <at> debbugs.gnu.org (full text, mbox):
On 20/09/17 23:02, Jim Meyering wrote:
> On Wed, Sep 20, 2017 at 10:20 PM, Pádraig Brady <P <at> draigbrady.com> wrote:
>> On 18/09/17 18:07, Jack Howarth wrote:
>>> On Mon, Sep 18, 2017 at 7:40 PM, Jim Meyering <jim <at> meyering.net> wrote:
>>>
>>>> On Mon, Sep 18, 2017 at 4:26 PM, Jack Howarth
>>>> <howarth.mailing.lists <at> gmail.com> wrote:
>>>>> On Mon, Sep 18, 2017 at 5:08 PM, Jim Meyering <jim <at> meyering.net> wrote:
>>>> ...
>>>>>> Is there any chance your failing test was via a python2 framework? I'm
>>>>>> asking (on Pádraig's behalf) because there is a known problem whereby
>>>>>> SIGPIPE is mishandled in that case, and that might explain this
>>>>>> failure, since the data-generation phase relies on SIGPIPE killing
>>>>>> this test's "yes" command.
>>>>>
>>>>> I doubt it as the hang doesn't happen under 10.13 when run on a JHFS
>>>>> formatted volume.
>>>>
>>>> How did you run the tests?
>>>>
>>>
>>> Actually, I forgot to mention that the coreutils test suite hang only
>>> occurred on the APFS volumes when the coreutils built against the gettext
>>> and libiconv from fink. A build outside of fink which didn't build against
>>> those packages didn't show the hang in the coreutils test suite. The fink
>>> gettext and libiconv packages that I am using are those from...
>>>
>>> https://sourceforge.net/p/fink/package-submissions/4955/
>>>
>>> and
>>>
>>> https://sourceforge.net/p/fink/package-submissions/5004/
>>>
>>> which are both patched for the format string strictness in High Sierra. I
>>> found that using --disable-nls in configuring coreutils was insufficient to
>>> suppress the test suite hang which I assume is due to the presence of...
>>>
>>> #define HAVE_LIBINTL_H 1
>>>
>>> in the generated ./lib/config.h
>>>
>>> despite the presence of...
>>>
>>> /* #undef HAVE_DCGETTEXT */
>>> /* #undef HAVE_GETTEXT */
>>>
>>> when --disable-nls is used so it still could be a Unicode related change in
>>> APFS, no?
>>> Jack
>>
>> The libintl bit reminded me of https://lists.gnu.org/archive/html/bug-gnulib/2014-10/msg00014.html
>> I.E. on OSX enabling those libs creates implicit threads I think.
>> Perhaps that's messing with SIGPIPE handling and only the implicit
>> thread gets it, thus not killing the main yes(1) thread.
>> However the yes(1) is also protected with a timeout(1) call.
>> Perhaps timeout(1) is a silent noop. We should support OSX through DYLD_INSERT_LIBRARIES,
>> but perhaps there is something preventing that on your system?
>> But then would the timeout tests fail. Could you check the timeout tests with:
>>
>> make SUBDIRS=. TESTS=tests/misc/filter.sh check
>>
>> In any case we should protect calls to timeout(1) to ensure it's supported.
>> The attached does that at least.
>
> Good idea.
> Do you think there should be a syntax-check rule to ensure that any
> timeout-using test first calls require_timeout_? This makes me wonder
> if we should make timeout a function that does that job (the first
> time only), and then exec's the real timeout command.
Yes that would be better.
Also functions for sleep, printf etc. would be useful in
avoiding the need for explicit env and giving greater test coverage.
I'll work on that
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#28506
; Package
coreutils
.
(Sat, 23 Sep 2017 03:08:01 GMT)
Full text and
rfc822 format available.
Message #32 received at 28506 <at> debbugs.gnu.org (full text, mbox):
On 21/09/17 17:23, Jack Howarth wrote:
> On Thu, Sep 21, 2017 at 1:20 AM, Pádraig Brady <P <at> draigbrady.com> wrote:
>
>> On 18/09/17 18:07, Jack Howarth wrote:
>>> On Mon, Sep 18, 2017 at 7:40 PM, Jim Meyering <jim <at> meyering.net> wrote:
>>>
>>>> On Mon, Sep 18, 2017 at 4:26 PM, Jack Howarth
>>>> <howarth.mailing.lists <at> gmail.com> wrote:
>>>>> On Mon, Sep 18, 2017 at 5:08 PM, Jim Meyering <jim <at> meyering.net>
>> wrote:
>>>> ...
>>>>>> Is there any chance your failing test was via a python2 framework? I'm
>>>>>> asking (on Pádraig's behalf) because there is a known problem whereby
>>>>>> SIGPIPE is mishandled in that case, and that might explain this
>>>>>> failure, since the data-generation phase relies on SIGPIPE killing
>>>>>> this test's "yes" command.
>>>>>
>>>>> I doubt it as the hang doesn't happen under 10.13 when run on a JHFS
>>>>> formatted volume.
>>>>
>>>> How did you run the tests?
>>>>
>>>
>>> Actually, I forgot to mention that the coreutils test suite hang only
>>> occurred on the APFS volumes when the coreutils built against the gettext
>>> and libiconv from fink. A build outside of fink which didn't build
>> against
>>> those packages didn't show the hang in the coreutils test suite. The fink
>>> gettext and libiconv packages that I am using are those from...
>>>
>>> https://sourceforge.net/p/fink/package-submissions/4955/
>>>
>>> and
>>>
>>> https://sourceforge.net/p/fink/package-submissions/5004/
>>>
>>> which are both patched for the format string strictness in High Sierra. I
>>> found that using --disable-nls in configuring coreutils was insufficient
>> to
>>> suppress the test suite hang which I assume is due to the presence of...
>>>
>>> #define HAVE_LIBINTL_H 1
>>>
>>> in the generated ./lib/config.h
>>>
>>> despite the presence of...
>>>
>>> /* #undef HAVE_DCGETTEXT */
>>> /* #undef HAVE_GETTEXT */
>>>
>>> when --disable-nls is used so it still could be a Unicode related change
>> in
>>> APFS, no?
>>> Jack
>>
>> The libintl bit reminded me of https://lists.gnu.org/archive/
>> html/bug-gnulib/2014-10/msg00014.html
>> I.E. on OSX enabling those libs creates implicit threads I think.
>> Perhaps that's messing with SIGPIPE handling and only the implicit
>> thread gets it, thus not killing the main yes(1) thread.
>> However the yes(1) is also protected with a timeout(1) call.
>> Perhaps timeout(1) is a silent noop. We should support OSX through
>> DYLD_INSERT_LIBRARIES,
>> but perhaps there is something preventing that on your system?
>> But then would the timeout tests fail. Could you check the timeout tests
>> with:
>>
>> make SUBDIRS=. TESTS=tests/misc/filter.sh check
>>
>> In any case we should protect calls to timeout(1) to ensure it's supported.
>> The attached does that at least.
>>
>> cheers,
>> Pádraig.
>>
>
> Pádraig,
> The hang on APFS volumes doesn't seem to be related to CoreFoundation
> threading. If I repeat the steps that I used to track down a similar issue
> in make 4.0/4.1 by rebuilding libiconv with --disable-nls and coreutils
> with the same --disable-nls so that neither are linked against
> CoreFoundation, the test suite hang still occurs. Also, for the stock
> build, adding your proposed timeout changes doesn't eliminate the hang in
> the test suite either.
Is is a wait or a cpu spin?
Could you use the equivalent of strace on your platform to see what's happening?
thanks,
Pádraig
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#28506
; Package
coreutils
.
(Sun, 24 Sep 2017 02:48:02 GMT)
Full text and
rfc822 format available.
Message #35 received at 28506 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On 22/09/17 20:07, Pádraig Brady wrote:
> Is is a wait or a cpu spin?
> Could you use the equivalent of strace on your platform to see what's happening?
Offlist Jack sent a profile showing /usr/bin/FILE was waiting on input.
That was the result of a silly typo in the script, which the attached
should fix. I don't know what that command does, nor why it's specifically
a problem on APFS, but hopefully this fixes things.
cheers,
Pádraig.
[filter-test-hang-macos.patch (text/x-patch, attachment)]
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#28506
; Package
coreutils
.
(Sun, 24 Sep 2017 06:15:01 GMT)
Full text and
rfc822 format available.
Message #38 received at 28506 <at> debbugs.gnu.org (full text, mbox):
On Sep 23 2017, Pádraig Brady <P <at> draigBrady.com> wrote:
> Offlist Jack sent a profile showing /usr/bin/FILE was waiting on input.
> That was the result of a silly typo in the script, which the attached
> should fix. I don't know what that command does,
That's file(1) trying to analyze '-'.
> nor why it's specifically a problem on APFS,
Presumably APFS is case insensitive.
Andreas.
--
Andreas Schwab, schwab <at> linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#28506
; Package
coreutils
.
(Sun, 24 Sep 2017 17:17:02 GMT)
Full text and
rfc822 format available.
Message #41 received at 28506 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Sat, Sep 23, 2017 at 10:47 PM, Pádraig Brady <P <at> draigbrady.com> wrote:
> On 22/09/17 20:07, Pádraig Brady wrote:
> > Is is a wait or a cpu spin?
> > Could you use the equivalent of strace on your platform to see what's
> happening?
>
> Offlist Jack sent a profile showing /usr/bin/FILE was waiting on input.
> That was the result of a silly typo in the script, which the attached
> should fix. I don't know what that command does, nor why it's specifically
> a problem on APFS, but hopefully this fixes things.
>
> cheers,
> Pádraig.
>
>
Pádraig.
Thanks. I can confirm that eliminates testsuite hang seen on 10.13 with
APFS volumes. FYI, the stock APFS is still case-insensitive on darwin17.
Jack
ps The only failure seen in the test suite is...
FAIL: tests/touch/trailing-slash.sh
[Message part 2 (text/html, inline)]
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#28506
; Package
coreutils
.
(Sun, 24 Sep 2017 17:35:02 GMT)
Full text and
rfc822 format available.
Message #44 received at 28506 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Sun, Sep 24, 2017 at 1:16 PM, Jack Howarth <
howarth.mailing.lists <at> gmail.com> wrote:
>
>
> On Sat, Sep 23, 2017 at 10:47 PM, Pádraig Brady <P <at> draigbrady.com> wrote:
>
>> On 22/09/17 20:07, Pádraig Brady wrote:
>> > Is is a wait or a cpu spin?
>> > Could you use the equivalent of strace on your platform to see what's
>> happening?
>>
>> Offlist Jack sent a profile showing /usr/bin/FILE was waiting on input.
>> That was the result of a silly typo in the script, which the attached
>> should fix. I don't know what that command does, nor why it's
>> specifically
>> a problem on APFS, but hopefully this fixes things.
>>
>> cheers,
>> Pádraig.
>>
>>
> Pádraig.
> Thanks. I can confirm that eliminates testsuite hang seen on 10.13
> with APFS volumes. FYI, the stock APFS is still case-insensitive on
> darwin17.
> Jack
> ps The only failure seen in the test suite is...
>
> FAIL: tests/touch/trailing-slash.sh
>
>
>
Pádraig,
Attached are the tests/touch/trailing-slash.log and
tests/touch/trailing-slash.trs files generated from a build on an APFS
volume running 10.13 in case you can identify why that test is failing.
Jack
[Message part 2 (text/html, inline)]
[trailing-slash.log (application/octet-stream, attachment)]
[trailing-slash.trs (application/octet-stream, attachment)]
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#28506
; Package
coreutils
.
(Mon, 25 Sep 2017 17:16:01 GMT)
Full text and
rfc822 format available.
Message #47 received at 28506 <at> debbugs.gnu.org (full text, mbox):
On Sun, Sep 24, 2017 at 10:34 AM, Jack Howarth
<howarth.mailing.lists <at> gmail.com> wrote:
> On Sun, Sep 24, 2017 at 1:16 PM, Jack Howarth <
...
> Attached are the tests/touch/trailing-slash.log and
> tests/touch/trailing-slash.trs files generated from a build on an APFS
> volume running 10.13 in case you can identify why that test is failing.
That test is failing because your system allows "touch
symlink-to-file-specified-with-trailing-slash/" to succeed, e.g., here
is how it's supposed to work, but on your system touch (mistakenly)
succeeds:
$ : > k && ln -s k j && touch j/
touch: setting times of 'j/': Not a directory
When a non-directory name is specified with a trailing slash, many
interfaces are required by POSIX to fail with ENOTDIR. It looks like
one of those on your system goes ahead and performs the requested
operation as if that slash were not present.
We can probably teach gnulib to detect and work around this flaw.
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#28506
; Package
coreutils
.
(Tue, 30 Oct 2018 01:17:01 GMT)
Full text and
rfc822 format available.
Message #50 received at 28506 <at> debbugs.gnu.org (full text, mbox):
tags 28506 fixed
close 28506
stop
(triaging old bugs)
On 2017-09-23 8:47 p.m., Pádraig Brady wrote:
> On 22/09/17 20:07, Pádraig Brady wrote:
>> Is is a wait or a cpu spin?
>> Could you use the equivalent of strace on your platform to see what's happening?
>
> Offlist Jack sent a profile showing /usr/bin/FILE was waiting on input.
> That was the result of a silly typo in the script, which the attached
> should fix. I don't know what that command does, nor why it's specifically
> a problem on APFS, but hopefully this fixes things.
Pushed here:
https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=63d2f05f5283c88f6c60ebe6de7a26ce6b9e4ee8
so closing as "fixed".
-assaf
Added tag(s) fixed.
Request was from
Assaf Gordon <assafgordon <at> gmail.com>
to
control <at> debbugs.gnu.org
.
(Tue, 30 Oct 2018 01:17:02 GMT)
Full text and
rfc822 format available.
bug closed, send any further explanations to
28506 <at> debbugs.gnu.org and Jack Howarth <howarth.mailing.lists <at> gmail.com>
Request was from
Assaf Gordon <assafgordon <at> gmail.com>
to
control <at> debbugs.gnu.org
.
(Tue, 30 Oct 2018 01:17:02 GMT)
Full text and
rfc822 format available.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Tue, 27 Nov 2018 12:24:04 GMT)
Full text and
rfc822 format available.
This bug report was last modified 5 years and 122 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.