Package: dejagnu;
Reported by: Tom de Vries <tdevries <at> suse.de>
Date: Fri, 12 Jun 2020 08:36:01 UTC
Owned by: jcb62281 <at> gmail.com
Severity: normal
Done: Jacob Bachmeyer <jcb62281 <at> gmail.com>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 41824 in the body.
You can then email your comments to 41824 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
View this report as an mbox folder, status mbox, maintainer mbox
bug-dejagnu <at> gnu.org
:bug#41824
; Package dejagnu
.
(Fri, 12 Jun 2020 08:36:01 GMT) Full text and rfc822 format available.Tom de Vries <tdevries <at> suse.de>
:bug-dejagnu <at> gnu.org
.
(Fri, 12 Jun 2020 08:36:01 GMT) Full text and rfc822 format available.Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
From: Tom de Vries <tdevries <at> suse.de> To: bug-dejagnu <at> gnu.org Subject: Dejagnu's unknown proc aborts testsuite run when triggered in test-case Date: Fri, 12 Jun 2020 10:35:43 +0200
Hi, the gdb testsuite uses the dejagnu framework. If I add a call to foobar at the end of a test-case in that testsuite, and I do a run of the testsuite, then the test-case aborts, and subsequently the testsuite run aborts, that is, no further tests are run. While it is as expected that the test-case aborts, it's not desirable that the testsuite run aborts. We've installed a hack in ${tool}_init/${tool}_finish to workaround this in the gdb testsuite ( https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=26783bce15adc0316f9167a216519cebcf1ccac3 ). It would preferable though if this problem was addressed in dejagnu, such that we can eventually remove the hack. A possible solution to this problem could be to make the exiting on error (which is done in dejagnu's version of proc unknown in framework.exp) optional. Thanks, - Tom
bug-dejagnu <at> gnu.org
:bug#41824
; Package dejagnu
.
(Tue, 16 Jun 2020 04:06:01 GMT) Full text and rfc822 format available.Message #8 received at 41824 <at> debbugs.gnu.org (full text, mbox):
From: Jacob Bachmeyer <jcb62281 <at> gmail.com> To: Tom de Vries <tdevries <at> suse.de> Cc: 41824 <at> debbugs.gnu.org Subject: Re: bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case Date: Mon, 15 Jun 2020 23:05:33 -0500
Tom de Vries wrote: > Hi, > > the gdb testsuite uses the dejagnu framework. > Apologies for the delay in responding; I took a few days to think about this issue because it requires a careful balance, as silently continuing after aborting a testcase could cause errors to go unnoticed in large testsuites. > If I add a call to foobar at the end of a test-case in that testsuite, > and I do a run of the testsuite, then the test-case aborts, and > subsequently the testsuite run aborts, that is, no further tests are run. > > While it is as expected that the test-case aborts, it's not desirable > that the testsuite run aborts. > The main problem I have with this is that calling an unknown procedure is presumably a very serious error. Aborting the entire run is a good way to ensure that the problem is noticed, but perhaps with a testsuite as large (and long-running) as the GDB testsuite has become, continuing may also be reasonable to salvage at least some results. Where in the GDB testsuite is this a problem? When is it appropriate to continue after such a serious error? How to ensure that the error is not buried in the logs amidst the normal output from other tests and is actually noticed? > We've installed a hack in ${tool}_init/${tool}_finish to workaround this > in the gdb testsuite ( > https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=26783bce15adc0316f9167a216519cebcf1ccac3 > ). > > It would preferable though if this problem was addressed in dejagnu, > such that we can eventually remove the hack. > That hack (if not reverted entirely) needs to be made conditional on the actual existence of ::tcl_unknown as soon as possible -- the existence of ::tcl_unknown (kept to allow Tcl autoloading to work) is very much an internal implementation detail, and it *will* be moved out of the global namespace and eventually *will* cease to exist entirely in the interpreters that run testcases. Really, tcl_unknown is not supposed to be there and relying on it introduces a latent bug into the GDB testsuite. I am sorry, and I will seek to work with you to fix these problems with a minimum of breakage, but I have to put my foot down on this: I *cannot* treat every internal symbol as long-term stable API. That hack will *not* be supported long-term. Any GDB releases that include it *will* break with some as-yet-undetermined future DejaGnu release. I have to draw a line somewhere or I may as well declare DejaGnu unmaintained and frozen at 1.6.2 forever. I have no inclination to do the latter. > A possible solution to this problem could be to make the exiting on > error (which is done in dejagnu's version of proc unknown in > framework.exp) optional. > This is obviously a good solution, but leads to the obvious question of how (command-line option? (Make normally aborts on error, but has a --keep-going option.) configuration variable?) that should be determined, or if exit-on-unknown-procedure should ever be a default. Could simply reporting an UNRESOLVED test (indicating that the testcase script aborted due to calling an undefined command) be a better option? It would show up in the summary report at the end. At first, I was planning to schedule this for the 1.7 series, but I checked and runtest.exp:runtest already catches Tcl errors, so fixing this does not require a significant change to the normal control flow. Would simply reporting an "UNRESOLVED: testcase $file aborted on unknown command '$what'" result, then propagating the Tcl error in the ::unknown proc be a workable solution? -- Jacob
bug-dejagnu <at> gnu.org
:bug#41824
; Package dejagnu
.
(Tue, 16 Jun 2020 15:55:02 GMT) Full text and rfc822 format available.Message #11 received at 41824 <at> debbugs.gnu.org (full text, mbox):
From: Tom de Vries <tdevries <at> suse.de> To: jcb62281 <at> gmail.com Cc: 41824 <at> debbugs.gnu.org Subject: Re: bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case Date: Tue, 16 Jun 2020 17:53:54 +0200
[Message part 1 (text/plain, inline)]
On 6/16/20 6:05 AM, Jacob Bachmeyer wrote: > Tom de Vries wrote: >> Hi, >> >> the gdb testsuite uses the dejagnu framework. >> > > Apologies for the delay in responding; I took a few days to think about > this issue because it requires a careful balance, as silently continuing > after aborting a testcase could cause errors to go unnoticed in large > testsuites. > >> If I add a call to foobar at the end of a test-case in that testsuite, >> and I do a run of the testsuite, then the test-case aborts, and >> subsequently the testsuite run aborts, that is, no further tests are run. >> >> While it is as expected that the test-case aborts, it's not desirable >> that the testsuite run aborts. >> > > The main problem I have with this is that calling an unknown procedure > is presumably a very serious error. Aborting the entire run is a good > way to ensure that the problem is noticed, but perhaps with a testsuite > as large (and long-running) as the GDB testsuite has become, continuing > may also be reasonable to salvage at least some results. > > Where in the GDB testsuite is this a problem? When is it appropriate to > continue after such a serious error? It's not a problem in a specific test-case, but a generic problem. If somebody makes a typo in a test-case and checks it in, there's no need to abort the test run. > How to ensure that the error is > not buried in the logs amidst the normal output from other tests and is > actually noticed? > I monitor ERROR and WARNING in my test runs, so I didn't think of this, but indeed, those are not tracked in the summary, so that could be a problem, agreed. >> We've installed a hack in ${tool}_init/${tool}_finish to workaround this >> in the gdb testsuite ( >> https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=26783bce15adc0316f9167a216519cebcf1ccac3 >> >> ). >> >> It would preferable though if this problem was addressed in dejagnu, >> such that we can eventually remove the hack. >> > > That hack (if not reverted entirely) needs to be made conditional on the > actual existence of ::tcl_unknown as soon as possible -- the existence > of ::tcl_unknown (kept to allow Tcl autoloading to work) is very much an > internal implementation detail, and it *will* be moved out of the global > namespace and eventually *will* cease to exist entirely in the > interpreters that run testcases. Really, tcl_unknown is not supposed to > be there and relying on it introduces a latent bug into the GDB testsuite. > > I am sorry, and I will seek to work with you to fix these problems with > a minimum of breakage, but I have to put my foot down on this: I > *cannot* treat every internal symbol as long-term stable API. That hack > will *not* be supported long-term. Any GDB releases that include it > *will* break with some as-yet-undetermined future DejaGnu release. I > have to draw a line somewhere or I may as well declare DejaGnu > unmaintained and frozen at 1.6.2 forever. I have no inclination to do > the latter. > Ack, makes sense. >> A possible solution to this problem could be to make the exiting on >> error (which is done in dejagnu's version of proc unknown in >> framework.exp) optional. >> > > This is obviously a good solution, but leads to the obvious question of > how (command-line option? (Make normally aborts on error, but has a > --keep-going option.) configuration variable?) that should be > determined, or if exit-on-unknown-procedure should ever be a default. > Could simply reporting an UNRESOLVED test (indicating that the testcase > script aborted due to calling an undefined command) be a better option? > It would show up in the summary report at the end. > > At first, I was planning to schedule this for the 1.7 series, but I > checked and runtest.exp:runtest already catches Tcl errors, so fixing > this does not require a significant change to the normal control flow. > Would simply reporting an "UNRESOLVED: testcase $file aborted on > unknown command '$what'" result, then propagating the Tcl error in the > ::unknown proc be a workable solution? I think so, yes. I've tried that out in attached gdb patch, which also solves the reliance on ::tcl_unknown. Then when running into the unknown command in the test-case, we have in gdb.sum: ... Running /home/vries/gdb_versions/devel/src/gdb/testsuite/gdb.ada/O2_float_param.exp ... PASS: gdb.ada/O2_float_param.exp: compilation foo.adb PASS: gdb.ada/O2_float_param.exp: frame UNRESOLVED: gdb.ada/O2_float_param.exp: testcase aborted on unknown command 'foo' ERROR: tcl error sourcing /home/vries/gdb_versions/devel/src/gdb/testsuite/gdb.ada/O2_float_param.exp. ERROR: invalid command name "foo" while executing "::gdb_unknown foo" ("uplevel" body line 1) invoked from within "uplevel 1 ::gdb_unknown $args" (procedure "::unknown" line 4) invoked from within "foo" (file "/home/vries/gdb_versions/devel/src/gdb/testsuite/gdb.ada/O2_float_param.exp" line 33) invoked from within "source /home/vries/gdb_versions/devel/src/gdb/testsuite/gdb.ada/O2_float_param.exp" ("uplevel" body line 1) invoked from within "uplevel #0 source /home/vries/gdb_versions/devel/src/gdb/testsuite/gdb.ada/O2_float_param.exp" invoked from within "catch "uplevel #0 source $test_file_name"" === gdb Summary === # of expected passes 2 # of unresolved testcases 1 ... I get similar results if I disable the hack in the gdb testsuite, and use attached runtest.exp demonstrator patch. Note btw that in both cases I didn't put $file in the unresolved message, because that's already present in pf_prefix, but that might not always be true. Thanks, - Tom
[0001-fix.patch (text/x-patch, inline)]
fix --- gdb/testsuite/lib/gdb.exp | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/gdb/testsuite/lib/gdb.exp b/gdb/testsuite/lib/gdb.exp index f502eb157d..5cb9e7907c 100644 --- a/gdb/testsuite/lib/gdb.exp +++ b/gdb/testsuite/lib/gdb.exp @@ -5177,6 +5177,13 @@ proc gdb_cleanup_globals {} { } } +# Create gdb_unknown, a copy tcl's ::unknown. +set temp [interp create] +set old_args [interp eval $temp "info args ::unknown"] +set old_body [interp eval $temp "info body ::unknown"] +interp delete $temp +eval proc gdb_unknown {$old_args} {$old_body} + proc gdb_init { test_file_name } { # Reset the timeout value to the default. This way, any testcase # that changes the timeout value without resetting it cannot affect @@ -5285,12 +5292,13 @@ proc gdb_init { test_file_name } { # Dejagnu overrides proc unknown. The dejagnu version may trigger in a # test-case but abort the entire test run. To fix this, we install a - # local version here, which reverts dejagnu's override, and restore - # dejagnu's version in gdb_finish. + # local version here, which instead calls unresolved, and then propagates the + # error to tcl's unknown. In gdb_finish, we restore dejagnu's version. rename ::unknown ::dejagnu_unknown proc unknown { args } { - # Dejagnu saves the original version in ::tcl_unknown, use it. - return [uplevel 1 ::tcl_unknown $args] + # Restore ::unknown. + unresolved "testcase aborted on unknown command '$args'" + return [uplevel 1 ::gdb_unknown $args] } return $res
[runtest-exp.patch (text/x-patch, inline)]
--- runtest.exp.orig 2020-06-16 17:36:15.038208916 +0200 +++ runtest.exp 2020-06-16 17:38:19.189251033 +0200 @@ -1431,6 +1431,11 @@ return $found } +proc test_case_unknown { args } { + unresolved "testcase aborted on unknown command '$args'" + return [uplevel 1 ::tcl_unknown $args] +} + # # Source the testcase in TEST_FILE_NAME. # @@ -1457,6 +1462,9 @@ } } + rename ::unknown dejagnu_unknown + rename ::test_case_unknown ::unknown + if { [catch "uplevel #0 source $test_file_name"] == 1 } { # If we have a Tcl error, propagate the exit status so # that 'make' (if it invokes runtest) notices the error. @@ -1476,6 +1484,9 @@ } } + rename ::unknown ::test_case_unknown + rename dejagnu_unknown ::unknown + if {[info exists tool]} { if { [info procs "${tool}_finish"] != "" } { ${tool}_finish
Jacob Bachmeyer <jcb62281 <at> gmail.com>
to control <at> debbugs.gnu.org
.
(Wed, 17 Jun 2020 02:27:05 GMT) Full text and rfc822 format available.bug-dejagnu <at> gnu.org
:bug#41824
; Package dejagnu
.
(Wed, 17 Jun 2020 03:27:02 GMT) Full text and rfc822 format available.Message #16 received at 41824 <at> debbugs.gnu.org (full text, mbox):
From: Jacob Bachmeyer <jcb62281 <at> gmail.com> To: Tom de Vries <tdevries <at> suse.de> Cc: 41824 <at> debbugs.gnu.org Subject: Re: bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case Date: Tue, 16 Jun 2020 22:26:24 -0500
Tom de Vries wrote: > On 6/16/20 6:05 AM, Jacob Bachmeyer wrote: > >> Tom de Vries wrote: >> >>> If I add a call to foobar at the end of a test-case in that testsuite, >>> and I do a run of the testsuite, then the test-case aborts, and >>> subsequently the testsuite run aborts, that is, no further tests are run. >>> >>> While it is as expected that the test-case aborts, it's not desirable >>> that the testsuite run aborts. >>> >>> >> The main problem I have with this is that calling an unknown procedure >> is presumably a very serious error. Aborting the entire run is a good >> way to ensure that the problem is noticed, but perhaps with a testsuite >> as large (and long-running) as the GDB testsuite has become, continuing >> may also be reasonable to salvage at least some results. >> >> Where in the GDB testsuite is this a problem? When is it appropriate to >> continue after such a serious error? >> > > It's not a problem in a specific test-case, but a generic problem. If > somebody makes a typo in a test-case and checks it in, there's no need > to abort the test run. > So the root cause problem is people committing changes to the GDB testsuite without actually testing their own changes first, at all? >> How to ensure that the error is >> not buried in the logs amidst the normal output from other tests and is >> actually noticed? >> > > I monitor ERROR and WARNING in my test runs, so I didn't think of this, > but indeed, those are not tracked in the summary, so that could be a > problem, agreed. > Blowing up the entire test run is a good way to ensure that problems (like typos in testcases) are noticed, but admittedly, that does not help when people fail to even run the tests before checking in their typos and then other developers' test runs blow up. Murphy's Law says the offenders then carry on to check in more typos in other code without testing their own work there either... >>> A possible solution to this problem could be to make the exiting on >>> error (which is done in dejagnu's version of proc unknown in >>> framework.exp) optional. >>> >>> >> This is obviously a good solution, but leads to the obvious question of >> how (command-line option? (Make normally aborts on error, but has a >> --keep-going option.) configuration variable?) that should be >> determined, or if exit-on-unknown-procedure should ever be a default. >> Could simply reporting an UNRESOLVED test (indicating that the testcase >> script aborted due to calling an undefined command) be a better option? >> It would show up in the summary report at the end. >> >> At first, I was planning to schedule this for the 1.7 series, but I >> checked and runtest.exp:runtest already catches Tcl errors, so fixing >> this does not require a significant change to the normal control flow. >> Would simply reporting an "UNRESOLVED: testcase $file aborted on >> unknown command '$what'" result, then propagating the Tcl error in the >> ::unknown proc be a workable solution? >> > > I think so, yes. I've tried that out in attached gdb patch, which also > solves the reliance on ::tcl_unknown. > That patch is subtly wrong in other ways. First, it assumes that ::unknown is a procedure and not a command; there is no guarantee of this in the Tcl manual. Second, it will fail if run inside a safe interpreter where the default "unknown" does not exist. Third, it breaks Tcl's default autoloading because it adds an UNRESOLVED result regardless of the result of Tcl's autoload search. The best solution is to just revert the hacks and wait for this to be solved upstream, where it is expected to land for the 1.6.3 release. > [...] > Note btw that in both cases I didn't put $file in the unresolved > message, because that's already present in pf_prefix, but that might not > always be true. > The use of pf_prefix appears to be a feature of the GDB testsuite, so the upstream implementation of this feature will not be able to rely on it. I am hesitant to alter default behavior for 1.6.3, so this will probably be added as a "--keep-going" option to runtest. I should be able to land this for 1.6.3; it is simple enough to do and we are already getting some other new command-line options in 1.6.3 anyway that were needed to fix DejaGnu's own testsuite. As of yet, there is no (supported) way planned to set the --keep-going option from an init file, only on the actual command-line. This may change in the future; if it does, there will be some general API or just a special parse of ~/.dejagnurc looking for command-line options. These are planned for 1.7 at the earliest. -- Jacob
bug-dejagnu <at> gnu.org, jcb62281 <at> gmail.com
:bug#41824
; Package dejagnu
.
(Wed, 17 Jun 2020 14:14:02 GMT) Full text and rfc822 format available.Message #19 received at 41824 <at> debbugs.gnu.org (full text, mbox):
From: Tom de Vries <tdevries <at> suse.de> To: jcb62281 <at> gmail.com Cc: 41824 <at> debbugs.gnu.org Subject: Re: bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case Date: Wed, 17 Jun 2020 16:13:17 +0200
On 6/17/20 5:26 AM, Jacob Bachmeyer wrote: > Tom de Vries wrote: >> On 6/16/20 6:05 AM, Jacob Bachmeyer wrote: >> >>> Tom de Vries wrote: >>> >>>> If I add a call to foobar at the end of a test-case in that testsuite, >>>> and I do a run of the testsuite, then the test-case aborts, and >>>> subsequently the testsuite run aborts, that is, no further tests are >>>> run. >>>> >>>> While it is as expected that the test-case aborts, it's not desirable >>>> that the testsuite run aborts. >>>> >>> The main problem I have with this is that calling an unknown procedure >>> is presumably a very serious error. Aborting the entire run is a good >>> way to ensure that the problem is noticed, but perhaps with a testsuite >>> as large (and long-running) as the GDB testsuite has become, continuing >>> may also be reasonable to salvage at least some results. >>> >>> Where in the GDB testsuite is this a problem? When is it appropriate to >>> continue after such a serious error? >>> >> >> It's not a problem in a specific test-case, but a generic problem. If >> somebody makes a typo in a test-case and checks it in, there's no need >> to abort the test run. >> > > So the root cause problem is people committing changes to the GDB > testsuite without actually testing their own changes first, at all? > >>> How to ensure that the error is >>> not buried in the logs amidst the normal output from other tests and is >>> actually noticed? >>> >> >> I monitor ERROR and WARNING in my test runs, so I didn't think of this, >> but indeed, those are not tracked in the summary, so that could be a >> problem, agreed. >> > > Blowing up the entire test run is a good way to ensure that problems > (like typos in testcases) are noticed, but admittedly, that does not > help when people fail to even run the tests before checking in their > typos and then other developers' test runs blow up. Murphy's Law says > the offenders then carry on to check in more typos in other code without > testing their own work there either... > >>>> A possible solution to this problem could be to make the exiting on >>>> error (which is done in dejagnu's version of proc unknown in >>>> framework.exp) optional. >>>> >>> This is obviously a good solution, but leads to the obvious question of >>> how (command-line option? (Make normally aborts on error, but has a >>> --keep-going option.) configuration variable?) that should be >>> determined, or if exit-on-unknown-procedure should ever be a default. >>> Could simply reporting an UNRESOLVED test (indicating that the testcase >>> script aborted due to calling an undefined command) be a better >>> option? It would show up in the summary report at the end. >>> >>> At first, I was planning to schedule this for the 1.7 series, but I >>> checked and runtest.exp:runtest already catches Tcl errors, so fixing >>> this does not require a significant change to the normal control >>> flow. Would simply reporting an "UNRESOLVED: testcase $file aborted on >>> unknown command '$what'" result, then propagating the Tcl error in the >>> ::unknown proc be a workable solution? >>> >> >> I think so, yes. I've tried that out in attached gdb patch, which also >> solves the reliance on ::tcl_unknown. >> > > That patch is subtly wrong in other ways. First, it assumes that > ::unknown is a procedure and not a command; there is no guarantee of > this in the Tcl manual. Second, it will fail if run inside a safe > interpreter where the default "unknown" does not exist. Third, it > breaks Tcl's default autoloading because it adds an UNRESOLVED result > regardless of the result of Tcl's autoload search. > Your last remark here got me thinking about dejagnu's unknown proc, and I've filed two bugs: - bug#41914: Propagate return value of auto-loaded command - bug#41918: Propagate error value of auto-loaded command > The best solution is to just revert the hacks and wait for this to be > solved upstream, where it is expected to land for the 1.6.3 release. > >> [...] >> Note btw that in both cases I didn't put $file in the unresolved >> message, because that's already present in pf_prefix, but that might not >> always be true. >> > The use of pf_prefix appears to be a feature of the GDB testsuite, so > the upstream implementation of this feature will not be able to rely on it. > > > I am hesitant to alter default behavior for 1.6.3, so this will probably > be added as a "--keep-going" option to runtest.> I should be able to > land this for 1.6.3; it is simple enough to do and we are already > getting some other new command-line options in 1.6.3 anyway that were > needed to fix DejaGnu's own testsuite. As of yet, there is no > (supported) way planned to set the --keep-going option from an init > file, only on the actual command-line. This may change in the future; > if it does, there will be some general API or just a special parse of > ~/.dejagnurc looking for command-line options. These are planned for > 1.7 at the earliest. I guess it would be convenient if we could switch this on somehow from lib/${tool}.exp. Thanks, - Tom
bug-dejagnu <at> gnu.org
:bug#41824
; Package dejagnu
.
(Wed, 17 Jun 2020 23:20:02 GMT) Full text and rfc822 format available.Message #22 received at 41824 <at> debbugs.gnu.org (full text, mbox):
From: Jacob Bachmeyer <jcb62281 <at> gmail.com> To: Tom de Vries <tdevries <at> suse.de> Cc: 41824 <at> debbugs.gnu.org Subject: Re: bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case Date: Wed, 17 Jun 2020 18:18:52 -0500
[Message part 1 (text/plain, inline)]
Tom de Vries wrote: > On 6/17/20 5:26 AM, Jacob Bachmeyer wrote: > >> Tom de Vries wrote: >> >>> On 6/16/20 6:05 AM, Jacob Bachmeyer wrote: >>> >>> >>>> Tom de Vries wrote: >>>> >>>>> A possible solution to this problem could be to make the exiting on >>>>> error (which is done in dejagnu's version of proc unknown in >>>>> framework.exp) optional. >>>>> >>>>> >>>> This is obviously a good solution, but leads to the obvious question of >>>> how (command-line option? (Make normally aborts on error, but has a >>>> --keep-going option.) configuration variable?) that should be >>>> determined, or if exit-on-unknown-procedure should ever be a default. >>>> Could simply reporting an UNRESOLVED test (indicating that the testcase >>>> script aborted due to calling an undefined command) be a better >>>> option? It would show up in the summary report at the end. >>>> >>>> At first, I was planning to schedule this for the 1.7 series, but I >>>> checked and runtest.exp:runtest already catches Tcl errors, so fixing >>>> this does not require a significant change to the normal control >>>> flow. Would simply reporting an "UNRESOLVED: testcase $file aborted on >>>> unknown command '$what'" result, then propagating the Tcl error in the >>>> ::unknown proc be a workable solution? >>>> >>>> >>> I think so, yes. I've tried that out in attached gdb patch, which also >>> solves the reliance on ::tcl_unknown. >>> >>> >> That patch is subtly wrong in other ways. First, it assumes that >> ::unknown is a procedure and not a command; there is no guarantee of >> this in the Tcl manual. Second, it will fail if run inside a safe >> interpreter where the default "unknown" does not exist. Third, it >> breaks Tcl's default autoloading because it adds an UNRESOLVED result >> regardless of the result of Tcl's autoload search. >> > > Your last remark here got me thinking about dejagnu's unknown proc, and > I've filed two bugs: > - bug#41914: Propagate return value of auto-loaded command > - bug#41918: Propagate error value of auto-loaded command > I believe both of these were fixed while working on this issue in the attached patch. Apologies for the delay; I wanted tests for the new feature. >> The best solution is to just revert the hacks and wait for this to be >> solved upstream, where it is expected to land for the 1.6.3 release. >> >> >>> [...] >>> Note btw that in both cases I didn't put $file in the unresolved >>> message, because that's already present in pf_prefix, but that might not >>> always be true. >>> >>> >> The use of pf_prefix appears to be a feature of the GDB testsuite, so >> the upstream implementation of this feature will not be able to rely on it. >> >> >> I am hesitant to alter default behavior for 1.6.3, so this will probably >> be added as a "--keep-going" option to runtest.> I should be able to >> land this for 1.6.3; it is simple enough to do and we are already >> getting some other new command-line options in 1.6.3 anyway that were >> needed to fix DejaGnu's own testsuite. As of yet, there is no >> (supported) way planned to set the --keep-going option from an init >> file, only on the actual command-line. This may change in the future; >> if it does, there will be some general API or just a special parse of >> ~/.dejagnurc looking for command-line options. These are planned for >> 1.7 at the earliest. >> > > I guess it would be convenient if we could switch this on somehow from > lib/${tool}.exp. The problem here is that this detects an invalidity in the testsuite. Allowing testsuites to select --keep_going is asking (Murphy's Law again) for people to turn it on and then ignore the errors from broken test scripts "because it works" -- never mind that it is obviously broken, but an extra UNRESOLVED or three can still be easily ignored. The entire test run stopping cold unless a special command-line option is given that is specifically documented to suppress abort-on-error is much harder to explain away or even try to justify as acceptable. -- Jacob
[0001-Allow-testing-to-continue-after-an-undefined-command.patch (text/plain, inline)]
From c5b21f1f1cfaabf1431010c314aadcc0b7b708f0 Mon Sep 17 00:00:00 2001 From: Jacob Bachmeyer <jcb62281+dev <at> gmail.com> Date: Wed, 17 Jun 2020 18:08:57 -0500 Subject: [PATCH] Allow testing to continue after an undefined command is called --- ChangeLog | 26 +++++++ Makefile.am | 2 +- Makefile.in | 2 +- NEWS | 2 + doc/dejagnu.texi | 5 + doc/runtest.1 | 3 + lib/framework.exp | 24 +++++- runtest.exp | 12 +++ testsuite/runtest.main/abort.exp | 78 ++++++++++++++++++++ .../abort/testsuite/abort.test/abort-undef.exp | 25 ++++++ .../abort/testsuite/abort.test/simple.exp | 21 +++++ 11 files changed, 195 insertions(+), 5 deletions(-) create mode 100644 testsuite/runtest.main/abort.exp create mode 100644 testsuite/runtest.main/abort/testsuite/abort.test/abort-undef.exp create mode 100644 testsuite/runtest.main/abort/testsuite/abort.test/simple.exp diff --git a/ChangeLog b/ChangeLog index 69a80ef..2351101 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,29 @@ +2020-06-17 Jacob Bachmeyer <jcb62281+dev <at> gmail.com> + + PR 41824 + + * NEWS: Add item for --keep_going option. + + * Makefile.am (CLEANFILES): Add abort-init.exp to list. + + * doc/dejagnu.texi (Invoking runtest): Document new --keep_going + command line option. + * doc/runtest.1: Likewise. + + * lib/framework.exp (unknown): Report an UNRESOLVED result if an + unknown command is invoked. Avoid exiting and propagate the error + from Tcl's "unknown" procedure if --keep_going was + specified. Brace procedure argument list. + * runtest.exp (dejagnu::opt): New namespace. + Add option --keep_going to continue running tests after a test + script aborts due to calling an undefined command. + + * testsuite/runtest.main/abort.exp: New file. + * testsuite/runtest.main/abort/testsuite/abort.test/abort-undef.exp: + New file. + * testsuite/runtest.main/abort/testsuite/abort.test/simple.exp: + New file. + 2020-06-02 Jacob Bachmeyer <jcb62281+dev <at> gmail.com> PR 41647 diff --git a/Makefile.am b/Makefile.am index 6c48c20..9a2ba81 100644 --- a/Makefile.am +++ b/Makefile.am @@ -26,7 +26,7 @@ EXTRA_DIST = ChangeLog-1992 MAINTAINERS dejagnu runtest \ $(commands_DATA) $(TESTSUITE_FILES) $(TEXINFO_TEX)\ $(CONTRIB) -CLEANFILES = options-init.exp stats-init.exp +CLEANFILES = abort-init.exp options-init.exp stats-init.exp clean-local: clean-local-check .PHONY: clean-local-check diff --git a/Makefile.in b/Makefile.in index 1cd211c..bc9d1e5 100644 --- a/Makefile.in +++ b/Makefile.in @@ -381,7 +381,7 @@ EXTRA_DIST = ChangeLog-1992 MAINTAINERS dejagnu runtest \ $(commands_DATA) $(TESTSUITE_FILES) $(TEXINFO_TEX)\ $(CONTRIB) -CLEANFILES = options-init.exp stats-init.exp +CLEANFILES = abort-init.exp options-init.exp stats-init.exp bin_SCRIPTS = dejagnu runtest include_HEADERS = dejagnu.h pkgdata_DATA = \ diff --git a/NEWS b/NEWS index 209e9c4..4354422 100644 --- a/NEWS +++ b/NEWS @@ -7,6 +7,8 @@ Changes since 1.6.2: should use this proc. The 'is_remote' proc is deprecated. 2. runtest now accepts --local_init and --global_init options to override the default of reading "site.exp". See the manual for details. +X. runtest now accepts a --keep_going option to continue with other test + scripts after a test script invokes an undefined command. 3. A utility procedure relative_filename has been added. This procedure computes a relative file name to a given destination from a given base. 4. The utility procedure 'grep' now accepts a '-n' option that diff --git a/doc/dejagnu.texi b/doc/dejagnu.texi index bb386e8..cd8c4e9 100644 --- a/doc/dejagnu.texi +++ b/doc/dejagnu.texi @@ -644,6 +644,11 @@ The host board to use. @item @code{--ignore [tests(s)] } The name(s) of specific tests to ignore. +@item @code{--keep_going} +Continue testing when test scripts encounter recoverable fatal errors. +Such errors always cause the offending test script to abort +immediately, but DejaGnu may be able to continue with the next script. + @item @code{--local_init [name]} Use @emph{name} as the testsuite local init file instead of @file{site.exp} in the current directory and in @emph{objdir}. The diff --git a/doc/runtest.1 b/doc/runtest.1 index d043ee7..e8913b1 100644 --- a/doc/runtest.1 +++ b/doc/runtest.1 @@ -45,6 +45,9 @@ The host board definition to use. .BI --ignore \ test1.exp\ test2.exp\ ... Do not run the specified tests. .TP +.B --keep_going +Do not abort test run if a test script encounters a fatal error. +.TP .BI --local_init \ NAME The NAME to use for the testsuite local init file in both the current directory and objdir. diff --git a/lib/framework.exp b/lib/framework.exp index e6ce197..53333ad 100644 --- a/lib/framework.exp +++ b/lib/framework.exp @@ -257,21 +257,39 @@ proc isnative { } { # This allows Tcl package autoloading to work in the modern age. rename ::unknown ::tcl_unknown -proc unknown args { - if {[catch {uplevel 1 ::tcl_unknown $args} msg]} { +proc unknown { args } { + set code [catch {uplevel 1 ::tcl_unknown $args} msg] + if { $code != 0 } { global errorCode global errorInfo global exit_status + set ret_cmd [list return -code $code] + clone_output "ERROR: (DejaGnu) proc \"$args\" does not exist." if {[info exists errorCode]} { + lappend ret_cmd -errorcode $errorCode send_error "The error code is $errorCode\n" } if {[info exists errorInfo]} { + # omitting errorInfo from the propagated error makes this code + # invisible with the backtrace pointing directly to the problem send_error "The info on the error is:\n$errorInfo\n" } set exit_status 2 - log_and_exit + + set unresolved_msg "testcase '[uplevel info script]' aborted" + append unresolved_msg " at call to unknown command '$args'" + unresolved $unresolved_msg + + lappend ret_cmd $msg + if { $::dejagnu::opt::keep_going } { + eval $ret_cmd + } else { + log_and_exit + } + } else { + return $msg } } diff --git a/runtest.exp b/runtest.exp index ba49e73..028ad5b 100644 --- a/runtest.exp +++ b/runtest.exp @@ -97,6 +97,13 @@ set global_init_file site.exp ;# global init file name set testsuitedir "testsuite" ;# top-level testsuite source directory set testbuilddir "testsuite" ;# top-level testsuite object directory +# +# These are used for internal command-line flags. +# +namespace eval ::dejagnu::opt { + variable keep_going 0 ;# continue after a fatal error in testcase? +} + # Various ccache versions provide incorrect debug info such as ignoring # different current directory, breaking GDB testsuite. set env(CCACHE_DISABLE) 1 @@ -372,6 +379,7 @@ proc usage { } { send_user "\t--host \[triplet\]\tThe canonical triplet of the host machine\n" send_user "\t--host_board \[name\]\tThe host board to use\n" send_user "\t--ignore \[name(s)\]\tThe names of specific tests to ignore\n" + send_user "\t--keep_going\t\tContinue testing even if a script aborts\n" send_user "\t--local_init \[name\]\tThe file to load for local configuration\n" send_user "\t--log_dialog\t\t\Emit Expect output on stdout\n" send_user "\t--mail \[name(s)\]\tWhom to mail the results to\n" @@ -1201,6 +1209,10 @@ for { set i 0 } { $i < $argc } { incr i } { continue } + "--k*" { # (--keep_going) reduce fatal errors + set ::dejagnu::opt::keep_going 1 + } + "--m*" { # (--mail) mail the output set mailing_list $optarg set mail_logs 1 diff --git a/testsuite/runtest.main/abort.exp b/testsuite/runtest.main/abort.exp new file mode 100644 index 0000000..c5f7014 --- /dev/null +++ b/testsuite/runtest.main/abort.exp @@ -0,0 +1,78 @@ +# Copyright (C) 1995-2016, 2018, 2020 Free Software Foundation, Inc. +# +# This file is part of DejaGnu. +# +# DejaGnu is free software; you can redistribute it and/or modify it +# under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3 of the License, or +# (at your option) any later version. +# +# DejaGnu is distributed in the hope that it will be useful, but +# WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with DejaGnu; if not, write to the Free Software Foundation, +# Inc., 51 Franklin Street - Fifth Floor, Boston, MA 02110-1301, USA. + +# This file tests handling of fatal errors in testcases. +# The way we do this is to recursively invoke ourselves on a small testsuite +# and analyze the results. + +load_lib util-defs.exp + +if {![info exists tmpdir]} { + set tmpdir [testsuite file -object -top tmpdir] +} + +set fd [open abort-init.exp w] +puts $fd "set srcdir [testsuite file -source -test abort]" +puts $fd "set objdir [testsuite file -object -test abort]" +puts $fd "set tmpdir $tmpdir" +close $fd + +if {![file isdirectory $tmpdir]} { + catch "file mkdir $tmpdir" +} + +if {![file isdirectory [testsuite file -object -test abort]]} { + catch {file mkdir [testsuite file -object -test abort]} +} + +set tests { + { "run only simple test" + "simple.exp" + "PASS: simple test.*\ + *expected passes\[ \t\]+1\n" } + { "abort on undefined command" + "abort-undef.exp" + "PASS: running abort-undef.exp.*\ + *UNRESOLVED: .* aborted at call to unknown command.*\ + *expected passes\[ \t\]+1\n.*unresolved testcases\[ \t\]+1\n" } + { "stop at abort without --keep_going" + "abort-undef.exp simple.exp" + "PASS: running abort-undef.exp.*\ + *UNRESOLVED: .* aborted at call to unknown command.*\ + *expected passes\[ \t\]+1\n.*unresolved testcases\[ \t\]+1\n" } + { "continue after abort with --keep_going" + "--keep_going abort-undef.exp simple.exp" + "PASS: running abort-undef.exp.*\ + *UNRESOLVED: .* aborted at call to unknown command.*\ + *PASS: simple test.*\ + *expected passes\[ \t\]+2\n.*unresolved testcases\[ \t\]+1\n" } +} + +foreach t $tests { + if [util_test $RUNTEST \ + "--local_init abort-init.exp\ + --outdir $tmpdir -a [lindex $t 1]" \ + "" \ + [lindex $t 2]] { + fail [lindex $t 0] + } else { + pass [lindex $t 0] + } +} + +file delete -force $tmpdir diff --git a/testsuite/runtest.main/abort/testsuite/abort.test/abort-undef.exp b/testsuite/runtest.main/abort/testsuite/abort.test/abort-undef.exp new file mode 100644 index 0000000..e5f4803 --- /dev/null +++ b/testsuite/runtest.main/abort/testsuite/abort.test/abort-undef.exp @@ -0,0 +1,25 @@ +# Copyright (C) 2020 Free Software Foundation, Inc. +# +# This file is part of DejaGnu. +# +# DejaGnu is free software; you can redistribute it and/or modify it +# under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3 of the License, or +# (at your option) any later version. +# +# DejaGnu is distributed in the hope that it will be useful, but +# WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with DejaGnu; if not, write to the Free Software Foundation, +# Inc., 51 Franklin Street - Fifth Floor, Boston, MA 02110-1301, USA. + +# Invoke an undefined command, causing a fatal error. + +pass "running abort-undef.exp" + +bogus_command 1 2 3 4 + +fail "script did not abort" diff --git a/testsuite/runtest.main/abort/testsuite/abort.test/simple.exp b/testsuite/runtest.main/abort/testsuite/abort.test/simple.exp new file mode 100644 index 0000000..93a03e7 --- /dev/null +++ b/testsuite/runtest.main/abort/testsuite/abort.test/simple.exp @@ -0,0 +1,21 @@ +# Copyright (C) 2020 Free Software Foundation, Inc. +# +# This file is part of DejaGnu. +# +# DejaGnu is free software; you can redistribute it and/or modify it +# under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3 of the License, or +# (at your option) any later version. +# +# DejaGnu is distributed in the hope that it will be useful, but +# WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with DejaGnu; if not, write to the Free Software Foundation, +# Inc., 51 Franklin Street - Fifth Floor, Boston, MA 02110-1301, USA. + +# Return a passing result + +pass "simple test" -- 1.7.4.1
bug-dejagnu <at> gnu.org, jcb62281 <at> gmail.com
:bug#41824
; Package dejagnu
.
(Tue, 23 Jun 2020 18:46:02 GMT) Full text and rfc822 format available.Message #25 received at 41824 <at> debbugs.gnu.org (full text, mbox):
From: Pedro Alves <palves <at> redhat.com> To: jcb62281 <at> gmail.com, Tom de Vries <tdevries <at> suse.de> Cc: 41824 <at> debbugs.gnu.org Subject: Re: bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case Date: Tue, 23 Jun 2020 19:45:04 +0100
On 6/17/20 4:26 AM, Jacob Bachmeyer wrote: >> >> I monitor ERROR and WARNING in my test runs, so I didn't think of this, >> but indeed, those are not tracked in the summary, so that could be a >> problem, agreed. >> > > Blowing up the entire test run is a good way to ensure that problems (like typos in testcases) are noticed, but admittedly, that does not help when people fail to even run the tests before checking in their typos and then other developers' test runs blow up. Murphy's Law says the offenders then carry on to check in more typos in other code without testing their own work there either... There's conditional execution in the gdb testsuite, where tests may run different code paths depending on the target. Typos can happen in code paths that the developer isn't able to exercise. You can imagine someone missing something when doing some large refactoring in the testsuite. I'd think that tracking ERROR and WARNING in the summary would sort out this "test run blew up but no one noticed" issue. Thanks, Pedro Alves
bug-dejagnu <at> gnu.org
:bug#41824
; Package dejagnu
.
(Tue, 23 Jun 2020 23:38:01 GMT) Full text and rfc822 format available.Message #28 received at 41824 <at> debbugs.gnu.org (full text, mbox):
From: Jacob Bachmeyer <jcb62281 <at> gmail.com> To: Pedro Alves <palves <at> redhat.com> Cc: Tom de Vries <tdevries <at> suse.de>, 41824 <at> debbugs.gnu.org Subject: Re: bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case Date: Tue, 23 Jun 2020 18:36:50 -0500
Pedro Alves wrote: > On 6/17/20 4:26 AM, Jacob Bachmeyer wrote: > >>> I monitor ERROR and WARNING in my test runs, so I didn't think of this, >>> but indeed, those are not tracked in the summary, so that could be a >>> problem, agreed. >>> >>> >> Blowing up the entire test run is a good way to ensure that problems (like typos in testcases) are noticed, but admittedly, that does not help when people fail to even run the tests before checking in their typos and then other developers' test runs blow up. Murphy's Law says the offenders then carry on to check in more typos in other code without testing their own work there either... >> > > There's conditional execution in the gdb testsuite, where tests may > run different code paths depending on the target. Typos can happen in > code paths that the developer isn't able to exercise. You can > imagine someone missing something when doing some large refactoring > in the testsuite. > This is a good point and is far more reasonable than checking in code without testing it at all. That sounds like an architectural issue with the GDB testsuite, but could also be a legitimate problem if the per-target handling really is different for some targets, especially if simulators for those targets are unavailable or insufficient for the tests. > I'd think that tracking ERROR and WARNING in the summary would sort > out this "test run blew up but no one noticed" issue. The problem is that the DejaGnu internals often generate ERROR and WARNING "directly" without using the procedures that update counters. The solution I am using in the current PR41824 patch (which has been rolled into the PR41918 patch) generates an extra UNRESOLVED result when a Tcl error is caught in runtest, which both appears in the summary and clearly indicates that the test run did not produce valid results. I am currently considering also adding a '--no-keep-going'/'--no_keep_going' option and making '--keep-going' the default for the rest of the 1.6 series. Aborting on uncaught Tcl errors will be the default in 1.7, but I am trying to decide whether changing it in 1.6 counts more as fixing a bug (DejaGnu quietly sweeping Tcl errors under the proverbial rug) or introducing an environment change (needing '--keep-going' to reach the end of some testsuites that "worked" before). This is further complicated by the fact that any testsuite needing that option to reach the end means that, technically, that testsuite is invalid -- it *did* throw a Tcl error during testing. -- Jacob
bug-dejagnu <at> gnu.org, jcb62281 <at> gmail.com
:bug#41824
; Package dejagnu
.
(Wed, 24 Jun 2020 09:41:01 GMT) Full text and rfc822 format available.Message #31 received at 41824 <at> debbugs.gnu.org (full text, mbox):
From: Pedro Alves <palves <at> redhat.com> To: jcb62281 <at> gmail.com Cc: Tom de Vries <tdevries <at> suse.de>, 41824 <at> debbugs.gnu.org Subject: Re: bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case Date: Wed, 24 Jun 2020 10:40:45 +0100
On 6/24/20 12:36 AM, Jacob Bachmeyer wrote: > Pedro Alves wrote: >> On 6/17/20 4:26 AM, Jacob Bachmeyer wrote: >> >>>> I monitor ERROR and WARNING in my test runs, so I didn't think of this, >>>> but indeed, those are not tracked in the summary, so that could be a >>>> problem, agreed. >>>> >>> Blowing up the entire test run is a good way to ensure that problems (like typos in testcases) are noticed, but admittedly, that does not help when people fail to even run the tests before checking in their typos and then other developers' test runs blow up. Murphy's Law says the offenders then carry on to check in more typos in other code without testing their own work there either... >>> >> >> There's conditional execution in the gdb testsuite, where tests may >> run different code paths depending on the target. Typos can happen in >> code paths that the developer isn't able to exercise. You can >> imagine someone missing something when doing some large refactoring >> in the testsuite. >> > > This is a good point and is far more reasonable than checking in code without testing it at all. That sounds like an architectural issue with the GDB testsuite, but could also be a legitimate problem if the per-target handling really is different for some targets, especially if simulators for those targets are unavailable or insufficient for the tests. It's not an architectural issue -- simply different targets support a different set of features, support the same features but differently, run different OSs or even no OS at all, are built with different compilers, support a different set of languages, etc., etc. > >> I'd think that tracking ERROR and WARNING in the summary would sort >> out this "test run blew up but no one noticed" issue. > > The problem is that the DejaGnu internals often generate ERROR and WARNING "directly" without using the procedures that update counters. I understand that that's what DejaGnu does currently, but couldn't it be changed? Would it not be desirable? The solution I am using in the current PR41824 patch (which has been rolled into the PR41918 patch) generates an extra UNRESOLVED result when a Tcl error is caught in runtest, which both appears in the summary and clearly indicates that the test run did not produce valid results. > > I am currently considering also adding a '--no-keep-going'/'--no_keep_going' option and making '--keep-going' the default for the rest of the 1.6 series. Aborting on uncaught Tcl errors will be the default in 1.7, but I am trying to decide whether changing it in 1.6 counts more as fixing a bug (DejaGnu quietly sweeping Tcl errors under the proverbial rug) or introducing an environment change (needing '--keep-going' to reach the end of some testsuites that "worked" before). This is further complicated by the fact that any testsuite needing that option to reach the end means that, technically, that testsuite is invalid -- it *did* throw a Tcl error during testing. It seems your designing this under the assumption that if a testsuite run "blows up", it's not a big deal to just fix it and rerun the testsuite. What that misses is that in some environments, particularly with slow embedded system boards, running the full testsuite can take _days_. This is all compounded by the fact that the testsuite must be run in a large set of different modes and boards to cover everything for a single target. Even on a modern GNU/Linux system, running all combinations of modes for full coverage can take hours. Having the testrun stop midway because of some specific testcase blowing up can waste a lot of time. The fact that the testsuite can take a while to fully run results in automated testing systems, like GDB's buildbot, struggling to keep up with testing all the commits that land in git. On some buildbot setups, it's just not possible to test all commits. So if some testcase blows up the whole test run, the next time you run the testsuite, it'll likely be that a large number of unrelated commits went into the sources. Losing a large chunk of the testsuite results because of some small typo in one testcase is thus undesirable, because noticing the typo in the first place can take a while, and also you may be stuck with a "last good run" for many of the testcases that gets older and older until the typo is finally fixed. Then there's the issue about testsuite order. DejaGnu runs testcases in alphabetical filename order, in sequence. If the testrun blows up in the first testcase run, say, a.exp, the whole testrun stops before b.exp, etc. are run. If the testcase that blows up is the last one, say named zzz.exp, most of the testsuite runs. This is not "fair". If I rename a.exp to zzzzz.exp, then suddenly that testcase gains more "importance" in the sense that it can no longer blow up the whole test run. A testcase's name should not be significant like that. On the same vein, consider parallel mode. This is not native to DejaGnu, but both GDB and GCC have parallel modes, where their makefiles set up a make check such that they spawn a number of runtest processes running in parallel, each exercising a subset the whole testsuite. Imagine a testsuite composed of exactly 32 testcases. And further imagine that you run it under "make check -j32". That is, you'll end up with 32 runtest instances, each running exactly one testcase. Now, if one of the testcases blows up, all the other 31 will still run. Again, this shows that testcase name, and testrun order should not be important. There is no real reason that a sequential run in that scenario ends up with different results (testcase a.exp blew up, so no result for the other tests) compared to a parallel run (all tests ran successfully but test a.exp). Instead, I advocate that the testresults summary should indicate that a problem happened, so that "blow up event" isn't lost. But do let the whole testsuite run. Thanks, Pedro Alves
bug-dejagnu <at> gnu.org
:bug#41824
; Package dejagnu
.
(Thu, 25 Jun 2020 02:45:02 GMT) Full text and rfc822 format available.Message #34 received at 41824 <at> debbugs.gnu.org (full text, mbox):
From: Jacob Bachmeyer <jcb62281 <at> gmail.com> To: Pedro Alves <palves <at> redhat.com> Cc: Tom de Vries <tdevries <at> suse.de>, 41824 <at> debbugs.gnu.org Subject: Re: bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case Date: Wed, 24 Jun 2020 21:43:56 -0500
Pedro Alves wrote: > On 6/24/20 12:36 AM, Jacob Bachmeyer wrote: > >> Pedro Alves wrote: >> >>> On 6/17/20 4:26 AM, Jacob Bachmeyer wrote: >>> >>> I'd think that tracking ERROR and WARNING in the summary would sort >>> out this "test run blew up but no one noticed" issue. >>> >> The problem is that the DejaGnu internals often generate ERROR and WARNING "directly" without using the procedures that update counters. >> > > I understand that that's what DejaGnu does currently, but couldn't it be changed? > Would it not be desirable? > It could be changed, but those are much larger changes to the code and overall behavior than simply adding an UNRESOLVED result to report the failure, and if I understand correctly, the presence of UNRESOLVED results marks the test results as invalid or partially-valid at best, which is correct if a test script failed with a Tcl error. Also, the current warning and error counters are not global, but are reset after each test result. Exceeding thresholds for warnings or errors causes the next test result to be changed to UNRESOLVED before the counters are reset. >> I am currently considering also adding a '--no-keep-going'/'--no_keep_going' option and making '--keep-going' the default for the rest of the 1.6 series. Aborting on uncaught Tcl errors will be the default in 1.7, but I am trying to decide whether changing it in 1.6 counts more as fixing a bug (DejaGnu quietly sweeping Tcl errors under the proverbial rug) or introducing an environment change (needing '--keep-going' to reach the end of some testsuites that "worked" before). This is further complicated by the fact that any testsuite needing that option to reach the end means that, technically, that testsuite is invalid -- it *did* throw a Tcl error during testing. >> > It seems your designing this under the assumption that if a testsuite > run "blows up", it's not a big deal to just fix it and rerun the testsuite. > For most packages, this is correct. For GDB, perhaps it is a bigger problem. On the other hand, "blowing up" at least indicates clearly that something is wrong and prevents "whistling past the graveyard" where tests are not actually running, but everyone thinks the situation is fine. > What that misses is that in some environments, particularly with slow embedded > system boards, running the full testsuite can take _days_. This is all > compounded by the fact that the testsuite must be run in a large set of > different modes and boards to cover everything for a single target. Even on > a modern GNU/Linux system, running all combinations of modes for full coverage > can take hours. > > Having the testrun stop midway because of some specific testcase > blowing up can waste a lot of time. > > The fact that the testsuite can take a while to fully run results in automated > testing systems, like GDB's buildbot, struggling to keep up with testing > all the commits that land in git. On some buildbot setups, it's just not > possible to test all commits. So if some testcase blows up the whole > test run, the next time you run the testsuite, it'll likely be that a > large number of unrelated commits went into the sources. Losing a large > chunk of the testsuite results because of some small typo in one > testcase is thus undesirable, because noticing the typo in the first > place can take a while, and also you may be stuck with a "last good run" > for many of the testcases that gets older and older until > the typo is finally fixed. > I presume buildbot can be easily configured to pass a '--keep_going' option, perhaps as 'RUNTESTFLAGS=--keep_going' if needed? Still, needing a new option to get old behavior is changing the environment interface, so I will make --keep_going the default, at least for the rest of 1.6. That default is likely to change for 1.7. For automated systems like buildbot, passing '--keep_going' should be perfectly reasonable, which is why the option exists. (While the documented options will be changed to use hyphens to align with the rest of the GNU system in 1.7, the current option names will continue to be accepted, since the aliases are obvious.) This has been pushed on the temporary PR41918 branch because the fix for PR41824 has been rolled into the fix for PR41918, as the bugs are closely related. > Then there's the issue about testsuite order. DejaGnu runs testcases > in alphabetical filename order, in sequence. If the testrun blows up in > the first testcase run, say, a.exp, the whole testrun stops before b.exp, > etc. are run. If the testcase that blows up is the last one, say named > zzz.exp, most of the testsuite runs. This is not "fair". If I rename > a.exp to zzzzz.exp, then suddenly that testcase gains more "importance" in > the sense that it can no longer blow up the whole test run. A > testcase's name should not be significant like that. > > On the same vein, consider parallel mode. This is not native to DejaGnu, > but both GDB and GCC have parallel modes, where their makefiles set up a > make check such that they spawn a number of runtest processes running in > parallel, each exercising a subset the whole testsuite. Imagine a testsuite > composed of exactly 32 testcases. And further imagine that you run it > under "make check -j32". That is, you'll end up with 32 runtest instances, > each running exactly one testcase. Now, if one of the testcases blows up, all > the other 31 will still run. Again, this shows that testcase name, and > testrun order should not be important. There is no real reason that a > sequential run in that scenario ends up with different results (testcase a.exp > blew up, so no result for the other tests) compared to a parallel > run (all tests ran successfully but test a.exp). Instead, I advocate > that the testresults summary should indicate that a problem happened, > so that "blow up event" isn't lost. But do let the whole testsuite run. > There are plans to eventually implement a native parallel testing mode. You are correct that the order in which testcases are run should be insignificant, and for valid testsuites that do not crash, it is insignificant -- all tests are run. The problems start when a testcase *crashes* and aborts with a Tcl error. The entire reason for stopping the testsuite immediately when a Tcl error occurs is that it is undeniable that something is *very* *wrong* when the testsuite stops early. Software testing is hard enough when you know your testsuite is good, but it is far worse when you have bugs in the testsuite masking bugs in the actual program because a test script is crashing and part of the program that is supposed to be tested is not being exercised. -- Jacob
bug-dejagnu <at> gnu.org, jcb62281 <at> gmail.com
:bug#41824
; Package dejagnu
.
(Thu, 25 Jun 2020 16:02:01 GMT) Full text and rfc822 format available.Message #37 received at 41824 <at> debbugs.gnu.org (full text, mbox):
From: Pedro Alves <palves <at> redhat.com> To: jcb62281 <at> gmail.com Cc: Tom de Vries <tdevries <at> suse.de>, 41824 <at> debbugs.gnu.org Subject: Re: bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case Date: Thu, 25 Jun 2020 17:01:10 +0100
On 6/25/20 3:43 AM, Jacob Bachmeyer wrote: > Pedro Alves wrote: >> On 6/24/20 12:36 AM, Jacob Bachmeyer wrote: >> >>> Pedro Alves wrote: >>> >>>> On 6/17/20 4:26 AM, Jacob Bachmeyer wrote: >>>> I'd think that tracking ERROR and WARNING in the summary would sort >>>> out this "test run blew up but no one noticed" issue. >>>> >>> The problem is that the DejaGnu internals often generate ERROR and WARNING "directly" without using the procedures that update counters. >> >> I understand that that's what DejaGnu does currently, but couldn't it be changed? >> Would it not be desirable? >> > > It could be changed, but those are much larger changes to the code and overall behavior than simply adding an UNRESOLVED result to report the failure, and if I understand correctly, the presence of UNRESOLVED results marks the test results as invalid or partially-valid at best, which is correct if a test script failed with a Tcl error. > > Also, the current warning and error counters are not global, but are reset after each test result. Exceeding thresholds for warnings or errors causes the next test result to be changed to UNRESOLVED before the counters are reset. > >>> I am currently considering also adding a '--no-keep-going'/'--no_keep_going' option and making '--keep-going' the default for the rest of the 1.6 series. Aborting on uncaught Tcl errors will be the default in 1.7, but I am trying to decide whether changing it in 1.6 counts more as fixing a bug (DejaGnu quietly sweeping Tcl errors under the proverbial rug) or introducing an environment change (needing '--keep-going' to reach the end of some testsuites that "worked" before). This is further complicated by the fact that any testsuite needing that option to reach the end means that, technically, that testsuite is invalid -- it *did* throw a Tcl error during testing. >>> >> It seems your designing this under the assumption that if a testsuite >> run "blows up", it's not a big deal to just fix it and rerun the testsuite. >> > > For most packages, this is correct. For GDB, perhaps it is a bigger problem. > > On the other hand, "blowing up" at least indicates clearly that something is wrong and prevents "whistling past the graveyard" where tests are not actually running, but everyone thinks the situation is fine. > >> What that misses is that in some environments, particularly with slow embedded >> system boards, running the full testsuite can take _days_. This is all >> compounded by the fact that the testsuite must be run in a large set of >> different modes and boards to cover everything for a single target. Even on >> a modern GNU/Linux system, running all combinations of modes for full coverage >> can take hours. >> >> Having the testrun stop midway because of some specific testcase >> blowing up can waste a lot of time. >> >> The fact that the testsuite can take a while to fully run results in automated >> testing systems, like GDB's buildbot, struggling to keep up with testing >> all the commits that land in git. On some buildbot setups, it's just not >> possible to test all commits. So if some testcase blows up the whole >> test run, the next time you run the testsuite, it'll likely be that a >> large number of unrelated commits went into the sources. Losing a large >> chunk of the testsuite results because of some small typo in one >> testcase is thus undesirable, because noticing the typo in the first >> place can take a while, and also you may be stuck with a "last good run" >> for many of the testcases that gets older and older until >> the typo is finally fixed. >> > > I presume buildbot can be easily configured to pass a '--keep_going' option, perhaps as 'RUNTESTFLAGS=--keep_going' if needed? > What's more likely is we'll tweak GDB's Makefile so that make check enables that option automatically. If GCC does the same, which I would encourage them to, since they also run the testsuite against systems that can take hours/days to complete, then I'll get to wonder who the default is for. :-) We'll need to somehow detect that the option is supported, though, so it's of course doable, just will require a configure check or some little bit of shell to probe the option or something of the sort. > Still, needing a new option to get old behavior is changing the environment interface, so I will make --keep_going the default, at least for the rest of 1.6. That default is likely to change for 1.7. For automated systems like buildbot, passing '--keep_going' should be perfectly reasonable, which is why the option exists. (While the documented options will be changed to use hyphens to align with the rest of the GNU system in 1.7, the current option names will continue to be accepted, since the aliases are obvious.) > Currently with DejaGnu only the case of calling an unknown function stops the test run AFAIK. Any other tcl error like calling a proc with the wrong number of arguments, or treating a non-array variable as an array, etc. is caught by the catch in the runtest proc in runtest.exp, and the testrun continues. That to me clearly indicates that the original intention was to catch errors and continue. That also gives the tool's $(tool)_finish proc a chance to tear down correctly. It is just that the "unknown" case wasn't thought of. So I argue that not making unknown proc calls abort the run is a bug fix, and making DejaGnu abort the run for other errors by default is a behavior change that no real testsuite built around DejaGnu is asking for. > This has been pushed on the temporary PR41918 branch because the fix for PR41824 has been rolled into the fix for PR41918, as the bugs are closely related. > >> Then there's the issue about testsuite order. DejaGnu runs testcases >> in alphabetical filename order, in sequence. If the testrun blows up in >> the first testcase run, say, a.exp, the whole testrun stops before b.exp, >> etc. are run. If the testcase that blows up is the last one, say named >> zzz.exp, most of the testsuite runs. This is not "fair". If I rename >> a.exp to zzzzz.exp, then suddenly that testcase gains more "importance" in >> the sense that it can no longer blow up the whole test run. A >> testcase's name should not be significant like that. >> >> On the same vein, consider parallel mode. This is not native to DejaGnu, >> but both GDB and GCC have parallel modes, where their makefiles set up a >> make check such that they spawn a number of runtest processes running in >> parallel, each exercising a subset the whole testsuite. Imagine a testsuite >> composed of exactly 32 testcases. And further imagine that you run it >> under "make check -j32". That is, you'll end up with 32 runtest instances, >> each running exactly one testcase. Now, if one of the testcases blows up, all >> the other 31 will still run. Again, this shows that testcase name, and >> testrun order should not be important. There is no real reason that a >> sequential run in that scenario ends up with different results (testcase a.exp >> blew up, so no result for the other tests) compared to a parallel >> run (all tests ran successfully but test a.exp). Instead, I advocate >> that the testresults summary should indicate that a problem happened, >> so that "blow up event" isn't lost. But do let the whole testsuite run. >> > > There are plans to eventually implement a native parallel testing mode. You are correct that the order in which testcases are run should be insignificant, and for valid testsuites that do not crash, it is insignificant -- all tests are run. > The problems start when a testcase *crashes* and aborts with a Tcl error. It's hopefully obvious that I'm aware of the latter. > > The entire reason for stopping the testsuite immediately when a Tcl error occurs is that it is undeniable that something is *very* *wrong* when the testsuite stops early. Software testing is hard enough when you know your testsuite is good, but it is far worse when you have bugs in the testsuite masking bugs in the actual program because a test script is crashing and part of the program that is supposed to be tested is not being exercised. I think it's safe to say that we all understand the general principle, but IMO that applies more to continuing the same testcase (if somehow that were possible, like "unknown" suppressing the error), rather than continuing with a different testcase. Continuing the testrun when one testcase errors out does not mask any bug in the program that is supposed to be tested. Why would it? The testcase aborts, it doesn't continue. The remaining testcases start afresh. Anyway, given the option, it won't be bad, just a little annoying. Thanks, Pedro Alves
bug-dejagnu <at> gnu.org
:bug#41824
; Package dejagnu
.
(Thu, 25 Jun 2020 22:41:01 GMT) Full text and rfc822 format available.Message #40 received at 41824 <at> debbugs.gnu.org (full text, mbox):
From: Jacob Bachmeyer <jcb62281 <at> gmail.com> To: Pedro Alves <palves <at> redhat.com> Cc: Tom de Vries <tdevries <at> suse.de>, 41824 <at> debbugs.gnu.org Subject: Re: bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case Date: Thu, 25 Jun 2020 17:39:58 -0500
Pedro Alves wrote: > On 6/25/20 3:43 AM, Jacob Bachmeyer wrote: > >> Pedro Alves wrote: >> >>> What that misses is that in some environments, particularly with slow embedded >>> system boards, running the full testsuite can take _days_. This is all >>> compounded by the fact that the testsuite must be run in a large set of >>> different modes and boards to cover everything for a single target. Even on >>> a modern GNU/Linux system, running all combinations of modes for full coverage >>> can take hours. >>> >>> Having the testrun stop midway because of some specific testcase >>> blowing up can waste a lot of time. >>> >>> The fact that the testsuite can take a while to fully run results in automated >>> testing systems, like GDB's buildbot, struggling to keep up with testing >>> all the commits that land in git. On some buildbot setups, it's just not >>> possible to test all commits. So if some testcase blows up the whole >>> test run, the next time you run the testsuite, it'll likely be that a >>> large number of unrelated commits went into the sources. Losing a large >>> chunk of the testsuite results because of some small typo in one >>> testcase is thus undesirable, because noticing the typo in the first >>> place can take a while, and also you may be stuck with a "last good run" >>> for many of the testcases that gets older and older until >>> the typo is finally fixed. >>> >>> >> I presume buildbot can be easily configured to pass a '--keep_going' option, perhaps as 'RUNTESTFLAGS=--keep_going' if needed? >> > > What's more likely is we'll tweak GDB's Makefile so that make check > enables that option automatically. If GCC does the same, which I would > encourage them to, since they also run the testsuite against systems > that can take hours/days to complete, then I'll get to wonder who the > default is for. :-) > The default is for developers working on testsuites -- and for testing release candidates and releases because if your testcases crash in a release, you have problems. :-) Are you suggesting withdrawing the --keep-going/--no-keep-going options and always generating an UNRESOLVED result, resuming with the next test script, and possibly storing a list of aborted test scripts somewhere to repeat at the very end of a test run with a big warning about test cases that crashed? >> Still, needing a new option to get old behavior is changing the environment interface, so I will make --keep_going the default, at least for the rest of 1.6. That default is likely to change for 1.7. For automated systems like buildbot, passing '--keep_going' should be perfectly reasonable, which is why the option exists. (While the documented options will be changed to use hyphens to align with the rest of the GNU system in 1.7, the current option names will continue to be accepted, since the aliases are obvious.) >> > > Currently with DejaGnu only the case of calling an unknown function stops > the test run AFAIK. Any other tcl error like calling a proc with the wrong > number of arguments, or treating a non-array variable as an array, etc. is > caught by the catch in the runtest proc in runtest.exp, and the testrun > continues. That to me clearly indicates that the original intention was > to catch errors and continue. > > That also gives the tool's $(tool)_finish proc a chance to tear down > correctly. > > It is just that the "unknown" case wasn't thought of. So I argue that not > making unknown proc calls abort the run is a bug fix, and making DejaGnu > abort the run for other errors by default is a behavior change that no real > testsuite built around DejaGnu is asking for. > As it was explained to me, it is the other way around -- the catch in runtest was overlooked when ::unknown was added to catch bugs by aborting if an undefined procedure is called. Is there a ${tool}_finish or do you mean ${tool}_exit? Either way, you make a good point about running cleanup procedures when aborting a test run. >> The entire reason for stopping the testsuite immediately when a Tcl error occurs is that it is undeniable that something is *very* *wrong* when the testsuite stops early. Software testing is hard enough when you know your testsuite is good, but it is far worse when you have bugs in the testsuite masking bugs in the actual program because a test script is crashing and part of the program that is supposed to be tested is not being exercised. >> > > I think it's safe to say that we all understand the general principle, > but IMO that applies more to continuing the same testcase (if somehow that > were possible, like "unknown" suppressing the error), rather than continuing > with a different testcase. > > Continuing the testrun when one testcase errors out does not mask any > bug in the program that is supposed to be tested. Why would it? The > testcase aborts, it doesn't continue. The remaining testcases start > afresh. > Consider a hypothetical case where GDB has exec.exp (which tests starting a debugged process) and attach.exp (which tests attaching to a process). Unknown to the GDB developers, the "attach" command does not work and causes GDB to dump core (oops!) but attach.exp fails with a Tcl error before reaching the point where "attach" is issued to the inferior GDB. If DejaGnu stops immediately, it is obvious that the testsuite is broken, the bug in the testsuite is fixed, and the bug in GDB is found. If DejaGnu sweeps the Tcl error under the proverbial rug (as previous versions did), the bug in the testsuite effectively hides the bug in GDB , because the developers think the "attach" command is being exercised, but those tests are being (almost) silently skipped. Now a bug that obvious will be caught very quickly by users (likely the GDB developers themselves) but obscure regressions are for more insidious, especially in large testsuites such as those for GCC or GDB. If a regression test is being skipped with an easily-missed error, that regression can slip in unnoticed. Would simply turning aborted test scripts into UNRESOLVED results be enough to get them fixed quickly? -- Jacob
bug-dejagnu <at> gnu.org, jcb62281 <at> gmail.com
:bug#41824
; Package dejagnu
.
(Thu, 25 Jun 2020 23:56:02 GMT) Full text and rfc822 format available.Message #43 received at 41824 <at> debbugs.gnu.org (full text, mbox):
From: Rob Savoye <rob <at> senecass.com> To: Pedro Alves <palves <at> redhat.com>, jcb62281 <at> gmail.com Cc: Tom de Vries <tdevries <at> suse.de>, 41824 <at> debbugs.gnu.org Subject: Re: bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case Date: Thu, 25 Jun 2020 17:53:57 -0600
On 6/25/20 10:01 AM, Pedro Alves wrote: >> It could be changed, but those are much larger changes to the code and overall behavior than simply adding an UNRESOLVED result to report the failure, and if I understand correctly, the presence of UNRESOLVED results marks the test results as invalid or partially-valid at best, which is correct if a test script failed with a Tcl error. Adding an additional call makes it harder to diff the .sum files. Not a huge big deal, but still... UNRESOLVED says a human has to look at this test case output and decide, the actual error should be in the .log file. UNTESTED means something went wrong with the test, and the test is invalid, which is probably what you'd want as the test state for a Tcl error. > Currently with DejaGnu only the case of calling an unknown function stops > the test run AFAIK. Any other tcl error like calling a proc with the wrong > number of arguments, or treating a non-array variable as an array, etc. is > caught by the catch in the runtest proc in runtest.exp, and the testrun > continues. That to me clearly indicates that the original intention was > to catch errors and continue. The original intent was to catch some errors, nd ignore others. :-) Catch is used for errors that can be ignored or worked around. Unknown was basically to catch programming errors, since as noted, there are many paths though the code as you change options or environments. They'd get lost in the output unless it aborted. > It is just that the "unknown" case wasn't thought of. So I argue that not > making unknown proc calls abort the run is a bug fix, and making DejaGnu > abort the run for other errors by default is a behavior change that no real > testsuite built around DejaGnu is asking for. I think the default should be to abort on a unknown Tcl error, and have an option to ignore them and continue. As a person that has run many long test runs, I agree I don't want some errors to abort the test run. But to be honest, a dependable test run should mean no bugs in the test framework. A buggy test framework means you can trust the test results. Aborting was for use during debugging of test framework code itself, so supposedly you should should never have an Tcl error during a test run. There are times when even during debugging you'd want it to continue, and then collect all the bugs later to fix in batch mode. - rob -
bug-dejagnu <at> gnu.org
:bug#41824
; Package dejagnu
.
(Fri, 26 Jun 2020 01:54:02 GMT) Full text and rfc822 format available.Message #46 received at 41824 <at> debbugs.gnu.org (full text, mbox):
From: Jacob Bachmeyer <jcb62281 <at> gmail.com> To: Rob Savoye <rob <at> senecass.com> Cc: Pedro Alves <palves <at> redhat.com>, Tom de Vries <tdevries <at> suse.de>, 41824 <at> debbugs.gnu.org Subject: Re: bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case Date: Thu, 25 Jun 2020 20:53:46 -0500
Rob Savoye wrote: > On 6/25/20 10:01 AM, Pedro Alves wrote: > > Adding an additional call makes it harder to diff the .sum files. Not > a huge big deal, but still... UNRESOLVED says a human has to look at > this test case output and decide, the actual error should be in the .log > file. UNTESTED means something went wrong with the test, and the test is > invalid, which is probably what you'd want as the test state for a Tcl > error. > The original rationale for using UNRESOLVED here was that DejaGnu converts a test result to UNRESOLVED if too many errors or warnings were produced. The DejaGnu manual indicates (section "A POSIX Compliant Test Framework") that UNRESOLVED is correct for a test where execution was interrupted or was set up incorrectly. UNTESTED is specifically listed as a placeholder for an as-yet-unwritten testcase. A "typical" GDB testsuite run has almost a hundred UNTESTED results but (without this patch) zero UNRESOLVED results. To me, this seems that UNRESOLVED is correct here, or that the manual has an error. >> Currently with DejaGnu only the case of calling an unknown function stops >> the test run AFAIK. Any other tcl error like calling a proc with the wrong >> number of arguments, or treating a non-array variable as an array, etc. is >> caught by the catch in the runtest proc in runtest.exp, and the testrun >> continues. That to me clearly indicates that the original intention was >> to catch errors and continue. >> > > The original intent was to catch some errors, nd ignore others. :-) > Catch is used for errors that can be ignored or worked around. Unknown > was basically to catch programming errors, since as noted, there are > many paths though the code as you change options or environments. They'd > get lost in the output unless it aborted. > So continuing with the next test script could be reasonable if the error (or at least an indication that the error occurred) is "replayed" at the end of the test run? >> It is just that the "unknown" case wasn't thought of. So I argue that not >> making unknown proc calls abort the run is a bug fix, and making DejaGnu >> abort the run for other errors by default is a behavior change that no real >> testsuite built around DejaGnu is asking for. >> > > I think the default should be to abort on a unknown Tcl error, and > have an option to ignore them and continue. As a person that has run > many long test runs, I agree I don't want some errors to abort the test > run. But to be honest, a dependable test run should mean no bugs in the > test framework. A buggy test framework means you can trust the test results. > > Aborting was for use during debugging of test framework code itself, > so supposedly you should should never have an Tcl error during a test > run. There are times when even during debugging you'd want it to > continue, and then collect all the bugs later to fix in batch mode. > This succinctly states the original motivation for having the --keep-going option, but having "make check" always pass --keep-going defeats the purpose. -- Jacob
bug-dejagnu <at> gnu.org, jcb62281 <at> gmail.com
:bug#41824
; Package dejagnu
.
(Fri, 26 Jun 2020 09:54:02 GMT) Full text and rfc822 format available.Message #49 received at 41824 <at> debbugs.gnu.org (full text, mbox):
From: Pedro Alves <palves <at> redhat.com> To: jcb62281 <at> gmail.com Cc: Tom de Vries <tdevries <at> suse.de>, 41824 <at> debbugs.gnu.org Subject: Re: bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case Date: Fri, 26 Jun 2020 10:53:34 +0100
On 6/25/20 11:39 PM, Jacob Bachmeyer wrote: > Pedro Alves wrote: >> On 6/25/20 3:43 AM, Jacob Bachmeyer wrote: >> >>> Pedro Alves wrote: >>> >>>> What that misses is that in some environments, particularly with slow embedded >>>> system boards, running the full testsuite can take _days_. This is all >>>> compounded by the fact that the testsuite must be run in a large set of >>>> different modes and boards to cover everything for a single target. Even on >>>> a modern GNU/Linux system, running all combinations of modes for full coverage >>>> can take hours. >>>> >>>> Having the testrun stop midway because of some specific testcase >>>> blowing up can waste a lot of time. >>>> >>>> The fact that the testsuite can take a while to fully run results in automated >>>> testing systems, like GDB's buildbot, struggling to keep up with testing >>>> all the commits that land in git. On some buildbot setups, it's just not >>>> possible to test all commits. So if some testcase blows up the whole >>>> test run, the next time you run the testsuite, it'll likely be that a >>>> large number of unrelated commits went into the sources. Losing a large >>>> chunk of the testsuite results because of some small typo in one >>>> testcase is thus undesirable, because noticing the typo in the first >>>> place can take a while, and also you may be stuck with a "last good run" >>>> for many of the testcases that gets older and older until >>>> the typo is finally fixed. >>>> >>> I presume buildbot can be easily configured to pass a '--keep_going' option, perhaps as 'RUNTESTFLAGS=--keep_going' if needed? >>> >> >> What's more likely is we'll tweak GDB's Makefile so that make check >> enables that option automatically. If GCC does the same, which I would >> encourage them to, since they also run the testsuite against systems >> that can take hours/days to complete, then I'll get to wonder who the >> default is for. :-) >> > > The default is for developers working on testsuites -- and for testing release candidates and releases because if your testcases crash in a release, you have problems. :-) > > Are you suggesting withdrawing the --keep-going/--no-keep-going options Yes, but if you would like to add it, it's OK with me, though then you can say that I'm more discussing the default. But really I'm more worried about making sure that the people maintaining the GDB testsuite (me included), and the DejaGnu people understand each other's needs better. > and always generating an UNRESOLVED result, > resuming with the next test script, and possibly storing a list of aborted test scripts somewhere Yes (with --keep-going if you insist on having the option), but note that I suggested storing the list of ERRORs and WARNINGs. > to repeat at the very end of a test run with a big warning about test cases that crashed? I did not suggest to repeat anything. > >>> Still, needing a new option to get old behavior is changing the environment interface, so I will make --keep_going the default, at least for the rest of 1.6. That default is likely to change for 1.7. For automated systems like buildbot, passing '--keep_going' should be perfectly reasonable, which is why the option exists. (While the documented options will be changed to use hyphens to align with the rest of the GNU system in 1.7, the current option names will continue to be accepted, since the aliases are obvious.) >>> >> >> Currently with DejaGnu only the case of calling an unknown function stops >> the test run AFAIK. Any other tcl error like calling a proc with the wrong >> number of arguments, or treating a non-array variable as an array, etc. is >> caught by the catch in the runtest proc in runtest.exp, and the testrun >> continues. That to me clearly indicates that the original intention was >> to catch errors and continue. >> >> That also gives the tool's $(tool)_finish proc a chance to tear down >> correctly. >> >> It is just that the "unknown" case wasn't thought of. So I argue that not >> making unknown proc calls abort the run is a bug fix, and making DejaGnu >> abort the run for other errors by default is a behavior change that no real >> testsuite built around DejaGnu is asking for. >> > > As it was explained to me, it is the other way around -- the catch in runtest was overlooked when ::unknown was added to catch bugs by aborting if an undefined procedure is called. > > Is there a ${tool}_finish or do you mean ${tool}_exit? Either way, you make a good point about running cleanup procedures when aborting a test run. Look at proc runtest in runtest.exp, it calls ${tool}_init, sources the testcase under catch (using ERROR:, so here you could run a tally of such caught errors), and then calls ${tool}_finish. > >>> The entire reason for stopping the testsuite immediately when a Tcl error occurs is that it is undeniable that something is *very* *wrong* when the testsuite stops early. Software testing is hard enough when you know your testsuite is good, but it is far worse when you have bugs in the testsuite masking bugs in the actual program because a test script is crashing and part of the program that is supposed to be tested is not being exercised. >>> >> >> I think it's safe to say that we all understand the general principle, >> but IMO that applies more to continuing the same testcase (if somehow that >> were possible, like "unknown" suppressing the error), rather than continuing >> with a different testcase. >> >> Continuing the testrun when one testcase errors out does not mask any >> bug in the program that is supposed to be tested. Why would it? The >> testcase aborts, it doesn't continue. The remaining testcases start >> afresh. >> > > Consider a hypothetical case where GDB has exec.exp (which tests starting a debugged process) and attach.exp (which tests attaching to a process). Unknown to the GDB developers, the "attach" command does not work and causes GDB to dump core (oops!) but attach.exp fails with a Tcl error before reaching the point where "attach" is issued to the inferior GDB. If DejaGnu stops immediately, it is obvious that the testsuite is broken, the bug in the testsuite is fixed, and the bug in GDB is found. If DejaGnu sweeps the Tcl error under the proverbial rug (as previous versions did), the bug in the testsuite effectively hides the bug in GDB , because the developers think the "attach" command is being exercised, but those tests are being (almost) silently skipped. First, that scenario doesn't normally cause a Tcl error. If GDB trips an internal error, the GDB testsuite catches it with a FAIL. If GDB just dies (whether it dumps core or not) the GDB testsuite catches that with UNRESOLVED. Also, even if those weren't already considered, the developer won't think the testcase is just passing correctly, because being a good GDB developer, he knows he needs to diff the new gdb.sum file against a previous run's, and sees that some PASSes are missing, and some ERROR: tcl error showed up in their stead. Nobody should rely on just looking at the summary. Everyone _must_ diff between a previous run and the current run. Now some people may use test result diffing scripts that forget to look at ERROR: lines in .sum. Those would be buggy, but then again, if those ERRORs were accounted for in the DejaGnu summary, they would be harder to miss. And again, people run the GDB testsuite in parallel mode, so your assumption that the testsuite would stop immediately anyway doesn't hold. One of the parallel runtest invocations bombs out, while the others keep running, so by the time you get back the prompt, other testcases already executed, and you don't see the ERROR in the current terminal anymore. You'll need to look at the testsuite .sum .log files anyway. Only, you will find out that one testcase bombed out, 50000 tests ran, but 1000 others didn't run because of that one testcase. And now you're missing those 1000 results, which maybe included those which you were actually interested in, because you were doing a builbot try run across a number of systems. Now, when running the testsuite against slow embedded boards, then you won't normally be using the parallel mode, the board just can't support multiple parallel connections. In that case, you set the testsuite running, and go on your merry way for a couple hours or 20. When you come back, maybe the next day, you notice that the testsuite actually stopped running 30 mins in, because someone changed the testsuite upstream just before you rebased in way that caused a tcl error for you in some testcase. Oops. You just lost a day. Now you wish you had remembered to type --keep-going, but it's not the default, and nobody ever writes Tcl bugs, so you didn't think of it before. That's why I think --keep-going should be the default. When you're developing a new testcase, you'll run that testcase in isolation a number of times. It's very hard to miss any silly error then -- there's no need to optimize for this use case by default, IMO. > > Now a bug that obvious will be caught very quickly by users (likely the GDB developers themselves) but obscure regressions are for more insidious, especially in large testsuites such as those for GCC or GDB. If a regression test is being skipped with an easily-missed error, that regression can slip in unnoticed. > > Would simply turning aborted test scripts into UNRESOLVED results be enough to get them fixed quickly? Yes. (Though I still think a separate count for ERROR (and WARNING) would be better.) Thanks, Pedro Alves
bug-dejagnu <at> gnu.org, jcb62281 <at> gmail.com
:bug#41824
; Package dejagnu
.
(Fri, 26 Jun 2020 13:42:02 GMT) Full text and rfc822 format available.Message #52 received at 41824 <at> debbugs.gnu.org (full text, mbox):
From: Pedro Alves <pedro <at> palves.net> To: Pedro Alves <palves <at> redhat.com>, jcb62281 <at> gmail.com Cc: Tom de Vries <tdevries <at> suse.de>, 41824 <at> debbugs.gnu.org Subject: Re: bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case Date: Fri, 26 Jun 2020 11:08:11 +0100
On 6/26/20 10:53 AM, Pedro Alves wrote: > >> to repeat at the very end of a test run with a big warning about test cases that crashed? > I did not suggest to repeat anything. > I misunderstood you here -- I somehow thought that that by "repeat" you meant re-running the testcase. Still, I would not suggest to repeat the error. Let me clarify -- with this: ~~~~ diff --git c/gdb/testsuite/gdb.base/break.exp w/gdb/testsuite/gdb.base/break.exp index 8c7ce42d805..056252c0677 100644 --- c/gdb/testsuite/gdb.base/break.exp +++ w/gdb/testsuite/gdb.base/break.exp @@ -15,6 +15,9 @@ # This file was written by Rob Savoye. (rob <at> cygnus.com) +set a 1 +set a(1) 1 + ~~~~ You get, if you run two testcases, and the other one manages to run first: $ make check TESTS="*/break.exp */schedlock.exp" make[1]: Entering directory '/home/pedro/brno/pedro/gdb/mygit/build/gdb/testsuite' Nothing to be done for all... make check-single make[2]: Entering directory '/home/pedro/brno/pedro/gdb/mygit/build/gdb/testsuite' rootme=`pwd`; export rootme; srcdir=/home/pedro/gdb/mygit/src/gdb/testsuite ; export srcdir ; EXPECT=`if [ "${READ1}" != "" ] ; then echo ${rootme}/expect-read1; elif [ -f ${rootme}/../../expect/expect ] ; then echo ${rootme}/../../expect/expect ; else echo expect ; fi` ; export EXPECT ; EXEEXT= ; export EXEEXT ; LD_LIBRARY_PATH=$rootme/../../expect:$rootme/../../libstdc++:$rootme/../../tk/unix:$rootme/../../tcl/unix:$rootme/../../bfd:$rootme/../../opcodes:$LD_LIBRARY_PATH; export LD_LIBRARY_PATH; if [ -f ${rootme}/../../expect/expect ] ; then TCL_LIBRARY=${srcdir}/../../tcl/library ; export TCL_LIBRARY ; fi ; runtest --status gdb.base/break.exp gdb.threads/schedlock.exp NOTE: Dejagnu's default_target_compile is missing support for Go, using local override NOTE: Dejagnu's default_target_compile is missing support for Rust, using local override Test run by pedro on Fri Jun 26 11:03:38 2020 Native configuration is x86_64-pc-linux-gnu === gdb tests === Schedule of variations: unix Running target unix Using /usr/share/dejagnu/baseboards/unix.exp as board description file for target. Using /usr/share/dejagnu/config/unix.exp as generic interface file for target. Using /home/pedro/gdb/mygit/src/gdb/testsuite/config/unix.exp as tool-and-target-specific interface file. Running /home/pedro/gdb/mygit/src/gdb/testsuite/gdb.threads/schedlock.exp ... Running /home/pedro/gdb/mygit/src/gdb/testsuite/gdb.base/break.exp ... ERROR: tcl error sourcing /home/pedro/gdb/mygit/src/gdb/testsuite/gdb.base/break.exp. ERROR: can't set "a(1)": variable isn't array while executing "set a(1) 1" (file "/home/pedro/gdb/mygit/src/gdb/testsuite/gdb.base/break.exp" line 19) invoked from within "source /home/pedro/gdb/mygit/src/gdb/testsuite/gdb.base/break.exp" ("uplevel" body line 1) invoked from within "uplevel #0 source /home/pedro/gdb/mygit/src/gdb/testsuite/gdb.base/break.exp" invoked from within "catch "uplevel #0 source $test_file_name"" === gdb Summary === # of expected passes 208 /home/pedro/brno/pedro/gdb/mygit/build/gdb/gdb version 10.0.50.20200625-git -nw -nx -data-directory /home/pedro/brno/pedro/gdb/mygit/build/gdb/testsuite/../data-directory make[2]: *** [Makefile:206: check-single] Error 2 make[2]: Leaving directory '/home/pedro/brno/pedro/gdb/mygit/build/gdb/testsuite' make[1]: *** [Makefile:159: check] Error 2 make[1]: Leaving directory '/home/pedro/brno/pedro/gdb/mygit/build/gdb/testsuite' make: *** [Makefile:1628: check] Error 2 ~~~~ Now, you already have an indication that something went wrong because make existed with error. But people may miss that. The issue for me is that the "gdb Summary" tally did not mention the aborted testcase. I would like to see something like this instead: # of expected passes 208 # of nasty OMG FIX! tcl errors 1 :-) Your UNRESOLVED is of course better than the status quo. Thanks, Pedro Alves
bug-dejagnu <at> gnu.org, jcb62281 <at> gmail.com
:bug#41824
; Package dejagnu
.
(Fri, 26 Jun 2020 15:32:01 GMT) Full text and rfc822 format available.Message #55 received at 41824 <at> debbugs.gnu.org (full text, mbox):
From: Rob Savoye <rob <at> senecass.com> To: Pedro Alves <palves <at> redhat.com>, jcb62281 <at> gmail.com Cc: Tom de Vries <tdevries <at> suse.de>, 41824 <at> debbugs.gnu.org Subject: Re: bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case Date: Fri, 26 Jun 2020 09:30:22 -0600
On 6/26/20 3:53 AM, Pedro Alves wrote: > you can say that I'm more discussing the default. But really I'm more > worried about making sure that the people maintaining the GDB > testsuite (me included), and the DejaGnu people understand each > other's needs better. As the author of the GDB testsuite, I understand quite well how it needs to work, or at least used to. Testing GDB was the #1 requirement that lead to creating DejaGnu. Both have evolved considerably over the years, so working together on improving the current implementation is a good idea. What I get out of this discussion, which is a good one, is how DejaGnu should handle errors. Unknown is one case, but there others. Whether something is a warning, error, UNRESOLVED, UNTESTED, UNSUPPORTED, etc... is often up to the developer. Maybe what needs to be improved is better tracking of issues in a test run. > at ERROR: lines in .sum. Those would be buggy, but then again, if > those ERRORs were accounted for in the DejaGnu summary, they would be > harder to miss. Right, one reason I started on database support with better analysis tools. Diffing text files has limited functionality. > That's why I think --keep-going should be the default. When > you're developing a new testcase, you'll run that testcase in > isolation a number of times. It's very hard to miss any > silly error then -- there's no need to optimize for this use > case by default, IMO. I personally don't consider development or debugging done till I have done a full testrun native and cross to catch the errors running a test in isolation misses. Yes, I know this can take days, but I prefer that to assuming a validation testrun is not going to have problems. I'll note that there is a $HOME/.dejagnurc, where you can set the default to whatever you want. :-) - rob-
bug-dejagnu <at> gnu.org
:bug#41824
; Package dejagnu
.
(Fri, 26 Jun 2020 22:38:02 GMT) Full text and rfc822 format available.Message #58 received at 41824 <at> debbugs.gnu.org (full text, mbox):
From: Jacob Bachmeyer <jcb62281 <at> gmail.com> To: Pedro Alves <palves <at> redhat.com> Cc: Tom de Vries <tdevries <at> suse.de>, 41824 <at> debbugs.gnu.org Subject: Re: bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case Date: Fri, 26 Jun 2020 17:37:23 -0500
Pedro Alves wrote: > On 6/25/20 11:39 PM, Jacob Bachmeyer wrote: > >> Pedro Alves wrote: >> >>> On 6/25/20 3:43 AM, Jacob Bachmeyer wrote: >>> >>> >>>> Pedro Alves wrote: >>>> >>>> >>>>> [...] >>>>> >>>> I presume buildbot can be easily configured to pass a '--keep_going' option, perhaps as 'RUNTESTFLAGS=--keep_going' if needed? >>>> >>>> >>> What's more likely is we'll tweak GDB's Makefile so that make check >>> enables that option automatically. If GCC does the same, which I would >>> encourage them to, since they also run the testsuite against systems >>> that can take hours/days to complete, then I'll get to wonder who the >>> default is for. :-) >>> >>> >> The default is for developers working on testsuites -- and for testing release candidates and releases because if your testcases crash in a release, you have problems. :-) >> >> Are you suggesting withdrawing the --keep-going/--no-keep-going options >> > > Yes, but if you would like to add it, it's OK with me, though then > you can say that I'm more discussing the default. But really I'm more > worried about making sure that the people maintaining the GDB > testsuite (me included), and the DejaGnu people understand each > other's needs better. > My position here stems from the need for reliable testing, and part of that is that crashes in the testsuite scripts need to be made very obvious, where DejaGnu had previously swept them under the proverbial rug. Sure, there would be a message in the log, but that is easily missed in a testsuite as large as GDB's and nowhere was a need to grep the log for those messages documented. >> and always generating an UNRESOLVED result, >> resuming with the next test script, and possibly storing a list of aborted test scripts somewhere >> > > Yes (with --keep-going if you insist on having the option), but note that > I suggested storing the list of ERRORs and WARNINGs. > Arguably, a crashed test script is a distinct category from the ERRORs and WARNINGs that test scripts can report when something does not go as expected. The former is a problem with the testsuite *itself*, while the latter indicate problems detected *by* the testsuite. >>>> Still, needing a new option to get old behavior is changing the environment interface, so I will make --keep_going the default, at least for the rest of 1.6. That default is likely to change for 1.7. For automated systems like buildbot, passing '--keep_going' should be perfectly reasonable, which is why the option exists. (While the documented options will be changed to use hyphens to align with the rest of the GNU system in 1.7, the current option names will continue to be accepted, since the aliases are obvious.) >>>> >>>> >>> Currently with DejaGnu only the case of calling an unknown function stops >>> the test run AFAIK. Any other tcl error like calling a proc with the wrong >>> number of arguments, or treating a non-array variable as an array, etc. is >>> caught by the catch in the runtest proc in runtest.exp, and the testrun >>> continues. That to me clearly indicates that the original intention was >>> to catch errors and continue. >>> >>> That also gives the tool's $(tool)_finish proc a chance to tear down >>> correctly. >>> >>> It is just that the "unknown" case wasn't thought of. So I argue that not >>> making unknown proc calls abort the run is a bug fix, and making DejaGnu >>> abort the run for other errors by default is a behavior change that no real >>> testsuite built around DejaGnu is asking for. >>> >>> >> As it was explained to me, it is the other way around -- the catch in runtest was overlooked when ::unknown was added to catch bugs by aborting if an undefined procedure is called. >> >> Is there a ${tool}_finish or do you mean ${tool}_exit? Either way, you make a good point about running cleanup procedures when aborting a test run. >> > > Look at proc runtest in runtest.exp, it calls ${tool}_init, > sources the testcase under catch (using ERROR:, so here you could > run a tally of such caught errors), and then calls ${tool}_finish. > Well, sure enough, there are ${tool}_init and ${tool}_finish procedures called "around" each script if they exist... and no documentation... another item added to my local TODO list for improving the manual. >>>> The entire reason for stopping the testsuite immediately when a Tcl error occurs is that it is undeniable that something is *very* *wrong* when the testsuite stops early. Software testing is hard enough when you know your testsuite is good, but it is far worse when you have bugs in the testsuite masking bugs in the actual program because a test script is crashing and part of the program that is supposed to be tested is not being exercised. >>>> >>>> >>> I think it's safe to say that we all understand the general principle, >>> but IMO that applies more to continuing the same testcase (if somehow that >>> were possible, like "unknown" suppressing the error), rather than continuing >>> with a different testcase. >>> >>> Continuing the testrun when one testcase errors out does not mask any >>> bug in the program that is supposed to be tested. Why would it? The >>> testcase aborts, it doesn't continue. The remaining testcases start >>> afresh. >>> >>> >> Consider a hypothetical case where GDB has exec.exp (which tests starting a debugged process) and attach.exp (which tests attaching to a process). Unknown to the GDB developers, the "attach" command does not work and causes GDB to dump core (oops!) but attach.exp fails with a Tcl error before reaching the point where "attach" is issued to the inferior GDB. If DejaGnu stops immediately, it is obvious that the testsuite is broken, the bug in the testsuite is fixed, and the bug in GDB is found. If DejaGnu sweeps the Tcl error under the proverbial rug (as previous versions did), the bug in the testsuite effectively hides the bug in GDB , because the developers think the "attach" command is being exercised, but those tests are being (almost) silently skipped. >> > > First, that scenario doesn't normally cause a Tcl error. If GDB trips > an internal error, the GDB testsuite catches it with a FAIL. If GDB just dies > (whether it dumps core or not) the GDB testsuite catches that > with UNRESOLVED. > In the hypothetical scenario, attach.exp crashes with a Tcl error *before* issuing the "attach" command to the GDB-under-test. Previous releases of DejaGnu mention a Tcl error in the log and carry on. With this patch, DejaGnu catches that attach.exp crashed and itself inserts an UNRESOLVED result. I also take this as more support for using the UNRESOLVED status here: the test script crashing is little different from GDB exiting unexpectedly. > Also, even if those weren't already considered, the developer won't think the > testcase is just passing correctly, because being a good GDB developer, > he knows he needs to diff the new gdb.sum file against a previous run's, and > sees that some PASSes are missing, and some ERROR: tcl error showed up > in their stead. Nobody should rely on just looking at the summary. > Everyone _must_ diff between a previous run and the current run. Now some > people may use test result diffing scripts that forget to look > at ERROR: lines in .sum. Those would be buggy, but then again, if > those ERRORs were accounted for in the DejaGnu summary, they would be > harder to miss. > If this is the case, then what is the problem with aborting the test run upon error? "Good GDB developers" will always diff the .sum files, notice that the expected new PASS results are not there, and fix their code before checking it in, so they will never have a problem with DejaGnu aborting upon encountering a Tcl error. :-) > And again, people run the GDB testsuite in parallel mode, so your > assumption that the testsuite would stop immediately anyway doesn't > hold. One of the parallel runtest invocations bombs > out, while the others keep running, so by the time you get > back the prompt, other testcases already executed, and you don't > see the ERROR in the current terminal anymore. This is a fundamental problem with slicing up the testsuite external to DejaGnu and running multiple instances of runtest -- output that is supposed to be "last at the end of the run" can be hidden away by other ongoing subset runs. > You'll need to > look at the testsuite .sum .log files anyway. Only, you will > find out that one testcase bombed out, 50000 tests ran, but 1000 others > didn't run because of that one testcase. And now you're missing > those 1000 results, which maybe included those which you were > actually interested in, because you were doing a builbot try run > across a number of systems. > This also applies on a smaller scale for which we have no workaround: gdb.foo/bar.exp bombs out and the remaining /N/ tests in that file did not run. The only real solution is for the test scripts to avoid causing Tcl errors in the first place; anything else is papering over the problem with duct tape. > Now, when running the testsuite against slow embedded boards, then > you won't normally be using the parallel mode, the board just can't > support multiple parallel connections. In that case, you set the > testsuite running, and go on your merry way for a couple hours or 20. > When you come back, maybe the next day, you notice that the testsuite > actually stopped running 30 mins in, because someone changed > the testsuite upstream just before you rebased in way that caused a > tcl error for you in some testcase. Oops. You just lost a day. This is more of an annoyance, and yes, I have come back the next day to find that a long-running command failed shortly after being started, so I know this irritation. > Now > you wish you had remembered to type --keep-going, but it's not > the default, and nobody ever writes Tcl bugs, so you didn't think > of it before. > Note that the snapshot of GDB I have been using for testing this patch does have such a Tcl bug in its testsuite. I did not make any particular effort to choose this snapshot, it was just the current state of GDB master when I started. I have not even looked at *which* commit it is. Thinking "nobody ever writes Tcl bugs" is what causes these problems in the first place. I would expect that, had DejaGnu always aborted immediately upon catching Tcl errors from testsuites, that issue would have been found and fixed quickly. > That's why I think --keep-going should be the default. When > you're developing a new testcase, you'll run that testcase in > isolation a number of times. It's very hard to miss any > silly error then -- there's no need to optimize for this use > case by default, IMO. > > >> Now a bug that obvious will be caught very quickly by users (likely the GDB developers themselves) but obscure regressions are for more insidious, especially in large testsuites such as those for GCC or GDB. If a regression test is being skipped with an easily-missed error, that regression can slip in unnoticed. >> >> Would simply turning aborted test scripts into UNRESOLVED results be enough to get them fixed quickly? >> > > Yes. (Though I still think a separate count for ERROR (and WARNING) > would be better.) I think we need the UNRESOLVED results for at least a credible best-effort at claiming POSIX compliance, but additionally tracking how many UNRESOLVED results are due to script crashes would not be difficult. -- Jacob
bug-dejagnu <at> gnu.org
:bug#41824
; Package dejagnu
.
(Fri, 26 Jun 2020 22:59:02 GMT) Full text and rfc822 format available.Message #61 received at 41824 <at> debbugs.gnu.org (full text, mbox):
From: Jacob Bachmeyer <jcb62281 <at> gmail.com> To: Pedro Alves <pedro <at> palves.net> Cc: Pedro Alves <palves <at> redhat.com>, Tom de Vries <tdevries <at> suse.de>, 41824 <at> debbugs.gnu.org Subject: Re: bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case Date: Fri, 26 Jun 2020 17:58:09 -0500
Pedro Alves wrote: > On 6/26/20 10:53 AM, Pedro Alves wrote: > >>> to repeat at the very end of a test run with a big warning about test cases that crashed? >>> >> I did not suggest to repeat anything. >> > > I misunderstood you here -- I somehow thought that that by "repeat" you meant > re-running the testcase. > No, repeating output of at least the name of the file, and possibly the entire error dump. > Still, I would not suggest to repeat the error. Let me clarify -- with this: > > [...] > > Now, you already have an indication that something went wrong > because make existed with error. But people may miss that. > They will easily miss that -- DejaGnu exits with a failure status if any tests produced "surprising" results: FAIL, XPASS, or UNRESOLVED. The GDB testsuite typically has multiple FAIL results, so exiting with an error code is routine. > The issue for me is that the "gdb Summary" tally did not > mention the aborted testcase. I would like to see something > like this instead: > > # of expected passes 208 > # of nasty OMG FIX! tcl errors 1 > > :-) > > Your UNRESOLVED is of course better than the status quo. > UNRESOLVED is a step in the right direction, but I agree that more is needed. We are somewhat limited by POSIX, which does not define a separate result type for "test case crashed" and seems to roll that into UNRESOLVED, although POSIX appears to require a testsuite to have a strictly-defined set of tests and that a run must produce results for exactly all of them, which requires information DejaGnu does not normally have about the running testsuite. So, while UNRESOLVED is needed, we could also maintain a separate count of UNRESOLVED-due-to-crash, or simply store away the error message/errorCode/errorInfo tuples and emit them a second time at the end of the run, also easy in Tcl 8. I am leaning towards this latter solution precisely because it would be loud and obnoxious, while still maintaining a professional tone. (Listing a count of "nasty OMG FIX! >:-( tcl errors", while certainly expressing the proper intent, probably should not find its way into a DejaGnu release, nor should an ASCII-art C'thulu reminding the user that broken testsuites drive programmers to madness, complete with randomly-chosen "your remaining sanity" score... :-) ) Repeating error dumps at the end of the run also fits well into an XML log model, where the collected errors could go into their own structure at the end of the file, without complicating the tag model used for the main test results. Simple importers would still get the inserted UNRESOLVED result, while more thorough or specialized tools could analyze the errors. -- Jacob
bug-dejagnu <at> gnu.org, jcb62281 <at> gmail.com
:bug#41824
; Package dejagnu
.
(Sat, 27 Jun 2020 11:41:02 GMT) Full text and rfc822 format available.Message #64 received at 41824 <at> debbugs.gnu.org (full text, mbox):
From: Pedro Alves <palves <at> redhat.com> To: jcb62281 <at> gmail.com, Pedro Alves <pedro <at> palves.net> Cc: Tom de Vries <tdevries <at> suse.de>, 41824 <at> debbugs.gnu.org Subject: Re: bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case Date: Sat, 27 Jun 2020 12:40:10 +0100
On 6/26/20 11:58 PM, Jacob Bachmeyer wrote: >> The issue for me is that the "gdb Summary" tally did not >> mention the aborted testcase. I would like to see something >> like this instead: >> >> # of expected passes 208 >> # of nasty OMG FIX! tcl errors 1 >> >> :-) >> >> Your UNRESOLVED is of course better than the status quo. >> > > UNRESOLVED is a step in the right direction, but I agree that more is needed. We are somewhat limited by POSIX, which does not define a separate result type for "test case crashed" and seems to roll that into UNRESOLVED, although POSIX appears to require a testsuite to have a strictly-defined set of tests and that a run must produce results for exactly all of them, which requires information DejaGnu does not normally have about the running testsuite. So, while UNRESOLVED is needed, we could also maintain a separate count of UNRESOLVED-due-to-crash, or simply store away the error message/errorCode/errorInfo tuples and emit them a second time at the end of the run, also easy in Tcl 8. I'm not sure how much is POSIX compliance important these days. I would hope that that would not limit DejaGnu's development and limit its potential to be more usable. Even if POSIX compliance is important, I would think that that could be put behind a --posixly_correct option, and/or POSIXLY_CORRECT environment variable, like other GNU tools do: https://www.gnu.org/prep/standards/html_node/Non_002dGNU-Standards.html Or conversely, we could have a --gnu mode, though I would say that a GNU tool should behave in GNU mode by default. I really don't buy that adding different result types is an issue, as hinted earlier. We've actually done it in GDB, for a loooong while. We have distinct XFAIL vs KFAIL output messages show up in the .sum file (though those aren't distinguished in the summary file, doing so would be convenient, hence I see that as a GDB testsuite bug). And just very recently, we've added PATH and DUPLICATE output messages, and those do show up in the summary: # of expected passes 89593 # of unexpected failures 547 # of expected failures 146 # of known failures 104 # of untested testcases 20 # of unresolved testcases 5 # of unsupported tests 97 # of paths in test names 1 <<<<< # of duplicate test names 370 <<<<< I would think that even these two would be useful to have in DejaGnu proper for all testsuites. v1/rationale - https://sourceware.org/pipermail/gdb-patches/2020-April/167847.html v3/final version - https://sourceware.org/pipermail/gdb-patches/2020-April/168105.html Thanks, Pedro Alves
bug-dejagnu <at> gnu.org, jcb62281 <at> gmail.com
:bug#41824
; Package dejagnu
.
(Sat, 27 Jun 2020 11:41:02 GMT) Full text and rfc822 format available.Message #67 received at 41824 <at> debbugs.gnu.org (full text, mbox):
From: Pedro Alves <palves <at> redhat.com> To: jcb62281 <at> gmail.com Cc: Tom de Vries <tdevries <at> suse.de>, 41824 <at> debbugs.gnu.org Subject: Re: bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case Date: Sat, 27 Jun 2020 12:40:20 +0100
On 6/26/20 11:37 PM, Jacob Bachmeyer wrote: > Pedro Alves wrote: >> On 6/25/20 11:39 PM, Jacob Bachmeyer wrote: >> >>> Pedro Alves wrote: >>> >>>> On 6/25/20 3:43 AM, Jacob Bachmeyer wrote: >>>> >>>> >>>>> Pedro Alves wrote: >>>>> >>>>>> [...] >>>>>> >>>>> I presume buildbot can be easily configured to pass a '--keep_going' option, perhaps as 'RUNTESTFLAGS=--keep_going' if needed? >>>>> >>>> What's more likely is we'll tweak GDB's Makefile so that make check >>>> enables that option automatically. If GCC does the same, which I would >>>> encourage them to, since they also run the testsuite against systems >>>> that can take hours/days to complete, then I'll get to wonder who the >>>> default is for. :-) >>>> >>> The default is for developers working on testsuites -- and for testing release candidates and releases because if your testcases crash in a release, you have problems. :-) >>> >>> Are you suggesting withdrawing the --keep-going/--no-keep-going options >>> >> >> Yes, but if you would like to add it, it's OK with me, though then >> you can say that I'm more discussing the default. But really I'm more >> worried about making sure that the people maintaining the GDB >> testsuite (me included), and the DejaGnu people understand each >> other's needs better. >> > > My position here stems from the need for reliable testing, and part of that is that crashes in the testsuite scripts need to be made very obvious, where DejaGnu had previously swept them under the proverbial rug. Sure, there would be a message in the log, but that is easily missed in a testsuite as large as GDB's and nowhere was a need to grep the log for those messages documented. Sure, but we don't need to drastically change DejaGnu's behavior to address that. > >>>>> Still, needing a new option to get old behavior is changing the environment interface, so I will make --keep_going the default, at least for the rest of 1.6. That default is likely to change for 1.7. For automated systems like buildbot, passing '--keep_going' should be perfectly reasonable, which is why the option exists. (While the documented options will be changed to use hyphens to align with the rest of the GNU system in 1.7, the current option names will continue to be accepted, since the aliases are obvious.) >>>>> >>>> Currently with DejaGnu only the case of calling an unknown function stops >>>> the test run AFAIK. Any other tcl error like calling a proc with the wrong >>>> number of arguments, or treating a non-array variable as an array, etc. is >>>> caught by the catch in the runtest proc in runtest.exp, and the testrun >>>> continues. That to me clearly indicates that the original intention was >>>> to catch errors and continue. >>>> >>>> That also gives the tool's $(tool)_finish proc a chance to tear down >>>> correctly. >>>> >>>> It is just that the "unknown" case wasn't thought of. So I argue that not >>>> making unknown proc calls abort the run is a bug fix, and making DejaGnu >>>> abort the run for other errors by default is a behavior change that no real >>>> testsuite built around DejaGnu is asking for. >>>> >>> As it was explained to me, it is the other way around -- the catch in runtest was overlooked when ::unknown was added to catch bugs by aborting if an undefined procedure is called. >>> >>> Is there a ${tool}_finish or do you mean ${tool}_exit? Either way, you make a good point about running cleanup procedures when aborting a test run. >>> >> >> Look at proc runtest in runtest.exp, it calls ${tool}_init, sources the testcase under catch (using ERROR:, so here you could >> run a tally of such caught errors), and then calls ${tool}_finish. >> > > Well, sure enough, there are ${tool}_init and ${tool}_finish procedures called "around" each script if they exist... and no documentation... another item added to my local TODO list for improving the manual. > >>>>> The entire reason for stopping the testsuite immediately when a Tcl error occurs is that it is undeniable that something is *very* *wrong* when the testsuite stops early. Software testing is hard enough when you know your testsuite is good, but it is far worse when you have bugs in the testsuite masking bugs in the actual program because a test script is crashing and part of the program that is supposed to be tested is not being exercised. It's actually deniable that there's something *very* *wrong*. For one, all tests run under the same runtest process, and are sourced in the global namespace. So it is easy for a global variable to leak from one testcase to another. That's usually harmless, except when you have set foo 1 in testcase a.exp, and then b.exp does: set foo(1) 1 Whoops, TCL doesn't like that, so it aborts the testcase. Could this be solved by putting each testcase in its own address space? Yes. Do people actually do that? No -- it's a lot of make work and the problem is rare. Is that reuse global as array really revealing such a MAJOR issue with the GDB testcase that it warrants stopping the whole run? No... I would argue that instead it is the DejaGnu framework that should make it so that each testcase is run under as much isolation as possible. So I call that a design issue in DejaGnu itself. Whether that could be fixed by something like sourcing each testcase under its own Tcl slave interpreter, I'm not sure. Another example of a Tcl bug that sometimes we trip on. Say we have a testcase that needs to extract some address from the inferior in its setup phase, something like this: set test "get address" gdb_test_multiple "x /i \$pc" $test { -re "($hex).*$gdb_prompt $" { set addr $expect_out(1,string) pass $test } } if {$addr == 0} { # The above failed, so no use continuing. return } Now, gdb_test_multiple is a wrapper around expect, that internally catches GDB crashes, internal GDB errors, timeouts and other things, so the bug above is often missed. The bug is of course that when e.g., a timeout occurs, $addr is left uninitialized, so you get a Tcl error referencing a nonexisting variable. But the thing is, we already issued a FAIL, and we want to bail out of the testcase anyway. Crashing the whole testrun for a bug that typically happens in already-failing code is too strict. And these are the code paths that are not exercised in a normal run, so are easy to miss, and are just impossible to have full coverage for. >>>>> >>>> I think it's safe to say that we all understand the general principle, >>>> but IMO that applies more to continuing the same testcase (if somehow that >>>> were possible, like "unknown" suppressing the error), rather than continuing >>>> with a different testcase. >>>> >>>> Continuing the testrun when one testcase errors out does not mask any >>>> bug in the program that is supposed to be tested. Why would it? The >>>> testcase aborts, it doesn't continue. The remaining testcases start >>>> afresh. >>>> >>> Consider a hypothetical case where GDB has exec.exp (which tests starting a debugged process) and attach.exp (which tests attaching to a process). Unknown to the GDB developers, the "attach" command does not work and causes GDB to dump core (oops!) but attach.exp fails with a Tcl error before reaching the point where "attach" is issued to the inferior GDB. If DejaGnu stops immediately, it is obvious that the testsuite is broken, the bug in the testsuite is fixed, and the bug in GDB is found. If DejaGnu sweeps the Tcl error under the proverbial rug (as previous versions did), the bug in the testsuite effectively hides the bug in GDB , because the developers think the "attach" command is being exercised, but those tests are being (almost) silently skipped. >>> >> >> First, that scenario doesn't normally cause a Tcl error. If GDB trips >> an internal error, the GDB testsuite catches it with a FAIL. If GDB just dies >> (whether it dumps core or not) the GDB testsuite catches that >> with UNRESOLVED. >> > > In the hypothetical scenario, attach.exp crashes with a Tcl error *before* issuing the "attach" command to the GDB-under-test. Previous releases of DejaGnu mention a Tcl error in the log and carry on. With this patch, DejaGnu catches that attach.exp crashed and itself inserts an UNRESOLVED result. > > I also take this as more support for using the UNRESOLVED status here: the test script crashing is little different from GDB exiting unexpectedly. Well, I'm actually not sure why doesn't GDB FAIL in the crash case instead of UNRESOLVED. To me a GDB internal error or the GDB process crashing doesn't look very different. Both require pretty much the same level of attention from a developer. But that decision predates me. Who knows, maybe GDB internal error support was only added after the testsuite was originally written, and thus the crash handling predates the internal error handling. > >> Also, even if those weren't already considered, the developer won't think the >> testcase is just passing correctly, because being a good GDB developer, >> he knows he needs to diff the new gdb.sum file against a previous run's, and >> sees that some PASSes are missing, and some ERROR: tcl error showed up >> in their stead. Nobody should rely on just looking at the summary. >> Everyone _must_ diff between a previous run and the current run. Now some >> people may use test result diffing scripts that forget to look >> at ERROR: lines in .sum. Those would be buggy, but then again, if >> those ERRORs were accounted for in the DejaGnu summary, they would be >> harder to miss. >> > > If this is the case, then what is the problem with aborting the test run upon error? "Good GDB developers" will always diff the .sum files, notice that the expected new PASS results are not there, and fix their code before checking it in, so they will never have a problem with DejaGnu aborting upon encountering a Tcl error. :-) As I've explained, the problem is that one testcase prevented a large number of others from running, potentially substantially all of the testsuite. So you've now wasted a lot of time and maybe prevented other people from using the target system meanwhile. All because of needless interference between testcases. > >> And again, people run the GDB testsuite in parallel mode, so your >> assumption that the testsuite would stop immediately anyway doesn't >> hold. One of the parallel runtest invocations bombs >> out, while the others keep running, so by the time you get >> back the prompt, other testcases already executed, and you don't >> see the ERROR in the current terminal anymore. > > This is a fundamental problem with slicing up the testsuite external to DejaGnu and running multiple instances of runtest -- output that is supposed to be "last at the end of the run" can be hidden away by other ongoing subset runs. That's like arguing against parallelization, and saying that everyone should run the testsuite in serial mode, and be OK with it taking 2 hours instead of 30 mins to complete a run... The parallel mode aggregates the summaries from the different runtest invocations as a single summary at the end, https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=contrib/dg-extract-results.py;h=30aa68771d491451f72d6dbd18f6d5ac339451b5;hb=HEAD so I don't see what problem here is unsolvable. Again, the issue is that Tcl ERRORs aren't tallied up on the DejaGnu summary, so the aggregated summary doesn't contain those either. > >> You'll need to >> look at the testsuite .sum .log files anyway. Only, you will >> find out that one testcase bombed out, 50000 tests ran, but 1000 others >> didn't run because of that one testcase. And now you're missing >> those 1000 results, which maybe included those which you were >> actually interested in, because you were doing a builbot try run >> across a number of systems. >> > > This also applies on a smaller scale for which we have no workaround: gdb.foo/bar.exp bombs out and the remaining /N/ tests in that file did not run. The only real solution is for the test scripts to avoid causing Tcl errors in the first place; anything else is papering over the problem with duct tape. That's not the same thing. The remaining /N/ tests in the file depend on the previous /M/ tests. E.g., gdb_test "step" gdb_test "print foo" If the "step" fails, the "print foo" test is meaningless. However, there should be absolutely no dependency between foo.exp and bar.exp. It should be possible to run those two different testcases in any order. Thus, it should be irrelevant to foo.exp whether bar.exp bombs out of not. Consequently, there's no real reason to not let foo.exp run regardless of what happens to bar.exp. > >> Now, when running the testsuite against slow embedded boards, then >> you won't normally be using the parallel mode, the board just can't >> support multiple parallel connections. In that case, you set the >> testsuite running, and go on your merry way for a couple hours or 20. >> When you come back, maybe the next day, you notice that the testsuite >> actually stopped running 30 mins in, because someone changed >> the testsuite upstream just before you rebased in way that caused a >> tcl error for you in some testcase. Oops. You just lost a day. > > This is more of an annoyance, and yes, I have come back the next day to find that a long-running command failed shortly after being started, so I know this irritation. There you go. > >> Now >> you wish you had remembered to type --keep-going, but it's not >> the default, and nobody ever writes Tcl bugs, so you didn't think >> of it before. >> > > Note that the snapshot of GDB I have been using for testing this patch does have such a Tcl bug in its testsuite. I did not make any particular effort to choose this snapshot, it was just the current state of GDB master when I started. I have not even looked at *which* commit it is. Can you tell me what the error looks like? > Thinking "nobody ever writes Tcl bugs" is what causes these problems in the first place. I don't know what you mean by this. What I meant is that people likely won't remember to use --keep-going when invoking runtest interactively. Why force people to go through the pain of learning the option? > > I would expect that, had DejaGnu always aborted immediately upon catching Tcl errors from testsuites, that issue would have been found and fixed quickly. As I've explained, that "aborted immediately" is just impossible by design with parallel mode. You could approximate it perhaps, by making make abort all the parallel runtests when one of the runtests bombs out. It won't be immediately. And I'd of course think that adding such feature wouldn't help anyone in reality. And with your "quickly", you're assuming that a Tcl errors typically happen in the normal code paths. If it happens on some timeout path as I explained, then it may be that it's not that easy to trigger the error, unless you happen to run the testsuite on a "lucky" system with a "wrong" combination of environment (compiler, glibc, kernel, etc.). Thanks, Pedro Alves
bug-dejagnu <at> gnu.org, jcb62281 <at> gmail.com
:bug#41824
; Package dejagnu
.
(Sat, 27 Jun 2020 11:45:02 GMT) Full text and rfc822 format available.Message #70 received at 41824 <at> debbugs.gnu.org (full text, mbox):
From: Pedro Alves <palves <at> redhat.com> To: jcb62281 <at> gmail.com Cc: Tom de Vries <tdevries <at> suse.de>, 41824 <at> debbugs.gnu.org Subject: Re: bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case Date: Sat, 27 Jun 2020 12:44:22 +0100
On 6/27/20 12:40 PM, Pedro Alves wrote: > > Could this be solved by putting each testcase in its own address space? I meant to say namespace instead. Thanks, Pedro Alves
bug-dejagnu <at> gnu.org, jcb62281 <at> gmail.com
:bug#41824
; Package dejagnu
.
(Sat, 27 Jun 2020 12:52:01 GMT) Full text and rfc822 format available.Message #73 received at 41824 <at> debbugs.gnu.org (full text, mbox):
From: Rob Savoye <rob <at> senecass.com> To: jcb62281 <at> gmail.com Cc: Pedro Alves <palves <at> redhat.com>, Tom de Vries <tdevries <at> suse.de>, 41824 <at> debbugs.gnu.org Subject: Re: bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case Date: Sat, 27 Jun 2020 06:49:30 -0600
On 6/25/20 7:53 PM, Jacob Bachmeyer wrote: > The original rationale for using UNRESOLVED here was that DejaGnu > converts a test result to UNRESOLVED if too many errors or warnings were > produced. The DejaGnu manual indicates (section "A POSIX Compliant Test > Framework") that UNRESOLVED is correct for a test where execution was > interrupted or was set up incorrectly. UNTESTED is specifically listed > as a placeholder for an as-yet-unwritten testcase. A "typical" GDB > testsuite run has almost a hundred UNTESTED results but (without this > patch) zero UNRESOLVED results. > > To me, this seems that UNRESOLVED is correct here, or that the manual > has an error. So having written both, there is always the chance I've interpreted things differently. If the test is interrupted, or has errors/warnings, like timing problems for example, then it's UNRESOLVED. But a bug in Tcl code enough to trigger unknown is UNTESTED, as the test never really ran. I'd go with which definition the toolchain teams prefer, as this effects validation testruns. I could go either way. I don't know if POSIX 1003.3 is still considered a standard, that was a long time ago. The testsuites were primarily focused on XOpen and POSIX conformance tests, not toolchains. I was on the standards committee, so used them as a model cause they seemed a good standard. But the details of the interpretation can now be whatever we want it to be. We're probably the only ones left running TET compliant testsuites anyway. :-) Anyway, whether a test run aborts on an error or not is not defined, so it's up to us to decide. - rob -
bug-dejagnu <at> gnu.org, jcb62281 <at> gmail.com
:bug#41824
; Package dejagnu
.
(Sat, 27 Jun 2020 13:16:02 GMT) Full text and rfc822 format available.Message #76 received at 41824 <at> debbugs.gnu.org (full text, mbox):
From: Pedro Alves <palves <at> redhat.com> To: jcb62281 <at> gmail.com, Pedro Alves <pedro <at> palves.net> Cc: Tom de Vries <tdevries <at> suse.de>, 41824 <at> debbugs.gnu.org Subject: Re: bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case Date: Sat, 27 Jun 2020 14:15:20 +0100
On 6/27/20 12:40 PM, Pedro Alves wrote: > We > have distinct XFAIL vs KFAIL output messages show up in the .sum file (though > those aren't distinguished in the summary file, doing so would be convenient, > hence I see that as a GDB testsuite bug). > Sorry, somehow I got confused and thought KFAIL was GDB specific because AFAIK GCC doesn't use it. DejaGnu does show the KFAILs separated in the summary. I don't why I got confused, I've actually read the kfail code in DejaGnu before. Oh well. The point still stands with DUPLICATE and PATH output results. And KFAIL isn't POSIX either, I think? Thanks, Pedro Alves
bug-dejagnu <at> gnu.org
:bug#41824
; Package dejagnu
.
(Sat, 27 Jun 2020 17:22:02 GMT) Full text and rfc822 format available.Message #79 received at 41824 <at> debbugs.gnu.org (full text, mbox):
From: Jacob Bachmeyer <jcb62281 <at> gmail.com> To: Pedro Alves <palves <at> redhat.com> Cc: Tom de Vries <tdevries <at> suse.de>, 41824 <at> debbugs.gnu.org Subject: Re: bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case Date: Sat, 27 Jun 2020 12:21:34 -0500
Pedro Alves wrote: > On 6/26/20 11:37 PM, Jacob Bachmeyer wrote: > >> Pedro Alves wrote: >> >>> On 6/25/20 11:39 PM, Jacob Bachmeyer wrote: >>> >>> >>>> Pedro Alves wrote: >>>> >>>> > > [...] > > Could this be solved by putting each testcase in its own address space? > Yes. Do people actually do that? No -- it's a lot of make work and the > problem is rare. Is that reuse global as array really revealing such a MAJOR > issue with the GDB testcase that it warrants stopping the whole run? No... > > I would argue that instead it is the DejaGnu framework that should make > it so that each testcase is run under as much isolation as possible. > So I call that a design issue in DejaGnu itself. Whether that could > be fixed by something like sourcing each testcase under its own > Tcl slave interpreter, I'm not sure. > This is the current plan -- to eventually run test scripts in independent slave interpreters, which enforces the isolation necessary at the Tcl level for future native parallel testing and also prevents test scripts from accidentally clobbering globals that are important to DejaGnu, some of which have very generic names. > Another example of a Tcl bug that sometimes we trip on. Say we have a > testcase that needs to extract some address from the inferior > in its setup phase, something like this: > > set test "get address" > gdb_test_multiple "x /i \$pc" $test { > -re "($hex).*$gdb_prompt $" { > set addr $expect_out(1,string) > pass $test > } > } > > if {$addr == 0} { > # The above failed, so no use continuing. > return > } > > Now, gdb_test_multiple is a wrapper around expect, that internally catches > GDB crashes, internal GDB errors, timeouts and other things, so the bug above > is often missed. The bug is of course that when e.g., a timeout occurs, $addr > is left uninitialized, so you get a Tcl error referencing a nonexisting variable. > > But the thing is, we already issued a FAIL, and we want to bail out of > the testcase anyway. Crashing the whole testrun for a bug that typically > happens in already-failing code is too strict. And these are the > code paths that are not exercised in a normal run, so are > easy to miss, and are just impossible to have full coverage for. > Fine; I will loosen that, but DejaGnu has no way to know that it is bailing out of a failing testcase and that there are no more tests from that script, so an UNRESOLVED result complaining about the error will still be inserted. >>>>> I think it's safe to say that we all understand the general principle, >>>>> but IMO that applies more to continuing the same testcase (if somehow that >>>>> were possible, like "unknown" suppressing the error), rather than continuing >>>>> with a different testcase. >>>>> >>>>> Continuing the testrun when one testcase errors out does not mask any >>>>> bug in the program that is supposed to be tested. Why would it? The >>>>> testcase aborts, it doesn't continue. The remaining testcases start >>>>> afresh. >>>>> >>>>> >>>> Consider a hypothetical case where GDB has exec.exp (which tests starting a debugged process) and attach.exp (which tests attaching to a process). Unknown to the GDB developers, the "attach" command does not work and causes GDB to dump core (oops!) but attach.exp fails with a Tcl error before reaching the point where "attach" is issued to the inferior GDB. If DejaGnu stops immediately, it is obvious that the testsuite is broken, the bug in the testsuite is fixed, and the bug in GDB is found. If DejaGnu sweeps the Tcl error under the proverbial rug (as previous versions did), the bug in the testsuite effectively hides the bug in GDB , because the developers think the "attach" command is being exercised, but those tests are being (almost) silently skipped. >>>> >>>> >>> First, that scenario doesn't normally cause a Tcl error. If GDB trips >>> an internal error, the GDB testsuite catches it with a FAIL. If GDB just dies >>> (whether it dumps core or not) the GDB testsuite catches that >>> with UNRESOLVED. >>> >>> >> In the hypothetical scenario, attach.exp crashes with a Tcl error *before* issuing the "attach" command to the GDB-under-test. Previous releases of DejaGnu mention a Tcl error in the log and carry on. With this patch, DejaGnu catches that attach.exp crashed and itself inserts an UNRESOLVED result. >> >> I also take this as more support for using the UNRESOLVED status here: the test script crashing is little different from GDB exiting unexpectedly. >> > > Well, I'm actually not sure why doesn't GDB FAIL in the crash case instead of > UNRESOLVED. To me a GDB internal error or the GDB process crashing doesn't > look very different. Both require pretty much the same level of attention > from a developer. But that decision predates me. Who knows, maybe GDB > internal error support was only added after the testsuite was originally > written, and thus the crash handling predates the internal error handling. > Considering that testing GDB was one of the motivations for writing DejaGnu and how old DejaGnu is, I suspect that the testsuite had to handle GDB crashes before GDB had support for reporting internal errors. In some sense, UNRESOLVED is more serious than FAIL; the GDB testsuite has about a hundred FAIL results but typically no UNRESOLVED results. >>> And again, people run the GDB testsuite in parallel mode, so your >>> assumption that the testsuite would stop immediately anyway doesn't >>> hold. One of the parallel runtest invocations bombs >>> out, while the others keep running, so by the time you get >>> back the prompt, other testcases already executed, and you don't >>> see the ERROR in the current terminal anymore. >>> >> This is a fundamental problem with slicing up the testsuite external to DejaGnu and running multiple instances of runtest -- output that is supposed to be "last at the end of the run" can be hidden away by other ongoing subset runs. >> > > That's like arguing against parallelization, and saying that everyone > should run the testsuite in serial mode, and be OK with it taking 2 hours > instead of 30 mins to complete a run... > It is a problem with parallelization, just like aborting a test run is a problem when running parallel tests. > The parallel mode aggregates the summaries from the different runtest > invocations as a single summary at the end, > > https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=contrib/dg-extract-results.py;h=30aa68771d491451f72d6dbd18f6d5ac339451b5;hb=HEAD > > so I don't see what problem here is unsolvable. Again, the issue is that Tcl > ERRORs aren't tallied up on the DejaGnu summary, so the aggregated summary > doesn't contain those either. > The ERRORs will be printed a second time, so all the script needs to do is collect text that appears after the summary and starts with "ERROR:". There seems to also be some confusion between Tcl errors and the ERROR messages produce by DejaGnu's perror procedure (or its internal equivalents). Would it be better to report Tcl errors with "CRASH" instead of "ERROR"? >>> You'll need to >>> look at the testsuite .sum .log files anyway. Only, you will >>> find out that one testcase bombed out, 50000 tests ran, but 1000 others >>> didn't run because of that one testcase. And now you're missing >>> those 1000 results, which maybe included those which you were >>> actually interested in, because you were doing a builbot try run >>> across a number of systems. >>> >>> >> This also applies on a smaller scale for which we have no workaround: gdb.foo/bar.exp bombs out and the remaining /N/ tests in that file did not run. The only real solution is for the test scripts to avoid causing Tcl errors in the first place; anything else is papering over the problem with duct tape. >> > > That's not the same thing. The remaining /N/ tests in the file depend > on the previous /M/ tests. E.g., > > gdb_test "step" > gdb_test "print foo" > > If the "step" fails, the "print foo" test is meaningless. > > However, there should be absolutely no dependency between foo.exp and > bar.exp. It should be possible to run those two different testcases > in any order. Thus, it should be irrelevant to foo.exp whether bar.exp > bombs out of not. Consequently, there's no real reason to not let > foo.exp run regardless of what happens to bar.exp. > This will need to be documented... another item for the local TODO list. >>> Now >>> you wish you had remembered to type --keep-going, but it's not >>> the default, and nobody ever writes Tcl bugs, so you didn't think >>> of it before. >>> >>> >> Note that the snapshot of GDB I have been using for testing this patch does have such a Tcl bug in its testsuite. I did not make any particular effort to choose this snapshot, it was just the current state of GDB master when I started. I have not even looked at *which* commit it is. >> > > Can you tell me what the error looks like? > With commit 7b958a48e1322880f23cdb0a1c35643dd27d3ddb and GDB built without Python support: ERROR: tcl error sourcing /home/jacob/arena/gdb-check/build-queue-test/gdb/testsuite/../../../src/gdb/testsuite/gdb.python/py-format-string.exp. ERROR: invoked "continue" outside of a loop -- Jacob
bug-dejagnu <at> gnu.org
:bug#41824
; Package dejagnu
.
(Sat, 27 Jun 2020 17:30:02 GMT) Full text and rfc822 format available.Message #82 received at 41824 <at> debbugs.gnu.org (full text, mbox):
From: Jacob Bachmeyer <jcb62281 <at> gmail.com> To: Rob Savoye <rob <at> senecass.com> Cc: Pedro Alves <palves <at> redhat.com>, Tom de Vries <tdevries <at> suse.de>, 41824 <at> debbugs.gnu.org Subject: Re: bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case Date: Sat, 27 Jun 2020 12:29:45 -0500
Rob Savoye wrote: > On 6/25/20 7:53 PM, Jacob Bachmeyer wrote: > >> The original rationale for using UNRESOLVED here was that DejaGnu >> converts a test result to UNRESOLVED if too many errors or warnings were >> produced. The DejaGnu manual indicates (section "A POSIX Compliant Test >> Framework") that UNRESOLVED is correct for a test where execution was >> interrupted or was set up incorrectly. UNTESTED is specifically listed >> as a placeholder for an as-yet-unwritten testcase. A "typical" GDB >> testsuite run has almost a hundred UNTESTED results but (without this >> patch) zero UNRESOLVED results. >> >> To me, this seems that UNRESOLVED is correct here, or that the manual >> has an error. >> > > So having written both, there is always the chance I've interpreted > things differently. If the test is interrupted, or has errors/warnings, > like timing problems for example, then it's UNRESOLVED. But a bug in Tcl > code enough to trigger unknown is UNTESTED, as the test never really > ran. I'd go with which definition the toolchain teams prefer, as this > effects validation testruns. I could go either way. > I see UNTESTED as a placeholder for a yet-to-be-written test; DejaGnu does not produce an error exit status after reporting UNTESTED. Once Tcl has begun to execute "source $test_file_name", I see the test as running from the framework's perspective, so an error exit should be UNRESOLVED: we do not know the result that would have been produced had the Tcl error not occurred. > I don't know if POSIX 1003.3 is still considered a standard, that was > a long time ago. The testsuites were primarily focused on XOpen and > POSIX conformance tests, not toolchains. I was > on the standards committee, so used them as a model cause they seemed a > good standard. But the details of the interpretation can now be whatever > we want it to be. We're probably the only ones left running TET > compliant testsuites anyway. :-) > POSIX 1003.3 appears to have been withdrawn. > Anyway, whether a test run aborts on an error or not is not defined, > so it's up to us to decide. Apparently, GDB has baked an assumption that the test run will not abort, ever, into their testsuite, and we had never documented circumstances that cause DejaGnu to abort, so I cannot really answer that with "your assumption is wrong". -- Jacob
bug-dejagnu <at> gnu.org
:bug#41824
; Package dejagnu
.
(Sat, 27 Jun 2020 17:43:02 GMT) Full text and rfc822 format available.Message #85 received at 41824 <at> debbugs.gnu.org (full text, mbox):
From: Jacob Bachmeyer <jcb62281 <at> gmail.com> To: Pedro Alves <palves <at> redhat.com> Cc: 41824 <at> debbugs.gnu.org, Tom de Vries <tdevries <at> suse.de>, Pedro Alves <pedro <at> palves.net> Subject: Re: bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case Date: Sat, 27 Jun 2020 12:42:05 -0500
Pedro Alves wrote: > On 6/26/20 11:58 PM, Jacob Bachmeyer wrote: > > >>> The issue for me is that the "gdb Summary" tally did not >>> mention the aborted testcase. I would like to see something >>> like this instead: >>> >>> # of expected passes 208 >>> # of nasty OMG FIX! tcl errors 1 >>> >>> :-) >>> >>> Your UNRESOLVED is of course better than the status quo. >>> >>> >> UNRESOLVED is a step in the right direction, but I agree that more is needed. We are somewhat limited by POSIX, which does not define a separate result type for "test case crashed" and seems to roll that into UNRESOLVED, although POSIX appears to require a testsuite to have a strictly-defined set of tests and that a run must produce results for exactly all of them, which requires information DejaGnu does not normally have about the running testsuite. So, while UNRESOLVED is needed, we could also maintain a separate count of UNRESOLVED-due-to-crash, or simply store away the error message/errorCode/errorInfo tuples and emit them a second time at the end of the run, also easy in Tcl 8. >> > > I'm not sure how much is POSIX compliance important these days. I would hope that that > would not limit DejaGnu's development and limit its potential to be more usable. We already have extensions; POSIX only allowed PASS/FAIL/UNTESTED/UNRESOLVED/UNSUPPORTED. [KX]{PASS,FAIL} are GNU extensions. > Even > if POSIX compliance is important, I would think that that could be put behind > a --posixly_correct option, and/or POSIXLY_CORRECT environment variable, like other > GNU tools do: > > https://www.gnu.org/prep/standards/html_node/Non_002dGNU-Standards.html > > Or conversely, we could have a --gnu mode, though I would say that a GNU tool > should behave in GNU mode by default. > POSIX compliance in DejaGnu depended on the testsuite being run. Any results generated by the framework itself need to be in the POSIX set, though testsuites can use extensions, as GDB has done. Actually adding more test result codes is dubious, as that will interact badly with XML output and efficient database processing, both of which can reasonably enumerate the nine result codes. ({,K,X}{PASS,FAIL}, UN{TEST,RESOLV,SUPPORT}ED) > I really don't buy that adding different result types is an issue, as > hinted earlier. We've actually done it in GDB, for a loooong while. We > have distinct XFAIL vs KFAIL output messages show up in the .sum file (though > those aren't distinguished in the summary file, doing so would be convenient, > hence I see that as a GDB testsuite bug). > > And just very recently, we've added PATH and DUPLICATE output messages, > and those do show up in the summary: > > # of expected passes 89593 > # of unexpected failures 547 > # of expected failures 146 > # of known failures 104 > # of untested testcases 20 > # of unresolved testcases 5 > # of unsupported tests 97 > # of paths in test names 1 <<<<< > # of duplicate test names 370 <<<<< > > I would think that even these two would be useful to have in > DejaGnu proper for all testsuites. > > v1/rationale - https://sourceware.org/pipermail/gdb-patches/2020-April/167847.html > v3/final version - https://sourceware.org/pipermail/gdb-patches/2020-April/168105.html > I will look at the exact details later, but this sounds like it is probably a monkeypatch that will stop working in some future version when the procedures you are overriding disappear into an internal namespace, so we need to be thinking about what would be needed to provide extensible summary line sets, just as we need some way to extend language support in default_target_compile without overriding the entire procedure. -- Jacob
bug-dejagnu <at> gnu.org, jcb62281 <at> gmail.com
:bug#41824
; Package dejagnu
.
(Sat, 27 Jun 2020 17:54:01 GMT) Full text and rfc822 format available.Message #88 received at 41824 <at> debbugs.gnu.org (full text, mbox):
From: Pedro Alves <palves <at> redhat.com> To: jcb62281 <at> gmail.com Cc: Tom de Vries <tdevries <at> suse.de>, 41824 <at> debbugs.gnu.org Subject: Re: bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case Date: Sat, 27 Jun 2020 18:53:19 +0100
On 6/27/20 6:21 PM, Jacob Bachmeyer wrote: > Pedro Alves wrote: >> On 6/26/20 11:37 PM, Jacob Bachmeyer wrote: >>> Note that the snapshot of GDB I have been using for testing this patch does have such a Tcl bug in its testsuite. I did not make any particular effort to choose this snapshot, it was just the current state of GDB master when I started. I have not even looked at *which* commit it is. >> >> Can you tell me what the error looks like? >> > > With commit 7b958a48e1322880f23cdb0a1c35643dd27d3ddb and GDB built without Python support: > ERROR: tcl error sourcing /home/jacob/arena/gdb-check/build-queue-test/gdb/testsuite/../../../src/gdb/testsuite/gdb.python/py-format-string.exp. > ERROR: invoked "continue" outside of a loop Thanks. It's already been fixed: commit 05e682e3be7e3d9d63ec358dcf8943fd200545cb Author: Sandra Loosemore <sandra <at> codesourcery.com> AuthorDate: Wed Jun 17 13:43:32 2020 -0700 Commit: Sandra Loosemore <sandra <at> codesourcery.com> CommitDate: Wed Jun 17 13:43:32 2020 -0700 Fix TCL error in gdb.python/py-format-string.exp. 2020-06-17 Sandra Loosemore <sandra <at> codesourcery.com> gdb/testsuite/ * gdb.python/py-format-string.exp: Move test for python support earlier, out of function body. Thanks, Pedro Alves
bug-dejagnu <at> gnu.org
:bug#41824
; Package dejagnu
.
(Tue, 30 Jun 2020 04:41:01 GMT) Full text and rfc822 format available.Message #91 received at 41824 <at> debbugs.gnu.org (full text, mbox):
From: Jacob Bachmeyer <jcb62281 <at> gmail.com> To: Pedro Alves <palves <at> redhat.com> Cc: Tom de Vries <tdevries <at> suse.de>, 41824 <at> debbugs.gnu.org Subject: Re: bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case Date: Mon, 29 Jun 2020 23:40:43 -0500
Pedro Alves wrote: > On 6/26/20 11:37 PM, Jacob Bachmeyer wrote: > >> Pedro Alves wrote: >> >>> And again, people run the GDB testsuite in parallel mode, so your >>> assumption that the testsuite would stop immediately anyway doesn't >>> hold. One of the parallel runtest invocations bombs >>> out, while the others keep running, so by the time you get >>> back the prompt, other testcases already executed, and you don't >>> see the ERROR in the current terminal anymore. >>> >> This is a fundamental problem with slicing up the testsuite external to DejaGnu and running multiple instances of runtest -- output that is supposed to be "last at the end of the run" can be hidden away by other ongoing subset runs. >> > > That's like arguing against parallelization, and saying that everyone > should run the testsuite in serial mode, and be OK with it taking 2 hours > instead of 30 mins to complete a run... > > The parallel mode aggregates the summaries from the different runtest > invocations as a single summary at the end, > > https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=contrib/dg-extract-results.py;h=30aa68771d491451f72d6dbd18f6d5ac339451b5;hb=HEAD > > so I don't see what problem here is unsolvable. Again, the issue is that Tcl > ERRORs aren't tallied up on the DejaGnu summary, so the aggregated summary > doesn't contain those either. > The current version of the fix for this issue (commit 61dc0cafad8845b3c668940ed2e574bd503d410f on temporary branch PR41918) generates separator lines containing at least 10 hyphens around each error when repeating the errors after the summary. The leading separator line currently begins with "ERROR: " due to limitations of the internal interfaces, but plans are to eventually fix that and collapse the separator lines between adjacent error reports if multiple Tcl errors occur in a run. Each error report begins with "ERROR: in testcase " and ends with a Tcl backtrace and a line of at least 10 hyphens. Is this something the GDB results aggregation tool can be extended to handle? -- Jacob
bug-dejagnu <at> gnu.org, jcb62281 <at> gmail.com
:bug#41824
; Package dejagnu
.
(Tue, 30 Jun 2020 12:27:02 GMT) Full text and rfc822 format available.Message #94 received at 41824 <at> debbugs.gnu.org (full text, mbox):
From: Pedro Alves <palves <at> redhat.com> To: jcb62281 <at> gmail.com Cc: Tom de Vries <tdevries <at> suse.de>, 41824 <at> debbugs.gnu.org Subject: Re: bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case Date: Tue, 30 Jun 2020 13:26:25 +0100
On 6/30/20 5:40 AM, Jacob Bachmeyer wrote: > Pedro Alves wrote: >> On 6/26/20 11:37 PM, Jacob Bachmeyer wrote: >> >>> Pedro Alves wrote: >>> >>>> And again, people run the GDB testsuite in parallel mode, so your >>>> assumption that the testsuite would stop immediately anyway doesn't >>>> hold. One of the parallel runtest invocations bombs >>>> out, while the others keep running, so by the time you get >>>> back the prompt, other testcases already executed, and you don't >>>> see the ERROR in the current terminal anymore. >>>> >>> This is a fundamental problem with slicing up the testsuite external to DejaGnu and running multiple instances of runtest -- output that is supposed to be "last at the end of the run" can be hidden away by other ongoing subset runs. >>> >> >> That's like arguing against parallelization, and saying that everyone >> should run the testsuite in serial mode, and be OK with it taking 2 hours >> instead of 30 mins to complete a run... >> >> The parallel mode aggregates the summaries from the different runtest >> invocations as a single summary at the end, >> >> https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=contrib/dg-extract-results.py;h=30aa68771d491451f72d6dbd18f6d5ac339451b5;hb=HEAD >> >> so I don't see what problem here is unsolvable. Again, the issue is that Tcl >> ERRORs aren't tallied up on the DejaGnu summary, so the aggregated summary >> doesn't contain those either. >> > > The current version of the fix for this issue (commit 61dc0cafad8845b3c668940ed2e574bd503d410f on temporary branch PR41918) generates separator lines containing at least 10 hyphens around each error when repeating the errors after the summary. The leading separator line currently begins with "ERROR: " due to limitations of the internal interfaces, but plans are to eventually fix that and collapse the separator lines between adjacent error reports if multiple Tcl errors occur in a run. Each error report begins with "ERROR: in testcase " and ends with a Tcl backtrace and a line of at least 10 hyphens. Is this something the GDB results aggregation tool can be extended to handle? Your guess is as good as mine, since I never touched that code. Thanks, Pedro Alves
bug-dejagnu <at> gnu.org
:bug#41824
; Package dejagnu
.
(Tue, 30 Jun 2020 23:50:02 GMT) Full text and rfc822 format available.Message #97 received at 41824 <at> debbugs.gnu.org (full text, mbox):
From: Jacob Bachmeyer <jcb62281 <at> gmail.com> To: Pedro Alves <palves <at> redhat.com> Cc: Tom de Vries <tdevries <at> suse.de>, 41824 <at> debbugs.gnu.org Subject: Re: bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case Date: Tue, 30 Jun 2020 18:49:11 -0500
Pedro Alves wrote: > On 6/30/20 5:40 AM, Jacob Bachmeyer wrote: > >> Pedro Alves wrote: >> >>> On 6/26/20 11:37 PM, Jacob Bachmeyer wrote: >>> >>> >>>> Pedro Alves wrote: >>>> >>>> >>>>> And again, people run the GDB testsuite in parallel mode, so your >>>>> assumption that the testsuite would stop immediately anyway doesn't >>>>> hold. One of the parallel runtest invocations bombs >>>>> out, while the others keep running, so by the time you get >>>>> back the prompt, other testcases already executed, and you don't >>>>> see the ERROR in the current terminal anymore. >>>>> >>>>> >>>> This is a fundamental problem with slicing up the testsuite external to DejaGnu and running multiple instances of runtest -- output that is supposed to be "last at the end of the run" can be hidden away by other ongoing subset runs. >>>> >>>> >>> That's like arguing against parallelization, and saying that everyone >>> should run the testsuite in serial mode, and be OK with it taking 2 hours >>> instead of 30 mins to complete a run... >>> >>> The parallel mode aggregates the summaries from the different runtest >>> invocations as a single summary at the end, >>> >>> https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=contrib/dg-extract-results.py;h=30aa68771d491451f72d6dbd18f6d5ac339451b5;hb=HEAD >>> >>> so I don't see what problem here is unsolvable. Again, the issue is that Tcl >>> ERRORs aren't tallied up on the DejaGnu summary, so the aggregated summary >>> doesn't contain those either. >>> >>> >> The current version of the fix for this issue (commit 61dc0cafad8845b3c668940ed2e574bd503d410f on temporary branch PR41918) generates separator lines containing at least 10 hyphens around each error when repeating the errors after the summary. The leading separator line currently begins with "ERROR: " due to limitations of the internal interfaces, but plans are to eventually fix that and collapse the separator lines between adjacent error reports if multiple Tcl errors occur in a run. Each error report begins with "ERROR: in testcase " and ends with a Tcl backtrace and a line of at least 10 hyphens. Is this something the GDB results aggregation tool can be extended to handle? >> > > Your guess is as good as mine, since I never touched that code. > After briefly looking through that script, I expect that it will not be particularly difficult, although I also note that the script counts the error messages generated by DejaGnu's ::unknown procedure, so lib/framework.exp:unknown cannot be withdrawn as useless until 1.7. -- Jacob
bug-dejagnu <at> gnu.org
:bug#41824
; Package dejagnu
.
(Tue, 30 Jun 2020 23:56:01 GMT) Full text and rfc822 format available.Message #100 received at 41824 <at> debbugs.gnu.org (full text, mbox):
From: Jacob Bachmeyer <jcb62281 <at> gmail.com> To: 41824 <at> debbugs.gnu.org, 41918 <at> debbugs.gnu.org Subject: Branch PR41918 containing fixes for these issues is into final testing Date: Tue, 30 Jun 2020 18:54:51 -0500
The temporary branch PR41918 (which earlier absorbed the PR41824 temporary branch) is now merged onto the testing queue ahead of master. After these patches pass full testing, master will be fast-forwarded to include them. -- Jacob
jcb62281 <at> gmail.com
:Tom de Vries <tdevries <at> suse.de>
:Message #105 received at 41824-done <at> debbugs.gnu.org (full text, mbox):
From: Jacob Bachmeyer <jcb62281 <at> gmail.com> To: Pedro Alves <palves <at> redhat.com> Cc: Tom de Vries <tdevries <at> suse.de>, 41824-done <at> debbugs.gnu.org Subject: Re: bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case Date: Wed, 08 Jul 2020 18:52:56 -0500
The bugfix branch has landed on master and will be included in the upcoming 1.6.3 release. -- Jacob
bug-dejagnu <at> gnu.org, jcb62281 <at> gmail.com
:bug#41824
; Package dejagnu
.
(Thu, 09 Jul 2020 12:18:01 GMT) Full text and rfc822 format available.Message #108 received at 41824-done <at> debbugs.gnu.org (full text, mbox):
From: Pedro Alves <palves <at> redhat.com> To: jcb62281 <at> gmail.com Cc: Tom de Vries <tdevries <at> suse.de>, 41824-done <at> debbugs.gnu.org Subject: Re: bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case Date: Thu, 9 Jul 2020 13:17:06 +0100
On 7/9/20 12:52 AM, Jacob Bachmeyer wrote: > The bugfix branch has landed on master and will be included in the upcoming 1.6.3 release. Thanks for all the work, and for the patience. Pedro Alves
Debbugs Internal Request <help-debbugs <at> gnu.org>
to internal_control <at> debbugs.gnu.org
.
(Fri, 07 Aug 2020 11:24:04 GMT) Full text and rfc822 format available.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.