GNU bug report logs - #51941
Detect duplication of ERT tests

Previous Next

Package: emacs;

Reported by: Mattias Engdegård <mattiase <at> acm.org>

Date: Thu, 18 Nov 2021 10:17:01 UTC

Severity: normal

Done: Mattias Engdegård <mattiase <at> acm.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 51941 in the body.
You can then email your comments to 51941 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#51941; Package emacs. (Thu, 18 Nov 2021 10:17:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Mattias Engdegård <mattiase <at> acm.org>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Thu, 18 Nov 2021 10:17:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: bug-gnu-emacs <at> gnu.org
Subject: Detect duplication of ERT tests
Date: Thu, 18 Nov 2021 11:16:09 +0100
[Message part 1 (text/plain, inline)]
Duplicated `ert-deftest` forms (ie, two or more definitions with the same test name) is a serious problem: the older test is silently ignored, which has the consequence of the test covering less than intended.

A quick experiment in the Emacs tree immediately reveals 51 instances. There's no telling how many lie waiting in external code.

Having explored various options I've come to the conclusion that it needs to be a hard error, at run-time, when running non-interactively (see attached diff). Rationale:

* We can't just warn at run time because tests tend to spam a lot when run and users only care about whether the tests passed and won't even look at the logs otherwise.
* We can't just warn at compile time because many tests aren't compiled at all, and where they are (as in the Emacs tree) nobody cares much about the warnings, because they are "just tests".
* We can't issue a complaint when running interactively because it's natural to keep redefining tests during development.

Since `ert-deftest` forms are often generated as a result of macro-expansion, passive static textual linting won't do.

The effect of this change will be a visible and easily remedied failure in broken test code. I volunteer to fix the first-order errors in Emacs (by renaming the clashes); domain specialists may need to help fixing secondary failures (failures in previously ignored tests).

[ert-deftest-redefine-error.diff (application/octet-stream, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#51941; Package emacs. (Thu, 18 Nov 2021 10:36:02 GMT) Full text and rfc822 format available.

Message #8 received at 51941 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Mattias Engdegård <mattiase <at> acm.org>
Cc: 51941 <at> debbugs.gnu.org
Subject: Re: bug#51941: Detect duplication of ERT tests
Date: Thu, 18 Nov 2021 11:35:44 +0100
Mattias Engdegård <mattiase <at> acm.org> writes:

> Having explored various options I've come to the conclusion that it
> needs to be a hard error, at run-time, when running non-interactively
> (see attached diff). Rationale:

Makes sense to me.

> The effect of this change will be a visible and easily remedied
> failure in broken test code. I volunteer to fix the first-order errors
> in Emacs (by renaming the clashes); domain specialists may need to
> help fixing secondary failures (failures in previously ignored tests).

Sounds good.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Reply sent to Mattias Engdegård <mattiase <at> acm.org>:
You have taken responsibility. (Thu, 18 Nov 2021 11:55:01 GMT) Full text and rfc822 format available.

Notification sent to Mattias Engdegård <mattiase <at> acm.org>:
bug acknowledged by developer. (Thu, 18 Nov 2021 11:55:01 GMT) Full text and rfc822 format available.

Message #13 received at 51941-done <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: 51941-done <at> debbugs.gnu.org
Subject: Re: bug#51941: Detect duplication of ERT tests
Date: Thu, 18 Nov 2021 12:54:18 +0100
18 nov. 2021 kl. 11.35 skrev Lars Ingebrigtsen <larsi <at> gnus.org>:

> Sounds good.

Thank you, all now done. The tests seem to pass after clashes were resolved but do tell if I missed something.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#51941; Package emacs. (Thu, 18 Nov 2021 11:59:02 GMT) Full text and rfc822 format available.

Message #16 received at 51941-done <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Mattias Engdegård <mattiase <at> acm.org>
Cc: 51941-done <at> debbugs.gnu.org
Subject: Re: bug#51941: Detect duplication of ERT tests
Date: Thu, 18 Nov 2021 12:58:28 +0100
Mattias Engdegård <mattiase <at> acm.org> writes:

> Thank you, all now done. The tests seem to pass after clashes were
> resolved but do tell if I missed something.

All the tests pass for me, too.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#51941; Package emacs. (Thu, 18 Nov 2021 17:45:01 GMT) Full text and rfc822 format available.

Message #19 received at 51941 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Mattias Engdegård <mattiase <at> acm.org>
Cc: 51941 <at> debbugs.gnu.org
Subject: Re: bug#51941: Detect duplication of ERT tests
Date: Thu, 18 Nov 2021 18:44:15 +0100
By the way, this reminds me of a different error I often make in ert
files -- I type "defun" instead of "ert-deftest".  Would it be
productive to check whether a -tests.el file contains a defun that's
never called?  In which case it's probably supposed to be en
ert-deftest.

But perhaps it's such a marginal problem that it's not worth checking
for.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#51941; Package emacs. (Thu, 18 Nov 2021 17:54:02 GMT) Full text and rfc822 format available.

Message #22 received at 51941 <at> debbugs.gnu.org (full text, mbox):

From: Philipp Stephani <p.stephani2 <at> gmail.com>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: Mattias Engdegård <mattiase <at> acm.org>,
 51941 <at> debbugs.gnu.org
Subject: Re: bug#51941: Detect duplication of ERT tests
Date: Thu, 18 Nov 2021 18:53:10 +0100
Am Do., 18. Nov. 2021 um 18:45 Uhr schrieb Lars Ingebrigtsen <larsi <at> gnus.org>:
>
> By the way, this reminds me of a different error I often make in ert
> files -- I type "defun" instead of "ert-deftest".  Would it be
> productive to check whether a -tests.el file contains a defun that's
> never called?  In which case it's probably supposed to be en
> ert-deftest.
>
> But perhaps it's such a marginal problem that it's not worth checking
> for.

This also happens to me rather frequently, so I'd welcome such an
addition. The byte compiler can already generate call graphs (see
byte-compile-generate-call-tree), so maybe that functionality could be
reused.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#51941; Package emacs. (Thu, 18 Nov 2021 19:15:02 GMT) Full text and rfc822 format available.

Message #25 received at 51941 <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: 51941 <at> debbugs.gnu.org, Philipp <p.stephani2 <at> gmail.com>
Subject: Re: bug#51941: Detect duplication of ERT tests
Date: Thu, 18 Nov 2021 20:13:59 +0100
18 nov. 2021 kl. 18.44 skrev Lars Ingebrigtsen <larsi <at> gnus.org>:

> By the way, this reminds me of a different error I often make in ert
> files -- I type "defun" instead of "ert-deftest".  Would it be
> productive to check whether a -tests.el file contains a defun that's
> never called?

It would certainly be good to detect such mistakes, but I'm not sure how to go about finding them. I wrote a quick-and-dirty scanner that found about 400 uncalled functions in the test tree of which 1 or 2, maybe, should have been `ert-deftest` (will fix those right away).

Clearly something with much better signal/noise ratio is called for. One would be to only consider functions using `should` etc.

> But perhaps it's such a marginal problem that it's not worth checking
> for.

It's unclear how frequent this mistake is but at least now we have concrete evidence of its existence. On the other hand, it will be found by a test author who has adopted the standard "test first without then with the fix" ritual as habit.

(I'm not accusing you of bad discipline; I'm probably no better myself, and in any case berating people for being careless is not how to solve problems caused by human mistakes.)

I may or may not have the time to try to find something better so feel free to take a shot if you like.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#51941; Package emacs. (Thu, 18 Nov 2021 19:35:02 GMT) Full text and rfc822 format available.

Message #28 received at 51941 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Mattias Engdegård <mattiase <at> acm.org>
Cc: 51941 <at> debbugs.gnu.org, Philipp <p.stephani2 <at> gmail.com>
Subject: Re: bug#51941: Detect duplication of ERT tests
Date: Thu, 18 Nov 2021 20:34:44 +0100
Mattias Engdegård <mattiase <at> acm.org> writes:

> It's unclear how frequent this mistake is but at least now we have
> concrete evidence of its existence. On the other hand, it will be
> found by a test author who has adopted the standard "test first
> without then with the fix" ritual as habit.
>
> (I'm not accusing you of bad discipline; I'm probably no better
> myself, and in any case berating people for being careless is not how
> to solve problems caused by human mistakes.)

I usually eval the `should' statements by hand in the running Emacs when
developing stuff, so I frequently don't notice that they don't actually
run when saying "make check".

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#51941; Package emacs. (Thu, 18 Nov 2021 19:44:02 GMT) Full text and rfc822 format available.

Message #31 received at 51941 <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: 51941 <at> debbugs.gnu.org, Philipp <p.stephani2 <at> gmail.com>
Subject: Re: bug#51941: Detect duplication of ERT tests
Date: Thu, 18 Nov 2021 20:43:22 +0100
18 nov. 2021 kl. 20.34 skrev Lars Ingebrigtsen <larsi <at> gnus.org>:

> I usually eval the `should' statements by hand in the running Emacs when
> developing stuff, so I frequently don't notice that they don't actually
> run when saying "make check".

Yes, that's a little dangerous because there are so many things that can go wrong. On the other hand, I often just run the added or changed test interactively with `ert` instead of the whole file in batch mode. (And I have many times fallen into the .elc vs .el trap of TEST_LOAD_EL!)

We should have a modern test runner in Emacs that runs tests in batch mode but displays the results interactively, points out exact failure points etc.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#51941; Package emacs. (Thu, 18 Nov 2021 22:52:02 GMT) Full text and rfc822 format available.

Message #34 received at 51941 <at> debbugs.gnu.org (full text, mbox):

From: Glenn Morris <rgm <at> gnu.org>
To: Mattias Engdegård <mattiase <at> acm.org>
Cc: 51941 <at> debbugs.gnu.org, Lars Ingebrigtsen <larsi <at> gnus.org>,
 Philipp <p.stephani2 <at> gmail.com>
Subject: Re: bug#51941: Detect duplication of ERT tests
Date: Thu, 18 Nov 2021 17:51:07 -0500
Mattias Engdegård wrote:

> It would certainly be good to detect such mistakes, but I'm not sure
> how to go about finding them. I wrote a quick-and-dirty scanner that
> found about 400 uncalled functions in the test tree of which 1 or 2,
> maybe, should have been `ert-deftest` (will fix those right away).
>
> Clearly something with much better signal/noise ratio is called for.
> One would be to only consider functions using `should` etc.

Seems to me that making "should" only be defined within tests would have
flagged these (https://debbugs.gnu.org/28385).




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#51941; Package emacs. (Fri, 19 Nov 2021 05:29:02 GMT) Full text and rfc822 format available.

Message #37 received at 51941 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Glenn Morris <rgm <at> gnu.org>
Cc: Mattias Engdegård <mattiase <at> acm.org>,
 Philipp <p.stephani2 <at> gmail.com>, 51941 <at> debbugs.gnu.org
Subject: Re: bug#51941: Detect duplication of ERT tests
Date: Fri, 19 Nov 2021 06:28:23 +0100
Glenn Morris <rgm <at> gnu.org> writes:

> Seems to me that making "should" only be defined within tests would have
> flagged these (https://debbugs.gnu.org/28385).

But like I said there, using `should' outside tests is really convenient
when developing.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#51941; Package emacs. (Fri, 19 Nov 2021 09:34:02 GMT) Full text and rfc822 format available.

Message #40 received at 51941 <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: Glenn Morris <rgm <at> gnu.org>, Philipp <p.stephani2 <at> gmail.com>,
 51941 <at> debbugs.gnu.org
Subject: Re: bug#51941: Detect duplication of ERT tests
Date: Fri, 19 Nov 2021 10:33:42 +0100
19 nov. 2021 kl. 06.28 skrev Lars Ingebrigtsen <larsi <at> gnus.org>:

>> Seems to me that making "should" only be defined within tests would have
>> flagged these (https://debbugs.gnu.org/28385).
> 
> But like I said there, using `should' outside tests is really convenient
> when developing.

You are both right -- apart from the development convenience, functions form an important abstraction mechanism in Lisp and they can be valuable for decomposing test code (although I don't think I've ever done so myself). On the other hand, disallowing them outside `ert-deftest` might help eliminate the mistakes that we are talking about.

A quick scan finds 365 occurrences of `should`, `should-not` and `should-error` inside defuns in the test/ subtree.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#51941; Package emacs. (Sat, 20 Nov 2021 08:39:02 GMT) Full text and rfc822 format available.

Message #43 received at 51941 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Mattias Engdegård <mattiase <at> acm.org>
Cc: Glenn Morris <rgm <at> gnu.org>, Philipp <p.stephani2 <at> gmail.com>,
 51941 <at> debbugs.gnu.org
Subject: Re: bug#51941: Detect duplication of ERT tests
Date: Sat, 20 Nov 2021 09:38:31 +0100
Mattias Engdegård <mattiase <at> acm.org> writes:

> A quick scan finds 365 occurrences of `should`, `should-not` and
> `should-error` inside defuns in the test/ subtree.

Most of those are probably called from an ert-deftest, I hope.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#51941; Package emacs. (Sat, 20 Nov 2021 08:50:02 GMT) Full text and rfc822 format available.

Message #46 received at 51941 <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: Glenn Morris <rgm <at> gnu.org>, Philipp <p.stephani2 <at> gmail.com>,
 51941 <at> debbugs.gnu.org
Subject: Re: bug#51941: Detect duplication of ERT tests
Date: Sat, 20 Nov 2021 09:49:16 +0100
20 nov. 2021 kl. 09.38 skrev Lars Ingebrigtsen <larsi <at> gnus.org>:

>> A quick scan finds 365 occurrences of `should`, `should-not` and
>> `should-error` inside defuns in the test/ subtree.
> 
> Most of those are probably called from an ert-deftest, I hope.

I didn't do any call-tree analysis but some tests use ERT in rather unorthodox ways.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#51941; Package emacs. (Wed, 24 Nov 2021 07:30:02 GMT) Full text and rfc822 format available.

Message #49 received at 51941 <at> debbugs.gnu.org (full text, mbox):

From: Ihor Radchenko <yantar92 <at> gmail.com>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: Mattias Engdegård <mattiase <at> acm.org>,
 51941 <at> debbugs.gnu.org
Subject: Re: bug#51941: Detect duplication of ERT tests
Date: Wed, 24 Nov 2021 15:30:38 +0800
Lars Ingebrigtsen <larsi <at> gnus.org> writes:

>> The effect of this change will be a visible and easily remedied
>> failure in broken test code. I volunteer to fix the first-order errors
>> in Emacs (by renaming the clashes); domain specialists may need to
>> help fixing secondary failures (failures in previously ignored tests).
>
> Sounds good.

This change breaks test code when a test file is "required" multiple
times. In particular, Org mode tests cannot run with latest Emacs.

Best,
Ihor




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#51941; Package emacs. (Wed, 24 Nov 2021 08:53:02 GMT) Full text and rfc822 format available.

Message #52 received at 51941 <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: Ihor Radchenko <yantar92 <at> gmail.com>
Cc: 51941 <at> debbugs.gnu.org, Lars Ingebrigtsen <larsi <at> gnus.org>
Subject: Re: bug#51941: Detect duplication of ERT tests
Date: Wed, 24 Nov 2021 09:52:39 +0100
24 nov. 2021 kl. 08.30 skrev Ihor Radchenko <yantar92 <at> gmail.com>:

> This change breaks test code when a test file is "required" multiple
> times. In particular, Org mode tests cannot run with latest Emacs.

We'll have to do something about that then. Would you tell me more precisely what those Org tests do and why?
('require' is supposed to be idempotent; maybe you are missing a 'provide' somewhere?)





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#51941; Package emacs. (Wed, 24 Nov 2021 09:28:02 GMT) Full text and rfc822 format available.

Message #55 received at 51941 <at> debbugs.gnu.org (full text, mbox):

From: Ihor Radchenko <yantar92 <at> gmail.com>
To: Mattias Engdegård <mattiase <at> acm.org>
Cc: 51941 <at> debbugs.gnu.org, Lars Ingebrigtsen <larsi <at> gnus.org>
Subject: Re: bug#51941: Detect duplication of ERT tests
Date: Wed, 24 Nov 2021 17:29:02 +0800
Mattias Engdegård <mattiase <at> acm.org> writes:

> 24 nov. 2021 kl. 08.30 skrev Ihor Radchenko <yantar92 <at> gmail.com>:
>
>> This change breaks test code when a test file is "required" multiple
>> times. In particular, Org mode tests cannot run with latest Emacs.
>
> We'll have to do something about that then. Would you tell me more precisely what those Org tests do and why?
> ('require' is supposed to be idempotent; maybe you are missing a 'provide' somewhere?)

In Org 9.5, we have test-oc.el. That file requires a macro defined in
test-ox.el:

In test-oc.el:
;; We need `org-test-with-parsed-data' macro.
(require 'test-ox "../testing/lisp/test-ox.el")

The load sequence is:
1. load-file test-oc
   1.1. test-oc requires test-ox
2. load file test-ox triggers error:
   Debugger entered--Lisp error: (error "Test ‘test-org-export/bind-keyword’ redefined")

Hope it helps.

Best,
Ihor




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#51941; Package emacs. (Wed, 24 Nov 2021 10:01:01 GMT) Full text and rfc822 format available.

Message #58 received at 51941 <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: Ihor Radchenko <yantar92 <at> gmail.com>
Cc: 51941 <at> debbugs.gnu.org, Lars Ingebrigtsen <larsi <at> gnus.org>
Subject: Re: bug#51941: Detect duplication of ERT tests
Date: Wed, 24 Nov 2021 11:00:30 +0100
24 nov. 2021 kl. 10.29 skrev Ihor Radchenko <yantar92 <at> gmail.com>:

> 1. load-file test-oc
>   1.1. test-oc requires test-ox
> 2. load file test-ox triggers error:
>   Debugger entered--Lisp error: (error "Test ‘test-org-export/bind-keyword’ redefined")

Thank you, that was very clear.

Best practice is to run each ERT test file in a separate Emacs process. This ensures isolation between the tests, which can be quite important, but also permits them to be run in parallel. Easiest is to do this from a Makefile but you could of course do it from a controlling Emacs process if you prefer that.

Now if you for some reason want to stick with running them all in a single process, which I would advice against, then at least avoid the load-file if the file has already been loaded. Maybe use `require` with a file argument instead of load-file.

As a last resort you could insert something like (ert-delete-all-tests) between the test runs, but then you would no longer get the benefit of the global ERT test duplication check that is the one advantage of running them all in the same process.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#51941; Package emacs. (Wed, 24 Nov 2021 12:16:02 GMT) Full text and rfc822 format available.

Message #61 received at 51941 <at> debbugs.gnu.org (full text, mbox):

From: Ihor Radchenko <yantar92 <at> gmail.com>
To: Mattias Engdegård <mattiase <at> acm.org>
Cc: 51941 <at> debbugs.gnu.org, Lars Ingebrigtsen <larsi <at> gnus.org>
Subject: Re: bug#51941: Detect duplication of ERT tests
Date: Wed, 24 Nov 2021 20:16:10 +0800
Mattias Engdegård <mattiase <at> acm.org> writes:

> Best practice is to run each ERT test file in a separate Emacs process. This ensures isolation between the tests, which can be quite important, but also permits them to be run in parallel. Easiest is to do this from a Makefile but you could of course do it from a controlling Emacs process if you prefer that.

Makes sense, though it is not trivial to change the existing code.

> Now if you for some reason want to stick with running them all in a single process, which I would advice against, then at least avoid the load-file if the file has already been loaded. Maybe use `require` with a file argument instead of load-file.

Using require indeed solved the issue.

Best,
Ihor




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Wed, 22 Dec 2021 12:24:06 GMT) Full text and rfc822 format available.

This bug report was last modified 2 years and 87 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.