How test frameworks work : Cleaning after a failure, the birth of TearDown and SetUp
This article is part of the series Building a Mental Model of test frameworks. The full list of articles is:
This is part 3 of the creation of our mental model of the internal of test frameworks.
So far, we’ve seen that test frameworks can be seen as loops over functions, and that the mechanism to know that something bad occurred is to throw an exception. To prevent the exception from stopping the loop, it contains a try/catch block. This is the basics.
In the second part, we’ve looked at how they manage to allow us testing that an exception is thrown by the system under test. This is not as easy as it seems.
In this (long awaited) third part, we will see why the tear down method is far more important than the setup method - even if we actually rarely use it.
Let’s refresh our memory and see what our framework currently looks like:
tests = [
test1() {},
test2() {},
test3() {},
test4() {
expectedException = MySuperException
throws MySuperException
}
];
failedTestExceptions = []
// Test loop
foreach(tests as test) {
expectedException = null
try {
test()
} catch(exception) {
if(exception != expectedException) {
failedTestExceptions[] = exception
}
continue;
}
// Did we get our expected exception?
if(expectedException) {
failedTestExceptions[] = NeverReceivedExpectedException(expectedException)
}
}
// Error display
if(count(failedTestExceptions) === 0) {
displayGreenBar()
}
else {
foreach(failedTestExceptions as exception) {
displayErrorFor(exception)
}
}
Why do we need a TearDown method?
Let’s see two examples to understand why a TearDown method is needed.
Example 1: The unclosed connection
Imagine that we have a test that requires a connection to a database.
See how we thought about everything: because we don’t want to keep connections open, we close it at the end of the test.
testWithConnection() {
connection = openConnection()
// Do stuff
assertSomething()
closeConnection()
}
Now, what happens when the test fails? This can be either because the assertion reports an error or because the code is just plain failing and throws an exception.
Because of the exception, the test doesn’t get to reach the end of the function, and the connection is never closed. If you have multiple tests failing in that way and you can exhaust your database connection pool. Leading to even more tests failing. That’s bad for test independence. The failure of a test can cause the failure of multiple others.
Apart from database connections, another example is opened files. While the issue might not lead to a failure of multiple tests, it can lead to your test suite running slower.
Example 2: The interdependence via a shared resource
In this example, let’s talk about 2 tests.
The first test adds a burrito to the collection, does something with it, and, because it’s a nice citizen, cleans the mess at the end and removes the added burrito from the collection. Here, imagine that the collection is something we share across multiple tests, like a database.
test1() {
addBurritoToCollection('Swiss Burrito')
// Do something with the burrito and make an assertion
removeBurritoFromCollection('Swiss Burrito')
}
The second test also deals with burritos.
Maybe it’s a test about counting the number of burritos.
test2() {
addBurritoToCollection('French Burrito')
addBurritoToCollection('Mexican Burrito')
assertEqual(2, countOfBurritos())
removeBurritoFromCollection('French Burrito')
removeBurritoFromCollection('Mexican Burrito')
}
Both tests are cleaning after themselves.
Now, what if test1 starts failing for some reason? The Swiss Burrito stays in the collection because the test never reaches the last line where it was supposed to remove it.
Test2 starts, adds two more burritos, makes its assertion, and fails. Now the collection contains 3 burritos. Swiss, French, and Mexican.
We clearly see how the failure of a test impacts another one. Our tests aren’t completely independent. This might not seem like a big problem, but now imagine that you have more than 2 tests, and that you don’t run them that often.[1] You are in trouble, you need to find the cause of the issue, you look in the code, possibly starting from a test that would probably continue to be ok if the other test wasn’t failing. You look around, everything seems ok. You might not even touched at that part of the code. WTF?
In both examples we’ve seen that even with good intentions and cleaning in the test function, if a test fails we can run into some problems.
Introducing the tearDown method
The TearDown mechanism is here to answer that problem. The tearDown provides a way to act after the test, even in case of failure.
To do so, we need to register what we want to do after the test.
In our simplified test framework, let’s add a variable that can be overridden by every test. This is where that simplified test framework starts to diverge a lot from the ones you probably are used to. In your usual test framework, tearDown (and setUp) are shared by a group of tests. To add that to the model would require to start creating a sort of collection of tests, and this is not that interesting for the mental model that I want to share.
// Test loop
foreach(tests as test) {
+ tearDown = null
expectedException = null
try {
test()
} catch(exception) {
if(exception != expectedException) {
failedTestExceptions[] = exception
}
continue;
}
+ finally {
+ if(tearDown) {
+ tearDown()
+ }
+}
// Did we get our expected exception?
if(expectedException) {
failedTestExceptions[] = NeverReceivedExpectedException(expectedException)
}
}
And we can rewrite our two examples to use the tearDown. For the connection:
testWithConnection() {
tearDown = () => {
closeConnection()
}
connection = openConnection()
// Do stuff
assertSomething()
}
And for the burrito collection:
test1() {
tearDown = () => {
removeBurritoFromCollection('Swiss Burrito')
}
addBurritoToCollection('Swiss Burrito')
// Do something with the burrito and make an assertion
}
test2() {
tearDown = () => {
removeBurritoFromCollection('French Burrito')
removeBurritoFromCollection('Mexican Burrito')
}
addBurritoToCollection('French Burrito')
addBurritoToCollection('Mexican Burrito')
assertEqual(2, countOfBurritos())
}
Now, in case of failure, the connection is properly closed, and the burritos added to the collection are removed. Thank you, tearDown. Our tests are way more independent than before.
My theory about tearDown and setup methods
This is why the tearDown method is actually more important than the setUp one. The tearDown is needed, while the setUp is just something convenient to factorize some common test setup in one place and avoid repetition. Which is nice too but not as crucial.
My theory is that the tearDown was discovered as something we needed for test independence, to make sure we properly clean after each test and that adding setUp was just a question of symmetry. After all, if I have a tearDown shared by all tests, why not have something that helps me setUp all my tests? Also, if we clean the connection after the test, wouldn’t it make sense to be able to create it before the test[2]
Since I started that series of articles long ago I started sharing carousels on LinkedIn about tests, and shared one about what this series talks about. Someone commented that Kent Beck demonstrates how to create JUnit in TDD in his book TDD By Example. This is a fun example because it shows how to create a test framework in TDD when you don’t have a test framework yet, and to create something using itself. Anyway, I totally forgot that Kent did that in the book and grabbed my copy to see what he was saying. With the steps he takes, the setUp method is introduced before the tearDown one. This contradicts my theory but this might only be because Kent already knew which form the test framework will have in the end. And to add some credit to my theory, the chapter is called "Cleaning up after" and the introduction is
Sometimes tests need to allocate external resources in setUp(). If we want the tests to remains independent, then a test that allocates external resources needs to release them before it is done, perhaps in a tearDown() method.
I find the formulation strange, because I don’t think that tests need to allocate external resources in setUp(), they could really do it inside the test, but for sure, they do need to release external resources no matter what.
So, here it is, end of part 3. We’ve seen how and why the tearDown method came to be and how important it is. Always remember to clean after your test if you use something that can mess with other tests or that can cause issues in the system.
Hey, let’s have a chat!.
Taking smaller steps would reduce the difficulty of dealing with that problem.
You see that the tests are failing, you revert to the previous good state, and you start again. ↩︎My examples are probably itching some of you.
How can the closeConnection function have access to the connection variable that seems to be declared in the test? You are right, that code probably doesn’t work, but is good enough for building a mental model.
Sorry, not sorry :) ↩︎
- Improve your automated testing : You will learn how to fix your tests and make them pass from things that slow you down to things that save you time. This is a self-paced video course in French.
- Helping your teams: I help software teams deliver better software sooner. We'll work on technical issues with code, test or architecture, or the process and organization depending on your needs. Book a free call where we'll discuss how things are going on your side and how I can help you.
- Deliver a talk in your organization: I have a few talks that I enjoy presenting, and I can share with your organization(meetup, conference, company, BBL). If you feel that we could work on a new topic together, let's discuss that.