About a month ago I wrote that I thought the two most important things tomorrow’s testing tools need to help with are infrastructure and maintenance. I’m about a month late on my promised follow up, but I suppose it’s better late than never.
What I wrote was:
By infrastructure, I mean challenges like setting up the test environment, keeping it in a consistent state, ensuring you have all the tools/browsers/databases/whatever always available for automation, running your tests in a timely manner, etc. All of these are much more challenging that JUnit, simply because JUnit only concerned itself with in-memory state and fixtures that could be recreated in milliseconds, not minutes or hours.
As applications become more complex, the work required to automate functional testing also becomes more complex – often more complex than the application itself. For example: it’s trivial these days to do Web 2.0 “mash ups” in your app and embed a Google Map in to your page (it’s a simple <script> block).
But what about testing that functionality? What about checking the performance of it? Not only does it require a compatible browser (no more HttpUnit or HtmlUnit, now you need to use something like Watir or Selenium). And what if you want to test what your app behaves like when the Google Map service is down or running slow. How do you do that?
The fact is that as we continue to build layers and layers of abstraction and extensible services, our speed at which we can develop rich, complex applications grows and the ability to test those applications becomes a much harder challenge.
Next gen tools must help
This is why I believe that the most fundamental thing that any testing tool (or process) will do in the coming years is to help manage these complexities. It’s not an easy problem to solve, but fortunately some ancillary technologies such as virtualization help immensely.
The testing tools of tomorrow can no longer simply assume that automation of the test case (click there, type this, click here, type that, …) is enough. We are in an age where that type of technology is a commodity (Selenium, Watir, etc). They must take developers to the next level of automation and help automate the process of preparing to run these scripts.
Many QA teams spend a majority of their time not writing automated tests or performing required manual testing, but in fact setting up the application in a QA/testing environment. This is a huge waste of time and is something that is begging to be automated.
Continuous integration is key
Let’s pretend tomorrow’s tools do this. They now automatically set up your database to a know state. They help you manage snapshots of data that can be used for various automated tests. They reset the state as needed when running automated test cases. They deploy your application properly. They set up a web browser with the cache cleared out. They do it all.
Suddenly there is a huge incentive for getting developers to take part in the process of writing functional tests in addition to the unit tests they write. That’s because they are continually reminded of those tests, every time they check in code they are notified if one of those tests fails.
Contrast this to today, where most teams barely even write functional tests, and those that do usually relegate that job to the QA team to run near the end of the release cycle. The difference is like night and day.
Some are doing it today…
The good news is that some people are already doing this. Projects like HSQL (an in-memory database) and Cargo (automatic J2EE deployment tools) help considerably. I recently “blogged”:… about how Atlassian is using these tools to get close to tomorrow’s development and testing process today.
HSQL and other in-memory databases help here because they allow for your functional tests to have clean fixtures. That is: they provide a simple and quick way to reset the data in your application so you can run the next functional test. And because they are in-memory, you can reset the state for every test, ensuring you get zero conflicts between test X and test Y.
Cargo is another great tool because it does all the heavy lifting required to deploy a J2EE application to a specific app server (Tomcat, Resin, JBoss, etc) and even allows you to programatically configure it, such as define database connection pools. In fact, Cargo recently began to add support for deployment and configuration of databases as well, so you could use it to set up your entire application deployment just before you run a test.
… but soon they will need more
While HSQL and Cargo are nice, they still are merely emulations of the real thing. That is, you likely won’t use HSQL in production and you likely won’t deploy on the same configuration of Tomcat that Cargo provides. Just as jWebUnit served as an acceptable browser emulator until AJAX arrived, these projects too do their job well for the time being.
However, as deployments become more complicated, so too will we again need to test on the real thing. So just as Selenium is to jWebUnit, tomorrow’s testing tools will need to provide an ability for technologies like HSQL and Cargo.
For example, just last week I had first hand experience with this challenge. Our functional tests use HSQL, but our production environment runs on Postgres. I had checked in some new code that executed a query that was passing just fine in our continuous integration process, but when it came time to push to staging, it broke.
The truth is that in a perfect world, we should be testing on the same environment as our production one, and things like HSQL just don’t cut the mustard. The reason we use them, however, is pure pragmatism: it costs too much and takes too long to develop a testing environment/functional testing fixture that works with a clone of your production environment.
Rising infrastructure costs
In fact, even those that are doing this today run in to challenges: Atlassian’s full integration test suite takes hours to run. Some teams have reported to me that their suites take over a day to complete. And this is all while using shortcuts in their fixtures such as HSQL.
This is where parallelization comes in. The reason these test suites take so long is that they almost always run in sequence. That is: test X runs, then test Y runs, etc. To save time, some teams cut out the heavy lifting in their fixtures (such as deploying the application, setting up the DB, etc) to only happen at the start of the suite rather than each test.
This almost always leads to tests that end up conflicting with other tests (ie: test X and test Y both try to create the same user with a unique username of “bob”). But on top of that, it also begins to introduce dependencies between tests and a preference towards test ordering.
That’s because, after a while, developers begin to take advance of this sequential order and do even more nefarious shorts. No longer does test Y need to create a new “Widget ABC“, because test X did. Of course, if test X ever changes, test Y suffers for it.
So why do even the best teams do this? Because there really isn’t another option today. That’s because the cost in terms of hardware/infrastructure as well as developer time are orders of magnitude larger.
If you currently have 500 functional tests and it takes, on average, 10 seconds per test, plus a one-time 5 minute fixture setup time, then you can run all your tests in about 90 minutes. But if you reset your fixture for each test, it would take 43 hours. Pragmatism takes hold usually right when people realize this.
Even at the much smaller 90 minutes, it’s often too long to place in to a continuous integration process that runs on every commit. The solution is parallelization. Now image those 500 tests running in parallel. The entire suite could run in 5 minutes and 10 seconds – or a short enough time that you could introduce a complete, reliable, real functional test suite that runs as often as you need.
The limitation today is the hardware costs: 500 build machines would cost a team $1,000,000, plus multiple times more maintaining it, powering it, etc. It’s just not possible… today.
The next gen testing tool isn’t a tool at all
Assuming that a “next gen testing tool” must be something that radically improves the software development process, it is my belief that tomorrow’s next gen testing tool will in fact not just be a tool, but an entire infrastructure, most likely offered as a service that you can rent on an as-needed basis.
It would let you run your 500 functional tests in as many as 500 parallel tasks, or as little as 1. It would let you ramp up the parallel tests as your needs and budget evolve. It would finally allow you to run all your tests after every commit, giving you the maximum level of confidence after each commit.
… and it might even be called HostedQA