The classic problems with scripted testing
On a recent project, we created (against my will) hundreds of scripted test cases that would be executed in the first cycle of testing. We spent months developing the test scripts and planning for their execution (getting all the files ready, setting them up in our test management tools, etc…). Here's what happened on the first day of testing:
Problem One: None of the test scripts were correct. The expected results did not match the actual requirements once the day of testing came around. It was impossible, given the number of test cases and the number of changing requirements, to keep them up to date. I won’t say all the time was wasted, because it did get us looking at the requirements, and it did give us something close to resembling an expected result, but it didn't do what it was intended to do, and that is it didn't reduce the need for brain engaged testing.
Problem Two: When executing the tests, the testers did not look for any errors other then the specific errors specified in the test case. For example, in one case a tester passed a test case. It was the first day of testing, so I opened the test case results to view what's happening in the system. I scanned the actual results looking for the expected value and I found it. I then re-scanned the actual results looking for other information about the product. I found some funny stuff.
All said and done we logged six defects on this "passed" test case. The only way I can explain this is that the tester was not brain engaged. Because they had a script, all they did was follow the script and turned off their powers of critical thinking and problem solving. This was one example of many similar cases we found.
Problem Three: The final problem was a perception problem. Since we had scripted test cases, progress was measured by the number of test cases executed. I don't want to indicate that I don't think this information is valuable. It is. But I don't think it's the only information that's valuable - and that's how it was being used. I think the fact that number of scripts executed was the driving metric added to the urgency many testers felt in passing test cases and moving on to the next one as quickly as possible. It builds the mentality: "We can't spend time looking for bugs if we are measured in how many test cases we actually execute."
Problem One: None of the test scripts were correct. The expected results did not match the actual requirements once the day of testing came around. It was impossible, given the number of test cases and the number of changing requirements, to keep them up to date. I won’t say all the time was wasted, because it did get us looking at the requirements, and it did give us something close to resembling an expected result, but it didn't do what it was intended to do, and that is it didn't reduce the need for brain engaged testing.
Problem Two: When executing the tests, the testers did not look for any errors other then the specific errors specified in the test case. For example, in one case a tester passed a test case. It was the first day of testing, so I opened the test case results to view what's happening in the system. I scanned the actual results looking for the expected value and I found it. I then re-scanned the actual results looking for other information about the product. I found some funny stuff.
All said and done we logged six defects on this "passed" test case. The only way I can explain this is that the tester was not brain engaged. Because they had a script, all they did was follow the script and turned off their powers of critical thinking and problem solving. This was one example of many similar cases we found.
Problem Three: The final problem was a perception problem. Since we had scripted test cases, progress was measured by the number of test cases executed. I don't want to indicate that I don't think this information is valuable. It is. But I don't think it's the only information that's valuable - and that's how it was being used. I think the fact that number of scripts executed was the driving metric added to the urgency many testers felt in passing test cases and moving on to the next one as quickly as possible. It builds the mentality: "We can't spend time looking for bugs if we are measured in how many test cases we actually execute."