Posts in Software Testing
The classic problems with scripted testing
On a recent project, we created (against my will) hundreds of scripted test cases that would be executed in the first cycle of testing. We spent months developing the test scripts and planning for their execution (getting all the files ready, setting them up in our test management tools, etc…). Here's what happened on the first day of testing:

Problem One: None of the test scripts were correct. The expected results did not match the actual requirements once the day of testing came around. It was impossible, given the number of test cases and the number of changing requirements, to keep them up to date. I won’t say all the time was wasted, because it did get us looking at the requirements, and it did give us something close to resembling an expected result, but it didn't do what it was intended to do, and that is it didn't reduce the need for brain engaged testing.

Problem Two: When executing the tests, the testers did not look for any errors other then the specific errors specified in the test case. For example, in one case a tester passed a test case. It was the first day of testing, so I opened the test case results to view what's happening in the system. I scanned the actual results looking for the expected value and I found it. I then re-scanned the actual results looking for other information about the product. I found some funny stuff.

All said and done we logged six defects on this "passed" test case. The only way I can explain this is that the tester was not brain engaged. Because they had a script, all they did was follow the script and turned off their powers of critical thinking and problem solving. This was one example of many similar cases we found.

Problem Three: The final problem was a perception problem. Since we had scripted test cases, progress was measured by the number of test cases executed. I don't want to indicate that I don't think this information is valuable. It is. But I don't think it's the only information that's valuable - and that's how it was being used. I think the fact that number of scripts executed was the driving metric added to the urgency many testers felt in passing test cases and moving on to the next one as quickly as possible. It builds the mentality: "We can't spend time looking for bugs if we are measured in how many test cases we actually execute."
Unit Testing
Last weekend we held the March session of the Indianapolis Workshops on Software Testing. The attendees were:


  • Joshua Rafferty

  • Cheryl Westrick

  • Chad Campbell

  • Jason Halla

  • Allen Stoker

  • Dana Spears

  • Michael Kelly



The topic we focused on for the five-hour workshop was unit testing. It is important to note that half the attendees were developers and the other half testers, all working in the Indianapolis area. The following is a summary of presentations and ideas shared.

We started the workshop off with Allen Stoker providing a high-level overview of unit testing using a test framework (NUnit, JUnit, etc...) so everyone was on the same page. Nothing really eventful was discussed here, but several issues cropped up around unit test complexity and when unit testing should take place. We parked those issues and moved ahead.

Once everyone was working from the same understanding, we broke up into pairs (one developer and one tester) and actually did some unit testing. Each developer wrote some sort of widget code and then the developers and testers took turns writing unit tests for the widget. The developers guided the testers through the process, creating both negative and positive test cases and identifying issues and answering questions along the way. It was interesting to see what some developers considered a widget, the simplest widget belonging to Allen Stoker who simply returned the same value passed into the component. And the most complex widget going to Jason Halla whose widget had a random number generator included in it.

There were some issues identified after the exercise of writing the unit tests. Those issues were:


  • How do we bridge the gap between testers and developers and actually get this level of collaboration on real world software projects?

  • Developers suffer from the same oracle problem testers suffer from.

  • It is difficult for developers (even though they do it every day) to explain unit testing to testers. All three developers struggled with explaining their thought process and their tacit knowledge around unit testing.

  • Unit testing is done at such a low level that it is difficult to test real business functionality (unit tests test the code, not that the requirement is correct).

  • Why isn't there more unit testing done?



We did not come up with many answers. We instead parked those issues with the others and moved ahead with the presentations. Some of the experience reports addressed some of those issues.

Cheryl Westrick then shared her experience helping developers with unit testing at her company. She is currently involved in a project where the testers were given the code as it was developed and they then logged defects as they found them. The unit testing was manual and the team identified and logged thousands of issues. Unfortunately, the testing still took place in an "over the wall" mentality. The testers would get builds, log defects, and then the process would repeat. No one worked side by side to resolve issues or establish continuous communication. Overall, Cheryl did not seem overly excited about the results of the effort, but during discussion, it seemed that her team was able to gain insight into the strengths and weaknesses of the developers, something previously difficult to do. We also talked about how the early insights into the problem areas in the software would affect their system testing going forward. Would they concentrate testing on areas where defects had clustered during unit testing? Would they scrutinize certain developer's code more closely?

Next, Chad Campbell shared his experience on testing private versus public methods while unit testing. Some believe in the process of only testing publicly visible methods. Others believe unit testing should cover all executed pieces of code. The testing of private methods can be difficult if you are not an experienced developer and Chad shared his experience using reflection in .Net to test private methods. Chad sparked some lively debate on the topic, with the Java developers curious about the specific details around his use of reflection. This took us into the very fascinating topic of unit test theory. Chad and I maintain that from a TDD standpoint, unit tests are there to simplify design and add confidence in your ability to refactor, so there is value in testing private methods. Allen and Jason sat on the other side of the fence offering the view that unit testing at that level is really not valuable in that all you are really interested in is the correctness of the functionality exposed by the public interface and that as long as that interface is thoroughly tested you have all the coverage you need. We agreed to disagree, concluding that different contexts for development will necessitate different unit testing methods (so really not much of a conclusion).

This gave us a good transition into Allen Stokers experience report on effective unit testing. Allen maintains that validating a build should be a convergence of a couple of pieces of information. You want to see a successful test run, but that's not enough. A successful run means that all of your test cases passed, but what you still have no real idea of what you tested. Allen made the case that this is typically where coverage analysis comes in.

The classic argument is that a detailed coverage report will give you a strong indication of the test case quality by identifying what code was exercised. Unfortunately, it will not tell you what was tested. The term tested implies far more than the execution of code. A proper test case requires that the results of the code execution are known ahead of time, and that the scenario actually verifies those results. A good test case will typically exercise a block of code multiple times by changing the conditions of the execution on each pass. From this, you can see that a good secondary indicator is the number of times each line of code is execution during the unit test execution.


High coverage % + High line execution count = success ?


According to Allen, no.

The problem is that one can write an entire suite of test cases that exercise the application at a high coverage level, and even execute the code multiple times (meeting the noted formula), but doesn't really verify anything. Building effective test cases takes a significant amount of time and thought to ensure that the right things are being tested. In the end, according to Allen, there is only one real solution, and that is human intervention. Allen recommended each use case include the following quality controls:


  • Detailed design reviews before coding (if the design is bad, it may be plagued with issues forever).

  • Initial code reviews (early identification of problem areas and bad practices)

  • Final code reviews including:

    • Full source code of component

    • Full JUnit source code of component

    • Coverage analysis results




This may require the dedication of a senior team member to this activity alone during the entire project. It's a significant investment. Unit testing is no different from any other coding effort and it requires both learning about the API, and learning about effective testing. In addition, Allen advocated developers having their unit tests reviewed by an experienced tester as a mentoring process. There are probably many ways you could automate analysis of the unit test cases, but the only way to truly know that you have an effective test suite is to review the test cases manually.

Our discussion of Allen's experience focused mostly on the final recommendation of having unit tests reviewed by an experienced tester as a mentoring process. This generated discussion between the testers and developers at the workshop on the different sets of knowledge we own and cultivate and how that knowledge can best be leveraged. We tried to imagine how to effectively implement such and environment. Code is quite often serves as a wall between tester and developer collaboration. As a group, we felt that testers would almost need to have a developer background to get involved in the code at this level.

We also identified a second wall that was created by developers. Typically, if you go to a tester and engage them in this type of activity, "Hey, can you help me with my unit testing? I don't know if I'm testing the right stuff..." the tester will be excited to be engaged. On the other hand, if you attempt to engage a developer, our shared experience has been that the developer will respond, "Just let me code and leave me alone...." We talked about how to tear down this wall and we decided that this could only be done on a one on one basis. Individual testers and developers will need to decide to collaborate and to learn from one another. We see very little that corporate or project leadership can do to facilitate this behavior.

We then looked at a problem solving opportunity presented by Dana Spears. Dana asked the group how an organization might develop standards and metrics for unit testing. Was there a flexible enough process that could be implemented on the organizational level to address effective unit testing? The short answer to the problem is sadly, no. It was the conclusion of the attendees of the workshop that unit testing, like any testing activity, tends to be qualitative and not quantitative. We generated a lot of discussion on the topic, but we could only come up with more questions then answers.

Finally, we finished with a tool demo. We looked at component testing with Rational Application Developer. This tool is an excellent example of the types of tools needed to get developers and testers working together. Using this tool, developers have the ability to create complex component-test scenarios, select sets and ranges of data for input, and have ready access to the static metrics for coverage analysis. In addition, testers can easily provide input in the form of data selection techniques like boundary-value analysis, domain partitioning, and selecting data sets and ranges for optimal test case coverage.

We find tools like this one valuable in that developers and testers don't necessarily have to work side by side (although there wouldn't be anything wrong with doing that), but instead we envision a scenario where developers develop component tests on their own and then work with testers to define the test data to be used in the test. Data can be entered directly by the tester, thus extending the component test, or the tester can send comments on data selection to the developer and the developer can enter the data.

Working this way can result in the following advantages:


  1. Increased appreciation and understanding of skill sets

  2. Cross-training and knowledge dissemination

  3. Better and more meaningful component tests

  4. Possible reuse of component test data for functional testing downstream in the project lifecycle

  5. Better coordination of all testing efforts based on coverage statistics and other static metrics

  6. Better software



Next month's topic is on testing in an Agile environment. We already have a couple of experience reports lined up from the good folks at CTI Group, and I am hoping we can get some good lessons learned on the topic and identify some of the issues around shifting the testing culture here in the Indianapolis community. If you would like to participate in an IWST workshop, please feel free to contact me.