Posts in Agile

Unit Testing

Last weekend we held the March session of the Indianapolis Workshops on Software Testing. The attendees were:

Joshua Rafferty

Cheryl Westrick

Chad Campbell

Jason Halla

Allen Stoker

Dana Spears

Michael Kelly

The topic we focused on for the five-hour workshop was unit testing. It is important to note that half the attendees were developers and the other half testers, all working in the Indianapolis area. The following is a summary of presentations and ideas shared.

We started the workshop off with Allen Stoker providing a high-level overview of unit testing using a test framework (NUnit, JUnit, etc...) so everyone was on the same page. Nothing really eventful was discussed here, but several issues cropped up around unit test complexity and when unit testing should take place. We parked those issues and moved ahead.

Once everyone was working from the same understanding, we broke up into pairs (one developer and one tester) and actually did some unit testing. Each developer wrote some sort of widget code and then the developers and testers took turns writing unit tests for the widget. The developers guided the testers through the process, creating both negative and positive test cases and identifying issues and answering questions along the way. It was interesting to see what some developers considered a widget, the simplest widget belonging to Allen Stoker who simply returned the same value passed into the component. And the most complex widget going to Jason Halla whose widget had a random number generator included in it.

There were some issues identified after the exercise of writing the unit tests. Those issues were:

How do we bridge the gap between testers and developers and actually get this level of collaboration on real world software projects?

Developers suffer from the same oracle problem testers suffer from.

It is difficult for developers (even though they do it every day) to explain unit testing to testers. All three developers struggled with explaining their thought process and their tacit knowledge around unit testing.

Unit testing is done at such a low level that it is difficult to test real business functionality (unit tests test the code, not that the requirement is correct).

Why isn't there more unit testing done?

We did not come up with many answers. We instead parked those issues with the others and moved ahead with the presentations. Some of the experience reports addressed some of those issues.

Cheryl Westrick then shared her experience helping developers with unit testing at her company. She is currently involved in a project where the testers were given the code as it was developed and they then logged defects as they found them. The unit testing was manual and the team identified and logged thousands of issues. Unfortunately, the testing still took place in an "over the wall" mentality. The testers would get builds, log defects, and then the process would repeat. No one worked side by side to resolve issues or establish continuous communication. Overall, Cheryl did not seem overly excited about the results of the effort, but during discussion, it seemed that her team was able to gain insight into the strengths and weaknesses of the developers, something previously difficult to do. We also talked about how the early insights into the problem areas in the software would affect their system testing going forward. Would they concentrate testing on areas where defects had clustered during unit testing? Would they scrutinize certain developer's code more closely?

Next, Chad Campbell shared his experience on testing private versus public methods while unit testing. Some believe in the process of only testing publicly visible methods. Others believe unit testing should cover all executed pieces of code. The testing of private methods can be difficult if you are not an experienced developer and Chad shared his experience using reflection in .Net to test private methods. Chad sparked some lively debate on the topic, with the Java developers curious about the specific details around his use of reflection. This took us into the very fascinating topic of unit test theory. Chad and I maintain that from a TDD standpoint, unit tests are there to simplify design and add confidence in your ability to refactor, so there is value in testing private methods. Allen and Jason sat on the other side of the fence offering the view that unit testing at that level is really not valuable in that all you are really interested in is the correctness of the functionality exposed by the public interface and that as long as that interface is thoroughly tested you have all the coverage you need. We agreed to disagree, concluding that different contexts for development will necessitate different unit testing methods (so really not much of a conclusion).

This gave us a good transition into Allen Stokers experience report on effective unit testing. Allen maintains that validating a build should be a convergence of a couple of pieces of information. You want to see a successful test run, but that's not enough. A successful run means that all of your test cases passed, but what you still have no real idea of what you tested. Allen made the case that this is typically where coverage analysis comes in.

The classic argument is that a detailed coverage report will give you a strong indication of the test case quality by identifying what code was exercised. Unfortunately, it will not tell you what was tested. The term tested implies far more than the execution of code. A proper test case requires that the results of the code execution are known ahead of time, and that the scenario actually verifies those results. A good test case will typically exercise a block of code multiple times by changing the conditions of the execution on each pass. From this, you can see that a good secondary indicator is the number of times each line of code is execution during the unit test execution.

High coverage % + High line execution count = success ?

According to Allen, no.

The problem is that one can write an entire suite of test cases that exercise the application at a high coverage level, and even execute the code multiple times (meeting the noted formula), but doesn't really verify anything. Building effective test cases takes a significant amount of time and thought to ensure that the right things are being tested. In the end, according to Allen, there is only one real solution, and that is human intervention. Allen recommended each use case include the following quality controls:

Detailed design reviews before coding (if the design is bad, it may be plagued with issues forever).

Initial code reviews (early identification of problem areas and bad practices)

Final code reviews including:
- Full source code of component
- Full JUnit source code of component
- Coverage analysis results

This may require the dedication of a senior team member to this activity alone during the entire project. It's a significant investment. Unit testing is no different from any other coding effort and it requires both learning about the API, and learning about effective testing. In addition, Allen advocated developers having their unit tests reviewed by an experienced tester as a mentoring process. There are probably many ways you could automate analysis of the unit test cases, but the only way to truly know that you have an effective test suite is to review the test cases manually.

Our discussion of Allen's experience focused mostly on the final recommendation of having unit tests reviewed by an experienced tester as a mentoring process. This generated discussion between the testers and developers at the workshop on the different sets of knowledge we own and cultivate and how that knowledge can best be leveraged. We tried to imagine how to effectively implement such and environment. Code is quite often serves as a wall between tester and developer collaboration. As a group, we felt that testers would almost need to have a developer background to get involved in the code at this level.

We also identified a second wall that was created by developers. Typically, if you go to a tester and engage them in this type of activity, "Hey, can you help me with my unit testing? I don't know if I'm testing the right stuff..." the tester will be excited to be engaged. On the other hand, if you attempt to engage a developer, our shared experience has been that the developer will respond, "Just let me code and leave me alone...." We talked about how to tear down this wall and we decided that this could only be done on a one on one basis. Individual testers and developers will need to decide to collaborate and to learn from one another. We see very little that corporate or project leadership can do to facilitate this behavior.

We then looked at a problem solving opportunity presented by Dana Spears. Dana asked the group how an organization might develop standards and metrics for unit testing. Was there a flexible enough process that could be implemented on the organizational level to address effective unit testing? The short answer to the problem is sadly, no. It was the conclusion of the attendees of the workshop that unit testing, like any testing activity, tends to be qualitative and not quantitative. We generated a lot of discussion on the topic, but we could only come up with more questions then answers.

Finally, we finished with a tool demo. We looked at component testing with Rational Application Developer. This tool is an excellent example of the types of tools needed to get developers and testers working together. Using this tool, developers have the ability to create complex component-test scenarios, select sets and ranges of data for input, and have ready access to the static metrics for coverage analysis. In addition, testers can easily provide input in the form of data selection techniques like boundary-value analysis, domain partitioning, and selecting data sets and ranges for optimal test case coverage.

We find tools like this one valuable in that developers and testers don't necessarily have to work side by side (although there wouldn't be anything wrong with doing that), but instead we envision a scenario where developers develop component tests on their own and then work with testers to define the test data to be used in the test. Data can be entered directly by the tester, thus extending the component test, or the tester can send comments on data selection to the developer and the developer can enter the data.

Working this way can result in the following advantages:

Increased appreciation and understanding of skill sets

Cross-training and knowledge dissemination

Better and more meaningful component tests

Possible reuse of component test data for functional testing downstream in the project lifecycle

Better coordination of all testing efforts based on coverage statistics and other static metrics

Better software

Next month's topic is on testing in an Agile environment. We already have a couple of experience reports lined up from the good folks at CTI Group, and I am hoping we can get some good lessons learned on the topic and identify some of the issues around shifting the testing culture here in the Indianapolis community. If you would like to participate in an IWST workshop, please feel free to contact me.

Agile, IWST, Software TestingMichael KellyMarch 21, 2005Comment

Why do we create unit tests?

A friend of mine asked me the question: "Why do we create, validate and peer-review unit test scripts?"

I was not sure how to answer this, so I took a look at some of the work/thoughts of people smarter then me (James Bach, Brian Marick, Cem Kaner, Jonathan Kohl, Bret Pettichord, Rex Black, Dave Liebreich) and I added my own thoughts and experiences. The following is what I sent her - along with some links to some good books and websites on the topic.

There are two main contexts from which I can answer this question. The first is through the context of the software development lifecycle (SDLC) and the second is from the context of the unit test itself, as an individual unit of work in the overall testing effort for the project. That is, when looking at the SDLC, unit testing can sometimes involve a methodology - as in test driven development (TDD) typically used in extreme or agile programming - or a unit test can sometimes be used as a closing artifact for a phase - as in a regulated waterfall methodology. When looking at a unit test as an individual unit or part - under the overall testing process - it can be easier to see its value outside of the implications it has on the SDLC in which it is being utilized. I will answer this question from the context of the latter, looking at a unit test as an individual work effort within the overall software testing process.

Unit tests are typically created for the following reasons:

Unit tests provide quick feedback to the developer.

The sooner a developer receives feedback on their code the sooner they can implement a change based on that feedback. The unit test provides feedback to the developer before other test systems exist for the code under development and provides feedback in the code environment closest to the developer. It is less expensive to fix errors found in a development environment then errors promoted to test environments. It is less expensive to fix errors found by the developer writing the code then when the error is found and recorded by a tester. The time to record, process, and track the error becomes overhead on the project.
Unit tests provide a measure of developer progress.

There is an increasing belief is that a properly structured system is easy to unit test. A unit test that is desirable but may be hard to implement is seen as a sign that the system needs improving. Often, developing unit tests will help focus and clarify thoughts on the code being developed by forcing issues and ambiguities to the surface before implementation and release.
Unit tests mitigate concerns about the effects of refactoring.

Unit tests are typically written at the time of the highest flux and least reliability in the code they are testing. By providing quick feedback and providing a measure of progress, developers can work confidently during times of flux and make changes with confidence, knowing that any omissions or new errors introduced should be caught by their suite of unit tests.
Unit tests validate code integration.

Unit tests can be reused during code integration to ensure that changes in other developer's code do not adversely affect the code tested via the unit test. "If it worked before it should work now." While typically not a complete test of integration, a complete test of all the parts of a whole is the logical first step in testing the whole.
Unit tests result in more testable code.

In the process of developing unit tests, developers typically a required to build in test harnesses, logging and debug utilities, and APIs to exercise functionality in isolation. Almost always, this functionality can be used downstream in the testing process during integration testing, system testing, regression testing, performance testing, and user acceptance testing.
Unit tests document the code they test.

Unit tests can be used as a form of executable documentation for the code they test. These tests are of substantial value for programmers doing maintenance: for both programmers trying to understand their own code at a later date and when looking at someone else's code. The simplicity of unit tests clarifies the intent and the expectations of the code.
Unit tests are inexpensive to run and maintain.

Relative to all the other types of testing, unit tests are inexpensive to create, maintain and run. Tools are typically free or included in the enterprise IDE being utilized, resulting in no additional tool investment. Unit tests are typically coded in the same language as the code they are testing, resulting in no additional cost associated with maintaining a specific language skill set. Unit tests are typically simple enough that no extra documentation is necessary for their longevity, unlike all other types of tests (with the exception of exploratory testing).
Unit tests report serious problems.

For a good many unit tests, failure would indicate a very serious problem if it were to occur. Unlike system tests, with can involve subjectivity and ambiguity, unit tests typically focus on technology issues and coding errors.

Aside from their immediate value to the developers, as shown above, unit tests provide value to the overall testing process:

Unit tests create a test harness that can be leveraged for other types of testing.

As described above under "Unit tests result in more testable code," unit tests and unit test harnesses can be leveraged elsewhere. Unit tests can be repurposed to address risks that may not have been envisioned when they were written. Unit test can be used as a starting point for APIs used for test automation. They can be used to seed a suite of automated tests. Logs developed for unit testing can be leveraged throughout the test lifecycle.
Unit tests reduce the overall scope (coverage analysis and risk analysis) of other types of testing.

System tests can be designed by reviewing the existing unit tests, and in some cases tapping into interfaces the developers had written for their own tests, and can focus on efforts that the developers didn't focus on. The system testers should take advantage of unit tests and design their own tests to mitigate risks not already addressed by the unit tests.
Good unit tests remove the necessity of in-depth domain testing.

With the exception of a quick sampling to verify that the right testing was done at the unit level, domain testing can be omitted entirely.
Good unit tests remove the necessity of in-depth boundary value analysis.

With the exception of a quick sampling to verify that the right testing was done at the unit level, boundary value analysis can be omitted entirely.

Agile, Software TestingMichael KellyNovember 19, 2004Comment