The Hoffman Heuristic: Why not try them all?

How would you test the following?

It's a simple time-clock application. You give it two times, it records them. It has some simple error messaging for invalid value combinations. I was once given a problem similar to this by James Bach.

The following is my memory of how the conversation went (it's an old memory, so my apologies to James if I misrepresent him):

Mike: "Who uses this product?"

James: "I do. I use it to train testers."

I know it's a sample application. This tells me that there is no specific user I should be keeping in mind as I'm testing. This affirms in my mind that this is an analysis exercise. I think of James' SFDPO heuristic and ask more questions.

Mike: "Tell me about the structure."

James: "HTML and JavaScript."

Not very helpful. I do a view source and look at the code. Very simple. I do notice that the display values for time are converted to integers for the error checking.

Mike: "All it does is take the two inputs and performs error checking?"

James: "Correct."

Man of few words.

Mike: "Where is the data stored?"

James: "The data doesn't get stored anywhere."

There is no real user, so I skip over operations.

Mike: "Are you interested in platform based defects?"

James: "I'm interested in seeing how you test it and what problems you find."

I start with some basic quick tests. I change resolution and browser size, I notice that on low resolution the time dropdowns fall off the screen and I can't select all the values. I tell James.

James: "Ok. What else?"

He's looking for something deeper. The easy stuff over, I do my analysis of the functionality. There were two equivalence classes, AM and PM. There are also some interesting values to try: noon, midnight, and at least one half-hour value for each class. Error handling appears to fire when the start time is later in the day then the end time.

I run a couple of tests. Once it's been 30 minutes, I don't want to spend more time on this problem. I tell James my results.

Mike: "Well, it doesn't like it when I select midnight to end my work day. It thinks that's the start of the day. Other then that, everything appears to be fine. When I select a start time before my end time, it works. Oh, the error dialog doesn't have a caption. I might report that."

James: "Are you sure everything appears to work? What have you tried?"

I show him how I modeled the problem. I tell James my thoughts during my analysis and some of the tests I had executed based on my equivalence classes.

James: "Are there any other tests you want to try? Why not try more?"

Oh! James is looking for me to talk about the value of a test! He wants me to tell him that I no longer see value in testing and that I should now move on the next application.

Mike: "I suppose I can, but I'm not sure there's value in it. At this point I'm relatively confident that the next test won't reveal new information. I should move on to the next assignment."

James: "Really? Why?"

Uncertainty...

Mike: "Well, I've spent 30 minutes on it. I've completed the testing based on my analysis. I can keep testing, but why?"

James: "Why didn't you just test all the values?"

Surprise...

Mike: "Why didn't I test all the values? There are two fields. Each field has 48 values. That's 48 x 48 tests! I'm not testing all the values, it's a waste of time."

James: "Why didn't you just script testing all the values?"

I murmur...

Mike: "I didn't think about it."

James: "Why?"

Mike: "I don't know. I guess it seemed like too simple a problem. Based on everything we've covered up to this point I figured my analysis would be enough. Why waste time writing a script when I can test it manually faster."

James: "Take a look at this..."

At this point, James writes a quick Perl script. He copies the selection values from the HTML and uses a regular expression to read them in (so he doesn't need to waste time formatting the data), and he runs the test. Total time to write and execute the script, about 10 minutes. It turns out there is a defect I missed around a specific conversion in the JavaScript that shows up with a value I never tried because of my equivalence classes.

This is the Hoffman Heuristic: Why not try them all?

This heuristic is based on Doug Hoffman's testing of a massively parallel system. He talks about that experience here.

In Doug's own words:

Failures can lurk in incredibly obscure places

You could need 4 billion computations to find 2 errors caused by a bug

Even covering all of the input and result values isn't enough to be certain

Sometimes automation is faster then thinking about the problem. Instead of taking the time to do the analysis of which values I wanted to test with, I could have executed all possible tests. While not always possible, it's something we need to constantly be looking for.

---------------------------------

Previous Comments

---------------------------------

>>>>Thanks

Submitted by Pradeep Soundar... on Thu, 12/10/2006 - 04:38.

Hi Mike,

I'll do my best to answer them.
You have done your best to answer my questions and no doubts about that. Thanks!

Sorry for question (4), I myself do realize the same as per your answer but was interested to get another dimension than what I claim to know. I did get it and thanks again.

Regards,

Pradeep Soundararajan
www.testertested.blogspot.com

>>>>Re: Could that be caught manually too?

Submitted by Mike Kelly on Tue, 10/10/2006 - 05:45.

Hi Pradeep,

Thank you for the questions. I'll do my best to answer them.

1.I am interested to know from you, could the same bug be caught by manual testing?

Of course. James used automation because, as Scott pointed out below, it was the fastest and easiest way to test all the single values in the lists.

2. What thought process would be required to catch the same bug?

An all singles approach would find it. If I had chosen different value for my existing equivalence class I would have found it. If I had a theory of failure in mind around conversion errors, I may have made a different equivalence class or set of test cases which would have found it.

3. Assuming you are allowed to write and run just one script and if that script does not catch a bug, what would you do?

Why would you assume you can only write and run just one script? How is that useful?

4. If a tester were to catch the same bug manually, in 9 minutes does it mean "Humans can do what automation does?".

Humans can't do the same things as automation, we are too smart. The test James ran was incredibly simple. It would have missed the few defects I did notice. It would have missed a java exception at the bottom of the screen. It would have missed the implications of invalid time entry in other parts of the application. The list of what it would have missed can go on and on.

In addition, automation is not something that happens on it's own. Humans write the automation code and decide when it's useful to use. James found the bug in the example above. The automation didn't find anything. For the automation to find it, it would have to write it's own code, select it's own oracle, and know when to execute it's own test. Incidentally, that's not what happened. James did all those things.

You might find it useful to read Jame's blog titled Manual Tests Cannot Be Automated and my article Software Testing Automation on the Fly: Against the Frameworks.

5. Had you caught the bug before James demonstrated to you the power of scripting, would there be a different lesson you might have learnt?

The purpose of the lesson was around modeling. The scripting lesson I walked away with was a bonus.

Thanks,
Mike

>>>>Could that be caught manually too?

Submitted by Pradeep Soundar... on Sun, 08/10/2006 - 03:00.

Hi Mike,

1.I am interested to know from you, could the same bug be caught by manual testing?

2. What thought process would be required to catch the same bug?

3. Assuming you are allowed to write and run just one script and if that script does not catch a bug, what would you do?

4. If a tester were to catch the same bug manually, in 9 minutes does it mean "Humans can do what automation does?".

5. Had you caught the bug before James demonstrated to you the power of scripting, would there be a different lesson you might have learnt?

Many thanks for sharing your valuable learnings from James Bach!

Regards,

Pradeep Soundararajan
http://www.testertested.blogspot.com

>>>>"I guess it seemed like too s

Submitted by Charlie Audritsh on Thu, 05/10/2006 - 17:22.

"I guess it seemed like too simple a problem." Wow. Good heuristic!

Why not try them all? Because back in "the olden days" we did not have automated test tools, so we had no choice but to narrow the scope of the problem, i.e., equivalence partitioning.

Now that we do have automated test tools, we forget we have them?? I admit my thoughts were going down the same road yours were... "Ah, equivalence partitioning, smart!" Uh, not really...

>>>>I don't think I can...

Submitted by Mike Kelly on Fri, 29/09/2006 - 11:56.

It's been well over a year since I saw the problem, so I don't remember the details that well. Perhaps Scott does... Or, if James sees this he may comment.

-Mike

>>>>Curious about the details

Submitted by alanpa on Fri, 29/09/2006 - 10:54.

"It turns out there is a defect I missed around a specific conversion in the JavaScript that shows up with a value I never tried because of my equivalence classes."

Can you satisfy my curiousity and reveal the details of this error? Which conversion, for which values?

>>>>No Fair!!!!

Submitted by sbarber on Fri, 29/09/2006 - 09:31.

I think I did a previous version of this exercise with James in a class we did prior to the last WTST. In the 5 minutes he gave us to come up with our 10 best test cases, I came up with 7 test cases (of which, only 6 were actually unique conditions because I didn't check my own answers for duplicates) before deciding "good enough". Jim asked me why I thought those were good enough. My answers was:

"Because it was taking me longer to think about and write down the test cases than it was taking me to test them, and if this was an application worth spending more than 15 minutes on, I'd just automate a quick script to loop through all the combinations."

My thought process was that in this case it would take me less time to automate every possible combination than it would take me to do a really good job of reducing the number of combinations that had a reasonable likelihood of generating errors.

So, no fair that *I* don't get the named heuristic! :P

--
Scott Barber
Chief Technologist, PerfTestPlus
Executive Director, Association for Software Testing
sbarber@perftestplus.com

Heuristics, Software TestingMichael KellySeptember 29, 2006Comment

Imagine yourself sitting at a computer...

Recently while working on a problem with James Bach, I was challenged with coming up with a series of tests for a system that could have at least two events that could potentially overlap each other. I came up with the following set of tests:

(Note: The names "process one" and "two" are arbitrary; the processes could in fact be the same process.)

process one, then two

process one and two at the same time

process one, with process two beginning before it's finished, with process one ending before process two

process one, with process two beginning before it's finished, with process two ending before process one

For full disclosure, this isn't the actual set of test cases that I first came up with, there were some duplicates in my initial analysis because I didn't understand the problem and failed to ask clarifying questions. In addition, I had two test cases that didn't really make sense once evaluated.

During the debrief for this exercise, James zeroed in on my test for two processes starting at the same time. His question was, "How would you actually run this test?"

Well, I hadn't really thought of that at the time I was thinking up the tests. When I was thinking of the tests, I was thinking of my model of the problem, not how I would execute them. I responded that I might run that test on a multiprocessor system, or on an emulator, or using some tool that I didn't know about that facilitates this kind of testing.

James didn't let up, "Ok, you have a multiprocessor system, now how will you actually run this test?"

I've tried to block the next fifteen minutes of our conversation from my memory, for the convenient fact that it means I don't have to blog to the world about how dim I can actually be at times. But, I do remember the lesson James taught me. (How convenient!)

When you are thinking about your tests, don't just think about coverage and risk, imagine yourself sitting in front of the computer and actually running the tests.

The problem I faced with James' earlier challenge was that I couldn't actually articulate the details of how I would run that test. How would I actually start both processes at the same time? How would I know they really started at the same time? What does the same time mean (millisecond, clock cycle, other...)? How would I actually monitor the CPUs to see if they were both working at the same time? How would I do this without interfering with the test itself? Grrr... Details!

But the details are important when we design tests. On a recent project I ran into this again. While developing a test strategy for a system with multiple interfaces, some third party web services, and a feature-set under test that at times can't really be tested using the GUI, I ran into the same mistake. I focused too much on the strategy and not enough on the tactics.

I knew what I wanted to test, but I waited too long to figure out how I would test it. It wasn't until prompting from the project manager that I realized this. She asked the question about how I will actually run my tests.

"Uhh... well, we'll just run them. You know, using the system... err... Is that my cell phone?" Needless to say, I wasn't very happy with my answer. Neither was she.

(Note to project managers in the crowd, that's a great question to ask.)

It was at that time we started to develop detailed test methods for our testing (special thanks to fellow blogger John McConda). These were just high level processes for running a basic test through the system, looking at test data, the execution steps, and identifying potential oracles. Nothing super heavy or really strict, but useful in helping us think about actually sitting in front of the computer and running the tests.

After you think of a test (assuming you don't just execute it right then and there), take the next step and think about how you will execute it. What will you need? What will the testing look like? How long will it take? How will you know when you are done?

Imagine yourself sitting at the computer...

Heuristics, Software TestingMichael KellySeptember 12, 2006Comment