The Hoffman Heuristic: Why not try them all?

How would you test the following?



It's a simple time-clock application. You give it two times, it records them. It has some simple error messaging for invalid value combinations. I was once given a problem similar to this by James Bach.

The following is my memory of how the conversation went (it's an old memory, so my apologies to James if I misrepresent him):
Mike: "Who uses this product?"

James: "I do. I use it to train testers."

I know it's a sample application. This tells me that there is no specific user I should be keeping in mind as I'm testing. This affirms in my mind that this is an analysis exercise. I think of James' SFDPO heuristic and ask more questions.

Mike: "Tell me about the structure."

James: "HTML and JavaScript."

Not very helpful. I do a view source and look at the code. Very simple. I do notice that the display values for time are converted to integers for the error checking.

Mike: "All it does is take the two inputs and performs error checking?"

James: "Correct."

Man of few words.

Mike: "Where is the data stored?"

James: "The data doesn't get stored anywhere."

There is no real user, so I skip over operations.

Mike: "Are you interested in platform based defects?"

James: "I'm interested in seeing how you test it and what problems you find."

I start with some basic quick tests. I change resolution and browser size, I notice that on low resolution the time dropdowns fall off the screen and I can't select all the values. I tell James.

James: "Ok. What else?"

He's looking for something deeper. The easy stuff over, I do my analysis of the functionality. There were two equivalence classes, AM and PM. There are also some interesting values to try: noon, midnight, and at least one half-hour value for each class. Error handling appears to fire when the start time is later in the day then the end time.

I run a couple of tests. Once it's been 30 minutes, I don't want to spend more time on this problem. I tell James my results.

Mike: "Well, it doesn't like it when I select midnight to end my work day. It thinks that's the start of the day. Other then that, everything appears to be fine. When I select a start time before my end time, it works. Oh, the error dialog doesn't have a caption. I might report that."

James: "Are you sure everything appears to work? What have you tried?"

I show him how I modeled the problem. I tell James my thoughts during my analysis and some of the tests I had executed based on my equivalence classes.

James: "Are there any other tests you want to try? Why not try more?"

Oh! James is looking for me to talk about the value of a test! He wants me to tell him that I no longer see value in testing and that I should now move on the next application.

Mike: "I suppose I can, but I'm not sure there's value in it. At this point I'm relatively confident that the next test won't reveal new information. I should move on to the next assignment."

James: "Really? Why?"

Uncertainty...

Mike: "Well, I've spent 30 minutes on it. I've completed the testing based on my analysis. I can keep testing, but why?"

James: "Why didn't you just test all the values?"

Surprise...

Mike: "Why didn't I test all the values? There are two fields. Each field has 48 values. That's 48 x 48 tests! I'm not testing all the values, it's a waste of time."

James: "Why didn't you just script testing all the values?"

I murmur...

Mike: "I didn't think about it."

James: "Why?"

Mike: "I don't know. I guess it seemed like too simple a problem. Based on everything we've covered up to this point I figured my analysis would be enough. Why waste time writing a script when I can test it manually faster."

James: "Take a look at this..."

At this point, James writes a quick Perl script. He copies the selection values from the HTML and uses a regular expression to read them in (so he doesn't need to waste time formatting the data), and he runs the test. Total time to write and execute the script, about 10 minutes. It turns out there is a defect I missed around a specific conversion in the JavaScript that shows up with a value I never tried because of my equivalence classes.

This is the Hoffman Heuristic: Why not try them all?

This heuristic is based on Doug Hoffman's testing of a massively parallel system. He talks about that experience here.

In Doug's own words:


  • Failures can lurk in incredibly obscure places

  • You could need 4 billion computations to find 2 errors caused by a bug

  • Even covering all of the input and result values isn't enough to be certain



Sometimes automation is faster then thinking about the problem. Instead of taking the time to do the analysis of which values I wanted to test with, I could have executed all possible tests. While not always possible, it's something we need to constantly be looking for.

---------------------------------

Previous Comments

---------------------------------

 



>>>>Thanks



Hi Mike,


I'll do my best to answer them.
You have done your best to answer my questions and no doubts about that. Thanks!

Sorry for question (4), I myself do realize the same as per your answer but was interested to get another dimension than what I claim to know. I did get it and thanks again.

Regards,

Pradeep Soundararajan
www.testertested.blogspot.com

 



>>>>Re: Could that be caught manually too?



Hi Pradeep,


Thank you for the questions. I'll do my best to answer them.

1.I am interested to know from you, could the same bug be caught by manual testing?

Of course. James used automation because, as Scott pointed out below, it was the fastest and easiest way to test all the single values in the lists.

2. What thought process would be required to catch the same bug?

An all singles approach would find it. If I had chosen different value for my existing equivalence class I would have found it. If I had a theory of failure in mind around conversion errors, I may have made a different equivalence class or set of test cases which would have found it.

3. Assuming you are allowed to write and run just one script and if that script does not catch a bug, what would you do?

Why would you assume you can only write and run just one script? How is that useful?

4. If a tester were to catch the same bug manually, in 9 minutes does it mean "Humans can do what automation does?".

Humans can't do the same things as automation, we are too smart. The test James ran was incredibly simple. It would have missed the few defects I did notice. It would have missed a java exception at the bottom of the screen. It would have missed the implications of invalid time entry in other parts of the application. The list of what it would have missed can go on and on.

In addition, automation is not something that happens on it's own. Humans write the automation code and decide when it's useful to use. James found the bug in the example above. The automation didn't find anything. For the automation to find it, it would have to write it's own code, select it's own oracle, and know when to execute it's own test. Incidentally, that's not what happened. James did all those things.

You might find it useful to read Jame's blog titled Manual Tests Cannot Be Automated and my article Software Testing Automation on the Fly: Against the Frameworks.

5. Had you caught the bug before James demonstrated to you the power of scripting, would there be a different lesson you might have learnt?

The purpose of the lesson was around modeling. The scripting lesson I walked away with was a bonus.

Thanks,
Mike

 



>>>>Could that be caught manually too?



Hi Mike,


1.I am interested to know from you, could the same bug be caught by manual testing?

2. What thought process would be required to catch the same bug?

3. Assuming you are allowed to write and run just one script and if that script does not catch a bug, what would you do?

4. If a tester were to catch the same bug manually, in 9 minutes does it mean "Humans can do what automation does?".

5. Had you caught the bug before James demonstrated to you the power of scripting, would there be a different lesson you might have learnt?

Many thanks for sharing your valuable learnings from James Bach!

Regards,

Pradeep Soundararajan
http://www.testertested.blogspot.com

 



>>>>"I guess it seemed like too s



"I guess it seemed like too simple a problem." Wow. Good heuristic!


Why not try them all? Because back in "the olden days" we did not have automated test tools, so we had no choice but to narrow the scope of the problem, i.e., equivalence partitioning.

Now that we do have automated test tools, we forget we have them?? I admit my thoughts were going down the same road yours were... "Ah, equivalence partitioning, smart!" Uh, not really...

 



>>>>I don't think I can...



It's been well over a year since I saw the problem, so I don't remember the details that well. Perhaps Scott does... Or, if James sees this he may comment.


-Mike

 



>>>>Curious about the details



"It turns out there is a defect I missed around a specific conversion in the JavaScript that shows up with a value I never tried because of my equivalence classes."


Can you satisfy my curiousity and reveal the details of this error? Which conversion, for which values?

>>>>No Fair!!!!



I think I did a previous version of this exercise with James in a class we did prior to the last WTST. In the 5 minutes he gave us to come up with our 10 best test cases, I came up with 7 test cases (of which, only 6 were actually unique conditions because I didn't check my own answers for duplicates) before deciding "good enough". Jim asked me why I thought those were good enough. My answers was:


"Because it was taking me longer to think about and write down the test cases than it was taking me to test them, and if this was an application worth spending more than 15 minutes on, I'd just automate a quick script to loop through all the combinations."

My thought process was that in this case it would take me less time to automate every possible combination than it would take me to do a really good job of reducing the number of combinations that had a reasonable likelihood of generating errors.

So, no fair that *I* don't get the named heuristic! :P

--
Scott Barber
Chief Technologist, PerfTestPlus
Executive Director, Association for Software Testing
sbarber@perftestplus.com