fast and frugal tree for test triage Rikard Edgren 7 Comments

There are situations when you have to choose to run a test or not.
Some organizations quantifies properties like time, risk and get a prioritized list.
Most probably just use their intuition, but if that’s not enough, or you want to explain and share the reasoning, you can try using a fast and frugal tree (Yes, I got this idea when reading Gigerenzer’s Gut Feelings.)

Fast Frugal Test Triage

Fast Frugal Test Triage

If the answer is Don’t Know to any of the question, keep the test on the list until you know more.

Feel free to exchange to questions that suit you better, or to areas of your concern, e.g. which bugs to fix.
This is not a water-proof method, but might yield the very best results, and is more straightforward than the notion of testworthiness.

Roleplaying your test scenarios Martin Jansson No Comments

Many of you have played roleplaying games/storytelling games or at least heard of them. In those games you have a gamemaster/storyteller, who arranges scenes or scenarios that the players act in. He does not control what each player does nor how he/she should interact with others.

Each player usually plays a role/character that has a certain background, certain skills, certain attributes and most importantly comes from specific context with specific information. The player then acts based on the premises in the environment setup by the gamemaster.

I think you can develop a skill to be able to act on the context and information that another character or persona has. You can do this through roleplaying games or regular acting.

If the personas in product development are created with consideration that they can be “played” by testers. The personas should contain as much information that a test scenario can be setup with several testers that can interact based on their personas.

You should be able to find a new set of bugs, issues or risks that might be missed when not roleplaying personas as a group. Using the different personas in your every-day-testing will make it more fun and creative. If you are able to act and test based on the information that each persona has it will lead you down new paths. When you report bugs, write status reports with the consideration of these personas will also let you talk about risks with a different perspective that can give some extra power.

Lateral Tester Exercise II – Everyday Analogies Rikard Edgren 12 Comments

Analogies are powerful when they help us understand something (they shouldn’t be used to argue.)
And virtually any analogy can be good, you don’t know until after you have tried.
So this exercise is to use an analogy from your daily life, compare it to testing in general, or to your current area of concern.
Follow the thinking, work in several rounds to see what happens.

My little example is software testing vs. boiling potatoes:

Often potatoes are peeled, but not always. You can also use tools to speed the preparation phase.
Testing conclusion: you might not need to spend time preparing; and if you do, can you use tools?

Potatoes grown by yourself, or nearby, tastes better, are often cheaper, and more nutritious.
Testing conclusion: Outsourcing or commercial tool package might not be good for you.
Your own solutions, the strategies and ideas grown in your environment can be the best.

When are potatoes ready? Easy, you just feel if they are soft, but not musky.
Testing conclusion: Difficult, but if tests aren’t revealing new information, it is over-cooked?
However, it would be nice with a stick to check the product’s temperature…

Potatoes aren’t dined alone, it is part of a meal, with a mix of tastes.
Testing conclusion: You need to adapt to the rest of the dinner, who is it for, in which situation, for how many? What kind of testing is performed by others?

But I have not found a solution to this interesting question:
What is the salt of software testing?

Developers, let the testers assist with the technical debt Martin Jansson 4 Comments

Shipping first time code is like going into debt. A little debt speeds development so long as it is paid back promptly with a rewrite. Objects make the cost of this transaction tolerable. The danger occurs when the debt is not repaid. Every minute spent on not-quite-right code counts as interest on that debt. Entire engineering organizations can be brought to a stand-still under the debt load of an unconsolidated implementation, object- oriented or otherwise.

Quote taken from Ward Cunningham in “The WyCash Portfolio Management System” [1] from 1992. Martin Fowler further elaborates around the subject “Technical Debt” [2] and identifies the “Technical Debt Quadrant” [3], where he makes a distinction between prudent and reckless debt as well as deliberate and inadvertent debt. These four debts creates the quadrant, as Martin sees it. Uncle Bob points out in his article “A mess is not a debt” [4] that some deliberate shortcuts should not be considered as a technical debt, but instead are just a mess. I think “A mess” could be seen as Broken Windows (another metaphor) that in the end turn into a debt you must pay, thus I agree with the comment that Daniel Broman had. Steve McConnell discuss the intentional and unintentional technical debt in an article [5]. He also elaborates around short-term and long-term debt. One among many important suggestion that he brings up is to maintain a debt list connected with workpackages needed to pay off a specific debt.

If the shortcut you are considering taking is too minor to add to the debt-service defect list/product backlog, then it’s too minor to make a difference; don’t take that shortcut. We only want to take shortcuts that we can track and repair later.

The quote above from Steve McConnel implies an openness to the debt that I like. It also implies we should focus on debt that we make a strategic decision on. The comments to this article also sheds more light to debt metaphor. Andy Lester has a talk in “Get out of Technical Debt Now!” [6], where he identifies many issues that are not strictly technical but instead more in the  social domain. There are many aspects that I agree with, but I would consider them as another kind of debt.

Ted Theodoropoulos has a series of articles where he defines [7] Technical Debt, then goes into how to identify [8] it, how to quantify [9] it, how to plan a remediation [10] and finally how to governance [11] it. Ted goes deeper than the previous authors have gone and is perhaps a bit more structured in his definitions, as I see it. Still, the sum of all views from all authors is imporant. All of these developers use the Debt metaphor to make it easier to communicate to different stakeholders.

By now you should have a good understanding what Technical Debt is and perhaps have some ideas on how we, as testers, can help our fellow developers with the Technical Debt. None of the authors above mention the usage of testing or QA in this. If we know there is a technical debt that is ever increasing, why is it so seldome that we talk about it openly in the project between tester and developer? It is a fact that developers cannot leave the code in as good shape as they always want it and that they must take shortcuts. Some of the authors above have stated to have a list of these shortcuts, open issues and bugs before hand. This would give us testers lots of valuable information that would make our testing even better. If the technical debt is a continuous struggle for the developers, we as testers are there to help!

Jonathan Kohl has touched the subject before in this article about “Testing Debt” [12] that partly relates to the subject. Johanna Rothman has in her two articles about “What Testers Can Do about Technical Debt” [13] and [14]. I will try to build on their ideas.

  1. Work closer to the developers, gain their trust. Assist them so that you understand when certain shortcuts are not sloppyness, but deliberate with known risks. Bugs found in these areas might not be a big surprise to them.
  2. I would improve communication within the project to a such a degree where we can talk about the technical debt (and possibly testing debt) openly and so that we can start pinning down the debt lists, assuming we have both testing debt and technical debt (developer debt). With testing debt I am referring to my own definition in the article “Turning the tide of bad testing” [15]. Still, there are many aspects of the debts that are social factors that are not so easy to talk about, but does that mean that we should not add it to the list? It might be that it is too hard to mix the issues that are directly related to code with social factors. In that case we might be talking about different kinds of debt for different situations and decisions.
  3. I would assume that the developers know which content in the of debt list they will consider for fixing soon and which will be postponed for the far far future. We would be able to talk about what areas should have higher priority so that we can put focus where it belongs.
  4. If we know an area is broken (dead horse heuristic) and that it will be fixed in the near future, there are probably other areas that could getter higher attention.
  5. Areas which the developer consider are “safe” to postpone into the far future, we as testers could verify their claims of certainty or identify new risks that make them reconsider.
  6. We could just see the debt as a risk area in itself. Test it and adding bugs as additional cost of having the debt left behind and as an aid to pinpoint where to get the most of a payback.
  7. A tester could be involved in the decision for the strategic debt, perhaps to evaluate the decision and to look for possible issues that will threaten the outcome of increasing the technical debt.
  8. Debt that is related to testability should perhaps be reconsidered. As testers, we would not be able to help out as much if this was threatened.
  9. Try to talk openly with developers about non-strategic debt (as Ted defines it), thus areas which has poorly written code or something similar. By being open about the situation will enable you as tester to help out more.
  10. When development start to payback on the debt you can work closely with the developers to decrease the introduction of new bugs and new areas that increase debt.

What would you consider adding to this list that I have left out?

References:

[1] The WyCash Portfolio Management System – http://c2.com/doc/oopsla92.html

[2] TechnicalDebt – http://martinfowler.com/bliki/TechnicalDebt.html

[3] Technical Debt Quadrant – http://martinfowler.com/bliki/TechnicalDebtQuadrant.html

[4] A mess is not a technical debt – http://blog.objectmentor.com/articles/2009/09/22/a-mess-is-not-a-technical-debt

[5] Technical Debt – http://blogs.construx.com/blogs/stevemcc/archive/2007/11/01/technical-debt-2.aspx

[6] Get out of Technical Debt Now! – http://www.media-landscape.com/yapc/2006-06-26.AndyLester/

[7] Technical Debt – Definition – http://blog.acrowire.com/technical-debt/technical-debt-part-1-definition/

[8] Technical Debt – Identification – http://blog.acrowire.com/technical-debt/technical-debt-part-2-identification/

[9] Technical Debt – Quantifying – http://blog.acrowire.com/technical-debt/technical-debt-part-3-quantifying/

[10] Technical Debt – Remediation – http://blog.acrowire.com/technical-debt/technical-debt-part-4-remediation/

[11] Technical Debt – Governance – http://blog.acrowire.com/technical-debt/technical-debt-part-5-governance/

[12] Testing Debt – http://www.kohl.ca/blog/archives/000148.html

[13] What Testers Can Do about Technical Debt – Part 1 – http://www.stickyminds.com/s.asp?F=S3629_COL_2

[14] What Testers Can Do about Technical Debt – Part 2 – http://www.stickyminds.com/s.asp?F=S3643_COL_2

[15] Turning the tide of bad testing – http://thetesteye.com/blog/2010/11/turning-the-tide-of-bad-testing/

Testing Clichés Part V: Testing needs a test coverage model Rikard Edgren 16 Comments

I believe there is too much focus on test coverage , there is even an axiom about the need of it.
My reason is that no coverage model captures what is important.
Cem Kaner lists 101 possible coverage models (Software Negligence and Testing Coverage), and none of them are super-good to me (my favorite is an expansion of no. 89: Potential Usage, which is impossible to measure.)
A dangerous example is coverage by amount of planned tests performed, which easily gives too little exploration, and less ambitious testing efforts.

Test coverage is about planning, precision, measuring and control; which isn’t the best match for things that can be used in a variety of ways, with different data and environment, and different needs.
Sure you can make use of them, but if you rely too much on them, you will have problems in an industry of uncertainty like software development.

The over-emphasis can be shown in the following ISTQB quote:
“Experience-based tests utilize testers’ skill and intuition, along with their experience with similar applications or technologies. These tests are effective at finding defects, but not as appropriate as other techniques to achieve specific test coverage levels or producing reusable test procedures.”
(implying that you can’t really rely on these methods that merely) find defects (and important information.)

I understand that coverage models can give confidence to the decision makers, but how often are these used in reality?
Aren’t release decisions rather made based on how you feel about the facts you are presented with; and it is specific bugs that can stop a release, and external factors that push a release?

If so, isn’t focus on coverage model sort of wasted?
And if it brings a slower testing with less result, it is something to try to get rid of?

As an alternative, I present my 95% Table:

The measurement used is anything you want it to be, and of course practically unusable.

– SO HOW ARE WE GONNA REPORT STATUS? I hear shouted.

In a different, and better way.
I’m not sure how, but I want to be close to what’s important, and far away from John von Neumann’s quote:
“There’s no sense in being precise when you don’t even know what you’re talking about.”

Tester’s Pedal Rikard Edgren 2 Comments

The tool you’ve been waiting for!

Tester's PedalFunction: When you push the pedal a random input will be sent to the machine, and thereby your application.
By default, a sample of error-prone inputs are available (e.g. ASCII 30, double-click, Unicode, beep)
The nifty thing is to be able to do this rapidly, On-Demand, in unexpected situations, or when you are bored.

System Requirements: any machine with USB (Wi-Fi in 2.0)
Modes: ASCII, all characters, everything
No of inputs sent: 1 to 4,294,967,296
Logging: Temp folder on applicable systems
Limitations: no API, no warranty
Price: TBD

If you plug it in to someone else’s computer, you can take your practical jokes to a new level…

The Helpful Model Henrik Emilsson 3 Comments

Here is neat story I told to Michael Bolton, Martin Jansson and Markus Gärtner when we were exploring the Metro in Copenhagen, during the coldest days the city had experienced since they started measure the temperature. I promised to blog about it…

Let me begin with some background information.

A couple of years ago I worked as a tester in a project where we developed an application that did an automated processing of an electronic version of a paper form.

In many ways this was an interesting system – no GUI, no user input; no visible output (if everything went OK). And the paper form was something most people in Sweden would use at least once; so it was very important that the system did things right. In fact, I used it myself just days after my assignment was over.

In short, this is what the system did:
My organization sent out a paper form regarding a matter to people with some hard-coded text in it (e.g., names and personal identity number, etc), and people should fill out the rest of the form.
They would then send in the paper form to a third-party company that used OCR to convert the paper form into an electronic form (xml-file). The file was then sent to my organization and went into our system, where the processing of the form content took place.
First there were many checks done in order to check the accuracy and format of the manually entered text. If that passed, it went on and did checks against all the laws that were supposed to be met in order for the matter to be processed correctly; if that passed it went on and a formal decision could be made, which included notifying several organizations and the people who sent in the paper form in the first place.
If any format check or law was violated, it went to a manual handling of the matter.

The test data we used was non-real people, but they still had to be unique in the national database system (so if our database had included all existing persons, our non-existing persons would have been unique persons in that sense). So we only got 2 test persons a week, because that was what could be reserved for testing purposes. This meant that we needed to be very careful with these persons and design them with utmost precision in order for them to not violate any rule in an unintended way and thereby be caught by the system. And vice versa, those who should be caught by a certain rule needed to just be caught by that rule and nothing else, even if that meant that an earlier check might have been introduced later in the project.
So you might understand that we put a lot of effort in the test data and making sure that it would be valid.

Anyway, this story is about what happened at the end of the project.
After we had developed all tests and they ran through the system successfully – meaning that many should be caught by the system, and many  should run through all the way without being caught – it was time for us to take the tests one step out of our system. This meant printing out all paper forms; and manually write the text that should be included; and then send them to the third-party company that would scan the paper forms and convert them into electronic forms which they then sent into our system. The intention was that the result would be the same as when we ran the electronic forms. Partly why we did this was because the third-party company had trimmed their OCR-machine in order for it to understand this new paper form; so we wanted to know how good work they had done with this. All their reports said that the results were OK; and we had sent them a couple of forms and we were also satisfied with the result. But we wondered if the OCR-system could handle some tricky data (which obviously some of our tests were designed to be).
So we began filling out the forms according to our test cases. E.g. one format check could be to make sure that only one word was written in a field, so the test included a word with a space inside – which we then wrote as obvious as possible in order for the OCR-machine to interpret this as two words and thereby it should be caught.
There were plenty of these format checks that we carefully tried to violate. Another example was to write with a really hard handwriting style so that it would be really hard for the machine to interpret and thereby flag the field as “unreadable”. Etc.
At the end of the week, we sent all these papers with a box delivery firm to the third-party company and waited for the forms to drop into our system on Monday morning. We were all excited because this really felt like a proper production test with test data as close to the real stuff as possible.

On Monday morning, the first forms were dropping in and processed by our system. We monitored the process by following them through the database and all the states that they ended up in. To our surprise, most of them passed all the format checks and went further into the system… We started out investigating the xml-files and the scanned tif-images. The tif-image showed our paper form and they were correct; but the xml-files didn’t get the flags that we would have expected. Hmmm, strange…
Our reaction was: “What an amazing OCR-machine! How can it interpret so good!?”

We reported to the third-party company that we weren’t satisfied with the tests; and we told them that we would send them a new batch as soon as possible. We didn’t say why we were unsatisfied, because we thought that we had screwed up and not exercised the OCR-machine to the limits.

So we began the tedious work of filling out the forms again; but now even more evil then before. 🙂
Now, the OCR-machine shouldn’t stand a chance against our cruel intentions.
As we did the previous week, all papers were shipped to them at the end of the week and on Monday we were back and rubbed our hands waiting for the forms to enter the system.

But the same thing happened the second time!

Now I took a tif-image and the corresponding xml-file and went to one of the business analysts. “How the hell can the machine interpret this garbage text into something as useful as what it says in the xml-file?”. We looked at it and shook our heads. We couldn’t believe that this was happening.
I went back to the system and analyzed the data some more.

Suddenly I discovered a typo in a name and instantly got suspicious. I recognized the name since I had created it when creating the test person. Something was very wrong here…
Then it hit me! The name had been pre-printed on the paper form and shouldn’t have caused any trouble for the OCR-machine given the results for the handwritten stuff. I thought to myself “It’s a human behind this!”.

I went to the business analyst again and told him about this. He said “Damn, that’s it!”
He then called the third-party company and asked to speak with the person responsible for the OCR-machine. He then asked “Do you know if someone has interpreted some of our paper forms manually?”
The answer dropped as a bomb:
“Well, yes. All of them. We’ve had some trouble with the machine so we had to do it manually. And I want to say that it was really hard for us; you had written in such bad handwriting that it took us so much time to process them that we thought that this was torture. And just as we thought that it couldn’t get worse, the second batch came in that was way worse than the first one. We had to sit in pairs and process them carefully and with utmost respect. But we did a hell of a job, don’t you think?”

I came to think about this when I read Jerry Weinberg’s “The Secrets of Consulting” and saw The Helpful Model:

No matter how it looks, everyone is trying to be helpful.

test design technique name competition Rikard Edgren 16 Comments

When I read about the “classic” test design techniques, I don’t recognize the way I come up with test ideas.
Sure, the implicit equivalence partitioning is used pretty often, and I get happy the few times a state model is appropriate, but the testing I perform seldom has the unit/component focus that these techniques have.
Rather I start with a What If question I believe is important, e.g. what if the Client-Server connection is lost here.
And then I use various sources to find out how this can occur, and write test ideas in an easy-to-read format:

– investigate behavior if Server goes down
– check what happens if network cable is unplugged
– can the function be performed on a really slow network (use NetLimiter)
– check behavior in our load test environment
– what if the connection is interrupted before/in beginning/at the end?
– verify ability to Cancel operation
– are error messages adequately informative? (remember Security, Secrecy…)
– how will a user feel during interruption?
– look at information in the log files

The focus is on the analysis part, to find out what we want to test.
There is a mix of “what” and “how”, non-functional is not a limiting factor, and the details are left for the test executor to work out.
I think this way of designing tests is extremely common, and yet we don’t have a name for it! Or do we?
We should have a name competition.
Vote for one of these; come up with your own suggestion; or refute the whole idea…

* Straightforward Test Design
* Common Sense Testing
* plain English Test Design
* Test Idea Generation
* One-Liner Tests
* Test Design Un-Technique
* Error Guessing
* Test Design
* Risk-based testing
* Analysis-Focused Test Design
* Test Charters
* Zxcvbnm
* Simple Test Design

– So are you promoting a test design technique that anyone can perform?
Yes, I think anyone should be able to write, and understand these test ideas.
To do it good, you just need to understand what is important, see the whole, and realize how it is best achievable in tests.

And if you can write all test ideas with a granularity so all interested parties can get a grip of the whole testing effort, you will get good feedback, and have a great start of your testing story.

Book Review: Exploring Requirements Rikard Edgren 1 Comment

Exploring Requirements: Quality Before Design is an excellent book written by Donald C. Gause and Gerald M. Weinberg.
It is primarily about requirements, but it is an excellent read for everyone involved in doing something that hasn’t been done before.

As a software tester, it highlights, and helps, my own problems with understanding all important aspects of what the product is good for.
I wish all requirements were written according to this book; it would make testing a lot easier, especially when trying to understand attributes, preferences, expectations and motivations.

They describe a human-centered requirements process, both regarding the people collaborating, and the product, that will be used by people, that have feelings.
At one moment I felt that it was too much measurements, but on the next page I read:
However, keep in mind what von Neumann said, “There’s no sense being precise about something if you don’t know what you’re talking about”, and don’t get bogged down in metrics.

Of specific interest to testers is the notion of Frill/”Get It If You Can”: functions (or attributes) that are desirable, but shouldn’t cost anything.
My testing interest is two-fold: first, even if these are ignored, you should watch out for violations to these, because they are (implicit) bugs.
Second, manual system testers are often good at finding out how nifty small additions would be, so testers can help raise priority of frills, or create new and better ones.

I also want to mention the User Satisfaction Test, which looks very useful, maybe even for testing status reporting??

So, with a title like this, does the book contain any hints on documenting tests in advance or not?
Yes, it does: “To be most effective, black box test construction must be done before you start designing solutions.
1-0 to documenting test ideas early on.
Page 258 (original 1989 edition) also describes the very first test design technique:
asking ‘What if’ questions
Page 94-103 deals with the very first test analysis technique:
“Mary had a little lamb” heuristic

There’s a lot more that can be used in any situation, e.g. Context-Free Questions, and Rule of Three: “If you can’t think of three things that might cause your great idea to fail, all that means is that you haven’t thought enough about it yet.

The book is fluently written, and contains a lot of entertaining and illuminating stories, and a beautiful ending.
Endings like that are not necessary when merely writing a blog post.

(Un)Common Testing Insights? Rikard Edgren 2 Comments

Over the years, one has read quite some text about software testing.
Some things have (from various sources and experiences) become clear for me, and I’m surprised when seeing articles/presentations that don’t acknowledge these insights.
These “truths” are now and then implicitly disregarded:

* Requirements don’t include all important information
* Testing includes more than verifying explicit requirements
* Testing is only partly about measuring and verification
* Any tester can perform non-functional testing (to some degree)
* Non-functional testing is efficient to perform at the same time as integration/system testing
* Metrics (measurement + value) are dangerous
* Quantitative numbers omit important information from the qualitative results
* Subjectivity is something good
* Code coverage has little meaning
* Test automation is primarily for regression testing, which often isn’t the biggest part of the testing challenge
* Software development & testing is complex, and analogies from manufacturing are not good
* Testers should collaborate with developers and others
* We don’t have complete knowledge, so exploring is necessary

Correct me if I’m wrong!