More thoughts on checks Martin Jansson 4 Comments

Scripted testing vs exploratory testing approach

I agree with the idea of a polarization between the scripted test approach and exploratory test approach. These approaches include how you perceive testing and a tester. Almost in the same sentence, some say that you do a bit of both scripted and exploratory testing. The perception on testing and how you conduct testing are two different things. I believe that by discussing them together and especially with the polarization example, it more than often confuse the listener.

Instead of saying that you do both scripted and exploratory testing, I think it is more fruitful to talk about testing and checking. M. Bolton and James Bach states that “A check is a component of a confirmatory approach to testing.” in the blog article Elements of Testing and Checking [1]. We might even be so bold to say that it is the main component, but then we might say that another part is smaller and that might not really be true.

The Checklist

One aspect of checks could be something similar to what the co-pilot does when assisting the pilot. If we see the pilot as the explorer, reacting to input and performing based on the current context. In certain situations the co-pilot brings up checklists to help go through a situation that is too complex to remember all the details about. It is up to the pilot and co-pilot to know when the checks are relevant or not. The same situation can be found at the operating table where you have several types assisting personnel that help perform a complex surgery. The use of a checklist in those cases would help avoid the easiest mistakes or misunderstandings. See The Checklist Manifesto [2] for more ideas for testing.

How does this apply to testing then? When we do pair testing there is usually one driver while the other is a bit more passive and documents. What if the driver is the pilot/explorer and the other is the co-tester handling documentation, checklists and checks? When doing collaborative test planning, group exploratory testing or other collaborative test activities, do we then split up in roles such as tester/explorer, co-tester, checker etc to help us define what we do and focus on? Does it provide value to do so?

Types of Checks

Here is a definition of a check [1]:

a check itself has three elements:
1) It involves an observation.
2) The observation is linked to a decision rule.
3) Both the observation and the decision rule can be performed without sapience (that is, without a human brain).

But could we split checks in different parts? What types of checks are there? M. Bolton describes questions that has to do with confirmation, verification, etc. Could we also break down checks into other categories such as checklists, test/quality patterns, guided checks and one-liner checks.

A checklist is a list of items that help you through a complex situation. The items in the checklist could be checks that need to be confirmed or verified to a certain extent.
A test/quality pattern is something that repeatedly gives suspicion that it might be broken so that you need to check it.
A guided check is what we previously called a scripted test. A set of steps that end with something that you wish to verify or confirm, something with a binary answer.
A one-liner check or a check idea is a brief statement of something that should be checked in a certain context that would give a binary answer, such as true/false, ok/nok or yes/no.

Checking clarifies an aspect of what testing is or is not, do these sub-categories help us clarify what we do with the checks? Could we find more types of checks?

Test Management Tools

Based on the assumption that there are at least two types of questions and at least two different types of answers, how do we structure and manage these today? The traditional management tools for testing can often handle all the above categories for checks, but they are unable to handle the answers from testing.

But what do we mean when we say handle in this case? Well, you can structure the checks, plan when you do them and report the result of them. You can then make reports on the overall progress and the overall quality of the system. You can also state what build, system or setup you used in the test. But since the tools cannot really handle the testing part, the progress and the picture on quality really becomes obsolete.

If we look at tools and techniques that are more aligned with testing such as SBTM, they are good at handling the testing story. They are not as good at handling checks, or rather the tools I have seen so far.

Can we and do we want to handle the result from our testing and checking in the same system? When working in small teams where there are fewer stakeholders to work for, the need for information sharing in a big system can be less important. But if you are working in an organization where you have teams in several countries and where the overall development organization can be thousands of people, it can be more important to share information in a bigger system. Still, are you actually able to share information to that many people in an efficient way just because you use an advanced management system? When a lot of people are involved in creating and sharing information there is also a bigger chance that the actual meaning is lost. You then oversimplify the situation that information sharing is easy. If the organisation is big, there is a big chance that the context changes over the organisation and that the information changes meaning over the organisation.

I’ve seen so many test management tools that try to solve the whole test process. By focusing on all aspects they also become quite crappy at all aspects. Picking the tools that are excellent at solving one problem can then be a lot more efficient even if you get to work with several tools. When working with test coverage, communicating your test ideas, reporting status, reporting progress and showing details on what you tested there is a big chance that one tool really cannot solve this for you. We know that we find new techniques from other disciplines that can solve a problem for us when testing. So why do many limit themselves to test case management systems?

Summary

Michael Bolton and many other great minds have explored and delved into this concept, I think we can find more pieces that we can shed some light on. I’ve identified a few areas and there are more to come.

References

[1] Elements of Testing and Checking – http://www.developsense.com/blog/2009/09/elements-of-testing-and-checking/
[2] The Checklist Manifesto – How to Get Things Right – http://www.amazon.com/The-Checklist-Manifesto-Things-Right/dp/0805091742

Lightweight Performance Testing Rikard Edgren 3 Comments

If performance is crucial for product success, you probably need pretty advanced tools to measure various aspects of your product, to find all bottlenecks and time thiefs. For all other software, performance is just very important, and you might get by with lightweight test methods. You may, or may not have quantified performance requirements, but you should test performance to some degree anyway; for the whole, but also for each detail (when appropriate.)

In TheTestEye’s classification, performance consists of:

Performance: Is the product fast enough?

Capacity: the many limits of the product, for different circumstances (e.g. slow network.)
Resource Utilization: appropriate usage of memory, storage and other resources.
Responsiveness: the speed of which an action is (perceived as) performed.
Availability: the system is available for use when it should be.
Throughput: the products ability to process many, many things.
Endurance: can the product handle load for a long time?
Feedback: is the feedback from the system on user actions appropriate?
Scalability: how well does the product scale up, out or down?

Be aware of different definitions of performance testing, e.g. some include reliability, stress handling, robustness, and what stakeholders believe is most important might differ (even when using the same words…)

Ongoing Violation Awareness

The number one lightweight method starts by finding out which of these characteristics that are relevant for your product. Then keep them in the back of your head, and whenever you see something fishy, investigate further and communicate. Often the OK zone is easy to reach, but testers should notice when violations occur. When appropriate, apply the destructive principle: Increase the amount of everything that can be increased.

No Tools

Perceived performance is what matters for end users (but maybe not for a product comparison check list) so think about how it feels, and try using a stop watch. You might get pretty far by load testing with colleagues with several instances each.

Tools

There exists limiters for CPU, RAM, bandwidth etc. and many of them are free (and some of them become obsolete.) A task manager/resource utilization tool can give you hints on memory, CPU, disk, network et.al. Scripting your product to run over weekend is good for endurance and stability testing. JMeter is free and often quick to get running.

Summarizing

Summarizing performance test results is difficult. Aggregations of measurements don’t tell the full story, and the whole story takes a long time to tell. Communicate what is important, which is easier if you have asked stakeholders beforehand.

Warning: For some products, users aren’t as interested in Performance as the developers…

Don´t hustle my flag! Henrik Andersson 7 Comments

I´m sure you have heard it before. Everyone can test or Everyone does testing. Is that so? Is that really the case? Do you test just because you use a product? Do you test just because you stumble upon a bug? Do you test just because you can write some detailed step into a test management tool?

What meaning do we put in the word *testing*?
I know that some separates testing by unskilled testing and skilled testing. But is unskilled testing really testing? I will claim not.
Too me, this is just like saying that I do surgery just because I slice up a stomach and poke around. This is not surgery, this is just what it is: “slicing up a stomach and poking around”. Still, to an untrained eye it might look like surgery and so does lots of the unskilled so called *testing* too.
I do think it is hurtful for us when we, who by reputation are considered to be good testers, recognize unskilled poking around as testing. Even if we call it unskilled testing, most people will only hear and remember the word testing.
Let me draw another parallel to this. Everyone can drive, right?
Most likely everyone without any disability that hinders them can figure out how to open the door, start the engine, put in the gear, push the throttle and turn the steering wheel. So if you do all of this are you then driving?
You might be driving and you might not be driving. I think it depends on the level of awareness and consciousness. If you do not know what you are doing, you push the brake you turn the radio on full power and put the gear in parking as you push the throttle to the floor. Or if you turn the wheel clockwise and at the same time signal to turn left and you hope to go straight ahead. What I’m trying to say is that it looks to me that you are just doing random like things with little awareness or at most you are trying to figure out how to drive, but you are not driving even if this by luck takes you to the fast food stop by the corner that you wanted to go to. Some of the testing that occurs is much like this and I would not like to name this testing. It is something else, it is pesting, it looks like testing and can fool many but it is just like the pest or plague to testing.
To put a non tester in front of a program to evaluate the outcome can have value but that is very different to putting a person who don’t know how to test and then to expect testing to be done. The first is a conscious decision with a purpose. The other one is ignorant and degrading.
When someone uses the product and finds a bug, you are not testing just because you find a bug. Everyone can find a bug, since a bug is a relationship between you and the product, it is something that disturbs you.
I do not mean to say that testing always has to be done consciously, we testers treat serendipity with the highest respect and acknowledge the power of it. But we are aware of this, that is the difference.
You better be able to describe why you do this test, how you are doing it, why it is valuable and what you learned from it. I believe that test framing is crucial in testing and to be able to tell a story of you testing. I think then we are getting closer to say that we are testing.
So the million dollar question is then of course what values could we put in the word testing to give it some depth and meaning.
I do not believe we are at a point where we should define that you must know this and that and have skill #1, skill #2 and skill #3 to be testing or this is the best way to do testing.
But maybe we can define values or emotions to the word. Something that can demonstrate to others that what you are doing is thoughtful and sapient actions and that others can value your work upon. I might not agree or like your flavor of testing but consensus of how to test is not what I’m looking for. Im looking for values that we can use to say that if you do not embrace and apply them we will not call it testing, it is simply not credible. I have mentioned a few like awareness, consciousness, framing, serendipity, valuable to stakeholders, learning, evaluating. I’m not sure that these actually are relevant for this or maybe they are?
What I’m afraid of is that someone steal the word testing from us just like when the freaking racists in Sweden during the 90’s stole our Swedish flag and claimed it to belong to them.
Or if the word testing become like milk and water.
As you probably have noticed by now I’m not giving much answers here. I’m not that far in my own process of this and my purpose is to open the door for your thoughts on this matter.
However, I do believe we need to raise the bar!
So when you are testing it is much like hitting the bars on a piano. Everyone can make a sound but when does it become music?

Critique of Test Design Axioms in The Tester’s Pocketbook Rikard Edgren No Comments

The Tester’s Pocketbook by Paul Gerrard is not a great book, but it is very good.
It covers fundamentals of software testing, and contains a ton of good ideas that will help you in your testing effort.
I also like it because it is one of few books on testing theory that focus on the human element of software, and its creation.

However, I disagree with some things, and will focus on the test design part.
These are Gerrard’s test design axioms with my verdict:

Test Model – Test Design is based on models
Yes, use many, and also look outside them. Build your own quality characteristics model.

Test Basis – Testers need sources of knowledge to select things to test
Yes, use multiple information sources to understand what is important, the list is bigger though (Purpose, Capabilities, Failure Modes, Quality Characteristics, Usage Scenarios, Creative Ideas, Models, Data, Surroundings, White-box, Public Collections, Internal Collections, Business Objectives, Information Objectives, Product Image, Product Fears, Project Risks, Rumors, Product History, Project Background, Test Artifacts, Debt, Business Knowledge, Field Information, Users, Conversations, Actual Software, Technologies, Standards, References, Competitors, Tools, Context Analysis, Legal Aspects, Many Deliverables, Searching, You.)

Oracle – Testers need sources of knowledge to evaluate actual outcomes or behaviors
Very useful, but not necessary; we can sometimes suspend judgment and communicate noteworthy information.

Coverage – Testing needs a test coverage model or models
Not needed, unless as a tool to get more test ideas, or a way to report what has been tested. Gerrard’s question “how many tests remain?” is not good. A better one is “how much test time do you guess remain?”

Prioritisation – Testing needs a mechanism for ordering tests by value
Don’t waste too much time on this; when necessary use a fast, frugal test triage.
Page 38 states “we must invite stakeholders to take an utilitarian view.” This is not true. We could equally well use a value-based system, e.g. “bought software should not crash”, or just go by feelings about what is right.

Fallibility – Our sources of knowledge are fallible and incomplete
True, but text gets a bit too negative towards the human mind. An engaged mind can make mistakes, but also discover what is important. We can separate right from wrong, we can handle the unknown, we can make up for mistakes done.

So which axiom is missing?
Sampling & Serendipity – testing is inevitably a sampling activity, where serendipity is to our help.

There is also an overall problem when Gerrard states that these axioms are needed to do testing. It should be: needed to do good testing.
Nonetheless, one of the better testing books out there!

Bug Title Crash Course Rikard Edgren 6 Comments

If you want to seriously improve your bug reporting skills, read up, or take, the BBST Bug Advocacy course.
If you want to start by improving bug report title/subject/summary; read Lessons Learned in Software Testing, no, 83, or this blog post.

Many people will only read the title, so it is important to make it possible to
* understand how the problem appears
* understand limitations and dependencies
* understand the consequences of the bug

Some people will make their first decision on fix/don’t fix, based solely on the title.
(And for those that look carefully at the report, the title will guide their thinking.)
The goal is to make it possible to understand the bug, and how important it is, just by reading the title.

A few tips

* As short as possible, but no shorter than that.
* Start the sentence with the most important, to capture the reader’s interest.
* Don’t overdo “externalization” to capture interest, rather describe the dry facts, and let the readers draw conclusions
* If it’s difficult, try many times, or write the title after everything else is done (you might find the right words on the way.)
* Include a brief description of what happens.
* Include where the problem happens.
* Describe observations rather than (presumed) facts.
* Use a fair description, don’t exaggerate or understate the consequence of the problem

I also have a tiny, controversial one; start the title with lower case for the first word.
Most testers think they are writing a sentence in a story, and start with upper case (and end with a redundant full stop.) The problem I see with this is that you lose the ability to use upper case for names and terminology. “Exit” refers to an element in the software, “exit” refers to the action, which can be done in several ways. This is nitpicking, yeah, but it’s what I think.

You can’t include everything in the title, so use what’s most important.
The best way to learn this comes as no surprise: practice a lot.

Some Good ISTQB Definitions Rikard Edgren 5 Comments

While sifting and sorting the ISTQB Glossary 2.1 I finally found a couple of terms which definitions were both correct and useful:

1. deliverableAny (work) product that must be delivered to someone other than the (work) product’s author.
Good, because it puts focus on the fact that you are creating the deliverable so it can be useful for someone else.

2. user-based qualityA view of quality, wherein quality is the capacity to satisfy needs, wants and desires of the user(s). A product or service that does not fulfill user needs is unlikely to find any users. This is a context dependent, contingent approach to quality since different business characteristics require different qualities of a product.
This is one of five ISTQB definitions of quality that are worth knowing about. This one is also correctly described.

3. walkthroughA step-by-step presentation by the author of a document in order to gather information and to establish a common understanding of its content.
A sometimes useful practice, and a definition including the magic “common understanding”.

So if someone says that all ISTQB definitions are wrong, I can confidently say I disagree.

Imaginary Dead horse heuristic Martin Jansson 4 Comments

A while ago I was going to a customer meeting to hold a workshop in SBTM, showing that testing could be managed in a different way. I feel fairly experienced in the way I work as a tester, but at a parking lot my humility and confidence turned over.

I was standing next in line to pay for the parking ticket. The middle-aged man in front of me was having trouble to get a ticket. He thought it might be something wrong with his credit card, so I offered to try out my card to test if it was the card or if the machine was in fact broken. There were several people behind me. After noticing that my card didn’t work either, which I knew had worked earlier the same morning, we concluded as a group that the parking machine was broken. There was another machine a bit further away that seemed to work, based on that the people using it were getting parking tickets and using their cards, so all of us went there.

On the way back to my car I walked past, what we called, the broken parking machine and two very old ladies were trying use it. I told the ladies that the machine was broken and they responded, “Yes, but how broken is it?”. I realized my blindness and my narrow definition of broken. Then the ladies said, “We are going to test to see if coins work.”. I rarely have any coins or paper money for that matter, so my narrowness on the test scope based on my own context of use, what I could do at that time.

I thought I had found a dead horse, but it was infact an imaginary one.

Announcing 37 Sources for Test Ideas the test eye 6 Comments

Download!

It is often stated, with right, that you should use many, different information sources in order to come up with good test ideas.
Rob Sabourin uses 10 categories, HICCUPPS(F) can be used not only as oracles, and Cem Kaner has many examples in various presentations and tutorials.

We decided to make our own list with sources we find useful, and describe them so they can be useful for others.
We finally ended up with 37 Sources for Test Ideas; have a look and see if they can inspire your quest for what is important.

Comments, additions, experiences are more than welcome!

/Rikard, Martin and Henrik

Many Models – Better Test Ideas Rikard Edgren 2 Comments

Henrik Emilsson has convinced me that skilled software testing is based on invisible mental models that help us see what can be tested.
If we can make these visible, we can sharpen our skills, and also teach testing more effectively.
Here follows a simple example I used in class, that shows that by switching between several models you will get different test ideas.
If you want to learn by experiencing, download Perfect Age Calculator application, and test for ten minutes or so.

Example 1 – The User Interface

OK, you’re casually reading a testing blog, that’s fine.
This is the User Interface, and the way you see it is your very first model, giving you (at least) “click these” test ideas.

Example 2 – Timeline

I guess you immediately tested some familiar dates, e.g. your own birthday to see if it handles normal data (Benevolent Start Heuristic)
Many of you checked today’s date and a date in the future.
These tests are probably based on your model of time, which can look in many ways, but something like a timeline (which can be very different for people!)

Example 3 – Detailed Timeline

The time model can be more elaborated, e.g. including mental notes of leap years, and other special dates. This, the previous and next models are visualizations of your equivalence partitioning.

Example 4 – Year, Month, Day

Some switched to test ideas bases on the date being three parts: Year, Month, Day.
This (more easily) rendered test ideas like same year, month, date; and invalid entries for each of these three. You were also more alert to problems with calculation of each of these three.

Example 5 -Technical Flow

Some of you used a technical flow model, reversing what the code is doing. Depending on your knowledge of data types this generated test ideas challenging the boundaries, perhaps by using “den 5 februari 2012” on a Swedish machine.

Example 6 -Quality Characteristics

Some testers always apply a characteristics model, personally I use an instinctive sub-set of thetesteye’s list, which in this case gave test ideas for intuitiveness, professionalism, memory consumption etc.

Using models

There could be variations of these, and many more as well (keyboard/mouse interactions, surroundings, competitors, familiar problems…); and I encourage you to find these in yourself, so you can make even better use of them.
(For a complex product, there will be hundreds of possible models, and a good way to find some of them is to combine your knowledge with SFDPOT from James Bach’s Heuristic Test Strategy Model.)

The results from these tests on Perfect Age Calculator show a lot of problems, seemingly too many for such a simple program.
When this happens, you should question your models (I Might Be Wrong Heuristic); I started to wonder if there might be some other kind of calculation they try to make.
But Sourceforge project information reads: “This tool is very useful for official use like office, schools, institutes etc.”
So my conclusion is that the piece of software doesn’t live up to its claims.

You don’t need to explicitly visualize your many mental models while testing, as a matter of fact, you shouldn’t, because it would take too much time.
Most of you reading this blog probably do it instinctively, and in different ways (hence the diversity testing embraces.)
But you might need to make the models explicit a couple of times, to train your testing brain to think in many different models, to get a richer set of test ideas. And by sharing our invisible models, I think we can get even better at testing.

The even more important question about which of these tests that are important, well that’s another story…

Some Nifty Windows Tools Rikard Edgren No Comments

Here are some small, free, nifty tools I use now and then:

FreeMind – to model and communicate
WinMerge – to diff or merge files or folders
Process Hacker – to monitor resource usage
Process Monitor – to monitor registry and disk activities
InCtrl5 – for installation testing (what happended to Install Analyzer??)
Fiddler/Wireshark – to see network traffic details
Firebug – to see and edit details in Firefox
Xenu’s Link Sleuth – to find broken links
Zed Attack Proxy – lightweight security testing
NetLimiter – to simulate low bandwidth
Om/Perlclip – put testdata in Clipboard
Rapid Reporter – to document your testing
Color Oracle – to simulate color blindness
Console – a better command prompt

Additions are welcome!