Exploratory Testing vs. Scripted Testing – rich terminology Rikard Edgren 2 Comments

Exploratory Testing in its purest form is an approach that focus on learning, evolution and freedom.
Cem Kaner’s definition is to the point: “Exploratory software testing is a style of software testing that emphasizes the personal freedom and responsibility of the individual tester to continually optimize the value of her work by treating test-related learning, test design, test execution, and test result interpretation as mutually supportive activities that run in parallel throughout the project.”

ET in real life is a collection of ways of testing, sometimes used as example implementations of the approach, sometimes used for testing that don’t follow a script, and sometimes as a synonym to ad hoc testing (because that word has become increasingly under-rated.)

Scripted testing in its purest form is an approach that focus on precision and control. It is yet to be defined by proponents, but a benevolent try could be:
“With a scripted testing approach the testing effort is controlled by designing and reviewing all test scripts in advance.
In this way the right tests are executed, they are well documented, progress towards 100% execution can be controlled, and it is easy to repeat tests when necessary.
The scripted approach is not dependent on many tester heroes, and can take advantage of many types of resources for test execution, since the intelligence in the test scripts are created by test design experts.

Scripted testing in real life mostly means designing test scripts early, and executing them later, and these scripts have quite detailed steps, and a clear expected result.
The terminology is rich, complex and sometimes confusing, since they at least can mean approach, style, method, activity and technique; and these are in reality so connected and intertwined that distinctions aren’t necessary or helpful.

 

So are the distinctions important?
I think they can be, especially if the words are used without details, e.g. in statements like “Exploratory Testing is the opposite of Scripted Testing” or “combining exploratory and scripted testing”.
Both statements can be true, because the first talks about the approach, and the other about methods.
By understanding the different meanings of the words it is possible to get a more nuanced debate, and to see other combinations, e.g. test scripts with an exploratory approach, or scripted approaches with elements of ad hoc testing.

The intention of the used method shows which approach you are using (inferred from Cem Kaner’s Value of Checklists… p.94):
If test scripts are used to control the testing, it is a scripted testing approach.
If test scripts are used as a baseline for further testing, it is an exploratory testing approach.

 

It would be nice to have a solution to this semantic mess, but I don’t think it is feasible to always attach approach or method to Exploratory Testing and Scripted Testing (or to distinguish between upper case Exploratory and lower case exploratory.)
It is extremely difficult to give life to new words, but I do have some hope in the clarifications by testing vs. checking, and less hope for a renaissance of ad hoc testing.
A start would be if more people are aware of the different meanings, and are more precise when necessary, and eventually the problem will dissolve, in 25 years from now.

The Quality Status Reporting Fallacy Henrik Emilsson 4 Comments

A couple of weeks ago I had a discussion with someone that claimed that testers should (and could) report on quality. And especially he promoted the GQM-approach and how this could be designed to report the quality status. When I asked how that person defined quality, he pointed to ISO 9000:2000 which define quality as “Degree to which a set of inherent (existing) characteristics fulfils requirements”.

But wait a minute!

If testers can report the current quality status based on the definition above, it means that test cases corresponds to the requirements; and bugs found are violations where the product characteristics does not satisfy the requirements. If so, then you must have requirements that follow a couple of truths:

  • Each requirement should exhibit the statements: Correct, Feasible, Necessary, Prioritized, Unambiguous and Verifiable.
  • The set of requirements cover all aspects of people needs.
  • The set of requirements capture all people expectations.
  • The set of requirements corresponds to the different values that people have.
  • The set of requirements contains all the different properties that people value.
  • The set of requirements are consistent.

(The word People above include: Users, customers, persons, stakeholders, hidden stakeholders, etc.)
At the same time, we know that it is impossible to test everything; you cannot test exhaustively.

But assume, for the sake of argument, that all requirements were true according to the list above; and the testing was really, really extensive; and the test effort was prioritized so that all testing done was necessary and related to the values that the important stakeholders and customers cared about.
If this would be the case, then how can you compare one test case to another? How can you compare two bugs? Is it possible to compare two bugs even if you have 20 grades of severity?

We, as testers, should be subjective; we should do our best to try to put ourselves in other people’s situation; we should find out who the stakeholders are and what they value; we should try to find all problems that matter.
But we should also be careful when we try to report on these matters. And it is not because we haven’t got any clue about the quality of the product, but we should be careful because many times we report on the things that we do that can be quantified and take these as strong indicators of the quality of the product. E.g., number of bugs found, number of test cases run, bugs found per test case, severe bugs found, severe bugs found per test case per week, etc. You know the drill…

If you are using quantitative measurements, you need to figure out what they really mean and how they connect to what really should (or could) be reported.

If you think that “non-technical” people are pleased by getting a couple of digits (hidden in a graph) presented to them, it is like saying: “Since you aren’t a technical person we have translated the words:  Done , Not quite done, Competent, Many, Problems, Requirements, Newly divorced, Few, Fixed, Careless, Test cases, Dyslexic, Needs, Workaholic, Lines of code, Overly complex code, Special configuration, Technical debt, Demands, etc, to some numbers and concealed it all in one graph that shows an aggregate value of the quality”.

Quality_is_a_number

I think that it is a bit unfair to the so-called non-technical…

Instead, we should use Jerry Weinberg’s definition “Quality is value to some person” in order to realize that quality is not something easy to quantify. Quality is subjective. Quality is value. Quality relates to some person. Quality is something complex, yet it is intuitive in the eyes of the beholder.

When do you feel productive? Rikard Edgren 5 Comments

I believe that it is impossible to objectively capture important things about a software tester’s productivity.
On the other hand I don’t believe there is a big difference between feeling productive and being productive.

I feel productive when I

* test a feature that is good, but not perfect
* review specifications
* do pair testing
* am happy
* am motivated
* find interesting things in the product
* find very important defects
* report bugs
* help developers
* don’t think much

When do you feel productive?
How do you make sure you spend as much time as possible being/feeling max-productive?

Seven Categories of Requirements Rikard Edgren 9 Comments

I like to use categorizations to structure my understanding of a subject; and after the simplifications are made and I think I understand it well; the structures can be ripped apart, and you get a bit less confused by the complexity of reality.

There are many forms of requirements, these are some a tester should look out for:

Explicit Requirements
These are the requirements found in the requirement documents. You are probably using them in your testing, making sure that they match the functionality.
This is a quite small part of software testing as I see it.

Implicit Requirements
These are requirements that can be found by combining different requirements that are intertwined.
They could originate from general statements like the program should never crash, or the program should be easy to use, which has implications for many other requirements.
They could also become very large, e.g. support all possible input, or support Python scripting.
They are an effect of vague requirements, and they are a natural part of software development; it would be insane to document everything in advance. Tester can deal with it and understand what is important.

Unspoken Requirements
These are things that many users expect from a program, but they are seldom listed in the requirements document.
Typical examples are behave in same way as other applications on this platform, not leave any garbage files after running, or be appealing to most users.

Incorrect Requirements
The writers of the requirement documents don’t know everything in the world, sometimes they are wrong.
There can be small errors, e.g. inconsistencies between requirements, and huge mistakes, because they didn’t understand the user’s true needs.

Changing Requirements
Sometimes requirements need to be changed, which is something that testers shouldn’t object (too much). The requirements are most likely changed in order to make a better product, and that’s what we all are working for. But when they are changed, or added at a late stage, it can be difficult to challenge them, and test them really good, simply because you are under time-pressure.
We can’t do more than our best, but that’s often enough.

Vague Requirements
I used to dislike vague requirements that were very difficult, almost impossible, to test, but now I think they are good to have.
Not that you should be vague on purpose, but quality attributes like usability, performance etc. can never be detailed and capture the important thing: that customers will be more than satisfied.
It gives you a challenge as a tester; you need to use your feelings and imagination to come up with test ideas, and results that point to a positive or negative indication of result. You can’t hide between some numbers, and must stand for if you think the requirement is met or not.

Hype Requirements
These can be difficult to handle. Often they come in the shape of specifying too much detail, e.g. save settings in an XML file, just because XML is hype (this was 10 years ago, so replace with SOA or the cloud or the hype your company believes in, right now.)
They might be out-of-place, put there in order to be allowed to start the project, but they can also be important, exploiting the hype, or just being a perfect match for this specific application.
As a tester, it’s often not much more to do than accept the hyped requirements, especially if it is accepted (or initiated) by the developers.
But probably you need to learn more about the hype, often there are (at least some) good things inside them.

And regardless of how well all these categories of requirements are implemented and tested, will the application be really, really, super good?

The power of a sound Martin Jansson 1 Comment

In my local food store they have this system where you scan the price tags on the food you buy and most often smoothly able to pay and exit without having to stay in any long queues.

Shop Express Scanner

A time back they must have changed software in these scanners because their behavior changed and bugginess increased. The funniest bug or feature, as they themselves would most certainly call it, is when you are finished. You then scan a finish code which then sends a signal to the scanner, then you are able to pay. When you perform this last scan you will now hear a loud beep from the device, previously this beep was used when there was an error of some kind. So, everyone (at least everyone who I have seen do this) perform the last scan and upon hearing the beep they exit the queue and go to the cashier. The cashier then tells it is supposed to sound like that and that it is perfectly normal. This happens every time for everyone, at least once for us technocrats… but prolly each time for those who are a bit scared of technology still. One of the main ideas with using this device is to minimize the effort for the cashier by letting customers check-out on their own. This sound stops that feature.

Another funny bug that have appeared quite recently is that it takes a bit longer to scan wares, I mean from less than a second to close to ten seconds per ware. Pulling up a box of tomato sauce where you must scan each of the 12 cans will now take about two minutes… you just stand there continuously pressing scan… waiting and building up on that hysterical laughter. The idea of Shop Express has lost a bit of its flavour, still when it works as expected it is indeed a lot better than using the normal queue.

Are we ashamed of software testing? (And who is willing to pay for it?) Henrik Emilsson 1 Comment

Imagine that you run a software consultant shop where you take on projects for customers. The projects cover such areas as new software development; implementations of IT systems; and web site development.

Let’s say that you are about to create a offer for a new project to a customer.

Do you dare to specify the proper amount of hours dedicated to software testing? Or do you feel ashamed of having to test the software before letting the customer lay its hand on it?
Do you just add a couple of hours as a separate post so that it doesn’t look bad if someone asks about “any software testing planned”?
Do you include all the software testing hours needed in the total estimate? Or included in the total per function?

I think that we should treat software testing as any other task that are needed in order to develop functionality so that the hours that are specified per function/requirement/area covers all necessary actions and tasks in order to deliver ready functionality.
As stated in an article on www.jcount.com/benefits-of-constructing-your-own-commercial-building/, if you include such tasks as Design, Interaction Design, Specification, Requirement Analysis, Architecture, Coding, etc, you should also include Software Testing amongst these tasks. And you should be proud of doing Software Testing!

By including software testing in your time estimates, you give yourself a competitive advantage. When your customer selects between several offers and sees that you have included software testing and some of the competitors haven’t, it is a signal to the customer that makes them wonder why the others haven’t got any software testing (or why they haven’t specified any). Your offer might come out as a more expensive one, but since you have specified the difference it becomes obvious that they cannot just compare the price tag.

What are your thoughts on this?

Automate configuration checks while testing Martin Jansson Comments Off on Automate configuration checks while testing

I assume you are familiar with the discussion around checks vs testing brought to you by Michael Bolton, which I agree with.

With configuration I mean settings on a unit such as settings for whatever you are testing. This can be configuration heavy devices such as switch, router or similar using SNMP, applications using the registry or applications using databases for setting storage. Those of you who are familiar with these settings would also be familiar with setting it using a CLI.

So what do I mean with… automated configuration checks while testing?

In a configuration heavy environment there are lots of settings that you know about the system that are stored in a configuration. While working with the system the configuration sometimes changes slightly. Some tests might be to perform a certain task and while finished check if some things have changed or not. You are testing in one area, but you are continuously interested if something is changed in specific configurations.

For instance, in an Ethernet Switch you want to test around the changing of the MDIX, Speed of a LAN port. At the same time you are generating traffic through the system. While doing this you wish to monitor that no other settings are changed, so you might wish to check configurations for alarms. There might be thousands of these settings that you know the state of and that should not be changed, or at least you know what states that they could enter.

My idea is that you create automated configuration checks that polls the system for information either continually or when triggered. The checks are context dependent, but it is fairly easy to know what context is valid in each situation when it comes to these settings. If it is too complex you should perhaps not automate it.

As I see it, a check would best be suited as a unit test (or unit check as Michael Bolton calls it). I am fond of using Python in combination with a unit test framework and Pexpect. Pexpect enables you to create wrappers around whatever you are trying to do with the system whether it is using the CLI or doing SNMP. Pexpect then enables you to check the result using regular expressions, thus enables you to build in some context-dependent checks.

This would enable you to create tools for yourself while doing the real testing.

Growing test teams: Progress Martin Jansson 2 Comments

A lot of these ideas come from Peopleware by Tom DeMarco and Timothy Lister. As I see it, they realised it is easier to show things that will stop the growth instead of listing things that will actually create the team. Jelled teams are created when many of the factors have been eliminated that stop us from growing.

What stops growth of a test team? I identify new things almost every day that in one way or another disrupts the team or stops it from growing.

The hunt for test progress!

When we talk about progress it is directly linked to a goal, thus progress towards a certain goal. If the goal is unclear or has been lost, the progress estimation can sometimes shift toward things that was not meant to be.

How do you determine progress then? When are we done testing? If our plan from start is fixed, we might have a defined set of tests that must be run in order to say we are complete. That is, complete with what we thought from the beginning. But the plan changes, no? If that is the case the progress report is ever changing up or down.  Is it perhaps not really that interesting to focus our time on bulletproof progress estimations? We stop testing when time runs out or when someone says stop?

I think the test team have a harder time growing when …

  • the ability to show test progress becomes more important than the actual testing done or the information produced from it.
  • we think it is important getting more green than red in the pie chart or bar chart.
  • we avoid testing areas that might result in bugs because that might disrupt the expected weekly progress.
  • it is more important doing a test that show progress than doing a test that might actually find bugs.
  • we avoid helping developers fix the bugs found because we need to show test progress.

Too much focus on progress will generate bad energy in the test group and therefore slow us down, as I see it.

YouTube Premiere! Rikard Edgren 2 Comments

At EuroStar 2008 I presented Testing is an Island – A Software Testing Dystopia.
Fritz shot the pictures, Henrik wrote the music, and I uploaded it on YouTube:

The accompanying paper can be found at http://www.thetesteye.com/papers/redgren_testingisanisland.doc

Exploratory test plans? Martin Jansson 4 Comments

How would a test plan be constructed that is for exploratory testing? I would assume it is different from a traditional test plan?

Would we use concepts such as entry/exit criteria for test? I would never say No to a build to test. Skipping entry/exit criteria. I guess it also has to do with the role of testers. Do we act police or are we a service?

Resources needed? Do we ever know how many testers we need? We can give a vague number how many we want to be to full comfortable, but can this ever be fully accurate? If we aim to test as much as we can in the defined set of time, we will do so with the amount of resources that we have been allocated. I guess it also has to do with how you are organized as a team and what your mission is. If the team is running several projects and tracks at the same time it is even harder to determine how many resources that are needed. Do you really want to allocate testers to a certain percentage in different projects?

What is to be tested and how? Well, do we ever know that in advance? We should be able to list tons of test ideas, but isn’t that just our initial idea of what is to be done and that will change as soon as we sink our teeth into the first build.

Do we get the test plan approved and then use it in the project? A plan is just temporary. It will change many times for sure. Planning is better than the actual plan. No matter what project I work in I am able to do planning incrementally, using scrum or whatever tool that is available.

I vote for that the traditional test plan as an unnecessary artifact. One would perhaps not want a generic plan either around exploratory testing?

How do you plan your exploratory testing? What do you focus on? What resistance do we meet from management when presenting our exploratory plans?