Set All Testers Free! Rikard Edgren 2 Comments

I have entered EuroSTAR’s VideoSTAR competition for 2011, main reason might be that I want more people to see the excellent introduction movie Mårten and Henrik made for me a couple of years ago.

My title is “Set All Testers Free!”, and I can’t say I have the details set for a talk, but it will be about external freedom (allowed to test anything, your own control of deviations and environments) and internal freedom (liberate the minds of the testers, think bigger!)
We shouldn’t think or act like machines, since software is made for humans, by humans.

There are a lot of other interesting testing videos (especially the one on serendipity), so go there, have a look, and cast your vote, if you want to.

EuroSTAR VideoSTAR

Lightweight Reliability Testing Rikard Edgren No Comments

The big drawback and big advantage with reliability testing is that it is easiest and most effective to perform together with other testing. A separate automated reliability regression test suite could cost an awful lot to implement, but reliability in your spine when performing any type of manual test, together with deviations, is cheap, interesting, and powerful.

If you look at Reliability from a standards perspective, you will see a lot of measurement methods like Mean Time Between Failures. You don’t need to use these. You can test and find important information anyway.
The most lightweight method is to ask these questions to heavy users of the product:

Reliability. Does the product work well all the time?

Stability. Are you experiencing (un)reproducible crashes?
Robustness. Are there any parts of the product that are fragile and have problems with mis-configurations or corner cases?
Recoverability. Is it possible/easy to recover after (provoked) fatal errors?
Resource Usage. What does the CPU, RAM, disk drive et.al. usage look like?
Data Integrity. Are all sorts of data kept intact in the system?
Safety. Is it possible to destroy something by (mis)usage of the system?
Disaster Recovery. What if something really, really bad happens?
Trustworthiness. Do you feel you can trust the system?

You may also want to perform some specific tests aiming at the different sub-categories.

Stability. Run the product for a long time, without restarts.
Automate a simplistic scenario and run it thousands of times in a sequence.
Count the number of non-reproducible crashes per day that happens to your team.
Try really hard to reproduce the non-reproducible issues.

Robustness. Provoking error messages is fun, and don’t forget to check spelling and if error message helps.
When this is important: hit hard, hit many times.

Recoverability. Turn off the power for machines performing important things, restart and look at behavior.
Whenever an error occurs, try to recover, and consider if it is easy and intuitive.

Resource Usage. Look at system resources now and then.
Stress the system in various ways (but only spend time on this if the project is interested in results.)

Data Integrity. Use all types of data (numeric, strings, out-of-range, invalid, empty, Unicode), in different sizes (small, medium, large) on different systems (localized OS, regional settings, different fonts) through all parts of the system.

Safety. Do thorough brainstorming around scenarios where people can get hurt. Be aware that ambiguous or missing information can be very dangerous if they affect important decisions.

Disaster Recovery. You probably don’t want to test this for real. But you can ask developers or others if there are possibilities of continuing using the software after a crucial machine has disappeared. This is one of those characteristics that either is irrelevant, or very important.

Trustworthiness. Note down all inconsistencies in behavior, or moments when you are unsure what the product is up to.
Tell the project how you feel about the product’s reliability.

The best start to get reliable software is to have really solid code, and properly customized code checker tools can help you with this.

I guess the list could be a bit longer and still be lightweight; feel free to help me out!

Background Complexity and Do One More Thing Heuristics Rikard Edgren 5 Comments

I spend a lot of time testing new features for the next release.
I actively try to not test the features in isolation, to not use the easiest data and environment.
One example of this is that I often use “documents” that are more complex than necessary, that includes elements and strange things that aren’t obviously related to the functionality in focus.
E.g. if I were to test text coloring in Word on a fully-fledged book with different types of footnotes, a lot of formatting and images et.al.
This doesn’t cost much time, but now and then expose weaknesses, either in the new functionality, or in the old product.
I guess many do this, and I recommend it to anyone who has the freedom or guts to control their test environment.
Maybe the name Background Complexity Heuristic can help you remember it.

When a test is “completed”, I like to do something more with the product, preferably something error-prone, popular or the next thing a user might do, not necessarily related to the new feature.
I don’t think too much, rather just press F1 if I can’t think of anything better, since I don’t want this extra-testing to take too much time.
I call this the Do One More Thing Heuristic.
It helps you learn, and find problems.

For both of these tricks, it might take some time to pinpoint problems, but the alternative might be to not know about it at all.

A symptomatic ISTQB definition Rikard Edgren 12 Comments

There are some discussions about current certification schemes, but there is not so much attacks and defense of the actual content.
This is from ISTQB Glossary 2.1:

black box test design technique: Procedure to derive and/or select test cases based on an analysis of the specification, either functional or non-functional, of a component or system without reference to its internal structure.

A first start of the narrowness I dislike is “test cases”.
Why must test design have a set of instructions and expected results?
I think test design can have many forms: detailed, visual, one-liners, tables, un-documented, charters, and it is not good to steer testers towards a limiting format, that in my opinion seldom is appropriate (because it stifles serendipity, promotes confirmation bias, is cumbersome to review, and time-consuming to write and maintain.)

“Analysis” might be the most under-estimated and un-elaborated areas in software testing.
In ISTQB foundation its allotted time is about 2 minutes, and I haven’t seen any interesting on this in the Advanced or Expert syllabi either.

Maybe it is because the test basis only consists of “specification”. I know it happens that tests only stem from specifications, but I can’t understand why.
Don’t we want to find out how the system really behaves?
Do we genuinely believe that the writers captured everything that might be important?
Are we consciously neglecting everything we learn throughout the development project?
Requirements are a good start, but there are a lot more to look at.

Have they written “derive and/or select” to make sure that no creativity and new ideas appear in test design?

That the definition reads “the specification” is a symptom of the un-holistic world view that each function/feature should be tested in isolation.

At least there is mention of “non-functional”, but I don’t want to detail my critique on their view on this (I think it should be done together with other testing, that it doesn’t have to be done by experts, that it is OK that it isn’t measurable in quantitative format, that testers should have a broader view and knowledge.)

I haven’t taken the ISTQB training, with the right teacher it might be great, especially for newcomers that want a glimpse on many aspects of what software testing is about.
But it is a pity that the content is so meek, bleak, weak.

Observation and interpretation by proxies Henrik Emilsson 3 Comments

If you haven’t done it before, have a look at the Software Quality Characteristics that we published last year: TheTestEye – Software Quality Characteristics

You can probably imagine ways of testing for all of these quality characteristics yourself, and you might even come up with good oracles that can assist you in the interpretation of the test results. However, the Charisma characteristics might be the most subjective ones; and it might also be a case of where it is really important to find out, for each quality characteristic respectively, which stakeholder whose values matters the most.

So how do you test for a product’s Charisma?

Charisma. Does the product have “it”?

  • Satisfaction: how does it feel after using the product?
  • Professionalism: does the product have the appropriate flair of professionalism and feel fit for purpose?
  • Attractiveness: are all types of aspects of the product “good-looking”?
  • Curiosity: will users get interested and try out what they can do with the product?
  • Entrancement: do users get hooked, have fun, in a flow, and fully engaged when using the product?
  • Hype: does the product use too much or too little of the latest and greatest technologies/ideas?
  • Expectancy: the product exceeds expectations and meets the needs you didn’t know you had.
  • Attitude: do the product and its information have the right attitude and speak to you with the right language and style?
  • Directness: are (first) impressions impressive?
  • Story: are there compelling stories about the product’s inception, construction or usage?

All of these are somewhat intuitive and I guess that you, consciously or unconsciously, test for these everyday in your project. You can also focus on testing these in the same fashion as when testing for other quality characteristics.
However, if (some of) these software quality characteristics really matter in your project and are core values for your product, you better question your own capability of being able to test this.
By question yourself; you might realize that your oracles aren’t powerful enough in order to really being able to answer the question “Is there a problem here”.
You might be too biased; or you might not be able to really understand what the stakeholders need or value. Or you might simply be too unimportant in order to even “have opinions”…

One way of addressing this is by letting a proxy user act as observer and interpreter of the test result. This can be a good approach when testing for software quality characteristics where you suspect that your own oracles won’t be enough in order to understand what is important. Well this isn’t news for many of you, since user testing has been used successfully for years. But my point is that you can spend some extra thoughts on which you select as your proxy oracle – with a mission to test for those selected Charisma values.

You can do this kind of testing by letting a proxy sit next to you while you test; or by letting users test the program themselves and you interviewing them afterwards. Another way can be to test or demo in front of an audience of proxies that observe and interpret the result as you go.

The tests could be designed as open questions that you would have asked the program and watched it respond, but instead you ask the questions to the proxies and letting them answer in the way they interpret the result.

There are no testers that are the best Martin Jansson 7 Comments

In a recent discussion with Henrik Andersson on Twitter regarding some consultancies being or claiming to be best at testing. Here is the initial conversation:

Henrik: Dear consultant companies why are you calling yourself consultant when all you talk about are recourses and invoiced hours. Shame on you!

Henrik: Many “consultant” companies claim to be “best at test”. You only train your testers in ISTQB and maybe a TMap book. That can’t be the best!

Henrik: All you companies that claim to be the best. I suggest you to have a show down to actually demonstrate if any of you are any good at all.

Martin: I take you on that challenge!

Henrik: are you one of those companies that claim to be “the best”? I have never heard you say that. But I’m always up for a game 🙂

Martin: No, to claim that is ignorant. But we are good at test related tasks. How do we measure ourselves then?

Henrik: you measure yourself by sharing your knowledge, demonstrate your skills, discussions with peers, trying new stuff, failing.

Martin: Yes, I know. But can you really compare?

Henrik: you can sort out those who sucks at it. but more important is not comparing but gaining respect and recognition from peers

Henrik: for me it is not about having one winner. it is about contributing to do my part in developing our community & learn from it

Martin: If you are good it is harder to distinguish who is the better as tester. I agree with you regarding the community etc.

Martin: I mean, would you compare which information was the most valuable between two who competed? I think it is hard.

Martin: Still, I see some testers who I think are really great while others are just good. What criteria do I have then?

There are many testers who have great renown, thus they talk a lot about testing. I’ve seen a handful of them show their skill. So they fit in with the tester Henrik mentions:

– by sharing your knowledge

– demonstrate your skills

– discussions with peers

– trying new stuff

– failing

A great tester might be an introvert that do not discuss with his/her peers or share his/her knowledge. Is it even required to demonstrate your skills? Are we talking a bout a renown tester here or an unknown tester who is just great in secret? Still, in order to be hired based on your reputation or renown you might need to fulfill the criteria that Henrik lists above. A great tester is only great in certain contexts, finding someone who is great in all contexts would be very rare. Being a generalist and general systems follower would mean that you could adapt to most situations, but would you be “great” in all of them? A specialist in one domain with great testing skills might be better than the generalist? Highly likely.

One criteria that I think is important is that a tester should be different from the next tester. This does not really apply to the tester that goes solo or prefers working alone. If we instead consider having a tester in a group that is really great at some aspect in testing. He/she is not great at everything and will not be able to handle every situation, but as a group they are well equipped to handle many, many contexts.

In my test team I have many great testers, as I see it:

– All have a high focus on what is valuable to the stakeholders.

– One loves everything that is complex and hard to understand, digging in to see what it is made of.

– One is totally unafraid, young and goes paths senior people would not go.

– One is extremely creative and comes up with tests that breaks everything in exciting new ways.

– One has an affinity for making everyone else feel better, thus making everyone work a bit better as a group.

– Some I’ve worked with a long time and probably knows things that I would miss,  thus covering behind me and I behind them.

– Some have the expertise in the current domain.

– Some have similar background and vocabulary with each other.

– Some are new to the group.

– Some are men and some are women.

– Some have excellent programmer skills.

– Some have experience from other roles in product development.

– Some are used to talking to the customer.

– Some are used to leading the group.

– Some are natural leaders.

– Some have worked together for nearly 15 years and still like it.

– Some have known each other for more than 30 years and are still best friends.

If you look at each one as an individual you would see one side, but if you look at the team and what they can acomplish you see greatness. Alone they would not be able to solve any situation in testing, but as a team they have a better chance.

This view is supported by the one that Cem Kaner writes about in his “Recruiting software testers” [1]. For a specific company he identifies an team that he would think he ideal. He lists it like this:

  • Senior tester or test manager with experience in business operations or human resources. This person has worn the shoes of the customer for this system. For a vertical application, I think this is essential.
  • Senior tester or test manager with strong test planning skills. If this is the test manager, she needs excellent mentoring skills, because she won’t have time to write the test documentation unless she is an individual contributor.
  • Test automation hotshot, willing to serve as the group’s tool builder.
  • Talented exploratory / intuitive tester, someone who is really good at finding bugs by playing with the product.
  • Network administrator. This person has the dual role of helping the other testers set up and deal with the ever-changing configurations that they have to test under, and designing configuration tests to determine whether the product will run on most of the systems in use by the product’s customers.
  • Attorney who is willing to wander through the various statutes and regulations looking for rules that the program must cover.

When you assemble a test team in product development, do you then consider the above traits and properties. So, back to the question that Henrik asked initially about consultancies stating they are the best in testing. I am sure they have good testers as individuals, but can they match a team that is handpicked to complement each other? I hardly think so.

When project managers need more resources for the project and especially testers. Do they then ask for “I want a tester” and “he/she must be ISTQB certified”. Being certified in testing is a story of itself, which I think is disturbing.

First, we really need to consider how they come to the conclusion that they need X number of testers. Was that the budget talking or was it someone who did the estimate on how many you “exactly” need? Ok, let’s assume it was budget to simplify the discussion somewhat. How do you then go about assembling the test group? Do you consider things things that I listed above or do you just want a tester, no interest in what they know and how they would fit in with the rest of the group. If they compliment each other is not a priority?

– What if you placed someone who is a (only english-speaking) great tester in another group of other testers who do not know english at all?

– What if you placed a collection of great testers, where each came from a competing consultancies?

– What if you only had introverts as testers in the group?

– What if you only had extroverts as testers in the group?

I often see the requirement in tester ads for domain knowledge rather than knowing anything about testing. I think domain knowledge is important, but you probably need at least one with it in the test group. There are so many other characteristics and backgrounds you want from people in the group.

So, my conclusion is that being a great tester is very context-dependent and you cannot say that you are the best.

References:

[1] Recruting software testers –  http://www.kaner.com/pdfs/JobsRev6.pdf

My First Ambitious Test Project Rikard Edgren 3 Comments

We all test as children; we are curious and want to find things out, before we are one year old we want to break things, and after three we ask “What If” questions.
My first ambitious testing journey came many years later.

Still a naive teenager, I started studying Philosophy at the university. It was an easy choice; philosophy was the subject where you could find the most essential truths.
It took me one and a half year of testing philosophical theories to find out I was wrong.

The classes went through the history of philosophy, and each theory was heavily criticized by me (and a classmate), not because I was a stubborn skeptic, but because I was an objectivist looking for real truth.
I met some very interesting philosophers (Heraclites, Spinoza and Wittgenstein being favorites), but the theories did not stand the test; there were flawed arguments and logic holes alongside with incorrect or unjustified assumptions.
I was disappointed, and finally realized everything is grounded in assumptions you can choose to agree with or not. (I’m OK with this now, I even see it as the essence of the charm of life.)
It was not wasted time, this massive falsification was a hard school in logical and artificial thinking, which are key skills for software testers.

A couple of years later I found the best philosophical theory at Kierkegaard (don’t stumble on his Christian dilemmas) who said that “the truth is the subjectivity”.
He did not state that the objective science was wrong, just that they didn’t grasp what is important.
To be human is to be subjective.

So whenever I rant about traditional testing theory, or coverage, or metrics, or expected results, I don’t say that they are wrong; they just don’t capture what is important.
If software is made for humans; testers should use their subjectivity.

fast and frugal tree for product importance Rikard Edgren 2 Comments

Software testing is difficult because there are so many possibilities; not only all functions and their interactions and attributes, but also possbible users’ data, needs, environments and feelings.

Good software testing need to deliberately sample, and understand what is important.
This understanding that evolves over time give testers an intuition about which tests to run, which spurious behavior to investigate, and which bugs need emphasis.

Gerd Gigerenzer describes in Adaptive Thinking his theory (that originates from Herbert Simon) that intuition is based on heuristics, that we are using fast and frugal decision trees in order to make good judgments in a complex world, with limited information.

This is my decision tree for judging software importance:

Fast Frugal Importance Judgment

Fast Frugal Importance Judgment

Use an expansive definition of user if you want to capture issues like Supportability, Testability, Maintainability.
Or even better, write your own tree, and notice the differences in your project, differences you might benefit from.

Multiple Information Sources Rikard Edgren 3 Comments

When I wrote blog post The Complete List of Testing Inspiration, I didn’t think so much about many testing efforts being totally based on requirements and specifications.
I took for granted that we know that requirements are incomplete and wrong, and that we should learn from many places.
But when reading a good book such as Lee Copeland’s A Practitioner’s Guide to Software Test Design you see that the classic black box testing procedure is to base your test design solely on requirements and specifications.
It’s the same as with code coverage models, the most interesting stuff is what isn’t there.

Maybe we need to sharpen the argumentation for this, so here are my first attempts:

1. What problem is testing trying to solve?
I think there are more situations like this:
“We know that there will be defects in the product we are building. We need your help to make sure that the released product can be used by customers without problems, and satisfactorily help them with their tasks.”
than like this:
“We want to make sure that the actual product meets the statements in the requirements document.”

2. Requirements checklist of one
Did an omnipotent write the requirements?
(We know more when testing than when requirements are written; testers look at other details; testers also test the requirements, how things turned out.)

3. The intuitive argument
A visual representation, like the software potato: an image showing that requirements cover less than what’s important, that is less then everything.

4. Requirement realism
Requirements aren’t explored, but if all were written according to Exploring Requirements by Gause and Weinberg, we would probably be in a better position. We would not only know what needs to be in the product, we would also know about Preferences, Constraints, and “Want it without Cost”

One argument should suffice, but which one, and how can it be polished?

Emerging Topics at CAST 2011 Henrik Emilsson No Comments

Even if most of the program is set, there is still a chance for you to talk at CAST 2011. Matt Heusser and Pete Walen is running the Emerging Topics session:

If you would like to speak at CAST 2011, you still have the option of proposing a twenty-minute emerging topics session. (Emerging topics are anything you feel like talking about after the more formal conference program was decided.) Talks should generally be twenty minutes in length, with at least five minutes planned for discussion. If you want more time, please make your case and reasons clear, as that means other people don’t get to speak and we would need to make a special exception.

Read more at: http://xndev.blogspot.com/2011/02/casting-wide-net.html

Be there or be square! 🙂