<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>thoughts from the test eye &#187; metrics</title>
	<atom:link href="http://thetesteye.com/blog/tag/metrics/feed/" rel="self" type="application/rss+xml" />
	<link>http://thetesteye.com/blog</link>
	<description>by rikard edgren, henrik emilsson and martin jansson - with torbjörn ryber and henrik andersson</description>
	<lastBuildDate>Sun, 13 May 2012 17:27:07 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>The Metrics Tumour</title>
		<link>http://thetesteye.com/blog/2011/08/the-metrics-tumour/</link>
		<comments>http://thetesteye.com/blog/2011/08/the-metrics-tumour/#comments</comments>
		<pubDate>Wed, 03 Aug 2011 06:28:50 +0000</pubDate>
		<dc:creator>Rikard Edgren</dc:creator>
				<category><![CDATA[Ideas]]></category>
		<category><![CDATA[binary disease]]></category>
		<category><![CDATA[metrics]]></category>

		<guid isPermaLink="false">http://thetesteye.com/blog/?p=2173</guid>
		<description><![CDATA[<img src="http://thetesteye.com/blog/wp-content/uploads/ideas.png" width="48" height="48" alt="" title="Ideas" /><br/>quantitative numbers in a world of qualitative feelings I am not against measurements in general, they can surely be useful. I use length when building things, weight for baking, time for appointments etc. I often use numbers for various things in my bug reports. But metrics are something different; metrics are measurement plus value. &#8220;Should [...]]]></description>
			<content:encoded><![CDATA[<img src="http://thetesteye.com/blog/wp-content/uploads/ideas.png" width="48" height="48" alt="" title="Ideas" /><br/><blockquote><p><em>quantitative numbers in a world of qualitative feelings</em></p></blockquote>
<p>I am not against measurements in general, they can surely be useful. I use length when building things, weight for baking, time for appointments etc.<br />
I often use numbers for various things in my bug reports.</p>
<p>But metrics are something different; metrics are measurement plus value.<br />
&#8220;<em>Should have at least 80% code coverage on unit tests</em>&#8221; is a typical example, where &#8220;<em>peer reviewed and accepted</em>&#8221; would give better results.<br />
&#8220;<em>2% better defect detection percentage since last release!</em>&#8221; says less than conversations with support department.<br />
&#8220;<em>95% Pass on test cases</em>&#8221; means nothing at all.</p>
<p>The measurements with value cannot judge what is important;<br />
reality is impossible to aggregate;<br />
metrics are dangerous.</p>
<p>There are many good software testers that advocate metrics. At a couple of occassions I have had the chance to have &#8220;the talk&#8221; with some I respect.<br />
It boils down to the same thinking: metrics must have a lot of context in order to be useful, so much details that I believe you could throw away the numbers, or only have them as a footnote.<br />
This is not done, because management demands numbers, that&#8217;s what they are used to base decisions on.<br />
But for software and testing that might not be well applicable.<br />
If kids want candy, it doesn&#8217;t mean you have to give it to them.<br />
You give them better things, out of respect and care.</p>
<p>Testing provides information, so you should find out what is important to different people, and give them what they need, not necessarily what they want.<br />
Metrics will hide importance, and also skew the development efforts.<br />
Try to take them out, with surgical precision.</p>
<p>Then you are left with one final catch:<br />
Qualitative information is hard to aggregate, trust is needed.</p>
<p><a class="a2a_dd a2a_target addtoany_share_save" href="http://www.addtoany.com/share_save#url=http%3A%2F%2Fthetesteye.com%2Fblog%2F2011%2F08%2Fthe-metrics-tumour%2F&amp;title=The%20Metrics%20Tumour" id="wpa2a_2"><img src="http://thetesteye.com/blog/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://thetesteye.com/blog/2011/08/the-metrics-tumour/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Ignoratio elenchi</title>
		<link>http://thetesteye.com/blog/2010/08/ignoratio-elenchi/</link>
		<comments>http://thetesteye.com/blog/2010/08/ignoratio-elenchi/#comments</comments>
		<pubDate>Mon, 16 Aug 2010 20:37:22 +0000</pubDate>
		<dc:creator>Henrik Emilsson</dc:creator>
				<category><![CDATA[Ideas]]></category>
		<category><![CDATA[People]]></category>
		<category><![CDATA[broken window theory]]></category>
		<category><![CDATA[metrics]]></category>
		<category><![CDATA[quality]]></category>
		<category><![CDATA[quality attributes]]></category>
		<category><![CDATA[subjectivity]]></category>

		<guid isPermaLink="false">http://thetesteye.com/blog/?p=1309</guid>
		<description><![CDATA[<img src="http://thetesteye.com/blog/wp-content/uploads/ideas.png" width="48" height="48" alt="" title="Ideas" /><img src="http://thetesteye.com/blog/wp-content/uploads/people.png" width="48" height="48" alt="" title="People" /><br/>&#8220;Wouldn&#8217;t it be cool if we could come up with a Quality Value for our products?&#8221;  said a colleague of mine. &#8221;Yes! That would be super!&#8221; me and a couple of colleagues answered. We had a lot of categories and data in our bug system; and perhaps most important was that we had a good data [...]]]></description>
			<content:encoded><![CDATA[<img src="http://thetesteye.com/blog/wp-content/uploads/ideas.png" width="48" height="48" alt="" title="Ideas" /><img src="http://thetesteye.com/blog/wp-content/uploads/people.png" width="48" height="48" alt="" title="People" /><br/><p>&#8220;Wouldn&#8217;t it be cool if we could come up with a Quality Value for our products?&#8221;  said a colleague of mine. &#8221;Yes! That would be super!&#8221; me and a couple of colleagues answered.</p>
<p>We had a lot of categories and data in our bug system; and perhaps most important was that we had a good data mining tool that enabled us to take the data and transform it by performing calculations and making aggregations of it.</p>
<p style="text-align: left;">We started out by analyzing the bug data and starting to come up with reasonable and weighted factors that would enable us to quantify the categories: Severity, Priority, Time to fix, Resolution, Bug Type,  etc. Then we constructed an algorithm that would go through all essential data and the result would be a numeric value, i.e. the Quality Value. We decided that the Quality Value should be in somewhere in the span 0-100 and scoring 100 would be the top Quality Value.<br />
When we discovered anomalies in the result, we tuned the algorithm and the quantifiers so that the result would make more sense; and a lot of discussions were about the quantifiers and their impact on the result. After several iterations we started to get some reasonable numbers.</p>
<p style="text-align: center;"><img class="aligncenter size-full wp-image-621" title="Quality_is_a_number" src="http://thetesteye.com/blog/wp-content/uploads/Quality.PNG" alt="" width="238" height="92" /></p>
<p>And by now you might have started wondering how we noticed the anomalies and why we could see that the numbers were reasonable? This happened because we already had a perceived value of the products so we were  biased by subjectivity (huh!).</p>
<p>Anyway, when we were satisfied with the numbers we had a marvelous Quality Value for each and all of our products! And I dare to say that we strongly believed in this Quality Value. Of course there was a constant debate on this subject, and we fought over how much certain categories should have impact on the overall value.<br />
&#8220;My product&#8221; scored in the top interval so I was very pleased.<br />
But pretty soon, somewhere inside of me, I heard a little voice whisper: &#8220;Ignoratio elenchi, ignoratio elenchi, &#8230;&#8221;</p>
<p>We were of course very naïve in that we believed that this metric would represent the quality of the product. Of course it didn&#8217;t!<br />
A couple of observations:</p>
<ul>
<li>Bug data only deals with reported bugs</li>
<li>Bugs are subjective</li>
<li>Bug reporting is subjective</li>
<li>Bug handling (management) is subjective</li>
<li>Bug fixing is subjective</li>
<li>All other quality criteria not caught in bug reports are not included in bug data</li>
<li>Quality is value to many people that haven&#8217;t reported anything</li>
<li>Bug data is only data about bugs (+ subjectivity)</li>
<li>All of the above means that we really cannot compare bugs with each other</li>
</ul>
<p>On the other hand, one conclusion we came up with that might be true was that the ability to care about the product and bugs were reflected in the Quality Value. And in some way, this meant that a high score indicated that the product was taken care of (see <a rel="bookmark" href="http://thetesteye.com/blog/2009/08/broken-window-theory-and-quality/">Broken window theory and quality</a>). While this might have been true, we were ever so wrong with the idea on capturing the Product Quality in a single Value&#8230;</p>
<p>Read about <a href="http://en.wikipedia.org/wiki/Ignoratio_elenchi" target="_blank">Ignoratio elenchi</a></p>
<p>Also see <a rel="bookmark" href="http://thetesteye.com/blog/2009/11/the-quality-status-reporting-fallacy/" target="_blank">The Quality Status Reporting Fallacy</a></p>
<p><a class="a2a_dd a2a_target addtoany_share_save" href="http://www.addtoany.com/share_save#url=http%3A%2F%2Fthetesteye.com%2Fblog%2F2010%2F08%2Fignoratio-elenchi%2F&amp;title=Ignoratio%20elenchi" id="wpa2a_4"><img src="http://thetesteye.com/blog/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://thetesteye.com/blog/2010/08/ignoratio-elenchi/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>The Quality Status Reporting Fallacy</title>
		<link>http://thetesteye.com/blog/2009/11/the-quality-status-reporting-fallacy/</link>
		<comments>http://thetesteye.com/blog/2009/11/the-quality-status-reporting-fallacy/#comments</comments>
		<pubDate>Mon, 09 Nov 2009 16:16:11 +0000</pubDate>
		<dc:creator>Henrik Emilsson</dc:creator>
				<category><![CDATA[Ideas]]></category>
		<category><![CDATA[context-driven]]></category>
		<category><![CDATA[metrics]]></category>
		<category><![CDATA[quality attributes]]></category>
		<category><![CDATA[subjectivity]]></category>

		<guid isPermaLink="false">http://thetesteye.com/blog/?p=619</guid>
		<description><![CDATA[<img src="http://thetesteye.com/blog/wp-content/uploads/ideas.png" width="48" height="48" alt="" title="Ideas" /><br/>A couple of weeks ago I had a discussion with someone that claimed that testers should (and could) report on quality. And especially he promoted the GQM-approach and how this could be designed to report the quality status. When I asked how that person defined quality, he pointed to ISO 9000:2000 which define quality as [...]]]></description>
			<content:encoded><![CDATA[<img src="http://thetesteye.com/blog/wp-content/uploads/ideas.png" width="48" height="48" alt="" title="Ideas" /><br/><div class="mceTemp mceIEcenter" style="text-align: left;">A couple of weeks ago I had a discussion with someone that claimed that testers should (and could) report on quality. And especially he promoted the GQM-approach and how this could be designed to report the quality status. When I asked how that person defined quality, he pointed to ISO 9000:2000 which define quality as &#8220;Degree to which a set of inherent (existing) characteristics fulfils requirements”.</div>
<p>But wait a minute!</p>
<p>If testers can report the current quality status based on the definition above, it means that test cases corresponds to the requirements; and bugs found are violations where the product characteristics does not satisfy the requirements. If so, then you must have requirements that follow a couple of truths:</p>
<ul>
<li>Each requirement should exhibit the statements: Correct, Feasible, Necessary, Prioritized, Unambiguous and Verifiable.</li>
<li>The set of requirements cover all aspects of people needs.</li>
<li>The set of requirements capture all people expectations.</li>
<li>The set of requirements corresponds to the different values that people have.</li>
<li>The set of requirements contains all the different properties that people value.</li>
<li>The set of requirements are consistent.</li>
</ul>
<p>(The word People above include: Users, customers, persons, stakeholders, hidden stakeholders, etc.)<br />
At the same time, we know that it is impossible to test everything; you cannot test exhaustively.</p>
<p>But assume, for the sake of argument, that all requirements were true according to the list above; and the testing was really, really extensive; and the test effort was prioritized so that all testing done was necessary and related to the values that the important stakeholders and customers cared about.<br />
If this would be the case, then how can you compare one test case to another? How can you compare two bugs? Is it possible to compare two bugs even if you have 20 grades of severity?</p>
<p>We, as testers, should be subjective; we should do our best to try to put ourselves in other people’s situation; we should find out who the stakeholders are and what they value; we should try to find all problems that matter.<br />
But we should also be careful when we try to report on these matters. And it is not because we haven’t got any clue about the quality of the product, but we should be careful because many times we report on the things that we do that can be quantified and take these as strong indicators of the quality of the product. E.g., number of bugs found, number of test cases run, bugs found per test case, severe bugs found, severe bugs found per test case per week, etc. You know the drill…</p>
<p>If you are using quantitative measurements, you need to figure out what they really mean and how they connect to what really should (or could) be reported.</p>
<p>If you think that &#8220;non-technical&#8221; people are pleased by getting a couple of digits (hidden in a graph) presented to them, it is like saying: &#8220;Since you aren’t a technical person we have translated the words:  Done , Not quite done, Competent, Many, Problems, Requirements, Newly divorced, Few, Fixed, Careless, Test cases, Dyslexic, Needs, Workaholic, Lines of code, Overly complex code, Special configuration, Technical debt, Demands, etc, to some numbers and concealed it all in one graph that shows an aggregate value of the quality&#8221;.</p>
<p><img title="Quality reduced to a number" src="http://thetesteye.com/blog/wp-content/uploads/Quality.PNG" alt="Quality_is_a_number" width="396" height="153" /></p>
<p>I think that it is a bit unfair to the so-called non-technical&#8230;</p>
<p>Instead, we should use Jerry Weinberg’s definition “Quality is value to some person” in order to realize that quality is not something easy to quantify. Quality is subjective. Quality is value. Quality relates to some person. Quality is something complex, yet it is intuitive in the eyes of the beholder.</p>
<p><a class="a2a_dd a2a_target addtoany_share_save" href="http://www.addtoany.com/share_save#url=http%3A%2F%2Fthetesteye.com%2Fblog%2F2009%2F11%2Fthe-quality-status-reporting-fallacy%2F&amp;title=The%20Quality%20Status%20Reporting%20Fallacy" id="wpa2a_6"><img src="http://thetesteye.com/blog/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://thetesteye.com/blog/2009/11/the-quality-status-reporting-fallacy/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Tricks with Metrics</title>
		<link>http://thetesteye.com/blog/2009/05/tricks-with-metrics/</link>
		<comments>http://thetesteye.com/blog/2009/05/tricks-with-metrics/#comments</comments>
		<pubDate>Thu, 14 May 2009 09:43:51 +0000</pubDate>
		<dc:creator>Henrik Emilsson</dc:creator>
				<category><![CDATA[People]]></category>
		<category><![CDATA[context-driven]]></category>
		<category><![CDATA[general systems]]></category>
		<category><![CDATA[metrics]]></category>

		<guid isPermaLink="false">http://thetesteye.com/blog/?p=288</guid>
		<description><![CDATA[<img src="http://thetesteye.com/blog/wp-content/uploads/people.png" width="48" height="48" alt="" title="People" /><br/>Recently in Sweden there was a tragic death to a young child that could have been rescued if only the child had come to a hospital in time for a full exam. The one that was blamed for this death was the medical care hotline company that did not understand the severity of the illness [...]]]></description>
			<content:encoded><![CDATA[<img src="http://thetesteye.com/blog/wp-content/uploads/people.png" width="48" height="48" alt="" title="People" /><br/><p>Recently in Sweden there was a tragic death to a young child that could have been rescued if only the child had come to a hospital in time for a full exam. The one that was blamed for this death was the medical care hotline company that did not understand the severity of the illness and did not send this kid to the hospital. (Read more, in Swedish: <a href="http://www.dn.se/sthlm/pojke-dog-efter-rad-att-inte-aka-till-sjukhuset-1.859096">http://www.dn.se/sthlm/pojke-dog-efter-rad-att-inte-aka-till-sjukhuset-1.859096</a>)</p>
<p>After this tragic accident, it was discovered that this private medical care hotline company pays out a monthly bonus to those nurses that keep their phone calls short. (Read more, in Swedish: <a href="http://www.dn.se/sthlm/skoterskor-far-bonus-for-snabba-rad-1.864534">http://www.dn.se/sthlm/skoterskor-far-bonus-for-snabba-rad-1.864534</a> )<br />
I.e. if they keep the call below 3.48 minutes and during that time complete the medical record, they receive a bonus of 1000 Swedish kronor (approx. € 100). In order to receive the bonus, there are some quality goals as well. E.g., you don’t get the bonus if you unnecessarily send someone to the emergency ward; or if you give a faulty medical advice.<br />
Do I need to tell you that the county council paid the private company by the number of calls they handled. </p>
<p>This is what happens when you use simplified and dangerous metrics as a foundation for incentive pay… And these metrics are easy to abuse because they are based on simplified models of how the real world looks like.<br />
When dealing with people, you are dealing with “complex systems” (read more in <a href="http://www.geraldmweinberg.com/Site/General_Systems.html">An Introduction to General Systems Thinking</a>, by Gerald M. Weinberg ) and you cannot treat every person like they would be the same. I.e., the people calling in (and indeed children that cannot speak for themselves) are treated as a neutral “* 1” or “+ 0” in the metrics equation.<br />
This happens if you include simplified metrics to measure your efficiency when dealing with people; metrics that leaves out the most important and complex parts of the equation: humans and human interaction.<br />
Nurses know how to work with people, they know that people are unique; they know that their job is hard and requires skill and years of experience. They know that some patients require 20 minutes before they are calm or they need such time to explain everything important; they also know that some people just need 25 seconds before they are satisfied.<br />
It is a shame that nurses are measured by how fast they finish a phone call. </p>
<p>It is the same thing that happens again and again in software industry; or rather the <a href="http://en.wikipedia.org/wiki/Peopleware">peopleware</a> industry. People that work with developing software are measured by metrics that are dangerous and wrong; and in many cases it can have the same tragic outcome as with the young boy that did not reach the hospital in time…</p>
<p>Read more about (dangerous) metrics in the Software Industry:<br />
<a href="http://www.kaner.com/pdfs/metrics2004.pdf">Software Engineering Metrics: What Do They Measure and How Do We Know?</a>, by Cem Kaner.<br />
<a href="http://xndev.blogspot.com/2009/05/metrics-schmetrics.html">Metrics, Schmetrics</a>, by Matthew Heusser.<br />
<a href="http://www.developsense.com/2009/01/meaningful-metrics.html">Meaningful Metrics</a>, by Michael Bolton<br />
<a href="http://thetesteye.com/blog/2008/06/measurementsmetricsanalysisjudgment/">Measurements/Metrics/Analysis/Judgment</a>, by Rikard Edgren</p>
<p><a class="a2a_dd a2a_target addtoany_share_save" href="http://www.addtoany.com/share_save#url=http%3A%2F%2Fthetesteye.com%2Fblog%2F2009%2F05%2Ftricks-with-metrics%2F&amp;title=Tricks%20with%20Metrics" id="wpa2a_8"><img src="http://thetesteye.com/blog/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://thetesteye.com/blog/2009/05/tricks-with-metrics/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Measurements/Metrics/Analysis/Judgment</title>
		<link>http://thetesteye.com/blog/2008/06/measurementsmetricsanalysisjudgment/</link>
		<comments>http://thetesteye.com/blog/2008/06/measurementsmetricsanalysisjudgment/#comments</comments>
		<pubDate>Fri, 13 Jun 2008 13:14:00 +0000</pubDate>
		<dc:creator>Rikard Edgren</dc:creator>
				<category><![CDATA[People]]></category>
		<category><![CDATA[judgment]]></category>
		<category><![CDATA[metrics]]></category>

		<guid isPermaLink="false">http://thetesteye.com/wordpress/?p=61</guid>
		<description><![CDATA[<img src="http://thetesteye.com/blog/wp-content/uploads/people.png" width="48" height="48" alt="" title="People" /><br/>At www.context-driven-testing.com you can read &#8220;Metrics that are not valid are dangerous.&#8221; I believe this is true, but I would rather prefer &#8220;Metrics are dangerous.&#8221; Uninterpreted measurements are not bad by themselves, but when value is added to them, they become metrics, and dangerous because they state specific things without considering a lot of other things, that [...]]]></description>
			<content:encoded><![CDATA[<img src="http://thetesteye.com/blog/wp-content/uploads/people.png" width="48" height="48" alt="" title="People" /><br/><p>At <a href="http://www.context-driven-testing.com/" target="_blank">www.context-driven-testing.com</a> you can read &#8220;Metrics that are not valid are dangerous.&#8221;<br />
I believe this is true, but I would rather prefer &#8220;Metrics are dangerous.&#8221;</p>
<p>Uninterpreted measurements are not bad by themselves, but when value is added to them, they become metrics, and dangerous because they state specific things without considering a lot of other things, that actually might be much more important.<br />
If measurements are used together with knowledge of details, you might have an analysis that is fruitful.<br />
But at many times, sound judgment not only is enough; it is better.</p>
<p>Measurements/Metrics are probably useful for dead things, like manufacturing objects, but m/m are not good for complex things involving people, e.g. software testing. Metrics uses numbers to reduce the complexity and thereby the &#8220;truth&#8221; disappears.</p>
<p><a class="a2a_dd a2a_target addtoany_share_save" href="http://www.addtoany.com/share_save#url=http%3A%2F%2Fthetesteye.com%2Fblog%2F2008%2F06%2Fmeasurementsmetricsanalysisjudgment%2F&amp;title=Measurements%2FMetrics%2FAnalysis%2FJudgment" id="wpa2a_10"><img src="http://thetesteye.com/blog/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://thetesteye.com/blog/2008/06/measurementsmetricsanalysisjudgment/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

