Joseph Ours' Blog

Tuesday, October 23, 2012

My Lessons Learned from the STPCon 2012 Test Competition

I attended STPCon Fall 2012 in Miami, FL. I was there both as a track speaker and a first time conference attendee. One interesting aspects of the conference, there were others I’ll cover in another blog post, was the testing competition that was available. Matt Huesser, a principal consultant at Excelon Development, arranged and helped judge the competition. A blog of his observations can be found at .

I participated in the competition and have my own thoughts on the competition.

The rules were fairly simple. We had to work in teams of two to five. We had 4 websites we could choose to test and we had a bug logging system to report our bugs. We also had access to stand in product owners. We had 2 hours to test, log bugs, as well as put together a test status report.

My first observation is that it was a competition but it wasn’t. The activity was billed as a “There can be only one” style of competition. However, and more importantly, it was about sharing more than competing. There were competitive aspects to the activity, but the real value was in sharing approaches, insights, and techniques with testers we have never met before. Not enough can be said about the value of peering. Through this exercise, I was able to share a tool, qTrace by QASymphony - for capturing steps to recreate our defects during our exploratory testing sessions, as well as my approach to basic web site security testing. Although we didn’t do pure peering, it is obvious how valuable the peering approach is.

Secondly, a simple planning discussion over testing approach and feedback during testing is immensely valuable as it not only spawns brainstorming, it helps reduce the occurrence of redundant testing. Through this exercise, my cohort, Brian Gerhardt, and sat next to each other and showed each other the defects we found. We also questioned each other on things we had not tried, but were in our realm of coverage. For side by side pseudo peering, this approach worked quite well for us and led to several bugs that we may not have looked for otherwise.

Lastly, I reflected on the competition and there are several observations that I have made as well as one startling curiosity that I think is most important of all. Every single team in the competition failed to do one single task that would have focused the effort, ensured we provided useful information, as well as removed any assumptions over what was important. We failed to ask the stakeholder any of importance regarding what they wanted us to test. We did not ask if certain functions were more important to others, we did not ask about expected process flows, we did not even ask what the business objective of the application was. Suffice to say, we testers have a bad habit of just testing, without direction and often on assumption. I will be posting a blog post more on this topic.

What I did notice is that testers when put under pressure, such as a competition or being time-bound, will fall back on habits. We will apply those oracles that have served us well in the past and work with heuristics that make sense to us. Often times this will produce results that appears to be great, but in the end, they really lead to mediocre testing. If we had taken the time to ask questions, to understand the application and the business behind it, our focus would have been sharper on areas of higher priority, AND, we would have had context for what we were doing and attempting to do.

I will keep this blog post short, the moral of the exercise is simply to ask questions, to seek understanding, to gain context before you begin.

Tuesday, March 6, 2012

Two Testing Giants Part Ways

Wow. What can I say? Scott Barber started a firestorm. Two great minds, James Back and Cem Kaner, in the testing community are parting ways on an idea, context-driven testing (CDT), which they helped create and foster. The reaction on each other’s blog has been like two parents getting divorced with testers in the field expressing disbelief, pain, and sadness.

Part of what is driving this is a changing viewpoint for Cem. Not uncommon since 2 other founding members have already broken off with CDT. To compound things between Cem and James, there is personal animosity between them. While I will say that I have a great deal of respect for James and Cem and they both have great ideas; I, for one, do not really care about them parting ways. And I do not think you care about their parting ways either; because CDT principles are an undeniable truth, not dependent on any one person.

I do agree with Cem’s general statement that there is more than one CDT school – more than one camp each with their own values and ontology (to quote James). The problem is, I disagree with using the word “school”. While it may be semantics, my issue is that school conjures up images of an institution with a predefined curriculum of which you cannot graduate unless you pass the courses. Sounds like certification, with which I disagree. CDT is a paradigm. I don’t like the implied argument, raised by James, that the ability to NOT follow something makes it an approach versus a school; because the implication is, that you must always follow something and that something that must be your identity. This makes CDT sounds like dogma and religion. For the sake of this blog post, I will still reference “schools” as schools to maintain some discussion continuity.

To me, CDT is more fundamental to the testing community than I have found anyone to say. My belief is based on the premise CDT is not about a new way doing things so much as it is an acknowledgement of reality. The CDT principles are more akin to truths than principles. Even if you do not positively use the principles to gain synergies, it does not negate the principles; it does not render them false. CDT is ingrained in the fabric of testing, regardless of which “school” a tester is following. The principles are an acknowledgement of “what is” not of “what can be”. To see what I mean, just paraphrase a few enlightened guys, “I hold these truths to be self-evident…” Even if someone followed the “Factory School” of testing, the CDT principles still hold true. For example, Factory testers believe testing measures progress. While that is an information point, it does not negate CDT principles such as projects unfold in unpredictable ways or that people are still the most important asset. The principles of CDT can be found in these realities. Therefore, CDT fundamentally permeates all “schools” of testing. Because of these reasons, I do not see any significance in this parting of ways.

Ultimately, there is an underlying “thing” that needs to be acknowledged, and that is: the purpose of software testing and consequently, software testers. Software testing’s purpose is to identify data, consolidate it into useful information, and provide it to stakeholders so that informed decisions can be made. That purpose exists regardless of a “school”. When thinking about it, I see the “schools” are really aligned to tools and types of information for specific decisions. The schools are not aligned to the real purpose of testing. I believe this is one of the reasons there are so many issues with every “school’s” beliefs and approaches.

Our job, as testers, is to extract data, synthesize it into information so that someone, a stakeholder, can make a decision. To do this we need tools. Those tools depend on the several factors:
I can only use tools I recognize – it is difficult to use something as a tool if you cannot see it in front of you Everyone has different tools – our tools are experience and knowledge based, accelerated - at times - but never replaced by technology
Not every tool we know of is at our disposal – there may be great tools out there, ones we know about, but we simply might not be able to afford or capable of easily learning them
Every tool has a function – no matter the tool, it has a purpose, a way of being used, and expectation of what using it will do
Every tool can be used for any job, with varying degrees of success – you can use a butter knife as a regular screwdriver - sometimes
Using tools, in both traditional and non-traditional ways, will create new tools for us – it is about learning. With all tools being knowledge based, any learning leads to new tools

Our challenge is to realize, as testers, our job is to extract data about a creative process in order to synthesize useful, information in a way so that stakeholders can make informed decisions; to use the tools we have available to do the best job we can. By the way, that translation of data to useful information is and should be influenced by the creative process (development process), our tools, knowledge, experience, and our understandings. We are researchers, inspectors, philosophers, teachers, learners, synthesizers, but most of all, we are information brokers.

Friday, July 17, 2009

Counter to "No Spec=Waste of Time"

What’s a QA team without a spec? A goddamned nuisance and a waste of time, that’s what.

In this article the author rants about Elder Games' Asheron’s Call 2 (an MMO) defect issues more than anything else. The author's opening sentence points to his emotional bias in the succeeding thoughts. Pretty powerful emotional stuff there. Distilling his gripes, I find he takes issue with:

1) Low Priority Bugs in the Bug Tracking System (noise)
2) Too many severe bugs released to production
3) Not enough time for the volume of work

The rest of his post is simply an attempt at assigning causation for those issues. It is his causation analysis that I take issue with. A lack of formal written specification does not result in a higher volume of "noise" in the bug tracking system. Testers innately have other references or oracles by which to evaluate software. Those can be prior experience, technology experience, genre experience, etc... Of course any material, even email, can serve as a reference point. In systems where the oracles used by the testing team are nearly universal to both the team and the software - more common in simple systems, very little documentation would be needed to have a successful test effort, with normal noise levels. In systems - more common in complex systems, where oracles are not universal, you will see more noise in the bug tracking system. It isn't the lack of documentation that is a problem; it is a lack of universally acceptable oracles. There are ways to achieve that without volumes of documents, such as collaboration, pair testing with another developer in the group. In the end these are just ways to communicate and agree on a common oracle; after all, documentation is just a proxy for, or the remembered result of, a discussion. The failure to recognize the gap in common oracles may result in increased noise in the bug tracking system and a reduction in severe bugs being caught.

Another aspect "noise" in the bug tracking system that wasn't addressed is the all too common problem of not monitoring what is being logged. All too often, testers log bugs and they don't get reviewed by anyone until some coordinated meeting. The span between the meetings represents the window of opportunity to log bugs where monitoring does not occur (this does not occur in all places and/or is not all that severe everywhere). There are two aspects I would like to address:

1) Approval signals importance - Testers log bugs they find important, again based on their own oracles. If it wasn't important to them, they would never see it as a bug. Importance must be defined in terms of value judgment and not severity. Approval in defined in terms of consensus acceptance and not a formal status assignment. Agreement by any stakeholder that a logged item is a bug signals to the team, "Hey, go ahead and find more of those because we like them." I believe this because it is societal behavior to fear rejection and pursue acceptance. So, approved bugs breeds more similar bugs. If a strong severity/priority system is NOT used, then testers can be led to believe some types of bugs are more significant/valuable than others. Without this correction and in combination with approval it is easy to establish conditions that magnify noise in the system.

2) Number of bugs - Bugs logged into a bug tracking system form another oracle for a tester, see #1. Because of that, the value of those bugs tends to decrease as more bugs are added to the system. In the face of numerous, especially uncategorized and unchanged status, bugs testers tend to skim the list for cursory information instead of diving deep into them to understand what has and has not been covered and uncovered. Therefore, a growing volume of unmanaged bugs degrades the value transferred to the testers by having the repository in the first place.

Finally, the issue of too much work and not enough time is simply a reality of software development. There are numerous estimation models, development methodologies, tools, etc... that all center around dealing with this specific issue of how to get more done with less money, time, and resources. Just because this condition exists doesn't mean there aren't ways to still achieve a solid development and testing effort. At the end of the day, if you don't communicate, and agree on common oracles, you will always incur more work to overcome this obstacle; because, software development is always collaborative no matter how hard you try to fight it.

Monday, July 6, 2009

Automation Pitfalls Snarls an Automation Vendor

I like automation testing tools. I like the challenge, I like the hobbyist programmer it brings out in me, and I like to use them, but only when they are useful. Automated tests are not sentient. They are not capable of subjective evaluations, only binary ones. Not only that, they are only able to monitor what they are told to monitor. These facts are often overlooked in the name of faster, better, higher quality testing. See the advertising quote below:

Benefits include ease of training (any member of your team can become proficient in test automation within a few days), staffing flexibility, rapid generation of automated test suites, quick maintenance of test suites, automated generation of test case documentation, faster time to market, lower total cost of quality, greater test coverage, and higher software quality.

This is the advertising from a tool I ran across today called SmarteScript, by SmarteSoft. I saw their ad, and downloaded a demo. The first thing I did was attempt to run through the tutorial to become acquainted with how the tool operates. Like most commercial for-profit tools, they have a rudimentary demo site that this designed to show off their product's bells and whistles. The basic tutorial was to learn the web-based objects (textboxes, buttons, etc...). Then, with their nifty excel like grid, add in some values for appropriate objects. Pretty simple. I used their tool to learn a dropdown box on their demo site per their instructions. *Emphasis added because it is possible to learn a table cell around the dropdown, the dropdown arrow graphic, or the text only portion of the dropdown list without learning the dropdown list as required. They had specific instructions on how to do this as well as how to tell if you screwed up. However, when going through the tutorial, with IE8, I noticed that it appeared to only learn the dropdown as static text (ie a label). I tried and tried to get it to recognize the dropdown as a dropdown, but it would not. I tried playing the script back just to see if the tool would overcome its own learning. But alas, it failed. I sent my observation to the support team. To their credit, I received an email back the same day stating they were able to reproduce the problem and their development team would look into it further.

I find it ironic that a vendor, selling this as a way to achieve faster time to market, lower total cost of quality, greater test coverage, and higher software quality ended up releasing a tool that is the contrary of that. I would be remiss if I didn't point out that I doubt I can be proficient at using a tool in a few days when it doesn't work on their own demo site; so strike down another claim. Most importantly I think this points out that test automation is not a silver bullet...nor a cheap one.

Thursday, July 2, 2009

Redefining Defects

I recently did a guest blog post about writing defect reports that include a value proposition; which is just another way of stating an impact of the defect. When writing the post, one thing occurred to me. The term defect is not exactly correct. Some are defects, some are misunderstandings, and some are suggestions. To add to the issue, Cem Kaner, recognized the legal implications of using the term defect (slide 23)

So, what exactly, should defects be called: bugs, defects, issues, problems, erratum, glitches, noncompliance items, software performance reports (spr), events, etc…? To understand what they should be called, we need to understand what “they” are.

The primary purpose of defects is to point out what we, as testers, believe to be notable information about the application. Typically, a defect report is about something that doesn’t work the way we think it should; but, it can also be a suggested improvement. As always, there is an explicit or implicit reference against which the defect is judged and evaluated. For software, it can be a requirements document, UI standards, historical experience, etc.

What is an observation? There are many dictionary definitions, but for our discussion, let us use Merriam Webster’s third entry which is “an act of recognizing and noting a fact or occurrence often involving measurement with instruments”. The Free Dictionary Online has a similar definition of observation (second) as being “a detailed examination of something for analysis, diagnosis, or interpretation.” Isn’t that what we do as testers? We perform a series of actions to elicit a response and compare that response against a set of expectation? We then definitely log anything we consider as out of line with our expectations? I propose that is exactly what we do in the testing field. Therefore, defects are really observations. Perhaps we should start calling them so. It eliminates negative sounding words, eliminates legal concerns and, most of all, it better matches our actions. I propose we call them observations. Thoughts?

Thursday, June 18, 2009

Certification - A money maker, but not for you

I received this advertisement via email. It is funny, in their misconceptions, but it is more sad that so many people buy into it (and by buy, I mean spend $$$).
Okay, so let's

start with the second line of this advertisement,

"If your team is conducting ad hoc, informal tests with little guidance or planning, the quality of the end product can be severely jeopardized—negatively affecting your bottom line"

This nearly represents everything that is wrong with the ISTQB certification. The quality of the end product is not

jeopardized with informal testing, a lack of test planning, or a lack of guidance. In reality, quality is a relationship, the simplest of which being the value of the product to the stakeholders (those who matter).
But this next statement in their advertisement does represents everything that is wrong with the ISTQB certification

"The best way to be certain that you are providing customers with quality software is to make sure your team of testers is certified."
Really? I thought it was by providing something they value, usually something built to meet their wants and/or needs? If I all need to do is put a gold embossed sticker on it, then so be it. Here you go.

All your software is now of high quality. Oh by the way ISTQB, that will be $1,995 + $250 ($1,995 for training so that you know how to use the sticker, and $250 for the right to use the sticker. Let's throw in $9.95 for Shipping and Handling as well). So, they don't have a clue about the testing industry, nor about quality. But hey, let's see what their mission statement says. Maybe that will shed some light on what they are trying to do.

It is the ISTQB's role to support a single, universally accepted, international qualification scheme, aimed at software and system testing professionals, by providing the core syllabi and by setting guidelines for accreditation and examination for national boards

So, their mission is to create a certification scheme (their word, not mine) and provide the materials and exams for that certification. It is nothing less than a money making scheme, and it is right there in their mission statement. They do not care about quality or testing.

I do not accept the ISTQB as a part of any community of software testers. The ISTQB is a business, pursuing their own agenda. I know that may sound a bit harsh, but consider this. As Michael Bolton pointed out, In Oct 2008, ISTQB announced 100,000 certified testers. Each of these testers had to pay a fee to take the exam. For the U.S, this fee is $250 (entry level) and I think $100 in India. That means they have made between $10 million to $25 million in revenue on certifications alone in the past 5 years. So far, they are succeeding at their mission statement.

Friday, June 12, 2009

Interviewing Testing Individuals

Finding good testers is like prospecting for gold. They take patience and skill to find. Many times during an interview I find that most testers have memorized the basic definitions of the field; usually from Google. So, in addition to standard interview questions I ask a scenario question in order to guage their knowledge and understanding. This is akin to a tester who tests a tester, if you will. To do this, I usually set up a simple scenario, such as what kinds of tests would you run if you were asked to test a toaster? Sometimes I would give them requirements, such as below, and sometimes I don't.

It is an electric two slice toaster
It has a single push down lever that control both sides
It automatically pops up, and shuts off when the desired darkness is achieved
It has 3 darkness settings: light, medium, dark that are triggered based on heat build up in the toaster

I like this scenario because it has proven, for me, to be a pretty good indicator of the type of testing a candidate has been exposed to as well as an indicator of their ability to decompose of requirements (direct and/or inferred).

Since I do not inform them of this question prior to an interview, I can surmise, based on their responses, what type of testing they favor. Below are some sample responses I tend to get and how I would categorize that response.

Category 1: Performance based

Testing two slices on each setting to see how long it takes for each setting
Testing multiple slices per setting to arrive at an average time for each setting
Repeating 1 and 2 for a single slice
Testing the coloration consistency over multiple toasts on the same setting
Testing that the toast time remains within tolerance over lots of toast cycles

Category 2: Aesthetics and User Interface

Testing to ensure the settings and labels marking the settings in alignment
Testing to ensure the lever works when pushed down
Testing to ensure the lever does not catch either going up or down
Testing that the finish is not tarnished or changed as a result of the heat produced when using the toaster
Testing that the slots are big enough for standard size toast

Category 3: Functional - General

Testing to ensure toast darkness matches darkness selected
Testing a variety of bread: wheat, white, bagels, pop tarts
Testing that the lever and toast pops up at the end of the test cycle
Testing that the toast can not be re-toasted until the toaster cools down
Testing that the toaster still shuts off if the lever is stuck in the down position

Category 4: Functional - Boundary

Testing to ensure the toaster is not damaged with no slices (or does not allow the lever to be pushed down)
Testing with a single slice (tested twice, once for each slice)
Testing to see that any single slice slot is consistent with the other as well as with two slices
Testing to see that it works with both thin and thick bread types
Testing to see what that it works with oversized bread for the slots

Category 5: Exploratory

Testing to see what happens if you push the lever while it is unplugged
Testing to see what happens to a variety of bread: wheat, white, bagels, pop tarts
Testing to see what happens if you test nothing.
Testing to see what happens if you manually hold the lever down.
Testing a two slices in one slot with cheese in between sandwich

This is not an end all list of possible tests, but rather a compilation of answers I have received over the years. I have found that people usually cling to a couple of categories, which is usually indicative of their experience. I usually ask the candidate which tests are more important, or which tests can be combined into a single execution. To make things interesting, and to see if people really understand regression testing, I often expand the toaster to a four slice toaster that has two setting knobs. This is where you start to see peoples understanding of testing. Often they will mention “I’d repeat the same tests.” What I look for is for testers who under new combinations, expansion of requirements, important versus non-important re-tests such as:

Tests for independent settings between the slot sets
Tests for a single slice in one of the two slots in each slot set

Again, this is not a definitive evaluative technique for testers, but I have found that it is quite beneficial and accurate in categorizing a tester’s knowledge and experience.

One thing I haven't tried yet, is bring a physical object to an interview and asking a candidate to test it. The Easy Button from Staples might be a great option. Then I can observe their behavior instead of analyzing their thoughts.