Friday, July 17, 2009

Counter to "No Spec=Waste of Time"

What’s a QA team without a spec? A goddamned nuisance and a waste of time, that’s what.

In this article the author rants about Elder Games' Asheron’s Call 2 (an MMO) defect issues more than anything else. The author's opening sentence points to his emotional bias in the succeeding thoughts. Pretty powerful emotional stuff there. Distilling his gripes, I find he takes issue with:

1) Low Priority Bugs in the Bug Tracking System (noise)
2) Too many severe bugs released to production
3) Not enough time for the volume of work

The rest of his post is simply an attempt at assigning causation for those issues. It is his causation analysis that I take issue with. A lack of formal written specification does not result in a higher volume of "noise" in the bug tracking system. Testers innately have other references or oracles by which to evaluate software. Those can be prior experience, technology experience, genre experience, etc... Of course any material, even email, can serve as a reference point. In systems where the oracles used by the testing team are nearly universal to both the team and the software - more common in simple systems, very little documentation would be needed to have a successful test effort, with normal noise levels. In systems - more common in complex systems, where oracles are not universal, you will see more noise in the bug tracking system. It isn't the lack of documentation that is a problem; it is a lack of universally acceptable oracles. There are ways to achieve that without volumes of documents, such as collaboration, pair testing with another developer in the group. In the end these are just ways to communicate and agree on a common oracle; after all, documentation is just a proxy for, or the remembered result of, a discussion. The failure to recognize the gap in common oracles may result in increased noise in the bug tracking system and a reduction in severe bugs being caught.

Another aspect "noise" in the bug tracking system that wasn't addressed is the all too common problem of not monitoring what is being logged. All too often, testers log bugs and they don't get reviewed by anyone until some coordinated meeting. The span between the meetings represents the window of opportunity to log bugs where monitoring does not occur (this does not occur in all places and/or is not all that severe everywhere). There are two aspects I would like to address:

1) Approval signals importance - Testers log bugs they find important, again based on their own oracles. If it wasn't important to them, they would never see it as a bug. Importance must be defined in terms of value judgment and not severity. Approval in defined in terms of consensus acceptance and not a formal status assignment. Agreement by any stakeholder that a logged item is a bug signals to the team, "Hey, go ahead and find more of those because we like them." I believe this because it is societal behavior to fear rejection and pursue acceptance. So, approved bugs breeds more similar bugs. If a strong severity/priority system is NOT used, then testers can be led to believe some types of bugs are more significant/valuable than others. Without this correction and in combination with approval it is easy to establish conditions that magnify noise in the system.

2) Number of bugs - Bugs logged into a bug tracking system form another oracle for a tester, see #1. Because of that, the value of those bugs tends to decrease as more bugs are added to the system. In the face of numerous, especially uncategorized and unchanged status, bugs testers tend to skim the list for cursory information instead of diving deep into them to understand what has and has not been covered and uncovered. Therefore, a growing volume of unmanaged bugs degrades the value transferred to the testers by having the repository in the first place.

Finally, the issue of too much work and not enough time is simply a reality of software development. There are numerous estimation models, development methodologies, tools, etc... that all center around dealing with this specific issue of how to get more done with less money, time, and resources. Just because this condition exists doesn't mean there aren't ways to still achieve a solid development and testing effort. At the end of the day, if you don't communicate, and agree on common oracles, you will always incur more work to overcome this obstacle; because, software development is always collaborative no matter how hard you try to fight it.

Monday, July 6, 2009

Automation Pitfalls Snarls an Automation Vendor

I like automation testing tools. I like the challenge, I like the hobbyist programmer it brings out in me, and I like to use them, but only when they are useful. Automated tests are not sentient. They are not capable of subjective evaluations, only binary ones. Not only that, they are only able to monitor what they are told to monitor. These facts are often overlooked in the name of faster, better, higher quality testing. See the advertising quote below:

Benefits include ease of training (any member of your team can become proficient in test automation within a few days), staffing flexibility, rapid generation of automated test suites, quick maintenance of test suites, automated generation of test case documentation, faster time to market, lower total cost of quality, greater test coverage, and higher software quality.

This is the advertising from a tool I ran across today called SmarteScript, by SmarteSoft. I saw their ad, and downloaded a demo. The first thing I did was attempt to run through the tutorial to become acquainted with how the tool operates. Like most commercial for-profit tools, they have a rudimentary demo site that this designed to show off their product's bells and whistles. The basic tutorial was to learn the web-based objects (textboxes, buttons, etc...). Then, with their nifty excel like grid, add in some values for appropriate objects. Pretty simple. I used their tool to learn a dropdown box on their demo site per their instructions. *Emphasis added because it is possible to learn a table cell around the dropdown, the dropdown arrow graphic, or the text only portion of the dropdown list without learning the dropdown list as required. They had specific instructions on how to do this as well as how to tell if you screwed up. However, when going through the tutorial, with IE8, I noticed that it appeared to only learn the dropdown as static text (ie a label). I tried and tried to get it to recognize the dropdown as a dropdown, but it would not. I tried playing the script back just to see if the tool would overcome its own learning. But alas, it failed. I sent my observation to the support team. To their credit, I received an email back the same day stating they were able to reproduce the problem and their development team would look into it further.

I find it ironic that a vendor, selling this as a way to achieve faster time to market, lower total cost of quality, greater test coverage, and higher software quality ended up releasing a tool that is the contrary of that. I would be remiss if I didn't point out that I doubt I can be proficient at using a tool in a few days when it doesn't work on their own demo site; so strike down another claim. Most importantly I think this points out that test automation is not a silver bullet...nor a cheap one.

Thursday, July 2, 2009

Redefining Defects

I recently did a guest blog post about writing defect reports that include a value proposition; which is just another way of stating an impact of the defect. When writing the post, one thing occurred to me. The term defect is not exactly correct. Some are defects, some are misunderstandings, and some are suggestions. To add to the issue, Cem Kaner, recognized the legal implications of using the term defect (slide 23)

So, what exactly, should defects be called: bugs, defects, issues, problems, erratum, glitches, noncompliance items, software performance reports (spr), events, etc…? To understand what they should be called, we need to understand what “they” are.

The primary purpose of defects is to point out what we, as testers, believe to be notable information about the application. Typically, a defect report is about something that doesn’t work the way we think it should; but, it can also be a suggested improvement. As always, there is an explicit or implicit reference against which the defect is judged and evaluated. For software, it can be a requirements document, UI standards, historical experience, etc.

What is an observation? There are many dictionary definitions, but for our discussion, let us use Merriam Webster’s third entry which is “an act of recognizing and noting a fact or occurrence often involving measurement with instruments”. The Free Dictionary Online has a similar definition of observation (second) as being “a detailed examination of something for analysis, diagnosis, or interpretation.” Isn’t that what we do as testers? We perform a series of actions to elicit a response and compare that response against a set of expectation? We then definitely log anything we consider as out of line with our expectations? I propose that is exactly what we do in the testing field. Therefore, defects are really observations. Perhaps we should start calling them so. It eliminates negative sounding words, eliminates legal concerns and, most of all, it better matches our actions. I propose we call them observations. Thoughts?