Long live testing!
There’s an apparent conflict between the idea of a cross-functional team collaborating together on their work and specialist testers that only do testing. There’s an awful lot wrong with the “them and us” attitude between developers and testers (in both directions) in some organisations. One of the biggest problems in isolating testing from development is it abdicates responsibility for quality from the developers, it makes a poacher-and-gamekeeper culture driving divisions between people. There’s a well established cliché of development chucking rubbish over the wall, there’s also a well established cliché of testers being empire building bureaucrats who only ever say no.
The biggest problem from my point of view in having separate developers and testers is that it creates two teams (an example of Conway’s Law), a team of teams and that causes all sorts of problems with communication, identity, culture, decision making, definition of done etc. If you’ve got a small team and you split it into two by having devs and testers you’re taking something that is potentially high performing and deliberately interfering to make it less likely to succeed. A typical management activity.
Testing as part of the team
In my teams we’ve always considered that we have to build quality into every activity if we have a hope of achieving quality in the product. That means our lowest levels of done aren’t that we wrote some code, or did a build but that we have done some testing. We have tested the basic and alternate flows, or the obvious paths through the stories. We don’t consider it stable/promoted/out of development until this has happened.
Ideally I like to do this two ways… using automated tests as part of my continuous integration to test the obvious paths and the various parameters that impact the system (checking different data sets against expected functionality etc.). The other is for the team to test each others stuff – not with a test script but with the requirement/acceptance criteria. This allows us to check what isn’t “normal”, some of the fringe cases, but mostly to check that our view of “normal” is consistent across the team and improve the common understanding of the product.
Testing isn’t done by a separate team using separate tools, it’s done by the team for the team, to deliver value to the customer.
So what about Testers?
So where to professional testers live in this world? There is still a role for professional testers, but it’s not at the lowest level of development-testing, it’s a little higher up where having creative intelligent humans with a quality focus can provide the most value to the software value stream. Human testing isn’t especially useful for testing the normal flows, it’s useful for testing the “fringe cases” and doing exploratory testing.
Focus testing professionals on testing the end-to-end integration scenarios where many of the real problems tend to lie. Testers have a wealth of experience and knowledge of different test techniques that can be applied to address different quality risks, wasting that on simple stuff means they don’t have time to do the important stuff.
Fringe testing is testing the unusual cases, the things that weren’t written down, weren’t predictable and weren’t obvious. In short it’s the kind of thing inventive, creative humans are good at.
When we’ve got requirements they’re often written in a form that states the normal way something will go, like a basic flow in a use case (or similar). Testing this might be useful but if it doesn’t work then the developer who wrote it clearly owes everyone a doughnut because after writing it, or finishing writing it, the very first thing that should be tried is going through the basic flow. In fact it’s a good candidate for automation and testing it should be considered part of the development team’s lowest levels of their definition of done.
Use Cases are normally elaborated in terms of more than just a basic flow with a set of alternative or exceptional cases or scenarios. You might consider each of these scenarios as Use Stories – I quite like this way of working with both Use Cases to make a high level scope diagram and User Stories for actually elaborating the requirements and acceptance criteria (see Use Case vs. User Story). The thing is, if it’s been written down in the requirements then the team should be testing it as a matter of course, the probability of finding defects in these flows is again pretty low. If they’re written as stories and implemented incrementally with acceptance criteria then it’s even more unlikely as they’re not much different from the basic flow. Just another story that should be developed and tested as part of the lowest levels of getting things done.
The fringe cases are the weird things that weren’t written down, that are odd ways through the requirements and are the places where the quality risks probably lie. Coming up with these is what real professional testers are good at and what proper test techniques are good for. Testing the basic paths of everything is less useful.
I’ve often thought that one way to simulate a user using my app is to take a scatter gun approach to clicking and pressing buttons in an app, because I think algorithmically, users often don’t. Many “normal” usages of your software may actually be fringe cases because things aren’t necessarily always used as designed. Of course that makes for a good argument for simpler interfaces (both GUIs and APIs).
So “Fringe Testing” is simply testing the unusual paths through your software, the places that most likely are the highest quality risks. Of course the most “fringey” cases are often the cross requirement paths, the end to end scenarios, that users take through your software or set of integrating components. As for traditional testing… I think it’s dead.
I’ve previously blogged on the Quality Confidence metric being a useful lead indicator for software quality. One common question I’ve had on that is “Are you expecting us to fully test every requirement?”. The simple answer is “no”, testing should be focussed on the fringe cases to put an expensive resource where it’s most valuable. The Quality Confidence measure requires you to assert when you have enough coverage, for me that’s when you feel you’ve got a handle on the quality risks by testing a mix of important flows, risky stories and weird fringe cases – the rest I cover with simple professional development which always covers verification as part of development.
For a long time I’ve wanted to be able to express the quality of my current software release in a simple intuitive way. I don’t want a page full of graphs and charts I just want a simple visualisation that works at every level of requirements to verification (up and down a decomposition/recomposition stack if you’ve got such a beasty). My answer to that is Quality Confidence.
What it is
QC combines a number of standard questions together in a single simple measure of the current quality of the release, so instead of going to each team/test manager/project manager and asking them the same questions and trying to balance the answers in my head I can get a simple measure that I can use to quantitively determine whether my teams are meeting our required “level of done“. The QC measure combines:
- how much test coverage have we got?
- what’s the current pass rate like?
- how stable are the test results?
We can represent QC as a single value or show it changing over time as shown below.
How to calculate it
Quality Confidence is 100% if all of the in scope requirements have got test coverage and all of those tests have passed for the last few test runs.
To calculate QC we track the requirements in a project and when they’re planned for/delivered. This is to limit the QC to only take into account delivered requirements. There’s no point in saying we’ve only got 10% quality of the current release because it only passes some of the tests because the rest having been delivered yet.
We also track all of the tests related to each requirement, and their results for each test run. We need to assert when a requirement has “enough coverage” so we know whether to include it or not – the reason for this is that if I say a requirement has been delivered but doesn’t yet have enough test coverage then even if all of it’s testing has passed and been stable then I don’t want it adding to the 100% of potential quality confidence. The assertion that coverage isn’t enough means that we aren’t confident in the quality of that requirement.
So 100% quality for a single requirement that’s in scope for the current release is when all the tests for that requirement have been run and passed (not just this time but for the last few runs) and that the requirement has enough coverage. For multiple requirements we simply average (or maybe weighted average) the results across the requirements set.
If we don’t run all the tests each during each test run then we can interpolate the quality for each requirement but I suggest decreasing the confidence for a requirement (maybe by 0.8) for each missing run. After all just because a test passed previously doesn’t mean it’s going to still pass now. We also decrease the influence of each test run on the QC of a requirement based on it’s age so that if 5 tests ago the test failed it has less impact on the QC that the most recent test run. Past 5 or so (depending on test cycle frequency) test runs we decrease the influence to zero. More info on calculation here.
So… how much coverage is enough?
Enough coverage for a requirement is an interesting question… For some it’s when they’ve covered enough lines of code, for others the cyclomatic complexity has an impact, or the number of paths through the requirements/scenarios/stories/use cases etc. For me, a requirement has enough test coverage when we feel we’ve covered the quality risks. I focus my automated testing on basic and normal flows and my human testing on the fringe cases. Either way, you need to make the decision of when enough is enough for your project.
To help calibrate this you can correlate the QC with the lag measure of escaped defects.
Measurement driven behaviour
The QC measure is quite intuitive and tends to be congruent with people’s gut feel of how the project/release is going, especially when shown over time, however there’s simply no substitute for going and talking to people. QC is a useful indicator but not something that you can use in favour of real communication and interaction.
The measurement driven behaviour for QC is interesting as you can only calculate it if you track requirements (of some form) related to tests and their results. You can push it up by adding more tests and testing more often 🙂 Or by asserting you have enough coverage when you don’t 😦 However correlation to escaped defects would highlight that false assertion.
If you’ve got a requirements stack ranging from some form of high level requirements with acceptance tests to low level stories and system tests you can implement the QC measure at each level and even use it as a quality gate prior to allowing entry to a team of teams release train.
Unfortunately, because me and my mate Ray came up with this in a pub there aren’t any tools (other than an Excel spreadsheet) that automatically calculate QC yet. If you’re a tool writer and would like then please do, I’ll send you the maths!
Use Cases are too big to fit into a sprint/iteration! User Stories are so fine grained there’s too many too keep track of! Where’s the big picture? How to we define releases? Argh!!! I don’t know which to use!
Personally I tend to use both. I don’t think there’s any conflict between Use Cases and User Stories, in fact they’re rather complementary. Here’s how and why….
A few years ago I delivered a presentation at an agile software development conference entitled “Do requirements still matter?”. The short answer was “yes”. Most descriptions of agile iteration start with having a backlog, prioritising etc. etc. But where does the backlog come from? My answer: the Backlog Fairy; obviously.
Of course requirements still matter, they’re how we get a common understanding between customers and development teams of what we need to do. They’re the units by which we can incrementally build systems. In the rush to adopt agile processes a lot of teams have forgotten that they still need to start up with some lightweight analysis of scope and risk.
One of the things Use Cases are great at is capturing the big picture, showing a simple diagram of stick people and blobs that people who’ve never heard of UML can happily discuss in terms of what big bits of functionality are in and out of the system and who’s going to do them. The humble Use Case diagram. Not so easy to do that with a massive list of small stories.
However Use Cases can often be a bit too big to fit into a single development cycle but since they’re made up of a lot of different scenarios they’re easy to slice up into smaller bits. This is where I often use stories.
The advantage of Use Cases defining the scope is they’re nice chunky things to estimate and prioritise in a first pass. Of course as we get going we’ll estimate and prioritise stories for sprints/iterations but Use Cases help us prioritise requirements into Releases.
So I use a Use Case diagram to describe the high level scope of the thing to be developed. This normally takes about 5minutes to sketch. Then I use the Use Cases identified by the diagram (not documents, just the ovals on the diagram) to focus discussion on deriving stories (or use case slices) for development and testing within a sprint/iteration.
Sometimes stories turn up that don’t really fit into the early Use Case model, this is a rather good thing as it lets us challenge the understanding of scope. Does the story not fit in because it’s not really in line with the customers priorities and needs, it’s a cross-cutting or architectural concern or because we’ve missed some important part of the scope? All are important things to understand.
Pictures that are worth 500 billion words!
Google Ngram Viewer shows graphs of how many times words or phrases have occurred in a set of 5 million books over the years. They’re a really interesting way of seeing trends in information and relative importance between words. It’s free and easy so check it out.
Here’s some I recently ran that I found interesting. I ran most of them from 1950 onwards and the info only goes up to 2008.
Comparison of programming languages
Ngram link – When looking at this you’ve got to mentally remove the baseline Java and Pascal references from the 1950 as they’re about coffee, islands and mathematicians. Interesting to see Java so dominant.
Ngram link – I found this one really interesting. Compared to the others in my query “structured programming” had a lot more books written about it. I wonder how much this is a reflection of the rise of the internet… these days although there are lots of programming books the primary source for learning a language is online material?
Ngram link – I was a little surprised to see RUP so much more prevalent than agile but then I did have to add “software development” to the term to avoid including the bendy and stretchy. Also as with the previous one I suspect that there’s a difference here between a vendor driven process with supporting books and a more open source philosophy on agile as a generic umbrella for methodologies, and therefore more online sources. As Ivar Jacobson says: “No one reads process books”
Shareware, Freeware and OSS
Ngram link – This one speaks for itself 🙂 I wish I could have worked out how to add “expensive vendor products” to the query!
User Stories vs. Use Cases
Ngram link – Ah yes, this argument again. Interestingly this dominance of use case over user story in written books correlates with query stats between user stories and use cases on by blog and the ivarjacobson.com site. Personally I think they’re both great and complimentary, I often use them together on software projects.
Windows vs. Linux
For more fun with Ngrams watch this very funny video explaining this stuff
Note this is from 2008, for a review of RRC in 2011 see CLM 2011 review
So I downloaded and installed IBM Rational Requirements Composer (RRC) today. I’m not very good at reading instructions so typically I didn’t read them but I still managed to set up RRC server and connect a client within an hour 😀 Excellent job yet again Jazz people, in the past with the “classic” tools this sort of thing wouldn’t have been possible in such a short time. It even co-exists (but isn’t integrated) with my Rational Team Concert installation. At the moment I’ve got two Jazz server instances which is a shame, but this is only a Beta.
Anyway, I used the configuration utility and with only referring to the instructions once or twice I quickly got RRC setup and working. Having said that the config utility uses an embedded IE instance to access the Jazz Admin console and for me that wasn’t working so I gave up on the config utility and just used trusty FireFox.
The client is Eclipse based but isn’t shell sharing with my other eclipse shell at the moment. I’ve created a Test Project and thought about creating some artifacts to go along with it. I can’t see where to edit templates but since this is Jazz based I’m sure that everything is customisable. I’ve got a bunch of errors showing in my logs and in the Jazz admin web UI so I’m not sure if I’m seeing everything anyway. Perhaps reading the instructions is a good idea!
It’s clear that it’s an early Beta as there’s still a lot of simple UI bugs but the point of these releases is not to provide a finished product but to give people that are interested a chance to get to grips with the functionality and look and feel. So here’s some of my thoughts and screenshots (clicky piccies):
I set about creating a process diagram
Then a glossary that supported some of the terms that I identified in thinking about the business process
I also played around with creating a Use Case diagram
Because I’ve got a software development background I immediately decided to mock up a UI and screen flow rather than consider any of those pesky requirement things 😛
At this point it seemed like a good idea to think about writing an initial Use Case specification, this was cool because I could integrate the various things I’ve done already such as embed the UI mockup, link to the business process and have glossary management done for me too 🙂
All of which left me with some cross linked integrated stuff to do with capturing my requirements as regards eating doughnuts
It’s quite easy to start setting up a set of integrated stuff including storyboards, process diagrams, use cases, UI mockups etc. and is very non-technical to use. Personally I found the UI mockup functionality to be limited, I’d prefer more free form drawing capability when I create a “sketch” it would take me longer to mock up a UI here than it would for me to build it in Visual Studio – but then again this is aimed at analysts that may not be able to use IDEs. The UI seems very windows based as well, what about trusty web widgets!
I’d like to get at the project template and see what can be done in terms of the elements and strucutre of the project, not to mention document templates for things like Use Cases.
Does this replace RequisitePro? No. Although it’s got requiremetns authoring, marking and linking RRC doesn’t yet provide full traceability management and (at least at the moment) I can’t see where I’d go about attributing and managing requirements attributes. That’s why RRC has integration into ReqPro to provide these things.
Personally I’d like to see versionable requirements artifacts, more flexible UI sketching, traceability management, attribute management and more integration into other Rational tools such as Rational Software Architect and Team Concert.
This is a good start in terms of providing a single tool to support requirements elicitation and elaboration, all the diagrams and docs in one place, easily distributed and collaborated on. I’ll look forward to seeing more of it as time goes by 🙂