Fringe testing is testing the unusual cases, the things that weren’t written down, weren’t predictable and weren’t obvious. In short it’s the kind of thing inventive, creative humans are good at.
When we’ve got requirements they’re often written in a form that states the normal way something will go, like a basic flow in a use case (or similar). Testing this might be useful but if it doesn’t work then the developer who wrote it clearly owes everyone a doughnut because after writing it, or finishing writing it, the very first thing that should be tried is going through the basic flow. In fact it’s a good candidate for automation and testing it should be considered part of the development team’s lowest levels of their definition of done.
Use Cases are normally elaborated in terms of more than just a basic flow with a set of alternative or exceptional cases or scenarios. You might consider each of these scenarios as Use Stories – I quite like this way of working with both Use Cases to make a high level scope diagram and User Stories for actually elaborating the requirements and acceptance criteria (see Use Case vs. User Story). The thing is, if it’s been written down in the requirements then the team should be testing it as a matter of course, the probability of finding defects in these flows is again pretty low. If they’re written as stories and implemented incrementally with acceptance criteria then it’s even more unlikely as they’re not much different from the basic flow. Just another story that should be developed and tested as part of the lowest levels of getting things done.
The fringe cases are the weird things that weren’t written down, that are odd ways through the requirements and are the places where the quality risks probably lie. Coming up with these is what real professional testers are good at and what proper test techniques are good for. Testing the basic paths of everything is less useful.
I’ve often thought that one way to simulate a user using my app is to take a scatter gun approach to clicking and pressing buttons in an app, because I think algorithmically, users often don’t. Many “normal” usages of your software may actually be fringe cases because things aren’t necessarily always used as designed. Of course that makes for a good argument for simpler interfaces (both GUIs and APIs).
So “Fringe Testing” is simply testing the unusual paths through your software, the places that most likely are the highest quality risks. Of course the most “fringey” cases are often the cross requirement paths, the end to end scenarios, that users take through your software or set of integrating components. As for traditional testing… I think it’s dead.
I’ve previously blogged on the Quality Confidence metric being a useful lead indicator for software quality. One common question I’ve had on that is “Are you expecting us to fully test every requirement?”. The simple answer is “no”, testing should be focussed on the fringe cases to put an expensive resource where it’s most valuable. The Quality Confidence measure requires you to assert when you have enough coverage, for me that’s when you feel you’ve got a handle on the quality risks by testing a mix of important flows, risky stories and weird fringe cases – the rest I cover with simple professional development which always covers verification as part of development.
For a long time I’ve wanted to be able to express the quality of my current software release in a simple intuitive way. I don’t want a page full of graphs and charts I just want a simple visualisation that works at every level of requirements to verification (up and down a decomposition/recomposition stack if you’ve got such a beasty). My answer to that is Quality Confidence.
What it is
QC combines a number of standard questions together in a single simple measure of the current quality of the release, so instead of going to each team/test manager/project manager and asking them the same questions and trying to balance the answers in my head I can get a simple measure that I can use to quantitively determine whether my teams are meeting our required “level of done“. The QC measure combines:
- how much test coverage have we got?
- what’s the current pass rate like?
- how stable are the test results?
We can represent QC as a single value or show it changing over time as shown below.
How to calculate it
Quality Confidence is 100% if all of the in scope requirements have got test coverage and all of those tests have passed for the last few test runs.
To calculate QC we track the requirements in a project and when they’re planned for/delivered. This is to limit the QC to only take into account delivered requirements. There’s no point in saying we’ve only got 10% quality of the current release because it only passes some of the tests because the rest having been delivered yet.
We also track all of the tests related to each requirement, and their results for each test run. We need to assert when a requirement has “enough coverage” so we know whether to include it or not – the reason for this is that if I say a requirement has been delivered but doesn’t yet have enough test coverage then even if all of it’s testing has passed and been stable then I don’t want it adding to the 100% of potential quality confidence. The assertion that coverage isn’t enough means that we aren’t confident in the quality of that requirement.
So 100% quality for a single requirement that’s in scope for the current release is when all the tests for that requirement have been run and passed (not just this time but for the last few runs) and that the requirement has enough coverage. For multiple requirements we simply average (or maybe weighted average) the results across the requirements set.
If we don’t run all the tests each during each test run then we can interpolate the quality for each requirement but I suggest decreasing the confidence for a requirement (maybe by 0.8) for each missing run. After all just because a test passed previously doesn’t mean it’s going to still pass now. We also decrease the influence of each test run on the QC of a requirement based on it’s age so that if 5 tests ago the test failed it has less impact on the QC that the most recent test run. Past 5 or so (depending on test cycle frequency) test runs we decrease the influence to zero. More info on calculation here.
So… how much coverage is enough?
Enough coverage for a requirement is an interesting question… For some it’s when they’ve covered enough lines of code, for others the cyclomatic complexity has an impact, or the number of paths through the requirements/scenarios/stories/use cases etc. For me, a requirement has enough test coverage when we feel we’ve covered the quality risks. I focus my automated testing on basic and normal flows and my human testing on the fringe cases. Either way, you need to make the decision of when enough is enough for your project.
To help calibrate this you can correlate the QC with the lag measure of escaped defects.
Measurement driven behaviour
The QC measure is quite intuitive and tends to be congruent with people’s gut feel of how the project/release is going, especially when shown over time, however there’s simply no substitute for going and talking to people. QC is a useful indicator but not something that you can use in favour of real communication and interaction.
The measurement driven behaviour for QC is interesting as you can only calculate it if you track requirements (of some form) related to tests and their results. You can push it up by adding more tests and testing more often 🙂 Or by asserting you have enough coverage when you don’t 😦 However correlation to escaped defects would highlight that false assertion.
If you’ve got a requirements stack ranging from some form of high level requirements with acceptance tests to low level stories and system tests you can implement the QC measure at each level and even use it as a quality gate prior to allowing entry to a team of teams release train.
Unfortunately, because me and my mate Ray came up with this in a pub there aren’t any tools (other than an Excel spreadsheet) that automatically calculate QC yet. If you’re a tool writer and would like then please do, I’ll send you the maths!
I’m not a big fan of metrics, measures, charts, reporting and data collection. I’m not terribly impressed by dashboards with 20 little graphs on showing loads of detailed information. When I’m involved in projects I want to know 3 simple things:
- How quick are we doing stuff?
- Are we on track or not?
- Is the stuff good enough quality?
There can be some deep science behind the answers to those questions but at the surface that’s all I want to see.
Organisations need to know that teams are delivering quality products at the right pace to fit the business need. To achieve this goal teams need to be able to demonstrate that their product is of sufficient quality and that they can commit to delivering the required scope within the business time scales. If the project goal may not be achieved then the business or the team need to change something (such as scope, resources or time scales). This feedback mechanism and the open transparent communication of this knowledge is key to the success of agile delivery.
The goal of delivering quality products at the right pace can be measured in many complex ways however, when designing the Project Forum agile at scale practice we looked at just 3 measures. In fact I should probably call them 2.5 measures as the throughput/release burnup can be considered mutually exclusive (if you’re continuous flow or iterative). The most important measure is people’s opinions when you go and talk to your team.
Note: in the measures section I often refer to “requirements” as a simple number, this could be a count, a normalised count, magnitude, points, etc. it doesn’t matter what’s used so long as it’s consistent.