In my previous post I talked about using a sketch to describe architectural structure, but the other part of a useful architectural description is it’s dynamics, best expressed as architectural mechanisms.
Mechanisms are little snippets of the architecture that address an important problem, provide a common way of doing something or are good examples of how the architecture hangs together.
Mechanisms exist within the context of an architecture, which provides overall structure for the mechanisms. I tend to use a simple architectural overview sketch to do that and then further refine the architecture, if necessary, in terms of mechanisms according to the architectural profile and (during development) the needs of my team.
Sometimes during a project the team will comment on the need to have a common way of doing something, or that they’ve uncovered something tricky that we need to consider as a more significant part of the system than our early analysis showed. In these cases it’s time to create a mechanism.
Mechanisms are great, but you don’t want to many of them, or to document and detail them too much, just enough to communicate the architecture and support maintenance efforts. Indeed writing too much actually makes it harder to communicate.
Mechanisms are best expressed in terms of their structure and behaviour, I tend to use a simple class diagram for the first and whatever seems appropriate for the second. This might be a UML sequence diagram, but I don’t really like those, instead I might use a good old fashioned activity diagram, or a flowchart with GUI mockups in the nodes. Either way I recommend limited the documentation and description, just because one flow is worth writing down to explain it the others might not be. In this way I do architectural specification by example. Once I’ve written enough about a mechanism that the rest can be inferred I stop.
The words aren’t important in this example but you can see that I try to fit the description into a fairly small concise area – that helps me focus on just the really important stuff. In the top left there’s a list titled “Appropriate for stories like:” which is an indicative list of a few things to which the mechanism is appropriate. Next to it is some blurb that says what it’s for and the main scenarios it covers, so in the case of persistency it’s the normal query, create, edit, save & delete. There might be some notes around important constraints or whatever else is important.
I’ll then describe each important scenario in terms of it’s behaviour in whatever language or visual form makes sense. Sometimes this is a photo of a whiteboard 🙂 Sometimes it’s text, sometimes it’s a combination of those things.
The flip side
Just like stories having a flip side which contains their acceptance tests I also like to put acceptance tests on the flip side of my mechanisms. Although many are easy to frame in terms of customer acceptance tests (e.g. Search Mechanism will have performance, consistency and accuracy acceptance criteria) some are a little harder to frame. Technical mechanism formed to provide a common way of doing something in an architecture or to express the shape and aesthetics of an architecture may feel like they only make sense in terms of the development team’s acceptance criteria, however I always make sure they relate back to a story if this is the case, otherwise I could be needlessly gold plating.
Mechanisms are best found by understanding the architectural profile initially and then by actually building the system. If the customer doesn’t have a story that will be satisfied in part by a mechanism then it probably shouldn’t be there. Even if it is shiny.
Long live testing!
There’s an apparent conflict between the idea of a cross-functional team collaborating together on their work and specialist testers that only do testing. There’s an awful lot wrong with the “them and us” attitude between developers and testers (in both directions) in some organisations. One of the biggest problems in isolating testing from development is it abdicates responsibility for quality from the developers, it makes a poacher-and-gamekeeper culture driving divisions between people. There’s a well established cliché of development chucking rubbish over the wall, there’s also a well established cliché of testers being empire building bureaucrats who only ever say no.
The biggest problem from my point of view in having separate developers and testers is that it creates two teams (an example of Conway’s Law), a team of teams and that causes all sorts of problems with communication, identity, culture, decision making, definition of done etc. If you’ve got a small team and you split it into two by having devs and testers you’re taking something that is potentially high performing and deliberately interfering to make it less likely to succeed. A typical management activity.
Testing as part of the team
In my teams we’ve always considered that we have to build quality into every activity if we have a hope of achieving quality in the product. That means our lowest levels of done aren’t that we wrote some code, or did a build but that we have done some testing. We have tested the basic and alternate flows, or the obvious paths through the stories. We don’t consider it stable/promoted/out of development until this has happened.
Ideally I like to do this two ways… using automated tests as part of my continuous integration to test the obvious paths and the various parameters that impact the system (checking different data sets against expected functionality etc.). The other is for the team to test each others stuff – not with a test script but with the requirement/acceptance criteria. This allows us to check what isn’t “normal”, some of the fringe cases, but mostly to check that our view of “normal” is consistent across the team and improve the common understanding of the product.
Testing isn’t done by a separate team using separate tools, it’s done by the team for the team, to deliver value to the customer.
So what about Testers?
So where to professional testers live in this world? There is still a role for professional testers, but it’s not at the lowest level of development-testing, it’s a little higher up where having creative intelligent humans with a quality focus can provide the most value to the software value stream. Human testing isn’t especially useful for testing the normal flows, it’s useful for testing the “fringe cases” and doing exploratory testing.
Focus testing professionals on testing the end-to-end integration scenarios where many of the real problems tend to lie. Testers have a wealth of experience and knowledge of different test techniques that can be applied to address different quality risks, wasting that on simple stuff means they don’t have time to do the important stuff.
Fringe testing is testing the unusual cases, the things that weren’t written down, weren’t predictable and weren’t obvious. In short it’s the kind of thing inventive, creative humans are good at.
When we’ve got requirements they’re often written in a form that states the normal way something will go, like a basic flow in a use case (or similar). Testing this might be useful but if it doesn’t work then the developer who wrote it clearly owes everyone a doughnut because after writing it, or finishing writing it, the very first thing that should be tried is going through the basic flow. In fact it’s a good candidate for automation and testing it should be considered part of the development team’s lowest levels of their definition of done.
Use Cases are normally elaborated in terms of more than just a basic flow with a set of alternative or exceptional cases or scenarios. You might consider each of these scenarios as Use Stories – I quite like this way of working with both Use Cases to make a high level scope diagram and User Stories for actually elaborating the requirements and acceptance criteria (see Use Case vs. User Story). The thing is, if it’s been written down in the requirements then the team should be testing it as a matter of course, the probability of finding defects in these flows is again pretty low. If they’re written as stories and implemented incrementally with acceptance criteria then it’s even more unlikely as they’re not much different from the basic flow. Just another story that should be developed and tested as part of the lowest levels of getting things done.
The fringe cases are the weird things that weren’t written down, that are odd ways through the requirements and are the places where the quality risks probably lie. Coming up with these is what real professional testers are good at and what proper test techniques are good for. Testing the basic paths of everything is less useful.
I’ve often thought that one way to simulate a user using my app is to take a scatter gun approach to clicking and pressing buttons in an app, because I think algorithmically, users often don’t. Many “normal” usages of your software may actually be fringe cases because things aren’t necessarily always used as designed. Of course that makes for a good argument for simpler interfaces (both GUIs and APIs).
So “Fringe Testing” is simply testing the unusual paths through your software, the places that most likely are the highest quality risks. Of course the most “fringey” cases are often the cross requirement paths, the end to end scenarios, that users take through your software or set of integrating components. As for traditional testing… I think it’s dead.
I’ve previously blogged on the Quality Confidence metric being a useful lead indicator for software quality. One common question I’ve had on that is “Are you expecting us to fully test every requirement?”. The simple answer is “no”, testing should be focussed on the fringe cases to put an expensive resource where it’s most valuable. The Quality Confidence measure requires you to assert when you have enough coverage, for me that’s when you feel you’ve got a handle on the quality risks by testing a mix of important flows, risky stories and weird fringe cases – the rest I cover with simple professional development which always covers verification as part of development.
For a long time I’ve wanted to be able to express the quality of my current software release in a simple intuitive way. I don’t want a page full of graphs and charts I just want a simple visualisation that works at every level of requirements to verification (up and down a decomposition/recomposition stack if you’ve got such a beasty). My answer to that is Quality Confidence.
What it is
QC combines a number of standard questions together in a single simple measure of the current quality of the release, so instead of going to each team/test manager/project manager and asking them the same questions and trying to balance the answers in my head I can get a simple measure that I can use to quantitively determine whether my teams are meeting our required “level of done“. The QC measure combines:
- how much test coverage have we got?
- what’s the current pass rate like?
- how stable are the test results?
We can represent QC as a single value or show it changing over time as shown below.
How to calculate it
Quality Confidence is 100% if all of the in scope requirements have got test coverage and all of those tests have passed for the last few test runs.
To calculate QC we track the requirements in a project and when they’re planned for/delivered. This is to limit the QC to only take into account delivered requirements. There’s no point in saying we’ve only got 10% quality of the current release because it only passes some of the tests because the rest having been delivered yet.
We also track all of the tests related to each requirement, and their results for each test run. We need to assert when a requirement has “enough coverage” so we know whether to include it or not – the reason for this is that if I say a requirement has been delivered but doesn’t yet have enough test coverage then even if all of it’s testing has passed and been stable then I don’t want it adding to the 100% of potential quality confidence. The assertion that coverage isn’t enough means that we aren’t confident in the quality of that requirement.
So 100% quality for a single requirement that’s in scope for the current release is when all the tests for that requirement have been run and passed (not just this time but for the last few runs) and that the requirement has enough coverage. For multiple requirements we simply average (or maybe weighted average) the results across the requirements set.
If we don’t run all the tests each during each test run then we can interpolate the quality for each requirement but I suggest decreasing the confidence for a requirement (maybe by 0.8) for each missing run. After all just because a test passed previously doesn’t mean it’s going to still pass now. We also decrease the influence of each test run on the QC of a requirement based on it’s age so that if 5 tests ago the test failed it has less impact on the QC that the most recent test run. Past 5 or so (depending on test cycle frequency) test runs we decrease the influence to zero. More info on calculation here.
So… how much coverage is enough?
Enough coverage for a requirement is an interesting question… For some it’s when they’ve covered enough lines of code, for others the cyclomatic complexity has an impact, or the number of paths through the requirements/scenarios/stories/use cases etc. For me, a requirement has enough test coverage when we feel we’ve covered the quality risks. I focus my automated testing on basic and normal flows and my human testing on the fringe cases. Either way, you need to make the decision of when enough is enough for your project.
To help calibrate this you can correlate the QC with the lag measure of escaped defects.
Measurement driven behaviour
The QC measure is quite intuitive and tends to be congruent with people’s gut feel of how the project/release is going, especially when shown over time, however there’s simply no substitute for going and talking to people. QC is a useful indicator but not something that you can use in favour of real communication and interaction.
The measurement driven behaviour for QC is interesting as you can only calculate it if you track requirements (of some form) related to tests and their results. You can push it up by adding more tests and testing more often 🙂 Or by asserting you have enough coverage when you don’t 😦 However correlation to escaped defects would highlight that false assertion.
If you’ve got a requirements stack ranging from some form of high level requirements with acceptance tests to low level stories and system tests you can implement the QC measure at each level and even use it as a quality gate prior to allowing entry to a team of teams release train.
Unfortunately, because me and my mate Ray came up with this in a pub there aren’t any tools (other than an Excel spreadsheet) that automatically calculate QC yet. If you’re a tool writer and would like then please do, I’ll send you the maths!
I’m not a big fan of metrics, measures, charts, reporting and data collection. I’m not terribly impressed by dashboards with 20 little graphs on showing loads of detailed information. When I’m involved in projects I want to know 3 simple things:
- How quick are we doing stuff?
- Are we on track or not?
- Is the stuff good enough quality?
There can be some deep science behind the answers to those questions but at the surface that’s all I want to see.
Organisations need to know that teams are delivering quality products at the right pace to fit the business need. To achieve this goal teams need to be able to demonstrate that their product is of sufficient quality and that they can commit to delivering the required scope within the business time scales. If the project goal may not be achieved then the business or the team need to change something (such as scope, resources or time scales). This feedback mechanism and the open transparent communication of this knowledge is key to the success of agile delivery.
The goal of delivering quality products at the right pace can be measured in many complex ways however, when designing the Project Forum agile at scale practice we looked at just 3 measures. In fact I should probably call them 2.5 measures as the throughput/release burnup can be considered mutually exclusive (if you’re continuous flow or iterative). The most important measure is people’s opinions when you go and talk to your team.
Note: in the measures section I often refer to “requirements” as a simple number, this could be a count, a normalised count, magnitude, points, etc. it doesn’t matter what’s used so long as it’s consistent.
What is it?
CLM 2011 is a suite of tools making up IBM Rational Team Concert, Rational Requirements Composer and Rational Quality Manager properly integrated for what is arguably the first time. In my opinion this release is the closest to my interpretation of the original Jazz vision that IBM have delivered yet. It’s not perfect but I like the direction it’s heading in.
It’s taken me a while to write this, mainly because I’ve been very busy but also because it seemed to take some deep magic to get all of this stuff installed and running properly. But I’ve done it a couple of times now so I thought I’d post some screenshots and comments. Of course if you look back on my posts of the original RTC review you’ll see that life is harder these days already. With RTC 1 and 2 you could just download a zip, extract, run a script and you were off and away. Installation isn’t too bad… it’s not great and simple but this is a big complex suite of tools so I don’t really expect it to be trivial.
Lifecycle Project Administration
To start with, the management of projects across these separate tools has been significantly improved by the introduction of Lifecycle Project Administration which allows you to create an integrated cross tool project in one step and simply manage members and roles across them. This is a big step forward although there are still some problems in that it’s not easy to do these kind of things at scale. For example if I want to add 3 people with a set of roles to 40 projects I’m going to be doing a lot of manual clicking. In fact generally it’s not so easy to deal with project areas at scale in the Jazz toolset and that hasn’t significantly improved yet although Lifecycle Project Administraton is an important step in that direction.
I’m not a big fan of the navigation between project areas (linked or unlinked) as the user needs to understand the architectural relationship between personal dashboards, Change and Configuration Management (which has the RTC bits like plans, work items, source control, builds etc.), Quality Management (which has the RQM bits like test plans, cases, scripts etc.) and Requirements Management (which has the RRC bits like diagrams, UI sketches, docs, requirements etc.) to navigate the stuff in their single project. I think it’s a mistake to build the UI navigation around the architecture, I would prefer to see a unified project interface and navigation structure with the extra products adding to the project mass like aspects on a foundation. As before this becomes even more of an issue when you scale up to hundreds of projects. Incidentally, aspect orientation is how we apply practices to process kernels while still providing a seamless user experience of composed practices.
So although the three products are closer together than ever before, sharing a UI and a common architecture they still are separate products and that’s clear in terms of navigation, UI differences and linking items between them. This is a shame for many reasons but one of the most important is that it’s still providing niche tools for separate roles, building walls between people in development teams and making cross-functional teams harder to create as individuals have to learn specific skills. These differences are not a strength, they make the whole game of software development harder.
To be fair though I’m yearning for an ideal perfect solution, CLM 2011 isn’t that idealised perfect solution but it’s a lot closer to it than anything else I’ve used!
Let’s start with IBM Rational Requirements Composer
RRC 3.0.1 is a totally different beast than RRC 1 (see my original 2008 review here) and shouldn’t be thought of in the same light as RRC1. The old version was eclipse only, badly integrated, dependent on IE and full of bugs – this version is entirely web based, deeply integrated into the Jazz platform and not dependent on IE! As for bugs, I don’t know yet but it’s actually usable which v1 wasn’t!
What it does:
- Tracks requirements of different types with different attributes and links
- Web based diagram editing (via browser plugin for FireFox or MSIE ), although someone used Microsoft CAB technology inside the FireFox xpi so don’t expect to do any diagram editing from linux or a mac 😦
- Ability to produce use case diagrams, UI prototypes, UI storyboards
For me this is the first time RRC has lived up to some of my early hopes for it. I see this as a replacement for ReqPro, and indeed a replacement for DOORS in time.
Unfortunately I only use linux at home so I couldn’t take screenshots of the RRC web editor, even in a virtual machine running windows on an explicitly supported browser I can’t get the plugin to work. When I’ve used it my office environment I’ve really quite liked it though, although it’s got quite a lot of basic usability problems. I’m looking forward to when it’s more mature.
There is also a dependency/traceability tree viewer but that’s a separate download at the moment.
Next Implement something in IBM Rational Team Concert
RTC hasn’t changed that much from a user perspective, it’s still a great tool for tracking work items, making atomic multi-file change sets in parallel with continuous integration and build management. I’ve been using RTC in anger for 3 years now and still like it and think it can help drive agile benefits in development teams. Granted it’s got some issues like any tool, enterprise scalability issues and annoyances the biggest of which is the lack of cross-project queries which is practically unforgivable from my perspective. See this work item on jazz.net and demand cross-project queries from IBM!
With that said, working in Eclipse from workitems, fully using Jazz SCM and continuous build is awesome.
Then Test via IBM Rational Quality Manager
I’ll admit it, I’m not a cross-functional ideal. I’m a typical developer, I’m not a tester, nor am I a test manager so I’ll leave the finer points of testing theory to others. However I do have some things to say about RQM from a cohesive logical process perspective.
In this example I’ve created a Test Case based on the pizza requirement I entered in RRC (in exactly the same way that I created the implementation story). At this point frankly I’m a little disappointed because my Test Case has been created and is linked to the requirement (good) but it has no knowledge of the implementation story that I also created from RRC.
The Missing Link
For me this is the missing link. I realise it’s not as a simple as their always being a related triangle of 1-1 associations between these different types of artifacts, but the link between the test and the development items should at least be suggested as the odds are fairly strong that if a requirement is developed by an bit of code the test for the requirement is likely to need to test the aforementioned bit of code. Obviously I can manually create this link but that’s not the point.
In fact this is symptomatic of the fact that these CLM tools are still separate and then integrated. There is a separation between requirements in RRC, development plan items and tasks in RTC and test plan items and tests in RQM. I have to create work items /artifacts in each tool to represent these different things and link them all together. Which is not really what I want to do.
I don’t want to spend a lot of my time creating at least 2 items for every requirement in my product (1 dev story+tasks and 1 test case+scripts).
What I want to do is elaborate a set of requirements in a simplistic user friendly UI with nice diagrams (RRC can do this) then view that candidate backlog immediately in my planning tool and further refine it – not copy it (even in bulk) to RTC but to actually manipulate the same work items, albeit a different aspect of them, in RTC. I want testing to be an inherent part of development with quality management and assurance being just another aspect of the system. Every requirement should have quality dimensions and be a test case although I might want to add more of course.
I want to have requirements with dev tasks and tests hanging off them.
Basically I want to define something that needs to be done, plan it, do it and test it all as one high level thing. I might drop off loads of tasks, different types of tests, supporting sketches etc. but I want a holistic understanding of a development item with the ability to project different views to different types of stakeholders (customer requirements views, test professional views, management views, development views).
CLM2011 is a serious step in the right direction from IBM. In my opinion Jazz always has been but it’s been suffering from too many silly little problems and a lack of meaningful deep integration (ironic considering the original mission). If I had a magic wand I would do the following feats of greatness to the Jazz platform which I believe are all necessary to turn what is a good suite of tools into a true killer software development environment.
- all jazz tools to apply new aspects to projects rather than creating seperate project areas which then need linking and administration via LPA
- all artifacts and workitems to be viewable as native items in any tool interface
- all artifacts and workitems created by all three tools (and other future ones) be instantiated and linked with truly consistent UI, all taggable and commentable
- all artifacts, scm/build elements and work items queryable via a single query engine (across linked and un-linked project areas)
- the ability to self serve projects (without granting massive impractical rights), finer grained security and permissions control
- parent-child change sets
- smoother flow of work items between tool aspects rather than creating new ones with links between them
- make RRC diagram editing work on linux – things that only work on Windows are not enterprise deployable, and if they’re browser based and only work on windows someone needs shouting at. Even a much maligned school boy shouldn’t be making this error
- a decent reporting solution and technology
Finally , being able to capture and elaborate requirements, plan iteratively, develop code, continuously integrate with unit tests, system test and manage test plans (amongst other things!) in what feels close to a single tool is extremely powerful. If only navigation wasn’t architecturally focused this strength would be much stronger and be a killer feature for IBM and Jazz.
If they sort out cross-project querying.