By Brad Cross - 7 August 2008
The previous article in this series introduced the concept of a technical balance sheet. This isn't just an interesting metaphor: it is relatively easy to construct a technical balance sheet for any project.
Discovering Technical Liabilities
Recall the basic account formula of assets - liabilities = equity. A technical liability is code that incurs a higher relative cost of carry. This can mean sloppy code, untested code, a quick and dirty temporary solution, etc. We measure technical liabilities by selecting metrics that make sense for a particular project's concerns.
The key words are, "that make sense." Metrics pragmatically selected by experienced practitioners tend coincide with intuition about the quality and flexibility of code. Metrics selected and applied naively can lead to an utterly useless exercise in math-turbation.
Metrics have limited use. They can tell us a lot of about technical liabilities, but they don't reveal much about our assets. It is important to recognize the difference, because managing to increase “positive” metrics can lead to trouble. The goal is not to manage toward “positive” metrics, as much as it is to manage away from “negative” metrics. In financial terms, think about metrics from the standpoint of minimizing risk, not maximizing return.
Consider test coverage. 0% test coverage is a liability, but 100% test coverage is not necessarily an asset: it's possible to have 100% test coverage over a binary that crashes when it's deployed. Viewing test coverage as an asset and managing toward maximum test coverage can result in asinine practices that distract attention from creating business assets. Software has asset value only in so far as it is useful. The objective of any development team isn't to achieve high test coverage, it is to maximize equity by producing business assets without denigrating return through technical liabilities.
Not only are metrics limited to assessing technical liabilities, we must also keep in mind that not every negative indicator reported by our metrics is in fact a technical liability. The extent to which a particular metric exposes a true liability is a function of the risk that it will undermine the business asset we’re delivering. Some risks are benign: portions of the system which have little business impact or are to be retired soon don’t require urgent mitigation. Others are malignant: core system components with opaque and high-maintenance structures will be of very high priority. We must keep this in mind when discovering our liabilities.
There are many tools for generating metrics (most of them open source) that we can use to discover our liabilties. But tools are only valuable if they generate actionable metrics. This means they must draw attention to specific things in our code that appear to be problems. Fancy charts don't tell us as much as do raw scores with a list of top offenders.
The abundance of available analysis tools requires that we use them judiciously. We must select only a handful of metrics that are directly relevant to our objectives, and define thresholds, objectives, and tolerances before we apply them.
Things like test coverage (emma, nCover) and duplication analysis (simian, cpd) are well established and don’t require extensive explanation. Static and dynamic analysis are far more sophisticated. Before attempting to put metrics on these, we must be sure to have a clear understanding of what “code quality” means in terms of flexibility, extensibility, testability and any other “-ilities” that strike our fancy. There are tools that can attempt to find potential bugs and programming errors (such as FindBugs for Java, FXcop for .Net, valgrind, helgrind and cachegrind) while others are configurable to allow us to find crufty code for different definitions of cruftiness (nDepend and jDepend, checkstyle, PMD, pc-lint, etc.) Still others, such as TestabilityExplorer for Java, let us measure the testability of code. With these tools we can measure testability, depth of inheritance, cyclomatic complexity, coupling metrics, dependency graphs, lines of code per class/method, and so forth.
Non-functional requirements such as load, scalability, performance, and so forth should be given equal footing with functional requirements. For example, a large scale performance intensive system needs to treat performance bottlenecks as technical liabilities. To identify and isolate these, create tests that simluate load. Similarly, projects with user facing web front-ends will place a very high priority on maintaining consistent user interface style. Treat inconsistent style as a liability: write tests to make sure that all user interfaces gather style settings (colors, etc.) from the same properties files. Finally, bear in mind that other developers are users of the services we create. Create APIs in style, and write tests from the perspective of the user of the API.
Statement of Technical Liability
Once we have a collection of metrics we can bring them into a summary statement that frames our liabilities without overloading us with data.
Here is an example statement of liabilities for a C# project in the investments field.
The first thing that should stand out is that we are not summarizing across the project, only observing at the level of functional area. Put another way, we are interested in our liabilities in the specific areas where the business derives asset value. This is because a technical balance sheet is an analysis of how the technical characteristics of an asset might be undermining its value. This is easier to do if the top level directory structure is close to a 1-to-1 mapping with functional areas. The more convoluted the structure of the code relative to the structure of functionality, the more convoluted the mapping will be between assessing technical liability and mapping it to the asset value provided by its functionality.
Take a look at the FIX project. FIX is a protocol for financial messaging. This library appears to be an ill conceived wrapper around the QuickFIX open source fix engine. It accounts for half the code base, has 1% test coverage and 80% of its code is duplicated. Seems like it should be a top priority, right? It isn't. The reason is that it carries benign risk. The code's risk has a limited impact; it is used in a few places in a straightforward way and there are tests around those useages. While it would be nice to be rid of the entire library, the risk has already been mitigated and there are costs associated with removing the library altogether, so there are better uses of time right now.
Servicing Technical Debt
Servicing technical debt is not conceptually different from servicing financial debt. If we are executing well on our business model, but our equity goes negative and we can no longer service our debt, we are bankrupt. The same holds true for execution risk: if technical liabilities get out of hand, the cost of carry on the technical debt will bankrupt the project despite an otherwise healthy operational model. We can get visibility into this if we place statements of our technical liabilities and risks side-by-side with our project management reporting ordered by business function. This gives us a more complete assessment of our overall execution risk, and let’s us know if we’re inflating our assets.
Now that we have a statement of liabilities, our next step will be to examine the asset side of the technical balance sheet. From there, we can create a balance sheet that will allow us to make trade off decisions, engineer turnarounds, or declare a write-off.
About Brad Cross: Brad is a programmer.