Showing posts with label Technical Balance Sheet. Show all posts
Showing posts with label Technical Balance Sheet. Show all posts

Thursday, April 23, 2009

Cross - Bridging the Gap

by Brad Cross - 23 April 2009

From the feedback I've received for my technical balance sheet series, I've identified two gaps in how people understand the concept. The first gap is that this approach requires somebody to be simultaneously knowledgeable in finance and software. The second gap is that it is unclear how you can adopt some of these ideas with a very small initial investment.

The technical balance sheet ideas are simple and cheap to try out. They can help you make cross-disciplinary trade-offs about software, finance, and operations. This can involve technical people who know almost nothing about finance, finance people who know almost nothing about technical work, and business operations and project management people who may not be strong in either finance or software.

So, bridging the first gap is easy: you don't need to have a double PhD in finance and computer science in order to understand these ideas. Building software costs money, going slower costs more money, and technical debt makes you go slower. If you have a lot of assets, but those assets are over-leveraged with debt, you can end up cash flow negative with negative assets. On the other hand, if you have no debt and no assets, you also have nothing. So the other side of the equation is to build software that has high asset value. Focus on the aspects of your systems that have the highest return on investment, and do so without borrowing through shortcuts and sloppiness. The technical balance sheet is just a way to give you a mental model for thinking about the trade-offs.

This leads into the second point: this is cheap to adopt. You don't have to spend a lot of time and money to try this out. The first article on gathering metrics for technical liabilities may have been a bit intimidating because it was not clear how to create a quick balance sheet for a project unless they made a significant up-front investment. Hopefully that was cleared up in the explanation of the approach in the field guide article.

You can quickly assemble a prototype balance sheet with just a few metrics that are really simple to round up. I did this on one project in about an hour by harvesting the metrics that were already available via the coverage bundle and PMD, all of which were already running in their build.

Bootstrapping a balance sheet is not about spending a lot of time putting together a bunch of overly complex tables and metrics. You can do a few simple exercises, look at the numbers, and see if it helps you think about your trade-offs and plan of action.

As an example, on a project I was working on last year I saw that we spent 40-50% of story points on a couple areas of plumbing. After some investigation, it turned out that there was some heavyweight architecture and design in place that I was able to eliminate pretty quickly. On top of it, a lot of the code could be replaced by open source components. I also saw that that the most valuable components as far as the business was concerned were in terrible shape (bad design, lots of bugs, low test coverage.) Right away, I could see that we were over-investing in maintaining plumbing and under-investing into the parts that generate the real cash flows.

This scenario is common: an over-investment on low quality infrastructure coupled with an under-investment in the parts of the system that support the business in generating actual cash flows. The solution is to figure out how to reduce your ongoing investment into plumbing, while simultaneously focusing on how to increase your investment in the cash flow generating parts of your systems. With minimal effort, the technical balance sheet can expose where these over or under investments are. It communicates in terms everybody can understand (e.g., we get value from this area of the code, we do not get value from that area of the code.) Applying a technical balance sheet to your project can make it clear to each member of the team where their attention should not be, as well as where it needs to be, to maximize the business impact of your project.




About Brad Cross: Brad is a programmer.

Wednesday, April 1, 2009

Cross - A Field Guide for Applying Technical Financial Statements, Part II

By Brad Cross - 1 April 2009

This article is the second in a series on putting the technical balance sheet to work. If you haven't done so already, you will want to read the first in the series.

Taking Action to Increase Equity

So, we've bootstrapped our balance sheet. Now, how to we get some of these ideas into our decision making process? If we notice something that is harming our owner's equity, how do we actually put that knowledge into practice?

It is critical to have code metrics that are actionable. A lot of tools will show you fancy charts, tables and diagrams, but few of these visual representations are actionable from a technical perspective. You need to be able to identify prioritized lists of actions. For example, Simian or CPD will sort the list of code duplication by worst offenders. Clearly, your first action is to tackle the worst offenders. If you see that your highest bug counts and lowest test coverage is in one of your most valuable components, then working on the robustness of that component is a clear action item. Often you can find a few monolithic classes where many of the issues occur; refactoring these while bringing them under test can be a simple way to achieve a radical change in your equity for that component.

Once you have defined a list of actions prioritized by impact on equity in your most valuable components, you are ready to start increasing your equity. There are other important and practical technical considerations to consider, however. As we discussed in the article on cost of carry vs. cost of switching you should be mindful of the impact of encapsulation and the dependencies among components. Start at the least dependent, but most depended upon, parts of the system: the leaf nodes of the dependency graph.

There are a couple of ways to tackle your list of actions. One is through big-bang refactoring or "clean up" efforts, and the other is through a more incremental "as you touch it" approach.

I typically prefer the incremental approach. I continue working as normal, and make the assumption that when I do a new piece of work in a target area, I am going to invest more time because I will be implementing actions from my prioritized list for increasing equity.

This incremental approach works very nicely most of the time and avoids big-bang "let's stop and refactor the world" efforts, which tend to suspend new development and rarely seem to work out well. That said, there are circumstances when it is appropriate to invest in testing and other infrastructure, because these investments can make your incremental efforts more effective. You will often run into structural issues that cause trouble. For example, you may have some monolithic piece of code in the system that everything is tightly coupled to. If this is holding up incremental progress, breaking this code apart in order to restructure the dependencies can be a sound investment. At the moment, I don't have any way to quantify this - you just need to have some people around who have significant experience working on large systems and restructuring efforts and have a pragmatic view and a good instinct for when this sort of thing is required.

Refactor, Rewrite or Replace?

In the article on the cost of carry vs. cost of switching, we made a passing mention of migration strategy.

Cost of switching is the cost associated with your mitigation strategy. When indebted code is an impediment to sustaining and evolving the functionality of your project, your mitigation strategy is how you go about transitioning away from indebted code in order to reduce or eliminate the impediments. This can entail completely replacing a component with another component, partially replacing a component, or simply small incremental refactorings that accumulate over time. Switching costs are impacted by the size and scope of the component you are replacing, the time you have to spend to find and evaluate commercial and open source alternatives, time spent on custom modifications to the replacement, and time spent on migrating infrastructure - for instance, migrating or mapping user data after switching to a different persistence technology.

Let's look at migration strategy in more concrete detail.

When we talk about migration strategy, it is usually related to refactoring, rewriting, or replacing.

First, your system needs to be structured in such a way that you make trade-offs at the component level and not system-wide. I discussed this at the very beginning of this series. It doesn't make sense to look at metrics and valuations at the component level unless you can actually trade-off at that level.

Components with a low asset value (easy to substitute) and high liabilities are good candidates for replacing or rewriting. Often you can combine replacing and rewriting by finding a way to write thin custom layers around open source components that can replace these parts of the system. This typically requires a bit of refactoring as well, in order to allow other components to use the new component, so you often end up using a bit of each approach.

Components with a high asset value (hard to substitute) and high liabilities are candidates for refactoring. Sometimes you can pull parts of these components out into new components that can be replaced or rewritten, but often this wont get you far. Typically these high value components are the core domain logic. Often the logic is more sophisticated and if you introduce new bugs while refactoring, they may be difficult to track down as they can result in subtle incorrectness rather than blatant crashing, exceptions and such. Careful and incremental refactoring is usually the way to go.

Finally, sometimes there are good reasons to rewrite an entire application stack. This might be appropriate, for example, if you are switching to an entirely new runtime and technology stack. Sometimes this is called re-platforming. Don't rewrite in the same language and technology stack just because you have written a crappy code base and it is too fragile to change anymore. When you abandon ship in that case, you lose the value of all the lessons you will learn from refactoring it. Sometimes, the best approach is to refactor your way to a rewrite.

Bridging the Gap

In the next piece, we will bridge the gap between finance and engineering. A big part of bridging this gap is noticing how each side tends to gravitate toward the asset and liability side of the balance sheet, respectively. This is where the technical balance sheet shines: allowing you to reduce ongoing investment into the plumbing aspects of your system and increase investment into your "special sauce."




About Brad Cross: Brad is a programmer.

Wednesday, March 25, 2009

Cross - A Field Guide for Applying Technical Financial Statements, Part I

By Brad Cross - 25 March 2009

So far, we have discussed a number of ways of applying financial models to software projects. We explored liabilities as an estimate of a project's technical debt. We explored assets as an estimate of the business value of software components. We explored techniques for computing equity as a function of assets and liabilities. We explored cost of carry and cost of switching as proxies for cash flows. We concluded by exploring techniques for framing tradeoff decisions based on equity and cash flow analysis.

The purpose of weighing assets against liabilities and cost of carry against cost of switching is to manage components of software projects like financial portfolios - assessing each component on its risk vs. return. The objective is to have a functional model for framing technical and business decisions about software. These models are not prescriptive. The purpose of the series so far has been to introduce the technical, financial, and process concepts.

Now that we understand these concepts, how do we put this into practice?

First of all, if your project is a complete mess, then collecting a bunch of metrics and rigorously applying financial concepts may not be appropriate at all. If you have only a handful of tests, then there really isn't much point in a careful analysis of test coverage and the distinction between unit test coverage and system test coverage. If your system is in really bad shape, your problems are often painfully obvious. That said, when it seems difficult to know where to start, the balance sheet model can help you pinpoint the worst problems and target the components that need the most attention.

I have applied this balance sheet approach in a number of different ways on my projects. I’ve used it to decide whether I should rewrite, refactor or replace un-maintainable components that carried too much technical debt. I’ve used it to inform estimates of effort: if an enhancement required modifying a component with high technical liabilities, I knew it was probably going to take longer. I’ve used it to prioritize work to maximize owner's equity by increasing priority of work in high-asset components and de-prioritizing or seeking open-source replacements for low-asset components.

I'm sure there are other ways to apply this model to software projects. For instance, if you have many different projects, programs, or business areas, you could apply this approach at different levels of granularity in the business. You could also use this way of thinking to frame decisions about green-field projects; even at startups. In fact a classic startup mistake is to pile on too much technical debt in an effort to go faster, which results in going slower, and ultimately leads to hitting a brick wall.

Step 1: Choose your metrics that define your debt

I recommend sticking to a pragmatic, lightweight spirit with this technical balance sheet approach. Use what you can get on the cheap - without much time investment. On one project, we already had access to clover, so we were able to mine its metrics "for free". We used these metrics to build the initial prototype of our balance sheet in an hour.

On my last project, we built a mind map (see below) of what we considered to be interesting metrics. We assembled the entire team to discuss what we thought were the most important aspects of the technical quality of the system (i.e. the most costly liabilities) and then identified which liabilities were measurable with current open source tools. It is interesting to note here that some aspects I typically evaluate are missing - such as code duplication. The team never mentioned this as one of the top concerns during this session.

Whatever metrics you choose, it is critical that they are actionable. A lot of tools will show you fancy charts, tables and diagrams, but few of these visual representations are actionable. You need to be able to identify prioritized lists of actions. For example, Simian or CPD will sort the list of code duplication by worst offenders; your obvious first action is to tackle the worst offenders. If you see that your highest bug counts and lowest test coverage is in one of your most valuable components, then working on the robustness of that component is a clear action item. Often you can find a few monolithic classes where many of the issues occur; refactoring these while bringing them under test can be a simple way to achieve a radical change in your equity for that component.

Step 2: Compute your financials by functional area

Here's a recap of our journey through the examples used in our discussions:

In the article on measuring technical liabilities, we constructed a table using a few simple and common metrics as indicators of technical indebtedness.

Functional Area

FX Cop

Coverage

Lines

Duplication

% Duplication

Brokers

139

1%

2984

234

8%

Data

52

31%

1450

297

20%

DataProviders

59

1%

1210

78

6%

DataServer

27

48%

1489

40

3%

Execution

7

48%

618

0

0%

FIX

27

1%

48484

39337

81%

Instruments

133

55%

12896

714

6%

Mathematics

77

56%

2551

205

8%

Optimization

25

60%

305

26

9%

Performance

2

73%

134

0

0%

Providers

36

47%

707

42

6%

Simulation

20

77%

241

0

0%

Trading

54

50%

2955

472

16%

TradingLibrary

66

29%

7035

1674

24%

Next, in the article on measuring assets and intangibles, we talked about using substitutability as a proxy for asset valuation in order to consolidate a basket of concrete metrics into an abstract relative metric.

Module

Substitutibility

Module

Substitutibility

Brokers

2

Mathematics

3

Data

2

Optimization

3

DataProviders

2

Performance

3

DataServer

2

Providers

1

Execution

3

Simulation

3

FIX

1

Trading

3

Instruments

3

TradingLogic

4

In the piece on technical equity, we show to transform our table of metrics for technical liabilities into a number that can be reasonably compared with the asset value of each component in order to derive a rough number representing technical equity and how leveraged each component is.

Component

Assets

Liabilities

Equity

Leverage

Brokers

2

3

-1

Infinity

Data

2

3

-1

Infinity

DataProviders

2

3

-1

Infinity

DataServer

2

2

0

Infinity

Execution

3

2

1

3

FIX

1

4

-3

Infinity

Instruments

3

3

0

Infinity

Mathematics

3

1

2

3/2

Optimization

3

1

2

3/2

Performance

3

1

2

3/2

Providers

1

1

0

Infinity

Simulation

3

1

2

3/2

Trading

3

3

0

Infinity

TradingLogic

4

3

1

4

Finally, in the article on cost of carry vs. cost of switching, we discussed thinking about tradeoff decisions in terms of cash flows when considering paying down more or less principal.

  • When you take on technical liabilities, you incur cash flow penalties in the form of ongoing interest payments, i.e. going slow.
  • When you pay down principal on your technical liabilities, you incur cash flow penalties in the form of payments against principal. However, these cash flow penalties are of a different nature: they come in the form of paying down principal in the short term (going slower right now) in exchange for paying less interest (going faster as the principal is paid down).
  • The notion of going faster or slower shows the connection between cash flows and time. The cost of ongoing interest payments is an incremental reduction in speed, whereas the cost of payments against principal is an investment of time in exchange for an increase in speed. Restated, there is a trade off between cash flow penalties now (paying the cost of switching) for decreased cash flow penalties in the future (reducing the cost of carry).

Based on my experience building software, I do not think that the relationship between cash flows, time, and speed is well understood. Much of the problem stems from confusion between the short and long term impact on cash flows that result from making certain tradeoffs. People cut corners under the auspices of short term speed. Often, this corner-cutting actually has the reverse of the intended effect, and can even destroy the chances of delivering. I have seen this thinking lead to the destruction of entire projects within 1 to 2 quarters.

Almost everyone will agree that a decade is long term and that taking on a lot of technical debt can be a significant risk to longevity. Fewer will agree that a year or more is long term. Very few will agree that a quarter is long term. Nevertheless, the more projects I work on, the shorter my definition of "long term" becomes with respect to technical debt. If you really look at the cash flow trade-offs that result in the relationship between time, speed, and technical debt, and you consider the compounding effect of negative cash flows that result from the debt, it becomes much less attractive to mindlessly acquire technical debt in the name of speed. It often results in going slower, even across the time horizon of a quarter or less.

So now we have some crude numbers, and we understand how to think about cash flow trade-offs. In part II, we'll present how we formulate and execute a plan to increase technical equity.




About Brad Cross: Brad is a programmer.

Thursday, December 4, 2008

Cross - Cost of Carry versus Cost of Switching

By Brad Cross - 4 December 2008

In the technical balance sheet, the cost of carry and cost of switching are proxies for cash flow.

When you take on technical liabilities, you incur cash flow penalties in the form of ongoing interest payments, i.e. going slow.

When you pay down principal on your technical liabilities, you incur cash flow penalties in the form of payments against principal. However, these cash flow penalties are of a different nature: they come in the form of paying down principal in the short term (going slower right now) in exchange for paying less interest (going faster as the principal is paid down).

The notion of going faster or slower shows the connection between cash flows and time. The cost of ongoing interest payments is an incremental reduction in speed, whereas the cost of payments against principal is an investment of time in exchange for an increase in speed. Restated, there is a trade off between cash flow penalties now (paying the cost of switching) for decreased cash flow penalties in the future (reducing the cost of carry).

Before we look further at these terms, we need to first introduce another concept: volatility. Volatility is the extent of change of a component, including maintenance and new development. The more work there is to do in an area of the code that has high technical liability, the greater the cost to complete that work. You can determine the volatility of a component by looking at previous project trackers, defects logs, the raw size of the component, source control history, and backlog work items.

Cost of carry is the sustained cost of servicing your liabilities. It represents the decrease in speed and corresponding increase in time that it takes for developers to complete tasks due to technical debt. Volatility is a major component of cost of carry: the liabilities in a high volatility component are the liabilities with the highest cost of carry across your application.

Within a component, your cost of carry is the time spent on production issues, maintenence issues, adding new code to change or extend the system, as well as testing and debugging time. The cost of carry for an indebted component also includes costs from its effects on dependent components. In other words, some dependent components are more difficult to productionize, maintain, change, extend, test and debug becasue of their dependency on an indebted component.

Cost of switching is the cost associated with your mitigation strategy. When indebted code is an impediment to sustaining and evolving the functionality of your project, your mitigation strategy is how you go about transitioning away from indebted code in order to reduce or eliminate the impediments. This can entail completely replacing a component with another component, partially replacing a component, or simply small incremental refactorings that accumulate over time. Switching costs are impacted by the size and scope of the component you are replacing, the time you have to spend to find and evaluate commercial and open source alternatives, time spent on custom modifications to the replacement, and time spent on migrating infastructure - for instance, migrating or mapping user data after switching to a different persistence technology.

The effect of dependencies and encapsulation

Both switching costs and cost of carry are proportional to encapsulation and dependencies.

If a technical liability is poorly encapsulated, other components that depend on the indebted component also incur higher switching and carry costs. This is an extremely important concept to understand, and one of the reasons why encapsulation is one of the most important ideas in the history of computing. (If you do not understand or track the dependence structure of your project, you can use a tool such as jDepend or nDepend.)

It is also important to understand that the impact of dependencies on switching costs and cost of carry can be transitive. This means that poor encapsulation of one high-liability component can have profound impact across not only its direct dependents, but also transitive dependencies that are several steps from the indebted component in the dependency graph.

If a high liability component is not highly volatile, and the costs of switching are high, then making the effort to substitute the component is a low priority. However, dependencies can overrule this judgment because indebted components can impact your cash flows by trickling down into other components that are highly volatile.

A component may be replaceable with an open source counterpart, but if the use of the component to be replaced is not properly encapsulated, then the switching cost can be quite high. It is important to note that in this case the cost of switching is higher because poor encapsulation itself is a technical liability. Often you have to spend a lot of time paying down that debt before you can consider replacing a component.

In the Technical Liabilities article, we talked about benign versus malignant risk. Benign risks are well encapsulated risks in low volatility code. Malignant risks are poorly encapsulated risks, risks in high volatility code, or both. Clearly, the cost of carry on benign risks is far lower than the cost of carry on malignant risks.

Making trade-off decisions

You can evaluate different cost of carry vs. cost of switching scenarios by taking the present value of each cash flow stream for each scenario. You can then compare the net present value of the different scenarios and determine the highest-value course of action. Switching is attractive when the present value of the stream of future cash flows from cost of carry is higher than the present value of the stream of future cash flows from cost of switching.

The next article in this series will be a field guide for execution that is dedicated to details and examples of using this mental model to make trade off decisions.




About Brad Cross:  Brad is a programmer.

Thursday, November 13, 2008

Cross - Technical Equity

By Brad Cross - 13 November 2008

Recall that previously, we tallied code metrics by component to produce a table of technical liabilities. Then we scored the degree of substitutability of each component on a scale of 1 to 4 to assign each an asset value. To determine owner's equity, we need to compare asset and liability valuations, which means we need to transform the collection of metrics in the table of technical liabilities into an index comparable to our asset valuation.

Are you Equity Rich or Over-Leveraged?

There are two ways we can make this transformation. One is to devise a purely mathematical transformation to weigh all the technical scores into a single metric scaled from 1 to n. To do this, convert each metric to a 100% scale (where 100% is the good side of the scale), sum the scaled values and divide by 100. This will give you a number from 0 to n, with n being the number of metrics you have in your table. Then multiply by 4/n, which will transform your aggregated metric to a scale with a maxiumum of 4. For example, if you have 60% test coverage, 20% percent duplication and 150 bug warnings from static analysis, you can do the following: (1-60%) + 20% + 150/200 = 135% (where 200 is the maximum number of bug warnings in any of your components.) Since we have 3 metrics and we want the final result to be on a 4 point scale, we multiply by 4/3. This gives a score of 1.8 out of 4.

Alternatively, we can devise a 1-to-4 scale similar to our asset valuation scale. This allows us to combine quantitative metrics with the qualitative experience we have from working with a code base.

As in finance, excessive liabilities can lead to poor credit ratings, which leads to increasing interest rates on borrowing. In software, we can think of the burden of interest rate payments on technical debt as our cost of change, something that will reduce the speed of development.

Technical debt, like any form of debt, can be rated. Bond credit ratings are cool, so we will steal the idea. Consider four credit rating scores according to sustainability of debt burden, and how these apply to code:

  1. AAA - Credit risk almost zero.
  2. BBB - Medium safe investment.
  3. CCC - High likelihood of default or other business interruption.
  4. WTF - Bankruptcy or lasting inability to make payments.

A AAA rated component is in pretty good shape. Things may not be perfect, but there is nothing to worry about. There is a reasonable level of test coverage, the design is clean and the component's data is well encapsulated. There is a low risk of failure to service debt payments. Interest rates are low. The cost of change in this component is very low. Development can proceed at a fast pace.

A BBB component needs work, but is not scary. It is bad enough to warrant observation. Problems may arise that weaken the ability to service debt payments. Interest rates are a bit higher. This component is more costly to change, but not unmanageable. The pace of development is moderate.

A CCC is pretty scary code. Sloppy and convoluted design, duplication, low test coverage, poor flexibility and testability, high bug concentration, and poor encapsulation are hallmarks of CCC-rated code. The situation is expected to deterioriate, risk of interruption of debt payments is high and bankruptcy is a possibility. Interest rates are high. Changes in this component are lengthy, painful, and expensive.

A WTF component is the kind of code that makes you consider a new line of work. Bankruptcy-insolvency is the most likely scenario. Interest rates are astronomically high. An attempt to make a change in this component is sure to be a miserable, slow and expensive experience.

Expanding on the example we've been using, let's fill out the rest of the balance sheet and see what owner's equity looks like.

Component

Assets

Liabilities

Equity

Leverage

Brokers

2

3

-1

Infinity

Data

2

3

-1

Infinity

DataProviders

2

3

-1

Infinity

DataServer

2

2

0

Infinity

Execution

3

2

1

3

FIX

1

4

-3

Infinity

Instruments

3

3

0

Infinity

Mathematics

3

1

2

3/2

Optimization

3

1

2

3/2

Performance

3

1

2

3/2

Providers

1

1

0

Infinity

Simulation

3

1

2

3/2

Trading

3

3

0

Infinity

TradingLogic

4

3

1

4

This table naturally leads to a discussion of tradeoffs such as rewriting versus refactoring. Components with negative equity and low asset asset value are candidates for replacement. Components with positive equity and middling asset value are not of much interest: while owning something of little value is neither exciting nor worrying, owning something of little value that carries a heavy debt burden is actually of negative utility. Components of high asset value but low equity are a big concern; these are the components we need to invest in.

In addition to thinking about how much equity we have in each component, we can also think about how leveraged each component is, i.e. how much of a given component's asset value is backed by equity, and how much is backed by liability. This measure of leverage is called the debt to equity ratio. An asset value of 3 with a liability of 2 leaves you with an equity value of 1, and you are leveraged 3-to-1, i.e. your asset value of 3 is backed by only 1 point of equity. Any asset with negative equity has infinite leverage, which indicates a severe debt burden.

The Technical Balance Sheet Applied

This exercise makes the decisions we face easier to make. In the example above, there are a number of components with an asset value of 2 and liabilities of 2 or 3. This led me to replace all the custom persistence code with a thin layer of my own design on top of db4o (an embeddable object database.) I deleted the components Brokers and DataProviders, then developed my own components from scratch and extracted new interfaces.

The FIX component, with an asset value of 1 and liabilities of 4, obviously needed to go. However, although the component has high negative equity, I did some experimenting and found that the cost of removing this component was actually quite high due to proliferation of trivial references to the component. I have gradually replaced references to the FIX component and chipped away at dependencies upon it, and it will soon be deleted entirely.

There are a number of components with an asset value of 3 or 4 but with liabilities of 2 or 3. These are the most valuable parts of the product that contain the core business code. However, some have 20% or less test coverage, loads of duplication, sloppy design, and many worrisome potential bugs. Due to the high asset value, these components warrant investment. I thought about rewriting them, but in these cases most often the best bet is to pay down the technical debt by incrementally refactoring the code. A subtle bonus from refactoring instead of rewriting is that each mistake in the code reveals something that doesn't work well, which is valuable information for future developement. When code is rewritten, these lessons are typically lost and mistakes are repeated.

We now have a sense of debt, asset value and technical ownership that we've accumulated in our code base. Our next step is to more fully understand trade off decisions based on weighing discounted cash flows: the cost of carry versus the cost of switching. Or alternatively stated, the cost of supporting the technical debt versus the cost of eliminating it.

Thursday, November 6, 2008

Cross - Discovering Assets and Valuing Intangibles

By Brad Cross 6 November 2008

In the last article in this series, we identified our technical liabilities by component, with each component representing a functional area. Now we will do the same with the asset side of the balance sheet.

The way a code base is structured determines the way it can be valued as an asset. If a code base matches up well to a business domain, chunks of code can be mapped to the revenue streams that they support. This is easier with software products than supporting infrastructure. Sometimes it is extremely difficult to put together a straightforward chain of logic to map from revenue to chunks of the code base. In situations like this, where mapping is not so obvious, you have to be a bit more creative. One possibility is to consider valuing it as an intangible asset, following practices of goodwill accounting.

An even more complicated scenario is a monolithic design with excessive interdependence. In this case, it is very difficult to even consider an asset valuation breakdown, since you cannot reliably map from functional value to isolated software components. This situation is exemplified by monolithic projects that become unwieldy and undergo a re-write. This is a case where bad design and lack of encapsulation hurt flexibility. Without a well encapsulated component-based design, you can only analyze value at the level of the entire project.

Substitutability as a Proxy for Asset Valuation

One way to think about asset valuation is substitutability. The notion of code substitutability is influenced by Michael Porter's Five Forces model, and is similar to the economic concept of a substitute good. Think of code components in a distribution according to the degree to which they are cheap or expensive to replace. "Cheap to replace" components are those that are replaceable by open source or commercial components, where customization is minimal. "Expensive to replace" components are those that are only replaceable through custom development, i.e. refactoring or rewriting.

Substitutability of code gives us four levels of asset valuation that can be rank ordered:

  1. Deletable (lowest valuation / cheapest to replace)
  2. Replaceable
  3. Valuable
  4. Critical (highest valuation / most expensive to replace)

Consider a collection of software modules and their corresponding valuation:

ModuleSubstitutibility ModuleSubstitutibility
Brokers

2

Mathematics

3

Data

2

Optimization

3

DataProviders

2

Performance

3

DataServer

2

Providers

1

Execution

3

Simulation

3

FIX

1

Trading

3

Instruments

3

TradingLogic

4


A critical component is top level domain logic. In the example table, only the trading logic component is critical. This component represents the investment strategies themselves - supporting the research, development, and execution of trading strategies is the purpose of the software.

A valuable component exists to support the top level domain logic. Here, there is a trading component that supports the trading logic by providing building blocks for developing investment strategies. There is also a simulation component that is the engine for simulating the investment strategies using historical data or using Monte Carlo methods. You can also see a handful of other components that provide specialized support to the mission of research, development and execution of investment strategies.

A replaceable component is useful, but it is infrastructure that can be replaced with open source or commercial component. Things in this category include homegrown components that could be replaced by other open source or commercial tools. Components in this category may do more than off-the-shelf alternatives, but an off-the-shelf replacement can be easily modified to meet requirements. In the example above you can see four components in this category. They are related to APIs, brokers, or the persistence layer of the software. Both broker APIs and persistence are replaceable by a wide variety of different alternatives.

A deletable component is a component from which little or no functionality is used, allowing us to either delete the entire thing, or extract a very small part and then delete the rest. This includes the case when you "might need something similar" but the current implementation is useless. In the example, there is an entire component, “Providers,” which is entirely useless.

Accepting Substitution

It is important to consider psychological factors when making estimates of value. Emotionally clinging to work we have done in the past will derail this process. For example, perhaps I have written a number of custom components of which I am very proud. Perhaps I think that work is a technical masterpiece, and over time I may have convinced others that it provides a genuine business edge. If we can't put our biases aside, we talk ourselves into investing time into code of low value, and we miss the opportunity to leverage off-the-shelf or open-source alternatives that can solve the same problem in a fraction of the time.

In this article we've defined a way to identify our assets by business function and to value them either based on a relationship to revenue streams or by proxy using their degree of substitutability. In the next installment, we’ll take a look at switching costs and volatility to set the stage for thinking in terms of cash flows when making trade-off decisions in our code.




About Brad Cross: Brad is a programmer.

Thursday, August 7, 2008

Cross - Technical Liabilities, Malignant Risks

By Brad Cross - 7 August 2008

The previous article in this series introduced the concept of a technical balance sheet. This isn't just an interesting metaphor: it is relatively easy to construct a technical balance sheet for any project.

Discovering Technical Liabilities

Recall the basic account formula of assets - liabilities = equity. A technical liability is code that incurs a higher relative cost of carry. This can mean sloppy code, untested code, a quick and dirty temporary solution, etc. We measure technical liabilities by selecting metrics that make sense for a particular project's concerns.

The key words are, "that make sense." Metrics pragmatically selected by experienced practitioners tend coincide with intuition about the quality and flexibility of code. Metrics selected and applied naively can lead to an utterly useless exercise in math-turbation.

Metrics have limited use. They can tell us a lot of about technical liabilities, but they don't reveal much about our assets. It is important to recognize the difference, because managing to increase “positive” metrics can lead to trouble. The goal is not to manage toward “positive” metrics, as much as it is to manage away from “negative” metrics. In financial terms, think about metrics from the standpoint of minimizing risk, not maximizing return.

Consider test coverage. 0% test coverage is a liability, but 100% test coverage is not necessarily an asset: it's possible to have 100% test coverage over a binary that crashes when it's deployed. Viewing test coverage as an asset and managing toward maximum test coverage can result in asinine practices that distract attention from creating business assets. Software has asset value only in so far as it is useful. The objective of any development team isn't to achieve high test coverage, it is to maximize equity by producing business assets without denigrating return through technical liabilities.

Not only are metrics limited to assessing technical liabilities, we must also keep in mind that not every negative indicator reported by our metrics is in fact a technical liability. The extent to which a particular metric exposes a true liability is a function of the risk that it will undermine the business asset we’re delivering. Some risks are benign: portions of the system which have little business impact or are to be retired soon don’t require urgent mitigation. Others are malignant: core system components with opaque and high-maintenance structures will be of very high priority. We must keep this in mind when discovering our liabilities.

Quantifying Liabilities

There are many tools for generating metrics (most of them open source) that we can use to discover our liabilties. But tools are only valuable if they generate actionable metrics. This means they must draw attention to specific things in our code that appear to be problems. Fancy charts don't tell us as much as do raw scores with a list of top offenders.

The abundance of available analysis tools requires that we use them judiciously. We must select only a handful of metrics that are directly relevant to our objectives, and define thresholds, objectives, and tolerances before we apply them.

Things like test coverage (emma, nCover) and duplication analysis (simian, cpd) are well established and don’t require extensive explanation. Static and dynamic analysis are far more sophisticated. Before attempting to put metrics on these, we must be sure to have a clear understanding of what “code quality” means in terms of flexibility, extensibility, testability and any other “-ilities” that strike our fancy. There are tools that can attempt to find potential bugs and programming errors (such as FindBugs for Java, FXcop for .Net, valgrind, helgrind and cachegrind) while others are configurable to allow us to find crufty code for different definitions of cruftiness (nDepend and jDepend, checkstyle, PMD, pc-lint, etc.) Still others, such as TestabilityExplorer for Java, let us measure the testability of code. With these tools we can measure testability, depth of inheritance, cyclomatic complexity, coupling metrics, dependency graphs, lines of code per class/method, and so forth.

Non-functional requirements such as load, scalability, performance, and so forth should be given equal footing with functional requirements. For example, a large scale performance intensive system needs to treat performance bottlenecks as technical liabilities. To identify and isolate these, create tests that simluate load. Similarly, projects with user facing web front-ends will place a very high priority on maintaining consistent user interface style. Treat inconsistent style as a liability: write tests to make sure that all user interfaces gather style settings (colors, etc.) from the same properties files. Finally, bear in mind that other developers are users of the services we create. Create APIs in style, and write tests from the perspective of the user of the API.

Statement of Technical Liability

Once we have a collection of metrics we can bring them into a summary statement that frames our liabilities without overloading us with data.

Here is an example statement of liabilities for a C# project in the investments field.

Functional Area

FX Cop

Coverage

Lines

Duplication

% Duplication

Brokers

139

1%

2984

234

8%

Data

52

31%

1450

297

20%

DataProviders

59

1%

1210

78

6%

DataServer

27

48%

1489

40

3%

Execution

7

48%

618

0

0%

FIX

27

1%

48484

39337

81%

Instruments

133

55%

12896

714

6%

Mathematics

77

56%

2551

205

8%

Optimization

25

60%

305

26

9%

Performance

2

73%

134

0

0%

Providers

36

47%

707

42

6%

Simulation

20

77%

241

0

0%

Trading

54

50%

2955

472

16%

TradingLibrary

66

29%

7035

1674

24%

The first thing that should stand out is that we are not summarizing across the project, only observing at the level of functional area. Put another way, we are interested in our liabilities in the specific areas where the business derives asset value. This is because a technical balance sheet is an analysis of how the technical characteristics of an asset might be undermining its value. This is easier to do if the top level directory structure is close to a 1-to-1 mapping with functional areas. The more convoluted the structure of the code relative to the structure of functionality, the more convoluted the mapping will be between assessing technical liability and mapping it to the asset value provided by its functionality.

Take a look at the FIX project. FIX is a protocol for financial messaging. This library appears to be an ill conceived wrapper around the QuickFIX open source fix engine. It accounts for half the code base, has 1% test coverage and 80% of its code is duplicated. Seems like it should be a top priority, right? It isn't. The reason is that it carries benign risk. The code's risk has a limited impact; it is used in a few places in a straightforward way and there are tests around those useages. While it would be nice to be rid of the entire library, the risk has already been mitigated and there are costs associated with removing the library altogether, so there are better uses of time right now.

Servicing Technical Debt

Servicing technical debt is not conceptually different from servicing financial debt. If we are executing well on our business model, but our equity goes negative and we can no longer service our debt, we are bankrupt. The same holds true for execution risk: if technical liabilities get out of hand, the cost of carry on the technical debt will bankrupt the project despite an otherwise healthy operational model. We can get visibility into this if we place statements of our technical liabilities and risks side-by-side with our project management reporting ordered by business function. This gives us a more complete assessment of our overall execution risk, and let’s us know if we’re inflating our assets.

Now that we have a statement of liabilities, our next step will be to examine the asset side of the technical balance sheet. From there, we can create a balance sheet that will allow us to make trade off decisions, engineer turnarounds, or declare a write-off.




About Brad Cross: Brad is a programmer.