alphaITjournal

Wednesday, September 17, 2008

Pettit - Is Your Project Team "Investment Grade?"

by Ross Pettit - 17 September 2008

One of the most important indicators of risk in debt markets is the grade (or collateralized debt obligations. Despite the controversy, the rating agencies remain the authority on assessing credit quality. Their impact on AIG's efforts to raise capital this week indicates how much market influence the rating agencies have.

There are several independent companies that assess the credit quality of bonds. The bond rating gives an indication of the probability of default. Although the bond is what is rated, the rating is really a forecast of the ability of the entity behind the bond – e.g., a corporation or sovereign nation – to meet its debt service obligation.

Each rating firm uses a different and proprietary approach to assess credit quality, involving both quantitative and qualitative factors. For example, bond ratings by Moody’s Investors Service reflect long-term risk consideration, predictability of cash flow, multiple negative scenarios, and interpretation of local accounting practices. In practical terms, this means that things such as macro and micro economic factors, competitive environment, management team, and financial statements are all factors in determining the credit worthiness of a firm.

Rating agencies are subsequently able to characterise the risk of debt investments. An investment grade bond will have lower yield but offer higher safety (that is, lower probability of default). A junk bond will have higher yield but lower safety. Beween these extremes are intermediate levels of quality: a bond that is rated AA will have very high credit quality, but lower safety than a AAA bond, while a bond rated at A or BBB, while still investment grade, indicates lower credit quality than a AA bond.

This concept is portable to IT. Just as the entity behind a bond is rated, a team behind an IT assets under development can be rated for its “delivery worthiness.” The difference is that we look to the rating not as an indicator of the risk premium we demand, but as a threat (and therefore a discount) to yield we should expect from the investment.

To rate an IT team, we can look at quantitative factors, such as the raw capacity of hours to complete an estimated workload, variance in the work estimates, and so forth. But we also need to look to qualitative factors. Consider the following:

Are we working on clear, actionable statements of business need? Are requirements independent statements of business functionality that can be acted upon, or are they descriptions of system behaviour laden with dependences and hand-offs?
Are we creating technical debt? Is code quality good, characterised by a high degree of technical hygiene (is code written in a manner that it can be tested?) and an absence of code toxicity (e.g., code duplication and cyclomatic complexity?)
Are we working transparently? Just as local accounting practices may need to be interpreted when rating debt, we must truely understand how project status is reported. Are we managing and measuring delivery of complete business functionality (marking projects to market), or are we measuring and reporting the completion of technical tasks (marking to model) with activities that complete business functionality such as integration and functional test deferred until late in the development cycle?
Are we delivering frequently, and consistently translating speculative value into real asset value? In the context of rating an IT team, delivered code can be thought of synonymously with cash flow. The more consistent the cash flow, the more likely a firm will be able to service its debt.
Are we resilient to staff turnover? Is there a high degree of turnover in the team? Is this a “destination project” for IT staff? Is there a significant amount of situational complexity that makes the project team vulnerable to staff changes?

At first glance, this may simply look like a risk inventory, but it’s more than that. It’s an assessment of the effectiveness of decisions made to match a team with a set of circumstances to produce an asset.

There are few, if any, absolute rules to achieving a high delivery rating. For example, assigning the top talent to the most important initiative may appear to be an obvious insurance policy for guaranteeing results. But what happens if that top talent is bored to tears because the project isn't a challenge? Such a project – no matter how much assurance is given to each person that they are performing a critical job – may very well increase flight risk. If that materialises, the expectation for returns of that project will crater instantly. If it's not expected, a team can appear to change from investment grade to junk very quickly.

While the rules aren’t absolute, the principles are. An IT team developing an asset expected to yield alpha returns will generally be characterised as a destination opportunity offering competitive compensation, operating transparently with actionable requirements, maintaining capability "liquidity" and a healthy “lifestyle,” and delivering functionally complete assets frequently to reduce operational exposure. All of these are characteristics that separate a team that is investment grade from one that is junk.

While these factors are portable across projects they may not be identically weighted for every team. This doesn’t undermine the value of the rating as much as it means we need to be acutely aware of the circumstances that any team faces. This also means that assessing the delivery worthiness of a team is borne of experience, and not a formulaic or deterministic exercise. While the polar opposites of investment-grade and junk may be clear, it takes a deft hand to recognise the subtle differences between a team that is worthy of a triple-A rating and one that is worthy of a single A, and even why that distinction matters. It also requires a high degree of situational awareness – employment market dynamics, direct inspection of artifacts (review of requirements, code, software), and certification of intermediate deliverables – so that the rating factors are less conjecture and more fact. Finally, it is an exercise to be repeated constantly, as the “market factors” in which a team operates – people, requirements, technology, suppliers and so forth – change constantly. This is consistent with how the rating agencies bring benefit to the market: they are not formulaic, they spend significant effort to interpret data, and they are updated with changing market conditions.

FDIC Chairman Sheila Bair commented recently that we have to look at the people behind the mortgages to really understand the risk of mortgage-backed securities. With IT projects, we have to look at the people and the situations behind the staffing spreadsheets and project plans. IT is a people business. We can measure effectiveness based on asset yield, but we are only going to be as effective as the capability we bring to bear on the unique situation – technological, geographical, economic, and even social-political – that we face. Rating is one means by which we can do that.

Investors in financial instruments have a consistent means by which to assess the degree of risk among different credit instruments. IT has no such mechanism to offer. Just as debt investors want to know the credit worthiness of a firm, so should IT investors know the delivery worthiness of their project teams.

Especially when alpha returns are on the line.

About Ross Pettit: Ross has over 15 years' experience as a developer, project manager, and program manager working on enterprise applications. A former COO, Managing Director, and CTO, he also brings extensive experience managing distributed development operations and global consulting companies. His industry background includes investment and retail banking, insurance, manufacturing, distribution, media, utilities, market research and government. He has most recently consulted to global financial services and media companies on transformation programs, with an emphasis on metrics and measurement. Ross is a frequent speaker and active blogger on topics of IT management, governance and innovation. He is also the editor of alphaITjournal.com.

Wednesday, September 10, 2008

Hevery - Changing Developer Behaviour, Part II

By Miško Hevery - 10 September 2008

In Part I of this series, we took a realistic look at what usually happens when we initiate change. We also took a look at the initial steps of effective change: defining a metric and getting people to accept it as a goal. In this second and final part, we'll introduce two additional steps and also highlight the point at which it becomes clear that change has taken effect.

Step 3: Make Progress Visible

When progress toward achieving a goal is highly visible, it can't be ignored:

It keeps the goal fresh in everyone's mind and helps to prevent regressive behaviors.
It communicates progress to all stakeholders, especially those who may not understand the details of the work.
It provides an opportunity for many small celebrations for a team on the way to achieving its goal.
There's a direct relationship between what developers do and changes to the visible "progress meter," providing a source of pride and "bragging rights" for individuals.
Regression can be easily identified, right down to the work done (the commit) and the "guilty" (the committer).

We can make progress visible by publishing both the raw metric and a burn down chart which computes the estimated date of completion based on the rate of progress.

This makes it easier to answer the obvious question, "When is it going to be done?" It also provides a fact-based response should there be a need to push back on demands, e.g., "you need to get this done by X date."

The continuous build status page is very well suited to this. With every build, the change metric can be computed and published automatically. The continuous build status page will then show a current chart of progress over time.

Going back to our example, we can use Testability Explorer to compute an overall cost number. Let's say that our project scores a cost of 500, and we make it our goal to lower the cost to 50 or below. As part of our continuous build we can produce a graph of the testability cost over time. We can project this graph in a highly visible location (on the wall, on the ceiling) so that it is on everyone's mind all the time. We can also easily identify people who are improving the situation with the code they commit. By calling out their contribution to the team, we get additional buy-in from the team.

Before long, there will be an anxious manager running around asking, "why is the graph is not falling fast enough?" When that happens, the change process is self-running and will complete itself. By keeping the graph current and visible even after the goal is reached, the change will be durable.

Step 4: Make it Required

By and large, developers care only about checking-in code. They are content to continue bad habits if those habits enable them to check in work sooner. When the priority is to get code checked-in, all those fancy graphs will simply show the team getting further away from its stated goal.

If this happens, no amount of meetings and discussion will make a difference, but one thing will: create a unit test that makes sure that any code change can only move the metric in a positive direction. This way any changes to the source code that take the team further from the goal fail the build. Once the build goes into a "red" or broken state, it needs to be fixed. This means that every developer will have to deal with the test to get the build to pass. Fixing the code to pass the build brings the team closer to its goal. A specific code change made to resolve a broken build makes the "what has to change" very clear to every person on the team.

Expect a lot of uproar, and that each developer will require individual attention to help them re-factor their code so that it passes the unit test. The screaming will stop as soon as each person has received individual attention to learn how to code correctly. This won't move the needle on the chart, but the numbers won't get any worse.

This suggests that early on, it will need to be someone's job to re-factor code. Only then will the chart move in the desired direction. Very likely, this is the only way the team will make any progress initially. This is because developers will know how to write new code, but they won't know how (or won't be motivated) to re-factor the old code to conform to new standard. Additionally, they will not yet have experienced any benefits from this new way of doing things. This typically takes a few months. While this is taking root, you will need to be patient to make slow but steady progress refactoring the existing code until developers catch up.

The Tipping Point

At some point, developers themselves will start refactoring old code and the rate of progress will accelerate. The interesting thing is that even if you - the agent of change - leave a project at this point, the goal will eventually be reached. The visibility of the charts clearly show where the team is and how much remains to achieve the goal, and this becomes ingrained into the everyday life of both developers and management. People will see the estimated date of completion and start counting the days to it. Each person will also keep the team focused on the goal. Because the estimated time of completion is based on past progress, the projected date of completion will slip if the rate of refactoring slows down. When this happens, people will ask why the date slipped. This brings attention back to the effort and reignites it. This feedback loop will exist even without a full-time change agent.

Every change has its opponents. At some point the most vociferous opponents become the most vocal champions because they experience the benefits of it first-hand. At the same time, all "passengers" in a project team find they're doing something for so long that they can't imagine working any other way. The team achieves its goal either by strong desire or because it has developed new "muscle memory."

Behaviour Changes when Goals are Visible and Reinforced

To successfully make change, it isn't enough to get consensus to try something new. You need a means by which to quantify a goal, objectively measure progress, and constantly remind people of how well they're doing toward achieving that goal. If these things aren't done, developers will find reasons and excuses not to change. This creates a downward spiral: by carrying on with business as usual, skills stagnate and technical debt accumulates. The irony is that precicely when a team most neads to make change, its own behaviors work against it. A high degree of visibility, and immediate, constant feedback, can overcome even the most difficult team situations.

About Miško Hevery: As an Agile Coach, Miško is responsible for teaching his co-workers to maintain the highest level of automated testing culture, allowing frequent releases of applications with high quality. He is very involved in Open Source community and an author of several open source projects. Recently his interest in Test Driven Developement turned into Testability Explorer with which he hopes will change the testing culture of the open source community.

Wednesday, September 3, 2008

Breidenbach - David and Goliath

by Kevin E. Breidenbach - 3 September 2008

What does the story of David and Goliath have to do with a technology journal? That will become obvious in short order, so bear with me.

I work in capital markets technology, specifically trading. Many years ago, a team I was on moved from a “waterfall” methodology to Agile. Waterfall is in quotes because, like any team that claims to follow waterfall, they really had no methodology. People will try to pass off a requirements document and a few design documents as “waterfall” despite the fact that the finished software (if there is finished software) rarely looks like the documents that were used to define it.

Anyway, without getting into the history of why that the team chose Agile, suffice it to say it was a success. So much so that it led to the adoption of Agile company-wide, with this team the reference implementation for how it should be done.

Large organization find it hard to change the way they behave. They employ a lot of people, and making sure those people are all moving in the same direction is very difficult. In other words, large organizations are not very agile. Case-in-point: consider that despite demonstrable benefits, a successful and high-profile implementation and a mass roll-out effort, Agile has still not made it to every development team after all these years. That is decidedly un-agile.

Which leads us to David and Goliath. You would have thought that by now big business would have learned at least something from that story. It’s not just a story from the Christian Bible, it’s also in the Torah; so it’s not really clear why it’s taking business so long to get it. The military did. They started using small agile teams with light but technically advanced weaponry to great success. So if they get it, why can’t big business?

Now, not every big business needs to have agile software development teams. Many firms' computer systems do not need to change for many years, so waterfall works well for them. But financial trading changes daily. Whether it’s new regulation, new products, new algorithms, the trading systems are constantly in need of update.

So while it’s good that trading teams are starting to become agile, there’s more to the problem than just using Agile as a development methodology: the organization needs to think agile. Development teams only produce software, they don’t put it in production. It needs to be deployed onto the infrastructure and administered. The larger the organization, the larger the infrastructure that needs to be maintained. Making changes to infrastructure often becomes an obstacle for development teams to get things done. There are a number of reasons for this.

Infrastructure teams are rarely exposed to Agile as a methodology. This is probably due to the fact that it has, until now, been directed mainly towards development teams. Infrastructure is often subject to rigid methodologies like CMM. Mixing CMM (or any document driven methodology) with Agile reminds me of the movie Die Hard 3, where Bruce Willis and Samuel L. Jackson watched a green and red liquid mix together just before the concoction blew half a New York block away. It isn’t going to work.

The Agile development team has consistent, time-bound iterations of, say, one or two weeks, while the infrastructure team has service level agreements that invariably are longer than one iteration length. This puts a spanner in the works. It means that the development team has to start thinking long in advance. And what if new requirements come along that change the infrastructure requirements of the application? This could mean the infrastructure team will need to go back to the requirements phase of their process, resetting the SLA clock (don’t think this doesn’t happen!) and delaying development.

This isn’t all bad for the development team, though. Because Agile is so transparent, the business will have been involved all along and will see how infrastructure impacts delivery. The result could be (and I pray it is) that the business may actually convince the infrastructure teams that they aren’t really deities, obliging them to start acting like the service they were intended to be. This doesn’t diminish the value of good infrastructure people. It simply means that infrastructure teams need to realize that some businesses, and thereby development teams, need to operate in a different fashion to others and that they should be able to accommodate this without impacting timelines or quality!

The infrastructure team is also afraid of another word: complexity. The more complex the infrastructure the more difficult and expensive it is to maintain, fear of the “c” word is understandable. Unfortunately, senior management becomes afraid of complexity and produces draconian rules that restrict development teams in their work. Ironically, the pursuit of controlling complexity can render your organization incapable of being agile.

For example, it’s fair to say that it’s good policy to keep the number of enterprise-wide tools to a minimum. For instance why support both Microsoft Word and Corel WordPerfect? The same can also be said for operating systems (why have Linux, AIX, Solaris, HP-UX and Windows) and even server hardware (one manufacturer).

But – and it's a big but – once the task of reducing complexity starts it takes on a life of its own and infiltrates every area of the technology organization. There are edicts that developers shall only use one type of IDE, or that certain open-source libraries or software languages are no longer allowed because we have other libraries or languages that do something almost similar and with some workarounds can do what you need.

This actually increases complexity of the software being produced, and reduces agility of the development team. This is because the developers now have to build those workarounds – they have to fit a square peg into a round hole in order to force some functionality out – when all they really needed was to use a different development language or different library.

Organizations justify this over-zealous complexity policy by saying that it adds cost to support these new libraries, when the fact is that no other support is needed as the libraries are embedded in an archive, or the language produces a simple executable, or the developer supports their own IDEs and tools. Indeed, the cost of the development teams building the workarounds often far exceeds any costs of supporting the tools that they intended to use. Not only that, but the workarounds may actually impact application performance costing the business money during trading!

Which brings us back to the title. David was smaller and more agile than the large and bulky Goliath, who relied on his size and strength. Not only that, but David realized, even if subconsciously, that standardizing on equipment doesn’t work when faced with an enemy like Goliath. Had he fought with just a spear or sword against Goliath then he may not have emerged victoriously and the Bible and Torah would be short of a good story! By picking up some new tools in the form of rocks and a slingshot he outsmarted the clear favorite and started the trend of people cheering for the underdog. So increasing the complexity actually had a positive affect on the task at hand!

The military’s small Special Forces teams recognize this too. They have different equipment to the regular large force, their own logistics and even command structure. It works.

Another group that has recognized this are companies that compete against large financial trading firms. Despite the economic climate, small to medium trading firms have been hiring furiously over the last 18 months while the large banks have hiring freeze after hiring freeze and lay offs on top of layoffs.

Bigger is better? I’ll let you decide. All I know is that I’d rather be is David’s shoes than Goliath’s!

About Kevin Breidenbach: Kevin has a BSc in computer science and over 15 years of development experience. He has worked primarily in finance but has taken a few brief expeditions into .com and product development. Having professionally worked with assembly languages, C++, Java and .Net he's now concentrating on dynamic languages such as Ruby and functional languages like Erlang and F#. His agile experience began about 4 years ago. Since that time, he has a serious allergic reaction to waterfall and CMM.

Thursday, August 28, 2008

From the Editor | The First Two Months of alphaITjournal.com

By Ross Pettit - 28 August 2008

As summer winds down in the northern hemisphere and people sneak in a few final days away from work, we’ll take a break at alphaITjournal.com to reflect on our first two months of publication.

Any doubts we may have had that alphaITjournal.com would be a consistent source of quality content have been put to rest.

The first indicator of this is relevancy. John Kehoe wrote about end-to-end versus silo-by-silo performance monitoring of an underperforming airline check-in system that had scored the trifecta: high operating costs, long queues and angry customers. Five days before his piece ran in alphaITjournal.com, the Wall Street Journal ran a story about a UK based air carrier that had initiated an urgent and high-profile ad campaign to make clear to its customers that lengthy check-in times (amongst other things) were now a thing of the past. Michael Martin wrote about IT innovation being threatened by misguided decision making during a period of economic uncertainty. His piece ran two weeks before Nassim Taleb (of Black Swan fame) wrote about the same threats to R&D in the healthcare industry in the Financial Times. By tapping into the problems, challenges and opportunities facing businesses today, the content in alphaITjournal.com is proving to be highly relevant, even prescient.

The second indicator is variety. Ten different people have contributed articles covering a wide swath of high-performance IT.Mark Rickmeier, Jane Robarts and John Kehoe raised questions about panaceas in sourcing and infrastructure. Kent Spillner pointed out a key anti-pattern in project execution: the false optimism created by time between deliverables. Miško Hevery and Carl Ververs both address people aspects of IT, specifically patterns of behaviour change critical to continuous improvement and contending with a generation gap in the IT workforce. Brad Cross and I make clear that IT investments must be framed, managed and executed explicitly as use of a firm's capital.

The third indicator is sustainability. alphaITjournal.com has featured at least one article each week since its inception, meaning we've sourced, composed, edited and published at a sustainable rate for two months. As a community we've overcome the challenges we face as practitioners - client and project deadlines, open source obligations, conference and writing demands, as well as personal commitments - to consistently meet our goal.

Readers have raised a number of questions, and in time we do intend to address them all. We're going to improve the user interface so that visited links aren’t next to impossible to see (and no doubt very difficult to read on this link-heavy page). We're going to add reader comments to each article. We’re going to include e-mail addresses so you can contact our writers directly. And we're going to enable registration so that we’re better able to stay in touch with our community. Getting them done is subject to those same deadlines, obligations, demands and commitments that we have to overcome to produce content. I appreciate your patience while we work through the fit-and-finish.

Looking forward, we have reason to be optimistic. Our statistics indicate a growing community of repeat readers. Our current roster of writers have a full backlog of topics and ideas, and we continue to attract new writers. We have some exciting opportunities to further distribute our content. And above all else, we have some great pieces in the pipeline: as I write this, our writers are composing some great series and upcoming pieces.

As Editor of alphaITjournal.com, and on behalf of the people who make this publication possible, I thank you, our readers, for being part of the alphaITjournal.com community. I hope you share in our excitement and enthusiasm for what lies ahead.

Best regards,

Ross Pettit
Editor
alphaITjournal.com

Tuesday, August 19, 2008

Hevery - Changing Developer Behaviour, Part I

by Miško Hevery - 19 August 2008

So you've figured out a better way of doing things, but how do you get everyone to change the way they work and start writing code in this better way? This is something I face daily in my line of work as a best practices coach. Here are my battle tested tricks which work even after I leave the project.

What usually happens

Most change initiatives fail because people don't appreciate the reasons for making a change, or because they get demotivated by the progress of change.

Think about what usually happens. Suppose that you have recently discovered the benefits of dependency injection. The dependency injection principle requires that object creation code be kept separate from application logic. Dependencies should be found through the constructor, not by creating them or looking them up in global state. This makes code more maintainable and easier to test, which results in a higher quality codebase. Armed with all of these benefits you decide to implement this on your project. As the lead of the project you give a tech talk to the team on the benefits of dependency injection. Developers come to your tech talk, you all take a vote and everyone agrees that this is what needs to be done. Everybody goes back to their workstations full of enthusiasm. People attempt dependency injection in the first few changes they make and you are excited, but a month later everyone is back to their old routine and nothing has changed. How can we break this cycle and achieve lasting change?

Step 1: Get Buy-In

The first step is to make sure that everyone is willing to try something new. Although buy-in meetings don't achieve anything tangible, they help people understand the reasons why a particular goal is important, and it gives the entire team the opportunity to collectively decide to go after it.

But buy-in isn't enough. We must have some way of knowing that we are implementing the change. To do that, we can use a metric as a proxy for the goal. This makes progress measurable, keeps the goal front-and-center for developers, and lets us know when we're done and can declare victory. Without a metric, we risk losing direction (how do we know our steps are making things better?), losing motivation (individual steps are too small to be seen against the big goal) and losing any sense of accomplishment (because there is always something that can be improved, we don't know when we are done with this round of improvements).

Step 2: Define Your Goal in Terms of an Objective Metric

The next step is to have a measurable number that gives us an indication of where we are and where we want to be. Although it doesn't matter whether this number is an exact measure or just a proxy, it must be repeatable and calculable in an objective way at each stage of adoption.

In Java, byte-code analysis tools are a great way to get to a metric. These can measure very specific things, such as the total number of calls to deprecated APIs or classes/methods which we want to remove. They can also be used to apply hueristic rules to code, such as determining whether it breaks the Law of Demeter, contains excessively deep inheritance, or has too many lines of code in methods or classes. There are a lot of open source tools available that can measure attributes of code, including ASM to write your own metrics, JDepend / Japan for dependency enforcement, Panopticode for code-quality metrics, and Testability Explorer for identifying code that is hard to test.

There are some interesting side-effects to using metrics. One is that people may initially debate the value of making a change, but the moment there's a number - even if it is a home-grown formula - the arguments stop. Another is that management is more likely to support the change: anything that can be measured conveys a sense of visibility and tangible benefit that an unmeasurable change cannot. Again, these are side-effects, not risks: having already established team buy-in in the previous step, we're not "managing to a number" and missing the point of making the change. We're simply bringing attention to the progress we're making toward that change.

In the example above, we want to get developers to do dependency injection. I developed a metric and corresponding open-source tool called Testability Explorer which measures the smallest number of conditions which cannot be mocked out in a test. The idea is that code where branches can be isolated will be easier to test than code where the logic cannot be isolated. In the latter case, one test will exercise a lot of code and will most likely require complex setup, assertions and tear down. The best metrics are those where gaming the metric produces the desirable outcome in the code. In Testability Explorer, using dependency injection is a good way to improve the score, which is exactly what we are trying to achieve.

Checkpoint: Socially Acceptable, but Not Yet Sustainable

Buy-in reinforced with metrics make our proposed change scientific as well as socially acceptable. But this isn't enough: it must be constantly reinforced. Part II of this series will present ways to make progress toward change highly visible yet subtly encouraged with each and every commit.

Tuesday, August 12, 2008

Kehoe - So I Get This Call...

by John Kehoe - 12 August 2008

It’s Friday, the day before my 40^th birthday (well, in fine Irish tradition, my birthday “wake”). I get a call from a customer, a major US-based air carrier. They’ve spent the last two months troubleshooting an online check-in system that powers their departure kiosks. They ask me to look at the problem.

The new check-in system was designed to complete the passenger ticketing process in 30 seconds, cut queue time and reduce counter staffing. Unfortunately, it wasn’t working out that well in production: the check-in process was taking five minutes, ten times longer than expected and longer than it would take to have an agent check in a passenger.

As a result, queues are long, customers are angry, and the customer has to increase the counter staff. Meanwhile, the airline the next counter over is fat, dumb and happy, successfully executing the business plan my customer was trying to implement. How dare they!

Each morning, my customer brings together a meeting of twenty people representing every vendor, owner and tier. Each presents the latest findings. All report acceptable metrics. Nobody can solve the end to end problem.

Before going any further, let’s do some math on the cost of these meetings. Sixty-three days, times twenty people, times (purely for round number’s sake) $100 equals $126,000 lost to just this meeting. This doesn’t include troubleshooting time, opportunity cost and proposed expenses to fix the problem (not to mention the meetings to implement that fix). So much for the returns the customer is trying to achieve with the new system.

This is a multi-million dollar problem. It isn't a seven digit problem, it’s an eight digit problem. The customer has already sunk millions in developing the software and acquiring the hardware and staff to deploy the application. They are past their planned deployment date and are paying dearly for FTE’s they want to shift. On top of it, they're losing business passengers who are the target of the system (the frequent flyer miles simply aren't worth the hassle).

To be fair, this application is a bear. There are four databases, three application tiers a data conversion tier, an application server and a remote data provider (that is, a third party, external vendor). There is no possible way to understand what is going on by looking at the problem atomically.

Now, back to the situation at hand. I join call number sixty-three. (Did I mention it’s my day off and I’m missing my birthday party?) There are four current fronts of the attack: the network load balancer is not cutting the mustard; the web servers are misconfigured; the Java guys think there might be an issue with the application server configuration; and the server pool is being tripled in size. I ask for seventy-two hours. My first – and only – act is to get two fellows from the customer to install some performance management software they bought a year earlier for a different project.

I sit back and wait.

It turns out that the team was off the mark. The Java guy was right, but for the wrong reasons.

Here is how the wait analysis breaks down. Authentication was responsible for 3% of wait, remote vendor response another 2%. One application component was responsible for 95% of the delay. The issue boiled down to async MDB calls.

Let’s consider the actual effort of what it took to isolate the problem.

First, we eliminated 90% of the people from the equation in two days. The network and systems were good. There was no issue with the web servers or system capacity. We could gain some single digit improvements by tweaking the authentication process (fix a couple of queries) and enforcing the SLA for our third party data provider. This left only the middleware team. This reduced the meeting of 20 people down to three: a customer Java guy, a rep from the Java App Server vendor and me.

Second, we eliminated a $1mm in hardware “solution” that was being given serious consideration. The web team genuinely believed they were the bottleneck and that if they scaled out and tripled their footprint, all would be better. Management (perhaps in a panic) was about to give them the money. It would have made no difference.

Third, we turned around a fix within seventy-two hours.

So, lets do the math again. One performance guy, times seventy-two hours (I really wasn’t working the whole time, I found the Scotch the family set aside for my birthday wake), times $100 (we didn’t charge so this is a bit inflated), comes to $7,200. Compare that to the (conservatively estimated) $126,000 spent for the daily firedrill meetings.

We eliminated waste by closing up the time wasting, money drawing, soul-sucking, morning meetings; avoiding a $1MM hardware upgrade that wouldn’t fix the problem; enabling the underlying system to achieve the business operations goals (reduction of counter staff and queue time) so that the it could come close to the business impact originally forecast, and providing a standard measurement system across all applications and tiers.

Consider this last point very carefully. We have to have a systematic approach to measuring the performance of applications. The approach must be holistic, i.e., capture transaction data from the desktop to the backend storage and all the tiers in between. We have to see and understand the relationships and interactions in the technology stack for our transactions. We cannot rely on just using free vendor supplied tool and a "toss the spaghetti to the wall to see what sticks" approach. This gives us only isolated, uncorrelated data points that show no problems or just symptoms, but not root cause.

From an IT perspective, the cost of the path that led to the solution was negligable: the time and tools over the three days spent actually solving the problem wasn’t much different from the cost of the morning meetings (except for the pounding the IT group was getting from the business owners while the application’s wings were clipped). From a business perspective, the cost of the path that led to the solution was nothing compared with the business impact: reduction of counter staff, faster check-ins, and happy customers. (Well, perhaps not "happy:" this is an airline we’re talking about... perhaps customers becoming disgruntled at a later point in the air travel experience.)

For all the panic and worry that it causes, a situation like this doesn’t need to be an exercise in “not my problem” and it can bring the business and vendors into alignment. But this is true only if vendors bear in mind that a holistic performance approach has real value associated with it, and if customers bear in mind that a holistic performance measurement system will set them back little more than the cost of futile execution.

Holistic performance management is an essential piece of successful business application deployment. Though viewed an afterthought, performance management is the least expensive part of application deployment. When used, it releases untapped value in applications. At the very least, it’s a cheap insurance policy for the business when the fire alarm rings.

About John Kehoe: John is a performance technologist plying his dark craft since the early nineties. John has a penchant for parenthetical editorializing, puns and mixed metaphors (sorry).

Thursday, August 7, 2008

Cross - Technical Liabilities, Malignant Risks

By Brad Cross - 7 August 2008

The previous article in this series introduced the concept of a technical balance sheet. This isn't just an interesting metaphor: it is relatively easy to construct a technical balance sheet for any project.

Discovering Technical Liabilities

Recall the basic account formula of assets - liabilities = equity. A technical liability is code that incurs a higher relative cost of carry. This can mean sloppy code, untested code, a quick and dirty temporary solution, etc. We measure technical liabilities by selecting metrics that make sense for a particular project's concerns.

The key words are, "that make sense." Metrics pragmatically selected by experienced practitioners tend coincide with intuition about the quality and flexibility of code. Metrics selected and applied naively can lead to an utterly useless exercise in math-turbation.

Metrics have limited use. They can tell us a lot of about technical liabilities, but they don't reveal much about our assets. It is important to recognize the difference, because managing to increase “positive” metrics can lead to trouble. The goal is not to manage toward “positive” metrics, as much as it is to manage away from “negative” metrics. In financial terms, think about metrics from the standpoint of minimizing risk, not maximizing return.

Consider test coverage. 0% test coverage is a liability, but 100% test coverage is not necessarily an asset: it's possible to have 100% test coverage over a binary that crashes when it's deployed. Viewing test coverage as an asset and managing toward maximum test coverage can result in asinine practices that distract attention from creating business assets. Software has asset value only in so far as it is useful. The objective of any development team isn't to achieve high test coverage, it is to maximize equity by producing business assets without denigrating return through technical liabilities.

Not only are metrics limited to assessing technical liabilities, we must also keep in mind that not every negative indicator reported by our metrics is in fact a technical liability. The extent to which a particular metric exposes a true liability is a function of the risk that it will undermine the business asset we’re delivering. Some risks are benign: portions of the system which have little business impact or are to be retired soon don’t require urgent mitigation. Others are malignant: core system components with opaque and high-maintenance structures will be of very high priority. We must keep this in mind when discovering our liabilities.

Quantifying Liabilities

There are many tools for generating metrics (most of them open source) that we can use to discover our liabilties. But tools are only valuable if they generate actionable metrics. This means they must draw attention to specific things in our code that appear to be problems. Fancy charts don't tell us as much as do raw scores with a list of top offenders.

The abundance of available analysis tools requires that we use them judiciously. We must select only a handful of metrics that are directly relevant to our objectives, and define thresholds, objectives, and tolerances before we apply them.

Things like test coverage (emma, nCover) and duplication analysis (simian, cpd) are well established and don’t require extensive explanation. Static and dynamic analysis are far more sophisticated. Before attempting to put metrics on these, we must be sure to have a clear understanding of what “code quality” means in terms of flexibility, extensibility, testability and any other “-ilities” that strike our fancy. There are tools that can attempt to find potential bugs and programming errors (such as FindBugs for Java, FXcop for .Net, valgrind, helgrind and cachegrind) while others are configurable to allow us to find crufty code for different definitions of cruftiness (nDepend and jDepend, checkstyle, PMD, pc-lint, etc.) Still others, such as TestabilityExplorer for Java, let us measure the testability of code. With these tools we can measure testability, depth of inheritance, cyclomatic complexity, coupling metrics, dependency graphs, lines of code per class/method, and so forth.

Non-functional requirements such as load, scalability, performance, and so forth should be given equal footing with functional requirements. For example, a large scale performance intensive system needs to treat performance bottlenecks as technical liabilities. To identify and isolate these, create tests that simluate load. Similarly, projects with user facing web front-ends will place a very high priority on maintaining consistent user interface style. Treat inconsistent style as a liability: write tests to make sure that all user interfaces gather style settings (colors, etc.) from the same properties files. Finally, bear in mind that other developers are users of the services we create. Create APIs in style, and write tests from the perspective of the user of the API.

Statement of Technical Liability

Once we have a collection of metrics we can bring them into a summary statement that frames our liabilities without overloading us with data.

Here is an example statement of liabilities for a C# project in the investments field.

Functional Area	FX Cop	Coverage	Lines	Duplication	% Duplication
Brokers	139	1%	2984	234	8%
Data	52	31%	1450	297	20%
DataProviders	59	1%	1210	78	6%
DataServer	27	48%	1489	40	3%
Execution	7	48%	618	0	0%
FIX	27	1%	48484	39337	81%
Instruments	133	55%	12896	714	6%
Mathematics	77	56%	2551	205	8%
Optimization	25	60%	305	26	9%
Performance	2	73%	134	0	0%
Providers	36	47%	707	42	6%
Simulation	20	77%	241	0	0%
Trading	54	50%	2955	472	16%
TradingLibrary	66	29%	7035	1674	24%

The first thing that should stand out is that we are not summarizing across the project, only observing at the level of functional area. Put another way, we are interested in our liabilities in the specific areas where the business derives asset value. This is because a technical balance sheet is an analysis of how the technical characteristics of an asset might be undermining its value. This is easier to do if the top level directory structure is close to a 1-to-1 mapping with functional areas. The more convoluted the structure of the code relative to the structure of functionality, the more convoluted the mapping will be between assessing technical liability and mapping it to the asset value provided by its functionality.

Take a look at the FIX project. FIX is a protocol for financial messaging. This library appears to be an ill conceived wrapper around the QuickFIX open source fix engine. It accounts for half the code base, has 1% test coverage and 80% of its code is duplicated. Seems like it should be a top priority, right? It isn't. The reason is that it carries benign risk. The code's risk has a limited impact; it is used in a few places in a straightforward way and there are tests around those useages. While it would be nice to be rid of the entire library, the risk has already been mitigated and there are costs associated with removing the library altogether, so there are better uses of time right now.

Servicing Technical Debt

Servicing technical debt is not conceptually different from servicing financial debt. If we are executing well on our business model, but our equity goes negative and we can no longer service our debt, we are bankrupt. The same holds true for execution risk: if technical liabilities get out of hand, the cost of carry on the technical debt will bankrupt the project despite an otherwise healthy operational model. We can get visibility into this if we place statements of our technical liabilities and risks side-by-side with our project management reporting ordered by business function. This gives us a more complete assessment of our overall execution risk, and let’s us know if we’re inflating our assets.

Now that we have a statement of liabilities, our next step will be to examine the asset side of the technical balance sheet. From there, we can create a balance sheet that will allow us to make trade off decisions, engineer turnarounds, or declare a write-off.

About Brad Cross: Brad is a programmer.