Thursday, August 28, 2008

From the Editor | The First Two Months of

By Ross Pettit - 28 August 2008

As summer winds down in the northern hemisphere and people sneak in a few final days away from work, we’ll take a break at to reflect on our first two months of publication.

Any doubts we may have had that would be a consistent source of quality content have been put to rest.

The first indicator of this is relevancy. John Kehoe wrote about end-to-end versus silo-by-silo performance monitoring of an underperforming airline check-in system that had scored the trifecta: high operating costs, long queues and angry customers. Five days before his piece ran in, the Wall Street Journal ran a story about a UK based air carrier that had initiated an urgent and high-profile ad campaign to make clear to its customers that lengthy check-in times (amongst other things) were now a thing of the past. Michael Martin wrote about IT innovation being threatened by misguided decision making during a period of economic uncertainty. His piece ran two weeks before Nassim Taleb (of Black Swan fame) wrote about the same threats to R&D in the healthcare industry in the Financial Times. By tapping into the problems, challenges and opportunities facing businesses today, the content in is proving to be highly relevant, even prescient.

The second indicator is variety. Ten different people have contributed articles covering a wide swath of high-performance IT.Mark Rickmeier, Jane Robarts and John Kehoe raised questions about panaceas in sourcing and infrastructure. Kent Spillner pointed out a key anti-pattern in project execution: the false optimism created by time between deliverables. Miško Hevery and Carl Ververs both address people aspects of IT, specifically patterns of behaviour change critical to continuous improvement and contending with a generation gap in the IT workforce. Brad Cross and I make clear that IT investments must be framed, managed and executed explicitly as use of a firm's capital.

The third indicator is sustainability. has featured at least one article each week since its inception, meaning we've sourced, composed, edited and published at a sustainable rate for two months. As a community we've overcome the challenges we face as practitioners - client and project deadlines, open source obligations, conference and writing demands, as well as personal commitments - to consistently meet our goal.

Readers have raised a number of questions, and in time we do intend to address them all. We're going to improve the user interface so that visited links aren’t next to impossible to see (and no doubt very difficult to read on this link-heavy page). We're going to add reader comments to each article. We’re going to include e-mail addresses so you can contact our writers directly. And we're going to enable registration so that we’re better able to stay in touch with our community. Getting them done is subject to those same deadlines, obligations, demands and commitments that we have to overcome to produce content. I appreciate your patience while we work through the fit-and-finish.

Looking forward, we have reason to be optimistic. Our statistics indicate a growing community of repeat readers. Our current roster of writers have a full backlog of topics and ideas, and we continue to attract new writers. We have some exciting opportunities to further distribute our content. And above all else, we have some great pieces in the pipeline: as I write this, our writers are composing some great series and upcoming pieces.

As Editor of, and on behalf of the people who make this publication possible, I thank you, our readers, for being part of the community. I hope you share in our excitement and enthusiasm for what lies ahead.

Best regards,

Ross Pettit

Tuesday, August 19, 2008

Hevery - Changing Developer Behaviour, Part I

by Miško Hevery - 19 August 2008

So you've figured out a better way of doing things, but how do you get everyone to change the way they work and start writing code in this better way? This is something I face daily in my line of work as a best practices coach. Here are my battle tested tricks which work even after I leave the project.

What usually happens

Most change initiatives fail because people don't appreciate the reasons for making a change, or because they get demotivated by the progress of change.

Think about what usually happens. Suppose that you have recently discovered the benefits of dependency injection. The dependency injection principle requires that object creation code be kept separate from application logic. Dependencies should be found through the constructor, not by creating them or looking them up in global state. This makes code more maintainable and easier to test, which results in a higher quality codebase. Armed with all of these benefits you decide to implement this on your project. As the lead of the project you give a tech talk to the team on the benefits of dependency injection. Developers come to your tech talk, you all take a vote and everyone agrees that this is what needs to be done. Everybody goes back to their workstations full of enthusiasm. People attempt dependency injection in the first few changes they make and you are excited, but a month later everyone is back to their old routine and nothing has changed. How can we break this cycle and achieve lasting change?

Step 1: Get Buy-In

The first step is to make sure that everyone is willing to try something new. Although buy-in meetings don't achieve anything tangible, they help people understand the reasons why a particular goal is important, and it gives the entire team the opportunity to collectively decide to go after it.

But buy-in isn't enough. We must have some way of knowing that we are implementing the change. To do that, we can use a metric as a proxy for the goal. This makes progress measurable, keeps the goal front-and-center for developers, and lets us know when we're done and can declare victory. Without a metric, we risk losing direction (how do we know our steps are making things better?), losing motivation (individual steps are too small to be seen against the big goal) and losing any sense of accomplishment (because there is always something that can be improved, we don't know when we are done with this round of improvements).

Step 2: Define Your Goal in Terms of an Objective Metric

The next step is to have a measurable number that gives us an indication of where we are and where we want to be. Although it doesn't matter whether this number is an exact measure or just a proxy, it must be repeatable and calculable in an objective way at each stage of adoption.

In Java, byte-code analysis tools are a great way to get to a metric. These can measure very specific things, such as the total number of calls to deprecated APIs or classes/methods which we want to remove. They can also be used to apply hueristic rules to code, such as determining whether it breaks the Law of Demeter, contains excessively deep inheritance, or has too many lines of code in methods or classes. There are a lot of open source tools available that can measure attributes of code, including ASM to write your own metrics, JDepend / Japan for dependency enforcement, Panopticode for code-quality metrics, and Testability Explorer for identifying code that is hard to test.

There are some interesting side-effects to using metrics. One is that people may initially debate the value of making a change, but the moment there's a number - even if it is a home-grown formula - the arguments stop. Another is that management is more likely to support the change: anything that can be measured conveys a sense of visibility and tangible benefit that an unmeasurable change cannot. Again, these are side-effects, not risks: having already established team buy-in in the previous step, we're not "managing to a number" and missing the point of making the change. We're simply bringing attention to the progress we're making toward that change.

In the example above, we want to get developers to do dependency injection. I developed a metric and corresponding open-source tool called Testability Explorer which measures the smallest number of conditions which cannot be mocked out in a test. The idea is that code where branches can be isolated will be easier to test than code where the logic cannot be isolated. In the latter case, one test will exercise a lot of code and will most likely require complex setup, assertions and tear down. The best metrics are those where gaming the metric produces the desirable outcome in the code. In Testability Explorer, using dependency injection is a good way to improve the score, which is exactly what we are trying to achieve.

Checkpoint: Socially Acceptable, but Not Yet Sustainable

Buy-in reinforced with metrics make our proposed change scientific as well as socially acceptable. But this isn't enough: it must be constantly reinforced. Part II of this series will present ways to make progress toward change highly visible yet subtly encouraged with each and every commit.

Tuesday, August 12, 2008

Kehoe - So I Get This Call...

by John Kehoe - 12 August 2008

It’s Friday, the day before my 40th birthday (well, in fine Irish tradition, my birthday “wake”). I get a call from a customer, a major US-based air carrier. They’ve spent the last two months troubleshooting an online check-in system that powers their departure kiosks. They ask me to look at the problem.

The new check-in system was designed to complete the passenger ticketing process in 30 seconds, cut queue time and reduce counter staffing. Unfortunately, it wasn’t working out that well in production: the check-in process was taking five minutes, ten times longer than expected and longer than it would take to have an agent check in a passenger.

As a result, queues are long, customers are angry, and the customer has to increase the counter staff. Meanwhile, the airline the next counter over is fat, dumb and happy, successfully executing the business plan my customer was trying to implement. How dare they!

Each morning, my customer brings together a meeting of twenty people representing every vendor, owner and tier. Each presents the latest findings. All report acceptable metrics. Nobody can solve the end to end problem.

Before going any further, let’s do some math on the cost of these meetings. Sixty-three days, times twenty people, times (purely for round number’s sake) $100 equals $126,000 lost to just this meeting. This doesn’t include troubleshooting time, opportunity cost and proposed expenses to fix the problem (not to mention the meetings to implement that fix). So much for the returns the customer is trying to achieve with the new system.

This is a multi-million dollar problem. It isn't a seven digit problem, it’s an eight digit problem. The customer has already sunk millions in developing the software and acquiring the hardware and staff to deploy the application. They are past their planned deployment date and are paying dearly for FTE’s they want to shift. On top of it, they're losing business passengers who are the target of the system (the frequent flyer miles simply aren't worth the hassle).

To be fair, this application is a bear. There are four databases, three application tiers a data conversion tier, an application server and a remote data provider (that is, a third party, external vendor). There is no possible way to understand what is going on by looking at the problem atomically.

Now, back to the situation at hand. I join call number sixty-three. (Did I mention it’s my day off and I’m missing my birthday party?) There are four current fronts of the attack: the network load balancer is not cutting the mustard; the web servers are misconfigured; the Java guys think there might be an issue with the application server configuration; and the server pool is being tripled in size. I ask for seventy-two hours. My first – and only – act is to get two fellows from the customer to install some performance management software they bought a year earlier for a different project.

I sit back and wait.

It turns out that the team was off the mark. The Java guy was right, but for the wrong reasons.

Here is how the wait analysis breaks down. Authentication was responsible for 3% of wait, remote vendor response another 2%. One application component was responsible for 95% of the delay. The issue boiled down to async MDB calls.

Let’s consider the actual effort of what it took to isolate the problem.

First, we eliminated 90% of the people from the equation in two days. The network and systems were good. There was no issue with the web servers or system capacity. We could gain some single digit improvements by tweaking the authentication process (fix a couple of queries) and enforcing the SLA for our third party data provider. This left only the middleware team. This reduced the meeting of 20 people down to three: a customer Java guy, a rep from the Java App Server vendor and me.

Second, we eliminated a $1mm in hardware “solution” that was being given serious consideration. The web team genuinely believed they were the bottleneck and that if they scaled out and tripled their footprint, all would be better. Management (perhaps in a panic) was about to give them the money. It would have made no difference.

Third, we turned around a fix within seventy-two hours.

So, lets do the math again. One performance guy, times seventy-two hours (I really wasn’t working the whole time, I found the Scotch the family set aside for my birthday wake), times $100 (we didn’t charge so this is a bit inflated), comes to $7,200. Compare that to the (conservatively estimated) $126,000 spent for the daily firedrill meetings.

We eliminated waste by closing up the time wasting, money drawing, soul-sucking, morning meetings; avoiding a $1MM hardware upgrade that wouldn’t fix the problem; enabling the underlying system to achieve the business operations goals (reduction of counter staff and queue time) so that the it could come close to the business impact originally forecast, and providing a standard measurement system across all applications and tiers.

Consider this last point very carefully. We have to have a systematic approach to measuring the performance of applications. The approach must be holistic, i.e., capture transaction data from the desktop to the backend storage and all the tiers in between. We have to see and understand the relationships and interactions in the technology stack for our transactions. We cannot rely on just using free vendor supplied tool and a "toss the spaghetti to the wall to see what sticks" approach. This gives us only isolated, uncorrelated data points that show no problems or just symptoms, but not root cause.

From an IT perspective, the cost of the path that led to the solution was negligable: the time and tools over the three days spent actually solving the problem wasn’t much different from the cost of the morning meetings (except for the pounding the IT group was getting from the business owners while the application’s wings were clipped). From a business perspective, the cost of the path that led to the solution was nothing compared with the business impact: reduction of counter staff, faster check-ins, and happy customers. (Well, perhaps not "happy:" this is an airline we’re talking about... perhaps customers becoming disgruntled at a later point in the air travel experience.)

For all the panic and worry that it causes, a situation like this doesn’t need to be an exercise in “not my problem” and it can bring the business and vendors into alignment. But this is true only if vendors bear in mind that a holistic performance approach has real value associated with it, and if customers bear in mind that a holistic performance measurement system will set them back little more than the cost of futile execution.

Holistic performance management is an essential piece of successful business application deployment. Though viewed an afterthought, performance management is the least expensive part of application deployment. When used, it releases untapped value in applications. At the very least, it’s a cheap insurance policy for the business when the fire alarm rings.

About John Kehoe: John is a performance technologist plying his dark craft since the early nineties. John has a penchant for parenthetical editorializing, puns and mixed metaphors (sorry).

Thursday, August 7, 2008

Cross - Technical Liabilities, Malignant Risks

By Brad Cross - 7 August 2008

The previous article in this series introduced the concept of a technical balance sheet. This isn't just an interesting metaphor: it is relatively easy to construct a technical balance sheet for any project.

Discovering Technical Liabilities

Recall the basic account formula of assets - liabilities = equity. A technical liability is code that incurs a higher relative cost of carry. This can mean sloppy code, untested code, a quick and dirty temporary solution, etc. We measure technical liabilities by selecting metrics that make sense for a particular project's concerns.

The key words are, "that make sense." Metrics pragmatically selected by experienced practitioners tend coincide with intuition about the quality and flexibility of code. Metrics selected and applied naively can lead to an utterly useless exercise in math-turbation.

Metrics have limited use. They can tell us a lot of about technical liabilities, but they don't reveal much about our assets. It is important to recognize the difference, because managing to increase “positive” metrics can lead to trouble. The goal is not to manage toward “positive” metrics, as much as it is to manage away from “negative” metrics. In financial terms, think about metrics from the standpoint of minimizing risk, not maximizing return.

Consider test coverage. 0% test coverage is a liability, but 100% test coverage is not necessarily an asset: it's possible to have 100% test coverage over a binary that crashes when it's deployed. Viewing test coverage as an asset and managing toward maximum test coverage can result in asinine practices that distract attention from creating business assets. Software has asset value only in so far as it is useful. The objective of any development team isn't to achieve high test coverage, it is to maximize equity by producing business assets without denigrating return through technical liabilities.

Not only are metrics limited to assessing technical liabilities, we must also keep in mind that not every negative indicator reported by our metrics is in fact a technical liability. The extent to which a particular metric exposes a true liability is a function of the risk that it will undermine the business asset we’re delivering. Some risks are benign: portions of the system which have little business impact or are to be retired soon don’t require urgent mitigation. Others are malignant: core system components with opaque and high-maintenance structures will be of very high priority. We must keep this in mind when discovering our liabilities.

Quantifying Liabilities

There are many tools for generating metrics (most of them open source) that we can use to discover our liabilties. But tools are only valuable if they generate actionable metrics. This means they must draw attention to specific things in our code that appear to be problems. Fancy charts don't tell us as much as do raw scores with a list of top offenders.

The abundance of available analysis tools requires that we use them judiciously. We must select only a handful of metrics that are directly relevant to our objectives, and define thresholds, objectives, and tolerances before we apply them.

Things like test coverage (emma, nCover) and duplication analysis (simian, cpd) are well established and don’t require extensive explanation. Static and dynamic analysis are far more sophisticated. Before attempting to put metrics on these, we must be sure to have a clear understanding of what “code quality” means in terms of flexibility, extensibility, testability and any other “-ilities” that strike our fancy. There are tools that can attempt to find potential bugs and programming errors (such as FindBugs for Java, FXcop for .Net, valgrind, helgrind and cachegrind) while others are configurable to allow us to find crufty code for different definitions of cruftiness (nDepend and jDepend, checkstyle, PMD, pc-lint, etc.) Still others, such as TestabilityExplorer for Java, let us measure the testability of code. With these tools we can measure testability, depth of inheritance, cyclomatic complexity, coupling metrics, dependency graphs, lines of code per class/method, and so forth.

Non-functional requirements such as load, scalability, performance, and so forth should be given equal footing with functional requirements. For example, a large scale performance intensive system needs to treat performance bottlenecks as technical liabilities. To identify and isolate these, create tests that simluate load. Similarly, projects with user facing web front-ends will place a very high priority on maintaining consistent user interface style. Treat inconsistent style as a liability: write tests to make sure that all user interfaces gather style settings (colors, etc.) from the same properties files. Finally, bear in mind that other developers are users of the services we create. Create APIs in style, and write tests from the perspective of the user of the API.

Statement of Technical Liability

Once we have a collection of metrics we can bring them into a summary statement that frames our liabilities without overloading us with data.

Here is an example statement of liabilities for a C# project in the investments field.

Functional Area

FX Cop




% Duplication





















































































The first thing that should stand out is that we are not summarizing across the project, only observing at the level of functional area. Put another way, we are interested in our liabilities in the specific areas where the business derives asset value. This is because a technical balance sheet is an analysis of how the technical characteristics of an asset might be undermining its value. This is easier to do if the top level directory structure is close to a 1-to-1 mapping with functional areas. The more convoluted the structure of the code relative to the structure of functionality, the more convoluted the mapping will be between assessing technical liability and mapping it to the asset value provided by its functionality.

Take a look at the FIX project. FIX is a protocol for financial messaging. This library appears to be an ill conceived wrapper around the QuickFIX open source fix engine. It accounts for half the code base, has 1% test coverage and 80% of its code is duplicated. Seems like it should be a top priority, right? It isn't. The reason is that it carries benign risk. The code's risk has a limited impact; it is used in a few places in a straightforward way and there are tests around those useages. While it would be nice to be rid of the entire library, the risk has already been mitigated and there are costs associated with removing the library altogether, so there are better uses of time right now.

Servicing Technical Debt

Servicing technical debt is not conceptually different from servicing financial debt. If we are executing well on our business model, but our equity goes negative and we can no longer service our debt, we are bankrupt. The same holds true for execution risk: if technical liabilities get out of hand, the cost of carry on the technical debt will bankrupt the project despite an otherwise healthy operational model. We can get visibility into this if we place statements of our technical liabilities and risks side-by-side with our project management reporting ordered by business function. This gives us a more complete assessment of our overall execution risk, and let’s us know if we’re inflating our assets.

Now that we have a statement of liabilities, our next step will be to examine the asset side of the technical balance sheet. From there, we can create a balance sheet that will allow us to make trade off decisions, engineer turnarounds, or declare a write-off.

About Brad Cross: Brad is a programmer.