Stephen Freeman Rotating Header Image

Agile Programming

An example of an unhedged software call option

At a client, we’ve been reworking some particularly hairy calculation code. For better or worse, the convention is that we call a FooFetcher to get hold of a Foo when we need one. Here’s an example that returns Transfers, which are payments to and from an account. In this case, we’re mostly getting hold of Transfers directly because can identify them1.

public interface TransferFetcher {
  Transfer      fetchFor(TransferId id);
  Transfer      fetchOffsetFor(Transfer transfer);
  Set fetchOutstandingFor(Client client, CustomerReference reference);
  Transfer      fetchFor(CustomerReference reference);

This looks like a reasonable design—all the methods are to do with retrieving Transfers—but it’s odd that only one of them returns a collection of Transfers. That’s a clue.

When we looked at the class, we discovered that the fetchOutstandingFor() method has a different implementation from the other methods and pulls in several dependencies that only it needs. In addition, unlike the other methods, it has only one caller (apart from its tests, of course). It doesn’t really fit in the Fetcher implementation which is now inconsistent.

It’s easy to imagine how this method got added. The programmers needed to get a feature written, and the code already had a dependency that was concerned with Transfers. It was quicker to add a method to the existing Fetcher, even if that meant making it much more complicated, than to introduce a new collaborator. They sold a Call Option—they cashed in the immediate benefit at the cost of weakening the model. The team would be ahead so long as no-one needed to change that code.

The option got called on us. As part of our reworking, we needed to change how Transfer objects were constructed so we could handle a new kind of transaction. The structure we planned meant changing another object, say Accounts, to depend on a TransferFetcher, but the current implementation of TransferFetcher depended on Accounts to implement fetchOutstandingFor(). We had a dependency loop. We should have taken a diversion and moved the behaviour of fetchOutstandingFor() into an appropriate object, but then we had our own delivery pressures. In the end, we found a workaround that allowed us to finish the task we were in the middle of, with a note to come back and fix the Fetcher.

The cost of recovery includes not just the effort of investigating and applying a solution (which would have been less when the code was introduced) but also the drag on motivation. It’s a huge gumption trap to be making steady progress towards a goal and then be knocked off course by an unnecessary design flaw. The research described in The Progress Principal suggests that small blockers like this have a disproportionate impact compared to their size. Time to break for a cup of tea.

I believe that software quality is a cumulative property. It’s the accumulation of many small good or bad design decisions that either make a codebase productive to work with or just too expensive to maintain.

…and, right on cue, Rangwald talks about The Tyranny of the Urgent.

1) The details of the domain have been changed to protect the innocent, so please don’t worry too much about the detail.

Thanks to @aparker42 for his comments

Going to Goto (twice)

GOTO ConferencesI’ll be at Goto Aarhus October 9-14 this year, giving a presentation and workshop on Nat Pryce and my material on using Test-Driven Development at multiple levels, guiding the design of system components as well as the objects within them.

If you register with the code free1250, you’ll get a discount of 1250 DKK and Goto will donate the same amount to Computers for Charities

Some of us are then rushing to Goto Amsterdam, where I’ll be giving the talk again on Friday. Again the code free1250 will do something wonderful, but I’m not quite sure what.

Bad code isn't Technical Debt, it's an unhedged Call Option

I’d been meaning to write this up for a while, and now Nat Pryce has written up the 140 character version.

Payoff from writing a call.

This is all Chris Matts‘ idea. He realised that the problem with the “Technical Debt” metaphor is that for managers debt can be a good thing. Executives can be required to take on more debt because it makes the finances work better, it might even be encouraged by tax breaks. This is not the same debt as your personal credit card. Chris came up with a better metaphor, the Call Option.

I “write” a Call Option when I sell someone the right, but not the obligation, to buy in the future an agreed quantity of something at an price that is fixed now. So, for a payment now, I agree to sell you 10,000 chocolate santas[1] at 56 pence each, at any time up to 10th December. You’re prepared to pay the premium because you want to know that you’ll have santas in your stores at a price you can sell.

From my side, if the price of the santas stays low, I get to keep your payment and I’m ahead. But, I also run the risk of having to provide these santas when the price has rocketed to 72 pence. I can protect myself by making arrangements with another party to acquire them at 56 pence or less, or by actually having them in stock. Or, I can take a chance and just collect the premium. This is called an unhedged, or “Naked”, Call. In the financial world this is risky because it has unlimited downside, I have to supply the santas whatever they cost me to provide.

Call options are a better model than debt for cruddy code (without tests) because they capture the unpredictability of what we do. If I slap in an a feature without cleaning up then I get the benefit immediately, I collect the premium. If I never see that code again, then I’m ahead and, in retrospect, it would have been foolish to have spent time cleaning it up.

On the other hand, if a radical new feature comes in that I have to do, all those quick fixes suddenly become very expensive to work with. Examples I’ve seen are a big new client that requires a port to a different platform, or a new regulatory requirement that needs a new report. I get equivalent problems if there’s a failure I have to interpret and fix just before a deadline, or the team members turn over completely and no-one remembers the tacit knowledge that helps the code make sense. The market has moved away from where I thought it was going to be and my option has been called.

Even if it is more expensive to do things cleanly (and I’m not convinced of that beyond a two-week horizon), it’s also less risky. A messy system is full of unhedged calls, each of which can cost an unpredictable amount should they ever be exercised. We’ve all seen what this can do in the financial markets, and the scary thing is that failure, if it comes, can be sudden—everything is fine until it isn’t. I’ve seen a few systems which are just too hard to change to keep up with the competition and the owners are in real trouble.

So that makes refactoring like buying an option too. I pay a premium now so that I have more choices about where I might take the code later. This is a mundane and obvious activity in many aspects of business—although not, it seems, software development. I don’t need to spend this money if I know exactly what will happen, if I have perfect knowledge of the relevant parts of the future, but I don’t recall when I last saw this happen.

So, the next time you have to deal with implausible delivery dates, don’t talk about Technical Debt. Debt is predictable and can be managed, it’s just another tool. Try talking about an Unhedged Call. Now all we need is a way to price Code Smells.

1) There is an apocryphal story about a trader buying chocolate santa futures and forgetting to sell them on. Eventually a truckload turned up at the Wall Street headquarters.

Machiavelli on code quality

As the doctors say of a wasting disease, to start with, it is easy to cure but difficult to diagnose. After a time, unless it has been diagnosed and treated at the outset, it becomes easy to diagnose but difficult to cure.

— Nicolo Machiavelli, The Prince

via Dee Hock, Birth of the Chaordic Age

Responding to Brian Marick

Brian’s been paying us the compliment of taking our book seriously and working through our extended example, translating it to Ruby.

He has a point of contention in that he’s doubtful about the value of our end-to-end tests. To be more precise, he’s doubtful about the value of our automated end-to-end tests, a view shared by J.B.Rainsberger, and Arlo Belshee and Jim Shore. That’s a pretty serious group. I think the answer, as always, is “it depends”.

There are real advantages to writing automated end-to-end tests. As Nat pointed out in an extended message to the mailing list for the book,

Most significantly to me, however, is the difference between “testing” end-to-end or through the GUI and “test-driving”. A lot of people who are evangelical about TDD for coding do not use end-to-end tests for driving design at the system scale. I have found that writing tests gives useful design feedback, no matter what the scale.

For example, during Arlo and Jim’s session, I was struck by how many of the “failure stories” described situations where the acceptance tests were actually doing their job: revealing problems (such as deployment difficulties) that needed to be fixed.

Automating an end-to-end test helps me think more carefully about what exactly I care about in the next feature. Automating tests for many features encourages me to work out a language to describe them, which clarifies how I describe the system and makes new features easier to test.

And then there’s scale. Pretty much anything will work for a small system (although Alan Shalloway has a story about how even a quick demonstrator project can get out of hand). For larger systems, things get complicated, people come and go, and the team isn’t quite as confident as it needs to be about where things are connected. Perhaps these are symptoms of weaknesses in the team culture, but it seems wasteful to me to take the design experience we gained while writing the features not encode it somewhere.

Of course this comes at a price. Effective end-to-end tests take skill, experience, and (most important) commitment. Not every system I’ve seen has been programmed by people who are as rigorous as Nat about making the test code expressive or allowing testing to drive the design. Worse, a large collection of badly written end-to-end tests (a pattern I’ve seen a few times) is a huge drag on development. Is that price worth paying? It (ahem) depends, and part of the skill is in finding the right places to test.

So, let me turn Brian’s final question around. What would it take to make automated end-to-end tests less scary?

Do do XP

In this post, Tobias Mayer argues against doing Extreme Programming (XP). I have a lot of time for Tobias, but I think he’s wrong on this one. I don’t know who he’s been talking to, but some of this is “strawman” argument, and I’d be more likely to be convinced if Tobias had tried XP just the once. XP is not a universal solution, but it is one possible choice and we know how to make it work.

As an occasional XP advocate, I don’t “blame Scrum for the lack of good development practices in the software industry”, I blame the software industry. If we worked in an effective industry, we wouldn’t be having methodology wars because things would just work. Now this same industry is messing up Scrum too by just taking on its ceremonial aspects. On the other hand, to blame XP for blocking good practice is just bizarre.

XP is a tiny movement that attracted some attention. What XP (version 1) did achieve was to show that it is possible to break through the logjam of cautious procrastination that still cripples many development teams, but without resorting to hackery. It gave teams a reliable package of practices that just worked. Of course XP didn’t take over the world because it’s not suitable for everyone–not least because it requires a degree of focus and skill that is not appropriate for many teams. Kent Beck’s presentation of XP version 1 was extreme on purpose: it was designed to inspire us huddled masses, and to stretch the boundaries of what was considered possible in software development, to reframe the discussion.

I think Tobias has forgotten just how far we’ve come in the last decade. That we have a craft movement at all is because XP put the actual writing of code back into the centre of the discussion–just look at who’s involved, it’s the same people. He also forgets just how counter-intuitive many of the XP practices are, especially compared to the direction the industry was moving at the time.

Tobias writes that the good development practices were spreading slowly at the time, but I’d argue that without XP we’d still be waiting. Test-Driven Development is still not that widely accepted and even the original C3 team didn’t adopt it fully until Kent was writing his book. Refactoring had a small academic following, but it’s not very safe without the compensating practice of TDD. I suspect most teams still ban changing code unless it’s to change a feature. Pair programming is still a very hard sell and, again, works much better in the context of TDD. I’ve seen enough Scrum teams that have not found a coherent set of technical practices. To say that they just need to improve their Scrum implementation begs the question of how Scrum is adopted and the limits of self-organisation.

Some final nit-picks. There are two editions of the XP book, the second is more recent than 12 years and has a “softer” approach to the methodology. As for the relevancy of the practices, the C3 project worked in an environment (Smalltalk/Gemstone) that still outclasses what most of us use today. Much of the work in the XP community has been to try to recreate that flexibility in inadequate current technical environments. What’s really scary is how slowly this industry moves.

Keep tests concrete

This popped up on a technical discussion site recently. The original question was how to write tests for code that invokes a method on particular values in a list. The problem was that the tests were messy, and the author was looking for a cleaner alternative. Here’s the example test, it asserts that the even-positioned elements in the parameters are passed to bar in the appropriate sequence.

public void testExecuteEven() {
  Mockery mockery = new Mockery();

  final Bar bar = mockery.mock(Bar.class);
  final Sequence sequence = new NamedSequence("sequence");

  final List allParameters = new ArrayList();
  final List expectedParameters = new ArrayList();

  for (int i = 0; i < 3; i++) {
    allParameters.add("param" + i);
    if (i % 2 == 0) {
      expectedParameters.add("param" + i);
  final Iterator iter = expectedParameters.iterator();

  mockery.checking(new Expectations() {
     while (iter.hasNext()) {

  Foo subject = new Foo();

The intentions of the test are good, but its most striking feature is that there’s so much computation going on. This doesn’t need a new technique to make it more readable, it just needs to be simplified.

A unit test should be small and focussed enough that we don’t need any general behaviour. It just has to deal with one example, so we can make it as concrete as we like. With that in mind, we can collapse the test to this:

public void testCallsDoItOnEvenIndexedElementsInList() {
  final Mockery mockery = new Mockery();
  final Bar bar = mockery.mock(Bar.class);
  final Sequence evens = mockery.sequence("evens");

  final  List params =
    Arrays.asList("param0", "param1", "param2", "param3");

  mockery.checking(new Expectations() {{
    oneOf(bar).doIt(params.get(0)); inSequence(evens);
    oneOf(bar).doIt(params.get(2)); inSequence(evens);

  Foo subject = new Foo();

To me, this is more direct, a simpler statement of the example—if nothing else, there’s just less code to understand. I don’t need any loops because there aren’t enough values to justify them. The expectations are clearer because they show the indices of the elements I want from the list (an alternative would have been to put in the expected values directly). And if I pulled the common features, such as the mockery and the target object, into the test class, the test would be even shorter.

The short version of this post is: be wary of any general behaviour written into a unit test. The scope should be small enough that values can be coded directly. Be especially wary of anything with an if statement. If the data setup is more complicated, then consider using a Test Data Builder.

Mock Roles not Objects, live and in person.

At the recent Software Craftsmanship conference in London, Willem and Marc ran a session on Responsibility-Driven Development with Mocks for about 30 people. Nat Pryce and I were sitting at the back watching and occasionally heckling.

The first striking thing was that when Willem and Marc asked who was using “Mock Objects” most everyone put their hand up (which was nice), but then only a handful also said they were thinking about Roles and Responsibilities when they did (which was frustrating). We first wrote up these ideas in our paper “Mock Roles Not Objects” and much of the difficulty we see people have with the technique of Mock Objects comes from focussing on classes rather than relationships.

As it happens, an example popped up in the rest of the session, which was run as a Coding Dojo. What was interesting to me was how the group managed to turn around its design ideas. Here’s what I can remember about how it worked out.

The domain was some kind of game, with a hero who moves around an environment slaying dragons and so forth. The first couple of stories were to do with displaying the current room, and then moving from one room to another. It was a little difficult getting started because the limitations of the event didn’t allow enough time to really drive the design from outer-level requirements, but the group managed to get started with something like:

describe Hero do
  it "should describe its surroundings" do
    hero =

    room.stub!(:description).and_return("a room with twisty passages")

    console.should_receive(:show).with("in a room with twisty passages")

The expectation here says that when looking, the hero should write a text describing the room to the console. This was a place to start, but it doesn’t look right. Why is a hero attached to a room? And hero.look(console) just doesn’t read well, it’s hard to tell what it means. The tensions became clearer with the next feature, which was to have the hero move from one room to another. If we write


how can we tell that this has worked? We could ask the hero to look() again, but that means making an extra call for testing, which is not related to the intent of the test. We could ask the hero what his current room is, but that’s starting to leak into Asking rather than Telling. There may be a need for the hero to hold on to his current location, but we haven’t seen it yet.

Suddenly, it became clear that the dependencies were wrong. We already have a feature that can be told about the hero’s situation, which we can build on. If the feature were to be told about what is happening to the hero, we could use that to detect the change in room. So, our example now becomes:

describe Hero do
  it "should move to a room" do
    hero =

    room.stub!(:description).and_return("a room with twisty passages")

    console.should_receive(:show).with("in a room with twisty passages")


That’s better, but it’s not finished. The term Console sounds like an implementation, not a role. Most of the sword-wielding adventurers that I know don’t know how to work a Console, but they’re quite happy to tell of their great deeds to, say, a Narrator (as David Peterson suggested). If we adjust our example we get.

describe Hero do
  it "should move to a room" do
    hero =

    room.stub!(:description).and_return("a room with twisty passages")

    narrator.should_receive(:says).with("in a room with twisty passages")


The whole example now reads as if it’s in the same domain, in the language of a D&D game. It doesn’t refer to implementation details such as a Console—we might see that code when we get to the detailed implementation of a Narrator. Obviously, there’s a lot more we could do, for a start I’d like to see more structured messages between Hero and Narrator, but the session ran out of time at about this point.

Some lessons:

  1. Naming, naming, naming. It’s the most important thing. A coherent unit of code should have a coherent vocabulary, it should read well. If not, I’m probably mixing concepts which will make the code harder to understand and more brittle to change than it needs to be.
  2. When I’m about to write a test, I ask “if this were to work, who would know”. That’s the most revealing question in B/TDD. If there’s no visible effect from an event, except perhaps for changing a field in the target object, then maybe it’s worth waiting until there is a visible effect, or maybe there’s a concept missing, or maybe the structure isn’t quite right. Before writing more code, I try to make sure I understand its motivation.

Willem’s (and many other people’s) approach is slightly different. He likes to explore a bit further with the code before really sorting out the names, and he’s right that there’s a risk of Analysis-Paralysis. I do that occasionally, but my experience is that the effort of being really picky at this stage forces me to be clearer about what I’m trying to achieve, to ask those questions I really ought to have answers to, before I get in too deep.

"He doesn't mean that about Scrum"

In very bad taste, but very funny: “Hitler’s Nightly build fails”.

and we should remember that Stalin won…

Experienced Agilista's proved wrong (again)

So, Jurgen Appelo is unhappy that some of the more experienced Agile names have been telling him what to do. In particular, apparently they’ve been doing so without understanding complexity theory; he’s not reacting well.

In between the ranting, much of what Jurgen says is obviously true. For disorganised teams, adopting Scrum and nothing else will help them get more organised and productive, as seems to be his case. But he then goes on to say that anyone who tries to clip his wings and tell him how to develop software cannot be agile because they’re not adaptive enough—and by the way, he knows lots of stuff about complex adaptive systems that other people don’t.

My first reaction is to suggest that he not underestimate some of the people he’s reacting to. Ron Jeffries can certainly be provocative, but I don’t think he does it to try to convince people he’s smart. And some of us have been investigating complex adaptive systems for years.

What I think Jurgen is missing (or at least not making explicit) is that there isn’t a single axis between chaos and ordered, some aspects of an organisation do need to be ordered (in the end, we’re dealing with physical machines here) and some things complex (the people bits, usually). I may use complex adaptive techniques to understand what to build and how to communicate that, but I also want the thing to work reliably and not to be dragged down by a fear of making changes.

Sure, people in the community say dumb things, but we have actually learned some things over the last ten years. One is that we see team after team that has hit the wall because they didn’t work cleanly. Some have the luxury of a financial buffer which allows them to continue working sub-optimally. Just a few teams have understood the real trade-offs and can undercut the opposition by delivering faster and more reliably—and no-one promised that it would be easy or quick to achieve.