Category: Agile

#ukgovit: A new hope

I was delighted to be at the launch event last night for System Error: Fixing the flaws in government IT. This is the report from the Institute for Government thinktank about how classic problems with government IT projects might be avoided. The event was tweeted under the hashtag #ukgovit.

System Error coverI need to declare the same interest as Glyn Moody and Harry Metcalfe: I was also involved in the research (and am the source of the Guardian case study: one of eleven).

That said, it addresses one of the things I’ve found most frustrating about government IT, as an outsider: how can you so often spend so much money and fail to deliver such little value? What would happen if government departments tried an agile approach, of delivering little and often? This report proposes a two-pronged approach: a platform based on commoditisation, co-ordination and common standards; and agile project management.

Mark O’Neill, the DCMS and DCLG CIO, described the traditional government approach to IT needs (this from memory):

We come to a river and decide we want to cross it. So we say “I know, we’ll commission someone to build a bridge”. And they spend five years designing and building a bridge, and for those five years we’re stuck on the river bank because 99.9% of a bridge is no good. But, oh, it will be a lovely bridge.

The proposed two-pronged approach outlined above ensures quicker, faster and cheaper delivery. The research included a live agile project with the Metropolitan Police and the Home Office. One of the authors told me that of all their recommendations in all their reports, they had never felt so strongly about the benefits of agile.

Still, I went into the launch wary about the support the report would receive in government — after all the IfG is a thinktank and sits outside government. So I was really impressed with what I found. Not only was Mark O’Neill on the podium endorsing its recommendations (and confirming he would be implementing several of them), but so was Ian Watmore of the Cabinet Office. And he confirmed that only that morning the permanent secretaries had approved an agile approach to the implementation of the universal credit system. I also found out afterwards that the report had also been approved all the government CIOs.

There is still a long way to go. Culture and bureaucracy will have to change within the civil service. But there is a plan for trying this out and ramping it up. There is a new hope.

Here are further sources of information:

How to fight the laws of (not just CMS) projects

Freelance Unbound has some great laws, clearly learned the hard way, about… well, supposedly about website launches with new content management systems. But actually they can be generalised to be laws of almost any kind of big project with a strong technology element. Laws of physics? Laws of the land? Either way, you don’t need to take them lying down. Here are some suggestions about how to fight back…

A successful project1. It only exists when it’s live

The point here is that you can’t get people’s full attention unless it’s actually live. That’s virtually a law of physics. The way to fight this is to deliver early and often: start making it real for people as soon as possible, get something out there quickly so they can use it and feel what it’s like, and then build up.

2. It’s nobody’s fault but yours

This is a law of the land, and you can get a change in legislation. If people work in silos then it’ll be easy to finger-point when things go wrong. If people are brought together from across the organisation and are jointly responsible for the project being a success, then there’s a much better chance of success happening. This is because there is much greater cohesion between people with different expertise, which means more fluid communication and much less loss of understanding. So problems are less likely. And when they do happen then you can expect a much more productive reaction from all involved.

3. No-one has thought about [some critical function]

Again, this is a law that can be changed: find out early who all the stakeholders are, and make sure they’re involved. This is a pain, because it’s much easier to get a project going when there are fewer dependencies or teams involved. But inconveniently projects are judged a success at the end, not at the start. So it’s worth foregoing a bit of pain at start — socialising the ideas, seeing who needs to be involved, involving them, listening to them, changing your thinking accordingly — to avoid a lot of pain at the end.

4. [Team which does unglamorous but critical work] is always the last to know

Yes, it may be easier to involve some people only at arm’s length, but the inconvenience you’re avoiding early on will have to be repaid with interest. As before, discovering and involving all the stakeholders early is key.

5. Things you want to change are always hard-coded — and vice versa

You can’t please all of the people all of the time. But you can choose a small number of people and make the system flexible for them. And then you can extend your system a bit so that it pleases a few more people. And then you can continue to extend and expand, including more and more people, and adding more and more flexibility in just the right places to please them, until… well, hopefully everyone agrees it’s okay to stop.

For this approach to work you do have to make sure you have the capability to adapt your system. If you’ve chosen to buy then you need to know it’s sufficiently flexible for your needs, and where it doesn’t flex (or you’ve chosen not to flex it) then you need to make sure you communicate the limitations to all the relevant people. If you’ve chosen to build then you need to make sure you have the skills to build effectively.

6. Your system will always be out of date

Every project is a response to the wider business environment. And the business environment will never stop changing. So the question to ask is: If this project is a response to business change, and the business environment is changing constantly, what resources do we need to continue to put into this system after the project is complete to ensure it continues to keep pace with the moving target it was aimed at in the first place?

I’d say almost any answer is acceptable, including “We’re happy for it start falling behind the moment it goes live, because that bit of our business isn’t the most important part”. That particular answer will probably upset some staff, but the main thing is it’s a conscious decision. What will lead to problems is not answering the question at all.

Mind you, it’s also possible to get that answer wrong. Or, more likely, to incorrectly predict the pace of business change. If that’s a possibility then acknowledging the risk should help secure additional resources in the future.

The Agile treadmill: Command and control in disguise?

Agile comes with tons of literature on how to organise work at a very detailed level. How much of this treadmill is activity for activity’s sake, rituals, and religious manifestations of an ‘Agile Sub-Culture’ aimed at integrating a growing workforce? Can this relentless ‘heart beat’ and this esoteric jargon stifle innovation and alienate the very people Agile purports to help? Could more slack benefit a discipline which ultimately relies on people’s intelligence and creativity?

That’s how we set the scene a few days ago when I led a discussion among some peers from different companies entitled “The Agile treadmill: Command and control in disguise?” It was organised by IndigoBlue as part of their Second Wednesday series, under the Chatham House Rule. This is summary of what we discussed, with fictionalised initials for each participant.

In this article


When people talk about how Agile they are I find this usually coincides with how rigorous their processes are. Some of the largest, most successful Agile projects I know have an exhausting hothouse atmosphere, where everyone’s role is tightly defined and cannot be deviated from.

Have we created a kind of Animal Farm, where a system designed to set us free has actually enslaved us? Does it make a difference if Agile has been introduced by developers or management? Is effective Agile synonymous with a hothouse environment?

Between us we had an awful lot of experience to help answer these questions.

How useful is Agile process?

There was a general agreement with TC that there is some real value in having standard processes and practices:

  • Rewriting the rulebook every time you start a project or programme is wasteful if you can capture the processes and practices that are known to work;
  • It’s easier to bring in external resources if well-known language and practices are being used in the team.

But practices can be unnecessary, and there was recognition that many practitioners have a near-religious attachment to them, and to particular ways of working. As people managing programmes of work we often don’t have the discipline to recognise what the right level of discipline is… and isn’t.

There was also concern that Agile created an alienating language. NA indicated his particular discomfort over the word “customer” because that was often not how people saw themselves or their relationship with the tech teams.

Religious Agile, variable results

So no-one was anti-process (nor anti-Agile process), and we know it has produced good results. Our own major rebuild and redesign at the Guardian is one example of that, with a very highly regimented process and an outcome that at least met the aims and expectations. There are plenty of others.

On the other hand we collectively had experience of strong-process projects which weren’t delivering.

FC talked about Agile projects in his organisation. Most of the smaller ones were fine; these were projects with just a handful of developers. But the company’s most significant piece of work was having serious problems. It was now in its third year — nothing necessarily wrong with that, we said, after all some projects are just very large. And what had been delivered so far? Not much. Ah, we see the problem. And yet, he said, to the people on the ground this was a successful Agile project, because they were doing all the right things: daily stand-ups, rigorous unit testing, prioritised iterations… all the paraphernalia of an Agile project. They couldn’t see there was a problem.

AS reported a similar experience. “When I walked into the organisation they proudly walked me through their very rigorous process. ‘But nine out of your ten projects are on Red’, I said, ‘What is that process doing for you?'”

So what is Agile, anyway?

“Is Agile a process?” asked AS, rhetorically, “No. Is there process in it? Yes.” Process helps a project fit into the overall programme, added FC.

BJ went a bit further: successful Agile is about business change, not about what happens in the software team. The Agile approach to a large project is to get just enough clarity to embark on it, which probably isn’t every last detail. The stakeholders need to have the confidence to deal with uncertainty, and that sufficient uncertainty will continuously be resolved as the project progresses.

[And with that I’m going to drop the capital A from Agile unless I’m referring to by-the-book developer-level Agile.]

RN then came in with his take on this. He works in a highly bureaucratic company where technology is driven by heavyweight specs, a focus by high level people on low level details, and delivered entirely by specialist (external) suppliers. Yet his board are continually wanting things delivered faster, to greater quality, and he sees agile as the solution to this. It means persuading his board to think about what they want at the right level — business value, not the contents of drop-down menus — and providing reassurance through clear, regular deliverables. It also means getting his suppliers to deliver in that manner. But all of that is entirely independent of unit tests, story cards, and so on.

I wondered if this meant a top-down mandate of agile was more successful than a bottom up one. AS clarified: it’s successful if it’s business-led, not management-led. LB linked this to Kotter’s 8 steps to successful change, and in particular the first step: there has to be a sense of urgency.

I confessed that our processes in the Guardian team today are substantially less religious (or rigorous) than they were during the time of our major rebuild programme. I used to be able to confidently say we are Agile and point to intensive pair programme, lots of story cards, and so on. Today pairing is inconsistent, automated unit tests aren’t used in non-critical satellite code, and teams don’t always work to iterations. I explained this changed because our business had changed. The emphasis is no longer on transforming our platform but now much more on innovation and on smaller projects with a much wider variety of stakeholders. This makes the control freak within me distinctly uncomfortable (AS had previously commented that in many organisations command-and-control provides confidence) but we generally deliver projects as intended and have happy stakeholders, so I wasn’t going to wade in to change it. BK commented that this seemed perfectly appropriate, and still looked like an agile business.

Seeing the vision, from top to bottom

Successful agile is something that needs to permeate the organisation, from top to bottom. It’s about delivering value, and the vision for what needs to be delivered must be seen and understood at all levels. Many of our experiences of projects going awry were also experiences of where the project vision wasn’t seen by the people doing the day-to-day work.

TC told how his IT department decided it was going to achieve CMMI Level 2. After two years they had accumulated 10 rooms’ worth of documentation, and their productivity had dropped like a stone. And yes, he could measure that drop in productivity, because the team essentially did nothing else for those two years than produce documentation. The board inevitably said: Never do that again. CMMI accreditation may be useful, but it’s got to be for a purpose.

FC talked about his troubled project in its third year. “I revisited the original outcomes document a short while ago, and I couldn’t see how the activities around me are ever going to achieve those outcomes. When I ask one of the team leads what they’ve delivered, and how much there is to go, I’m just waved in the direction of a vast wall of index cards.”

AS talked about a daily scrum he attended: a two-minute meeting in which the team lead got a task update from each team member while he stood with his back to the story card board. “I had to take him to one side and tell him that he needed to be constantly referring people to the project outcomes, not just getting task updates.”

He also told the story of a business manager who wanted a change rolled out midweek — unprecedented for the organisation, as it would involve taking the website down for two hours. This was flatly refused by the Head of Systems. When they discussed the matter the issues were this: the business manager could show that if the change rolled out in the desired slot the company was projected to make an extra £1m, and if the slot was missed there wouldn’t be another opportunity for six months; the Head of Systems, meanwhile, had a target of 99.9% uptime and two hours downtime would mean he’d forfeit his bonus. “Our business must organise itself so that we are all focused on the same goals.”

What did we learn?

You may ask what our learnings were. I can confidently say there were no learnings whatsoever because “learnings” isn’t a real word. But there were probably lots of lessons. Here are three key things I came away with.

  1. Religious Agile, with an optional hothouse atmosphere, is entirely independent of a successful agile project. There are successful projects with and without religious Agile, and there are unsuccessful projects with and without religious Agile. This is underscored by…
  2. Agile is about business change. It really only works when all the stakeholders focus on the same business-level goals, understand what their part in it is (and isn’t), can deal with uncertainty, and continually work to reduce that uncertainty exactly when necessary.
  3. The business-level vision must be clear and permeate all levels. This is a corollary to the last point, but to me is significant enough to pull out. The people on the ground have to be able to relate all their activities to the business’s end goal, otherwise they will define success by other criteria — perhaps how well they adhere to a process — which are different from the actual success criteria.

Thanks to all the participants for making it such a stimulating morning.

Writing in Javascript for Palm’s webOS platform

In my spare time over the past few months I’ve been writing an app for my Palm Pre, and have finally got to point where I’m happy enough to release it. Palm Pre and Pixi apps run on Palm’s webOS operating system in HTML and Javascript. This is a summary of how I got on…

Background: A basic usability problem

I’ve always regarded Palm as the masters of productivity. David Allen based his Getting Things Done system on his use of a Palm Pilot. I stuck to my old Palm Vx as long as I could, as I lived almost entirely out of its task list application. I bought an iPod Touch when I had to ditch my old Palm, but while Apple makes beautiful products it doesn’t seem to care about productivity apps. (And that’s understandable — if you can look cool in Starbucks with a black polo neck sweater and a slim silver laptop, why should you care about productivity? Me, I don’t look cool in polo neck sweaters.) The iPod Touch and iPhone’s calendar app, for instance, only allowed a limited choice of event recurrence options; if you had a meeting which occurred on the 3rd Thursday of every month then that was just tough. There was no task list app as standard.

While I had migrated my task list to Remember The Milk (RTM) on my desktop, the Touch’s RTM app was for me a poor user experience. For one thing you had to navigate through several screens to get to your task list. Then you couldn’t see today’s and tomorrow’s tasks in one list — they were on separate screens you had to navigate back and forth between. And when you tapped to create a new task you couldn’t start typing “Get bread”, you had to first tap on the description field to select it — that’s just an annoying unnecessary extra user action. One lesson I learnt — from GUI Bloopers, I think — is understand what the user will primarily want to do and then absolutely minimise the work needed to do it. In a task list application I’d have thought having your tasks on the first screen might be obvious. Apparently not.

So I held out for the long-awaited Palm Pre to be my first smartphone. (And in the UK that was a very long awaiting.) Take a look at the Pre’s calendar app in this inevitably unnatural video run-through. Notice for one how the free time concertinas up so that you don’t miss knowing about appointments just because there’s a big gap before them. That’s an application made by people who really understand what their users need — or at least understand what I need.

Unfortunately, while the Pre does come with a very beautiful task list app it really is very basic. It doesn’t capture the ease of use of the Palm Pilot’s original (once again you have to drill down to actually see your tasks). And of course it was foolish to have hoped it would sync with Remember The Milk.

So, anyway, I wrote a task list app.


Palm’s webOS is based on Linux, and the basic programming environment is Javascript and HTML running in Webkit. Now I know what you’re thinking: isn’t that just what Steve Jobs announced when the iPhone first appeared, before the iPhone SDK existed, and wasn’t it unconvincing then? This is what he said:

We have been trying to come up with a solution to expand the capabilities of the iPhone so developers can write great apps for it, but keep the iPhone secure. And we’ve come up with a very. Sweet. Solution. Let me tell you about it. An innovative new way to create applications for mobile devices… it’s all based on the fact that we have the full Safari engine in the iPhone. […] don’t worry about distribution, just put ’em on an internet server. They’re easy to update, just update it on your server. They’re secure, and they run securely sandboxed on the iPhone. And guess what, there’s no SDK you need! You’ve got everything you need if you can write modern web apps…

That section of transcript came from Engadget’s live coverage, and they immediately added the following comment of their own:


Well, don’t confuse that with what Palm has done, because there are three key differences. First, they’ve provided their own APIs which allow access to useful functions, including some system functions such as background alarms, network status, location, and so on. Second, you can package up your app into a single bundle and deploy it onto a device. Third, when an app runs it does so within a well-defined framework. So you can create scenes with their distinct logic and pass data between them, and the framework (called Mojo) handles the application lifecycle.

So Javascript is still the language, and HTML and CSS is still the rendering mechanism, but that’s just a language detail. You really are writing real apps on the actual device platform. It’s Javascript, Jim, but not as you probably know it.

As an example, here’s how my task list app exits from the scene in which someone has added or edited a task:

EditTaskAssistant.prototype.popScene = function(wasCancelled) {"EditTaskAssistant.popScene: Entering");
    if ( == '') {
        // Don't allow a blank name to be entered =;
        lastScene: 'EditTask',
        task: this.config.task,
        isNew: this.config.isNew,
        wasCancelled: wasCancelled

The Mojo.* Javascript classes are built in, and I’m using them to do a bit of logging and also to exit from this scene. The framework itself knows what the last scene was to return to, and the only thing I’ve added is some parameters to pass back.

My application's files in EclipseDevelopment environment

The webOS development environment and workflow is pretty good. It consists of

  • Eclipse;
  • The Aptana Javascript plugin;
  • Sun’s VirtualBox; and
  • The webOS SDK itself.

The SDK consists of

  • Palm Pre and Palm Pixi images for VirtualBox (thus creating an emulator);
  • Another Eclipse plugin which adds some webOS specific actions (deploy to emulator, create a new Mojo scene, etc);
  • Some command line tools to perform actions such as tailing logs, creating an app package, deploying an app, etc;
  • Graphical tools for resourcing monitoring and DOM inspection; and
  • Sample apps to study.

What appeals to me about all this is Palm’s relaxed attitude to technology ownership. Write an IDE? Why go to the trouble when Eclipse already exists. Write an emulator? Run it in VirtualBox. And so on. Plus, the choice of Javascript as a development language ensures a very low bar to entry.

This relaxed (Palm would say “open”) attitude extends to deployment, too. Once I’ve packaged my Javascript app I can send it to the emulator or I can hook up my Palm Pre to my laptop with the USB cable and send the app down the line. No licencing, no cryptography, no UUIDs, just drop it on the device. That also means I can e-mail you the app package and you can drop it on your webOS device, too. As a result there’s been a lot of healthy activity around so-called “homebrew” apps — apps which aren’t available in the Palm App Catalog but can still be easily downloaded and installed.

I said the development environment was “pretty good”, but I have reservations. For one thing, the debugging and troubleshooting tools are a bit basic. But my more significant reservations are around use of Javascript as a development language. After coming from the nice, safe, statically-typed world of Java — having a dynamically-typed language with no safety nets of compile-time type-checking or intelligent auto-completion means I have to work harder. More of my thinking goes into typing rather than programming.

Also, Javascript is a language and platform that has its own quirks that stalled this non-expert developer, particularly when dealing with a large application…

My application: Cloud TasksExecution sequence

One thing I took a while to grasp was how Javascript handles its execution sequence. On the one hand we are told Javascript is single-threaded. On the other hand, it’s easy to have events fire out of order using setTimeout or in response to some user action. Eventually I worked out that Javascript will execute a sequence of code without interruption until it reaches completion, and then it will select another sequence which has queued up in the interim (from setTimeout, or a user action, or whatever). Overall this is really simplifies things, and it meant I could stop worrying about concurrency problems. (It also explained why I couldn’t write any code to successfully demonstrate a concurrency problem.)

However Palm realised that this is potentially bad, too, because a long-running routine can lock up the application — it will become totally unresponsive while the routine runs. So in order to preserve the user experience the webOS platform kills any routine that continues for more than 10 seconds without interruption. Unfortunately when this happens webOS doesn’t tell you (e.g. by logging a message) so you’re left to guess why your software has suddenly stopped. And the documentation of this “feature” is not terribly obvious, either. I spent quite a bit of time wondering what was going on when it happened to me. But eventually I discovered the cause, found myself in a small community of similarly-frustrated developers, and posted some sample code which showed the problem in action and also demonstrated that at least webOS doesn’t kill your whole application, just the long-running routine.

So I had to adapt my code to run in smaller, disjoint chunks. Fortunately by that stage I had become rather adept at dealing with…

Callbacks and testing

Most of the interesting Javascript extensions use callbacks, in particular XMLHttpRequest and the HTML5 SQL database API, which both return their results on callbacks. Initially I structured my code around XMLHttpRequest (more precisely, Prototype’s Ajax.Request wrapper) and I avoided asynchronous storage by putting all my data in cookies, which are accessed synchronously.

But eventually I felt I had to switch from cookies to SQL and convert my synchronous storage code to asynchronous code. Not only was that conversion quite painful, but trying to find a way of writing tests around so many, er, sequential asynchronous actions sent me down a lot of dead ends. I was using Yahoo!’s YUI test framework, which is quite elegant, but I had found the limit of its usefulness here.

In the end I devised a mechanism to break up my test cases’ code into asynchronous segments with pauses to aid sequencing. So a test would create a database table if it wasn’t already there, pause, clear down the table, pause, write some data into it, pause, and so on. Essentially I was pumping many of my tests full of race conditions (although it was fine in practice… generally) . What didn’t help was that to run the tests I now needed a browser which supported the HTML5 SQL database API, and on my Windows box that meant only Safari. Which, it turns out, is spectacularly disappointing: slow, resource-hungry, and oddly buggy. Test execution took three times as long, and it was only in a small part due to the little pauses I put in between database calls.

A recurring event in Cloud TasksEvents

But at least I settled on an approach. One area I never resolved satisfactorily was how best to handle events. Because of the way Javascript executes, or because of what my application does, there were a lot of “things that had to respond to other things”. I didn’t settle on a single way to manage those, and actually found myself using four techniques:

  1. Writing event listeners. Various functions register with an object to show their interest in an event. The object adds each function to a list. When the event occurs the object calls the functions in turn. Here, the event happens and the functions are called synchronously.
  2. A dummy function designed to be overwritten. This is a simplification of the first, where the object knows that it will have only one listener. When the event happens the object calls the dummy function. The interested code simply needs to replace the dummy function with its own implementation.
  3. DOM events. The interested code listens for a custom event in the DOM. When the interesting thing happens the custom event is fired in the DOM for the interested code to pick up. Useful if the firing code and the listening code aren’t otherwise well connected.
  4. Weird specialist class. Sometimes I found some actions depended on other things having happened first. For example, pushing out task changes depended on having first fetched a “timeline” ID from the Remember The Milk server, which in turn depended having an authentication token, which in turn depended on having an internet connection. To handle this I encapsulated the logic in a single class. It had a method called fire() which just went off and did the next thing it could.

Now having four ways to handle events isn’t necessarily wrong: arguably there are different kinds of events, and each kind may well benefit from its own design pattern. But I don’t have a handy classification of events, much less a mapping from them to appropriate design patterns. So perhaps here more than anywhere else in my code I felt I was guessing.

Releasing it

Undoubtedly my experience of writing a webOS application was a positive one: the very fact that I finished the project proves that. It is the most complete and release-quality piece of software I’ve developed in my own time. And now my smartphone has its missing app: Cloud Tasks. I can continue to live by my task list even when I’ve left my desk.

With some trepidation I’ve made the application available for others to download and use, too, in case they find it useful. The trepidation exists because it’s taken a substantial chunk of my time to develop and I really want to leave it now, and get on with the rest of my life. But I know there are plenty of people who use Remember The Milk in different ways, making use of features I’ve not implemented in the webOS client, and don’t want to implement, because I don’t use them myself. It’s not very community-minded to release something with minimal support. Instead I’ve chosen to release it under the GPL, so that others can develop it and any developments remain similarly freely available.

We’ll see what happens.

Being Agile and open for a stronger business

A couple of weeks ago I presented at the Agile Business Conference 2009. Thanks to all those at the DSDM Consortium who made it such a great event. The theme of the conference was how Agile can help in adversity (e.g. an economic downturn), and I provided a case study — title above — of how the digital technology team at the Guardian had responded to various organisational changes and challenges. Looking back over the last few years I found I’d learned many lessons from working with my friends and colleagues here. This is the summary of those lessons that I offered to the delegates at ABC09…

Exploit your successes

Our huge rebuild and redesign of was a successful project. Once it was done we could have slumped back, exhausted. But instead we used it as a springboard to start additional exciting (and arguably controversial) work. Primarily I’m thinking of the Open Platform, led by Matt McAlister. As I wrote at the time, getting that off the ground was a huge testament to the work and faith of many people in the company, particularly given that it’s a very intangible thing at launch.

Still, the Open Platform is built on top of the work in that earlier project, and in particular in that people recognised technology success and were prepared to back more.

Sustainable business needs sustainable technology

Much of our technology, particularly on the web side, is developed in-house. On the one hand that can seem expensive, but on the other hand it does mean we have the ability to take in the direction we see fit. A particular example is the cost of scaling It does cost more money to serve more users, but it doesn’t cost as much as it might because we have control of the technology.

Broadly speaking since we own the technology we can steer it in line with business needs much more readily.

Right-levelled decision-making builds trust

Let’s make one thing clear first: “Right-level” is absolutely not an acceptable verb phrase, and I only used it because it fitted onto the slide better. Now, to the substance of this…

Any reasonably sizable projects are approved to go (or not go) by a group of directors and other senior staff, and they do this at a regular meeting on the basis of information provided by the technology team working with many people around the business. This information of course consists of timescales, costs and benefits, but also what other projects are in progress and waiting in the wings.

This may seem obvious, but the consequence is that in the event of heavy cost control (which is what happens when the economy takes a dive) that cost control is exactly in line with project expenditure, because it’s the same people who are making both kind of decisions. It should be said that this is supported by the Agile principle of delivering value, and delivering frequently, because any project can start demonstrating its value early.

Openness reduces costs and provides options

The Guardian has been open not only with the Open Platform, but also with our full content RSS feeds. Applications have been built on the latter simply because they are a known format and, with full content, have huge utility. By having open full content RSS feeds we have a means of others using our content, and if they (or we) choose that option there are no internal integration costs.

Transparency builds trust

We do a lot of internal reporting: what features we release, our current bug count, project costs, and so on. When times get tough people start asking what value you’re adding to the business, and where all that money that’s being spent on your team is actually going. Because of all our reporting it’s relatively easy to open up our books to the consultants with financial questions, and then we can have an intelligent debate about the value of that expenditure. But that’s fine, and something we tend to welcome, because it’s part of the culture that Agile fosters: within the team we debate every week the value of what we do — that’s what Agile planning is all about — and Agile retrospectives are all about seeking improvement.

And finally…

I recognise that saying all this may be a hostage to fortune: the credit crunch took hold only about a year ago, and we’re not out of the woods yet. But for now I can look back on some of the things that have gone well, and if we emerge from this economic downturn with only a few cuts and bruises then it will be in no small part due to honesty and openness creating trust, which in turn leads to healthier decision-making.

A few things I know about lean

I’ve been reading a bit about lean working recently, and this is a bit about what I’ve learned.

Lean software development is a fascinating step on from Agile, but its history is in manufacturing cars and to date I’ve only been reading up on lean manufacturing, not lean development. There are two reasons for this. First, I’d feel much more confident about lean if I know the background and reasons for things — in other words, I want to derive lean software development from first principles. Second, I was rather cautious, if not downright sceptical, that something can be translated from manufacturing to software. So if there’s any translating to be done then I want to do it myself — or at least understand someone else’s translation of some of the processes sufficiently to trust the rest of their translations.

This, then, is some of what I’ve learnt about lean working so far, based on manufacturing and with my own projections onto software development. It comes with the caveats that it’s partial and not informed by 99.9% of the work Mary and Tom Poppendieck, the acknowledged leaders in the field of lean software development. But it is informed by my reading of two fabulous books: The Toyota Way by Jeffrey K. Liker (TTW), and Lean Thinking by James P. Womack and Daniel T. Jones (LT). My interest in cars registers on no scale visible to the human eye, but learning about car production from these authors has been quite eye-opening.

In this post:

Lean challenges common sense

The “common sense” way of putting a complicated thing together is to divide the manufacturing process into pieces (the engine, the chassis, the interior, the shell), create a specialist team for each area with their specialist tools, then have them work in a long production line. The result is a very long production line with each product moving from one specialist team to another as it’s built up from beginning to end. If you’re going to be really efficient then you might batch up all the products of one type and send them through in one go, then switch to another product of another type and send those through.

Lean says this is wrong. It’s not trying to be deliberately challenging or quirky or iconclastic just for the sake of making a name for its founders. Rather, it says this is wrong because it generates hidden problems. If we look at the very long production line then every one of the products on that line is in an incomplete state and hiding a multitude of problems. And if you were to implement that single-product-batching it would be even worse.

Lean says the most efficient way of doing something is to create it from beginning to end in a single pass. The single long production line transforms into many, much smaller, teams that are each responsible for the entirety of the production of each product.

This is counter-intuitive. Did we learn nothing from Henry Ford? Well, the evidence — and there is plenty of it — says otherwise.

But I’m getting ahead of myself. Let’s start at the beginning…

Lean is about producing value

This is pretty unstartling and almost uninspiring. But it turns out that what constitutes value is often hard to define. First you have to ask “value to whom?” Often in large organisations people are set targets of something which is not the main goal of the organisation, and this means you end up optimising your processes for the wrong thing. For example, the paint team might be targeted on how quickly they can paint a car body before moving onto the next one. But that would be misleading, and might explain why there’s a retouch team further up the line to cover up the paint problems discovered later. A car company is not about producing painted bodies, it’s about producing cars, and focusing a team on painting ensures they miss the big picture.

The Guardian software team produces software, so it’s easy to see that’s where our value lies and we should optimise our processes around software. But on second thoughts I’m not entirely sure… Perhaps value should be measured from the point of view of the end user. The better we provide news and information the greater the value of what we do, and merely producing software is no good if it doesn’t benefit the end user. And when we do produce software it’s of absolutely no value until it’s released and public.

Lean is about the value stream

Once you’ve identified the value you need to map the value stream.

Let’s suppose our value is in producing software that the end user finds useful. The value stream runs from the point of idea inception to the point the user can use it. Mapping the value stream is the painful process of tracking exactly what happens to what and who, and how long it takes. This is painful because you need to be very, very honest with yourself, and map with your eyes wide open. In particular you need to make sure you track all the times something goes back for rework, and all the times the thing just sits there waiting for something to happen to it. It’s also painful because it means managers have to understand first hand what their team have to deal with — anyone who thinks they are above that due to demands of time or priorities is going to find this first step very difficult.

A typical Agile process has a backlog of work cued up before the iteration or sprint, and then hopefully delivered at the end. But there will be large gaps between a story being defined and actual development. If an iteration is three weeks then on average there will be a gap of 1.5 weeks between definition and development starting, and if we insist on any story being up to five days’ effort then a typical optimum scenario is that a single story is at least 60% non-work. This is relevant because…

Lean is all about eliminating waste

The primary purpose of lean is to eliminate waste, and I’ve not put it first in the list of things I know because we first need to define value and then map the value stream. Once we’ve done that we can tackle the waste.

Waste comes in many forms — well, seven forms according to lean lore. In manufacturing the most significant of the seven is unused inventory, which means parts that are sitting around in racks and warehouses waiting to be used. Unused inventory not only takes up unnecessary space, it actually masks problems. Here are some examples:

  • the acres of spare parts mask the fact that part production is not in sync with the demand;
  • the fact that different parts are overstocked by different levels means that it’s impossible to tell which ones are most and least over-produced;
  • if a part is only used long after it is produced then quality problems cannot be captured and addressed in time — by the time the quality flaw is discovered there will be many similarly-flawed parts in circulation.

Inventory in manufacturing seems similar to Agile stories that aren’t being worked on: they’ve been specified but are sitting around waiting to be picked off the warehouse rack. And when they’ve been developed they might be sitting around waiting to be released — after all, until it’s released it’s of no value to the customer.

This is where we return to the example above of a story that’s 60% non-work — 60% of its time is just sitting around waiting. The goal is to compress this down to an ideal 0%, but not just because we want to do things quicker. It’s because having sight of something from beginning to end, and not lose sight of it for a second. It’s because being able to focus on something means information isn’t lost, and everyone’s expertise can be brought to bear on it in one pass. If that were to happen then less would be needed to be written down on the story card because the team wouldn’t have to suffer context switching. They could also apply much greater creativity to their work, because they would see exactly why certain things were and were not being specified and contribute alternative or additional ideas without causing confusion or risking repetition.

This idea of taking something from beginning to end in a single pass is called…


In manufacturing this is a big deal. It means rather than having your factory floor as a single production line you create small teams (“cells”) which are responsible for the entire production of each unit (car, lawnmower, etc). If you have huge lathes and paint machines and so on it’s a major change to rearrange the factory floor.

Not so difficult in software development, fortunately — we tend to just have desks and computers, though in any large organisation with centralised functions you always need to win the buy-in of other people.

However, flow comes with a new and serious responsibility for those involved: the cell as a whole is responsible for producing the goods, so they must work together to ensure regular and maximal output. Let me make that concrete…

At no time when I’ve been working with cross-functional teams (software developer, client-side developer, QA, etc) has there been the perfect balance of all roles; we could always have done with one more software developer, or an extra 0.2 of a QA, etc. Much of the time the imbalance is negligible (or quietly welcome), but sometimes it’s very noticeable.  And when a cell is working together on a single deliverable (the car, the lawnmower, the software feature) then it’s up to everyone to help each other. It’s no good the client-side developer producing more widgets to test if the QA can’t keep up. They need to ignore the traditional perceived boundaries created by job titles, reach across to others, and work together to regulate the output.

I said lean is about eliminating waste, and flow helps with that in a way hinted at earlier. Flow increases quality by allowing all participants to see the thing put together from being to end without interruption. This reduces hand-off time, reduces information loss, reduces relearning, and increases knowledge and ownership. The number of times an item needs to go back for rework is reduced, and if it does need to go back then the rework that’s needed is clear and therefore quicker.


Meanwhile, all this work needs to come from somewhere and that’s what “pull” is all about. The principle of lean is to only do what’s needed, and that means only produce something that is a direct response of a specific request, and only when needed.

The distinction is clear in car manufacturing. A car company’s marketing department will devise a special offer on a particular configuration (these seats, those mirrors, any one of the following colour combinations) and the plant will have to manufacture a whole lot of those particular cars ahead of time, but only with a guess (rather than a certainty) about what the demand will be. In the lean world a car is only produced in response to a specific customer’s specific purchase: customer goes into showroom, customer orders car, order triggers build.

That’s how pull works in relation to delivering the product to the customer. But pull also works in relation to building the product inside the factory. The old method is to keep hundreds of each part in store; the lean method is that a part is only provided to the worker when they need it: when they’re running low they signal the need for more, and it’s provided for them. This triggers a chain right back potentially to the supplier of the part, ensuring they are always delivering just enough, no more and no less.

The parallel flawed system in the software world is the product backlog. (We’ll ignore the even worse scenario of waterfall’s detailed planning up front.) Work is prioritised ahead of the sprint and waits to be developed. Requirements can change, even in that gap between prioritisation and development. The consequences aren’t as terrible in the Agile world as in the waterfall world, but it still causes problems: it disrupts the team’s schedule and of course all the effort that went into the planning of the now-deprioritised story is wasted. Even if the requirements remain constant the gap between planning and developing mean knowledge is lost, or conversations need to be repeated, or the requirements turn into mini-waterfall-style requirements specifications.

The lean software alternative is to prioritise the next story only when the team is ready to work on the next story. That means while the team is developing the current feature they don’t have much certainty about what’s coming next. Like the worker in the car plant they have to signal slightly ahead of time that a new story needs to be worked out. Then the Scrum Master/business owner/internal customer needs to get something ready so that when they become available the team can all get together, thrash out the details, and set to work again.

As a manager I’m uncomfortable with this: I can no longer know what’s going into an iteration when the iteration starts. But the definition of value is not “what a manager’s comfortable with”. I will need to find other ways to ensure we can be accountable to the rest of the business, and they will need to be ways which are closely aligned with end user value — and that can only be a good thing.

The combination of flow and pull doesn’t mean the team is only working on one thing at a time. But it does mean that everyone has an equal balance of work at all times. So if Alf’s development work gets passed over to Betty’s testing then Alf and Betty need to make sure that she is expecting to finish her current piece of testing at pretty much the same time as Alf finishes his current piece of development and passes it over to her. Keeping that flow even is really important.

Lean promises the seemingly-impossible

Lean holds out the seemingly-impossible promise of increased productivity and increased quality. But here are some numbers from the literature:

  • The Puget Sound Naval Shipyard, time to prepare a repair document. Originally: 97 days. After a lean workshop and taking action: 26 days. (TTW, p.103)
  • Lantech wrapping machinery, before and after lean. Production throughput time was 16 weeks, became 14 hours to 5 days. Delivered defects per machine was 8, became 0.8. Employee time per machine was 160 hours, became 80 hours. (LT, p.121)
  • Porsche, production of the 911, before and after lean. Initial stamping to final shipping was 6 weeks, became 5 days. Inventory held was 17 days’ worth, became 4.2 days’ worth. Errors from work on the assembly line dropped 55%. (LT p.213)

I do have some concerns about lean software development, but they’re less about lean itself and more about bandwagon-jumping and doing things without really understanding the reasons. Regardless of that, it’s refreshing to find a new way of looking at what seem to be known problems, and making insights which you might not otherwise have found. It’s certainly something I’ll be spending quite a bit more time on.

An ABC of R2: Y is for YAGNI

…which stands for “you ain’t gonna need it” and is an important principle of Agile development, with strong benefits for the business.

The basis for YAGNI stems from a failure common in many software development projects: that when a developer creates a component of a system they tend to give it more flexibility than is immediately necessary, so that it can be reused in more contexts and provide more value. But the result in practice is almost always over-engineering: the component costs more to develop, the suspected additional use rarely materialises, when it does materialise it has requirements that don’t quite match the actual implementation, and the component as a whole is more brittle, less comprehensible to successive developers and more difficult to maintain.

By contrast YAGNI says “you ain’t gonna need it”: develop the absolute minimum to get it working for the immediate problem, and extend it later only if another specific needs arises. This works in Agile development because there is a wider supporting structure which enables changes to be made with minimal cost.

An excellent example of YAGNI in action was our implementation of content workflow. There are lots of things you could do with workflow. You can lock content to ensure that only one person can edit it at a time; you can branch content so that different versions can be worked on simultaneously; one of our senior stakeholders was keen on allowing numbered versions which she had seen elsewhere, allowing individual changes to be tracked and traced, and even allowing someone to switch back to a previous numbered version — but she did say that the feature was almost never used.

What we did in practice was a minimal workflow with minimal cost: we built in the ability to have content as draft and live, and very little more. That was sufficient for the staff who we knew would be using it. It seems like we could have taken the opportunity to build more, in case more people with more needs would use it. But you know what they say — you ain’t gonna need it. In the months following that very early stage in the project the company changed a lot, and among other things this included much more integrated working between editorial teams who were previously designated “web” or “print”… these days that distinction has diminished. The technological impact is that many editorial staff are using an integrated front-end, sitting outside the web system, and that’s where the workflow is handled now.

The industry-wide drive for integrated publishing changed our working environment; if we’d have developed elaborate workflow in the web system it would have been wasted effort, and a large wasted cost for the company.

An ABC of R2: X is for XL

…which was a size of problem we noted but wouldn’t tackle.

When we estimated the work for R2 up front we used t-shirt sizes for each feature: S, M, L and XL. The largest single task the team would tackle was an L, which was the equivalent of five days’ work. We felt this was a good maximum for two reasons: first, it delivered something of value within a reasonable space of time and what should have been one iteration; second, if we could imagine developing it within five days then it was probably sufficiently understood, whereas anything bigger risked being too complex to estimate without a lot more thought.

However, during the time-limited initial planning period we were required to estimate things that needed more analysis than we had time for. These features we labelled XL and gave a nominal equivalent size to: 10 days’ work.

However, we would never actually start a task that was labelled XL. We would always break it down before the event into smaller parts. The aim was that although individual estimates might be over or under, on average they should balance out and the total size of the project should remain constant.

Of course, for XL tasks it was quite possible that a more thoughtful estimate would produce a total that was far too big to be able to be balanced out by other tasks. In these cases we would have to reconsider our options and think about dealing with the problem in a different way — most likely by having a less comprehensive solution. However that didn’t tend to be too much of a problem. In many cases by the time we got to the XL task our stakeholders’ needs and priorities had changed so much that there were a lot of new options and directions that people wanted to explore, and ones that they would not have thought of in those very early planning stages.

This is an example of why planning was important, but very detailed planning would actually have been quite wasteful.

An ABC of R2: P is for pair programming

…which was, and is, a hugely important part of our software development, and something that took a long time to learn to do well.

Pair programming is when two developers sit at one machine and one keyboard to write the software. It’s very difficult to do: the driver has the pressure of someone watching their every move, and the navigator has to be aware of what’s going on because they’ll be asked to take over at any moment and they have a responsibility to keep an eye on the bigger picture. It also makes it a very collaborative process — the pair need to work out together exactly how they’re going to tackle every problem. Mat, leading our architecture team, calls this “keeping each other honest”.

Pairing looks expensive — two people apparently doing something that one could do — but that makes the mistake of thinking that all software is the same and all developers are interchangeable. Here are some of the benefits we’ve found:

  • Developers are more productive. With someone sitting beside you you can’t afford to cruise. You’ve got to be demonstrably on the ball all the time.
  • The quality of the software is much higher. I’ve listened to developers discussing how they should write a particular piece of code, suggesting alternatives and weighing up pros and cons that one individual would never have come up with by themselves.
  • Developers become much more skilled much more quickly. Everyone learns off everyone else.
  • The company’s technology investment is de-risked. Highly specialised knowledge is shared among many people, and doesn’t live just in the head of one person. This also means…
  • …Resourcing projects is easier. Because more people are able to move onto other teams more often, since (a) they are more likely to have the knowledge needed for the new team, and (b) they are less tied to their existing team since they will have shared that knowledge. This last point also means…
  • …Developers have more opportunities to learn new things. They can move onto other teams and new projects, safe that they won’t be constantly called back to their previous team, because they’ll have shared their skills and knowledge.

When a software project is complete the software itself is only just beginning its life, in operation day after day — and in the case of our software, by hundreds of people for many years to come. So that development investment has to ensure the product is of very high quality, and pair programming is part of how we ensure that.