Being Agile and open for a stronger business

A couple of weeks ago I presented at the Agile Business Conference 2009. Thanks to all those at the DSDM Consortium who made it such a great event. The theme of the conference was how Agile can help in adversity (e.g. an economic downturn), and I provided a case study — title above — of how the digital technology team at the Guardian had responded to various organisational changes and challenges. Looking back over the last few years I found I’d learned many lessons from working with my friends and colleagues here. This is the summary of those lessons that I offered to the delegates at ABC09…

Exploit your successes

Our huge rebuild and redesign of guardian.co.uk was a successful project. Once it was done we could have slumped back, exhausted. But instead we used it as a springboard to start additional exciting (and arguably controversial) work. Primarily I’m thinking of the Open Platform, led by Matt McAlister. As I wrote at the time, getting that off the ground was a huge testament to the work and faith of many people in the company, particularly given that it’s a very intangible thing at launch.

Still, the Open Platform is built on top of the work in that earlier project, and in particular in that people recognised technology success and were prepared to back more.

Sustainable business needs sustainable technology

Much of our technology, particularly on the web side, is developed in-house. On the one hand that can seem expensive, but on the other hand it does mean we have the ability to take in the direction we see fit. A particular example is the cost of scaling guardian.co.uk. It does cost more money to serve more users, but it doesn’t cost as much as it might because we have control of the technology.

Broadly speaking since we own the technology we can steer it in line with business needs much more readily.

Right-levelled decision-making builds trust

Let’s make one thing clear first: “Right-level” is absolutely not an acceptable verb phrase, and I only used it because it fitted onto the slide better. Now, to the substance of this…

Any reasonably sizable projects are approved to go (or not go) by a group of directors and other senior staff, and they do this at a regular meeting on the basis of information provided by the technology team working with many people around the business. This information of course consists of timescales, costs and benefits, but also what other projects are in progress and waiting in the wings.

This may seem obvious, but the consequence is that in the event of heavy cost control (which is what happens when the economy takes a dive) that cost control is exactly in line with project expenditure, because it’s the same people who are making both kind of decisions. It should be said that this is supported by the Agile principle of delivering value, and delivering frequently, because any project can start demonstrating its value early.

Openness reduces costs and provides options

The Guardian has been open not only with the Open Platform, but also with our full content RSS feeds. Applications have been built on the latter simply because they are a known format and, with full content, have huge utility. By having open full content RSS feeds we have a means of others using our content, and if they (or we) choose that option there are no internal integration costs.

Transparency builds trust

We do a lot of internal reporting: what features we release, our current bug count, project costs, and so on. When times get tough people start asking what value you’re adding to the business, and where all that money that’s being spent on your team is actually going. Because of all our reporting it’s relatively easy to open up our books to the consultants with financial questions, and then we can have an intelligent debate about the value of that expenditure. But that’s fine, and something we tend to welcome, because it’s part of the culture that Agile fosters: within the team we debate every week the value of what we do — that’s what Agile planning is all about — and Agile retrospectives are all about seeking improvement.

And finally…

I recognise that saying all this may be a hostage to fortune: the credit crunch took hold only about a year ago, and we’re not out of the woods yet. But for now I can look back on some of the things that have gone well, and if we emerge from this economic downturn with only a few cuts and bruises then it will be in no small part due to honesty and openness creating trust, which in turn leads to healthier decision-making.

A few things I know about lean

I’ve been reading a bit about lean working recently, and this is a bit about what I’ve learned.

Lean software development is a fascinating step on from Agile, but its history is in manufacturing cars and to date I’ve only been reading up on lean manufacturing, not lean development. There are two reasons for this. First, I’d feel much more confident about lean if I know the background and reasons for things — in other words, I want to derive lean software development from first principles. Second, I was rather cautious, if not downright sceptical, that something can be translated from manufacturing to software. So if there’s any translating to be done then I want to do it myself — or at least understand someone else’s translation of some of the processes sufficiently to trust the rest of their translations.

This, then, is some of what I’ve learnt about lean working so far, based on manufacturing and with my own projections onto software development. It comes with the caveats that it’s partial and not informed by 99.9% of the work Mary and Tom Poppendieck, the acknowledged leaders in the field of lean software development. But it is informed by my reading of two fabulous books: The Toyota Way by Jeffrey K. Liker (TTW), and Lean Thinking by James P. Womack and Daniel T. Jones (LT). My interest in cars registers on no scale visible to the human eye, but learning about car production from these authors has been quite eye-opening.

In this post:

Lean challenges common sense

The “common sense” way of putting a complicated thing together is to divide the manufacturing process into pieces (the engine, the chassis, the interior, the shell), create a specialist team for each area with their specialist tools, then have them work in a long production line. The result is a very long production line with each product moving from one specialist team to another as it’s built up from beginning to end. If you’re going to be really efficient then you might batch up all the products of one type and send them through in one go, then switch to another product of another type and send those through.

Lean says this is wrong. It’s not trying to be deliberately challenging or quirky or iconclastic just for the sake of making a name for its founders. Rather, it says this is wrong because it generates hidden problems. If we look at the very long production line then every one of the products on that line is in an incomplete state and hiding a multitude of problems. And if you were to implement that single-product-batching it would be even worse.

Lean says the most efficient way of doing something is to create it from beginning to end in a single pass. The single long production line transforms into many, much smaller, teams that are each responsible for the entirety of the production of each product.

This is counter-intuitive. Did we learn nothing from Henry Ford? Well, the evidence — and there is plenty of it — says otherwise.

But I’m getting ahead of myself. Let’s start at the beginning…

Lean is about producing value

This is pretty unstartling and almost uninspiring. But it turns out that what constitutes value is often hard to define. First you have to ask “value to whom?” Often in large organisations people are set targets of something which is not the main goal of the organisation, and this means you end up optimising your processes for the wrong thing. For example, the paint team might be targeted on how quickly they can paint a car body before moving onto the next one. But that would be misleading, and might explain why there’s a retouch team further up the line to cover up the paint problems discovered later. A car company is not about producing painted bodies, it’s about producing cars, and focusing a team on painting ensures they miss the big picture.

The Guardian software team produces software, so it’s easy to see that’s where our value lies and we should optimise our processes around software. But on second thoughts I’m not entirely sure… Perhaps value should be measured from the point of view of the end user. The better we provide news and information the greater the value of what we do, and merely producing software is no good if it doesn’t benefit the end user. And when we do produce software it’s of absolutely no value until it’s released and public.

Lean is about the value stream

Once you’ve identified the value you need to map the value stream.

Let’s suppose our value is in producing software that the end user finds useful. The value stream runs from the point of idea inception to the point the user can use it. Mapping the value stream is the painful process of tracking exactly what happens to what and who, and how long it takes. This is painful because you need to be very, very honest with yourself, and map with your eyes wide open. In particular you need to make sure you track all the times something goes back for rework, and all the times the thing just sits there waiting for something to happen to it. It’s also painful because it means managers have to understand first hand what their team have to deal with — anyone who thinks they are above that due to demands of time or priorities is going to find this first step very difficult.

A typical Agile process has a backlog of work cued up before the iteration or sprint, and then hopefully delivered at the end. But there will be large gaps between a story being defined and actual development. If an iteration is three weeks then on average there will be a gap of 1.5 weeks between definition and development starting, and if we insist on any story being up to five days’ effort then a typical optimum scenario is that a single story is at least 60% non-work. This is relevant because…

Lean is all about eliminating waste

The primary purpose of lean is to eliminate waste, and I’ve not put it first in the list of things I know because we first need to define value and then map the value stream. Once we’ve done that we can tackle the waste.

Waste comes in many forms — well, seven forms according to lean lore. In manufacturing the most significant of the seven is unused inventory, which means parts that are sitting around in racks and warehouses waiting to be used. Unused inventory not only takes up unnecessary space, it actually masks problems. Here are some examples:

  • the acres of spare parts mask the fact that part production is not in sync with the demand;
  • the fact that different parts are overstocked by different levels means that it’s impossible to tell which ones are most and least over-produced;
  • if a part is only used long after it is produced then quality problems cannot be captured and addressed in time — by the time the quality flaw is discovered there will be many similarly-flawed parts in circulation.

Inventory in manufacturing seems similar to Agile stories that aren’t being worked on: they’ve been specified but are sitting around waiting to be picked off the warehouse rack. And when they’ve been developed they might be sitting around waiting to be released — after all, until it’s released it’s of no value to the customer.

This is where we return to the example above of a story that’s 60% non-work — 60% of its time is just sitting around waiting. The goal is to compress this down to an ideal 0%, but not just because we want to do things quicker. It’s because having sight of something from beginning to end, and not lose sight of it for a second. It’s because being able to focus on something means information isn’t lost, and everyone’s expertise can be brought to bear on it in one pass. If that were to happen then less would be needed to be written down on the story card because the team wouldn’t have to suffer context switching. They could also apply much greater creativity to their work, because they would see exactly why certain things were and were not being specified and contribute alternative or additional ideas without causing confusion or risking repetition.

This idea of taking something from beginning to end in a single pass is called…

Flow

In manufacturing this is a big deal. It means rather than having your factory floor as a single production line you create small teams (”cells”) which are responsible for the entire production of each unit (car, lawnmower, etc). If you have huge lathes and paint machines and so on it’s a major change to rearrange the factory floor.

Not so difficult in software development, fortunately — we tend to just have desks and computers, though in any large organisation with centralised functions you always need to win the buy-in of other people.

However, flow comes with a new and serious responsibility for those involved: the cell as a whole is responsible for producing the goods, so they must work together to ensure regular and maximal output. Let me make that concrete…

At no time when I’ve been working with cross-functional teams (software developer, client-side developer, QA, etc) has there been the perfect balance of all roles; we could always have done with one more software developer, or an extra 0.2 of a QA, etc. Much of the time the imbalance is negligible (or quietly welcome), but sometimes it’s very noticeable.  And when a cell is working together on a single deliverable (the car, the lawnmower, the software feature) then it’s up to everyone to help each other. It’s no good the client-side developer producing more widgets to test if the QA can’t keep up. They need to ignore the traditional perceived boundaries created by job titles, reach across to others, and work together to regulate the output.

I said lean is about eliminating waste, and flow helps with that in a way hinted at earlier. Flow increases quality by allowing all participants to see the thing put together from being to end without interruption. This reduces hand-off time, reduces information loss, reduces relearning, and increases knowledge and ownership. The number of times an item needs to go back for rework is reduced, and if it does need to go back then the rework that’s needed is clear and therefore quicker.

Pull

Meanwhile, all this work needs to come from somewhere and that’s what “pull” is all about. The principle of lean is to only do what’s needed, and that means only produce something that is a direct response of a specific request, and only when needed.

The distinction is clear in car manufacturing. A car company’s marketing department will devise a special offer on a particular configuration (these seats, those mirrors, any one of the following colour combinations) and the plant will have to manufacture a whole lot of those particular cars ahead of time, but only with a guess (rather than a certainty) about what the demand will be. In the lean world a car is only produced in response to a specific customer’s specific purchase: customer goes into showroom, customer orders car, order triggers build.

That’s how pull works in relation to delivering the product to the customer. But pull also works in relation to building the product inside the factory. The old method is to keep hundreds of each part in store; the lean method is that a part is only provided to the worker when they need it: when they’re running low they signal the need for more, and it’s provided for them. This triggers a chain right back potentially to the supplier of the part, ensuring they are always delivering just enough, no more and no less.

The parallel flawed system in the software world is the product backlog. (We’ll ignore the even worse scenario of waterfall’s detailed planning up front.) Work is prioritised ahead of the sprint and waits to be developed. Requirements can change, even in that gap between prioritisation and development. The consequences aren’t as terrible in the Agile world as in the waterfall world, but it still causes problems: it disrupts the team’s schedule and of course all the effort that went into the planning of the now-deprioritised story is wasted. Even if the requirements remain constant the gap between planning and developing mean knowledge is lost, or conversations need to be repeated, or the requirements turn into mini-waterfall-style requirements specifications.

The lean software alternative is to prioritise the next story only when the team is ready to work on the next story. That means while the team is developing the current feature they don’t have much certainty about what’s coming next. Like the worker in the car plant they have to signal slightly ahead of time that a new story needs to be worked out. Then the Scrum Master/business owner/internal customer needs to get something ready so that when they become available the team can all get together, thrash out the details, and set to work again.

As a manager I’m uncomfortable with this: I can no longer know what’s going into an iteration when the iteration starts. But the definition of value is not “what a manager’s comfortable with”. I will need to find other ways to ensure we can be accountable to the rest of the business, and they will need to be ways which are closely aligned with end user value — and that can only be a good thing.

The combination of flow and pull doesn’t mean the team is only working on one thing at a time. But it does mean that everyone has an equal balance of work at all times. So if Alf’s development work gets passed over to Betty’s testing then Alf and Betty need to make sure that she is expecting to finish her current piece of testing at pretty much the same time as Alf finishes his current piece of development and passes it over to her. Keeping that flow even is really important.

Lean promises the seemingly-impossible

Lean holds out the seemingly-impossible promise of increased productivity and increased quality. But here are some numbers from the literature:

  • The Puget Sound Naval Shipyard, time to prepare a repair document. Originally: 97 days. After a lean workshop and taking action: 26 days. (TTW, p.103)
  • Lantech wrapping machinery, before and after lean. Production throughput time was 16 weeks, became 14 hours to 5 days. Delivered defects per machine was 8, became 0.8. Employee time per machine was 160 hours, became 80 hours. (LT, p.121)
  • Porsche, production of the 911, before and after lean. Initial stamping to final shipping was 6 weeks, became 5 days. Inventory held was 17 days’ worth, became 4.2 days’ worth. Errors from work on the assembly line dropped 55%. (LT p.213)

I do have some concerns about lean software development, but they’re less about lean itself and more about bandwagon-jumping and doing things without really understanding the reasons. Regardless of that, it’s refreshing to find a new way of looking at what seem to be known problems, and making insights which you might not otherwise have found. It’s certainly something I’ll be spending quite a bit more time on.

An ABC of R2: Y is for YAGNI

…which stands for “you ain’t gonna need it” and is an important principle of Agile development, with strong benefits for the business.

The basis for YAGNI stems from a failure common in many software development projects: that when a developer creates a component of a system they tend to give it more flexibility than is immediately necessary, so that it can be reused in more contexts and provide more value. But the result in practice is almost always over-engineering: the component costs more to develop, the suspected additional use rarely materialises, when it does materialise it has requirements that don’t quite match the actual implementation, and the component as a whole is more brittle, less comprehensible to successive developers and more difficult to maintain.

By contrast YAGNI says “you ain’t gonna need it”: develop the absolute minimum to get it working for the immediate problem, and extend it later only if another specific needs arises. This works in Agile development because there is a wider supporting structure which enables changes to be made with minimal cost.

An excellent example of YAGNI in action was our implementation of content workflow. There are lots of things you could do with workflow. You can lock content to ensure that only one person can edit it at a time; you can branch content so that different versions can be worked on simultaneously; one of our senior stakeholders was keen on allowing numbered versions which she had seen elsewhere, allowing individual changes to be tracked and traced, and even allowing someone to switch back to a previous numbered version — but she did say that the feature was almost never used.

What we did in practice was a minimal workflow with minimal cost: we built in the ability to have content as draft and live, and very little more. That was sufficient for the staff who we knew would be using it. It seems like we could have taken the opportunity to build more, in case more people with more needs would use it. But you know what they say — you ain’t gonna need it. In the months following that very early stage in the project the company changed a lot, and among other things this included much more integrated working between editorial teams who were previously designated “web” or “print”… these days that distinction has diminished. The technological impact is that many editorial staff are using an integrated front-end, sitting outside the web system, and that’s where the workflow is handled now.

The industry-wide drive for integrated publishing changed our working environment; if we’d have developed elaborate workflow in the web system it would have been wasted effort, and a large wasted cost for the company.