Category: Agile

A few things I know about lean

I’ve been reading a bit about lean working recently, and this is a bit about what I’ve learned.

Lean software development is a fascinating step on from Agile, but its history is in manufacturing cars and to date I’ve only been reading up on lean manufacturing, not lean development. There are two reasons for this. First, I’d feel much more confident about lean if I know the background and reasons for things — in other words, I want to derive lean software development from first principles. Second, I was rather cautious, if not downright sceptical, that something can be translated from manufacturing to software. So if there’s any translating to be done then I want to do it myself — or at least understand someone else’s translation of some of the processes sufficiently to trust the rest of their translations.

This, then, is some of what I’ve learnt about lean working so far, based on manufacturing and with my own projections onto software development. It comes with the caveats that it’s partial and not informed by 99.9% of the work Mary and Tom Poppendieck, the acknowledged leaders in the field of lean software development. But it is informed by my reading of two fabulous books: The Toyota Way by Jeffrey K. Liker (TTW), and Lean Thinking by James P. Womack and Daniel T. Jones (LT). My interest in cars registers on no scale visible to the human eye, but learning about car production from these authors has been quite eye-opening.

In this post:

Lean challenges common sense

The “common sense” way of putting a complicated thing together is to divide the manufacturing process into pieces (the engine, the chassis, the interior, the shell), create a specialist team for each area with their specialist tools, then have them work in a long production line. The result is a very long production line with each product moving from one specialist team to another as it’s built up from beginning to end. If you’re going to be really efficient then you might batch up all the products of one type and send them through in one go, then switch to another product of another type and send those through.

Lean says this is wrong. It’s not trying to be deliberately challenging or quirky or iconclastic just for the sake of making a name for its founders. Rather, it says this is wrong because it generates hidden problems. If we look at the very long production line then every one of the products on that line is in an incomplete state and hiding a multitude of problems. And if you were to implement that single-product-batching it would be even worse.

Lean says the most efficient way of doing something is to create it from beginning to end in a single pass. The single long production line transforms into many, much smaller, teams that are each responsible for the entirety of the production of each product.

This is counter-intuitive. Did we learn nothing from Henry Ford? Well, the evidence — and there is plenty of it — says otherwise.

But I’m getting ahead of myself. Let’s start at the beginning…

Lean is about producing value

This is pretty unstartling and almost uninspiring. But it turns out that what constitutes value is often hard to define. First you have to ask “value to whom?” Often in large organisations people are set targets of something which is not the main goal of the organisation, and this means you end up optimising your processes for the wrong thing. For example, the paint team might be targeted on how quickly they can paint a car body before moving onto the next one. But that would be misleading, and might explain why there’s a retouch team further up the line to cover up the paint problems discovered later. A car company is not about producing painted bodies, it’s about producing cars, and focusing a team on painting ensures they miss the big picture.

The Guardian software team produces software, so it’s easy to see that’s where our value lies and we should optimise our processes around software. But on second thoughts I’m not entirely sure… Perhaps value should be measured from the point of view of the end user. The better we provide news and information the greater the value of what we do, and merely producing software is no good if it doesn’t benefit the end user. And when we do produce software it’s of absolutely no value until it’s released and public.

Lean is about the value stream

Once you’ve identified the value you need to map the value stream.

Let’s suppose our value is in producing software that the end user finds useful. The value stream runs from the point of idea inception to the point the user can use it. Mapping the value stream is the painful process of tracking exactly what happens to what and who, and how long it takes. This is painful because you need to be very, very honest with yourself, and map with your eyes wide open. In particular you need to make sure you track all the times something goes back for rework, and all the times the thing just sits there waiting for something to happen to it. It’s also painful because it means managers have to understand first hand what their team have to deal with — anyone who thinks they are above that due to demands of time or priorities is going to find this first step very difficult.

A typical Agile process has a backlog of work cued up before the iteration or sprint, and then hopefully delivered at the end. But there will be large gaps between a story being defined and actual development. If an iteration is three weeks then on average there will be a gap of 1.5 weeks between definition and development starting, and if we insist on any story being up to five days’ effort then a typical optimum scenario is that a single story is at least 60% non-work. This is relevant because…

Lean is all about eliminating waste

The primary purpose of lean is to eliminate waste, and I’ve not put it first in the list of things I know because we first need to define value and then map the value stream. Once we’ve done that we can tackle the waste.

Waste comes in many forms — well, seven forms according to lean lore. In manufacturing the most significant of the seven is unused inventory, which means parts that are sitting around in racks and warehouses waiting to be used. Unused inventory not only takes up unnecessary space, it actually masks problems. Here are some examples:

  • the acres of spare parts mask the fact that part production is not in sync with the demand;
  • the fact that different parts are overstocked by different levels means that it’s impossible to tell which ones are most and least over-produced;
  • if a part is only used long after it is produced then quality problems cannot be captured and addressed in time — by the time the quality flaw is discovered there will be many similarly-flawed parts in circulation.

Inventory in manufacturing seems similar to Agile stories that aren’t being worked on: they’ve been specified but are sitting around waiting to be picked off the warehouse rack. And when they’ve been developed they might be sitting around waiting to be released — after all, until it’s released it’s of no value to the customer.

This is where we return to the example above of a story that’s 60% non-work — 60% of its time is just sitting around waiting. The goal is to compress this down to an ideal 0%, but not just because we want to do things quicker. It’s because having sight of something from beginning to end, and not lose sight of it for a second. It’s because being able to focus on something means information isn’t lost, and everyone’s expertise can be brought to bear on it in one pass. If that were to happen then less would be needed to be written down on the story card because the team wouldn’t have to suffer context switching. They could also apply much greater creativity to their work, because they would see exactly why certain things were and were not being specified and contribute alternative or additional ideas without causing confusion or risking repetition.

This idea of taking something from beginning to end in a single pass is called…

Flow

In manufacturing this is a big deal. It means rather than having your factory floor as a single production line you create small teams (“cells”) which are responsible for the entire production of each unit (car, lawnmower, etc). If you have huge lathes and paint machines and so on it’s a major change to rearrange the factory floor.

Not so difficult in software development, fortunately — we tend to just have desks and computers, though in any large organisation with centralised functions you always need to win the buy-in of other people.

However, flow comes with a new and serious responsibility for those involved: the cell as a whole is responsible for producing the goods, so they must work together to ensure regular and maximal output. Let me make that concrete…

At no time when I’ve been working with cross-functional teams (software developer, client-side developer, QA, etc) has there been the perfect balance of all roles; we could always have done with one more software developer, or an extra 0.2 of a QA, etc. Much of the time the imbalance is negligible (or quietly welcome), but sometimes it’s very noticeable.  And when a cell is working together on a single deliverable (the car, the lawnmower, the software feature) then it’s up to everyone to help each other. It’s no good the client-side developer producing more widgets to test if the QA can’t keep up. They need to ignore the traditional perceived boundaries created by job titles, reach across to others, and work together to regulate the output.

I said lean is about eliminating waste, and flow helps with that in a way hinted at earlier. Flow increases quality by allowing all participants to see the thing put together from being to end without interruption. This reduces hand-off time, reduces information loss, reduces relearning, and increases knowledge and ownership. The number of times an item needs to go back for rework is reduced, and if it does need to go back then the rework that’s needed is clear and therefore quicker.

Pull

Meanwhile, all this work needs to come from somewhere and that’s what “pull” is all about. The principle of lean is to only do what’s needed, and that means only produce something that is a direct response of a specific request, and only when needed.

The distinction is clear in car manufacturing. A car company’s marketing department will devise a special offer on a particular configuration (these seats, those mirrors, any one of the following colour combinations) and the plant will have to manufacture a whole lot of those particular cars ahead of time, but only with a guess (rather than a certainty) about what the demand will be. In the lean world a car is only produced in response to a specific customer’s specific purchase: customer goes into showroom, customer orders car, order triggers build.

That’s how pull works in relation to delivering the product to the customer. But pull also works in relation to building the product inside the factory. The old method is to keep hundreds of each part in store; the lean method is that a part is only provided to the worker when they need it: when they’re running low they signal the need for more, and it’s provided for them. This triggers a chain right back potentially to the supplier of the part, ensuring they are always delivering just enough, no more and no less.

The parallel flawed system in the software world is the product backlog. (We’ll ignore the even worse scenario of waterfall’s detailed planning up front.) Work is prioritised ahead of the sprint and waits to be developed. Requirements can change, even in that gap between prioritisation and development. The consequences aren’t as terrible in the Agile world as in the waterfall world, but it still causes problems: it disrupts the team’s schedule and of course all the effort that went into the planning of the now-deprioritised story is wasted. Even if the requirements remain constant the gap between planning and developing mean knowledge is lost, or conversations need to be repeated, or the requirements turn into mini-waterfall-style requirements specifications.

The lean software alternative is to prioritise the next story only when the team is ready to work on the next story. That means while the team is developing the current feature they don’t have much certainty about what’s coming next. Like the worker in the car plant they have to signal slightly ahead of time that a new story needs to be worked out. Then the Scrum Master/business owner/internal customer needs to get something ready so that when they become available the team can all get together, thrash out the details, and set to work again.

As a manager I’m uncomfortable with this: I can no longer know what’s going into an iteration when the iteration starts. But the definition of value is not “what a manager’s comfortable with”. I will need to find other ways to ensure we can be accountable to the rest of the business, and they will need to be ways which are closely aligned with end user value — and that can only be a good thing.

The combination of flow and pull doesn’t mean the team is only working on one thing at a time. But it does mean that everyone has an equal balance of work at all times. So if Alf’s development work gets passed over to Betty’s testing then Alf and Betty need to make sure that she is expecting to finish her current piece of testing at pretty much the same time as Alf finishes his current piece of development and passes it over to her. Keeping that flow even is really important.

Lean promises the seemingly-impossible

Lean holds out the seemingly-impossible promise of increased productivity and increased quality. But here are some numbers from the literature:

  • The Puget Sound Naval Shipyard, time to prepare a repair document. Originally: 97 days. After a lean workshop and taking action: 26 days. (TTW, p.103)
  • Lantech wrapping machinery, before and after lean. Production throughput time was 16 weeks, became 14 hours to 5 days. Delivered defects per machine was 8, became 0.8. Employee time per machine was 160 hours, became 80 hours. (LT, p.121)
  • Porsche, production of the 911, before and after lean. Initial stamping to final shipping was 6 weeks, became 5 days. Inventory held was 17 days’ worth, became 4.2 days’ worth. Errors from work on the assembly line dropped 55%. (LT p.213)

I do have some concerns about lean software development, but they’re less about lean itself and more about bandwagon-jumping and doing things without really understanding the reasons. Regardless of that, it’s refreshing to find a new way of looking at what seem to be known problems, and making insights which you might not otherwise have found. It’s certainly something I’ll be spending quite a bit more time on.

An ABC of R2: Y is for YAGNI

…which stands for “you ain’t gonna need it” and is an important principle of Agile development, with strong benefits for the business.

The basis for YAGNI stems from a failure common in many software development projects: that when a developer creates a component of a system they tend to give it more flexibility than is immediately necessary, so that it can be reused in more contexts and provide more value. But the result in practice is almost always over-engineering: the component costs more to develop, the suspected additional use rarely materialises, when it does materialise it has requirements that don’t quite match the actual implementation, and the component as a whole is more brittle, less comprehensible to successive developers and more difficult to maintain.

By contrast YAGNI says “you ain’t gonna need it”: develop the absolute minimum to get it working for the immediate problem, and extend it later only if another specific needs arises. This works in Agile development because there is a wider supporting structure which enables changes to be made with minimal cost.

An excellent example of YAGNI in action was our implementation of content workflow. There are lots of things you could do with workflow. You can lock content to ensure that only one person can edit it at a time; you can branch content so that different versions can be worked on simultaneously; one of our senior stakeholders was keen on allowing numbered versions which she had seen elsewhere, allowing individual changes to be tracked and traced, and even allowing someone to switch back to a previous numbered version — but she did say that the feature was almost never used.

What we did in practice was a minimal workflow with minimal cost: we built in the ability to have content as draft and live, and very little more. That was sufficient for the staff who we knew would be using it. It seems like we could have taken the opportunity to build more, in case more people with more needs would use it. But you know what they say — you ain’t gonna need it. In the months following that very early stage in the project the company changed a lot, and among other things this included much more integrated working between editorial teams who were previously designated “web” or “print”… these days that distinction has diminished. The technological impact is that many editorial staff are using an integrated front-end, sitting outside the web system, and that’s where the workflow is handled now.

The industry-wide drive for integrated publishing changed our working environment; if we’d have developed elaborate workflow in the web system it would have been wasted effort, and a large wasted cost for the company.

An ABC of R2: X is for XL

…which was a size of problem we noted but wouldn’t tackle.

When we estimated the work for R2 up front we used t-shirt sizes for each feature: S, M, L and XL. The largest single task the team would tackle was an L, which was the equivalent of five days’ work. We felt this was a good maximum for two reasons: first, it delivered something of value within a reasonable space of time and what should have been one iteration; second, if we could imagine developing it within five days then it was probably sufficiently understood, whereas anything bigger risked being too complex to estimate without a lot more thought.

However, during the time-limited initial planning period we were required to estimate things that needed more analysis than we had time for. These features we labelled XL and gave a nominal equivalent size to: 10 days’ work.

However, we would never actually start a task that was labelled XL. We would always break it down before the event into smaller parts. The aim was that although individual estimates might be over or under, on average they should balance out and the total size of the project should remain constant.

Of course, for XL tasks it was quite possible that a more thoughtful estimate would produce a total that was far too big to be able to be balanced out by other tasks. In these cases we would have to reconsider our options and think about dealing with the problem in a different way — most likely by having a less comprehensive solution. However that didn’t tend to be too much of a problem. In many cases by the time we got to the XL task our stakeholders’ needs and priorities had changed so much that there were a lot of new options and directions that people wanted to explore, and ones that they would not have thought of in those very early planning stages.

This is an example of why planning was important, but very detailed planning would actually have been quite wasteful.

An ABC of R2: P is for pair programming

…which was, and is, a hugely important part of our software development, and something that took a long time to learn to do well.

Pair programming is when two developers sit at one machine and one keyboard to write the software. It’s very difficult to do: the driver has the pressure of someone watching their every move, and the navigator has to be aware of what’s going on because they’ll be asked to take over at any moment and they have a responsibility to keep an eye on the bigger picture. It also makes it a very collaborative process — the pair need to work out together exactly how they’re going to tackle every problem. Mat, leading our architecture team, calls this “keeping each other honest”.

Pairing looks expensive — two people apparently doing something that one could do — but that makes the mistake of thinking that all software is the same and all developers are interchangeable. Here are some of the benefits we’ve found:

  • Developers are more productive. With someone sitting beside you you can’t afford to cruise. You’ve got to be demonstrably on the ball all the time.
  • The quality of the software is much higher. I’ve listened to developers discussing how they should write a particular piece of code, suggesting alternatives and weighing up pros and cons that one individual would never have come up with by themselves.
  • Developers become much more skilled much more quickly. Everyone learns off everyone else.
  • The company’s technology investment is de-risked. Highly specialised knowledge is shared among many people, and doesn’t live just in the head of one person. This also means…
  • …Resourcing projects is easier. Because more people are able to move onto other teams more often, since (a) they are more likely to have the knowledge needed for the new team, and (b) they are less tied to their existing team since they will have shared that knowledge. This last point also means…
  • …Developers have more opportunities to learn new things. They can move onto other teams and new projects, safe that they won’t be constantly called back to their previous team, because they’ll have shared their skills and knowledge.

When a software project is complete the software itself is only just beginning its life, in operation day after day — and in the case of our software, by hundreds of people for many years to come. So that development investment has to ensure the product is of very high quality, and pair programming is part of how we ensure that.

An ABC of R2: I is for iterations

…which lasted two weeks and culminated in a full new release of our software. I’ve written elsewhere about what happened in one week of a particular iteration in June 2008.

However, our R2 iterations didn’t just involve implementing software. At the same time each team was also working with a business analyst and end-users to plan and clarify the work for the next iteration, and sometimes getting the last iteration’s work through its systems tests prior to release. As one of the team leads said to me, “we work in three timezones”.

At the end of an iteration the release went out. But our releases didn’t always reveal much that was different — either because we were waiting to reveal it as part of a forthcoming launch, or because it was part of our internally-facing systems. That can be an odd feeling for the development teams. In previous lives I’ve worked on projects where the launch has the whole team working right up to the last moment and so also signals the time when you can collapse with exhaustion. But our interleaving of “timezones”, and releases well ahead of launches, changed that. The launch of, say, the Sports section was a big moment for so many in the company, but most of the people on the development team had finished the work some time before and were spending more time thinking about the launch that followed.

An ABC of R2: C is for changing requirements

…which are a fact of life — certainly if your life revolves around developing software. During R2 there was a 40% churn on requirements. That means by the end of the project 40% of the work we had done had not appeared in our initial plan — some things were dumped, new things were introduced, and much was significantly altered.

The Agile way is to embrace change, and more than that it’s to structure your working practices around enabling and encouraging it. There are several ways we make this happen. One is to make sure that every task has tangible value to the end-user. Another is to prioritise the tasks so that the highest value work is delivered first. The second follows from the first: only if the tasks are individually valuable can they be arranged and rearranged.

One example of handling changing requirements is how we introduced keywords. We always believed these were important to our work, so we developed them very early — other elements of our plan were less certain and so might change, but we could safely defer such decisions until closer to the time we’d have to build them.

Managing keywords would require some slick tools if they were to be rolled out to more than a couple of staff. But our first cut was very primitive: it involved someone typing data into an Excel spreadsheet and then uploading that to a web page which would digest the spreadsheet and either process the data or report errors. It was not user-friendly. But this approach did have many virtues:

  • It was relatively quick to develop;
  • It was sufficient to enable just one person to manage the keywords, which was all that was needed then;
  • It allowed us to build further things based on keywords, such as a little component next to an article which listed the keywords associated with it;
  • It showed up real, unexpected problems arising from real-world use, that we would not have foreseen during planning;
  • It showed us what really was and wasn’t going to be important when we built a slick tool for wider use.

A 40% churn on requirements doesn’t mean there was a 40% budget overrun or time overrun, by the way. We came in on time and on budget. That’s another consequence of having working practices that embrace change rather than merely tolerate it.

An ABC of R2: B is for business analysis

…which means different things to different people. In our case it meant extracting requirements and turning them into something that could be implemented.

Business analysis is often misunderstood when it’s used in an Agile context. Agile people often think it’s not necessary — after all, they say, the analysis is best performed by the developers working closely with the customer as they go along. Many non-Agile people often think the analysis should be done up-front, written down, and then the workload on the business analysts (BAs) eases off as everyone else goes off to implement what they’ve written.

In fact business analysis was one of the most demanding jobs on R2. Analysis was indeed done up-front, at the start of the project, but only to a level sufficient to estimate the work, not to a level sufficient to implement it. Ahead of each fortnightly iteration the BAs would specify the forthcoming fortnight’s tasks in much greater detail, sufficient to be implemented. At the same time they would be dealing with queries about the iteration currently in flight. Thus at any one time they had to deal with work being planned and work being implemented.

In both cases they would be working with a very diverse group of people. For work being detailed the business analyst would take requirements from our business users and present them to the technologists for estimation; for work being implemented they would often take queries from the technologists and present them to the business users for guidance.

You could say the process of business analysis was often like being piggy in the middle, yet the personal skills of the individual business analysts (technical insight, strategic understanding, and not a little diplomacy) was key to the success of the project. The cohesiveness of guardian.co.uk — both from the outside and the inside — is in no small part due to the talents of our business analysts.

What management buy-in really means

We work using agile development processes, which obviously needs the buy-in of internal users and project sponsors. But this jumped out at me when I read it. It’s from Carolyn McCall, Chief Executive of Guardian Media Group, which owns the company I work for. She was announcing a £15m investment in our digital business, and it was reported thus:

“What we’ve done so far is our own version of web 1.0, but we want to continue to web 2.0 and what comes after that,” Ms McCall added. “We need to be agile and ready to change.”

I suspect that phrase, “We need to be agile and ready to change”, is not a coincidence. You can trace a strong lineage between the agile development work going on in Guardian Unlimited and that little sentence in a speech to the OPA. Let me tell you how I think it got there.

A story of a sentence

A long time ago we began using agile processes in the GU tech team. It was a new way of working, but it seemed to address a lot of our problems — actually, the usual ones you get in most software organisations, such as knowledge silos, lack of flexibility, impossible deadlines, and so on. Changing what you know is always a tough choice, but the GU management team were very supportive of the move to agile.

In particular, our boss at GU at the time, Simon Waldman, was very interested in its implementation and evolution. (Simon’s since moved to GMG.) Over the months and years he enquired, queried, provoked, but was never less than encouraging and constructive. (At one memorable annual review to the GU staff Simon devoted an entire slide to explain how we had stealthily removed a problematic database table. Our DBA was thrilled; I have no idea what the attendant sales people and journalists made of it.)

And agile clearly worked for GU. One notable change was that planning meetings took on much more of a business focus. Far less about technical dependencies, much more on what functionality we wanted to release when.

And then we began we work on our new Travel site (released in November). It featured a new design and new commercial opportunties. More people around the company needed to scrutinise this as it started up, because it was a pretty big project and we had a much greater responsibility to be rigorous and transparent in our delivery. We saw agile methods as being critical to the work’s success. It would allow us to delay decisions to the latest possible moment and therefore produce something that was much more relevant — both editorially and commercially — to our audience and clients.

I therefore spent some time explaining agile to various company directors… but not as much as you might expect, because clearly the principles and practices had been discussed and understood well before. I ended up having the kind of conversations about agile development with non-technical senior managers that I wish I could have had with more technical job candidates. Word was getting around. The business people scrutinising the work on the Travel site understood the process’s business benefits, and they understood how the low-level practices would provide those, and they knew those practices were working successfully within GU at the time. And indeed, the project was blessed by the the company directors, including Simon and our MD at the time… Carolyn McCall.

You can see how that word “agile” has been thrown around a lot inside the company, and was — at least then — strongly associated with Guardian Unlimited and therefore its innovation (which I’d like to think is almost synonymous with “daily work” — ahem), and it’s stuck.

Now I can’t claim that Carolyn knows what, say, test-driven development really is. I suspect she has higher level things to think about. But if it came up, I think she’d understand our practices such as refactoring (making lots of tiny internal restructurings to produce a smoother, more manageable system), continuous integration (ensuring our changes are constantly integrated into the day-to-day work, not siloed) and retrospectives (frequent reviews and actions to improve). And, come to think of it, TDD, which is really about supporting and enabling change.

It seems Carolyn understands “agile” in a way that I wish more technical people would. It’s about change, and about supporting change (“We need to be agile and ready to change”).

The hard work starts now

And what of the future? Well, agile development helps us deliver responsibly to the business — and if someone invests £15m in you, then you really need to deliver responsibily. Again, that kind of investment only happens if there’s confidence in your ability to deliver. Can we offer that confidence? Well, just this week I was in a meeting with Tom Turcan, who is our Head of Digital Media Development. The conversation went something like this. Note that development for this particular thing is already about 30% of the way through:

Expert user: “…So in summary we’ve decided that features A, B and C aren’t that important after all, and we’ve replaced them with features J and K.”

Tom: “It looks like A, B, C total 7 ideal days, and J and K total 8.5 ideal days. So that’s a net increase of 1.5 ideal days. Is that right?”

Expert user: “Er, yes.”

Tom: “So we can’t really allow that to happen. We can make changes to make the work less, or keep it at the same level, but we mustn’t start increasing it. Can you find 1.5 days’ worth of features to remove?”

Expert user: “Yes, I should think so.”

Tom: “Okay, we’ll make a note to see what you’ve decided next week.”

We couldn’t have had that conversation in a more waterfall environment. By the time we’d have reached this point the groundwork for A, B and C — the database tables, the DAO layer, etc — would have been done, and it would have been wasted effort. Instead our expert user is able to make decisions later, and Tom is able keep to keep the work under control, giving the rest of the business the confidence it needs. Our agile processes enable that to happen. And all that is known by our developers, and by our managers, and that same confidence is shared by our group Chief Executive.

Postscript

By the way, there are other things going on this week in and around Guardian Unlimited. You might want to take a look at Jeff Jarvis’s commentary on Alan Rusbridger‘s commitment to digital and his interest in being flexible. Also, my esteemed colleague Neil McIntosh tells how GU saved a cat, and offers a word of advice to a researcher at News International.

As a friend said to me the other day, “I know when you’re busy at work — your blog goes quiet.” So back to work now.