Category: Working practices

A few things I know about lean

I’ve been reading a bit about lean working recently, and this is a bit about what I’ve learned.

Lean software development is a fascinating step on from Agile, but its history is in manufacturing cars and to date I’ve only been reading up on lean manufacturing, not lean development. There are two reasons for this. First, I’d feel much more confident about lean if I know the background and reasons for things — in other words, I want to derive lean software development from first principles. Second, I was rather cautious, if not downright sceptical, that something can be translated from manufacturing to software. So if there’s any translating to be done then I want to do it myself — or at least understand someone else’s translation of some of the processes sufficiently to trust the rest of their translations.

This, then, is some of what I’ve learnt about lean working so far, based on manufacturing and with my own projections onto software development. It comes with the caveats that it’s partial and not informed by 99.9% of the work Mary and Tom Poppendieck, the acknowledged leaders in the field of lean software development. But it is informed by my reading of two fabulous books: The Toyota Way by Jeffrey K. Liker (TTW), and Lean Thinking by James P. Womack and Daniel T. Jones (LT). My interest in cars registers on no scale visible to the human eye, but learning about car production from these authors has been quite eye-opening.

In this post:

Lean challenges common sense

The “common sense” way of putting a complicated thing together is to divide the manufacturing process into pieces (the engine, the chassis, the interior, the shell), create a specialist team for each area with their specialist tools, then have them work in a long production line. The result is a very long production line with each product moving from one specialist team to another as it’s built up from beginning to end. If you’re going to be really efficient then you might batch up all the products of one type and send them through in one go, then switch to another product of another type and send those through.

Lean says this is wrong. It’s not trying to be deliberately challenging or quirky or iconclastic just for the sake of making a name for its founders. Rather, it says this is wrong because it generates hidden problems. If we look at the very long production line then every one of the products on that line is in an incomplete state and hiding a multitude of problems. And if you were to implement that single-product-batching it would be even worse.

Lean says the most efficient way of doing something is to create it from beginning to end in a single pass. The single long production line transforms into many, much smaller, teams that are each responsible for the entirety of the production of each product.

This is counter-intuitive. Did we learn nothing from Henry Ford? Well, the evidence — and there is plenty of it — says otherwise.

But I’m getting ahead of myself. Let’s start at the beginning…

Lean is about producing value

This is pretty unstartling and almost uninspiring. But it turns out that what constitutes value is often hard to define. First you have to ask “value to whom?” Often in large organisations people are set targets of something which is not the main goal of the organisation, and this means you end up optimising your processes for the wrong thing. For example, the paint team might be targeted on how quickly they can paint a car body before moving onto the next one. But that would be misleading, and might explain why there’s a retouch team further up the line to cover up the paint problems discovered later. A car company is not about producing painted bodies, it’s about producing cars, and focusing a team on painting ensures they miss the big picture.

The Guardian software team produces software, so it’s easy to see that’s where our value lies and we should optimise our processes around software. But on second thoughts I’m not entirely sure… Perhaps value should be measured from the point of view of the end user. The better we provide news and information the greater the value of what we do, and merely producing software is no good if it doesn’t benefit the end user. And when we do produce software it’s of absolutely no value until it’s released and public.

Lean is about the value stream

Once you’ve identified the value you need to map the value stream.

Let’s suppose our value is in producing software that the end user finds useful. The value stream runs from the point of idea inception to the point the user can use it. Mapping the value stream is the painful process of tracking exactly what happens to what and who, and how long it takes. This is painful because you need to be very, very honest with yourself, and map with your eyes wide open. In particular you need to make sure you track all the times something goes back for rework, and all the times the thing just sits there waiting for something to happen to it. It’s also painful because it means managers have to understand first hand what their team have to deal with — anyone who thinks they are above that due to demands of time or priorities is going to find this first step very difficult.

A typical Agile process has a backlog of work cued up before the iteration or sprint, and then hopefully delivered at the end. But there will be large gaps between a story being defined and actual development. If an iteration is three weeks then on average there will be a gap of 1.5 weeks between definition and development starting, and if we insist on any story being up to five days’ effort then a typical optimum scenario is that a single story is at least 60% non-work. This is relevant because…

Lean is all about eliminating waste

The primary purpose of lean is to eliminate waste, and I’ve not put it first in the list of things I know because we first need to define value and then map the value stream. Once we’ve done that we can tackle the waste.

Waste comes in many forms — well, seven forms according to lean lore. In manufacturing the most significant of the seven is unused inventory, which means parts that are sitting around in racks and warehouses waiting to be used. Unused inventory not only takes up unnecessary space, it actually masks problems. Here are some examples:

  • the acres of spare parts mask the fact that part production is not in sync with the demand;
  • the fact that different parts are overstocked by different levels means that it’s impossible to tell which ones are most and least over-produced;
  • if a part is only used long after it is produced then quality problems cannot be captured and addressed in time — by the time the quality flaw is discovered there will be many similarly-flawed parts in circulation.

Inventory in manufacturing seems similar to Agile stories that aren’t being worked on: they’ve been specified but are sitting around waiting to be picked off the warehouse rack. And when they’ve been developed they might be sitting around waiting to be released — after all, until it’s released it’s of no value to the customer.

This is where we return to the example above of a story that’s 60% non-work — 60% of its time is just sitting around waiting. The goal is to compress this down to an ideal 0%, but not just because we want to do things quicker. It’s because having sight of something from beginning to end, and not lose sight of it for a second. It’s because being able to focus on something means information isn’t lost, and everyone’s expertise can be brought to bear on it in one pass. If that were to happen then less would be needed to be written down on the story card because the team wouldn’t have to suffer context switching. They could also apply much greater creativity to their work, because they would see exactly why certain things were and were not being specified and contribute alternative or additional ideas without causing confusion or risking repetition.

This idea of taking something from beginning to end in a single pass is called…

Flow

In manufacturing this is a big deal. It means rather than having your factory floor as a single production line you create small teams (“cells”) which are responsible for the entire production of each unit (car, lawnmower, etc). If you have huge lathes and paint machines and so on it’s a major change to rearrange the factory floor.

Not so difficult in software development, fortunately — we tend to just have desks and computers, though in any large organisation with centralised functions you always need to win the buy-in of other people.

However, flow comes with a new and serious responsibility for those involved: the cell as a whole is responsible for producing the goods, so they must work together to ensure regular and maximal output. Let me make that concrete…

At no time when I’ve been working with cross-functional teams (software developer, client-side developer, QA, etc) has there been the perfect balance of all roles; we could always have done with one more software developer, or an extra 0.2 of a QA, etc. Much of the time the imbalance is negligible (or quietly welcome), but sometimes it’s very noticeable.  And when a cell is working together on a single deliverable (the car, the lawnmower, the software feature) then it’s up to everyone to help each other. It’s no good the client-side developer producing more widgets to test if the QA can’t keep up. They need to ignore the traditional perceived boundaries created by job titles, reach across to others, and work together to regulate the output.

I said lean is about eliminating waste, and flow helps with that in a way hinted at earlier. Flow increases quality by allowing all participants to see the thing put together from being to end without interruption. This reduces hand-off time, reduces information loss, reduces relearning, and increases knowledge and ownership. The number of times an item needs to go back for rework is reduced, and if it does need to go back then the rework that’s needed is clear and therefore quicker.

Pull

Meanwhile, all this work needs to come from somewhere and that’s what “pull” is all about. The principle of lean is to only do what’s needed, and that means only produce something that is a direct response of a specific request, and only when needed.

The distinction is clear in car manufacturing. A car company’s marketing department will devise a special offer on a particular configuration (these seats, those mirrors, any one of the following colour combinations) and the plant will have to manufacture a whole lot of those particular cars ahead of time, but only with a guess (rather than a certainty) about what the demand will be. In the lean world a car is only produced in response to a specific customer’s specific purchase: customer goes into showroom, customer orders car, order triggers build.

That’s how pull works in relation to delivering the product to the customer. But pull also works in relation to building the product inside the factory. The old method is to keep hundreds of each part in store; the lean method is that a part is only provided to the worker when they need it: when they’re running low they signal the need for more, and it’s provided for them. This triggers a chain right back potentially to the supplier of the part, ensuring they are always delivering just enough, no more and no less.

The parallel flawed system in the software world is the product backlog. (We’ll ignore the even worse scenario of waterfall’s detailed planning up front.) Work is prioritised ahead of the sprint and waits to be developed. Requirements can change, even in that gap between prioritisation and development. The consequences aren’t as terrible in the Agile world as in the waterfall world, but it still causes problems: it disrupts the team’s schedule and of course all the effort that went into the planning of the now-deprioritised story is wasted. Even if the requirements remain constant the gap between planning and developing mean knowledge is lost, or conversations need to be repeated, or the requirements turn into mini-waterfall-style requirements specifications.

The lean software alternative is to prioritise the next story only when the team is ready to work on the next story. That means while the team is developing the current feature they don’t have much certainty about what’s coming next. Like the worker in the car plant they have to signal slightly ahead of time that a new story needs to be worked out. Then the Scrum Master/business owner/internal customer needs to get something ready so that when they become available the team can all get together, thrash out the details, and set to work again.

As a manager I’m uncomfortable with this: I can no longer know what’s going into an iteration when the iteration starts. But the definition of value is not “what a manager’s comfortable with”. I will need to find other ways to ensure we can be accountable to the rest of the business, and they will need to be ways which are closely aligned with end user value — and that can only be a good thing.

The combination of flow and pull doesn’t mean the team is only working on one thing at a time. But it does mean that everyone has an equal balance of work at all times. So if Alf’s development work gets passed over to Betty’s testing then Alf and Betty need to make sure that she is expecting to finish her current piece of testing at pretty much the same time as Alf finishes his current piece of development and passes it over to her. Keeping that flow even is really important.

Lean promises the seemingly-impossible

Lean holds out the seemingly-impossible promise of increased productivity and increased quality. But here are some numbers from the literature:

  • The Puget Sound Naval Shipyard, time to prepare a repair document. Originally: 97 days. After a lean workshop and taking action: 26 days. (TTW, p.103)
  • Lantech wrapping machinery, before and after lean. Production throughput time was 16 weeks, became 14 hours to 5 days. Delivered defects per machine was 8, became 0.8. Employee time per machine was 160 hours, became 80 hours. (LT, p.121)
  • Porsche, production of the 911, before and after lean. Initial stamping to final shipping was 6 weeks, became 5 days. Inventory held was 17 days’ worth, became 4.2 days’ worth. Errors from work on the assembly line dropped 55%. (LT p.213)

I do have some concerns about lean software development, but they’re less about lean itself and more about bandwagon-jumping and doing things without really understanding the reasons. Regardless of that, it’s refreshing to find a new way of looking at what seem to be known problems, and making insights which you might not otherwise have found. It’s certainly something I’ll be spending quite a bit more time on.

An ABC of R2: W is for Wiimbledon

…which was a semi-regular event of Wii tennis in the office, but a very useful part of our R2 work, too.

Each launch required a small army of technologists to be on-hand: to run the various scripts, check the results, and deal with any problems that might arise. We needed to arrange these teams carefully because launches happened overnight, so we’d need an overnight team and another team in early the next day to pick up any remaining issues.

All this was fine, but after a while someone realised we’d missed a trick. I think he actually wanted to be part of the overnight team (it’s always exciting to see these thngs go live) but wasn’t actually on it. So he arranged with others to bring in some games consoles, and wired one up to a big screen, the others to projectors. It was a big draw, and a great way to have an extra group of techies staying late in case the need arose.

Still, we couldn’t be cavalier with this. For example, we made sure the gamers were located far from the launch team — the launch team had a serious job to do and didn’t need distraction. But that didn’t prevent the benefits: every so often a gamer would slip out of Guitar Hero and wander over to the launch team to check up on progress, offering some advice and support if necessary. One time a critical SQL script was running worryingly slowly and a call came through to see if someone could contribute to the investigation; a couple sat down at a machine a few feet from the Mario Kart players, traced the problem, and suggested a change to the script which was agreed and did the job. That night it made the difference between “go” and “abort”.

If there’s a general lesson to be drawn from this I don’t think it’s to keep a Wii console in the stationery cupboard along with the paperclips and envelopes. But exploiting opportunities that are specific your particular situation is probably a good thing to do, even if they aren’t enshrined in official company policy.

An ABC of R2: S is for sitebuilding

…which was the penultimate step before a launch, after the software had been built and released, and before the technical work to finally lift the curtain.

One of the big changes that was part of R2 was how we structured our content — our information architecture. Previously each piece of content lived in a section, up to two levels deep, and a lot of content was duplicated so that it could appear in more than one section at a time. An extreme example we often used was the affair around David Kelly and the consequential Hutton inquiry. Almost every story there crossed the boundaries of politics, media and daily news.

With R2 we were introducing much more nuanced keywording and more options around navigation. So the content in the old system didn’t map directly into the new system: it all had to be examined and reclassified by hand. Additionally, production staff needed to build subject pages in ways they couldn’t before — for example, the pages on Afghanistan, the British monarchy and the BAE corruption investigations. All this was called sitebuilding.

Of course, the tech team built tools and wrote scripts to make the production staff’s job easier, but some things just need human expertise and take a very long time. Typically we allowed six weeks between the time the software was released and the relevant site was launched and that was the period in which sitebuilding took place.

An ABC of R2: Q is for quality assurance

…which is a much misunderstood subject.

The R2 team consisted of a number of QAs, and the most obvious artifact that the QAs produced and worked with was the test script: a series of detailed instruction that explained what to test and how to test it. For this reason it’s too easy to dismiss QAs as testers, and that would be a mistake.

Our QA Manager, Amy, says “I don’t want us to be finding problems, I want there to be no problems to find”. That hints at how QAs should be used: they should be involved from the very start of a piece of work to identify an appropriate structure and approach that ensures greater reliability and more opportunities for testing. In the best cases a QA has guided a task in a direction that BAs, developers and architects might not have previously considered, so avoiding problems they hadn’t thought of. In the worst cases a QA has been omitted from planning a piece of work and something considered straightforward by the QA-less group has turned out expensive to test and a constant source of problems.

Testing comes at the end of the development process, and considering a QA as a tester therefore allows one to fall into the trap of including them only at the end. The quality assurance process should add value, and that can only happen if the QAs are involved from the start.

An ABC of R2: P is for pair programming

…which was, and is, a hugely important part of our software development, and something that took a long time to learn to do well.

Pair programming is when two developers sit at one machine and one keyboard to write the software. It’s very difficult to do: the driver has the pressure of someone watching their every move, and the navigator has to be aware of what’s going on because they’ll be asked to take over at any moment and they have a responsibility to keep an eye on the bigger picture. It also makes it a very collaborative process — the pair need to work out together exactly how they’re going to tackle every problem. Mat, leading our architecture team, calls this “keeping each other honest”.

Pairing looks expensive — two people apparently doing something that one could do — but that makes the mistake of thinking that all software is the same and all developers are interchangeable. Here are some of the benefits we’ve found:

  • Developers are more productive. With someone sitting beside you you can’t afford to cruise. You’ve got to be demonstrably on the ball all the time.
  • The quality of the software is much higher. I’ve listened to developers discussing how they should write a particular piece of code, suggesting alternatives and weighing up pros and cons that one individual would never have come up with by themselves.
  • Developers become much more skilled much more quickly. Everyone learns off everyone else.
  • The company’s technology investment is de-risked. Highly specialised knowledge is shared among many people, and doesn’t live just in the head of one person. This also means…
  • …Resourcing projects is easier. Because more people are able to move onto other teams more often, since (a) they are more likely to have the knowledge needed for the new team, and (b) they are less tied to their existing team since they will have shared that knowledge. This last point also means…
  • …Developers have more opportunities to learn new things. They can move onto other teams and new projects, safe that they won’t be constantly called back to their previous team, because they’ll have shared their skills and knowledge.

When a software project is complete the software itself is only just beginning its life, in operation day after day — and in the case of our software, by hundreds of people for many years to come. So that development investment has to ensure the product is of very high quality, and pair programming is part of how we ensure that.

An ABC of R2: O is for opportunity

…which is a word that we came to understand only slowly, particularly as a counterpart to the word “challenge”.

As we worked we inevitably came across problems; Nigel, our indefatigable programme manager, would insist on calling them “challenges”, and casting possible actions as “opportunities”, to the point that it became a running joke. But problems — sorry, challenges — are chances to raise your game, and opportunities are chances to resolve two issues with one action.

One very early challenge was to deliver our video platform without disrupting the R2 project. Delivering video was itself a major project, requiring CMS integration and embedded advertising. Our opportunity was to do that and at the same time prove that our “business as usual” team (which ran alongside the R2 team and tended to deal with small, one-off tasks) could produce work at least as complex and high-profile as anything the R2 team could.

Our video technology has been a great success. The team produced something which enshrined good principles of web publishing, and integrated perfectly with the content management system (allowing keywording, search findability, etc) built from the R2 project. Taking the opportunity to prove the capability of the business as usual team provided everyone — both inside and outside the team — with even more confidence in what we could do.

An ABC of R2: J is for JFDI

…which stands for “just do it”, and was the unofficial name of one of the development teams which sat alongside the R2 teams.

One key principle we had from the start of the project was that other development work couldn’t stop for the sake of the site rebuild. There might be less of it, but it shouldn’t dwindle to zero. And while R2 was a major long term undertaking, the rest of the work that came up inevitably had a very different shape. Consequently we had different kinds of teams.

The JFDI team handled very short turnaround work. Mostly this consisted of bugfixes, but it also included minor enhancements. It worked in a traditional Agile manner, but due to the size of the individual tasks work was reprioritised every day rather than every fortnight.

Working on the JFDI team suits some people better than others. On the one hand it’s difficult to get your teeth stuck into anything because it doesn’t last very long (or at least it shouldn’t); on the other hand you get a sense of completion every day. A lot of the time people don’t relish cycling into that team, but once they’re in they find they learn a huge amount about how the software they’ve written actually gets used. I’ve written more on this subject previously.

Overall the JFDI team has been very successful, dealing with a large and constantly-shifting workload, but also demonstrating daily progress to our internal users. Since R2 finished we’ve kept the team running in the same mode, and it continues to bring immediate benefit to people inside and outside the development teams.

Big lessons from a little project

I’ve just finished a fortnight’s holiday which I (foolishly) spent mostly in front of a PC developing a never-ending little application. But unexpectedly, despite the trivial nature of my project, I rediscovered a number of important lessons more usually associated with serious application development.

The software I’m writing is a just a little Firefox plugin. I’ve been fiddling with it for so long we’re almost onto the second major release of Firefox since I started, and yet it’s probably not much more than a couple of hundred lines long. You can see it really is a minor enterprise. Despite this it’s been quite a surprise — quite a shock, even — to be reminded of some industrial-strength truths in a small and personal environment.

And they are…

1. Damn, writing software is difficult

What we non-developers know about software we know either by observing or by talking to those who do it. But anyone who wants to be trusted won’t complain about their job or bore you with details they know you don’t want to hear, and so you’ll never hear about everything there is to know about software development, even when you ask.

So one thing I discovered — again — is that writing software is really difficult. Sometimes I was flying, but more often I was crawling: piecing together information from different sources, trying to understand what was possible, learning all kinds of technologies (Javascript, XUL, XPCOM), trying use them well, but more often trying to get them to work at all.

It’s made me respect all the more the people I work with every day who make it look so easy. Every line you write is pure logic and needs to work 100%. This is not like writing a letter or a sales proposal, where a sentence that’s only 95% perfect is more than adequate. This is like writing a legal contract from scratch for a particularly unpredictable world. Every line needs to be watertight.

Aspects of commercial software development2. Simple design really is difficult

Simple design is another way of saying that the software should be easy to work with and modify. You’d think this would be hard to get wrong, particularly with something which is being built as new: surely you just add small, simple pieces one at a time, with each one adding the next feature on the list. What could go wrong?

I found out first hand that simple design is difficult to achieve, even in very simple scenarios. For example, my application has a class which handles the user’s preferences. It needs to be initialised for when the application starts, and it needs to take and save the preferences when requested. All this takes place within a single object. So I wrote an initialisation function which set up the initial preferences according to how it’s been configured, I tested it, and all was well. Then I added some functions to take updated values and save them, tested those, and all continued to be well. Total: about 20 lines of code over four functions.

Then I ran the whole in an integrated environment and weird errors started occurring. It took me about two hours to figure out what was going on: it was a strange combination of unexpected start-up values, Firefox calling the initialisation function at unexpected times, and some confusing logic of my own which was supposed to protect double-initialisation. In the end I decided the best solution was to throw out the whole idea of initialisation. Now the object just takes and save preferences, and you can do that when the application starts if you want. Total: 5 lines.

The point of this is that an apparently simple and obvious design was actually too complicated to sustain. I’m very pleased that the solution was to delete lines and simplify. But I could only do this so easily because I had complete control of the code. In other circumstances there might have been other systems which were relying on that initialisation code (however flawed it may have been), and I might have had to add to the existing complications to solve my problem; or I might not have had the time to take a fresh look at the code and simply built around the flaws out of a sense of fear of touching the wrong thing.

This is a tiny example from a tiny piece of work, but it showed me how easy it is to go wrong with a design, and how easy it is to produce software that is complicated, hard to understand, and time-consuming to fix and evolve.

3. Learning a language is more about culture than syntax

I sometimes get CVs from people who claim to know about 10 programming languages, and I’m always doubtful. Just because you’ve written an application in a language it does not follow you’ve done it well. Knowing the syntax is only the first step. You also need to have good knowledge of any libraries, and finally you need to know how to work with the grain of a language. This means you’ll use different idioms, and structure your solutions in different ways.

In my own case I’ve been writing Javascript, but it stinks because I’ve tried to use it like Java. I’ve been stuck in my old Java ways, like creating classes and carving out a deeply-nested namespace. It’s Javascript, Jim, but not as we know it. It works, it makes sense, but it looks clunky and… well, it just feels wrong, dammit.

Javascript is a prototyping language. I can even tell you what that means, but only with my head, not with my heart. I use the prototyping as a hoop to jump through to get it to do the Java-y things that I know I shouldn’t be doing in the first place. Being a prototyping language doesn’t mean Javascript is a second-class language, or a dumbed-down Java. It means it’s a different kind of language to Java and should be treated as a first class language with its own ways of getting things done.

It’s a cultural thing, and you can’t claim to really know the language if you don’t operate comfortably in its culture. I don’t really know Javascript.

4. How did I ever live without automated tests?

Possibly by not spending my holidays sat in front of a PC. But aside from that, I continue to wonder at the marvel that is automated testing, and unit testing in particular. To be able to implement a change and not have to trouble your brain about the consequences is very liberating, allowing you to move ahead with confidence. It does take some work to set up the environment, but the results are worth it.

That said…

5. Your automated tests won’t cover everything

In one of my functions I unexpectedly found a truck-sized hole which had gone undiscovered despite seemingly comprehensive automated test coverage. (A loop which had a “break” when it should have had a “continue”, meaning great swathes of actions got skipped in most circumstances.) I only discovered this through integration testing (which is the fancy name I’m using for what was really “trying it out”), and found that a quirk in my unit testing setup had caused the mistake to be missed. Once I had found the cause I adjusted the main code to be more predictable and put in an automated test to trap the error, but it was only discovered through real hands-on testing.

6. A strong IDE sustains motivation as much as anything

Although I was using Eclipse for development, when working with Javascript it really doesn’t offer the comprehensive support you get with Java. Because Javascript is a dynamic language, and no doubt also because of the state of JSEclipse, there’s very weak support for code completion, refactoring and so on.

The consequence of this was a loss of code-writing speed, but much more than that I was suddenly able to see how easily a weak IDE allows bad software to be produced. There were many occasions when I knew that I should tidy up or refactor something, but was then suddenly hit with a premonition of the tedious steps I’d have to go through: working out which files to pick on, the manual search-and-replace, checking the context before I made a change. I had to force myself to get on with the tidying up despite knowing how painful it would be, focusing on the long-term results, and safe in the knowledge that for this little personal project I didn’t have a deadline.

It became clear to me that so many people must come under a constant barrage of pressure, with only their current strength of character to defend against the pressures of deadlines and short-term wins. It’s inevitable that too often they will give in to those pressures, leaving cumbersome code building up, and ultimately gumming up the works of the system. A strong IDE removes the barriers to those virtuous tasks of improving design and allows you not only to do your job, but to do it well.

7. Estimation is difficult

Which is just an excuse for the shameful truth: all my estimations were out by a factor of four. This is embarrassing because it’s not as if I’ve never written software before.

In retrospect the mistake I was making was to look at the component parts of a task, guess how long they’d each take, and add them up. What I should have done was try to take the forthcoming work, identify similar previous tasks, and from past experience see how long they actually took.

At the end of the day experience is good, but it’s how you use it that really counts.

8. You can’t know all your requirements up front

This is familiar to anyone who’s bought into Agile, but it hits home hard when it’s you who’s the user setting the requirements.

I’ve written countless requirements specifications in the past (in 60-page documents, on task cards, wherever) — I thought I really ought to be clear-thinking enough to know my own requirements up front. Wrong again. As the UI came together, as one idea sparked off another, and as I had chances to step away from the code to think about things from a distance, I started to see that my feature set was really rather disjoint — almost random. These were moments of clarity that on the one hand caused me to add requirements, but in doing so I was recasting the software with a new perspective. I was starting to see what the software should be doing, which was not quite the same as what I’d started out on.

Release date

Fortunately I’ve not sent out a press release announcing a release date, held a press conference, or hired the London Eye for a glitzy media event. I’m just writing software for fun, and at this rate it’s probably not going to ever see the light of day. But even then, it’s been startling to find that the germs of some of the Big Ideas of software development are still present in the smallest of projects.

QCon London 2008: A Michelin-starred deli

There were very few moments for me during QCon London 2008 of earth-shaking enlightenment — if any. But every hour of the three days of the conference there were insights and guidance that could be tucked away, and reused later to save hours, days or weeks of time elsewhere. Snake-oil salesmen where thin on the ground, and instead there were dozens of people saying one or both of:

  • This is what we did; and
  • This is what you can do.

No magic, no silver bullets, but plenty of solid advice and experience.

A good example of both of these was Randy Shoup of eBay. He had nothing to sell (other than the good name of eBay, perhaps) and his presentation was very clearly constructed to show their principles of scalability, and some concrete examples of how these work in practice. You probably wouldn’t use their periodic batch processing method to generate recommendations — if only because it’s odds on you don’t have a recommendation system — but you could take the overarching principle of “async everywhere” and apply that to the next scalable application that you need to work on.

Even the very specific presentations contained valuable points that could be generalised and reused. For example, Matt Youill and Asher Glynn of Betfair talked through how they scaled the transaction processing on their servers by a hundred-fold. Guardian.co.uk doesn’t need that kind of throughput, so the details were primarily of intellectual benefit. But a key practical lesson was how they approached the problem: by presenting it to industry players as a challenge carrying great kudos to the winning company.

All of this was summed up very nicely by the team from BBC News: John O’Donovan, Kevin Hinde and Ross Heritage. They were asked how they managed performance testing for the iPlayer. John spent a few moments describing some of the techniques they employed, but got to the point when he realised the audience really wanted some eye-opening enlightenment which he didn’t have. At this moment Kevin stepped forward and said straight out “There’s no secret sauce”. Indeed not: they just work hard and stick to strong principles.

QCon offered little in the way of secret sauces, but it did contain dozens and dozens of great ingredients you could take away and use to concoct your own wonderful dishes.

And with that analogy pushed to breaking point, I think we should leave it there.

The unstoppable urgency of web development

While I’m usually proud of the work I’m involved with, I’m rarely happy for long. There are always ways to improve, and I’m usually dissatisfied by one unmet ideal or another. Almost since I started in this field I’ve been vexed by how much of web development is “urgent” rather than “important”. This is not merely of theoretical interest; it vexes me because dealing with important issues gets you to something of lasting and strategic value, while dealing with urgent issues gets you to the next day. It’s that unmet ideal of achieving more long-lasting value that causes me to think about this.

I want to give some context around the claim that there’s more urgency in web development, and then I want to offer some ideas about why this might be. If you want the brief version, I think what makes the web generate such urgency is that…

  1. Updates are cheaper;
  2. Problems hurt less;
  3. Competition is more visible;
  4. Boundaries are blurred; and
  5. It’s easier to have more stakeholders.

But first, some context.

Picture will go here when I have timeSome context

Although I do think web development has a surfeit of urgency, it’s fair to say that “web development” is not a clearly defined area. So much software today has some link or other to the web, it’s usually unfair to say definitively that something is or is not web development. So it’s a matter of degrees, and looking back on the software development I’ve been involved with I’d say the more web-oriented something has been the more I’ve had to deal with urgent issues, and the less web-oriented something has been the more I’ve had to deal with important issues. The most urgency comes when working on websites, and the least when working on applications which may or may not use the web for a bit of minor communication.

This issue of urgency also seems to be the perception and expectation of non-technical people. This is how we get the phrase “Internet time” and why for years I’ve heard people saying “but this is the Internet” as a justification for tight deadlines. This is something that continues today, even after the unsustainable boom and bust of the dot.com years.

When urgency is too pressing we risk doing a bad job, and releasing something which isn’t up to scratch. Of course, when urgency isn’t pressing enough we lose ground to our competitors. And of those two scenarios it’s the first one, doing a bad job, which is easiest to spot: a bug bites someone on the nose and you get to hear about it pretty fast. The second scenario, losing to the competiton, happens slowly over months or years, and by the time you slip into obscurity so many events have occurred it’s hard to pinpoint exactly where you went wrong.

Urgency is no bad thing, then, but I continue to be bothered by its dominance in web development. Too much urgency leads to a loss of quality and a reduction of strategic actions. Why does this happen in web development more than elsewhere? Here are some thoughts…

1. Updates are cheaper on the web

Rolling out a new version of a website onto a server shouldn’t be regarded as a trivial affair, but it’s a lot easier than other means of releasing software. So it’s easier to release a slightly rushed product if we know the only copies anywhere in the world live on servers which are entirely under our control.

Slightly more costly than updating our own servers is the job of the IT department which has to update an application on everyone’s desktop. Here, not only do you have to be sure the desktop machines are all switched on and ready to receive the software, but an error for one person could mean they cannot work at all the next day. More costly than that is the job of burning, boxing, and shipping a DVD to hundreds of retail stores.

For both those cases, and for many other ways of releasing software, the cost of repetition and the cost shipping something substandard is very high. But in many web applications, where the cost of distribution and the cost of error is relatively low, the consequences of problems that come from an urgent release are much less costly.

2. Problems hurt less on the web

As users, our relationship with websites and web applications tends to be less engaged than with most other applications. So although we all know undue urgency leads to more faults being released, it’s perhaps more forgivable on the web.

If I’m using a website and I encounter an error, I know the problem — the website — is theirs. I can come back later or try an alternative (there are always plenty of alternatives on the web). Either way, I’m probably not going to get very upset.

By contrast, if I encounter a problem with software running on my machine then any error is going to hurt me much more: that’s a problem with my software… it’s causing my machine to go wrong… I went to the trouble of installing that software and now it’s giving me problems… this is not good. Even if the software was installed by my IT department it’s still my software, and it’s core part of my daily work — finding a workaround is going to be difficult.

3. Competition is more visible on the web

One of the drivers of urgency is beating the competition, and if you and your competitors all work on the web, then they are likely to make (cheap) incremental releases, and you will notice almost every one.

If competitor A releases a new feature in week 1 then you’ll notice and ask how you can do that yourselves. If competitor B releases another new feature in week 2 then you’ll notice it and ask how you can beat that, too. By the time competitor C releases a third thing in week 3 you’ve got the pressure of the three things being delivered by your competition and nothing to show yourself.

Things are different as you move offline. Consider software companies in a competitve marketplace that see software ship features in batches. When your competition ships their new release with new features you’ll be able to assess all those features as a whole. Of course there will still be intense pressure to deliver, but your view will be more rounded and decisions of importance will play a greater role than decisions of urgency.

Consider also the internal team which delivers software which is internally-facing. Your life in this team will be no less demanding than in any other, but one thing you’ll be spared is much pressure from competitive teams. If you’re delivering the next version of the payroll system then there will be pressure of deadlines, there will be pressure of budget, but you’re very unlikely to get an unexpected new requirement due to the payroll software team at your competitor suddenly releasing their new e-mail alert module.

None of this is to say that competition is more fierce on the web. That may or may not be the case, but it’s not what I’m interested in here. The point is that the competition seems more fierce and is much more visible. This encourages urgency.

4. Boundaries are blurred on the web

A wordprocessor helps you write documents, a media player plays audio and video, but a website… a website doesn’t have clear boundaries.

Although I wrote above of competition, exactly who is a competitor on the web is not always certain. I once worked with an e-commerce music retailer who suddenly decided they needed to provide a web mail service for their customers. I’ve worked with a law firm who wanted to turn their brochureware site into a recruitment platform. Yahoo! evolved from a directory into a portal, embracing almost everything they could while the world struggled to understand what a portal might not be. Meanwhile, stepping away from the web, I don’t know of a spreadsheet which carries a discussion forum or an HR system which recommends movies.

While all this clearly accounts for the quantity of demands that occur in web development, it also accounts for the urgency because that steady drip-drip-drip of features that comes from our competition also comes from those who aren’t direct competition but are just out there on the web. The drip-drip-drip becomes a trickle (and maybe even a flood) pretty quickly, and the pressure to deliver new things is even greater.

5. It’s easier to have more stakeholders on the web

Because the web means so many things to so many people — contributing to those blurred boundaries — many more people within an organisation will consider themselves stakeholders. Marketing people see it as a marketing tool, the recruitment team see it as a recruitment tool, the sales team see it as a sales tool, and so on. And of course each one of them is right: it can be all these things.

With so many diverse stakeholders, we can expect so many more pressing deadlines, and hence much more urgency. More stakeholders also means communication and prioritisation is more difficult. It’s difficult to get everyone into the same room, it’s difficult to bring everyone to the same point of understanding, it’s difficult to weigh completely different kinds of requirements against each other. It doesn’t matter that everyone is working for the same company and therefore, at a higher level, the same goal. That goal is usually at too high a level to make a difference to everyone’s day-to-day needs.

The least pleasant example of this I remember is a colleague at a previous company who was the project manager for one of our clients. Although he was their single point of contact they managed to have several single points of contact, none of whom were very good at speaking to the others. He found he had responsibilities to the marketing manager, the distribution manager and the IT director. They all had pressing demands, and he had no contact with anyone who could bring them all into line. Indeed, that seemed to be by design, as we suspected the client used this as a way of squeezing us as a supplier. The relationship did not end happily. Although that’s a very extreme example it does highlight the kind of difficulties of having many stakeholders.

A final word

Cheaper updates, less damaging problems, more visible competition, blurred boundaries and more stakeholders. This tends to be the world of web development, and therefore a world in which urgency plays a more dominant role.

Yet we can think of web development projects in which importance dominates urgency. One which springs to mind is an online banking site: this is web development if ever anything is, but one in which doing things quickly will always be trumped by doing things properly (whatever “properly” may mean for that bank). But this is also an environment where many of the criteria I’ve listed don’t hold. Most obviously updates will be fairly costly, any problems will be very damaging, and the boundary of what the site does is very well defined.

So the criteria I’ve set out above are more to explain than define. Web development tends to take place in an environment that’s different from many other software development environments, and I think it’s those features of that environment that I’ve listed that explain why urgency predominates.

I said at the beginning this vexes me because urgency causes visible problems, and eliminating urgency would seem to help eliminate those problems. But I also said I’m usually dissatisfied with one unmet ideal or another. I’ve also worked in environments where there is a distinct lack of urgency. That, of course, leads to waste and stagnation. And that’s really, really, vexing…