Technical people don’t tend to like a big bang release, but it’s often perceived as the only viable option when you launch a new service. Why is this, and is there any kind of middle ground? The recent redesign of Telegraph.co.uk is a useful way to look at this. I’m also going to use examples from Guardian Unlimited, most notably Comment is Free.
What is a big bang release?
A big bang release is one in which a large system is released all in one go. Recent examples from Guardian Unlimited include the Comment is Free site and our daily podcasts. This week the Telegraph launched its website redesign. In the offline world, UK newspapers have been in a state of almost constant change recently, kicked off largely by the Independent’s sudden transformation from a broadsheet to a tabloid in September 2003. The Times followed suit shortly after, and in September 2005 The Guardian changed to the “Berliner” size, new to any British paper. These launches can only be done in one go: you can’t have a newspaper that’s broadsheet on page 1 and tabloid on page 2.
Contrast that to…
At the opposite end of the spectrum to the big bang is the mantra “release early, release often”. This is how the Linux community claims much of its success. Much of GU’s work is done this way: while there’s always a project in the pipeline like Comment is Free, most of the time we’re making small improvements and adding new features day by day. Some of these are internal to our colleagues, some are bugfixes, but many are small features added to the public site. In January the Weekend magazine ran a photo competition with Windows XP. We worked with them to produce an online entry form that allowed our readers to upload their photographic entries directly to us.
So what are the worries about big bangs?
Big bangs give you a jump on your competitors and give you a big marketing boost. But they also put an awful lot at stake. Releasing a photo uploader is one thing. That’s more of a feature release than a product release. Consider what makes up Comment is Free, which the GU team created in collaboration with Ben Hammersley. In no particular order:
- Steve Bell’s If… cartoon every day;
- Comments on blog entries;
- Keywords applied to each entry;
- Commenters need to sign in (or register) before they can comment;
- An RSS feed for the whole blog, plus one for each contributor;
- A one-click feature for editors behind the scenes to “push” an article from the Guardian or the Observer into Comment is Free.
- Dan Chung’s photo blog;
- A list of the most active entries on the blog…
- …and the most interesting from elsewhere;
- A search facility;
- The ability to serve all this even with the inevitable battering of thousands of readers check it out every day.
And that’s not counting the editorial team powering it and the work of the designers creating something is beautiful and works across a good range of web browsers (including the famously fickle Internet Explorer).
Using the example of Comment is Free there are several kinds of difficulties with big bangs. One is that there is a lot to potentially go wrong. Another other is that there may be a big switch-over which is difficult to engineer and risky to run. A third is relates to community reaction and fourth relates to social psychology.
Difficulty 1: The lots-to-go-wrong scenario
Within hours of launch, any of the Comment is Free features above (and any number of others which I forgot) could have failed in unpredictable but surprisingly common circumstances. The important thing to remember is that this is the first time any of these features would be tested by unknown people, and the first time they’re used together in unpredictable ways. A problem with a core component of the system can have an impact on many aspects visible to the readers. A big bang can’t be rolled back without a lot of egg on face.
The problem is much more acute with anything that makes a fundamental change to a core part of a system. In any one hour we’ll be serving hundreds of thousands of pages, some of which were published five minutes ago, some of which were published five years ago. A change to the way we present articles is going to have consequences for thousands of people in myriad different situations over the course of a few minutes. If we got something wrong there, misjudged some of those situations, we could annoy hundreds or thousands of them very quickly. And since such problems can take time to fix, more people will be annoyed the longer it takes.
All but the smallest of software launches throw up some problems. Often they are too minor to worry too much about (a particular colour change not working in a particular browser, for instance). Sometimes they are superseded by more significant problems in the same launch. With a small feature release (early and often) we can spend a little time and fix it. But if a sweeping change has occurred there’s the danger of having too many things to fix in a timely fashion. Even the original big bang took several millenia to settle down, although to be fair that was mostly a hardware rollout.
Difficulty 2: The big-switch-over scenario
With some systems it’s not a case of making a big thing public in one shot; sometimes it’s a major internal mechanism that has to be switched over in one go.
Adding comments to articles might involve not just mean an addition to the underlying system, but a wholesale either/or change. The system must work not just before and after the change, but during the change, too. And everything that links into the part which changes must be somehow unlinked and relinked in good time.
If you think this language (part, unlink, etc) is vague, by the way, there’s a reason for that: exactly what needs to be changed will be many and varied. Database tables, triggers, stored procedures, software libraries, application code, outgoing feeds, incoming feeds, application servers, web servers… all these might need “unlinking” (dropping, reinstalling, stopping, bouncing, whatever). It needs careful planning, co-ordinating, and execution, and it might only be possible over several minutes, which is a very long time if you’re running a live service.
Again, rehearsing this and doing it for real in the live environment are very different things. But this time if something goes wrong you’re stuck with a broken service or something which will take just as long to roll back — which will only leave you back at square one.
Difficulty 3: Community reaction
Even if everything goes well on a technical level there’s still the danger that your user community won’t like it.
A failure for Comment is Free, Emily Bell has written, would have been if no readers had posted any comments at all. That would have been an awful lot of work wasted. But at least in that scenario we could have quietly shut it down and walked away.
A more difficult scenario would have been if people were using it, but found a fundamental feature annoying or obsctructive. A good example of this is the page width on Comment is Free. 99% of Guardian Unlimited pages are designed to fit a monitor which is 800 pixels wide. That was pretty standard six years ago when the site was designed, but is less common now. Both Comment is Free and the Telegraph’s new design require a width of at least 1000 pixels. That gives much greater design freedom and hence a better user experience — unless you can’t or won’t stretch your browser that wide, in which case it could be very alienating. If readers react badly to the 1000 pixel design then there’s very little that can be done about it. Sure you can redesign, but that’s a major undertaking.
Difficulty 4: Psychology
So, let’s suppose the features all worked, the switch-over all worked, and the community doesn’t react negatively. There’s still the problem that people, well, they just might not gel with it.
Blogs are a very good example of this. They’ve been around for almost a decade, and talk/discussion software has been around for much longer. But blogs only really took off in the last two or three years. Most of this is entirely down to getting the psychology right:
- A blog is now recognised as a largely personal affair, being the soap box for individuals rather than a free-for-all forum;
- discussion tends to be just a single column of comments rather than threaded, ensuring commenters respond more to the original poster, rather than getting into an internal discussion;
- most bloggers write less content but write it more often.
That list is not something that falls straight out of a grand vision. It’s a series of very subtle psychological and social lessons that the whole online community learned over a very long period. Similarly any software release relies on social and psychological subtleties that can’t always be guessed at, and will often overlap each other. Making a single big release will obscure any of these subtleties. If people aren’t reacting to your product in quite the way you hoped then working out why could be difficult: a small tweak to a couple of presentational elements might make all the difference in the world… or perhaps your promotion of the product just wasn’t enough.
How can we mitigate this?
From the outside it looks like Telegraph’s website redesign was executed — is being executed as I write — without breaking a sweat (though I daresay internally there were a lot of frayed nerves as well as excitement). It gives us a few clues about how to manage a big bang.
Mitigation: Splash instead of bang
One trick is to make a big splash, rather than a big bang. The website’s redesign is actually rolling out over days and weeks, not minutes or hours. The team made sure to change the high-profile front page and World Cup pages first, so ensuring the perceived impact was greatest. They get a lot of credit without risking more than they need to.
Mitigation: Step-by-step approach
Even if plaudits is not what you’re after it can be useful to slow big bangs into small, distinct steps.
A while ago at GU we need to switch from one kind of server to another. On the face of it there was no way to smooth this switch: requests either went to one server, or they went to another. Nevertheless, any kind of problem switching over could have been potentially disastrous. We took extra time to build in a middle tier, which brokered incoming requests to the back-end servers, and added into that a facility to route different classes of requests to one or other of the servers. Initially everything went to the one server. Then we altered it so that only one specific kind of request, a tiny fraction of the whole, was routed to the other. We ensured that was working smoothly before we sent another class of requests over, and so on. We took what would originally have been a dangerous all-or-nothing switch, and smoothed it out over several weeks.
Mitigation: Avoid feature creep
Another thing the Telegraph team have done is to avoid feature creep. The redesign comes with almost no new features — I can count the ticker, but the rest seem to be rearrangements of the same material. This contrasts (unfairly, but instructively) with, say, the Guardian’s Berliner reformatting where the editor took a conscious decision not just to change the size of the paper, but also the design, the section arrangements, the approach to presenting the writing (distinguishing news and comment more distinctly, referencing the online world more) and numerous internal processes.
Avoiding feature creep on software projects is very, very difficult, and it takes strong wills and authority to say No to the many voices who want just a little bit more.
Mitigation: Don’t just test — rehearse
Testing individual features is one thing: testing the whole lot together is another. That’s why rehearsing the whole sequence together is important. And there’s no point rehearsing if you’ve not built in time to make changes when (not if) you discover there are flaws in your plan.
Mitigation: Plan for natural breaks
The Telegraph also seem to have a release plan with plenty of natural breaks: once started it seems they weren’t on a time-critical unstoppable slide. Of course they will have had deadlines, but if they had to slip some of these for the sake of something else then that was always a possibility.
A good example of this is their replacement of the missing e-mail and print features. The site’s articles originally had links to mail them to friends or print them out. The original rollout omitted these features, and there was a promise that they would be restored later. However the outcry from the community was such that they interrupted their schedule and reinserted them earlier than planned. It’s a good example of listening to your community, but it also shows that you need to the flexibility to react accordingly.
Mitigation: The safe but indirect route
Big-switch-over-scenarios can be avoided by ensuring you take a route that’s not necessarily direct but is at least safe. You may end up with a very long and winding plan, however. This is a bit like the word game where you have to go from one word to another, changing one letter at a time, but still ensuring that you’ve got a word at every step of the way.
W | H | I | T | E |
W | H | I | N | E |
S | H | I | N | E |
S | H | I | R | E |
S | H | I | R | K |
S | H | A | R | K |
S | H | A | C | K |
S | L | A | C | K |
B | L | A | C | K |
This can still give you your big bang if the final step is the only visible step. The Telegraph redesign will have involved an awful lot more behind the scenes than the apparently straightforward change first seen by the public. No doubt they could have done it quicker at the expense of a more confusing user experience, but they managed to keep the end user change to minimum.
Mitigation: Trial features elsewhere
If you’re unsure of a particular part of a big bang release then it might be possible to trial just that feature elsewhere. As noted above, Comment is Free was the first large scale GU site with a thousand pixel width. It was a calculated risk. Reaction to that will inevitably inform any future services we launch.
Mitigation: Loosely coupled systems
Another mitigation technique to allow a big bang is to ensure the product being released is on an entirely separate system to anything else. If you’ve been following Comment is Free very closely, you’ll have read Ben Hammersley’s comment (you’ll need to scroll down or search for his name) that it’s based on the Movable Type blogging software. MT is certainly not what we use as our main content management system and it’s necessarily separate. This makes the big bang release easier: there is integration, but it’s loosely coupled. The systems are linked, but they’re not tied that closely that a disaster with one will adversely affect the other.
Don’t think it’s solved
All of the above approaches (and the many more I’ve inevitably missed) help, but they aren’t silver bullets. Each one has its drawbacks. A splash might not always be a commercially-acceptable compromise; there may be no obvious step-by-step solution; feature creep relies on more strength than organisations usually possess in the heat of the excitement, and even then it only limits the problems without eliminating them; rehearsing requires time and an expensive duplication of the entire production environment; natural breaks can be difficult to identify, and can rely on a running-order that isn’t compatible with commercial interests; the indirect route is necessarily lengthier and most costly; feature trials are only possible in moderation; loosely coupled systems limit the amount of integration and future development you can do.
The Telegraph has clearly encountered some of these problems.
It’s notable that “Education” disappeared from their navigation bars initially, and was only findable via the search engine, as Shane Richmond recommended on one blog comment to readers. No doubt they couldn’t find a cost-effective route from the old design to the new that encompassed preserving all the navigational elements.
The switch to a wider page was greeted with mixed reactions, and while we limited it to Comment is Free, the Telegraph dived right in and is applying it to everything.
Finally, it’s interesting that the new design is quite sympathetic to their previous one. No doubt a lot of that is due to the Telegraph having the same cores value two years ago as it has today. But it’s some way from some of the more radical ideas they had, and no doubt that is partly a constraint of having to live with two designs for period.
So where does this leave us?
I hope this explains some of the problems with big bangs, some of the ways to deal with them, and the difficulties that still remain.