Conversation versus concentration

Compare and contrast two blog entries that popped up in my RSS reader on the same day last week. In the blue corner Joel O’Software, fighting for private offices. And in the red corner, Martin Fowler, battling it out for continuous collaboration between developers and their customers.

Naturally, they’re not really at opposite ends of a spectrum — in fact, their posts are about different things — and they’ll find a lot to agree with each other. But look at a select part of each of their text and see the contrast. Here’s Joel:

Not every programmer in the world wants to work in a private office. In fact quite a few would tell you unequivocally that they prefer the camaradarie and easy information sharing of an open space. Don’t fall for it. They also want M&Ms for breakfast and a pony.

And here’s Martin:

One of Kent’s suggested names for ‘Agile’ was conversational software development – the point being that it’s a two way communication. This isn’t something like a telecoms protocol that you can define, but the back and forth discussions about how software can enhance the business are where the real value lives. Much of this conversation is of half-baked ideas, some of which grow into valuable features – often ones that aren’t things that the customer originally thought of.

It’s notable how two people renowned for being leaders in software can be veering apart on what should be a fundamental issue: how should people interact? Joel is for concentration, Martin is for conversation.

It’s notable also that they do different things in the software world. Joel produces shrink-wrapped products on behalf of his own company. So does Microsoft, the company Joel used to work for and who he praises for “putting literally everyone in individual, private offices”. Martin is a gun for hire (via his employer), called in to consult on a variety of projects for different companies, no doubt 90% of the time producing in-house software for each client.

I can’t escape the feeling that their respective backgrounds inform their respective views, though wouldn’t for a second think that implies that one approach always suits one kind of output.

So, should one prefer concentration or conversation? Obviously[*] it depends on several factors, and here’s the way I see it…

First, it must come from whoever you start with. If Joel O’Software starts a one-man band and likes to work in silence, then takes on his first employee, he’s not going to want his hire to keep piping up with questions every two minutes. Similarly, if you’ve created your business by extolling the virtues of pair programming then you’ll be looking exclusively for developers who will continue that good work. If you start with one introvert or extravert, you’ll grow from there.

Second, it depends on where you think your strengths are as an organisation. Joel talks a lot about productivity and algorithms, while Martin tends to talk of people and methodologies. Each also has an interest in the other’s topics, but their chosen hot subjects are where they see the biggest gaps, and where they think they can most make a difference.

Third, it’s about how you see your team. I suspect Joel has very low staff turnover, hires developers very infrequently, and there’s no doubt he puts a lot of effort into picking the cream of the crop: he’s in a buyer’s market, and his developers will all be very smart. Martin will inevitably work with a much broader range of companies. While they will of course have made a very smart decision to hire him and his colleagues (ahem) they will tend more towards the market norm, and will also tend to be fairly large development teams — even if individual project teams are smaller. Thus Martin is going to be much more concerned about sharing information between developers, evolving designs collaboratively, establishing standards and keeping those standards refreshed.

Finally, it’s about how you see the long term. Again, low staff turnover and a tight-knit team means Joel is less concerned about knowledge silos, but an average corporate team will have average turnover and will have its average share of crises. Knowledge sharing and reducing single points of failure is essential in these cases.

All of that is why I favour conversational development. Knowledge sharing and evolving ideas is key to me as a general principle, all the more so that Guardian Unlimited is such a diverse site that there’s just so much to know. That doesn’t mean it’s easy for everyone, but for me a typical team will be stronger if it keeps every last bit of information flowing round, ideas constantly exchanged and checked, and experience continuously refreshed and revised. Concentration is often needed, but too often the price paid is too high, and is only found when one person is seconded to another project, has left, or is holiday.

[*] I think there’s a progression whenever you ask “Is A or B better?” Naive or inexperienced people will always pick one or the other. Eventually they’ll come to hedge their bets because they realise things are more subtle than they previously thought, or else it just makes them sound wiser. Finally they may reach a point when they are (or regard themselves as) leaders in their field and act as evangelists or iconoclasts pushing one or other opinion heavily. You’ll see a lot of hedging on this blog.

Is it agile? Check the fixed points

One of the problems with agile development is that it’s subject to changes, so means you’re in danger of changing it into something which isn’t agile. How do you know when you’ve gone too far?

This isn’t a problem for seasoned agile practitioners, but it’s a concern if you’re just starting out, or working with people who are. It’s certainly something I found looking back at my early agile days. There are some non-obvious or difficult agile practices which one might omit, but which make a big difference. How do we know what to watch out for?

My suggestion is to ensure there are some fixed points. Then you can evolve the process without straying from the ideals. My rule of thumb for creating fixed points is this:

Could I stand up at an international conference and be sure that my process would be generally indistinguishable from that of any of the speakers?

With that in mind, here’s what I’d want to be sure I did before I stood next to my peers…

A fixed point, by lobo235

Daily stand-up meetings at a set time

There are three things here: daily, stand-up, and set time. Making the meeting daily can be difficult, but having spoken to people who hold weekly meetings it’s pretty clear that they take up the same time in the end. In addition, daily communication keeps people veering too far off the track.

Insisting the meetings are stand-ups might be difficult and there’s no clearly tangible gain — after all, if you keep the meeting to 5 minutes, who cares whether you’re sitting or standing, or bouncing on one leg? But this important to me because it instills a sense of urgency. A sit-down meeting fails to distinguish us from our document-heavy, procedure-driven colleagues. It’s entirely psychological, and all the better for it.

Making the meeting at a set time can be difficult, particularly if the lead has other responibilities. But this is important to me because it forces people to focus on the project. It says “The train’s going at this time — be there or get left behind”. It tells people the project won’t wait for them. It keeps the team to a rhythm.

Iteration kick-offs and/or close-downs

It’s surprisingly easy to get away without these: some of last iteration’s story cards might not be finished, so the team just carries on; new story cards become available when the old ones are done, which could be any time. Iterations blur.

But a formal meeting to mark the iteration boundary again ensures the project and the team keeps to a rhythm. Coming up to the boundary it ensures people know ahead of time that they’ve got something to aim towards, and deadlines always focus the mind. Entering the new iteration it gives people the chance to think afresh and renew their goals.

Mid-project retrospectives with the whole team

An end-of-project retrospective is good, but mid-project retrospectives are agile. They ensure the team really is seeking to improve not just for the next project but for the current one. It can be difficult to see the benefit of retrospectives (another time-consuming meeting), nor do I think they’re not very obvious. I certainly don’t recall them cropping up in the Extreme Programming Explained or in Agile Software Development with Scrum. One way to keep retrospectives effective is to ensure any actions are (a) objective and (b) assigned to named individuals. This way improvements will be much more tangible.

Continuous integration and testing

This is certainly an obvious feature of agile, but it’s also difficult. If the project is anything more than a few days long it’s worth spending the time up front to get the continuous integration and testing working. This ensures rapid change and frequent delivery. Otherwise it’s very likely that features and fixes will regress, that this won’t be discovered for long periods, and fixing the regression will take a long time, too. Agile is about frequent delivery and constantly moving forward at a regular pace. That can’t be done without continuous integration and testing.

Frequent customer demos

Again, not necessarily obvious because while agile talks a lot about on-site customers it doesn’t often talk about the common case of customers not being around as much as you’d like. Frequent mid-project demos are the next best thing — at Guardian Unlimited we call them “show and tells”. A customer demo has two benefits. First, it steers the team towards feature-driven development rather than architecture-driven development. (It’s very hard to impress a customer with a data object layer.) Second it encourages prioritisation of the next batch of features.

Anything else…?

There’s much more you need to do to be sure you’re being agile, but to my mind the things above are easily missed. So next time I create a team for an agile project I’m going to make sure we start with some fixed points, and evolve round those.

A field guide to simplicity

Here’s my contribution to Brad Appleton’s stimulating conversation-piece on simplicity. I’ve produced a number of short soundbites to help understand simplicity, and backed some of them up with concrete examples of how Guardian Unlimited handles its articles.

Brad makes it clear that simplicity is a confusing and often contractory topic. But he’s written his article because

If we can come up with a modicum of modest, simple rules & principles to govern our design decisions in ways that help us minimize and manage interdependencies, eliminate constraints, and remove waste, then we are on the path to solving the real problem and meeting stakeholder needs in a way that is both simple and sustainable.

Strangely, he seems to fall foul of one of the very confusions he’s perceptive enough to identify. He says

The Agile Manifesto defines simplicity as “maximizing the amount of work not done.” But I think that’s a more accurate characterization of Lean than of simplicity.

but then goes on to cross-post his blog entry to several agile and extreme programming groups (1, 2, 3, 4,… thanks to Deborah Hartmann at InfoQ for the links). Indeed, that’s just unfortunate wording in the manifesto. The real issue, as Brad points out, is about complexity, and is something that even proponents of Big Upfront Design would support — indeed, it could be argued that Fred Brooks‘s “surgical team” model of development emphasises a clear vision from a single developer, and therefore produces something uniquely unconfused.

Soundbites on simplicity

For me, it’s the following one-liner of Brad’s that jumps off the screen:

Hiding complexity isn’t the same as removing complexity.

That pithy statement not only sums up many software problems that I’ve come across, it’s also a gem just small enough to fit into my tiny, PowerPoint-addled, brain. Hiding complexity isn’t necessarily wrong, but it is often mistakenly confused with removing complexity and is thus the cause of much pain.

So inspired, I sought to produce some more soundbites. Here’s what I’ve come up with, while trying to avoid agile-specific leanness, or indeed simplicity of user interfaces (say). The observatins below are all about simple design of software, something that should be relevant to any software developer or architect, agile or otherwise. There is some discussion with examples afterwards:

Topic Common mistake Real goal
Time Minimise the effort to produce results. Minimise the effort over the total lifetime.
  Take the most straightforward route. Produce the most straightforward result.
Concepts Hide complexity. Remove complexity.
Use existing concepts. Use simple concepts.
Avoid introducing new concepts. Avoid introducing concepts new to the problem domain.
Look clever. Be clever.
Comprehension Ensure there are few lines of code. Ensure there are few steps to understand the code.
State the obvious. Increase obviousness.
Be elegant. Be comprehensible.
Make it intuitive (to some unspecified person). Make it intuitive to a newcomer.
Obviousness is the first thing you think of. Obviousness takes hard work.

Simplicity and time

As Brad says: “…simple and sustainable”. We’ve all been in the situation where a simple hack costs us dear in the long run. It’s called technical debt. The cost of doing something isn’t to be calculated over the immediate future, but over all time.

You can see an example of this with our search engine, which is built on Endeca‘s technology. If you search for “major” you’ll see two kinds of results appear: in the main part of the page you’ll find articles with the word “major” in them, but above those you’ll see links to politicians’ pages (e.g. former prime minister John Major) as well as films with “major” in the title (such as Major League II). Now, the first thing we did with Endeca was create an Endeca thingumy called an “Adapter” which would allow the engine to search articles. But politicians’ details are not stored as articles, they are separate entities in the database. So given that we already had articles being found, how could we best develop the system to fetch politicians’ pages? There were two options.

First we could create another Adapter, but one which allowed Endeca to read politicians’ details. The problem with this is that creating an Adapter takes time, particularly if you’ve only ever created one of them ever before. Additionally, an extra Adapter might be seen as making the search engine more complex. The second option was to allow the underlying database (Oracle) to present politicians’ details as if they were articles, and then Endeca would read them as it did any other articles. This would be quicker because we’ve had years of experience with Oracle against months of experience with Endeca. Also, the search engine software would have remained virtually untouched.

The right answer as far as simple design goes is to choose the first option. It might have taken longer, but it is completely consistent with the existing search engine architecture. The second option might not have increased the complexity of the search engine, but it would have increased the complexity of the database layer and anything that interacted with the politicians’ details. It would have been unintuitive (clever, yes, but unintuitive) and therefore more time-consuming for anyone else to maintain down the line.

We could have chosen the quick-for-now option, but it would have cost us more time in the future.

Simplicity and concepts

This is the section where Brad’s original observation about hiding complexity fits in.

To me, one touchstone of conceptual simplicity is the importance of the domain and domain-driven design. The software problem should relate entirely to its domain. If the domain is complex then the software supporting it will necessarily be complex, but it should be no more so than the problem domain itself. Certainly the software should reflect the problem domain, but it should not bring in any complexities of its own, at least none that infect the domain layer.

A straightforward example of this can be found on the front page of Guardian Unlimited. As we saw above, most of what we produce are, of course, articles. Articles are fundamental to the system. But on the front page you can see a long list of stories, with pictures, links and sometimes little snippets of text. This is called a “trailblock”. When creating something like the front page we could have said trailblocks are just special kinds of articles. This would be the elegant solution, the clever solution, the solution which doesn’t introduce new ideas, and which therefore might be regarded as the simplest solution. But it would also be the wrong solution, because that is not how our journalists think of these things, it is not how they like to manipulate these things, and to contort an article into an unsuitable shape would have caused all kinds of problems.

What we ended up with was a trailblock being an entirely new type of content. As a result we have specialist editing tools (which are quite unlike articles’ editing tools), various ways of rendering them, and extensions such as RSS feeds which would be near-impossible if trailblocks were really just glorified articles. We introduced a new concept, but that concept was only new to the software. It was not new to the problem domain — the world that journalists inhabit.

Simplicity and comprehension

To take another example, I used to work with a wonderfully elegant system (now sadly defunct) which was so pluggable and so configurable that it was often impossible to work out where certain things came from. One day our sysadmin thought he’d like to try his hand at coding, so we gave him a simple job to do: the command line already had a help message, but we’d like it to be a bit more comprehensive. He returned in despair after an hour: “I’ve grepped through the entire source, and I cannot find one instance of the word ‘help’.” Yes, our code was so configurable, even location of the help message was obscured.

Making something appear obvious is hard work, and you don’t get people admiring your work so much because, well, you’ve made it look obvious. But nor do you get people lightly cursing you, such as the developer I know of who delved into a system and discovered that one of his predecessors had named a crucial function “.”. That’s a single fullstop. Clever? Yes. Helpful? Unlikely.

Conclusion

There’s no pithy conclusion to this. I hope the guide above leads us to better understand what simplicity of software involves, and how to distinguish it from its near neighbours. Certainly it’s made me think carefully about a subject I’d previously underestimated.

Ajax versus Swing: A management question

For the last year or so within the Guardian Unlimited technical team there’s been a continual low-level question being debated: what technology should we use to build GUI tools for our editors, and in particular should Ajax play a part? It’s a debate in which a technology manager has a rather different perspective from that of a developer. In this article I want explain those considerations.

Why is this even an issue?

It probably seems odd to ask about a GUI technology, since you would imagine — rightly — that we’ve already built an awful lot of tools for our editorial teams. They have tools to write articles, create quizzes, manage podcasts, and much more besides. But it’s come about recently due to a confluence of events: the rise of Ajax, and the realisation that most of the tools we built five years ago are Ajax-a-like tools without us having come up with the catchy moniker; the oft-discussed integration of newsrooms, and the consequential comparison of our home-grown tools with professional systems such as Adobe InCopy; and the general desire to upgrade and improve anything that’s a few years old.

Despite having Ajax-a-like tools in-house, the question as to whether we use Ajax is still pertinent because the frameworks and techniques available now mean we’re talking about a very different codebase and approach to what we’ve developed historically. We are in a position to decide on a technology which is actually new (Swing, Flash, etc) or effectively new (newer Ajax techniques, or a readily-available Ajax framework).

In some organisations this wouldn’t even be a question. Many BBC journalists use in-house Visual Basic applications, and if Guardian Unlimited was a Microsoft shop no doubt we’d be doing the same. Maybe we’d move from VB6 to VB.NET, but the core technology would remain: native Windows applications with native Windows GUI components.

But for better or worse GU is not a Microsoft shop, and we need our applications to run cross-platform — a lot of our staff use Macs as well as Windows machines. We can choose to build a tool using HTML web pages if it’s really simple, and for more complex applications we can choose either Java Swing or what we now call Ajax. For the purposes of this debate we’re really talking about a choice between Swing and Ajax, because both are building on a foundation of knowledge (Java, Javascript) that we already have.

Why is the technology manager’s view different?

Everyone in an organisation has their own perspective. Developers need to focus on getting the immediate task done; project managers need to focus on getting the immediate project done; I find that more than most other people I’m thinking about operational issues. How will it pan out in the long term? Where will it leave us 12 months from now? Five years from now? How will we grow with this technology in place?

Creating Ajax applications might be fun and cool, but there are plenty of more serious issues to consider. Here is my checklist of considerations…

Check: Minimise the number of technologies

There’s only so much an ordinary person can really be good at. Minimising the number of technologies you deal with means that you get to become expert in those, and so ensure you get the most out of them. If you use one technology everywhere, then whenever you pick up a skill or insight in one project every subsequent piece of work benefits.

By contrast, spending a relatively small amount of time on a particular technology means that you probably don’t get to use it in the most effective way. You may have been using Perl, say, over many years, but if that use has only been intermittent then you’d never learn the quirks, never learn the efficiencies, and never commit time to making sure you had a good, reliable Perl environment in which you could consistently produce really reliable code, and have the motivation to keep that environment up to date.

On the other hand, if you commit to using a technology in one place, for one job (if we were to use Ajax, say, on the front-end website) then it’s too late — you’re already infected. The technology is already in use, and you’d better get au fait with it pretty quick. But you’ve still got some choice. You could still decide that it will only be used for that specific role, within strictly defined parameters. Nevertheless, that’s a much harder rule to enforce. You’re still at the top of a slippery slope.

Check: What is the future of that technology?

It’s of great importance that any software developed doesn’t become legacy — or at least avoids being legacy for as long as possible. Choosing a technology today that won’t get updated tomorrow means you won’t get bugfixes for it, and you won’t be able to find new libraries to deal with the changing technological landscape.

I once worked at a company where a core Unix server product was built around Netscape Enterprise Server (“voted best of 1998”). It was probably the right thing to do at the time, when Apache was just emerging, but it turned out it didn’t have a future, at least not for our purposes. It became increasingly hard to sell into companies and eventually we had to embark on a significant rewrite programme to recover lost ground.

Ajax is the technology du jour, but where will it be in two years’ time? Swing has the stability and backing of Sun Microsystems, but will the ascendency of Ajax mean they take their foot of the accelerator as they focus on Ajax? You need to make a good guess at these and other market questions to ensure your software isn’t based on legacy software.

Check: How stable is the technology?

This is most pertinent to those technologies that are newer or developing rapidly.

I’ve seen a project built on an emerging graphics framework that changed every week. Each week the lead developer would spend a day updating his version of the framework from the trunk, fixing all the bugs and rewriting great swathes of his code because it used calls since removed or replaced.

Right now I wouldn’t want to be developing a mission-critical system on Parrot, as exciting as it looks. Even Ajax has its problems: the various Ajax frameworks will get updated periodically (we hope), and Javascript implementations may or may not change subtly with browser point-releases, and then we have to ensure that the progress of your Ajax framework and the myriad browser versions you need to support actually keep reasonably in sync with each other.

Swing, of course, has never had a problem with lack of stability. And that’s also a problem, which is why we must also ask…

Check: How current will it remain?

Say what you like about the graphics framework that changed every week: it sure was up to date. By contrast, if you report a Swing bug to Sun tomorrow, what are the odds on it getting fixed in time for your next release?

Demanding stability and currency is of course a bit cheeky, but unfortunately they’re both needed. It is often possible to attain them, though — if your technology supplier is committed to backwards compatibility. This is a creed that Microsoft has stuck to so vehemently for years, and which Sun has applied stringently to Java… so stringently, in fact, that they created a generics solution for Java even more compatible than Microsoft’s generics solution for .NET.

Check: How much support does it have?

Whether the technology can do the job is one thing, but you must also ask if it does it well. Does it have IDEs? Test frameworks? Deployment systems? Libraries for XML, RPC, TLA, ETC? These are usually things you can build yourself, but, really, are you going to?

Support also extends to the wider community. If you find a bug, how quickly will you be able to get help or otherwise find a solution?

Check: How ubiquitous is your new technology?

Another question for any new technology is how widely you can apply it. If you introduce a new technology you’re likely to find yourself using that to solve many other problems, even if it isn’t the most appropriate choice. You may find yourself extending an application in a way the underlying technology wasn’t really designed for.

I recall using the Informix Web Datablade for a project which was ostensibly a series of intelligent web pages. But as subsequent prototypes were delivered and new feature requests came in it became apparent that this was absolutely the wrong choice for that particular job. Of course, by then it was too late, and after a while the project was politely and calmly shut down.

Nobody wants to choose the wrong tool for the job, but you also need to look at future jobs and see what will be needed there.

Check: What is the base skillset?

This is the reason you don’t see much mention of Flash in this article. You can create great GUI applications in Flash (check out the Groovy.com demo) and Guardian Unlimited has terrific Flash skills in-house — see our interactive guides. But those guides come from our design team, not our software development team.

When considering any new technology you need to be sure about the skillset you’re starting from. Sure, everyone wants to learn Ruby. Sure, Ajax looks great on a CV. But what are the skills your team can really learn well? What does it take to be an expert? Because delivering an enterprise grade application means you need to be an expert, and there’s a world of difference between walking through the tutorial and creating your own application for a demanding client with a deadline.

Check: What is the future skillset?

And while we’re on the subject of CVs, what about all those new people you’re going to be hiring over the next few months and years? What are you going to find on their CVs? Are you going to find people with Java and Javascript? Perl and Ruby? Are your Java candidates going to want to learn Javascript? Seeking the right technology mix can make the difference between interviewing 9 developers and 90.

This issue applies not just to large technology choices, but smaller ones, too. I’ve been involved with an application where the data feed was SGML — a promising technology once, but it quickly fell to XML and no new developers arrived with SGML skills. After the initial developer moved on no-one really knew how the application was supposed to work. We rewrote it eventually, but only after an awful lot of pain.

So finally, then: Ajax or Swing?

Back to the beginning, then: Ajax or Swing? Naturally, the above checks don’t settle the matter, but I hope they give a good idea about the kind of things that need to be considered.

Interestingly, the Ajax option has been blurred significantly recently with a number of Java-to-Javascript frameworks which allow you to write an Ajax application without writing a single line of Javascript. Google Web Toolkit is one; Echo2 is another. Suddenly you can make a decision which is much less radical than one language versus another, and much more about choosing between frameworks on a more level playing field. All the above issues do need to be addressed, though. Both for this decision, and for any in the future.

Big bangs and Telegraph.co.uk’s redesign

Technical people don’t tend to like a big bang release, but it’s often perceived as the only viable option when you launch a new service. Why is this, and is there any kind of middle ground? The recent redesign of Telegraph.co.uk is a useful way to look at this. I’m also going to use examples from Guardian Unlimited, most notably Comment is Free.

What is a big bang release?

A big bang release is one in which a large system is released all in one go. Recent examples from Guardian Unlimited include the Comment is Free site and our daily podcasts. This week the Telegraph launched its website redesign. In the offline world, UK newspapers have been in a state of almost constant change recently, kicked off largely by the Independent’s sudden transformation from a broadsheet to a tabloid in September 2003. The Times followed suit shortly after, and in September 2005 The Guardian changed to the “Berliner” size, new to any British paper. These launches can only be done in one go: you can’t have a newspaper that’s broadsheet on page 1 and tabloid on page 2.

Contrast that to…

At the opposite end of the spectrum to the big bang is the mantra “release early, release often”. This is how the Linux community claims much of its success. Much of GU’s work is done this way: while there’s always a project in the pipeline like Comment is Free, most of the time we’re making small improvements and adding new features day by day. Some of these are internal to our colleagues, some are bugfixes, but many are small features added to the public site. In January the Weekend magazine ran a photo competition with Windows XP. We worked with them to produce an online entry form that allowed our readers to upload their photographic entries directly to us.

So what are the worries about big bangs?

Big bangs give you a jump on your competitors and give you a big marketing boost. But they also put an awful lot at stake. Releasing a photo uploader is one thing. That’s more of a feature release than a product release. Consider what makes up Comment is Free, which the GU team created in collaboration with Ben Hammersley. In no particular order:

  • Steve Bell’s If… cartoon every day;
  • Comments on blog entries;
  • Keywords applied to each entry;
  • Commenters need to sign in (or register) before they can comment;
  • An RSS feed for the whole blog, plus one for each contributor;
  • A one-click feature for editors behind the scenes to “push” an article from the Guardian or the Observer into Comment is Free.
  • Dan Chung’s photo blog;
  • A list of the most active entries on the blog…
  • …and the most interesting from elsewhere;
  • A search facility;
  • The ability to serve all this even with the inevitable battering of thousands of readers check it out every day.

And that’s not counting the editorial team powering it and the work of the designers creating something is beautiful and works across a good range of web browsers (including the famously fickle Internet Explorer).

Using the example of Comment is Free there are several kinds of difficulties with big bangs. One is that there is a lot to potentially go wrong. Another other is that there may be a big switch-over which is difficult to engineer and risky to run. A third is relates to community reaction and fourth relates to social psychology.

Difficulty 1: The lots-to-go-wrong scenario

Within hours of launch, any of the Comment is Free features above (and any number of others which I forgot) could have failed in unpredictable but surprisingly common circumstances. The important thing to remember is that this is the first time any of these features would be tested by unknown people, and the first time they’re used together in unpredictable ways. A problem with a core component of the system can have an impact on many aspects visible to the readers. A big bang can’t be rolled back without a lot of egg on face.

The problem is much more acute with anything that makes a fundamental change to a core part of a system. In any one hour we’ll be serving hundreds of thousands of pages, some of which were published five minutes ago, some of which were published five years ago. A change to the way we present articles is going to have consequences for thousands of people in myriad different situations over the course of a few minutes. If we got something wrong there, misjudged some of those situations, we could annoy hundreds or thousands of them very quickly. And since such problems can take time to fix, more people will be annoyed the longer it takes.

All but the smallest of software launches throw up some problems. Often they are too minor to worry too much about (a particular colour change not working in a particular browser, for instance). Sometimes they are superseded by more significant problems in the same launch. With a small feature release (early and often) we can spend a little time and fix it. But if a sweeping change has occurred there’s the danger of having too many things to fix in a timely fashion. Even the original big bang took several millenia to settle down, although to be fair that was mostly a hardware rollout.

Difficulty 2: The big-switch-over scenario

With some systems it’s not a case of making a big thing public in one shot; sometimes it’s a major internal mechanism that has to be switched over in one go.

Adding comments to articles might involve not just mean an addition to the underlying system, but a wholesale either/or change. The system must work not just before and after the change, but during the change, too. And everything that links into the part which changes must be somehow unlinked and relinked in good time.

If you think this language (part, unlink, etc) is vague, by the way, there’s a reason for that: exactly what needs to be changed will be many and varied. Database tables, triggers, stored procedures, software libraries, application code, outgoing feeds, incoming feeds, application servers, web servers… all these might need “unlinking” (dropping, reinstalling, stopping, bouncing, whatever). It needs careful planning, co-ordinating, and execution, and it might only be possible over several minutes, which is a very long time if you’re running a live service.

Again, rehearsing this and doing it for real in the live environment are very different things. But this time if something goes wrong you’re stuck with a broken service or something which will take just as long to roll back — which will only leave you back at square one.

Difficulty 3: Community reaction

Even if everything goes well on a technical level there’s still the danger that your user community won’t like it.

A failure for Comment is Free, Emily Bell has written, would have been if no readers had posted any comments at all. That would have been an awful lot of work wasted. But at least in that scenario we could have quietly shut it down and walked away.

A more difficult scenario would have been if people were using it, but found a fundamental feature annoying or obsctructive. A good example of this is the page width on Comment is Free. 99% of Guardian Unlimited pages are designed to fit a monitor which is 800 pixels wide. That was pretty standard six years ago when the site was designed, but is less common now. Both Comment is Free and the Telegraph’s new design require a width of at least 1000 pixels. That gives much greater design freedom and hence a better user experience — unless you can’t or won’t stretch your browser that wide, in which case it could be very alienating. If readers react badly to the 1000 pixel design then there’s very little that can be done about it. Sure you can redesign, but that’s a major undertaking.

Difficulty 4: Psychology

So, let’s suppose the features all worked, the switch-over all worked, and the community doesn’t react negatively. There’s still the problem that people, well, they just might not gel with it.

Blogs are a very good example of this. They’ve been around for almost a decade, and talk/discussion software has been around for much longer. But blogs only really took off in the last two or three years. Most of this is entirely down to getting the psychology right:

  • A blog is now recognised as a largely personal affair, being the soap box for individuals rather than a free-for-all forum;
  • discussion tends to be just a single column of comments rather than threaded, ensuring commenters respond more to the original poster, rather than getting into an internal discussion;
  • most bloggers write less content but write it more often.

That list is not something that falls straight out of a grand vision. It’s a series of very subtle psychological and social lessons that the whole online community learned over a very long period. Similarly any software release relies on social and psychological subtleties that can’t always be guessed at, and will often overlap each other. Making a single big release will obscure any of these subtleties. If people aren’t reacting to your product in quite the way you hoped then working out why could be difficult: a small tweak to a couple of presentational elements might make all the difference in the world… or perhaps your promotion of the product just wasn’t enough.

How can we mitigate this?

From the outside it looks like Telegraph’s website redesign was executed — is being executed as I write — without breaking a sweat (though I daresay internally there were a lot of frayed nerves as well as excitement). It gives us a few clues about how to manage a big bang.

Mitigation: Splash instead of bang

One trick is to make a big splash, rather than a big bang. The website’s redesign is actually rolling out over days and weeks, not minutes or hours. The team made sure to change the high-profile front page and World Cup pages first, so ensuring the perceived impact was greatest. They get a lot of credit without risking more than they need to.

Mitigation: Step-by-step approach

Even if plaudits is not what you’re after it can be useful to slow big bangs into small, distinct steps.

A while ago at GU we need to switch from one kind of server to another. On the face of it there was no way to smooth this switch: requests either went to one server, or they went to another. Nevertheless, any kind of problem switching over could have been potentially disastrous. We took extra time to build in a middle tier, which brokered incoming requests to the back-end servers, and added into that a facility to route different classes of requests to one or other of the servers. Initially everything went to the one server. Then we altered it so that only one specific kind of request, a tiny fraction of the whole, was routed to the other. We ensured that was working smoothly before we sent another class of requests over, and so on. We took what would originally have been a dangerous all-or-nothing switch, and smoothed it out over several weeks.

Mitigation: Avoid feature creep

Another thing the Telegraph team have done is to avoid feature creep. The redesign comes with almost no new features — I can count the ticker, but the rest seem to be rearrangements of the same material. This contrasts (unfairly, but instructively) with, say, the Guardian’s Berliner reformatting where the editor took a conscious decision not just to change the size of the paper, but also the design, the section arrangements, the approach to presenting the writing (distinguishing news and comment more distinctly, referencing the online world more) and numerous internal processes.

Avoiding feature creep on software projects is very, very difficult, and it takes strong wills and authority to say No to the many voices who want just a little bit more.

Mitigation: Don’t just test — rehearse

Testing individual features is one thing: testing the whole lot together is another. That’s why rehearsing the whole sequence together is important. And there’s no point rehearsing if you’ve not built in time to make changes when (not if) you discover there are flaws in your plan.

Mitigation: Plan for natural breaks

The Telegraph also seem to have a release plan with plenty of natural breaks: once started it seems they weren’t on a time-critical unstoppable slide. Of course they will have had deadlines, but if they had to slip some of these for the sake of something else then that was always a possibility.

A good example of this is their replacement of the missing e-mail and print features. The site’s articles originally had links to mail them to friends or print them out. The original rollout omitted these features, and there was a promise that they would be restored later. However the outcry from the community was such that they interrupted their schedule and reinserted them earlier than planned. It’s a good example of listening to your community, but it also shows that you need to the flexibility to react accordingly.

Mitigation: The safe but indirect route

Big-switch-over-scenarios can be avoided by ensuring you take a route that’s not necessarily direct but is at least safe. You may end up with a very long and winding plan, however. This is a bit like the word game where you have to go from one word to another, changing one letter at a time, but still ensuring that you’ve got a word at every step of the way.

W H I T E
W H I N E
S H I N E
S H I R E
S H I R K
S H A R K
S H A C K
S L A C K
B L A C K

This can still give you your big bang if the final step is the only visible step. The Telegraph redesign will have involved an awful lot more behind the scenes than the apparently straightforward change first seen by the public. No doubt they could have done it quicker at the expense of a more confusing user experience, but they managed to keep the end user change to minimum.

Mitigation: Trial features elsewhere

If you’re unsure of a particular part of a big bang release then it might be possible to trial just that feature elsewhere. As noted above, Comment is Free was the first large scale GU site with a thousand pixel width. It was a calculated risk. Reaction to that will inevitably inform any future services we launch.

Mitigation: Loosely coupled systems

Another mitigation technique to allow a big bang is to ensure the product being released is on an entirely separate system to anything else. If you’ve been following Comment is Free very closely, you’ll have read Ben Hammersley’s comment (you’ll need to scroll down or search for his name) that it’s based on the Movable Type blogging software. MT is certainly not what we use as our main content management system and it’s necessarily separate. This makes the big bang release easier: there is integration, but it’s loosely coupled. The systems are linked, but they’re not tied that closely that a disaster with one will adversely affect the other.

Don’t think it’s solved

All of the above approaches (and the many more I’ve inevitably missed) help, but they aren’t silver bullets. Each one has its drawbacks. A splash might not always be a commercially-acceptable compromise; there may be no obvious step-by-step solution; feature creep relies on more strength than organisations usually possess in the heat of the excitement, and even then it only limits the problems without eliminating them; rehearsing requires time and an expensive duplication of the entire production environment; natural breaks can be difficult to identify, and can rely on a running-order that isn’t compatible with commercial interests; the indirect route is necessarily lengthier and most costly; feature trials are only possible in moderation; loosely coupled systems limit the amount of integration and future development you can do.

The Telegraph has clearly encountered some of these problems.

It’s notable that “Education” disappeared from their navigation bars initially, and was only findable via the search engine, as Shane Richmond recommended on one blog comment to readers. No doubt they couldn’t find a cost-effective route from the old design to the new that encompassed preserving all the navigational elements.

The switch to a wider page was greeted with mixed reactions, and while we limited it to Comment is Free, the Telegraph dived right in and is applying it to everything.

Finally, it’s interesting that the new design is quite sympathetic to their previous one. No doubt a lot of that is due to the Telegraph having the same cores value two years ago as it has today. But it’s some way from some of the more radical ideas they had, and no doubt that is partly a constraint of having to live with two designs for period.

So where does this leave us?

I hope this explains some of the problems with big bangs, some of the ways to deal with them, and the difficulties that still remain.

We’ve won a Webby!

How great to be able to make this the first post of this new blog. Guardian Unlimited has won a 2006 Webby — winners of the Newspaper category. We were nominated with illustrious names from world of US print, and it’s great to see our name among theirs. This is the second year we’ve won it, too, which is particularly wonderful, if slightly unsettling… it means there’s much more at stake over the next year.

It’s great to be part of such a star team. My colleague Sheila Pulham has more news and links on it (and it’s from her that I stole the unavoidable headline) over at the GU Newsblog.