Lightweight versus heavyweight: The cost is in the management

A recent conversation with a colleague got me thinking about so-called “lightweight” systems, and when they become more trouble then they’re worth. He was frustrated by some problems he was having; even more so, he explained, because he thought he was dealing with something that was “lightweight”. It’s a seductive word, and sometimes — as with other forms of seduction — when you get more involved than you should things can get a bit sticky.

This article is an attempt to explain what lightweight really means, both in terms of benefits and drawbacks. There are also a couple of comparative examples from my own experience.

A lightweight system (plus management support)Lightweight doesn’t mean simple

People often mistake “lightweight” to mean simple or quick. But this can’t be right, because everyone wants simple and quick, and if it really meant this no-one would use anything else. Every website would be rewritten with the lightweight Ruby on Rails and every application would be sitting on top of the lightweight SQLite database. Who wouldn’t? Who doesn’t want simple? Who doesn’t want quick?

Lightweight is often good, but it must have its tradeoffs, otherwise other technologies wouldn’t exist.

From the examples below I see lightweight as offering low cost in return for low demands, high cost for high demands. Heavyweight is disproportionately high cost for low demands, but low cost for high demands.

Lightweight carries low inherent management costs. But some situations require a high degree of management control whether you like it or not. That means that if a lightweight system needs to scale up you have to wrest management from it and maintain it externally. If you can do that then the lightweight system continues to work, but if the lightweight system will not relinquish management control, or if you don’t have the discipline to keep the management going, then it won’t be effective in the long run. By contrast heavyweight systems impose management and structure of their own. This is good if you’re going to need it, as it takes the pressure of discipline off you, but it’s not effective if you didn’t need that management structure in the first place.

To illustrate this, here are a couple of lightweight/heavyweight comparison case studies…

Language example: TCL and Java

TCL is a lightweight language. You get to write Hello World in one line, it doesn’t force much structure on you, and it’s pretty relaxed about how it’s written.

TCL is so good, in fact, that was the basis of the original Guardian Unlimited website. We built Ajax-style tools with it before Ajax was known as a concept, we generated our front page from it, we used it to integrate with our ad server.

But as our site grew the language didn’t scale with it. Clever shortcuts implemented by earlier developers confused newer developers because they obscured the purpose of the code. The lack of an imposed structure meant every foray into older code involved learning its idiosyncracies from scratch. Development slowed down as we worked around older code. And when we wanted to redesign the website we found that through years of lightweight flexibility we had allowed ourselves to be tied into knots: it would be more effective to start again than to work with what we had.

In fact, for the most part we’re now using Java…

In contrast to TCL, Java is pretty heavyweight. Not only does Hello World require three lines (excluding any lines with just braces), but its philosophy of structure and layering percolates through from the core language to most of its add-ons. For example, to parse an XML document you have to drill through two abstraction layers before you can find the parser.

One Java framework that maintains this ethos is Hibernate, used for database access. Its architecture is complicated, and as usual this is to offer flexibility without relinquishing manageability. Recently a forthcoming release of the Guardian Unlimited website was failing its pre-production performance tests. Our developers tracked down a major cause of the problem to an inefficient query within Hibernate. They extracted some of the query’s logic up into the application layer and simplified what remained, rebalancing the work between the application and the database. Problem solved, performance restored. What’s relevant to our story is that the developers did this entirely within the archicture of Hibernate, so they didn’t compromise the design of the application and therefore didn’t add complexity.

CMS example: WordPress and the GU CMS

Over on ZDNet Larry Dignan extolls the virtues of WordPress and says, effectively, “What have big content management systems ever done for us?”

WordPress is the lightweight CMS I’ve chosen for this blog, and I’m very happy with it. It’s easy to install, requires almost zero maintenance, and lets me focus on the writing. And yet I’m a strong advocate of the home-grown CMS we have for our journalists and subs on the Guardian Unlimited site. Is lightweight not good enough for our journalists? What has a big CMS ever done for us?

Well, I just looked at a current article on guardian.co.uk: “Ministers ordered to assess climate cost of all decisions”. It was created with our big CMS. What’s there that WordPress couldn’t deliver?

For a start it’s got a list of linked subjects down the side, which aren’t the same as WordPress’s tags because they’re tightly managed to ensure consistency and reliability. These subjects are also categorised, so Pollution and Climate Change are subjects under Environment, while Green politics is a subject under Politics. As I write this, I note also that the pages for Pollution and Climate Change are designed differently, with Climate Change being more pictorial and feature-led. Subject categorisations and subject-specific designs are beyond what WordPress’s tags do.

Okay, so apart from the linked subjects, the categorisations, and the subject-specific designs, what has a big CMS ever done for us? I suppose it’s worth mentioning the related advertising, which as write includes a large ad for environmentally-friendly washing liquid. There are other contextual commercial elements, too, such as the sponsored features, links to green products and books, and offers of reducing energy bills and offsetting carbon emissions. And there are related articles and related galleries. And details of the article history, listing when and where it was first published, on what page and in what newspaper section.

Okay, so apart from the linked subjects, the categorisations, the subject-specific designs, the related advertising, the contextual sponsored features, the links to relevant products and books, the complementary offers, the related articles, the related galleries, and the article history, what has a big CMS ever done for us?

Well, I suppose it is serving to over 17 million unique users a month…

I’ll stop now. The point is a lightweight CMS such as WordPress could probably do any one of these things, with a bit of work. But it isn’t designed to do anywhere near all of them. And each time it’s changed to do one more of these things the more it is moved away from its core architecture and it gets closer to a point of paralysis, where nothing functions well anymore because no part of it is doing what it was designed to do. A bit like the TCL example.

Looking back

Reviewing these two examples, it’s clear that the lightweight systems became, or would become, very costly when they were pushed beyond their initial expectations. In both cases the corresponding heavyweight systems came with their own (heavy) management structure, but that management structure ensures lower running costs.

In the Hibernate example our software maintained its architecture after we’d made our performance change; anyone looking at this new code would be able to rely on previous knowledge to understand what was going on. By contrast, anyone coming fresh to a snippet of old TCL code would be starting from scratch, regardless of how much of the other TCL code they’d seen.

Similarly, the large-scale content management system at GU is internally consistent, despite its vast range of features and functionality. Once someone has learnt the principles (which, admittedly, are non-trivial) they can get to work on pretty much any part of it. Pushing WordPress to do that would have created a monster.

Lightweight systems take the management away from you. And that’s ideal, as long you don’t need that manageability.

We (heart) UAT

When do we need user acceptance testing? And when can we get away without it?

User acceptance testing (UAT) is when your software goes in front of the user to get final sign-off — and when they ask for changes if not. In theory you shouldn’t need UAT at all (didn’t they tell you what they wanted? Weren’t you listening?), and indeed perhaps you can sometimes get away without it.

Testing a search engine rigorouslyThis argument is stronger if you’re considering UAT throughout the project, not just at the end. This would be the case if you’re UATing individual features, or individual deliveries in a culture of more and smaller iterations. The arguments are: more deliverables mean more UAT means more expense for the customer; smaller iterations mean a reduced chance of getting it wrong; frequent deliverables give the customer more opportunities to change anything they don’t like anyway.

However, even with small, frequent deliveries there are circumstances when it’s more important to do UAT. Looking back on some projects I’ve worked on, here are times when it was, or would have been, a good idea…

1. When the output of the software is subjective

I once worked on a project that asked what bands you liked and then based on that recommended albums by unsigned bands. This was a big leap away from the “people who liked that also liked this…” kind of recommendations, because the unsigned bands won’t have had a significant userbase, so we couldn’t rely on a self-generating wisdom-of-crowds mechanism. Plus, of course, music recommendation is highly subjective, and it was our clients who were the music experts — we implementers were mere software developers.

In this case you can see it was key for our clients to play with the system and be sure they were comfortable with it. If they didn’t then it would have meant tweaking the algorithm, not a complete software rewrite of the software. In the event something rather unexpected happened. While our client’s representative was very happy with the output he also found it rather disconcerting. He felt somewhat disenfranchised from the system because it seemed so mysterious. He realised that he couldn’t demonstrate it in front of his colleagues and their investors without having an answer for the suddenly inevitable question: how does it work? And once this was explained to hiim he became much more comfortable again.

If the output of the system had been much more objective and predictable this instance of UAT wouldn’t have been so important. In the end it threw up an important new acceptance criteria for client which we were quickly able to address.

2. When the software has a costly userbase

One of my past projects involved creating a user interface for a property data entry team. Individuals on the team were, as you’d expect, low-paid and largely interchangeable — after an hour of training you would know everything there was to know about the system. Your sole function was then to spend hour after unsociable hour entering information about properties you could never afford to live in for people you would never want to live next to.

But although the daily cost of individuals on the team was considered low, the cost of the team as a whole — and the value they were bringing to the business — was very high.

Problems with the user interface in this case would have had a knock-on effect across the whole team of 20 or 30 individuals, and cost their employer dear. It was key for someone to check not only that the system was responsive, but also that there were intuitive keystrokes across the data fields, data entry was reasonably forgiving, tabbing between fields happened in the order that data was presented in, and so on. In this case, the time and early feedback from one user at the vanguard saved time and cost 20 or 30 times over.

3. When you don’t have a QA department

A pretty obvious one, but if you don’t have people dedicated to quality assurance then those internal people who take on that function (probably the lucky developers or irritated account manager) will be too close to the software and too distant from client to pick up on all the issues that are important to them.

A good percentage of the projects I’ve worked on have involved integrating search engines. And in almost every instance when testing has been left to the development team they’ve alighted on one or two key phrases to use as test cases and considered that sufficient. By contrast, the end user to whom this is important will have half a dozen phrases that are relevant each day — and they’ll involve accented characters, non-ASCII characters, mismatched quote marks, and so on. And that’s before we get to the results page. Without effort the expert user has given the search engine a good workout — often raising difficult issues for the developers.

Actually, even if you do have a QA department then there’s a case for UAT, because more often than not the QA team will test against written criteria. There are very often criteria which, with the best effort in the world, just don’t get realised or written down at the requirements stage. And in these cases it’s only the end user who can really tell if it’s right.

4. When you lack detailed requirements

Another fairly obvious situation, but if you’ve not had a good bit of customer input at the start of the project, then UAT is going to be even more important later on.

A sales person I worked with once had a cunning plan. We’d already produced an e-commerce site for one of our clients, and he decided to sell a duplicate system to one of their rivals. It was good plan on paper. There was no contractual restriction for us; the new client would get their site at a relatively low cost; they’d get it quicker than usual; and we’d only have to reskin it. The deal was done over a nice lunch.

You can guess what happened in reality. When they saw what we imagined was a near-complete version they wanted some changes. The checkout system wasn’t quite to their liking; their product catalogue didn’t have quite the fields we were expecting; the order placement system didn’t align with what they had internally; and so on. It cost everyone more than they had anticipated and it was delivered later than anyone would have liked.

Certainly this is a strong case for robust specifications — or at least understanding what you’re getting into before you get into it. But it also demonstrates that lack of early information cost us dearly when the information did come along later. We hadn’t made time for UAT, which is why we delivered the final product later than planned. Fortunately for everyone there was no externally-driven deadline. If there had been then we’d have been obliged to deliver something which the client wasn’t happy with.

5. When the inputs to the software are difficult to specify

MediaGuardian.co.ukFinally an example that brings us right to the present.

The current work to refresh the Guardian Unlimited website is much more than applying a lick of paint. We’re also producing new tools for our reporters and new ways they can tell their stories. But telling a news story is not simply a matter of filling in a few boxes and clicking Save. To make a story relevant requires a lot of hard training. To edit an entire news section additionally needs an awful lot of experience and good intuition…

As I write, MediaGuardian.co.uk is leading with three big stories and a podcast, plus some smaller items headed “More media news”. The stories have between three and six sublinks which indicates the depth of related news that the editor has available. Yet at the same time the Technology section is showcasing six stories, all with a bit of blurb, but only one of which has sublinks (to a video and a gallery). In both cases the editors are using the same system, but the page layout has to balance itself out regardless of the highly variable input.

But though the input is highly variable it is not without bounds. A section would never be showing just one story, and an editor would almost certainly not try to highlight twenty stories and claim they had largely equal weight. The only way to really know the bounds in which the system is supposed to reasonably work is to sit an editor down with the tools and let them lay out pages in response to a real day’s stories. No amount of requirements analysis and QA experience can substitute for a real journalist responding to a real day’s events.

And that’s the beginning…

No mechanism should be applied blindly; there are times when user acceptance testing is more appropriate, and times when it is less appropriate. The important thing is to be aware first that UAT exists, and second that there are criteria which you can apply to assess how valuable it can be.