Category: Technology

“But why didn’t you write it in Ruby…?”

I thought I’d answer a question that comes up from time to time about the Java development and redesign work going on for the Guardian Unlimited site: “Why didn’t you write it in Ruby?”

I was reminded of this question because of an excellent post over at Manageability.org in which ceperez wonders aloud whether dynamic languages can scale. The post contains excellent thumbnail sketches of various languages, and the suggestion is that different languages cultivate different habits which have different consequences as an application (or project) evolves. The flipside of this question is Nati Shalom pondering why Java isn’t used more for large scale web applications. Nati’s post is one I’ve kept around for some time, wanting to follow it up, and so this post finally does that.

The question of why the Guardian Unlimited redesign isn’t taking place in Ruby is sometimes asked of me, but most often asked of any of our architects, often following a presentation of our work. Here are some answers.

Developers discuss Java and Ruby1. A personal answer

My own immediate answer to the question is usually a reaction to the fact that it is too often asked by certain kinds of people. These people are the Kool Kidz who have picked up Ruby on Rails over a weekend, created a blog in 15 minutes, sneaked a Ruby script onto their company’s servers, and are so delighted with their new toy that they think it should be the basis of the multi-million pound invoicing system currently wanted by the Accounts Department, which they’d quite like to build if they can use Ruby but which they have no intention of sticking around and supporting for the next ten years.

Unfortunately my personal answer to these people contains too many expletives to be included on a family blog.

Just as well we’ve got some other answers, then…

2. Confidence

Ruby might be a terrific language, with good support, and it might be really reliable. But the redesign work that’s going on now produced its first public release in November 2006. So the work will have started a few months before then, and the planning a while before that. Let’s round it out to January 2006, two years ago from now.

Casting your mind back two years you may recall that Ruby was an up-and-coming language, but it hadn’t really been proven running large scale websites. Sure, 37signals were working with it to great effect, but that’s only one case study, and not of a typical company. If you’re going to sink someone else’s money (that is, your employer’s money) into a big project you have a responsibility to ensure you’ll get results. You’ll want to be sure there is sufficient expertise in the marketplace to call in help when you need it, you’ll need to see several other similar projects that have succeeded with it, and you’ll probably want to know of a few projects that have failed with it so you know what not to do. With the state of Ruby at the time there just wasn’t that weight of evidence that supported its use for our purpose.

3. Scalability and socialness

Finally, a very important factor in our consideration was a kind of scalability, and what one of our architects calls socialness. It’s the scaling of the team.

There’s a world of difference between four-or-so developers writing some software and several dozen developers writing it. The day-to-day mechanics of writing software changes significantly when you’ve got so many developers that…

  • there are too many to fit round the same table;
  • you can no longer rely on saying something aloud and being heard by everyone;
  • you couldn’t possibly have recruited them all if you had wanted Indentikit developers who would all naturally solve all problems in exactly the same way;
  • any code you write is likely to be picked up next by someone you’ve not happened to speak to for more than a fortnight;
  • any code you pick up is likely to have been written by someone you’ve not happened to speak to for more than a fortnight;
  • it’s only to be expected that the people you end the project with won’t be all the same people you started with.

When all this is the case, then code consistency becomes hugely significant, and just relying on people’s own technical and communication skills isn’t going to be sufficient — you need your software to keep you on the straight and narrow. We weren’t comfortable that a language with the fluidity and flexibility of Ruby would provide that.

Postscript

I’ve just made a quick calculation: If quality is compound, and you’ve got 25 developers who each manage to code to 95% of your desired standards, then the quality of your codebase becomes 0.95^25, or 28%. Okay, it’s a pretty spurious calculation, but it goes to illustrate the point that quality is more fragile across a larger team.

In the end, the “why not Ruby” question comes down significantly to historical reasons and human reasons.

And what would we choose if we had to make the decision today? Well, we don’t have to.

Guardian Unlimited’s new look: Some background on templating

We’ve just launched the single most visible, most complex, and most trafficked page in the entire network of Guardian Unlimited sites — the Guardian Unlimited front. And in so doing we’ve revealed more of a new design that’s due to roll out over the next few months.

There are some really lovely editorial and commercial features in there, but I can’t give the whole game away now. Those tricks and subtleties will reveal themselves over the next few days, weeks and months as the news agenda and business opportunities unfold. However, I would strongly recommending looking at Emily Bell’s piece which explains what we’re doing, and Mark Porter’s background on the design.

Since this is a technology blog I want to focus on just a couple of technical areas that I’ve been tracking — templating and layout, and some of the issues we’ve had there.

Velocity hello world snapshotKeeping it simple with Velocity

Like the GU Travel site, perhaps our highest profile release in recent months, the templating was achieved with Velocity. This is no secret; Velocity was named as one of our technologies when we were recruiting for client-side developers earlier this year.I’ve noted with raised eyebrows the Velocity versus Freemarker skrimishes that have blown up on the Net from time to time — they have been both lengthy and deeply felt. I can’t honestly say that we ran a detailed and open comparison when we chose a templating framework, but we consciously chose Velocity because it enforces simplicity. I once recommended someone to the Velocity documentation page, and one of our developers piped up “That’s what I like about Velocity, it has a documentation page“. Admittedly, it’s a very long page, but it’s still one page. That’s a sign of its simplicity.

But maintaining simplicity is difficult, and it’s too easy when faced with basic tools and a complex problem to create hacks and workarounds. One of our warning signs of things going skew-whiff was when our Velocity templates were becoming difficult to understand. I know a few people who would regard such things as the norm, that workarounds are par for the course for a jobbing techie. But complexity of the solution should be proportional to the complexity of the problem, and rendering data in templates shouldn’t be complex unless the data is complex. Also, one of the reasons behind the strategic decision to keep the development in-house was to ensure quality remains high throughout the software stack — we can resolve problems at source, not just where we have access.

In the case of complicated Velocity templates I was at pains to emphasise to our client-side team that they should expect simple, intuitive, relevant information from the back-end systems, and they should use this to ensure their templates are easy to read — if they weren’t getting this then they should tell the back-end developers. Conversely the back-end developers considered it their duty to provide simple, intuitive, relevant information to the client-side team, but needed to be guided as to exactly what this meant.

Consequently, subtle changes in the way information was presented, or accessed, made big differences. The back-end team ensured not only that the domain model was adjusted and expanded to ensure simple templates, they also built a number of tools to ensure the work of templating itself was more manageable.

In the end the Velocity behind the Guardian Unlimited front does remains complex. But it’s complex only as far as the visual design itself is complex. That feels appropriate to me.

Anatomy of CSS rule, thanks to josephtateCSS: Cascading scalability snafus

Because there’s an awful lot of content on the GU front there’s an awful lot of styling. And that’s brought with it its own problems. An Agile process encourages change, and as the work came together the design and the styling necessarily evolved. However, while change can be handled easily in software (thank you, JUnit and Eclipse) that’s not really the case with CSS — there’s no reliable refactoring tool that I know of.

I’ve previously found Factor CSS which reduced a 300 line file I gave it to 40 lines. But as one of our client-side team stressed to me, “Refactoring CSS is about making it more manageable, not shorter”. IntelliJ IDEA does better. It allows renaming of IDs, class names, etc, which is a huge jump forward, but far from the current state of refactoring in object-oriented programming. So refactoring CSS is important to us because of that manageability: this is software we’re writing for the long term.

CSS editing with IntelliJ IDEAThus a short time ago we embarked on a manual refactoring exercise which rounded up all those tiny changes that had crept in throughout the Agile design process. When we began we had a single global CSS file of several thousand lines. That horrified me, though I suspect it wouldn’t be shocking to those who work on this kind of thing full time. There is debate within the team as to whether this is to be expected — “CSS doesn’t scale,” / “Yes it does, but we’re still on the learning curve”.

Either way, we’ve now split it out across several more manageable files and removed a lot of duplication and redundancy. No-one thinks we’ve achieved an ideal, but we think we’ve done what’s pragmatic for now. The team has achieved this partly through dogged hard work, and partly with a bit of innovation. The hard work needs little explanation — it’s work, and it’s hard, and it’s part of any job, although in this case it does have the virtue of rewarding you with incremental results at every step of the way. The innovation is in the way we link CSS templates to the on-page components that they refer to, so that each component pulls in its own CSS if necessary, and only if necessary. This ensures we can isolate component-specific CSS quickly and effectively, and therefore be more confident in our refactoring.

However, I’m left with the feeling that web design is a good way behind back-end software development when it comes to management.

Domain Driven Design book coverSeparating presentation and content

Of course, separation of presentation and content is second nature today, if rare in older systems. I noted with interest that when Yahoo! News UK relaunched recently one of the innovations listed was “proper separation of presentation and content for the first time”. Yahoo! News UK and GU clearly have more in common than just what’s on the surface.

But while the fact of presention and content separation is well-established, the two sides must communicate and how this is done is much less certain. To give a particular example, when we display an article the content consists of obvious things such as article body, headline, byline, etc. Our early adherence to domain driven design (DDD) led us to ensure that this content — and only this content — was visible to the Velocity templates. This sounds fine in theory, but in practice wasn’t good enough: the visual design of the page also depended on knowing non-domain information such as where the 200-word point was, or the dimensions of the associated image.

These things are part of the presentation logic, so need to be kept separate from the domain logic. On the other hand, they also need to be kept separate from Velocity, because that’s a templating framework, not a programming framework.

For these reasons passing in “fake” articles with extra (non-domain) information violated the architectural principles of DDD and lost us its advantages. We also flirted briefly with creating and exposing objects such a DecoratedArticle (using the decorator pattern) but this didn’t pass the intuition test. A DecoratedArticle is not an intuitive thing to anyone except an experienced software developer.

It took some time to evolve a solution that ticked the boxes for domain driven design, simplicity, intuitiveness and manageability. It involved small alterations in our templating patterns, among other things.

The GU front in retrospect

The work to date on the GU front has been far from trivial — not only because of the issues above, but also because of things I’ve not talked about such as performance and the software our editors need to keep updating it.

Along the way we’ve tackled new and difficult technical issues. You might think it’s odd that such issues weren’t resolved last year when we were working on the Travel site which is really part of the same system. But, as I said before, the GU front is probably the single most complex thing we could release. It’s only when you force yourself to confront the most difficult issues that you are forced to see root causes and address them at source. This is particularly so as there is much more work to come. As Emily says, “our view is that for a more technically robust site, which will be a platform for far more rapid developments in the future, an iterative approach is the best”. So we’ve made a good decision to face the difficult stuff sooner rather than later. There will be more technical challenges to tackle in the months ahead. But for now I can say we’ve resolved some really core issues in ways that should hold us in good stead for a long time to come.

Ajax versus Swing: A management question

For the last year or so within the Guardian Unlimited technical team there’s been a continual low-level question being debated: what technology should we use to build GUI tools for our editors, and in particular should Ajax play a part? It’s a debate in which a technology manager has a rather different perspective from that of a developer. In this article I want explain those considerations.

Why is this even an issue?

It probably seems odd to ask about a GUI technology, since you would imagine — rightly — that we’ve already built an awful lot of tools for our editorial teams. They have tools to write articles, create quizzes, manage podcasts, and much more besides. But it’s come about recently due to a confluence of events: the rise of Ajax, and the realisation that most of the tools we built five years ago are Ajax-a-like tools without us having come up with the catchy moniker; the oft-discussed integration of newsrooms, and the consequential comparison of our home-grown tools with professional systems such as Adobe InCopy; and the general desire to upgrade and improve anything that’s a few years old.

Despite having Ajax-a-like tools in-house, the question as to whether we use Ajax is still pertinent because the frameworks and techniques available now mean we’re talking about a very different codebase and approach to what we’ve developed historically. We are in a position to decide on a technology which is actually new (Swing, Flash, etc) or effectively new (newer Ajax techniques, or a readily-available Ajax framework).

In some organisations this wouldn’t even be a question. Many BBC journalists use in-house Visual Basic applications, and if Guardian Unlimited was a Microsoft shop no doubt we’d be doing the same. Maybe we’d move from VB6 to VB.NET, but the core technology would remain: native Windows applications with native Windows GUI components.

But for better or worse GU is not a Microsoft shop, and we need our applications to run cross-platform — a lot of our staff use Macs as well as Windows machines. We can choose to build a tool using HTML web pages if it’s really simple, and for more complex applications we can choose either Java Swing or what we now call Ajax. For the purposes of this debate we’re really talking about a choice between Swing and Ajax, because both are building on a foundation of knowledge (Java, Javascript) that we already have.

Why is the technology manager’s view different?

Everyone in an organisation has their own perspective. Developers need to focus on getting the immediate task done; project managers need to focus on getting the immediate project done; I find that more than most other people I’m thinking about operational issues. How will it pan out in the long term? Where will it leave us 12 months from now? Five years from now? How will we grow with this technology in place?

Creating Ajax applications might be fun and cool, but there are plenty of more serious issues to consider. Here is my checklist of considerations…

Check: Minimise the number of technologies

There’s only so much an ordinary person can really be good at. Minimising the number of technologies you deal with means that you get to become expert in those, and so ensure you get the most out of them. If you use one technology everywhere, then whenever you pick up a skill or insight in one project every subsequent piece of work benefits.

By contrast, spending a relatively small amount of time on a particular technology means that you probably don’t get to use it in the most effective way. You may have been using Perl, say, over many years, but if that use has only been intermittent then you’d never learn the quirks, never learn the efficiencies, and never commit time to making sure you had a good, reliable Perl environment in which you could consistently produce really reliable code, and have the motivation to keep that environment up to date.

On the other hand, if you commit to using a technology in one place, for one job (if we were to use Ajax, say, on the front-end website) then it’s too late — you’re already infected. The technology is already in use, and you’d better get au fait with it pretty quick. But you’ve still got some choice. You could still decide that it will only be used for that specific role, within strictly defined parameters. Nevertheless, that’s a much harder rule to enforce. You’re still at the top of a slippery slope.

Check: What is the future of that technology?

It’s of great importance that any software developed doesn’t become legacy — or at least avoids being legacy for as long as possible. Choosing a technology today that won’t get updated tomorrow means you won’t get bugfixes for it, and you won’t be able to find new libraries to deal with the changing technological landscape.

I once worked at a company where a core Unix server product was built around Netscape Enterprise Server (“voted best of 1998”). It was probably the right thing to do at the time, when Apache was just emerging, but it turned out it didn’t have a future, at least not for our purposes. It became increasingly hard to sell into companies and eventually we had to embark on a significant rewrite programme to recover lost ground.

Ajax is the technology du jour, but where will it be in two years’ time? Swing has the stability and backing of Sun Microsystems, but will the ascendency of Ajax mean they take their foot of the accelerator as they focus on Ajax? You need to make a good guess at these and other market questions to ensure your software isn’t based on legacy software.

Check: How stable is the technology?

This is most pertinent to those technologies that are newer or developing rapidly.

I’ve seen a project built on an emerging graphics framework that changed every week. Each week the lead developer would spend a day updating his version of the framework from the trunk, fixing all the bugs and rewriting great swathes of his code because it used calls since removed or replaced.

Right now I wouldn’t want to be developing a mission-critical system on Parrot, as exciting as it looks. Even Ajax has its problems: the various Ajax frameworks will get updated periodically (we hope), and Javascript implementations may or may not change subtly with browser point-releases, and then we have to ensure that the progress of your Ajax framework and the myriad browser versions you need to support actually keep reasonably in sync with each other.

Swing, of course, has never had a problem with lack of stability. And that’s also a problem, which is why we must also ask…

Check: How current will it remain?

Say what you like about the graphics framework that changed every week: it sure was up to date. By contrast, if you report a Swing bug to Sun tomorrow, what are the odds on it getting fixed in time for your next release?

Demanding stability and currency is of course a bit cheeky, but unfortunately they’re both needed. It is often possible to attain them, though — if your technology supplier is committed to backwards compatibility. This is a creed that Microsoft has stuck to so vehemently for years, and which Sun has applied stringently to Java… so stringently, in fact, that they created a generics solution for Java even more compatible than Microsoft’s generics solution for .NET.

Check: How much support does it have?

Whether the technology can do the job is one thing, but you must also ask if it does it well. Does it have IDEs? Test frameworks? Deployment systems? Libraries for XML, RPC, TLA, ETC? These are usually things you can build yourself, but, really, are you going to?

Support also extends to the wider community. If you find a bug, how quickly will you be able to get help or otherwise find a solution?

Check: How ubiquitous is your new technology?

Another question for any new technology is how widely you can apply it. If you introduce a new technology you’re likely to find yourself using that to solve many other problems, even if it isn’t the most appropriate choice. You may find yourself extending an application in a way the underlying technology wasn’t really designed for.

I recall using the Informix Web Datablade for a project which was ostensibly a series of intelligent web pages. But as subsequent prototypes were delivered and new feature requests came in it became apparent that this was absolutely the wrong choice for that particular job. Of course, by then it was too late, and after a while the project was politely and calmly shut down.

Nobody wants to choose the wrong tool for the job, but you also need to look at future jobs and see what will be needed there.

Check: What is the base skillset?

This is the reason you don’t see much mention of Flash in this article. You can create great GUI applications in Flash (check out the Groovy.com demo) and Guardian Unlimited has terrific Flash skills in-house — see our interactive guides. But those guides come from our design team, not our software development team.

When considering any new technology you need to be sure about the skillset you’re starting from. Sure, everyone wants to learn Ruby. Sure, Ajax looks great on a CV. But what are the skills your team can really learn well? What does it take to be an expert? Because delivering an enterprise grade application means you need to be an expert, and there’s a world of difference between walking through the tutorial and creating your own application for a demanding client with a deadline.

Check: What is the future skillset?

And while we’re on the subject of CVs, what about all those new people you’re going to be hiring over the next few months and years? What are you going to find on their CVs? Are you going to find people with Java and Javascript? Perl and Ruby? Are your Java candidates going to want to learn Javascript? Seeking the right technology mix can make the difference between interviewing 9 developers and 90.

This issue applies not just to large technology choices, but smaller ones, too. I’ve been involved with an application where the data feed was SGML — a promising technology once, but it quickly fell to XML and no new developers arrived with SGML skills. After the initial developer moved on no-one really knew how the application was supposed to work. We rewrote it eventually, but only after an awful lot of pain.

So finally, then: Ajax or Swing?

Back to the beginning, then: Ajax or Swing? Naturally, the above checks don’t settle the matter, but I hope they give a good idea about the kind of things that need to be considered.

Interestingly, the Ajax option has been blurred significantly recently with a number of Java-to-Javascript frameworks which allow you to write an Ajax application without writing a single line of Javascript. Google Web Toolkit is one; Echo2 is another. Suddenly you can make a decision which is much less radical than one language versus another, and much more about choosing between frameworks on a more level playing field. All the above issues do need to be addressed, though. Both for this decision, and for any in the future.