The benefits of OO aren’t obvious: Part IV of V, refactoring

Previously I wrote:

Sometimes you work with something for so long that you forget that other people aren’t familiar with it. This happened recently talking to a friend of mine — “I’ve got no need for object orientation”, he said, and I was shocked. […] What, I wondered to myself, were the benefits of OO that he was missing? How could I persuade him that it does have a point? So here is my attempt to trawl my memory and see how I’ve changed in response to OO, and why I think it serves me (and my projects) better. In the end I’ve found that the benefits are real, but nothing I would have guessed at when I first started in this field.

I’ve already written about data hiding, separation of concerns and better testability. Now read on…

Refactoring

This is something that builds on all the items above. If you’ve worked on any piece of code for a length of time you’ll always have come to the point where you think “Damn, I really should have done that bit differently.” Refactoring is what you want to do here: “restructuring an existing body of code, altering its internal structure without changing its external behavior”.

Refactoring is helped considerably by OO. “I already refactor,” says my non-OO friend, “I always make sure common code is in put into functions.” Which is good, but there’s a whole lot more to refactoring than that. Martin Fowler lists a vast number of options. But before I list my favourite refactorings (how sad is that? I have favourite refactorings) let me explain why this is so important, why it probably isn’t all entirely obvious, and how it relates to so much else here…

First, be aware that two people could both write outwardly identical applications but with their source totally different internally. At its most extreme, refactoring would enable you to evolve from one codebase to the other in lots of very small steps, at each step leaving you with the same working, stable, application. That’s much more than just putting common code into functions. If you think you already have the skill and technology to entirely change your codebase without changing anything outwardly then you’re a refactoring expert. If not, then you’ve got a lot to learn (and a lot of fun yet to have).

Second, refactoring is so important because it gives you a chance to improve your code. Unit testing (and indeed all testing) makes this possible. Test driven development isn’t just about test-first development, it’s also about refactoring. The last step of a TDD sequence is to refactor. You ensure your tests pass, you refactor, and you run your tests again, to make sure they still pass — to ensure you’ve not broken anything while refactoring. A suite of unit tests acts as scaffolding around your code, allowing you to rebuild parts of it without fear of the edifice collapsing. If you’ve made a mistake in your refactoring then the unit tests should show exactly whereabouts the problem is.

Third, taking the previous two together, you can make vast, sweeping changes to your codebase in tiny, safe steps. Simply break down the process into lots of small refactorings, and test after each one. This is how the apparently-impossible task of changing your entire codebase reliably is possible. In reality you’ll probably never want to change your entire codebase but you will want to change large chunks of it, and this is how you’d do it.

Fourth, you often find yourself needing to add a feature which seems reasonable to a client, but difficult to someone who understands the structure of the application. An excellent way to add the feature is to utilise refactoring as follows: refactor the software so that the addition of the feature does become a natural extension, and then add it. Martin Fowler gives an excellent worked example of this in the first couple of chapters of his book, which is (almost) worth the price of admission alone.

Now you can browse the refactoring catalogue linked to above (and here again), but here are some of my favourite refactorings…

  1. Rename variable. This is very trivial, and very easy, but all the better for it. Forgotten what a variable does? Give it a better name. It’s so easy to do, but so useful.
  2. Extract method. Never mind extracting common code into functions, I often find myself extracing one-off code into a single function (or “method”, as they’re called in OO) with a long and useful name. This is to make the original method much shorter, so I can see it all on one screen, and also so that it reads better. I no longer need to look at ten lines and translate them in my head into “…and then sort all the even numbered vertices into alphabetical order”. Instead I have a single method called sortEvenNumberedVerticesAlphabetically(). You think that’s silly? Maybe Java or VB.NET is your first language. Personally, English is my first language.
  3. Introduce null object [1, 2]. I don’t think ever used this, but I like it a lot because it’s just very interesting. The problem is that you’ve got special logic whenever a key object is null. For example, you need a basic billing plan by default, if the customer object isn’t set up and doesn’t have its own billing plan:

    if (customer == null)
    billing_plan = Plans.basic();
    else
    billing_plan = customer.getPlan();

    This is a small example, but it’s important to realise why this is undesirable: partly because each branch point is another potential source of error, and parly because your “null customer” logic isn’t just here, but is probably scattered all over your code. It would be better to have all the logic in one place, where it can be seen and managed and tested more easily. The solution is to introduce a null object — in this case a NullCustomer object such as this:

    public class NullCustomer
    {
    // ...probably lots missed out here.
    // Meanwhile the getPlan() method of NullCustomer
    // just returns the default basic plan...

    public Plan getPlan()
    {
    return Plans.basic();
    }
    }

    Every time we set up a customer we should now use this NullCustomer as a default rather than the non-specific null object. And our original if/then/else code can be reduced to this:

    billing_plan = customer.getPlan();

    That’s simpler and clearer, it doesn’t have any branches, and the logic for a null customer lives in one place.

Refactoring is a wonderful thing. And it’s even more wonderful if your IDE helps you. For .NET languages you can get the Refactor! plugin, and MSDN has video demo, too. All that code manipulation at the touch of a button!

All installments:

The benefits of OO aren’t obvious: Part III of V, better testability

Previously I wrote:

Sometimes you work with something for so long that you forget that other people aren’t familiar with it. This happened recently talking to a friend of mine — “I’ve got no need for object orientation”, he said, and I was shocked. […] What, I wondered to myself, were the benefits of OO that he was missing? How could I persuade him that it does have a point? So here is my attempt to trawl my memory and see how I’ve changed in response to OO, and why I think it serves me (and my projects) better. In the end I’ve found that the benefits are real, but nothing I would have guessed at when I first started in this field.

I’ve already written about data hiding and separation of concerns. Now read on…

Better testability

Another great benefit of OO is better testability. The simplest kind of testing you can do is unit testing — taking a single class and testing it s methods in isolation. By being able to instantiate individual objects and ensuring there are minimal dependencies it becomes fairly easy to test almost every branch of your code. Fairly easy, but not trivial — there’s a difference between writing merely working code and writing testable code. Writing testable code is a very good skill to acquire, and test-driven development is a further discipline that’s certainly improved my code (and my peace of mind) no end.

There are further aids to testing that OO gives you, most notably stubs and mocks. Stubs are dummy implementations when you need a non-critical dependency (such as Repository for an Article, but you’re not actually testing the Repository). Mocks are dummy implementations which simulate a given interaction and allow you to check that interaction afterwards. Both of these are easier with dependency injection.

Personally I think the greatest step forward in testing is unit testing, and automatic unit testing — have a look at NUnit for .NET languages and JUnit for Java. Even coding for fun, I’ve found a great sense of relief in not having to think about the correctness of my code, but leaving that to my unit testing framework. It means I can leave my code at any point and not wonder about whether or not it really works — because I know it does — and not have to hold anything in my head for when I return, because everything will be verified by the tests. Pedants will say there’s a lot more to testing than unit testing, and they’d be right, but to go from next-to-no testing, or infrequent testing, to automated unit tests is a huge step forward, and I cannot recommend it highly enough.

All installments:

The benefits of OO aren’t obvious: Part II of V, separation of concerns

Previously I wrote:

Sometimes you work with something for so long that you forget that other people aren’t familiar with it. This happened recently talking to a friend of mine — “I’ve got no need for object orientation”, he said, and I was shocked. […] What, I wondered to myself, were the benefits of OO that he was missing? How could I persuade him that it does have a point? So here is my attempt to trawl my memory and see how I’ve changed in response to OO, and why I think it serves me (and my projects) better. In the end I’ve found that the benefits are real, but nothing I would have guessed at when I first started in this field.

I’ve already written about data hiding. Now read on…

Separation of concerns

This is one of those things that is a consequence of OO fundamentals, but by no means obvious. The root of so many problems in software is dependencies. A depends on B, B depends on C, and before you know it everything depends on everything else. It makes things very difficult to manage: changes become slower and slower, and riskier and riskier. Separation of concerns is about moving away from this: by forcing things into classes it encourages each class to do its own specific job. This breaks the dependencies and makes the management of your code much easier.

But it’s not that easy or obvious, and requires a fair bit of imagination to see it working really well. Let’s take a very simple example.

Suppose Guardian Unlimited has an Article class, and articles can be saved into a Repository (probably a database). Then we have two classes, each with a separate concern; the Article class has a save() method which uses the Respository to save to the backing store. But hold on there — we’ve still got the chance for it to go wrong.

An easy mistake would be to have the Article manage the setup of the Repository: when you create an Article it sets up the Repository all by itself. In some ways this is nice (it’s a form of encapsulation) because you can create and save an article without worrying about how the save is done, or even that there is a Respository at all. But there is a nasty dependency here: the Article depends on the Repository: not only are we unable to set up an Article without having a Respository, but it’s the Article which determines what the Repository is — if your application requires a working SQL Server database then it will always require a working SQL Server database, even if you just want to do a little test.

Enter dependency injection [1, 2, 3]. This is a technique that is by no means an obvious consequence of object orientation, but is facilitated by it.

In our example the Article depends on the Repository. It also specifies exactly what the Repository is, because it sets it up internally. Dependency injection says we shouldn’t allow the Article to set up the Repository; rather, we should set up the Repository externally and inject it into the Article (for example by calling a setRepository() method). The dependency isn’t broken, but the dangerous element is: the particular kind of Repository is no longer a concern of the Article.

What this means in practice is that you can feed any kind of Repository into the Article and it will still work. You can even inject a “do nothing” Repository just in case you want to do a quick test which doesn’t require saving it into the real database. Or you could inject a disposable in-memory Repository which only lasts for a limited time.

All of this is possible with another feature of OO: abstraction. If we ensure Repository is only an abstraction (rather than a specific concrete class) then we can have all kinds of Repository implementations: a SQL Server one, an in-memory one, one which always throws an error (so you can check how your application handles errors), and so on. In VB.NET you’d use an interface for this particular example.

Dependency injection forces concerns to stay separate and makes your code much more flexible and easy to manage. But that wouldn’t be obvious to you if you’re just starting out in OO development, and if you’re just looking at the concept of objects, inheritance, and so on.

All installments:

The benefits of OO aren’t obvious: Part I of V, data hiding

Sometimes you work with something for so long that you forget that other people aren’t familiar with it. This happened recently talking to a friend of mine — “I’ve got no need for object orientation”, he said, and I was shocked. Yet I couldn’t easily explain why I thought OO was far superior to procedural languages. And thinking about it afterwards I realised that its advantages for me weren’t obvious — almost none of them are clear from OO’s fundamentals (inheritance, abstraction, etc), and almost all of the advantanges are things that have been learnt by most of us recently, years after OO first appeared.

My friend spends most of his time writing embedded software, but occasionally is called upon to write a front-end interface. His front-ends are Windows-based and for his last GUI project he chose to try out VB.NET rather than his usual VB6. He was baffled by the verbosity and he felt it didn’t add anything. It all seemed rather unnecessary. And yet to me it’s long been the only way.

What, I wondered to myself, were the benefits of OO that he was missing? How could I persuade him that it does have a point? So here is my attempt to trawl my memory and see how I’ve changed in response to OO, and why I think it serves me (and my projects) better. In the end I’ve found that the benefits are real, but nothing I would have guessed at when I first started in this field.

Not the point of OO

DNJ Online has an article on migrating from VB6 to VB.NET, and it says that from this point of view the OO language is “unlike anything you’ve used before”. This is helpful in its honesty, but that particular article explains the low level differences rather than the high level aims. Try/catch blocks and namespaces are definite steps forward, but they hardly justify learning an entirely new paradigm, let alone phasing out a well-established one.

Similarly, common justifications of OO is that objects promote code re-use and allow better modelling of user requirements, but it’s hard to see how this is obvious. Regards the former, procedural languages allow code reuse via libraries. Regards the latter, it’s just as easy to see user requirements as procedural things as object-based things — easier, arguably.

For me, the advantages of OO are not obvious. But they are tangible once experienced, given the right environment. For me, those advantages are…

Data hiding

Even I’m horrified by how little fits into my tiny brain, and the more I have to keep in my head at any one time the more likely I am to make mistakes. In terms of development, the more attention I have to pay to the intricacies of my software the less attention I can pay to solving whatever the immediate problem happens to be.

Fortunately OO’s concept of data hiding (or encapsulation, if you like long words) is for people like me. By making explicit what I need to know (public methods/functions) and hiding what I don’t (private methods/functions and fields/variables) I am freed to focus on the task in hand. Private fields and methods allow my classes to manage themselves without having to burden the user with the details. You can look at the class (or its auto-generated documentation [1, 2, 3, 4]) and focus on what you need to know, without getting distracted by the details.

Anything which makes me think less is huge benefit.

Of course, data hiding by itself isn’t enough to switch programming paradigm, and if you’re writing in VB6 you’ll already have the “private” keyword. But this is only the start…

All installments, which I’ll post over the next few days:

Conversation versus concentration

Compare and contrast two blog entries that popped up in my RSS reader on the same day last week. In the blue corner Joel O’Software, fighting for private offices. And in the red corner, Martin Fowler, battling it out for continuous collaboration between developers and their customers.

Naturally, they’re not really at opposite ends of a spectrum — in fact, their posts are about different things — and they’ll find a lot to agree with each other. But look at a select part of each of their text and see the contrast. Here’s Joel:

Not every programmer in the world wants to work in a private office. In fact quite a few would tell you unequivocally that they prefer the camaradarie and easy information sharing of an open space. Don’t fall for it. They also want M&Ms for breakfast and a pony.

And here’s Martin:

One of Kent’s suggested names for ‘Agile’ was conversational software development – the point being that it’s a two way communication. This isn’t something like a telecoms protocol that you can define, but the back and forth discussions about how software can enhance the business are where the real value lives. Much of this conversation is of half-baked ideas, some of which grow into valuable features – often ones that aren’t things that the customer originally thought of.

It’s notable how two people renowned for being leaders in software can be veering apart on what should be a fundamental issue: how should people interact? Joel is for concentration, Martin is for conversation.

It’s notable also that they do different things in the software world. Joel produces shrink-wrapped products on behalf of his own company. So does Microsoft, the company Joel used to work for and who he praises for “putting literally everyone in individual, private offices”. Martin is a gun for hire (via his employer), called in to consult on a variety of projects for different companies, no doubt 90% of the time producing in-house software for each client.

I can’t escape the feeling that their respective backgrounds inform their respective views, though wouldn’t for a second think that implies that one approach always suits one kind of output.

So, should one prefer concentration or conversation? Obviously[*] it depends on several factors, and here’s the way I see it…

First, it must come from whoever you start with. If Joel O’Software starts a one-man band and likes to work in silence, then takes on his first employee, he’s not going to want his hire to keep piping up with questions every two minutes. Similarly, if you’ve created your business by extolling the virtues of pair programming then you’ll be looking exclusively for developers who will continue that good work. If you start with one introvert or extravert, you’ll grow from there.

Second, it depends on where you think your strengths are as an organisation. Joel talks a lot about productivity and algorithms, while Martin tends to talk of people and methodologies. Each also has an interest in the other’s topics, but their chosen hot subjects are where they see the biggest gaps, and where they think they can most make a difference.

Third, it’s about how you see your team. I suspect Joel has very low staff turnover, hires developers very infrequently, and there’s no doubt he puts a lot of effort into picking the cream of the crop: he’s in a buyer’s market, and his developers will all be very smart. Martin will inevitably work with a much broader range of companies. While they will of course have made a very smart decision to hire him and his colleagues (ahem) they will tend more towards the market norm, and will also tend to be fairly large development teams — even if individual project teams are smaller. Thus Martin is going to be much more concerned about sharing information between developers, evolving designs collaboratively, establishing standards and keeping those standards refreshed.

Finally, it’s about how you see the long term. Again, low staff turnover and a tight-knit team means Joel is less concerned about knowledge silos, but an average corporate team will have average turnover and will have its average share of crises. Knowledge sharing and reducing single points of failure is essential in these cases.

All of that is why I favour conversational development. Knowledge sharing and evolving ideas is key to me as a general principle, all the more so that Guardian Unlimited is such a diverse site that there’s just so much to know. That doesn’t mean it’s easy for everyone, but for me a typical team will be stronger if it keeps every last bit of information flowing round, ideas constantly exchanged and checked, and experience continuously refreshed and revised. Concentration is often needed, but too often the price paid is too high, and is only found when one person is seconded to another project, has left, or is holiday.

[*] I think there’s a progression whenever you ask “Is A or B better?” Naive or inexperienced people will always pick one or the other. Eventually they’ll come to hedge their bets because they realise things are more subtle than they previously thought, or else it just makes them sound wiser. Finally they may reach a point when they are (or regard themselves as) leaders in their field and act as evangelists or iconoclasts pushing one or other opinion heavily. You’ll see a lot of hedging on this blog.

Is it agile? Check the fixed points

One of the problems with agile development is that it’s subject to changes, so means you’re in danger of changing it into something which isn’t agile. How do you know when you’ve gone too far?

This isn’t a problem for seasoned agile practitioners, but it’s a concern if you’re just starting out, or working with people who are. It’s certainly something I found looking back at my early agile days. There are some non-obvious or difficult agile practices which one might omit, but which make a big difference. How do we know what to watch out for?

My suggestion is to ensure there are some fixed points. Then you can evolve the process without straying from the ideals. My rule of thumb for creating fixed points is this:

Could I stand up at an international conference and be sure that my process would be generally indistinguishable from that of any of the speakers?

With that in mind, here’s what I’d want to be sure I did before I stood next to my peers…

A fixed point, by lobo235

Daily stand-up meetings at a set time

There are three things here: daily, stand-up, and set time. Making the meeting daily can be difficult, but having spoken to people who hold weekly meetings it’s pretty clear that they take up the same time in the end. In addition, daily communication keeps people veering too far off the track.

Insisting the meetings are stand-ups might be difficult and there’s no clearly tangible gain — after all, if you keep the meeting to 5 minutes, who cares whether you’re sitting or standing, or bouncing on one leg? But this important to me because it instills a sense of urgency. A sit-down meeting fails to distinguish us from our document-heavy, procedure-driven colleagues. It’s entirely psychological, and all the better for it.

Making the meeting at a set time can be difficult, particularly if the lead has other responibilities. But this is important to me because it forces people to focus on the project. It says “The train’s going at this time — be there or get left behind”. It tells people the project won’t wait for them. It keeps the team to a rhythm.

Iteration kick-offs and/or close-downs

It’s surprisingly easy to get away without these: some of last iteration’s story cards might not be finished, so the team just carries on; new story cards become available when the old ones are done, which could be any time. Iterations blur.

But a formal meeting to mark the iteration boundary again ensures the project and the team keeps to a rhythm. Coming up to the boundary it ensures people know ahead of time that they’ve got something to aim towards, and deadlines always focus the mind. Entering the new iteration it gives people the chance to think afresh and renew their goals.

Mid-project retrospectives with the whole team

An end-of-project retrospective is good, but mid-project retrospectives are agile. They ensure the team really is seeking to improve not just for the next project but for the current one. It can be difficult to see the benefit of retrospectives (another time-consuming meeting), nor do I think they’re not very obvious. I certainly don’t recall them cropping up in the Extreme Programming Explained or in Agile Software Development with Scrum. One way to keep retrospectives effective is to ensure any actions are (a) objective and (b) assigned to named individuals. This way improvements will be much more tangible.

Continuous integration and testing

This is certainly an obvious feature of agile, but it’s also difficult. If the project is anything more than a few days long it’s worth spending the time up front to get the continuous integration and testing working. This ensures rapid change and frequent delivery. Otherwise it’s very likely that features and fixes will regress, that this won’t be discovered for long periods, and fixing the regression will take a long time, too. Agile is about frequent delivery and constantly moving forward at a regular pace. That can’t be done without continuous integration and testing.

Frequent customer demos

Again, not necessarily obvious because while agile talks a lot about on-site customers it doesn’t often talk about the common case of customers not being around as much as you’d like. Frequent mid-project demos are the next best thing — at Guardian Unlimited we call them “show and tells”. A customer demo has two benefits. First, it steers the team towards feature-driven development rather than architecture-driven development. (It’s very hard to impress a customer with a data object layer.) Second it encourages prioritisation of the next batch of features.

Anything else…?

There’s much more you need to do to be sure you’re being agile, but to my mind the things above are easily missed. So next time I create a team for an agile project I’m going to make sure we start with some fixed points, and evolve round those.