The cost of delay of technical debt

Last week I wrote about the value of stopping work to tackle tech debt, and more specifically when we might expect to recover from the temporary stoppage. In the ensuing discussion a question came up about one of my assumptions, which was that the team was delivering at a constant rate (albeit below its potential). Really, said Richard Wild, we should expect our productivity to decline over time.

Since then I’ve tried to incorporate the idea of declining productivity while answering this question: What is the cost of delay of stopping to deal with technical debt? Or to use slightly less jargon, if we want to stop feature development and concentrate for a period just on addressing tech debt, what value does our business lose, if any, from postponing that stoppage?

Given some very particular assumptions I found this:

The cost of delaying a stoppage to address technical debt increases linearly with the delay.

In other words, if we postpone the stoppage for twice as long then overall it costs us twice us much.

Let’s be bit a bit more specific. If we stop to deal with tech debt then it will cost us some amount of money in the short term, but we’ll make up for it by being able to go faster afterwards, so in the end we’ll be better off. If we delay that stoppage by, say, 5 months, then we’ll still be better off overall, but we’ll have incurred an additional cost by delaying the stoppage, because for those 5 months we were consistently operating below our full potential. That additional cost is our cost of delay. If we delay the stoppage by 10 months then that cost of delay doubles. And if we delay the stoppage by 15 months then the cost of that delay is triple what it would have been had we tackled it at 5 months. This is what I mean by the cost of delay increasing linearly.

Actually, as we’ll see, it’s not strictly linear, but it seems to be as near-linear as makes no odds. So with that initial caveat, let’s look at some of the assumptions behind this claim…

Assumption 1. We’re going to stop to deal with tech debt. We are not considering the alternative of dealing with tech debt on a continuous, on-going basis, even though that’s generally a good and very sensible thing to do. In this scenario we’re assuming that for whatever reason (perhaps things have just got too difficult) a continuous approach just isn’t an option for us.

Assumption 2. Our productivity (or rate of delivery) is continually declining. In the Excel sheet referenced towards the end of this I describe the decline as 1/x, shifted or scaled in some way. This looks like it gives us a downward curve we can shift or stretch to give something we’re likely to feel more or less fits our own particular situation.

Assumption 3. Once we’ve dealt with our tech debt our renewed productivity becomes constant. (Presumably it’s also greater then when we started, although that’s not essential for the maths to work.)

Assumption 4. The amount of work we have to do to tackle the tech debt is fixed. This is true regardless of whether we tackle it now, or in a year’s time. At first glance this might seem a little unlikely, but we’ll also say…

Assumption 5. The time required to deal with our tech debt is inversely proportional to our productivity at the point at which we start dealing with it. In other words, if our tech debt workload is 2 months of effort today, but we delay it for 12 months when we’re one third as productive, then it will take us 6 months when we pick it up at that delayed point.

Let’s look at that with a picture.

The horizontal axis is time, and the vertical axis is productivity, or the rate at which we’re delivering value. The blue downward curve a(t) is our default productivity, which declines as time goes on. The straight green line b(t) is what our productivity will become once we’ve dealt with the tech debt.

The two shaded rectangles are two options for dealing with the tech debt. The one on the left is the effort we have to put in if we deal with the tech debt now. We start today and finish at time tE. It has area E, which is the total effort we needed to put into it. The one on the right shows us starting the tech debt later, at time t0. It has the same area E as the first one because one of our assumptions is that we have the same amount of work to do regardless of when we start it. You can see it finishes at time t1 and the total duration (that is, t1t0) is greater because our productivity has declined.

In either case, when we’ve finished tackling the tech debt we can expect our productivity to jump from the blue curve a(t) to the green one, b(t).

If we tackle the tech debt immediately, then, we get something like this:

The value we end up delivering is the large green shaded area starting at time tE, which is when we’ve finished dealing with the tech debt.

On the other hand, if we delay tackling the tech debt we get something that looks like this:

In this case the total value we deliver is the blue shaded area on the left plus the green shaded area on the right. The blue shaded area is the (declining) value we deliver day by day before dealing with the tech debt. The gap in the middle, from t0 to t1, is us deal with the tech debt. Then the green shaded area is us working at full potential.

And from these two pictures we can work out the cost of delay, which is the area in the first picture minus the area in the second one:

We can see that the cost of delay is the area of the green rectangle from tE to t1 minus the blue area that runs from 0 to tE.

So that’s all very nice, but what happens to the cost of delay (green minus blue) as we move t0 further and further out?

For this I took my inked scribblings [pdf] and put them into an Excel spreadsheet. (And because my maths is very rusty I also turned to Wolfram Alpha rather than engage my brain any more. The area under the blue curve a(t) relies on using integration, and it can tell you how to integrate a curve like 20/(x+4), which is very handy.)

What we get depends on the precise numbers we put into Excel, but generally it turns out like this:


The blue line and green line (both measured by the right hand y-axis) are the same a(t) and b(t) we’ve seen above. The orange line (measured by the left hand y-axis) is the cost of delay of pushing out the tech debt further and further into the future. You can see it’s not quite a straight line, but it becomes near-as-dammit a straight line very quickly.

If you fiddle with the figures in the Excel spreadsheet you can see what happens in different scenarios. You can change the slope of the downward curve a(t) to make it more gentle, or approach a non-zero value. (If you do this, don’t forget to update the corresponding integral equation in the spreadsheet.) You can even see what happens if our renewed productivity is actually less than what we started with. As far as I can tell, in all (sensible) cases we get the same near-as-dammit linear cost of delay.

I find cost of delay mentally stimulating, but also troublesome. It can get very geeky, and that means almost certain aliention of senior stakeholders who don’t generally have time for that kind of thing. In this case we entered the world of integral equations, which, from that point of view, is awful.

But the good news is the outcome in this case is fairly easy to communicate, and that is this: If we’re just considering a temporary stoppage to deal with tech debt then delaying that costs the business proportionally more the longer we leave it.