Visualising data is good, but then looking behind the visualisation is even better.
This became apparent to me recently reading a fun piece of research by Justin Matejka and George Fitzmaurice:
An effective (and often used) tool used to demonstrate that visualizing your data is in fact important is Anscome’s Quartet. Developed by F.J. Anscombe in 1973, Anscombe’s Quartet is a set of four datasets, where each produces the same summary statistics (mean, standard deviation, and correlation), which could lead one to believe the datasets are quite similar. However, after visualizing (plotting) the data, it becomes clear that the datasets are markedly different. […] Recently, Alberto Cairo created the Datasaurus dataset which urges people to “never trust summary statistics alone; always visualize your data”, since, while the data exhibits normal seeming statistics, plotting the data reveals a picture of a dinosaur. Inspired by Anscombe’s Quartet and the Datasaurus, we present, The Datasaurus Dozen…
The authors then show twelve patterns (a star, a cross, parallel lines, etc)—plus another in the shape of a dinosaur—which all have the same mean, standard deviation and correlation. They are clearly very different, but the summary statistics are the same.
The message is, of course, never trust the summary statistics alone—it’s far more informative to have a visualisation of the actual data itself.
But there’s another side of that message that’s rather hidden, and it’s something advocated by proponents of systems thinking: go see the work. The data presented in this paper was entirely fictional—there was no reality behind the data. But we’re usually presented with real data, and real data has something real and physical behind it. Real data about work—operational activity, a project, etc—has real work and real activity behind it. In Lean circles this is also called genchi genbutsu, and in the world of public sector digital development the phrase is “go see for yourself”.
So in these circumstances it’s good look behind the statistics and visualise the data. But it’s even better to look behind the visualisation and go see what’s really happening on the ground.
In a project it’s good to go behind (say) team velocity and see the environment the team is working in, and the difficulties they face. If we’re looking at a product we should go beyond our dashboards and visit our customers or users, seeing how their work in their natural environments. If we’re looking at the economy it’s good to look behind figures such as GDP and related charts, and visit factories and workplaces to see what issues people really face on the ground.
Visualising the data is good. Going to see for yourself is an important additional step.