We spoke to Dean Jones, Head of Data Science at the National Trust ahead of his presentation at Predictive Analytics Innovation in London on May 11 & 12.
Dean Jones is Head of Data Science at the National Trust. He holds a Ph.D. in Computer Science from the University of Liverpool and has spent the last 15 years working in the fields of natural-language processing, machine learning and data science. Following completion of his Ph.D., he worked as a Research Scientist for BT and then led text mining projects for Definiens, a Munich-based start-up, and for the National Centre for Text Mining at the University of Manchester. Before joining the National Trust, he was Chief Scientist for numero in Manchester and Head of Data Science for RecordSure in London.
Innovation Enterprise: How did you get started in data science?
Dean Jones: I’ve been interested in artificial intelligence since a disturbingly young age. I’ve followed the trajectory of AI from symbolic reasoning, which was the prevailing paradigm when I was a postgrad, to the statistical reasoning approaches which have become dominant more recently. I became very interested in machine learning as it became clear that this was an extremely effective way of solving many problems in natural-language processing. I’ve always been interested in the application of advanced reasoning and algorithms to everyday problems, and data science fits very well with this.
Are there any recent innovations in the data science community that you see as a ‘game changer’?
Lots. I love using tools like IPython and RMarkdown to interactively analyse datasets, I find them particularly useful for initial exploratory analyses. When it comes to building full-scale models based on huge datasets, the in-memory distributed processing that is possible using tools like Apache Spark and Apache Flink is a welcome step-change from the low-level coding required by map-reduce. At the National Trust, we get a lot of value from using tools like Tableau and Alteryx. Tableau allows us to both quickly investigate a dataset visually, which is often a nice way to communicate with our internal clients, and to produce reports for things like campaign evaluations. We use Alteryx to define workflows which blend data from multiple sources, and it integrates nicely with R and Tableau, which allows us to incorporate things like visualisation and model creation/evaluation into these workflows, so it’s all pretty seamless.
What are the unique challenges facing the National Trust that you are looking to solve with data science?
We are a unique organisation – one of the biggest charities in the UK and one of the largest membership organisations, we’re fortunate to have lots of volunteers who work with us, we own hundreds of properties, we’re the second-biggest landowner in the UK, we have both bricks-and-mortar and online retail, etc etc. Consequently, there are lots of people across the organisation who want to communicate with our supporters about all sorts of different things, and the danger is that our supporters are being swamped by lots of messages which aren’t joined up. Our challenge is to manage all of this so that we get the right message to the right supporter at the right time. We believe that data science has a huge role to play in getting this right.