Interview With Caroline Clark, Machine Learning Engineer At Argos

'2018 will see lots of change in data science'


Caroline Clark is a Machine Learning Engineer at Argos where she is currently working in forecasting and replenishment. She has a PhD in Cosmology from the Theoretical Physics group at Imperial College London, where she carried out research in data analysis and modeling for observations of the Cosmic Microwave Background, the echo of the Big Bang. Her interests are in optimization, time series models, Bayesian techniques, as well as setting up processes for model evaluation.

We sat down with her ahead of her presentation at the Predictive Analytics Innovation Summit, which takes place in London this March 21–22.

What first sparked your interest in analytics?

I have always been deeply interested in statistics and data analysis. I became involved in experimental data analysis through the final year project of my MSci at Imperial College London, during which I studied Physics with Theoretical Physics. My project focused on processing data from a balloon borne Cosmic Microwave Background experiment. I continued with this field of study throughout my PhD before beginning my career in data science. My PhD gave me a deep understanding of computational methods, data and statistics as well as a background in research skills.

How important is establishing a data-driven culture? What do you think is the most important thing companies can do to instil one?

Creating a data-driven culture is one of the most important aspects for the successful integration of data science and machine learning. It's about communicating the scientific approach to problem solving and being able to clearly and concisely explain your reasoning. One of the most important pieces of the puzzle is a thorough understanding of the evaluation metrics used to measure the performance of the current system and compare against a model. Creating a culture where teams make decisions on how to progress a data science project through a clear analysis based on these metrics is essential to success.

How do you see data scientist role changing in the future and how do you think machine learning will impact their role? Will data scientists themselves see their work automated?

I think the data science role will evolve as companies and the data science community grow and develop. There's so much exciting research happening in industry that it's hard to keep track of all the new developments in the field. I think 2018 will see lots of change in data science.

Is there a skills gap in data science? If so, what can we do to fill it?

I'm not sure if there's a skills gap. In my experience, companies aren't always clear in the job description about what they're looking for from a candidate. For example, do you need someone with strong software development skills as well as a background in data science? It can certainly be more difficult to find someone with skills in a broad range of areas, but this may not be what you need, so I'd recommend understanding the requirements of the role and targeting candidates based on this. I spend a long time reading the job description when looking for a role and I think this can help to attract the right candidate.

Is a data science team better centralized or decentralized? Why?

I have worked in both kinds of team, and I think there are benefits to both. A decentralized team can build strong relationships with business teams so I like this approach. Ultimately, as long as there is a clear structure to enable communication with business teams and foster collaboration I don't think it matters too much.

What new technologies and approaches to data strategy should we watch out for in data analytics in 2018?

In 2018 we are obviously going to see lots happen with respect to data privacy, which I think is going to be very exciting. However, it's going to present new challenges to data science and machine learning teams as well.

What will you be discussing in your presentation?

Establishing tools for model and data governance is an important but often overlooked topic. I will discuss guidelines for governing the usage of data and techniques that can be applied to measure, assess and prevent bias of algorithms and models. 

You can hear more from Caroline, along with other industry-leading experts, at the Predictive Analytics Innovation Summit, which takes place in London this March 21–22. View the full agenda here


Read next:

Why Blockchain Hype Must End