The hype around big data has swollen to mammoth levels over the past few years. Organizations in both the public and private sectors have fallen over themselves to throw money at the technology and expertise necessary for collecting data and rooting out the insights held within, while a Google search will return millions of articles extolling its virtues. However, anything that goes up must come down, and for big data, it seems that the backlash has begun.
The past year has seen a widespread malaise gather around the use of quantitive data, most pronouncedly in politics. William Davies, writing in the Guardian, noted that surveys conducted in the US have found that 68% of Trump supporters distrusted the economic data published by the federal government. In the UK, a research project carried out by Cambridge University and YouGov looking at conspiracy theories found that 55% of the population believes that the government ‘is hiding the truth about the number of immigrants living here.’ It could be argued that this is simply a symptom of the growing distrust developing towards government and the ‘mainstream media,’ fueled by the rhetoric of supposed ‘outsider’ politicians like Donald Trump and Nigel Farage. Indeed, you will often find that people mysteriously start to believe in the numbers when they reinforce their pre-existing belief system. It would be easy to believe, therefore, that this begins and ends in the political sphere, but ideology and personal experience trumping evidence is a worrying trend increasingly being seen in other areas, including business.
This distrust has not just sprouted from nowhere. There have been some well-publicized failures with data over the last year that detractors have been able to use as evidence. The polls around the EU Referendum and US presidential elections, for example, were dramatically mistaken according to public perception, even though they were both well within margins of error. There has also been a significant amount of bad press around big data, with algorithms employed by Google and Facebook to curate search results and timelines being blamed for driving partisan confirmation bias. Concerns around the implications of big data collection for privacy rights are also starting to appear well founded, with a number of hacks raising fears it could end up in the wrong hands and use of wearables to monitor employees’ every movement growing, suggesting that data may actually be enabling companies to control us completely.
Leading the charge against big data is Cathy O’Neil, author of Weapons of Math Destruction. She particularly warns against the statistical models being used in education and policing, arguing that such applications can actually codify biases and exacerbate inequalities. She writes that, ‘Models are opinions embedded in mathematics. Big data is a new field, and people are essentially blindly trusting it.’ She further argues that some are ‘using people’s fear and trust of mathematics to prevent them from asking questions.’
This is true, to a degree, but the examples of misuse O’Neil cites are extremely few in comparison to the many examples of good that it has done. Blaming big data because of a few bad actors is reckless. Some people misuse cucumbers, that doesn't mean I'm going to start putting gravel in my tuna sandwiches. But such skepticism seems to more and more be infecting boardrooms. In a recent survey of 2,165 data professionals commissioned by KPMG and conducted by Forrester Consulting, 49% of respondents said they believe their C-level executives don't fully support their organizations' data and analytics strategies. In another KPMG survey of 400 US CEOs early last year, 77% said they have some level of distrust toward the quality of the data on which they base their decisions, and around 70% that using data and analytics leaves their organization vulnerable to reputational risk. KPMG also found that only about 34% of business leaders said they are 'very confident' about the insights they get from data, and just 13% that their firm excels in the privacy and ethical use of data and analytics. A recent Economist Intelligence Unit (EIU) survey found that just 2% of respondents say they had achieved ‘broad positive results’ from their data projects.
There is yet to be any real evidence that suggests companies are scaling back their data initiatives, but the warning signs are there. A Gartner, Inc. survey found that 48% of companies invested in big data in 2016, up 3% from 2015. However, those who plan to invest in big data within the next two years fell from 31% to 25% in 2016, and almost 75% of respondents said that their organisation has invested or is planning to invest in big data but they are stuck at the pilot stage. Only 15% of businesses reported deploying their big data project to production in 2016, up from 14% in 2015.
This is not a trend that is easily corrected. The simple truth is that, as with any relatively nascent technology, adoption has not been smooth and there are still many mistakes being made. Even when you are doing everything perfectly, it is not easy to point to amazing ROI as evidence of its success simply because there are still no IT ROI models that can be used. James Kobielius, a Big Data Evangelist for IBM, notes that, ‘Putting a dollar value on data is a very tricky endeavor, Data is only as valuable as the business outcomes it makes possible, but the data itself is usually not the only factor responsible for those outcomes. How can we tie this back to putting a monetary value on big data?’ For example, around 45% of marketers participating in a study by the Winterberry Group said they had difficulty proving a return on investment for their data-driven campaigns. The consequence of this is that senior buy-in is harder to gain and skepticism harder to argue with.
This is going to be an ongoing battle. The most important thing is to implement data projects properly. Organizations need to stop just expecting it to drop insights into their lap, investment and expertise are required, as is the patience to wait for results. Organizations also need to understand that they cannot just rip off Google and Facebook, they need to take a bespoke approach to meet their specific business challenges. Data evangelists then need to prove ROI any way they can, and they need to do so constantly throughout the project so that doubt isn’t allowed to creep in. Data initiatives don’t have to involve a large initial outlay, and implementation should be done slowly, firstly so that everything is done right, secondly so that evidence can be provided to justify further investment allowing it to be scaled up, rather than simply abandoned if results are not immediately fantastic. Mark van Rijmenam notes on datafloq that there are now ‘a number of open source tools available that are free to use and that work on commodity hardware, thereby saving a lot of money’, while storage is also becoming cheaper with innovative cloud data warehouse solutions offering value for money. You can also point to the many examples of where data has provided ROI. McKinsey has put the amount companies can increase their operating margin by using big data at 60%, and they can reduce expenditure by 8%. Case studies of individual companies such as that of UPS, who have saved over 39 million gallons of fuel and avoided driving 364 million miles, also provide tangible examples to sell the idea of data with.
Ultimately, data is not infallible, and skepticism is a good thing. Cathy O’Neil is right when she points to mis-use of algorithms, but this does not mean that we should just dismiss it, only that we need to be more vigilant. Data should not be trusted implicitly, it should be probed rigorously for weaknesses before analysis is conducted. There are many problems with relying too much on data, and perhaps we have been promised too much over the last few years. But it has been a hard slog getting people to move away from intuition to evidence-based decision making, and this sudden regression in people’s thinking is a concern. We have made too much progress to go back without a fight, and need to draw a line in the sand before.