Analytics Innovation, Issue 8

Where we look at how much bad data can cost a company


In the days approaching the election, Huffington Post mounted what could only be described as a sustained assault on Nate Silver’s 538 website, publishing no fewer than three articles virulently criticizing his predictions for the US Presidential election and the models he used to make them. In his attack on Silver in HuffPo, Ryan Grim wrote: ‘The short version is that Silver is changing the results of polls to fit where he thinks the polls truly are, rather than simply entering the poll numbers into his model and crunching them.’ His article concluded, ‘If you want to put your faith in the numbers, you can relax. She’s got this.’

Given that many will have taken Grim’s advice and had faith in the numbers, it is understandable that this faith is shaken, maybe even gone. The numbers failed them. Mike Murphy, a Republican strategist who predicted a Clinton win, summed up the mood towards numbers on MSNBC, saying ‘My crystal ball has been shattered into atoms. Tonight, data died.’ But what do flaws in the election modeling and polls actually tell us this time? Is data really dead?

The answer is that no, data is not dead. Indeed, it is arguably more important than ever. The only thing the election has shown - yet again - is that you cannot use data without context. There were a number of reasons polls were wrong on an individual basis. Pollsters are less likely to question new voters, in this case white working class Americans, who voted in high numbers. It has also been claimed that there were many so-called ‘Shy Trump’ supporters, echoing a phenomenon seen in Britain when the polls got it so wrong about Brexit and a Conservative Party majority in 2015, hiding their voting intentions for fear of being labeled racist and sexist. Frank McCarthy, a Republican consultant with the Keelen Group, a consulting firm in Washington, DC, said: 'People have been told that they have to be embarrassed to support Donald Trump, even when they're answering a quick question in a telephone poll.' He added that, ‘What we've been hearing from the [Republican National Committee] for months is there's a distinct difference on people who get polled by a real person versus touch tone push poll,

Politics is ultimately about more than numbers, it is about people, and people are hard to predict, but not completely impossible. As Pradeep Mutalik noted ahead of the election, ‘Aggregating poll results accurately and assigning a probability estimate to the win are completely different problems. Forecasters do the former pretty well, but the science of election modeling still has a long way to go.’

As always, if you have any comments or would like to submit an article, please don’t hesitate to contact me at

Bean small

Read next:

City of Chicago: An Analytics-Driven City