Big Data In The UK Election

How did the polls get it so wrong?


This piece was originally going to be discussing how each party used Big Data to target specific voters. It was going to be discussing how polls used it to make their predictions and how accurate these were.

What it will actually be discussing is that there only seemed to be three people in the entire British political landscape who fully understand it.

This was Lynton Crosby, James Messina and Mark Textor.

As the population began voting at 7am on May 7th May, every single poll said that the result would be too close to call, many claimed that the Conservatives would have no chance of making it back into government even with a coalition. The truth by 7am on the 8th was that they had managed to secure a majority in the House of Commons. Almost everybody was surprised, apart from Crosby, Messina and Textor.

Unlike the consensus that the main polling companies had found, on the morning of the 7th, Messina and Textor’s predictions had shown that the Conservatives would bring in 315 votes, not enough for a a majority, but significantly higher than the 280-290 seats that the other polls predicted. The real number was actually higher, with 331 seats in total.

It shows that although more accurate than the traditional polls, it was still fairly flawed compared to the models that Nate Silver and Drew Linzer used in the US election.

So why was it that that the pollsters got it so wrong?

Essentially, it was because pollsters used the same techniques that they had used many years ago and still expected the same results. The truth however, is that society today is a very different place to twenty years ago.

Whereas then our information would have come exclusively from newspapers, tv and radio, today our information comes mainly from the internet. It comes in numbers so vast that it dwarves what we previously had, to the extent that polling companies were so far removed from this fact that they got it almost completely wrong.

Where a newspaper may have one editorial on a political subject, across the web there may be thousands. Where a television programme may spend 10 minutes analyzing an event, I can go onto Youtube and watch hundreds of hours of people bringing different perspectives to the same event.

The beauty of these interactions being done online is that they are all easily trackable.

According to many, the polling companies had called up a select number of people and simply asked who they were planning on voting for or asked people in the street. This clearly did not work and showed us something that I have long held to be true, namely that your online activity is very similar to your voting activity.

Online you can hide behind a screen name and nobody is likely to know who you are or what you are doing. When voting you can put your name next to any box without anybody knowing which party you are voting for. It is the same anonymised action and so has a large crossover in behaviour.

Being able to track online behaviour is fairly simple and can even be done at a drilled down and complex level to get the most comprehensive view of who people are likely to be be voting for.

Many have been disappointed at the result of the election due to the outcome, but from a data perspective the most disappointing aspects has simply been the lack of understanding of the way society gains its information and makes its decisions. Polling companies have a single job, which is to predict people’s behaviour, it seems that they have either wilfully shunned or embarrassingly missed the primary tool in their arsenal in modern society.

But if this is the case then why did Messina, Textor and Crosby manage to have a slightly clearer picture?

It is simply because in James Messina they had somebody who was switched on to the effectiveness of Big Data, somebody who, when hired, simply picked up his iPhone and told his new employers that this device is what would win them the election.

He was right because he, unlike the polling companies, understood that the data that could be gleamed from the habits of mobile users would be key to accurate voter patterns. In fact, mobile data is the ideal factor in establishing voting patterns and the likelihood of specific seats being won or lost.

This is because not only can you get data related to what is being read, communicated and viewed, it is easier to see exactly where people are. As voting is segmented by area it is essential to not only have the information of what people are looking at, but also where they are likely to be voting. This means that if most of your browsing happens in a large town then every night you do a bit in a smaller village 30 miles away, this information would suggest that you work in a large town and live (thus vote) in the smaller village.

It is a complex modelling system, but one that should be achievable if you want to get accurate information.

With this information the conservatives could easily see that they had the edge in the way they were campaigning, whilst Labour believed the numbers that pollsters had provided, meaning that their message wasn’t working but they were under the impression that it was.

This use of data is vital in an election where gauging public opinion is key to the success of a campaign. The fact that the companies who were meant to get it right got it so wrong is a clear indication that Big Data was not utilized to the extent that it could. The party who used it properly were the eventual winners and this point alone is testament to why parties and polls need to be more vigilant in 2020. 


Read next:

Working At The Boundaries Of Aesthetics And Inference