Data Is Inherently Wrong

The use of data in its current form is great, but does have some flaws


If you are to believe the popular media, data is the be-all and end-all of businesses. It allows companies to make the most of what they currently have, whilst also making better decisions about what they are going to have in the future.

However, this is not always the case, in many instances data can be the flaw in a decision, simply because it is attached to assumptions.

For those outside of the technical aspects of data collection and analysis this is not something that is necessarily considered. With algorithms sat unnoticed behind the findings, few are going to consider the complexities of their creation and why this process alone is enough to devalue a conclusion.

When algorithms are built, they are done so for a purpose, be this to find specific types of data, run focussed mining techniques or collate similar data. However, all this is written into an algorithm by a human, who have the natural blindspots that every human has.

This means that bias is input into the system before the system has even been run, putting a certain degree of error into the conclusions. To take the example given by pymnts, it is essentially the recipe, whilst the data is the ingredients and the programmer is the chef. You can have the correct recipe and the correct ingredients, but Gordon Ramsey with the same ingredients and recipe is still going to outperform me in the kitchen.

Machine learning and AI are two ways that people are attempting to get around this inherent bias, but this in itself is not necessarily going to provide the answer. Again going back the pymnts analogy, it is not creating something based on the recipe, but is instead the process of creating the recipe itself.

This means that it too is going to have flaws as the actions are learnt and not programmed. It would be the equivalent of learning to drive a car without having seen anybody drive a car before. You would learn how to make it operate, but you are probably not going to be a great driver even though you can drive from one point to another, even if you can learn to deal with all of the unexpected events that might crop up along the way.

At present it is difficult to see a way around this simply because human input itself is flawed given the inherent bias that it brings, but it also has the self awareness to check and cross reference anything it does. A machine learnt algorithm does the opposite, it does not input biases, but because of this may not always do things in a logical way, because that is how it has learnt to do things.

I am certainly overstating some elements of this and the truth is that these options are far better than the ways of old business. We are also seeing with the spread of data use, that this kind of bias and catch 22 is certainly not impacting too heavily on businesses at the moment, however at this stage of the development of data driven businesses, it is something that companies need to be aware of. 

Bean small

Read next:

City of Chicago: An Analytics-Driven City