Big Data has been hyped considerably over the past 5 years, with large organizations and governments now finding it necessary to collect and analyze huge amounts of data or face criticism for not keeping up.
It is a problem identified by the Australian Bureau Of Statistics (ABS) in their recent report ‘A Statistical Framework for Analysing Big Data’. In this paper they have found that data programmes in companies and governments are conducted backwards.
By this they mean that as much data as possible is collected before it is known what will be done with it. ‘They put the cart (Big Data) before the horse (business problems) and treat 'Big Data as a solution in search of a problem’ according to Dr Siu-Ming Tam, the ABS's Chief Methodologist and author of the paper.
According to Tam, it is also important to not mix datasets just because we can. It may be easier than ever to merge huge datasets, but the reality is that this can create skewed results. These are not necessarily showing causality within correlations as the correlations being brought from different datasets cannot always be trusted.
However, these headline-grabbing aspects of the paper should not take away an important element that Tam discusses, which is that there are significant benefits from using Big Data in government.
For instance, Tam discusses the uses of satellite images combined with other data to predict crop yields in Australia, something which has a vital role in terms of both GDP and agricultural health.
So what does this paper essentially tell us?
It appears that one of the most important elements that Tam is pushing for is that the use of data needs to be collected and collated correctly for the insights to be useful for Australian governments and companies. In truth this kind of insight is necessary for any organization, regardless of geography.
Being selective about the data collected and having a clear mandate about how and why it is being gathered is imperative for the success of any data driven company. It is often simpler to bring in as much data as possible then try to find correlations in it by merging data sets, but there needs to be a dedication to authenticity if the insights found are going to stand up to scrutiny.
So is the bubble bursting in Australia? I would argue that it isn’t, but there needs to be concerted efforts to make sure that data programmes are being implemented and theorized correctly to make sure.