When you look at the data being mined today compared to only five years ago, the key difference is that there is considerably more of it. We have seen the rate of growth in data being collected increase exponentially, which should, therefore, mean that decisions often get better and business performance grows. However, the huge increase in data being created has created difficulties in the form of complex data management.
This is an issue that many companies are finding today. Below, we look at four key ways that companies should be looking to their new treasure chest of data to create better business outcomes.
Create Smart Systems
We heard from Muktha Ananda at the recent Predictive Analytics Innovation Summit about the use of complex data mining techniques at XBOX to create smart systems for players on the system. Here, they used the data from various elements of gameplay to create smart systems that could emulate a player's gameplay for games like Forza Motorsport 5. It worked so well that it could predict how a player would drive around a new track and interact with other players.
This use of data has been well received by XBOX users because it provides a considerable benefit to them. It is a lesson that can be brought to a more diverse business landscape too, showing that data can be more powerful than a simple design indicator or marketing approach. The key to this is going to be in the way that it's approached and planned before data starts being collected. Understanding what data is going to be used in advance gives companies a clear advantage once they actually have access to it.
With the level of data mining possible today, it is possible to create experiences that are almost totally unique to a customer. It goes well beyond simply sending them coupons for something that they buy frequently, it allows companies to become almost completely different depending on who they are interacting with. Data can show how somebody would like a site to be laid out in more than simple A/B tested designs, whether they should design a car in a certain way or how to make key business decisions.
For instance, Wired Magazine notes the example of Brown University, who leveraged data to improve their Engineering School. They needed to know whether they should renovate the existing school or move to another site off campus. After mining a huge amount of data, they found that through cross-referencing communications, fund sharing and collaboration, that it would provide a far better experience and ROI to keep the facility on campus. It was only through drilling down to the individual levels of members of the faculty and students and analyzing the millions of interactions, that it became clear.
Collect What's Needed
Companies often fall into the trap of collecting everything they can without considering their use cases. Is it really going to be useful to know that somebody who likes the band Nickelback is more likely to buy orange juice on a Thursday if it's reduced by over 15%? Choosing what data to collect in order for it to be mined is key to a successful operation and the big wins.
The companies who have excelled in their data gathering and analysis are the ones who started relatively simply and then built on what they learnt from their humble data beginnings. Throwing money and resources to collect and store as much data as possible straight away is never going to be the best way to get effective results.
Facebook, for instance, now has four data centre across four areas of the world. In order to look through their data, they needed to create Scuba just to navigate it all, a complex in-memory system. Without this system, it would take weeks to perform a query and make finding anything almost impossible, and what they did find would likely be irrelevant by the time it was discovered. A regular company that tried to collect this amount of data straight away or attempt to mine this amount of information would fail. The reason that Facebook could do it was that they had a lot to build on in the first place, collecting what was needed until they reached the summit upon which they currently sit.
Keep It Up To Date
One of the powerful elements of Facebook's data, and indeed any data-driven company, is that they are working off effective and relevant data. It is why collecting a huge amount of information then not using it can be damaging. Work with what your users are currently saying, not what they previously said.
Data decays at around 2% per month, meaning that an average company can expect to see 25-30% decay rate every year. This is normally caused by changing circumstances, positions or living situations, but shows that unless you are using the data you are collecting within a year, almost one-third of your work will have been wasted. Therefore, making sure that the data you are using is as recent and relevant as possible makes sure that you are mining the best possible data sets, not simply looking at something that happened months ago.
This is perhaps the most difficult element of any data department but is probably the most important. It goes back to the old saying 'garbage in, garbage out'. It doesn't matter how powerful your technology is or how many highly paid data scientists you have looking for correlations and patterns, if they are mining bad data the results will be meaningless.