A recent Business2community survey of data professionals found that the data science skill with the highest correlation to project success was data mining and visualization tools.
Jeff Jonas, an IBM Fellow and Chief Scientist of IBM's Context Computing group, seems to disagree. He argues that, ‘Contrary to hype, hopes, and dreams, Big Data visualization is generally not helping humans make novel discoveries. Data visualization has two primary purposes: exploration and storytelling.’ He continues, ‘What are the odds this form factor and experience will help someone find novelty – true weak signal; that proverbial ‘needle in the haystack? Answer: Slim to none.’
Jonas is essentially arguing that the primary use of data visualization is to tell stories about the data we already understand. Whether or not he’s right about its failure to help discover new insights - although many would argue he isn’t - does this mean it’s not the most important stage of data analytics anyway? The discovery of novelty is a wonderful thing, but is entirely pointless to a business if decision makers can’t be convinced of their existence and take appropriate action. Data visualization is the best way of doing this.
The amount of data that organizations now consume and produce has grown exponentially over the last few years, but as comforting as it might be to have reams of data stashed away, it’s only valuable when you do something with it - analyze them, visualize them. Visualization reveals intricate structure in data that cannot be absorbed in any other way. Storytelling with data visualization draws an impactful response from the user and reinforces it with numerical evidence. The way the human brain processes information means that presenting data as a story gives everyone in an organization a better understanding of it, and enabling a greater range of people to make sense of what it’s saying is often likely to lead to more insights.
Zoomdata CEO Justin Langseth argues that Jonas is wrong to say that data exploration does not lead to unexpected insights, noting that pitting ‘exploration leading to insight vs 'aha insights' as separate things’ is wrong. They are the same, really.’ He argues that, ‘The best visual is the one that allows a normal human with understanding of a business system to quickly see how the visuals match up with the system, learn new unexpected things about the system (business) if there are any, but mostly just match up with their innate understanding of the business.’ In business terms, it is this that is the most important thing.
Data still needs people to apply it to situations. The point of data visualization is to communicate with people and engage them, and on top of this, people still need to be convinced of the quality of the data, something else that data visualization helps with. It shows you if your dataset is incomplete by easily displaying where data is missing on the report, and whether it’s valid - with a quick, preliminary visualization on collected data showing trends that indicate problems in the complete data. Jonas may be right that data visualization is not the best way of finding insights, but at the moment it is still the best way of using data to support decision making, and this is ultimately the most important stage of analytics.