Key takeaways from DATAx San Francisco

We take a look back at five of the most significant insights to emanate from the two-day data extravaganza


With five stages, dozens of speakers and more than 450 people in attendance, the two-day data extravaganza that was DATAx San Francisco left us with much to mull over. Covering AI solutions, concerns about data leadership, the technicalities of machine learning and much, much more, delegates were left with a greatly-enhanced understanding of the world of data and the future of the industry.

For those unable to make it this year ­– and those curious to know what the other stages covered ­– the DATAx team has pooled together its five key takeaways from the event.

To understand your data, first understand its limitations

While big data spent two days at DATAx comfortably astride a pedestal, its limitations were also repeatedly brought to our attention, and we were reminded time and time again that to efficiently utilize data, an approach that is very aware of its boundaries has to be adopted.

"Part of being data driven, particularly when you are building a product, is stepping back and saying data doesn't always have all the answer," said Cathy Tanimura, senior directorof analytics and data science at Strava, explained during her presentation on the AI in Marketing stage. "Instead it is about really thinking about what the data can do to help us understand the opportunity."

Tanimura urged that, to properly make use of data, companies must focus on finding the insights that actually matter and to gather data from many places to ensure the most unbiased datasets possible. And while many organizations stand to benefit greatly from advancements in AI and ML, she also warned them to truly consider whether it was needed or not as she added that "the important thing is to bring it back to: Where is this business value? What does this add?".

During Day Two's AI in Healthcare summit, Kathryn Rough, research scientist at Google, addressed the audience with a similar message, emphasizing the importance of understanding the weaknesses of the vast amounts of raw medical data we have available today.

"US healthcare data alone reached 150 exabytes in 2011. For reference, five exabytes of data would contain all the words ever spoken by everyone on earth," Rough said. "It's messy and complex – and it was not intended for research purposes – and as much potential as there is there, we have to be careful in how we use it."

Like Tanimura, she encouraged companies to question the validity of the data they are using, acknowledging that poor quality data can drastically – sometimes fatally – affect patient outcomes. She also stressed the importance of compiling reports correctly: "It's crucial to transparently report, thoughtfully address limitations and not exaggerate findings."

Data governance was the festival's hottest topic

As data security becomes a central concern for customers, organizations from all walks of life are increasingly facing intense scrutiny about the way they use consumer's data. At DATAx, this was reflected in the vast number of questions levied at speakers centered around the topic of data governance.

Needless to say, in the atmosphere of suspicion that surrounds the subject, many of our speakers were not able to comment on their companies' individual approaches. But one thing was decided during Day One's panel 'Spotlight on Social Media": Data governance used to mean "lock it away", but the question has changed to "how do we figure out how to open it up responsibly?"

No doubt, next year's festival will see our speakers and attendees return to report some serious data governance breakthroughs ­– and with 72% of consumers having no idea which companies to trust with their information, according to digicert, this should be right at the top of most companies' lists.

Personalization: The ultimate matchmaker

The use of data to personalize users' experience came up time and time again. This is no surprise – 74% of consumers have chosen, recommended or paid more for a brand that provides a personalized service or experience, according to Infosys – and organizations today cannot afford to overlook this.

However, one way of utilizing a consumer's unique data that has proven to be incredibly effective is using it to pair up or group individuals who share behaviors, resulting in a higher satisfaction with the product.

This use of personalization was something Electronic Arts (EA) director of data science Scott Allen implored the gaming industry to begin utilizing if they want to develop a loyal fanbase.

Throughout his 30-minute session on the Gaming Analytics stage, Allen outlined how EA use gamers behavior to categorize its 500 million players into groups, allowing them to matchmake players games "utilizing the affinity of confirmed actions".

Doing this, EA "are able to offer dynamic experiences and we are able to change the game while people are playing it", he remarked.

Taking the stage in the packed Machine Learning Innovation Summit, Shanshan Ding, machine learning (ML) lead at Hinge, talked through a very different form of matchmaking as she outlined how the dating app has used ML to more effectively couple up its users.

The team tried many different options for ML recommendation systems, Ding explained, but found that matching people romantically was not the simple personalized ranking problem that ML solutions typically work on with the likes of Spotify or Netflix. Instead, matchmaking was all about delivering its own personalized preferences for potential daters.

"It is really easy to show initial conversion if you show everyone the most attractive people," Ding noted. "But for online dating, we can have an ecosystem where all users are receiving all of the attention. There is an optimal distribution problem, so the decision was made to use ML for personalized ranking, then apply operations research to the distribution problem."

Finding a common language is key to success when working with cross-functionally

Ineffective communication is the primary contributor to project failure a third of the time and it has a negative impact on project success more than half of the time, according to research undertaken by Project Management Institute. And when it comes to data and its complexities, communication becomes even more difficult when different players are attempting to work together while maintaining very different levels of understanding about the technologies at play.

This struggle in communication was a subject that many of our speakers covered, and it was a dilemma Emma Huang, director, data sciences – external innovation at Johnson & Johnson, attempted to tackle during her presentation on the AI in Healthcare stage on Day Two.

First, Huang outlined how vital communication and outreach is across industries to develop the partnerships that ultimately move technology forward.

"I want to encourage everyone to make a connection, because while it might not be the right time to partner it's always good to have a conversations," she urged. "At the end of the day we all want to move in the direction of innovating healthcare."

However, she noted the difficulties in communication between large, legacy companies like Johnson & Johnson and smaller, more nimble data startups are vast, creating an "asymmetry in inherent valuation".

"For example, a company like Johnson & Johnson will work on drug development for years but won't know the nitty gritty of data science and cannot make use of buzzwords – it all looks like black boxes to them."

Therefore, there needs to be an emphasis on avoiding jargon, argued Huang, acknowledging that this is something experts in data science have heard over and over again. However, her conclusion was that companies should start investing in 'translators' – experts – on the inside so teams can communicate.

"It's important to have internal experts. If you don't have internal experts, how can you expect to speak the same language?"

She believes we will increasingly see roles like this popping up, enabling cross-functional teams to communicate and effectively innovate going forward.

During the Gaming Analytics Summit on Day Two, Florent Blachot, associate director – data science at Ubisoft, also called for communication in an effort to speed up innovation, arguing that something as simple as understanding what members of your team do can increase productivity enormously.

"It's about mutual understanding," he said. "It is so important to educate your colleagues about what you are doing day to day, to educate them about your own work so they know what you are doing. Then it is crucial that you learn how your colleagues are working as well."

And as the pace of innovation ever-escalates, the next big data technological breakthrough is just around the corner…

DATAx was abuzz with organizations, old and new, ardently discussing theirs and others' fresh technologies and exciting new use-cases for data and AI. All in all, it was exceptionally clear that the much-talked about fourth industrial revolution is already upon us and adaption to this new way of life is critical.

What remained unclear, however, was exactly what would mark this novel era.

"Advances are merging the physical, digital and biological worlds in ways that create both huge promise and potential peril," writes the World Economic Forum. "The speed, breadth and depth of this revolution is forcing us to rethink how countries develop, how organizations create value and even what it means to be human."

With the acceptance of a new world comes the stark realization that the industry is missing… something: A gap waiting to be filled with a new technological revolution – one which will undeniably be powered by AI and data.

Several speakers and attendees offered their views on what this new pioneering tech will look like, be it chips able to power a whole new level of AI, a groundbreaking next-gen wearable or a much-needed competitor product to the waning smartphone.

As we look back, the opening statement from Edward Saatchi, CEO and co-founder of Fable, comes to mind.

" I'm here today to convince you that the next operating system (OS) will be a virtual being," he declared, noting that, as the era of the smartphone comes to a grating end, we need to start thinking about what system will mark humanities' next epoch. Saatchi believes that AI and VR alone are not yet developed enough to provide that system themselves, but the two technologies working in tandem could create an autonomous being with which we develop a deep bond, forming the next-generation of OS.

Predictions aside, when it comes to data and AI, there is a vacant space for the next pioneer to muscle their way through the fray. And we, for one, cannot wait to see what the next year brings and what DATAx San Francisco 2020 will look like as it takes place on an entirely different technological landscape.

How can marketers achieve 11 multichannel marketing small

Read next:

How can marketers achieve 1:1 multi-channel marketing?