​Interview: Machine Learning Data Science Manager at Uber, Hugh Williams

"I spend a lot less time advocating for whether to use ML and much more time brainstorming how to use ML."

19Feb

In anticipation of this year's Machine Learning Summit in San Francisco, I spoke to one of our slated speakers, Hugh Williams, data science manager at Uber.

Williams leads the Applied Machine Learning group at Uber which is part of a larger central Machine Learning (ML) org that encompasses several teams working on ML platforms. One of the teams, Applied Machine Learning (AML), focuses on bringing technically advanced approaches (i.e. Deep Learning, semi-supervised learning, etc) to production in order to solve Uber’s toughest problems.

Before Uber, Hugh spent time building ML solutions at Google and Capital One with a primary focus on developing behavioral models and intervention strategies to steer those behaviors.

So... how important would you say Natural Language Processing (NLP) is to the future of AI?

Haha, very important. It’s one of the core areas for extracting information out of unstructured data sources. Within the realm of text and speech… rule-based approaches can only get you so far. Anyone who wants to analyze deeper than keyword analysis will need to develop Natural Language Processing skills.

Now, I don’t want to be dismissive of simple hacks. You can learn a hell of a lot with some well-built regex parsers, but any sort of ML-based system that is attempting to mimic (or at the very least learn from) human behavior is going to require NLP.

At a minimum, knowledge of TF-IDF, n-grams, and 'classical' NLP techniques in the domain of topic modeling are critical. But to really push the boundaries, you need to learn how to leverage Deep Learning as a means of capturing the rich amount of information contained within the text.

And is that what you guys are doing with NLP in Uber?

Not surprisingly, we’re using it to improve our communications with our riders and drivers. For example, if you run a business wherein you collect unstructured feedback, you NEED to have ways of summarizing what your users are saying. For Uber, that means being able to detect trends in our support tickets, flag issues in rider-driver reviews, or summarize findings whenever we do in-app or one-off surveys. It’s of utmost necessity to hear what they’re telling us and NLP is the only way to do it at scale.

A couple other examples include looking at whether the language we use in email communications resonates with users and helps them understand or confuses them. NLP is also helpful in growing our support channels. We have teams of specialists helping drivers onboard over SMS and even more supporting them through our customer support system when an issue is reported through our in-app or web-based help center. NLP helps us scale both of these channels by letting us know what the common issues are, so we can develop tools that help our support teams provide better customer service for our drivers by suggesting appropriate resolutions and alternatives.

What do you think are the most important things organizations need to remember to best position themselves to adopt machine learning? And what do you think the biggest myth around AI and machine learning being propagated round your industry is?

In my mind, the answer to both is: use-cases. Before you even start building any model, you HAVE to define upfront how it’s going to be used and why that’s going to make the experience better for a user. Whether my team is building an NLP-based model for customer support, an anomaly detection system for fraud, or a computer-vision model for a research project, we sit right beside product/engineering to crystalize how the model’s output will be consumed.

I cannot tell you how many times companies and teams make this mistake. A most common example you will see in the industry is customer churn. Everyone and their cousin wants to build a churn model, but I’m willing to bet that a large portion of them fall flat not because of poor ML work, but because the company/team wasn’t clear on what to do with the predictions until after the model was built.

ML/AI isn’t going to solve all of your problems. Heck, Monica Rogati and I each wrote pieces (hers and mine) around why you shouldn’t go chasing ML just for the sake of doing so. Make sure it’s actually what your business needs AND define beforehand how it will help.

But do you think the evolution of machine learning will end up being hampered by the lack of the talent currently in the field?

No, there’s simply a supply/demand imbalance right now and it’ll get better over time. What you’re seeing is a market correction, where tons of talent from other fields are crossing over to ML both because it’s valuable to companies but also because there’s a lot of ways to get into it.

Let me give you an example of why I love the current environment. Here is a sample of the fields that my team got their degrees in:

  • Transportation Engineering
  • Industrial Engineering
  • Thermal Science
  • Quantum Physics
  • Computer Science
  • Statistics
  • Theoretical Chemistry

Before ML, there’s no way all of us would have ended up in the same room, let alone be collaborating side-by-side to solve some of the coolest, most challenging problems at Uber. Yes, it’s tough finding great talent. But that difficulty pushes folks like myself to find brilliance in a wider spectrum of places.

How else has the industry’s attitude towards machine learning and other AI changed since you first entered the field?

Honestly, I don’t know about 'the industry', but I can speak for the companies I’ve worked for. In general, I’ve seen ML become a more formalized staple of business strategy as opposed to some lofty theoretical concept. It is no longer a pipe-dream to use an-ML based strategy to generate efficiency. Those who already used ML are opening their mind to more complex and powerful techniques and willing to push the boundaries now in terms of research. It’s exciting.

As a regular practitioner, it’s greatly helped elevate the conversation. I spend a lot less time advocating for whether to use ML and much more time brainstorming how to use ML.

That is exciting. Lastly, what will you be discussing in your presentation?

I will be presenting on behalf of my team’s work of COTA. COTA is a collection of NLP models that make providing customer support easier for our agents and quicker for our users. It’s an awesome partnership, wherein we’re able to use the text of support issues to match them with common issues and responses, so as to expedite resolutions and increase customer satisfaction.

Looking small

Read next:

Expert Insight: 'An Effective Visualization Results From A Great Deal Of Curiosity And Exploration'

i