Ahead of her presentation at the Machine Learning Innovation Summit in New York on December 11 & 12, we spoke to Alexandra Abate, data scientist at Dia&Co.
Alexandra Abate is a data scientist at Dia&Co, a leading plus size fashion styling service, where she harnesses the power of data to help stylists personalize customers' subscription boxes.
Before making her transition to the private sector, Alexandra was an astrophysicist. Her research involved designing analyses to discover the effects of dark energy on the expansion of the universe, where she used data science methods to extract signal out of large noisy data sets.
Why do you think we have seen machine learning use increase so dramatically in the past 3 years?
I think it’s due to a number of inter-related reasons. The most mundane is that the average laptop can easily handle large data sets and crank out results using easily implemented open source machine learning algorithms. Couple this with the endless blogs and free online courses that have cropped up in recent years to teach you how, and you have fairly democratic access to learning the right skills.
But why are people interested in doing this anyway? Machine Learning and AI are frequently written about in the media as being the next societal revolution, “the new electricity”, because they have the power to fundamentally change how we live. That’s a pretty exciting thing to be a part of, and a huge draw for people looking to work on exciting problems, apparently so much so that an AI researcher is the face of a global fragrance ad campaign!
How do you think organizations could be utilizing machine learning better?
I think it comes down to a very unflashy and uncool answer: improve the underlying infrastructure and quality of data collected, and focus on these pieces first before trying to build fancy models. Data Scientists and Machine Learning Engineers can make better machine learning models more quickly if the data available is pertinent, robust, simple to load into the model and then, if required, the models can easily be run at scale.
What are the biggest challenges currently facing the further spread of machine learning?
Recruiting probably. Machine learning is still a fairly new area and so there is a very small pool of candidates who have extensive real world machine learning application experience. The harder problem though is often companies don’t know exactly what they are looking for in a machine learning hire, because they too are unfamiliar with their own machine learning application, and are unsure which technical and professional skills are most important to prioritise when screening candidates. This causes recruitment to be a slow and painful process on both sides of the equation.
Do you think that machine learning regulation is currently fit for purpose?
I would say absolutely not. It is rare that you hear a discussion of ethics that goes along with talking about the potential of AI and machine learning. Machine learning at a company like Dia&Co is low risk in this sense: a biased algorithm could send a customer clothes she doesn’t like causing her to have wasted a relatively small amount of money. However an algorithm that uses ancillary data to produce a new kind of credit score or predicts if someone will be a good employee can have devastating consequences to an individual. Bias in these algorithms is especially easy to occur because the model cannot learn from the false negatives: no one employs the individual the algorithm erroneously rejected so no feedback is ever generated that could improve the model. In cases such as this regulation should allow the individual to learn why they were rejected, and give the ability to appeal. Also, Facebook political advertising is a particularly topical example, and regulating political ads is being mulled over by members of Congress. However potentially it should go further: which ads are being shown to which segments of users on a platform should be publicly available information to make marketing practices, often designed by algorithms, transparent.
What can the audience expect to take away from your presentation in New York?
Machine learning approaches that combine algorithms, which learn unseen patterns in huge volumes of data, with expert humans, who have domain expertise that is hard to quantify, should be viewed as parts of the same ecosystem. Modeling the actions of the expert human can allow your application to more easily learn the edge cases while at the same time aiding the human to be more effective in the average case.
You can hear Alexandra's presentation at the Machine Learning Innovation Summit in New York on December 11 & 12