Stop Trolling With Data

Companies are looking to data to get rid of online trolls


Trolling has become a huge issue over the past few years, which seems to have only increased in frequency over the past 18 months. One group of the most famous trolls have been especially active around Donald Trump’s election, with a Reddit group the_donald celebrating his nomination as the Republican Presidential Candidate by posting ‘We have officially sh*tposted Donald J. Trump into the Republican Presidental Nomination. CONGRATS CENTIPEDES!!!’. 

This trend has only continued with thousands of users on Twitter posting vile comments about a huge number of people across the world, such as people like John Nimmo, who was jailed for 2 years in February after sending anti-semitic messages to an MP in the UK via Twitter, having previously been jailed for a similar offence in 2013. One of the key elements of the hard right wing activists in 2017 is vile language used online to offend and gain more momentum through condemnation than they would be able to through positive messaging.

One of the big issues that companies have when trying to deal with this trend is that trolls can simply create new accounts after the initial accounts have been shut down, then they can simply do the same thing over and over again. This is particularly galling for companies looking to create engagement with their customers. Online Harassment, Digital Abuse, and Cyberstalking in America, a study by Data & Society found that over 25% of Americans had at some point refrained from posting online due to fear of harassment or trolling. This is clearly not a small issue of some people being offended, it is impacting the ways that companies interact with their audience.

The issue is so bad today that some companies have limited the number of articles they allow people to comment on because of the difficulty they have in moderating comments across their entire site. For instance, The Times, one of the world’s leading media sources, can moderate only 10% of the comments submitted to their site and cannot leave their comments unmoderated because the problem has become so bad.

However, Jigsaw, an Alphabet owned think-tank have approached this challenge head on and have been working on ways to tackle this issue through data and machine learning. Rather than taking the traditional approach of looking for keywords and phrases, which are simple to bypass, Jigsaw is instead using deep learning to recognize traits like aggression and irrelevancy. The tool they’ve created is called Perspective which ‘scores comments based on the perceived impact a comment might have on a conversation, which publishers can use to give real-time feedback to commenters, help moderators sort comments more effectively, or allow readers to more easily find relevant information’. It is part of their Conversation AI which is taking a wider view of abusive and poisonous language being used online.

At present Jigsaw are still in the relatively early stages of the project given the hugely complex subject and potential to get it wrong. For instance, If I were commenting on a site about dogs the word ‘bitch’ would have a completely different context to if I used it in a comment about a female celebrity. With traditional ways of automated monitoring this context would be completely lost and somebody who frequently comments on a site about dogs could easily find themselves blacklisted.

In order to gain as much knowledge as possible Jigsaw is partnering with a number of companies including the New York Times, Wikipedia, and Disqus, to give their algorithms a huge amount of information to teach them what is genuinely offensive and what isn’t. Initially through a tagging system within the CMS systems on each of these platforms, moderators can tag comments as ‘toxic’ which then feeds into the algorithm and helps it with future automization of these decisions.

At present this system is in its formative stages, building an effective database and learning actions, but given the progress they’ve made and the acceleration in technologies that we’ve seen, we may well see data forcing the trolls back under their bridges soon.

Looking small

Read next:

Expert Insight: 'An Effective Visualization Results From A Great Deal Of Curiosity And Exploration'