Machine Learning Can Be The Salvation Of Cyber Security

The industry is struggling, but machine learning could solve the problem


IBM’s artificial intelligence platform Watson has previously been applied to solve problems in all manner of fields, from finance to healthcare. It has seen significant results wherever it has focused its gaze, and this will likely be the same in its latest project, as the machine learning tool attempts to solve one of the most pressing threats to the world economy - cyber security.

Researchers at the tech giant have already started sourcing computer security data from its open access threat intelligence platform, X-Force Exchange, and uploading it to Watson. The system will analyze this data to identify security vulnerabilities, spam messages, malware, among others, by using machine learning algorithms to effectively remember what they look like so it can identify them again without the need for humams. The tool will analyze 80% of all data on the internet, much of which traditional security tools are unable to process, including blogs, articles, and videos, that discuss new malware.

IBM’s move reflects a general trend in cybersecurity towards utilizing machine learning techniques to detect and neutralize threats. The threat is vast and evolving. According to a ITRC Data Breach Report, over 169 million personal records were exposed in 2015, stemming from 781 publicized breaches across the financial, business, education, government and healthcare sectors. IBM/Ponemon found that the average global cost of each lost or stolen record containing confidential and sensitive data was $154. In healthcare it was $363.

There are two trends in cybersecurity that make machine learning the perfect weapon to deal with this threat. Firstly, there has already been a tremendous amounts of useful data collected and stored, and this continues to grow every day. Machine learning algorithms require huge datasets to learn from, and that this is already available makes the task far easier. It is also simply far more data than a human security analyst can process. Companies need to identify threats as quickly as possible, as hackers can be in and out and taken what they need in minutes without anyone noticing. In its latest Data Breach Investigations Report, Verizon found that almost 80% of attackers took just days to infiltrate their targets, but only a third of companies managed to detect the attacks within the same time frame. The speed at which machine learning algorithms churn through the data should greatly narrow this time frame.

The second trend is the dearth of security experts needed to defend organizations’ vital infrastructure and systems. A security team only needs to slip up once, leave one door open, to allow for a devastating attack. And the need is only growing. Marc van Zadelhoff, General Manager, IBM Security noted that, ‘Even if the industry was able to fill the estimated 1.5 million open cyber security jobs by 2020, we'd still have a skills crisis in security’. Machine learning algorithms could comfortably fill this gap.

There are, however, a number of issues with relying on machine learning for cybersecurity. The primary cause of hacks is still human error, and it is difficult to see whether machine learning could prevent these. Many issues also do not require analysis of petabytes of data. A Windows system, for example, will only have a limited number of ways for attacker to gain access to user credentials, which can be expressed using Indicators of Attack by skilled experts.

Ultimately, as long as its humans committing cyberattacks, humans are necessary to protect networks to some degree and make the critical decisions. However, machine learning is increasingly going to be used to do the bulk of the manual labor. What it ultimately does is scale the knowledge of skilled human analysts to large data sizes, and deals with the levels of complexity beyond human capabilities. With the evolution of the Internet of Things and the explosion of connected devices we’re likely to see in the next few years, the scale of data and complexity of analysis is only going to increase. Machine learning is the only tool currently at our disposal that can cope, and potentially quell the surge in attacks.

Bean small

Read next:

City of Chicago: An Analytics-Driven City