How Text Analytics Can Help Solve Our Mental Health Problems

Social media posts can indicate depression, and we need to do more to analyze them


When humans started writing about 5,000 years ago, few could have ima gined quite how banal it could get. From the President of the United States making up words, to Kim Kardashian West asking her followers whether peanuts have carbs in, Twitter has, for better or worse, changed our idea of what it means to communicate using text. Even the most banal tweets are revelatory about the author in some way. The written word is how we communicate our ideas, experiences, and emotions to those around us, and every tweet is a fingerprint that tells us something about the author. Their syntax, the words that they used, the punctuation… all combine in such a way that the reader can identify certain characteristics about their personality and emotional state.

Through careful analysis of social media output, we can build a rich picture of an individual’s motivations, fears, emotions, and the way that they connect with others and themselves. In a research paper released by Universiti Teknologi Malaysia, it was noted that a ‘study presented in Golbeck et al. (2011) showed that humans reveal their personality trait in online communication through self-description and online statistical updates on social networking sites through which the FFM (someone who displays personality traits such as openness to new experience, conscientiousness, extraversion, agreeableness, and neuroticism) can provide a well-rounded measure of the human–computer relationship. The study observed that the personality trait of users can be estimated (in social media) to a degree of ≅11% accuracy for each factor based on the mean square error of observed online statistics. This implies that personality trait prediction can be achieved within 1/10 of its actual value.’

This has profound implications across a variety of fields, and nowhere more so than in mental health. We are now able to apply machine learning techniques to unstructured data in ways previously unimaginable, opening up swathes of text for analysis that can help identify those who are most at risk of depression. Indeed, social media giant Facebook recently came in for criticism after leaked documents revealed that it had told advertisers it could identify in real time through posts and photos when teenagers feel ‘insecure’, ‘worthless’ and ‘need a confidence boost’, presumably so they could intervene in real time to sell them something. This is, of course, incredibly dangerous. Companies want you to feel insecure because it makes you easier to sell their goods to. When you are bombarded with ads telling you to try their new diet pill, your perception of yourself becomes negative and makes you prone to believe those selling the solution. However, while this may be true, used the right way it could also do a great deal of good.

According to a recent study published in Translational Psychiatry, more than 36% of teenaged girls in America are depressed or have suffered a recent major depressive episode, as have 13.6% of boys. These numbers have risen dramatically in recent years, and while there are many factors that impact an individual’s mental health, by identifying the traits most commonly associated with depression and any indicators that they could be moving to that state of mind, it is possible to intervene with therapies, even small ones you can do at home, that can counter it in some way.

Happify, for example, is an app that offers mindfulness and meditation therapies that have been found to improve the happiness of those feeling the same emotions as you. At the recent Big Data and Analytics in Healthcare Summit, Ran Zilca, Happify’s Chief Data Science Officer, discussed how data could be used to identify the causes of depression.

Zilca led with the argument that observed behavior - such as how many cigarettes a person smokes a day, whether they exercise, whether they take their medication, whether they show up to work, and so forth - is noisy and messy. On the other hand, there are mechanisms and processes inside an individual that are far more structured and conducive to intervention and changing behaviors. These are behaviors that are cognitive - feelings, thoughts, and emotions - and they reveal far more about behavior.

In order to better understand these characteristics, it is first and foremost important to use psychology to understand whether what you are trying to assess through your analytics efforts is psychologically valuable. Without this, Zilca says ‘it’s like trying to fix a car without looking under the hood’. Firstly, there are psychologically meaningful variables. These include psychological traits, such as personality and values, psychological states such as affect and emotional state, topics that occupy their mind, such as hopes, fears, aspirations, attitudes and finally their intentions.

There are several different channels to collect this data so that it can be better understood. Self reports are flawed, difficult to obtain and compliance results are not always high, but they useful for obtaining label data. Text, on the other hand, is a far richer source. What you write and say is who you are, and recent research using data has revealed a number of things about what people reveal about themselves through their writing, particularly around stop words such as pronouns. Pronouns (such as I, you, they), articles (a, an, the), prepositions (to, of, for), auxiliary verbs (is, am, have) have very little meaning on their own and the English language has fewer than 500 function words. However, they account for more than half of the words we speak, hear, and read every day, and by analyzing their use, we begin to learn how speakers are connecting with their audiences, their friends, their conversational topics, and themselves. For example, a poet who uses ‘I’ a lot in their work will be more likely to commit suicide. Zilca also pointed to research that showed how low levels of conscientiousness - the tendency to act in an organised or helpful way - strongly correlated with depression. People with very low consciousness scores have 75% chance of being diagnosed with depression. While this doesn’t necessarily show causation, but definitely shows it is tied. The same with extraversion. While these findings are not necessarily surprising, it validates much of the literature. He also noted research from M Choudhury, an academic at Georgia Tech, who used Twitter feeds to try and predict the onset of depression within the course of the year. They looked at various indicators from the text in Twitter feeds alongside other variable social networking analysis to predict the onset of depression within 60-70% accuracy.

Organizations are, for obvious reasons, desperate to understand consumer behavior, and they will do so. There are, however, far nobler goals to be achieved by analyzing our social media feeds, and with mental health becoming an increasingly serious problem for society, there is a clear need for organizations to do more to use social media data analysis to pinpoint and resolve issues as quickly as possible.


comments powered byDisqus
Data culture small

Read next:

Building A Culture Of Data