Evaluating the Social Media Users’ Mental Health Status During COVID-19 Pandemic Using Deep Learning

Fernández-Barrera, I.; Bravo-Bustos, S.; Vidal, M.

doi:10.1007/978-3-031-59216-4_7

I. Fernández-Barrera⁹,
S. Bravo-Bustos⁹ &
M. Vidal^9,10,11

Part of the book series: IFMBE Proceedings ((IFMBE,volume 108))

Included in the following conference series:

International Conference on Biomedical and Health Informatics

Abstract

Depression is a globally known disease with a great impact on the suicide rate. However, this can be an early diagnostic by observing the behavior of the patients through the time. In this research, we studied the linguistics and visual features of depressive mood during COVID-19 pre and post-pandemic based on Flickr posts. We implemented the significant advances in text-based sentiment analysis and image classification using Natural Language Processing (NLP), histograms and deep learning strategies to characterize some of the main patterns of depression. We demonstrate that user’s behavior in social media had a relevant impact during pandemics, since the main patterns change drastically between periods. For images, we found that in pre-pandemic, user posts were more uniform in color distribution and with medium to low levels of light intensity. Besides, the scenes were more outside activities like. For text, we found that the topics and general sentiment were always depressive and with negative connotation, however, during pre-pandemic they described attributes of the symptomatology of depression pathology, while in post-pandemic are more related to the product of isolation and fear.

Download conference paper PDF

Keywords

1 Introduction

Multiple factors during the COVID-19 pandemic, such as the associated lockdown, physical distancing, among others, have affected the health of people, and nowadays, mental health consequences are a worldwide problem [1]. Artificial Intelligence applications for mental health have been used to predict, detect and prevent conditions before reaching clinical level symptoms [2, 3]. However, unlike other fields of medicine, it presents several challenges, such as there are very few validated biomarkers in mental health, leading to a heavy reliance on patient-physician questionnaires [4]. According to the World Health Organization, during 2019, 1 in 8 people, or 970 million people worldwide were found to be living with some mental illness, anxiety and depressive disorder were the most common. In 2020, this number increased significantly due to the COVID-19 pandemic. The initial estimation shows an increase of 26% and 28% for anxiety and severe depressive disorders, respectively, in just one year.

Nevertheless, social network platforms are now being used by users to express their emotions, feelings, and thoughts, becoming a valuable source of data for researching mental health [5]. The online behavior and activities inspired many researchers to perform detection systems for health care using NLP techniques and text classification [6]. On the other hand, deep neural networks have improved classical machine learning approaches in large healthcare datasets, some of the success models have been in the field of computer vision, focusing on image and video analysis for classification, detection, and segmentation tasks [7].

In this work, we analyze user’s behavior on the Flickr platform, collecting images and posts. We applied NLP (polarity and sentiment analysis) and deep learning methods to understand the visual and linguistic features associated with depression patterns. Both linguistically and in the images, we found that the attributes describing the pre-pandemic and post-pandemic periods differ from each other, but each period is associated with patterns that infer depressive mental health status.

2 Methods

The proposed architecture is composed of two main modules. The first module is the data collection and dataset construction, based on Flickr user’s posts based on characteristics closely related to depressive moods. For this purpose, filters related to the polarity and sentiment lexicon of the posts were applied. The second module deals with the extraction of relevant features, and the cross-analysis of textual and visual features to find hidden patterns.

2.1 Dataset Pre-processing

To conform the dataset, we first collected posts and images from Flickr, using the Flickr API. The search considered 1,000 images per page ordered by relevance, selected according to the “depression” tag, that is, only the posts that contain it. The features extracted were the image, title, description and associated tags used in the same post. The chosen population contains the entire universe of Flickr users and it is not geographically delimited and the dates considered were from January 2018 to December 2019 (pre-pandemic) and January 2020 to December 2021(post-pandemic) since the World Health Organization (WHO) declared the outbreak a public health emergency of international concern on January 30, 2020 and a pandemic on March 11, 2020 [8]. The preliminary dataset contains 14523 images.

First, polarity was used as an initial filter for the post tags and description, using VADER (Valence Aware Dictionary and sentiment Reasoner), which is a lexicon and simple rule-based model for sentiment analysis from the NLTK library, it is specifically attuned to sentiments expressed in social media [9]. Then, we applied NRC Emotion Lexicon (EmoLex) to both tags and description. This associates English words with eight basic emotions (anger, fear, anticipation, trust, surprise, sadness, joy, and disgust) and two sentiments (negative and positive) using NLTK library’s WordNet synonym sets [10].

Tags filter: For tags filter we stored the data that have a polarity less than 0 to ensure negative sentiment-related tags. After this, the frequency of the tags was analyzed, storing only those with a frequency greater than 100 based on an exploratory analysis, choosing only those with a polarity value less than \(-0.5\) between those and a sentiment lexicon with a score higher than 0.2 for negative emotions. This ensures that the tags are the most used, with the highest (negative) score within this group.

To the resulting list are added the tags “mentalhealth” and “mentalillness”, which, according to previous studies, are highly related when accompanied by some pathologies such as depression, since they have a semantic relationship [11]. Once the tags have been chosen, each post is asked if it contains any tags from this list, to ensure that the post will have the tag depression (with which the data was acquired) and any of those chosen according to their closeness with the pathology.
Description filter: After the tags filter, we select only the publications where a description is provided and proceed to filter using the emotional components within the text. First, we start by removing stopwords, symbols and other non relevant characters within the post description that may interfere with our analysis. We also used lemmatization or stemming to group similar words together or convert words to their root form.

After this, we choose only data with polarity value less than zero and the scores of “sadness” and “disgust” emotions are higher than the others. These last are chosen according to the Diagnostic and Statistical Manual of Mental Disorders (DSM-5), which establishes that depressive moods are associated with the predominance of sadness and disgust) [12]. The final dataset contains 580 images.

2.2 Posts Features

In order to characterize the study group, linguistic and visual analyses are used. The first approach was to analyze the captions (description and tags) to explore the user’s mental status, and a second approach was to analyze the images from the data collection seeking to understand the composition of the images at high and low levels.

2.2.1 Linguistics Features

For text analysis, we calculated the frequency of each tag obtaining the most used for each period (pre and post-pandemic). For the image descriptions, we extract tokens to see their frequencies and their associated topics, with wordclouds and NLP strategies using the library NLTK.

2.2.2 Visual Features

For image analysis, we pre-processed the images reading each one in gray scale and obtained a gray scale histogram. Then we transformed the original images to RGB to perform RGB histogram analysis. To understand the environment or scene and the attributes in the images, we implemented a deep learning Convolutional Neural Network (CNN) using PyTorch [13] and the frameworks Places365-CNN [14], and Wideresnet18 [15].

3 Results

3.1 Linguistic Features

By using deep learning models and NLP strategies we identified the most relevant patterns and features associated with depression tags from different users from Flickr.

3.1.1 N-Gram Analysis

After pre-processing and cleaning the text, in Fig. 1A and B, we observed the wordcloud-based representation of the post description N-grams for both groups (pre and post-pandemic), in which the size of each word indicates its frequency in the dataset.

Through the topics N-grams, presented in Fig. 1C and D, we can observe evidence of seasonal sadness or depression (“depressing”, “winter”) sadness and distress (“gloomy”, “mental”, “affliction”). In contrast to the post pandemic group, where the N-grams examined mainly contained words describing topics related to the Coronavirus Pandemic (“isolation”, “year”, “impact”).

3.1.2 Tags Analysis

To analyze tags, we used VADER with a threshold less than zero, keeping only tags with a frequency of at least 80 for better graphic representation. In Fig. 1E, we observed for the pre-pandemic tags attributes describing symptomatological characteristics of the depression pathology such as “sad”, “melancholic”, “gloomy”, “vulnerable”. On the other hand, for post-pandemic, we found attributes describing characteristics associated with a general feeling of fear and restlessness (“scared”, “frightening”, “apocalyptic”) and also a feeling of loneliness (“loneliness”, “isolation”, “sadness”).

3.2 Visual Features

We observed that visual features are associated with the mental health tag status observed for the users. We analyzed the brightness, color distribution, light intensity, environment and scenes. Figure 2A and B shows RGB and Gray histograms in comparison by period, where it indicates that the pixel quantity tends to accumulate around medium and high levels of light intensity, while in post-pandemic is mostly at low levels of light intensity. We can also see that the color distribution has different patterns between periods, being uniformly distributed among the three channels (RGB), while in the post-pandemic period, a non-uniform distribution pattern is observed among them.

In addition, we classified the environment and scenes by year using the CNN under two classes: “indoor” (pre-pandemic: 234 images, post-pandemic: 338 images) and “outdoor” (pre-pandemic: 322 images, post-pandemic: 210 images). Therefore, the indoor environment images increased 44,4 % and the outdoor environment by 34,8 %. On the other hand, we obtained 35 main attributes of images between both periods, obtaining mostly outdoor activities and scenes related results for pre-pandemic period such as “driving”, “transporting”, “sunny” or “trees”. While in the post-pandemic period were more frequent indoor activities and scene attributes such as “indoor lighting”, “enclosed area”, “working”, “stressful” or “far-away horizon”. The results of the attributes and a sample of the images for pre and post-pandemic used in the CNN can be seen in Fig. 2C and Fig. 3.

4 Discussion

After extracting publications (text and image) we applied the pre-processing steps and filters (VADER EmoLex) and obtained a dataset characterized mainly by a negative polarity and depression-related emotions. We observed almost no variation in the number of publications between periods (556 vs 548 publications in each group). This suggests that the number of users that presents symptoms of pathology does not increase during the pandemic, however, as the frequency distribution of N-Grams and tags frequency suggests, the topics change, getting more related to the pandemic situation in that period.

For instance, N-grams topics changed from “gloomy”, “depressing” or “winter”, to “work” “isolation”, “year” and “impact”. This may be due to confinement and working remotely since people was unable to attend their jobs or travel. Therefore, this lead us to infer that there is a significant shift in the general mood due to the global pandemic event, since a majority of the population was confined and force to stay home, many times in isolation apart from their loved ones, which can easily generate feelings of panic and anxiety [16, 17].

For the image analysis, we performed histograms and scene analysis. The results of histograms indicates that the users post images with an harmonic distribution of the three channels in pre-pandemic, while in post-pandemic the posts have pictures with blue and green dominance, i.e. more color disturbed images.

Besides, the attributes of the images suggests that there is a relevant change in the content of the photos, being more outdoor like in pre-pandemic and more indoor like in post-pandemic. These results are the same in the exploration of the environments of the images, which make sense since the extensive quarantines. The visual features analysis led us to infers that while the patterns are different between periods, both are characteristics of depression mood, since both present distributions of color and light intensity with most pixels in low levels, which according to several studies are directly associated with depression [18, 19].

5 Conclusion

We observed a relationship between depression symptoms and the “Coronavirus pandemic” as shown in N-Gram and tag analysis, where we notice a change in the higher frequency tags going from “sad” and “depression” (pre-pandemic) to “tired” and “isolation” (post-pandemic), and an overall increased feeling of “fear” within the publications made for the post-pandemic group.

Those analyzes are clear indicators that the impact of the situation was big enough to change the users behavior in social media, reflecting feelings, moods and even symptoms consistent with depression but originated by the pandemic, instead the users of the pre-pandemic period which reflects mostly symptoms of common depressive mood.

Finally, we observed that during the COVID-19 pandemic the users expressed behavioral differences between pre and post-pandemic, specifically users associated with mental illness posted darker images and indoors scenes. Therefore, by analyzing images and posted information from social media with NLP and deep learning strategies can be useful for improving the performance to classify users into depressive and non-depressive groups and provide an early diagnostic.

References

Moreno, C., Wykes, T., Galderisi, S., et al.: How mental health care should change as a consequence of the COVID-19 pandemic. Lancet Psychiatry 7, 813–824 (2020)
Article Google Scholar
Graham, S., Depp, C., Lee, E., et al.: Artificial intelligence for mental health and mental illnesses: an overview. Curr. Psychiatry Rep. 21, 1–18 (2019)
Article Google Scholar
Lee, E.E., Torous, J., De Choudhury, M., et al.: Artificial intelligence for mental health care: clinical applications, barriers, facilitators, and artificial wisdom. Biol. Psychiatry Cogn. Neurosci. Neuroimaging 6, 856–864 (2021)
Google Scholar
Gupta, V.K., Singh, A.P.: Mental Health Questionnaire (MHQ) for managers: development and standardisation. J. Health Manag. 24, 478–487 (2022)
Article Google Scholar
Chancellor, S., De, C.M.: Methods in predictive techniques for mental health status on social media: a critical review. NPJ Digit. Med. 3, 1–11 (2020)
Article Google Scholar
Carol, F., Noémie, E.: Natural language processing in health care and biomedicine. Biomed. Inform. 255–284 (2014)
Google Scholar
Andre, E., Alexandre, R., Bharath, R., et al.: A guide to deep learning in healthcare. Nat. Med. 25, 24–29 (2019)
Article Google Scholar
Organization World Health, et al.: Novel Coronavirus (2019-nCoV): situation report (2020)
Google Scholar
Hutto, C., Gilbert, E.: Vader: a parsimonious rule-based model for sentiment analysis of social media text. In: Proceedings of the International AAAI Conference on Web and Social Media, vol. 8, pp. 216–225 (2014)
Google Scholar
Mohammad, S.M., Turney, P.D.: NRC emotion lexicon. Natl. Res. Council Canada 2, 234 (2013)
Google Scholar
Xu, Z., Pérez-Rosas, V., Mihalcea, R.: Inferring social media users’ mental health status from multimodal information. In: Proceedings of the 12th Language Resources and Evaluation Conference, pp. 6292–6299 (2020)
Google Scholar
Ríssola, E.A., Bahrainian, S.A., Crestani, F.: A dataset for research on depression in social media. In: Proceedings of the 28th ACM Conference on User Modeling, Adaptation and Personalization, pp. 338–342 (2020)
Google Scholar
Paszke, A., Gross, S., Massa, F., et al.: PyTorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Google Scholar
Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., Torralba, A.: Places: a 10 million image database for scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 40, 1452–1464 (2017)
Article Google Scholar
Zagoruyko, S., Komodakis, N.: Wide residual networks arXiv preprint arXiv:1605.07146 (2016)
Jingyi, W., Brynmor, L.-E., Domenico, G., et al.: Social isolation in mental health: a conceptual and methodological review. Soc. Psychiatry Psychiatric Epidemiol. 52, 1451–1461 (2017)
Article Google Scholar
Tzung-Jeng, H., Kiran, R., Carmelle, P., William, R., Manabu, I.: Loneliness and social isolation during the COVID-19 pandemic. Int. Psychogeriatrics 32, 1217–1220 (2020)
Article Google Scholar
Reece Andrew, G., Danforth, C.M.: Instagram photos reveal predictive markers of depression. EPJ Data Sci. 6, 15 (2017)
Article Google Scholar
Wang, Y., Li, B.: Sentiment analysis for social media images. In: 2015 IEEE International Conference on Data Mining Workshop (ICDMW), pp. 1584–1591. IEEE (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

Facultad de Ingeniería y Tecnología, Universidad San Sebastián, Concepción, Chile
I. Fernández-Barrera, S. Bravo-Bustos & M. Vidal
Department of Computer Science, University of Concepción, Concepción, Chile
M. Vidal
Molecular and Translational Immunology Laboratory, Department of Clinical Biochemistry and Immunology, University of Concepcion, Concepción, Chile
M. Vidal

Authors

I. Fernández-Barrera
View author publications
You can also search for this author in PubMed Google Scholar
S. Bravo-Bustos
View author publications
You can also search for this author in PubMed Google Scholar
M. Vidal
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to M. Vidal .

Editor information

Editors and Affiliations

Electrical Engineering, Universidad de Concepcion, Concepcion, Chile
Esteban Pino
Faculty of Electrical Engineering and Computing, University of Zagreb, Zagreb, Croatia
Ratko Magjarević
Department of Informatics Engineering, University of Coimbra, Coimbra, Portugal
Paulo de Carvalho

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fernández-Barrera, I., Bravo-Bustos, S., Vidal, M. (2024). Evaluating the Social Media Users’ Mental Health Status During COVID-19 Pandemic Using Deep Learning. In: Pino, E., Magjarević, R., de Carvalho, P. (eds) International Conference on Biomedical and Health Informatics 2022. ICBHI 2022. IFMBE Proceedings, vol 108. Springer, Cham. https://doi.org/10.1007/978-3-031-59216-4_7

Download citation

DOI: https://doi.org/10.1007/978-3-031-59216-4_7
Published: 30 April 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-59215-7
Online ISBN: 978-3-031-59216-4
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Evaluating the Social Media Users’ Mental Health Status During COVID-19 Pandemic Using Deep Learning