Keywords

1 Introduction

Multiple factors during the COVID-19 pandemic, such as the associated lockdown, physical distancing, among others, have affected the health of people, and nowadays, mental health consequences are a worldwide problem [1]. Artificial Intelligence applications for mental health have been used to predict, detect and prevent conditions before reaching clinical level symptoms [2, 3]. However, unlike other fields of medicine, it presents several challenges, such as there are very few validated biomarkers in mental health, leading to a heavy reliance on patient-physician questionnaires [4]. According to the World Health Organization, during 2019, 1 in 8 people, or 970 million people worldwide were found to be living with some mental illness, anxiety and depressive disorder were the most common. In 2020, this number increased significantly due to the COVID-19 pandemic. The initial estimation shows an increase of 26% and 28% for anxiety and severe depressive disorders, respectively, in just one year.

Nevertheless, social network platforms are now being used by users to express their emotions, feelings, and thoughts, becoming a valuable source of data for researching mental health [5]. The online behavior and activities inspired many researchers to perform detection systems for health care using NLP techniques and text classification [6]. On the other hand, deep neural networks have improved classical machine learning approaches in large healthcare datasets, some of the success models have been in the field of computer vision, focusing on image and video analysis for classification, detection, and segmentation tasks [7].

In this work, we analyze user’s behavior on the Flickr platform, collecting images and posts. We applied NLP (polarity and sentiment analysis) and deep learning methods to understand the visual and linguistic features associated with depression patterns. Both linguistically and in the images, we found that the attributes describing the pre-pandemic and post-pandemic periods differ from each other, but each period is associated with patterns that infer depressive mental health status.

2 Methods

The proposed architecture is composed of two main modules. The first module is the data collection and dataset construction, based on Flickr user’s posts based on characteristics closely related to depressive moods. For this purpose, filters related to the polarity and sentiment lexicon of the posts were applied. The second module deals with the extraction of relevant features, and the cross-analysis of textual and visual features to find hidden patterns.

2.1 Dataset Pre-processing

To conform the dataset, we first collected posts and images from Flickr, using the Flickr API. The search considered 1,000 images per page ordered by relevance, selected according to the “depression” tag, that is, only the posts that contain it. The features extracted were the image, title, description and associated tags used in the same post. The chosen population contains the entire universe of Flickr users and it is not geographically delimited and the dates considered were from January 2018 to December 2019 (pre-pandemic) and January 2020 to December 2021(post-pandemic) since the World Health Organization (WHO) declared the outbreak a public health emergency of international concern on January 30, 2020 and a pandemic on March 11, 2020 [8]. The preliminary dataset contains 14523 images.

First, polarity was used as an initial filter for the post tags and description, using VADER (Valence Aware Dictionary and sentiment Reasoner), which is a lexicon and simple rule-based model for sentiment analysis from the NLTK library, it is specifically attuned to sentiments expressed in social media [9]. Then, we applied NRC Emotion Lexicon (EmoLex) to both tags and description. This associates English words with eight basic emotions (anger, fear, anticipation, trust, surprise, sadness, joy, and disgust) and two sentiments (negative and positive) using NLTK library’s WordNet synonym sets [10].

  • Tags filter: For tags filter we stored the data that have a polarity less than 0 to ensure negative sentiment-related tags. After this, the frequency of the tags was analyzed, storing only those with a frequency greater than 100 based on an exploratory analysis, choosing only those with a polarity value less than \(-0.5\) between those and a sentiment lexicon with a score higher than 0.2 for negative emotions. This ensures that the tags are the most used, with the highest (negative) score within this group.

    To the resulting list are added the tags “mentalhealth” and “mentalillness”, which, according to previous studies, are highly related when accompanied by some pathologies such as depression, since they have a semantic relationship [11]. Once the tags have been chosen, each post is asked if it contains any tags from this list, to ensure that the post will have the tag depression (with which the data was acquired) and any of those chosen according to their closeness with the pathology.

  • Description filter: After the tags filter, we select only the publications where a description is provided and proceed to filter using the emotional components within the text. First, we start by removing stopwords, symbols and other non relevant characters within the post description that may interfere with our analysis. We also used lemmatization or stemming to group similar words together or convert words to their root form.

    After this, we choose only data with polarity value less than zero and the scores of “sadness” and “disgust” emotions are higher than the others. These last are chosen according to the Diagnostic and Statistical Manual of Mental Disorders (DSM-5), which establishes that depressive moods are associated with the predominance of sadness and disgust) [12]. The final dataset contains 580 images.

2.2 Posts Features

In order to characterize the study group, linguistic and visual analyses are used. The first approach was to analyze the captions (description and tags) to explore the user’s mental status, and a second approach was to analyze the images from the data collection seeking to understand the composition of the images at high and low levels.

2.2.1 Linguistics Features

For text analysis, we calculated the frequency of each tag obtaining the most used for each period (pre and post-pandemic). For the image descriptions, we extract tokens to see their frequencies and their associated topics, with wordclouds and NLP strategies using the library NLTK.

2.2.2 Visual Features

For image analysis, we pre-processed the images reading each one in gray scale and obtained a gray scale histogram. Then we transformed the original images to RGB to perform RGB histogram analysis. To understand the environment or scene and the attributes in the images, we implemented a deep learning Convolutional Neural Network (CNN) using PyTorch [13] and the frameworks Places365-CNN [14], and Wideresnet18 [15].

3 Results

3.1 Linguistic Features

By using deep learning models and NLP strategies we identified the most relevant patterns and features associated with depression tags from different users from Flickr.

3.1.1 N-Gram Analysis

After pre-processing and cleaning the text, in Fig. 1A and B, we observed the wordcloud-based representation of the post description N-grams for both groups (pre and post-pandemic), in which the size of each word indicates its frequency in the dataset.

Fig. 1.
figure 1

Linguistic Features Analysis. A Wordcloud of posts description during pre-pandemic. B Wordcloud of post description during post-pandemic. C Wordcloud of the main topics during pre-pandemic. D Wordcloud of the main topics during post-pandemic. E Frequency of tags associated to negative sentiments during pre and post-pandemic.

Through the topics N-grams, presented in Fig. 1C and D, we can observe evidence of seasonal sadness or depression (“depressing”, “winter”) sadness and distress (“gloomy”, “mental”, “affliction”). In contrast to the post pandemic group, where the N-grams examined mainly contained words describing topics related to the Coronavirus Pandemic (“isolation”, “year”, “impact”).

3.1.2 Tags Analysis

To analyze tags, we used VADER with a threshold less than zero, keeping only tags with a frequency of at least 80 for better graphic representation. In Fig. 1E, we observed for the pre-pandemic tags attributes describing symptomatological characteristics of the depression pathology such as “sad”, “melancholic”, “gloomy”, “vulnerable”. On the other hand, for post-pandemic, we found attributes describing characteristics associated with a general feeling of fear and restlessness (“scared”, “frightening”, “apocalyptic”) and also a feeling of loneliness (“loneliness”, “isolation”, “sadness”).

3.2 Visual Features

We observed that visual features are associated with the mental health tag status observed for the users. We analyzed the brightness, color distribution, light intensity, environment and scenes. Figure 2A and B shows RGB and Gray histograms in comparison by period, where it indicates that the pixel quantity tends to accumulate around medium and high levels of light intensity, while in post-pandemic is mostly at low levels of light intensity. We can also see that the color distribution has different patterns between periods, being uniformly distributed among the three channels (RGB), while in the post-pandemic period, a non-uniform distribution pattern is observed among them.

In addition, we classified the environment and scenes by year using the CNN under two classes: “indoor” (pre-pandemic: 234 images, post-pandemic: 338 images) and “outdoor” (pre-pandemic: 322 images, post-pandemic: 210 images). Therefore, the indoor environment images increased 44,4 % and the outdoor environment by 34,8 %. On the other hand, we obtained 35 main attributes of images between both periods, obtaining mostly outdoor activities and scenes related results for pre-pandemic period such as “driving”, “transporting”, “sunny” or “trees”. While in the post-pandemic period were more frequent indoor activities and scene attributes such as “indoor lighting”, “enclosed area”, “working”, “stressful” or “far-away horizon”. The results of the attributes and a sample of the images for pre and post-pandemic used in the CNN can be seen in Fig. 2C and Fig. 3.

Fig. 2.
figure 2

Visual Features Analysis. A RGB histogram analysis during pre and post-pandemic. B Gray histogram analysis during pre and post-pandemic. C Attributes frequency of the images scenes during pre and post-pandemic.

Fig. 3.
figure 3

Sample of image dataset from A pre-pandemic period and B post-pandemic period.

4 Discussion

After extracting publications (text and image) we applied the pre-processing steps and filters (VADER EmoLex) and obtained a dataset characterized mainly by a negative polarity and depression-related emotions. We observed almost no variation in the number of publications between periods (556 vs 548 publications in each group). This suggests that the number of users that presents symptoms of pathology does not increase during the pandemic, however, as the frequency distribution of N-Grams and tags frequency suggests, the topics change, getting more related to the pandemic situation in that period.

For instance, N-grams topics changed from “gloomy”, “depressing” or “winter”, to “work” “isolation”, “year” and “impact”. This may be due to confinement and working remotely since people was unable to attend their jobs or travel. Therefore, this lead us to infer that there is a significant shift in the general mood due to the global pandemic event, since a majority of the population was confined and force to stay home, many times in isolation apart from their loved ones, which can easily generate feelings of panic and anxiety [16, 17].

For the image analysis, we performed histograms and scene analysis. The results of histograms indicates that the users post images with an harmonic distribution of the three channels in pre-pandemic, while in post-pandemic the posts have pictures with blue and green dominance, i.e. more color disturbed images.

Besides, the attributes of the images suggests that there is a relevant change in the content of the photos, being more outdoor like in pre-pandemic and more indoor like in post-pandemic. These results are the same in the exploration of the environments of the images, which make sense since the extensive quarantines. The visual features analysis led us to infers that while the patterns are different between periods, both are characteristics of depression mood, since both present distributions of color and light intensity with most pixels in low levels, which according to several studies are directly associated with depression [18, 19].

5 Conclusion

We observed a relationship between depression symptoms and the “Coronavirus pandemic” as shown in N-Gram and tag analysis, where we notice a change in the higher frequency tags going from “sad” and “depression” (pre-pandemic) to “tired” and “isolation” (post-pandemic), and an overall increased feeling of “fear” within the publications made for the post-pandemic group.

Those analyzes are clear indicators that the impact of the situation was big enough to change the users behavior in social media, reflecting feelings, moods and even symptoms consistent with depression but originated by the pandemic, instead the users of the pre-pandemic period which reflects mostly symptoms of common depressive mood.

Finally, we observed that during the COVID-19 pandemic the users expressed behavioral differences between pre and post-pandemic, specifically users associated with mental illness posted darker images and indoors scenes. Therefore, by analyzing images and posted information from social media with NLP and deep learning strategies can be useful for improving the performance to classify users into depressive and non-depressive groups and provide an early diagnostic.