Svoboda | Graniru | BBC Russia | Golosameriki | Facebook
'),o.close()}("https://assets.zendesk.com/embeddable_framework/main.js","jmir.zendesk.com");/*]]>*/

Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Currently submitted to: JMIR Mental Health

Date Submitted: Mar 11, 2024
Open Peer Review Period: Mar 11, 2024 - May 6, 2024
(currently open for review)

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Machine Learning for Depression Risk Monitoring on Chinese Social Media: A Comprehensive Evaluation and Analysis

  • Zhenwen Zhang; 
  • Zepeng Li; 
  • Zhihua Guo; 
  • Jianghong Zhu; 
  • Yu Zhang; 
  • Bin Hu

ABSTRACT

Background:

Depression is a significant global public health issue that affects the physical and mental well-being of hundreds of millions of people worldwide. However, a substantial number of individuals with depression on social media often go undiagnosed and struggle to access timely and effective treatment, increasingly becoming a major societal health concern.

Objective:

This paper aims to explore and develop an online depression risk detection method based on deep learning technology to identify individuals at risk of depression on the Chinese social media platform Sina Weibo.

Methods:

We initially collected approximately 527,333 posts publicly shared over one year from 1600 individuals with depression and 1600 individuals without depression on the Sina Weibo platform. Subsequently, we developed a hierarchical Transformer network to learn semantic features for each user. This network comprises two levels of Transformer structures, one at the word level and the other at the sentence level. These Transformers are employed to extract the textual semantic features of each post, and the aggregated features of all posts for each user generate user-level semantic features. A classifier is then applied to predict the risk of depression. Finally, we conducted statistical and linguistic analyses of the content of posts from individuals with and without depression using the Chinese LIWC.

Results:

We divided the original dataset into training, validation, and test sets. The training set consists of 1000 individuals with depression and 100 individuals without depression. The validation and test set each includes 600 users, with 300 individuals with depression and 300 without depression. Our method achieved an accuracy of 84.62%, precision of 84.43%, recall of 84.50%, and F1 score of 84.32% on the test set without applying sampling techniques. After applying our proposed retrieval-based sampling strategy, our method achieved an accuracy of 95.46%, precision of 95.30%, recall of 95.70%, and F1 score of 95.43%. These results strongly demonstrate the effectiveness and superiority of our proposed depression risk detection model and retrieval-based sampling technique. This provides new insights for large-scale depression detection through social media. Through language behavior analysis, it is observed that individuals with depression are more likely to use negation words (the value of "swear" is 0.001253). This may indicate the presence of negative emotions, rejection, doubt, disagreement, or aversion expressed by individuals with depression. Additionally, we also found that individuals with depression tend to use negative emotional vocabulary in their expressions (NegEmo: 0.022306, Anx: 0.003829, Anger: 0.004327, Sad: 0.005740), which may reflect their internal negative emotions and psychological state. This frequent use of negative vocabulary could be a way for individuals with depression to express negative feelings towards life, themselves, or their surrounding environment.

Conclusions:

The research results indicate the feasibility and effectiveness of deep learning methods in detecting the risk of depression. This provides insights into the potential for large-scale, automated, and non-invasive prediction of depression among users of online social media.


 Citation

Please cite as:

Zhang Z, Li Z, Guo Z, Zhu J, Zhang Y, Hu B

Machine Learning for Depression Risk Monitoring on Chinese Social Media: A Comprehensive Evaluation and Analysis

JMIR Preprints. 11/03/2024:58259

DOI: 10.2196/preprints.58259

URL: https://preprints.jmir.org/preprint/58259

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

Advertisement