Skip to main content

Research Repository

Advanced Search

Comparison of machine learning models for early depression detection from users’ posts

Mothe, Josiane; Ramiandrisoa, Faneva; Ullah, Md Zia

Authors

Josiane Mothe

Faneva Ramiandrisoa



Contributors

Fabio Crestani
Editor

David E. Losada
Editor

Javier Parapar
Editor

Abstract

With around 300 millions people worldwide suffering from depression, the detection of this disorder is crucial and a challenge for individual and public health. As with many diseases, early detection means better medical management; the use of social media messages as potential clues to depression is an opportunity to assist in this early detection by automatic means. This chapter is based on the participation of the CNRS IRIT laboratory in the early detection of depressive people (eRisk) task at the CLEF evaluation forum. Early depression detection differs from depression detection in that it considers temporality; the system must make its decision about a user’s possible depression with as little data as possible. In this chapter we re-evaluate the models we have developed for our participation at eRisk over the years on the different collections, to obtain a more robust comparison. We also add new models. We use well-established classification methods, such as Logistic regression, Random forest, and Support Vector Machine (SVM). The users’ data from which the system should detect if they are depressed, are represented as vectors composed of (a) various task-oriented features including depression related lexicons and (b) word and document embeddings, extracted from the users’ posts. We perform an ablation study to analyze the most important features for our models. We also use BERT deep learning architecture for comparison purposes, both for depression detection and early depression detection. According to our results, well-established machine learning models are still better than more modern models for -early- detection of depression.

Citation

Mothe, J., Ramiandrisoa, F., & Ullah, M. Z. (2022). Comparison of machine learning models for early depression detection from users’ posts. In F. Crestani, D. E. Losada, & J. Parapar (Eds.), Early Detection of Mental Health Disorders by Social Media Monitoring: The First Five Years of the eRisk Project (111-139). Springer. https://doi.org/10.1007/978-3-031-04431-1_5

Online Publication Date Sep 15, 2022
Publication Date 2022-09
Deposit Date Mar 8, 2023
Publisher Springer
Pages 111-139
Book Title Early Detection of Mental Health Disorders by Social Media Monitoring: The First Five Years of the eRisk Project
ISBN 978-3-031-04430-4
DOI https://doi.org/10.1007/978-3-031-04431-1_5