Comparison of machine learning models for early depression detection from users’ posts

Mothe, Josiane; Ramiandrisoa, Faneva; Ullah, Md Zia

doi:10.1007/978-3-031-04431-1_5

Comparison of machine learning models for early depression detection from users’ posts

Mothe, Josiane; Ramiandrisoa, Faneva; Ullah, Md Zia

Authors

Josiane Mothe

Faneva Ramiandrisoa

Dr Md Zia Ullah M.Ullah@napier.ac.uk
Lecturer

Contributors

Fabio Crestani
Editor

David E. Losada
Editor

Javier Parapar
Editor

Abstract

With around 300 millions people worldwide suffering from depression, the detection of this disorder is crucial and a challenge for individual and public health. As with many diseases, early detection means better medical management; the use of social media messages as potential clues to depression is an opportunity to assist in this early detection by automatic means. This chapter is based on the participation of the CNRS IRIT laboratory in the early detection of depressive people (eRisk) task at the CLEF evaluation forum. Early depression detection differs from depression detection in that it considers temporality; the system must make its decision about a user’s possible depression with as little data as possible. In this chapter we re-evaluate the models we have developed for our participation at eRisk over the years on the different collections, to obtain a more robust comparison. We also add new models. We use well-established classification methods, such as Logistic regression, Random forest, and Support Vector Machine (SVM). The users’ data from which the system should detect if they are depressed, are represented as vectors composed of (a) various task-oriented features including depression related lexicons and (b) word and document embeddings, extracted from the users’ posts. We perform an ablation study to analyze the most important features for our models. We also use BERT deep learning architecture for comparison purposes, both for depression detection and early depression detection. According to our results, well-established machine learning models are still better than more modern models for -early- detection of depression.

Citation

Mothe, J., Ramiandrisoa, F., & Ullah, M. Z. (2022). Comparison of machine learning models for early depression detection from users’ posts. In F. Crestani, D. E. Losada, & J. Parapar (Eds.), Early Detection of Mental Health Disorders by Social Media Monitoring: The First Five Years of the eRisk Project (111-139). Springer. https://doi.org/10.1007/978-3-031-04431-1_5

Online Publication Date	Sep 15, 2022
Publication Date	2022-09
Deposit Date	Mar 8, 2023
Publisher	Springer
Pages	111-139
Book Title	Early Detection of Mental Health Disorders by Social Media Monitoring: The First Five Years of the eRisk Project
ISBN	978-3-031-04430-4
DOI	https://doi.org/10.1007/978-3-031-04431-1_5
Public URL	http://researchrepository.napier.ac.uk/Output/3011076