Back to Main Conference 2026
LREC 2026main

Mental Health Disorder Detection beyond Social Media: A Systematic Review of Available Datasets

Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)

DOI:10.63317/32pcux3ahpau

Abstract

Detecting mental health disorders in a timely manner is an important societal challenge. NLP and machine learning (ML) methods used to assist with detection rely on data collected primarily from social media. However, such datasets often have sampling biases and inherent ethical and privacy issues. One avenue to overcome these limitations is non-social media data. We present the first comprehensive review of non-social media, free-text datasets for mental health research. We use the PRISMA methodology to conduct our survey and we review datasets available in multiple languages. We find that non-social media free-text based datasets are predominantly focused on English and on detecting depression. These datasets also vary in demographics, platforms, data types, annotation techniques, and methodologies. This systematic review also reveals key gaps and highlights opportunities to develop more diverse, reliable and clinically-relevant resources.

Details

Paper ID
lrec2026-main-494
Pages
pp. 6235-6250
BibKey
puspo-etal-2026-mental
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-493814-49-4
Conference
The Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Location
Palma, Mallorca, Spain
Date
11 May 2026 16 May 2026

Authors

  • SP

    Sadiya Sayara Chowdhury Puspo

  • AB

    Ana-Maria Bucur

  • SC

    Stevie Chancellor

  • ÖU

    Özlem Uzuner

  • MZ

    Marcos Zampieri

Links