Back to Main Conference 2026
LREC 2026main

Identifying Contexts of Distress in College Students' Reddit Posts: A Comparative Study of Classical NLP and Large Language Models

Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)

DOI:10.63317/2k99z869ni4v

Abstract

Mental health is a salient and growing societal concern among college students. Social media platforms such as Reddit offer a rich source of data regarding how students talk about their mental health, and NLP tools may potentially assist in identifying when a student is struggling. In this paper, we investigate how different NLP tools can be used to extract context surrounding college students expressions of distress. We construct a novel dataset from Reddit posts (College Distress on Reddit, or CDR), and examine the "classical NLP pipeline", and modern generative LLMs on this data. Our dataset exploration is conducted in parallel with, and contrasted against the Dreaddit dataset to examine cross-domain variation. Results show that standard or "classical" NLP tools extract a limited number of concrete entities, whereas generative models can infer more nuanced causes. However, LLMs struggle with knowledge extraction in specific content areas. Our work shows how important it is to be wary of LLMs, especially in mental health contexts.

Details

Paper ID
lrec2026-main-758
Pages
pp. 9657-9668
BibKey
graff-etal-2026-identifying
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-493814-49-4
Conference
The Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Location
Palma, Mallorca, Spain
Date
11 May 2026 16 May 2026

Authors

  • CG

    Carine Graff

  • NK

    Nikhil Krishnaswamy

Links