Back to Main Conference 2026
LREC 2026main

SocialStep: Fast Prediction of Social Determinants of Health

Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)

DOI:10.63317/2t6bdjt7dvvi

Abstract

Given thousands of medical documents, how can we automatically uncover patients’ social risk factors? Social Determinants of Health (SDoH) constitute a growing class of non-clinical risk factors that shape patient trajectories. While clinically significant, automatic detection of SDoH from free text remains understudied due to scarce and imbalanced training data. Current approaches often rely on monolithic large language models. We present SocialStep, a two-step hybrid pipeline that first uses a lightweight classifier to triage sentences and then applies a Large Language Model (LLM) for multilabel classification to the relevant subset. On the Medical Information Mart for Intensive Care III (MIMIC-III) dataset, SocialStep improves macro F1 by 5 points over the state-of-the-art baseline while running 12.2× faster. These findings demonstrate that integrating compact neural encoders with large language models provides a scalable and highly accurate framework for clinical NLP tasks, including SDoH extraction. Notably, we also observe some unexpected patterns in LLM performance. SocialStep offers a practical blueprint for hybrid model deployment that identifies critical social risk factors without prohibitive computational cost.

Details

Paper ID
lrec2026-main-846
Pages
pp. 10802-10814
BibKey
landes-etal-2026-socialstep
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-493814-49-4
Conference
The Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Location
Palma, Mallorca, Spain
Date
11 May 2026 16 May 2026

Authors

  • PL

    Paul Landes

  • AC

    Adam Richard Cross

  • JS

    Jimeng Sun

Links