Pediatric Sepsis Cohort Detection Using In-Context Pointwise V-Usable Information
Proceedings of the 8th Workshop on Clinical Natural Language Processing (Clinical NLP) @ LREC 2026
Abstract
Pediatric sepsis diagnosis remains a major clinical challenge due to non-specific symptoms and a lack of reliable diagnostic criteria. Large language models (LLMs) provide a scalable solution for processing and understanding unstructured text in medical records. However, identifying the most suitable model is non-trivial given the rapid growth of available LLMs. In this work, we proposed using in-context pointwise V-usable information (pvi) to estimate task difficulty and guide model selection for pediatric sepsis cohort detection. We applied in-context pvi to estimate task difficulty and inform model selection across 12 state-of-the-art open LLMs on the task, using electronic medical record data from 507 patient encounters at a U.S. children’s hospital. We compared the performance of the best-fitting LLM to feature-rich baseline models and a fine-tuned transformer. Our results show that the pvi-selected LLM outperforms the baselines, although the feature-rich bag-of-words model with a support vector machine also achieves competitive performance. We believe our approach demonstrates a promising application of current LLM techniques to high-stakes clinical tasks.