GigitAI at ArchEHR-QA 2026: Prompting Strategies and Constitutional AI for Clinical Question Answering
Proceedings of the Third Workshop on Patient-Oriented Language Processing (CL4Health) @ LREC 2026
Abstract
Answering patient questions from electronic health records requires identifying relevant evidence in lengthy clinical notes and generating faithful, patient-friendly answers. We present a systematic study of LLM prompting strategies for both tasks, evaluating 21 evidence identification methods and 13 answer generation methods across 7 language models. For evidence identification, we find that LLM prompting outperforms traditional retrieval (BM25, SBERT, BioLinkBERT) by 19 F1 points, and that prompt framing alone controls precision–recall trade-offs: inclusive framing achieves 90% recall on dev while balanced framing reaches 67% precision. For answer generation, we introduce a Constitutional AI pipeline that critiques and revises answers against five clinical faithfulness principles, improving BLEU and ROUGE over the constrained baseline. Our analysis reveals that chain-of-thought effectiveness is strongly model-dependent, and that simple well-designed prompts outperform complex multi-step pipelines. We evaluate our approaches on the ArchEHR-QA 2026 shared task at CL4Health, achieving 58.0 F1 for evidence identification and 31.8 overall for answer generation.