HSE NLP TEAM at MEDIQA-SYNUR 2026: Consensus Adjudication Ensemble (ACE): Balancing Precision and Recall for Schema-Bystander Clinical Extraction
Proceedings of the 8th Workshop on Clinical Natural Language Processing (Clinical NLP) @ LREC 2026
Abstract
Clinical documentation from nurse dictations is labor-intensive and error-prone, yet it contains high-value observations that must be transferred into structured flowsheets. The MEDIQA-SYNUR 2026 shared task evaluates systems that extract and ontology-align 193 clinical concepts (with heterogeneous value types) from synthetic speech transcripts derived from intensive care notes. We describe the Consensus Adjudication Ensemble (ACE), a three-stage pipeline that (i) maximizes candidate coverage via complementary generators, (ii) enforces high precision through a dedicated adjudicator that operates as a verifier rather than a generator, and (iii) restores strict schema compliance using a targeted, token-efficient repair step. On the official test set we achieve an exact-match micro-F1 of 0.7996 (P=0.7812, R=0.8188), ranking 4th on the leaderboard. Beyond the competitive result, we analyze clinically relevant failure modes - hallucinated interventions, over-confident categorical labels, and unit/normalization errors - and quantify adjudication trade-offs: 2,219 candidates removed, 91.3% of which are true false positives, at the cost of 8.7% mistakenly removed true positives. Finally, targeted schema repair reduces validation context from approx. 230k tokens to <2k per document while preserving most extraction gains. Keywords: clinical information extraction, nurse dictations, ontology alignment, ensemble methods, adjudication, error analysis