Back to Main Conference 2024
LREC-COLING 2024main

A Multi-Task Transformer Model for Fine-grained Labelling of Chest X-Ray Reports

Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

DOI:10.63317/2fmkm2ffo9ni

Abstract

Precise understanding of free-text radiology reports through localised extraction of clinical findings can enhance medical imaging applications like computer-aided diagnosis. We present a new task, that of segmenting radiology reports into topically meaningful passages (segments) and a transformer-based model that both segments reports into semantically coherent segments and classifies each segment using a set of 37 radiological abnormalities, thus enabling fine-grained analysis. This contrasts with prior work that performs classification on full reports without localisation. Trained on over 2.7 million unlabelled chest X-ray reports and over 28k segmented and labelled reports, our model achieves state-of-the-art performance on report segmentation (0.0442 WinDiff) and multi-label classification (0.84 report-level macro F1) over 37 radiological labels and 8 NLP-specific labels. This work establishes new benchmarks for fine-grained understanding of free-text radiology reports, with precise localisation of semantics unlocking new opportunities to improve computer vision model training and clinical decision support. We open-source our annotation tool, model code and pretrained weights to encourage future research.

Details

Paper ID
lrec2024-main-0078
Pages
pp. 862-875
BibKey
zhu-etal-2024-multi
Editor
N/A
Publisher
European Language Resources Association (ELRA) and ICCL
ISSN
2522-2686
ISBN
979-10-95546-34-4
Conference
Joint International Conference on Computational Linguistics, Language Resources and Evaluation
Location
Turin, Italy
Date
20 May 2024 25 May 2024

Authors

  • YZ

    Yuanyi Zhu

  • ML

    Maria Liakata

  • GM

    Giovanni Montana

Links