Back to Main Conference 2012
LREC 2012main

Statistical Section Segmentation in Free-Text Clinical Records

Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC 2012)

DOI:10.63317/3tsi2bjytv6w

Abstract

Automatically segmenting and classifying clinical free text into sections is an important first step to automatic information retrieval, information extraction and data mining tasks, as it helps to ground the significance of the text within. In this work we describe our approach to automatic section segmentation of clinical records such as hospital discharge summaries and radiology reports, along with section classification into pre-defined section categories. We apply machine learning to the problems of section segmentation and section classification, comparing a joint (one-step) and a pipeline (two-step) approach. We demonstrate that our systems perform well when tested on three data sets, two for hospital discharge summaries and one for radiology reports. We then show the usefulness of section information by incorporating it in the task of extracting comorbidities from discharge summaries.

Details

Paper ID
lrec2012-main-605
Pages
pp. 2001-2008
BibKey
tepper-etal-2012-statistical
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-7-7
Conference
Eighth International Conference on Language Resources and Evaluation
Location
Istanbul, Turkey
Date
21 May 2012 27 May 2012

Authors

  • MT

    Michael Tepper

  • DC

    Daniel Capurro

  • FX

    Fei Xia

  • LV

    Lucy Vanderwende

  • MY

    Meliha Yetisgen-Yildiz

Links