Back to Main Conference 2004
LREC 2004main

Creation of a Doctor-Patient Dialogue Corpus Using Standardized Patients

Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC 2004)

DOI:10.63317/3vxfai9xnfu7

Abstract

In this paper we describe the development of a doctor-patient dialogue corpus to support a speech-to-speech machine translation effort for English-Persian medical dialogues. The corpus was developed by recording and transcribing English-to-English dialogues between medical students and standardized patients (actors who have been trained to portray illness or injury victims), and then translated into Persian. We discuss some of the benefits and drawbacks to creating a corpus in this way. Benefits include the ability to customize the corpus in a way that would be infeasible for actual doctor-patient data and avoidance of privacy and legal issues, while drawbacks include the fact that the Persian does not originate as speech, but as text translation of English speech. We address concerns such as the authenticity of the dialogues and the value of such data for system development.

Details

Paper ID
lrec2004-main-223
Pages
N/A
BibKey
melvin-etal-2004-creation
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-1-6
Conference
Fourth International Conference on Language Resources and Evaluation
Location
Lisbon, Portugal
Date
26 May 2004 28 May 2004

Authors

  • RM

    Robert S. Melvin

  • WM

    Win May

  • SN

    Shrikanth Narayanan

  • PG

    Panayiotis Georgiou

  • SG

    Shadi Ganjavi

Links