Back to Main Conference 2004
LREC 2004main

The Effect of Text Difficulty on Machine Translation Performance – A Pilot Study with ILR-Rated Texts in Spanish, Farsi, Arabic, Russian and Korean

Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC 2004)

DOI:10.63317/5gwzcduuk297

Abstract

We report on initial experiments that examine the relationship between automated measures of machine translation performance (Doddington, 2003, and Papineni et al. 2001) and the Interagency Language Roundtable (ILR) scale of language proficiency/difficulty that has been in standard use for U.S. government language training and assessment for the past several decades (Child, Clifford and Lowe 1993). The main question we ask is how technology-oriented measures of MT performance relate to the ILR difficulty levels, where we understand that a linguist with ILR proficiency level N is expected to be able to understand a document rated at level N, but to have increasing difficulty with documents at higher levels. In this paper, we find that some key aspects of MT performance track with ILR difficulty levels, primarily for MT output whose quality is good enough to be readable by human readers.

Details

Paper ID
lrec2004-main-382
Pages
N/A
BibKey
clifford-etal-2004-effect
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-1-6
Conference
Fourth International Conference on Language Resources and Evaluation
Location
Lisbon, Portugal
Date
26 May 2004 28 May 2004

Authors

  • RC

    Ray Clifford

  • NG

    Neil Granoien

  • DJ

    Douglas Jones

  • WS

    Wade Shen

  • CW

    Clifford Weinstein

Links