Back to Main Conference 2010
LREC 2010main

Applying a Dynamic Bayesian Network Framework to Transliteration Identification

Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010)

DOI:10.63317/4gmr8qbctv9p

Abstract

Identification of transliterations is aimed at enriching multilingual lexicons and improving performance in various Natural Language Processing (NLP) applications including Cross Language Information Retrieval (CLIR) and Machine Translation (MT). This paper describes work aimed at using the widely applied graphical models approach of ‘Dynamic Bayesian Networks (DBNs) to transliteration identification. The task of estimating transliteration similarity is not very different from specific identification tasks where DBNs have been successfully applied; it is also possible to adapt DBN models from the other identification domains to the transliteration identification domain. In particular, we investigate the applicability of a DBN framework initially proposed by Filali and Bilmes (2005) to learn edit distance estimation parameters for use in pronunciation classification. The DBN framework enables the specification of a variety of models representing different factors that can affect string similarity estimation. Three DBN models associated with two of the DBN classes originally specified by Filali and Bilmes (2005) have been tested on an experimental set up of Russian-English transliteration identification. Two of the DBN models result in high transliteration identification accuracy and combining the models leads to even much better transliteration identification accuracy.

Details

Paper ID
lrec2010-main-622
Pages
N/A
BibKey
nabende-2010-applying
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-6-7
Conference
Seventh International Conference on Language Resources and Evaluation
Location
Valletta, Malta
Date
17 May 2010 23 May 2010

Authors

  • PN

    Peter Nabende

Links