Back to Main Conference 2014
LREC 2014main

The Development of the Multilingual LUNA Corpus for Spoken Language System Porting

Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC 2014)

DOI:10.63317/4ydayhanv4ks

Abstract

The development of annotated corpora is a critical process in the development of speech applications for multiple target languages. While the technology to develop a monolingual speech application has reached satisfactory results (in terms of performance and effort), porting an existing application from a source language to a target language is still a very expensive task. In this paper we address the problem of creating multilingual aligned corpora and its evaluation in the context of a spoken language understanding (SLU) porting task. We discuss the challenges of the manual creation of multilingual corpora, as well as present the algorithms for the creation of multilingual SLU via Statistical Machine Translation (SMT).

Details

Paper ID
lrec2014-main-613
Pages
pp. 2675-2678
BibKey
stepanov-etal-2014-development
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-8-4
Conference
Ninth International Conference on Language Resources and Evaluation
Location
Reykjavik, Iceland
Date
26 May 2014 31 May 2014

Authors

  • ES

    Evgeny Stepanov

  • GR

    Giuseppe Riccardi

  • AB

    Ali Orkan Bayer

Links