Back to Main Conference 2006
LREC 2006main

Bilingual speech corpus in two phonetically similar languages

Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC 2006)

DOI:10.63317/37jyibqd3qag

Abstract

As Speech Recognition Systems improve, they become suitable for facingnew problems. Multilingual speech recognition is one such problems.In the present work, the case of the Comunitat Valenciana multilingual environment is studied.The official languages in the Comunitat Valenciana (Spanish and Valencian) share most of their acoustic units, and their vocabularies and syntax are quite similar.They have influenced each other for many years.A small corpus on an Information System task was developed for experimentationpurposes.This choice will make it possible to develop a working prototype in the future,and it is simple enough to build semi-automatic language models.The design of the acoustic corpus is discussed, showing that all combinations of accents have been studied (native, non-native speakers, male, female, etc.).

Details

Paper ID
lrec2006-main-143
Pages
N/A
BibKey
alabau-martinez-2006-bilingual
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-2-4
Conference
Fifth International Conference on Language Resources and Evaluation
Location
Genoa, Italy
Date
24 May 2006 26 May 2006

Authors

  • VA

    Vicente Alabau

  • CM

    Carlos D. Martínez

Links