Back to Main Conference 2016
LREC 2016main

Discriminating Similar Languages: Evaluations and Explorations

Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)

DOI:10.63317/4e2puhxuby8y

Abstract

We present an analysis of the performance of machine learning classifiers on discriminating between similar languages and language varieties. We carried out a number of experiments using the results of the two editions of the Discriminating between Similar Languages (DSL) shared task. We investigate the progress made between the two tasks, estimate an upper bound on possible performance using ensemble and oracle combination, and provide learning curves to help us understand which languages are more challenging. A number of difficult sentences are identified and investigated further with human annotation

Details

Paper ID
lrec2016-main-284
Pages
pp. 1800-1807
BibKey
goutte-etal-2016-discriminating
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-9-1
Conference
Tenth International Conference on Language Resources and Evaluation
Location
Portorož, Slovenia
Date
23 May 2016 28 May 2016

Authors

  • CG

    Cyril Goutte

  • SL

    Serge Léger

  • SM

    Shervin Malmasi

  • MZ

    Marcos Zampieri

Links