Back to Main Conference 2022
LREC 2022main

EENLP: Cross-lingual Eastern European NLP Index

Proceedings of the Thirteenth International Conference on Language Resources and Evaluation (LREC 2022)

DOI:10.63317/3a7bdciqv9e4

Abstract

Motivated by the sparsity of NLP resources for Eastern European languages, we present a broad index of existing Eastern European language resources (90+ datasets and 45+ models) published as a github repository open for updates from the community. Furthermore, to support the evaluation of commonsense reasoning tasks, we provide hand-crafted cross-lingual datasets for five different semantic tasks (namely news categorization, paraphrase detection, Natural Language Inference (NLI) task, tweet sentiment detection, and news sentiment detection) for some of the Eastern European languages. We perform several experiments with the existing multilingual models on these datasets to define the performance baselines and compare them to the existing results for other languages.

Details

Paper ID
lrec2022-main-220
Pages
pp. 2050-2057
BibKey
tikhonov-etal-2022-eenlp
Editors
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, Stelios Piperidis2020
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
79-10-95546-38-2
Conference
Thirteenth Language Resources and Evaluation Conference
Location
Marseille, France
Date
20 - 25 June 2022

Authors

  • AT

    Alexey Tikhonov

  • AM

    Alex Malkhasov

  • AM

    Andrey Manoshin

  • GD

    George-Andrei Dima

  • RC

    Réka Cserháti

  • MH

    Md.Sadek Hossain Asif

  • MS

    Matt Sárdi

Links