Back to Main Conference 2016
LREC 2016main

European Union Language Resources in Sketch Engine

Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)

DOI:10.63317/27jdqbuvdhiz

Abstract

Several parallel corpora built from European Union language resources are presented here. They were processed by state-of-the-art tools and made available for researchers in the corpus manager Sketch Engine. A completely new resource is introduced: EUR-Lex Corpus, being one of the largest parallel corpus available at the moment, containing 840 million English tokens and the largest language pair English-French has more than 25 million aligned segments (paragraphs).

Details

Paper ID
lrec2016-main-445
Pages
pp. 2799-2803
BibKey
baisa-etal-2016-european
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-9-1
Conference
Tenth International Conference on Language Resources and Evaluation
Location
Portorož, Slovenia
Date
23 May 2016 28 May 2016

Authors

  • VB

    Vít Baisa

  • JM

    Jan Michelfeit

  • MM

    Marek Medveď

  • MJ

    Miloš Jakubíček

Links