Back to Main Conference 2018
LREC 2018main

LIdioms: A Multilingual Linked Idioms Data Set

Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

DOI:10.63317/2md6xukrx6x5

Abstract

In this paper, we describe the LIDIOMS data set, a multilingual RDF representation of idioms currently containing five languages: English, German, Italian, Portuguese, and Russian. The data set is intended to support natural language processing applications by providing links between idioms across languages. The underlying data was crawled and integrated from various sources. To ensure the quality of the crawled data, all idioms were evaluated by at least two native speakers. Herein, we present the model devised for structuring the data. We also provide the details of linking LIDIOMS to well-known multilingual data sets such as BabelNet. The resulting data set complies with best practices according to Linguistic Linked Open Data Community.

Details

Paper ID
lrec2018-main-392
Pages
N/A
BibKey
moussallem-etal-2018-lidioms
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
79-10-95546-00-9
Conference
Eleventh International Conference on Language Resources and Evaluation
Location
Miyazaki, Japan
Date
7 May 2018 12 May 2018

Authors

  • DM

    Diego Moussallem

  • MS

    Mohamed Ahmed Sherif

  • DE

    Diego Esteves

  • MZ

    Marcos Zampieri

  • AN

    Axel-Cyrille Ngonga Ngomo

Links