Improving Bilingual Terminology Extraction from Comparable Corpora via Multiple Word-Space Models

Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)

Abstract

There is a rich flora of word space models that have proven their efficiency in many different applications including information retrieval (Dumais, 1988), word sense disambiguation (Schutze, 1992}, various semantic knowledge tests (lund, 1995; Karlgren, 2001}, and text categorization (Sahlgren, 2005). Based on the assumption that each model captures some aspects of word meanings and provides its own empirical evidence, we present in this paper a systematic exploration of the principal corpus-based word space models for bilingual terminology extraction from comparable corpora. We find that, once we have identified the best procedures, a very simple combination approach leads to significant improvements compared to individual models.

Resources

Details

Paper ID

lrec2016-main-661

Pages

pp. 4184-4187

DOI

10.63317/32tnfkvmz9nm

BibKey

hazem-morin-2016-improving

Editors

Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asunción Moreno, Jan Odijk, Stelios Piperidis

Publisher

European Language Resources Association (ELRA)

ISSN

2522-2686

ISBN

978-2-9517408-9-1

Conference

Tenth International Conference on Language Resources and Evaluation

Location

Portorož, Slovenia

Date

23 - 28 May 2016

Authors

AH
Amir Hazem
EM
Emmanuel Morin

Links

URL

DOI