Back to Main Conference 2016
LREC 2016main

MultiVec: a Multilingual and Multilevel Representation Learning Toolkit for NLP

Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)

DOI:10.63317/52gcepr8aezz

Abstract

We present MultiVec, a new toolkit for computing continuous representations for text at different granularity levels (word-level or sequences of words). MultiVec includes word2vec's features, paragraph vector (batch and online) and bivec for bilingual distributed representations. MultiVec also includes different distance measures between words and sequences of words. The toolkit is written in C++ and is aimed at being fast (in the same order of magnitude as word2vec), easy to use, and easy to extend. It has been evaluated on several NLP tasks: the analogical reasoning task, sentiment analysis, and crosslingual document classification.

Details

Paper ID
lrec2016-main-662
Pages
pp. 4188-4192
BibKey
berard-etal-2016-multivec
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-9-1
Conference
Tenth International Conference on Language Resources and Evaluation
Location
Portorož, Slovenia
Date
23 May 2016 28 May 2016

Authors

  • AB

    Alexandre Bérard

  • CS

    Christophe Servan

  • OP

    Olivier Pietquin

  • LB

    Laurent Besacier

Links