Back to Main Conference 2016
LREC 2016main

SCALE: A Scalable Language Engineering Toolkit

Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)

DOI:10.63317/4vjahevtd9da

Abstract

In this paper we present SCALE, a new Python toolkit that contains two extensions to n-gram language models. The first extension is a novel technique to model compound words called Semantic Head Mapping (SHM). The second extension, Bag-of-Words Language Modeling (BagLM), bundles popular models such as Latent Semantic Analysis and Continuous Skip-grams. Both extensions scale to large data and allow the integration into first-pass ASR decoding. The toolkit is open source, includes working examples and can be found on http://github.com/jorispelemans/scale.

Details

Paper ID
lrec2016-main-612
Pages
pp. 3868-3871
BibKey
pelemans-etal-2016-scale
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-9-1
Conference
Tenth International Conference on Language Resources and Evaluation
Location
Portorož, Slovenia
Date
23 May 2016 28 May 2016

Authors

  • JP

    Joris Pelemans

  • LV

    Lyan Verwimp

  • KD

    Kris Demuynck

  • HV

    Hugo Van hamme

  • PW

    Patrick Wambacq

Links