Back to Main Conference 2022
LREC 2022main
IceBATS: An Icelandic Adaptation of the Bigger Analogy Test Set
Proceedings of the Thirteenth International Conference on Language Resources and Evaluation (LREC 2022)
Abstract
Word embedding models have become commonplace in a wide range of NLP applications. In order to train and use the best possible models, accurate evaluation is needed. For extrinsic evaluation of word embedding models, analogy evaluation sets have been shown to be a good quality estimator. We introduce an Icelandic adaptation of a large analogy dataset, BATS, evaluate it on three different word embedding models and show that our evaluation set is apt at measuring the capabilities of such models.