Back to Main Conference 2022
LREC 2022main
PortiLexicon-UD: a Portuguese Lexical Resource according to Universal Dependencies Model
Proceedings of the Thirteenth International Conference on Language Resources and Evaluation (LREC 2022)
Abstract
This paper presents PortiLexicon-UD, a large and freely available lexicon for Portuguese delivering morphosyntactic information according to the Universal Dependencies model. This lexical resource includes part of speech tags, lemmas, and morphological information for words, with 1,221,218 entries (considering word duplication due to different combination of PoS tag, lemma, and morphological features). We report the lexicon creation process, its computational data structure, and its evaluation over an annotated corpus, showing that it has a high language coverage and good quality data.