Back to Main Conference 2016
LREC 2016main

Port4NooJ v3.0: Integrated Linguistic Resources for Portuguese NLP

Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)

DOI:10.63317/5ddrxy69mwjo

Abstract

This paper introduces Port4NooJ v3.0, the latest version of the Portuguese module for NooJ, highlights its main features, and details its three main new components: (i) a lexicon-grammar based dictionary of 5,177 human intransitive adjectives, and a set of local grammars that use the distributional properties of those adjectives for paraphrasing (ii) a polarity dictionary with 9,031 entries for sentiment analysis, and (iii) a set of priority dictionaries and local grammars for named entity recognition. These new components were derived and/or adapted from publicly available resources. The Port4NooJ v3.0 resource is innovative in terms of the specificity of the linguistic knowledge it incorporates. The dictionary is bilingual Portuguese-English, and the semantico-syntactic information assigned to each entry validates the linguistic relation between the terms in both languages. These characteristics, which cannot be found in any other public resource for Portuguese, make it a valuable resource for translation and paraphrasing. The paper presents the current statistics and describes the different complementary and synergic components and integration efforts.

Details

Paper ID
lrec2016-main-201
Pages
pp. 1264-1269
BibKey
mota-etal-2016-port4nooj
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-9-1
Conference
Tenth International Conference on Language Resources and Evaluation
Location
Portorož, Slovenia
Date
23 May 2016 28 May 2016

Authors

  • CM

    Cristina Mota

  • PC

    Paula Carvalho

  • AB

    Anabela Barreiro

Links