Back to Main Conference 2002
LREC 2002main

Floresta Sintá(c)tica: A treebank for Portuguese

Proceedings of the Third International Conference on Language Resources and Evaluation (LREC 2002)

DOI:10.63317/45dcr6ebdxoy

Abstract

This paper reviews the first year of the creation of a publicly available treebank for Portuguese, Floresta Sintá(c)tica, a collaboration project between the VISL and the Computational Processing of Portuguese projects. After briefly describing the main goals and the organization of the project, the creation of the annotated objects is presented in detail: preparing the text to be annotated, applying the Constraint Grammar based PALAVRAS parser, revising its output manually in a two-stage process, and carefully documenting the linguistic options. Some examples of the kind of interesting problems dealt with are presented, and the paper ends with a brief description of the tools developed, the project results so fa1.r, and a mention to a preliminary inter-annotator test and what was learned from it.

Details

Paper ID
lrec2002-main-001
Pages
N/A
BibKey
afonso-etal-2002-floresta
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
N/A
Conference
Third International Conference on Language Resources and Evaluation
Location
Las Palmas, Spain
Date
29 May 2002 31 May 2002

Authors

  • SA

    Susana Afonso

  • EB

    Eckhard Bick

  • RH

    Renato Haber

  • DS

    Diana Santos

Links