Back to Main Conference 2014
LREC 2014main

The Procedure of Lexico-Semantic Annotation of Składnica Treebank

Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC 2014)

DOI:10.63317/2kfht9vs9z4f

Abstract

In this paper, the procedure of lexico-semantic annotation of Składnica Treebank using Polish WordNet is presented. Other semantically annotated corpora, in particular treebanks, are outlined first. Resources involved in annotation as well as a tool called Semantikon used for it are described. The main part of the paper is the analysis of the applied procedure. It consists of the basic and correction phases. During basic phase all nouns, verbs and adjectives are annotated with wordnet senses. The annotation is performed independently by two linguists. During the correction phase, conflicts are resolved by the linguist supervising the process. Multi-word units obtain special tags, synonyms and hypernyms are used for senses absent in Polish WordNet. Additionally, each sentence receives its general assessment. Finally, some statistics of the results of annotation are given, including inter-annotator agreement. The final resource is represented in XML files preserving the structure of Składnica.

Details

Paper ID
lrec2014-main-380
Pages
pp. 2290-2297
BibKey
hajnicz-2014-procedure
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-8-4
Conference
Ninth International Conference on Language Resources and Evaluation
Location
Reykjavik, Iceland
Date
26 May 2014 31 May 2014

Authors

  • EH

    Elżbieta Hajnicz

Links