Back to Main Conference 2002
LREC 2002main

Using the Text Corpus to Create a Comprehensive List of Phrasal Verbs

Proceedings of the Third International Conference on Language Resources and Evaluation (LREC 2002)

DOI:10.63317/26byg78236ko

Abstract

The paper describes extraction of Estonian multi-word verbs from text corpora, using a language- and task-specific software tool SENVA, which is based on a statistical language-independent software tool SENTA (Dias et al, 2000). The outcome is a comprehensive list of 16,000 phrasal verbs. We describe the extraction tool, manual post-editing principles, and evaluate the outcome in terms of precision and recall, comparing the results with man-made electronic dictionaries, and with the results of a manual extraction experiment of a sub-set of the MWV-s.

Details

Paper ID
lrec2002-main-165
Pages
N/A
BibKey
kaalep-muischnek-2002-using
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
N/A
Conference
Third International Conference on Language Resources and Evaluation
Location
Las Palmas, Spain
Date
29 May 2002 31 May 2002

Authors

  • HK

    Heiki-Jaan Kaalep

  • KM

    Kadri Muischnek

Links