Back to Main Conference 2008
LREC 2008main

Building the Valency Lexicon of Arabic Verbs

Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC 2008)

DOI:10.63317/4fb657bzqqfx

Abstract

This paper describes the building of a valency lexicon of Arabic verbs using a morphologically and syntactically annotated corpus, the Prague Arabic Dependency Treebank (PADT), as its primary source. We present the theoretical account on valency developed within the Functional Generative Description (FGD) theory. We apply the framework to Modern Standard Arabic and discuss various valency-related phenomena with respect to examples from the corpus. We then outline the methodology and the linguistic and technical resources used in the building of the lexicon. The key concept in our scenario is that of PDT-VALLEX of Czech. Our lexicon will be developed by linking the conceivable entries with their instances in the treebank. Conversely, the treebank’s annotations will be linked to the lexicon. While a comparable scheme has been developed for Czech, our own contribution is to design and implement this model thoroughly for Arabic and the PADT data. The Arabic valency lexicon is intended for applications in computational parsing or language generation, and for use by human researchers. The proposed valency lexicon will be exploited in particular during further tectogrammatical annotations of PADT and might serve for enriching the expected second edition of the corpus-based Arabic-Czech Dictionary.

Details

Paper ID
lrec2008-main-172
Pages
N/A
BibKey
bielicky-smrz-2008-building
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-4-0
Conference
Sixth International Conference on Language Resources and Evaluation
Location
Marrakech, Morocco
Date
28 May 2008 30 May 2008

Authors

  • VB

    Viktor Bielický

  • OS

    Otakar Smrž

Links