Back to Main Conference 2016
LREC 2016main

A Proposal for a Part-of-Speech Tagset for the Albanian Language

Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)

DOI:10.63317/2cjkscg8crkk

Abstract

Part-of-speech tagging is a basic step in Natural Language Processing that is often essential. Labeling the word forms of a text with fine-grained word-class information adds new value to it and can be a prerequisite for downstream processes like a dependency parser. Corpus linguists and lexicographers also benefit greatly from the improved search options that are available with tagged data. The Albanian language has some properties that pose difficulties for the creation of a part-of-speech tagset. In this paper, we discuss those difficulties and present a proposal for a part-of-speech tagset that can adequately represent the underlying linguistic phenomena.

Details

Paper ID
lrec2016-main-682
Pages
pp. 4305-4310
BibKey
kabashi-proisl-2016-proposal
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-9-1
Conference
Tenth International Conference on Language Resources and Evaluation
Location
Portorož, Slovenia
Date
23 May 2016 28 May 2016

Authors

  • BK

    Besim Kabashi

  • TP

    Thomas Proisl

Links