Back to Main Conference 2012
LREC 2012main

KPWr: Towards a Free Corpus of Polish

Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC 2012)

DOI:10.63317/5a28vt5wze4w

Abstract

This paper presents our efforts aimed at collecting and annotating a free Polish corpus. The corpus will serve for us as training and testing material for experiments with Machine Learning algorithms. As others may also benefit from the resource, we are going to release it under a Creative Commons licence, which is hoped to remove unnecessary usage restrictions, but also to facilitate reproduction of our experimental results. The corpus is being annotated with various types of linguistic entities: chunks and named entities, selected syntactic and semantic relations, word senses and anaphora. We report on the current state of the project as well as our ultimate goals.

Details

Paper ID
lrec2012-main-574
Pages
pp. 3218-3222
BibKey
broda-etal-2012-kpwr
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-7-7
Conference
Eighth International Conference on Language Resources and Evaluation
Location
Istanbul, Turkey
Date
21 May 2012 27 May 2012

Authors

  • BB

    Bartosz Broda

  • MM

    Michał Marcińczuk

  • MM

    Marek Maziarz

  • AR

    Adam Radziszewski

  • AW

    Adam Wardyński

Links