Back to Main Conference 2018
LREC 2018main

Manually Annotated Corpus of Polish Texts Published between 1830 and 1918

Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

DOI:10.63317/45z36wrvg3z9

Abstract

The paper presents a manually annotated 625,000 tokens large historical corpus of -- fiction, drama, popular science, essays and newspapers of the period. The corpus provides three layers: transliteration, transcription and morphosyntactic annotation. The annotation process as well as the corpus itself are described in detail in the paper.

Details

Paper ID
lrec2018-main-609
Pages
N/A
BibKey
kieras-wolinski-2018-manually
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
79-10-95546-00-9
Conference
Eleventh International Conference on Language Resources and Evaluation
Location
Miyazaki, Japan
Date
7 May 2018 12 May 2018

Authors

  • WK

    Witold Kieraś

  • MW

    Marcin Woliński

Links