Back to Main Conference 2010
LREC 2010main

Multimodal Russian Corpus (MURCO): First Steps

Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010)

DOI:10.63317/2gfujyv6khxs

Abstract

The paper introduces the Multimodal Russian Corpus (MURCO), which has been created in the framework of the Russian National Corpus (RNC). The MURCO provides the users with the great amount of phonetic, orthoepic, intonational information related to Russian. Moreover, the deeply annotated part of the MURCO contains the data concerning Russian gesticulation, speech act system, types of vocal gestures and interjections in Russian, and so on. The Corpus is on free access. The paper describes the main types of annotation and the interface structure of the MURCO. The MURCO consists of two parts, the second part being the subset of the first: 1) the whole Corpus, which is annotated from the lexical (lemmatization), morphological, semantic, accentological, metatextual, socioligical point of view (these types of annotation are standard for the RNC), and also from the point of view of phonetics (the orthoepic annotation and the mark-up of accentological word structure), 2) the deeply annotated MURCO, which is annotated in addition from the point of view of gesticulation and speech act structure.

Details

Paper ID
lrec2010-main-091
Pages
N/A
BibKey
grishina-2010-multimodal
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-6-7
Conference
Seventh International Conference on Language Resources and Evaluation
Location
Valletta, Malta
Date
17 May 2010 23 May 2010

Authors

  • EG

    Elena Grishina

Links