Back to Main Conference 2002
LREC 2002main
Linguistic and Computational Problems for the Creation of an Italian Children’s Corpus of Spoken Language
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC 2002)
Abstract
In this paper we describe the criteria adopted for the creation of a corpus of spoken language produced by children of six to eleven years of age in different communicative situations, the methodology used for the collection of data, the transcription, coding and lemmatization phases. We also give some quantitative descriptions about nouns, verbs and adjectives present in the corpus. Qualitative analyses on the adjectives are underway. This work is to be included among the activities carried out within the framework of the "Corpus di Linguaggio Infantile" (C.L.I.), a special project of the Italian National Research Council (CNR).