Back to Main Conference 2004
LREC 2004main

An Environment for Dialogue Corpora Collection (ENDIACC)

Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC 2004)

DOI:10.63317/4o273c3wuzug

Abstract

In this paper we present an environment for dialogue corpora collection (ENDIACC) being a part of our long term program consisting in development of methodology and tools to design systems with Emulated Language Competence (ELC systems). ELC systems are those able to communicate interactively with their human users in the human language. The key point of our research is development of a methodology for systematic studies of the human user interacting with a machine or with another human. The methods of acquisition of the initial linguistic knowledge, necessary at the early steps of the design of ELC systems, are being systematically implemented for Polish language at the Adam Mickiewicz University. The element we focus on in this presentation is an open experimental setting to generate empirical data about NL dialogues in form of dialogue corpora. The main problem with the corpus-based empirical approach consists in absence of easy and inexpensive way of collecting naturally generated dialogue recordings. The problem may partially be solved by designing experiments were natural dialogues could be registered. The novelty of the proposal presented in this paper consists in proposing a free, easily accessible, language independent software platform ENDIACC (ENvironment for DIAlogue Corpora Collection) to provide an experimental setting for text mode written (keyboard) dialogue corpora collection. This platform is particularly well adapted to the collection of corpora of chat-like dialogues in text mode combined with MMS-like technologies. The system requires a graphical operating system (e.g. Windows or Linux) with a Java interpreter. It will be free accessible for research purposes from http://main.amu.edu.pl/~zlisi.

Details

Paper ID
lrec2004-main-377
Pages
N/A
BibKey
vetulani-2004-environment
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-1-6
Conference
Fourth International Conference on Language Resources and Evaluation
Location
Lisbon, Portugal
Date
26 May 2004 28 May 2004

Authors

  • ZV

    Zygmunt Vetulani

Links