Back to Main Conference 2018
LREC 2018main

Construction of the Corpus of Everyday Japanese Conversation: An Interim Report

Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

DOI:10.63317/33kpbtjoikgt

Abstract

In 2016, we launched a new corpus project in which we are building a large-scale corpus of everyday Japanese conversation in a balanced manner, aiming at exploring characteristics of conversations in contemporary Japanese through multiple approaches. The corpus targets various kinds of naturally occurring conversations in daily situations, such as conversations during dinner with the family at home, meetings with colleagues at work, and conversations while driving. In this paper, we first introduce an overview of the corpus, including corpus size, conversation variations, recording methods, structure of the corpus, and annotations to be included in the corpus. Next, we report on the current stage of the development of the corpus and legal and ethical issues discussed so far. Then we present some results of the preliminary evaluation of the data being collected. We focus on whether or not the 94 hours of conversations collected so far vary in a balanced manner by reference to the survey results of everyday conversational behavior that we conducted previously to build an empirical foundation for the corpus design. We will publish the whole corpus in 2022, consisting of more than 200 hours of recordings.

Details

Paper ID
lrec2018-main-672
Pages
N/A
BibKey
koiso-etal-2018-construction
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
79-10-95546-00-9
Conference
Eleventh International Conference on Language Resources and Evaluation
Location
Miyazaki, Japan
Date
7 May 2018 12 May 2018

Authors

  • HK

    Hanae Koiso

  • YD

    Yasuharu Den

  • YI

    Yuriko Iseki

  • WK

    Wakako Kashino

  • YK

    Yoshiko Kawabata

  • KN

    Ken’ya Nishikawa

  • YT

    Yayoi Tanaka

  • YU

    Yasuyuki Usuda

Links