A Vietnamese Dialog Act Corpus Based on ISO 24617-2 standard
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)
Abstract
The voice-based human-machine interaction systems such as personal virtual assistants, chat-bots, and automatic contact centres are becoming increasingly popular. In this trend, conversation mining research also is getting the attention of many researchers. Standardized data play an important role in conversation mining. In this paper, we present a new Vietnamese corpus annotated for dialog acts using the ISO 24617-2 standard (2012), for emotions using Ekman's six primitives (1972), and for sentiment using the tags ``positive", ``negative" and ``neutral''. Emotion and sentiment are tagged at functional segment level. We show how the corpus is constructed and provide a brief statistical description of the data. This is the first Vietnamese dialog act corpus.