Deep Reinforcement Learning-based Dialogue Policy with Graph Convolutional Q-network

Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

DOI:10.63317/3et23bk5ig2w

Abstract

Deep Reinforcement learning (DRL) has been successfully applied to the dialogue policy of task-oriented dialogue systems. However, one challenge in the existing DRL-based dialogue policy methods is their unstructured state-action representations without the ability to learn the relationship between dialogue states and actions. To alleviate this problem, we propose a graph-structured dialogue policy framework for task-oriented dialogue systems. More specifically, we use an unsupervised approach to construct two different bipartite graphs. Then, we generate the user-related and knowledge-related subgraphs based on the matching dialogue sub-states with bipartite graph nodes. A variant of graph convolutional network is employed to encode dialogue subgraphs. After that, we use a bidirectional gated cycle unit (BGRU) and self-attention mechanism to obtain the high-level historical state representations and employ a neural network for the high-level current state representations. The two state representations are joined to learn the action value of dialogue policy. Experiments implemented with different DRL algorithms demonstrate that the proposed framework significantly improves the effectiveness and stability of dialogue policies.