Back to Main Conference 2024
LREC-COLING 2024main

RT-VQ2A2: Real Time Vector Quantized Question Answering with ASR

Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

DOI:10.63317/4fz43nwumm98

Abstract

In Spoken Question Answering (SQA), automatic speech recognition (ASR) outputs are often relayed to language models for QA. However, constructing such a cascaded framework with large language models (LLMs) in a real-time SQA setting involves realistic challenges, such as noise in the ASR output, the limited context length of LLMs, and latency in processing large models. This paper proposes a novel model-agnostic framework, RT-VQ2A2, to address these challenges. RT-VQ2A2 consists of three steps: codebook preparation, quantized semantic vector extractor, and dual segment selector. We construct a codebook from clustering, removing outliers on a text corpus derived from ASR to mitigate the influence of ASR error. Extracting quantized semantic vectors through a pre-built codebook shows significant speed and performance improvements in relevant context retrieval. Dual segment selector considers both semantic and lexical aspects to deal with ASR error. The efficacy of RT-VQ2A2 is validated on the widely used Spoken-SQuAD dataset.

Details

Paper ID
lrec2024-main-1238
Pages
pp. 14204-14214
BibKey
kim-etal-2024-rt
Editor
N/A
Publisher
European Language Resources Association (ELRA) and ICCL
ISSN
2522-2686
ISBN
979-10-95546-34-4
Conference
Joint International Conference on Computational Linguistics, Language Resources and Evaluation
Location
Turin, Italy
Date
20 May 2024 25 May 2024

Authors

  • KK

    Kyungho Kim

  • SP

    Seongmin Park

  • JL

    Jihwa Lee

Links