Back to Main Conference 2024
LREC-COLING 2024main

Correcting Pronoun Homophones with Subtle Semantics in Chinese Speech Recognition

Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

DOI:10.63317/2bszornssi8x

Abstract

Speech recognition is becoming prevalent in daily life. However, due to the similar semantic context of the entities and the overlap of Chinese pronunciation, the pronoun homophone, especially “他/她/它 (he/she/it)”, (their pronunciation is “Tā”) is usually recognized incorrectly. It poses a challenge to automatically correct them during the post-processing of Chinese speech recognition. In this paper, we propose three models to address the common confusion issues in this domain, tailored to various application scenarios. We implement the language model, the LSTM model with semantic features, and the rule-based assisted Ngram model, enabling our models to adapt to a wide range of requirements, from high-precision to low-resource offline devices. The extensive experiments show that our models achieve the highest recognition rate for “Tā” correction with improvements from 70% in the popular voice input methods up to 90%. Further ablation analysis underscores the effectiveness of our models in enhancing recognition accuracy. Therefore, our models improve the overall experience of Chinese speech recognition of “Tā” and reduce the burden of manual transcription corrections.

Details

Paper ID
lrec2024-main-0360
Pages
pp. 4047-4058
BibKey
zhang-etal-2024-correcting
Editor
N/A
Publisher
European Language Resources Association (ELRA) and ICCL
ISSN
2522-2686
ISBN
979-10-95546-34-4
Conference
Joint International Conference on Computational Linguistics, Language Resources and Evaluation
Location
Turin, Italy
Date
20 May 2024 25 May 2024

Authors

  • ZZ

    Zhaobo Zhang

  • RG

    Rui Gan

  • PY

    Pingpeng Yuan

  • HJ

    Hai Jin

Links