Request Correction
Use this form to request corrections to the paper metadata. Select the fields that need correction and provide the correct information.
Correction Guidelines
- Click the edit button next to a field to report a correction.
- Fill in the suggested correction value for each field you want to correct.
- Provide your name and email so we can contact you if needed.
Paper Information
From LLM Prompts to Acoustic Baselines: A Scalable Pipeline for Under-Resourced Disfluent Code-Mixed Speech
Paper Fields
Click the edit button next to a field to report a correction.
From LLM Prompts to Acoustic Baselines: A Scalable Pipeline for Under-Resourced Disfluent Code-Mixed Speech
Spontaneous speech in multilingual communities often involves rapid code-mixing (CM) and natural disfluencies, yet such patterns are rarely reflected in available training data for under-resourced languages. This gap limits the development of robust automatic speech recognition (ASR) systems. To address this, we introduce BEHE-CMDisfl, a fully synthetic Bengali–English and Hindi–English disfluent code-mixed speech corpus generated through a controlled Large Language Model (LLM) and Text-to-Speech (TTS) pipeline. The dataset explicitly incorporates conversational phenomena such as filled pauses, repetitions, and restarts. We evaluate its usefulness under two ASR settings. In a micro-resource scenario (∼1.3 hours), a GMM-HMM Kaldi baseline achieved a 37.74% Word Error Rate (WER) after phonetic normalization to reduce transliteration inconsistencies, and successfully retained disfluency markers in decoding. We also examined adaptation of a modern foundation model. In zero-shot testing, openai/whisper-small failed on the code-mixed speech due to severe hallucinations and looping behavior. After applying parameter-efficient fine-tuning (LoRA) for 1,000 steps, the model stabilized, reduced insertion errors, captured rapid language switching more reliably, and achieved a WER of 21.37%. These findings show that synthetic data combined with efficient fine-tuning offers a practical path for ASR development in complex low-resource disfluent CM settings.
Authors
Expand an author to correct their information. Use the remove button to request author removal, or add a new author.
PDF Attachment
You may attach a PDF as a corrected version of the paper. Max file size: 10MB. Only PDF files are accepted.
Your Information
Author Declaration *
Select at least one field to correct using the edit buttons above.