Request Correction
Use this form to request corrections to the paper metadata. Select the fields that need correction and provide the correct information.
Correction Guidelines
- Click the edit button next to a field to report a correction.
- Fill in the suggested correction value for each field you want to correct.
- Provide your name and email so we can contact you if needed.
Paper Information
Fine-tuning Whisper with Spontaneous Persian Speech (SPS)
Paper Fields
Click the edit button next to a field to report a correction.
Fine-tuning Whisper with Spontaneous Persian Speech (SPS)
This paper introduces the Spontaneous Persian Speech (SPS) dataset designed for automatic speech recognition (ASR) tasks and a methodology laying the groundwork for addressing the shortage of spontaneous speech data. The corpus aims to support research on natural and conversational Persian, which remains under-represented in current ASR resources. The dataset consists of 694 minutes of audio from a total of 65 speakers, including 34 male and 31 female speakers. It contains 526,585 tokens. The audio segmentation step produces intervals of 1.24 to 3.25 seconds, each containing 3 to 9 words. The recordings cover a variety of environments, from inside cars to homes and shopping areas, including both busy and quiet settings. We use the SPS dataset to fine-tune Whisper and the performance increases significantly for both the small and medium models based on Word Error Rate (WER). This could be an initiative toward building domain-oriented datasets for specific ASR tasks.
Authors
Expand an author to correct their information. Use the remove button to request author removal, or add a new author.
PDF Attachment
You may attach a PDF as a corrected version of the paper. Max file size: 10MB. Only PDF files are accepted.
Your Information
Author Declaration *
Select at least one field to correct using the edit buttons above.