Request Correction
Use this form to request corrections to the paper metadata. Select the fields that need correction and provide the correct information.
Correction Guidelines
- Click the edit button next to a field to report a correction.
- Fill in the suggested correction value for each field you want to correct.
- Provide your name and email so we can contact you if needed.
Paper Information
Methods and Tools for Speech Data Acquisition exploiting a Database of German Parliamentary Speeches and Transcripts from the Internet
Paper Fields
Click the edit button next to a field to report a correction.
Methods and Tools for Speech Data Acquisition exploiting a Database of German Parliamentary Speeches and Transcripts from the Internet
This paper describes methods that exploit stenographic transcripts of the German parliament to improve the acoustic models of a speech recognition system for this domain. The stenographic transcripts and the speech data are available on the Internet. Using data from the Internet makes it possible to avoid the costly process of the collection and annotation of a huge amount of data. The automatic data acquisition technique works using the stenographic transcripts and acoustic data from the German parliamentary speeches plus general acoustic models, trained on different data. The idea of this technique is to generate special finite state automata from the stenographic transcripts. These finite state automata simulate potential possible correspondences between the stenographic transcript and the spoken audio content, i.e. accurate transcript. The first step is the recognition of the speech data using finite state automaton as a language model. The next step is to find, to extract and to verify the match between sections of recognized words and actually spoken audio content. After this, the automatically extracted and verified data can be used for acoustic model training. Experiments show that for a given recognition task from the German Parliament domain the absolute decrease of the word error rate is 20%.
Authors
Expand an author to correct their information. Use the remove button to request author removal, or add a new author.
PDF Attachment
You may attach a PDF as a corrected version of the paper. Max file size: 10MB. Only PDF files are accepted.
Your Information
Author Declaration *
Select at least one field to correct using the edit buttons above.