Back to Main Conference 2002
LREC 2002main
Automatic extraction of differences between spoken and written languages, and automatic translation from the written to the spoken language
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC 2002)
Abstract
We extracted the differences between spoken language and written language from a spoken-language corpus and a written-language corpus by using the UNIX command ``diff'' and examined the differences to determine the construction of the grammars of the two corpora. We also transformed written-language sentences into spoken-language sentences by using rules based on the extracted differences.