Back to Main Conference 2004
LREC 2004main

Using a Parallel Transcript/Subtitle Corpus for Sentence Compression

Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC 2004)

DOI:10.63317/2zdefmrmcbbn

Abstract

The paper describes the construction and usage of a parallel corpus consisting of transcripts of television programs on the one hand and subtitles of those television programs on the other hand. The subtitles were targeted at hearing-impaired people. They are in the same language as the television programs (Dutch). Our goal is to convert transcripts to subtitles. We will apply the corpus for learning how to perform sentence compression in much the same way as Jing (2001).

Details

Paper ID
lrec2004-main-093
Pages
N/A
BibKey
vandeghinste-tjong-kim-sang-2004-using
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-1-6
Conference
Fourth International Conference on Language Resources and Evaluation
Location
Lisbon, Portugal
Date
26 May 2004 28 May 2004

Authors

  • VV

    Vincent Vandeghinste

  • ET

    Erik Tjong Kim Sang

Links