Back to Main Conference 2004
LREC 2004main

Grouping Synonymous Sentences from a Parallel Corpus

Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC 2004)

DOI:10.63317/5cxebhgvndau

Abstract

Recently, natural language processing researches have focused on data or processing techniques for paraphrasing. Unfortunately, however, we have little data for paraphrasing. There are some research reports on collecting synonymous expressions with parallel corpus, though no suitable corpus for collecting a set of paraphrases is yet available. Therefore, we obtain a few variations of expression in paraphrase sets when we tried to apply this method with a parallel corpus. In this paper, we propose a grouping method based on the basic idea of grouping synonymous sentences related to the translation recursively, and decompose incorrect groups using the DM-decomposition algorithm. The incorrect groups include expressions that cannot be paraphrased because some words or expressions have different meanings in different situations. We discuss our method and experimental results with respect to BTEC, which is a multilingual parallel corpus.

Details

Paper ID
lrec2004-main-067
Pages
N/A
BibKey
kashioka-2004-grouping
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-1-6
Conference
Fourth International Conference on Language Resources and Evaluation
Location
Lisbon, Portugal
Date
26 May 2004 28 May 2004

Authors

  • HK

    Hideki Kashioka

Links