Back to Main Conference 2002
LREC 2002main

Automatic paraphrasing based on parallel corpus for normalization

Proceedings of the Third International Conference on Language Resources and Evaluation (LREC 2002)

DOI:10.63317/3sguu3wb9wga

Abstract

There are various ways to express the same meaning in natural language. This diversity causes difficulty in many fields of natural language processing. It can be reduced by normalization of synonymous expressions, which is done by replacing various synonymous expressions with a standard one. In this paper, we propose a method for extracting paraphrases from a parallel corpus automatically and utilizing them for normalization. First, synonymous sentences are grouped by the equivalence of translation. Then, synonymous expressions are extracted by the differences between synonymous sentences. Synonymous expressions contain not only interchangeable words but also surrounding words in order to consider contextual condition. Our method has two advantages: 1) only a parallel corpus is required, and 2) various types of paraphrases can be acquired.

Details

Paper ID
lrec2002-main-086
Pages
N/A
BibKey
shimohata-sumita-2002-automatic
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
N/A
Conference
Third International Conference on Language Resources and Evaluation
Location
Las Palmas, Spain
Date
29 May 2002 31 May 2002

Authors

  • MS

    Mitsuo Shimohata

  • ES

    Eiichiro Sumita

Links