Back to Main Conference 2000
LREC 2000main

Production of NLP-oriented Bilingual Language Resources from Human-oriented dictionaries

Proceedings of the Second International Conference on Language Resources and Evaluation (LREC 2000)

DOI:10.63317/57myumogzr22

Abstract

In this paper, the main features of manually produced bilingual dictionaries, which have been originally designed for human use, are considered. The problem is to find the way to use such kind of dictionaries in order to produce bilingual language resources that could make a base for automate text processing, such as machine translation, cross-lingual interrogation in text retrieval, etc. The transformation technology suggested hereby is based on XML-parsing of the file obtained from the source data by means of serial of special procedures. In order to produce well-formed XML-file, automatic procedures suffice. But in most cases, there are still semantic problems and inconveniencies that could be retired only in interactive way. However, the volume of this work can be minimized due to automatic pre-editing and suitable XML mark-up. The paper presents the results of R&D project which was carried out in the framework of ELRA’1999 Call for proposals on Language resources Production. The paper is based on the authors’ experience with English-Russian and French-Russian dictionaries, but the technology can be applied to other pairs of languages.

Details

Paper ID
lrec2000-main-243
Pages
N/A
BibKey
fluhr-semenova-etal-2000-production
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
N/A
Conference
Second International Conference on Language Resources and Evaluation
Location
Athens, Greece
Date
31 May 2000 2 June 2000

Authors

  • VF

    Vera Fluhr-Semenova

  • CF

    Christian Fluhr

  • SB

    Stéphanie Brisson

Links