Bilingual Indexing for Information Retrieval with AUTINDEX
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC 2002)
Abstract
AUTINDEX is a bilingual automatic indexing system for the two languages German and English. It is being developed within an EU-funded project called "BINDEX" (IST-1999-20028, November 2000 - April 2002). The aim of the system is to automatically index large quantities of abstracts of scientific and technical papers from several areas of engineering. These abstracts are provided by project partners FIZ Fachinformationszentrum) Technik in Frankfurt (Germany), and IEE (Institution of Electrial Engineers) in Stevenage(England) - both are large information providers. Automatic indexing takes place using a controlled vocabulary (lists of approved "descriptors") provided in monolingual and bilingual thesauri, which have been made available (and are also used for manual indexing) by FIZ and IEE. The indexing process produces for a given abstract a list of descriptors as well as a list of classification codes using these thesauri. The AUTINDEX system also allows for free indexing - indexing with an unrestricted vocabulary (delivering so called 'free descriptors´). These free descriptors are used to enhance and extend the thesauri