Italian arabic linguistic tools
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC 2002)
Abstract
This paper concerns our participation in the research project: 'Corpus bilingue Italiano - Arabo' (Bilingual Italian - Arabic corpus) funded by law 488/92. The purpose of this project is to develop some linguistic tools and resources for bilingual Italian/Arabic corpora; its background and starting point are tools that have already been developed by the Computational Linguistics Institute. As far as IT tools are concerned, the project consists of four basic elements: a) morphological engine for the Arabic language; b) aligning system for Italian and Arabic parallel texts; c) automatic tagging system for Italian and Arabic texts; d) access tools (and relevant query systems) for the texts of the bilingual corpora at each text-processing step.