Title

A Xml-Based Term Extraction Tool for Basque

Author(s)

I. Alegria (1), A. Gurrutxaga (2), P. Lizaso (2), X. Saralegi (2), S. Ugartetxea (2), R. Urizar (1)

(1) Ixa taldea -University of the Basque Country; (2) Elhuyar Fundazioa, Astesuain Poligonoa, 14 - 20170 Usurbil

Session

O42-TW, P24-T

Abstract

This project combines linguistic and statistical information to develop a term extraction tool for Basque. Being Basque an agglutinative and highly inflected language, the treatment of morphosyntactic information is vital. In addition, due to late unification process of the language, texts present more elevated term dispersion than in a highly normalized language. The result is a semi-automatic terminology extraction tool based on XML, for its use in technical and scientific information managing.

Keyword(s)

terminology extraction, specialized corpus, xml

Language(s)

Basque

Full Paper

301.pdf