Summary of the paper

Title Translation-oriented Word Sense Induction Based on Parallel Corpora
Authors Marianna Apidianaki
Abstract Word Sense Disambiguation (WSD) is an intermediate task that serves as a means to an end defined by the application in which it is to be used. However, different applications have varying disambiguation needs which should have an impact on the choice of the method and of the sense inventory used. The tendency towards application-oriented WSD becomes more and more evident, mostly because of the inadequacy of predefined sense inventories and the inefficacy of application-independent methods in accomplishing specific tasks. In this article, we present a data-driven method of sense induction, which combines contextual and translation information coming from a bilingual parallel training corpus. It consists of an unsupervised method that clusters semantically similar translation equivalents of source language (SL) polysemous words. The created clusters are projected on the SL words revealing their sense distinctions. Clustered equivalents describing a sense of a polysemous word can be considered as more or less commutable translations for an instance of the word carrying this sense. The resulting sense clusters can thus be used for WSD and sense annotation, as well as for lexical selection in translation applications.
Language Multiple languages
Topics Semantics, Machine Translation, SpeechToSpeech Translation, Word Sense Disambiguation
Full paper Translation-oriented Word Sense Induction Based on Parallel Corpora
Slides -
Bibtex @InProceedings{APIDIANAKI08.822,
  author = {Marianna Apidianaki},
  title = {Translation-oriented Word Sense Induction Based on Parallel Corpora},
  booktitle = {Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)},
  year = {2008},
  month = {may},
  date = {28-30},
  address = {Marrakech, Morocco},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-4-0},
  note = {http://www.lrec-conf.org/proceedings/lrec2008/},
  language = {english}
  }

Powered by ELDA © 2008 ELDA/ELRA