Title

A Multilingual Database of Idioms

Author(s)

Aline Villavicencio (1), Timothy Baldwin (2), Benjamin Waldron (1)

(1) University of Cambridge Computer Laboratory, (Villavicencio and Waldron); (2) CSLI, Stanford University, (Baldwin)

Session

P10-W

Abstract

This paper presents a possible architecture for a multilingual database of idioms. We discuss the challenges that idioms present to the creation of such a database and propose a possible encoding that maximises the amount of information that can be stored for different languages. Such a resource provides important information for linguistic, computational linguistic and psycholinguistic use, and allows for the comparison of different phenomena in different languages. This can provide the basis for a better understanding of regularities in idioms across languages.

Keyword(s)

Idiom, lexical database, multiword expression

Language(s) English, Portuguese
Full Paper

760.pdf