HomeLREC 2022WorkshopsMWElrec2022-ws-mwe-17
Back to MWE 2022
LREC 2022workshop

Handling Idioms in Symbolic Multilingual Natural Language Generation

Proceedings of the 18th Workshop on Multiword Expressions @LREC2022

DOI:10.63317/3ru2nsy72n2u

Abstract

While idioms are usually very rigid in their expression, they sometimes allow a certain level of freedom in their usage, with modifiers or complements splitting them or being syntactically attached to internal nodes rather than to the root (e.g., “take something with a big grain of salt”). This means that they cannot always be handled as ready-made strings in rule-based natural language generation systems. Having access to the internal syntactic structure of an idiom allows for more subtle processing. We propose a way to enumerate all possible language-independent n-node trees and to map particular idioms of a language onto these generic syntactic patterns. Using this method, we integrate the idioms from the LN-fr into GenDR, a multilingual realizer. Our implementation covers nearly 98% of LN-fr’s idioms with high precision, and can easily be extended or ported to other languages.

Details

Paper ID
lrec2022-ws-mwe-17
Pages
pp. 118-126
BibKey
dube-lareau-2022-handling
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
N/A
ISBN
N/A
Workshop
Proceedings of the 18th Workshop on Multiword Expressions @LREC2022
Location
undefined, undefined
Date
20 June 2022 25 June 2022

Authors

  • MD

    Michaelle Dubé

  • FL

    François Lareau

Links