Back to Main Conference 2022
LREC 2022main

Inducing Discourse Marker Inventories from Lexical Knowledge Graphs

Proceedings of the Thirteenth International Conference on Language Resources and Evaluation (LREC 2022)

DOI:10.63317/4n7jwzymvd2i

Abstract

Discourse marker inventories are important tools for the development of both discourse parsers and corpora with discourse annotations. In this paper we explore the potential of massively multilingual lexical knowledge graphs to induce multilingual discourse marker lexicons using concept propagation methods as previously developed in the context of translation inference across dictionaries. Given one or multiple source languages with discourse marker inventories that discourse relations as senses of potential discourse markers, as well as a large number of bilingual dictionaries that link them – directly or indirectly – with the target language, we specifically study to what extent discourse marker induction can benefit from the integration of information from different sources, the impact of sense granularity and what limiting factors may need to be considered. Our study uses discourse marker inventories from nine European languages normalized against the discourse relation inventory of the Penn Discourse Treebank (PDTB), as well as three collections of machine-readable dictionaries with different characteristics, so that the interplay of a large number of factors can be studied.

Details

Paper ID
lrec2022-main-257
Pages
pp. 2401-2412
BibKey
chiarcos-2022-inducing
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
79-10-95546-38-2
Conference
Thirteenth Language Resources and Evaluation Conference
Location
Marseille, France
Date
20 June 2022 25 June 2022

Authors

  • CC

    Christian Chiarcos

Links