Back to MWE 2024
LREC-COLING 2024workshop

Combining Grammatical and Relational Approaches. A Hybrid Method for the Identification of Candidate Collocations from Corpora

Proceedings of the Joint Workshop on Multiword Expressions and Universal Dependencies (MWE-UD) @ LREC-COLING 2024

DOI:10.63317/4jweb7xgmtp5

Abstract

We present an evaluation of three different methods for the automatic identification of candidate collocations in corpora, part of a research project focused on the development of a learner dictionary of Italian collocations. We compare the commonly used POS-based method and the syntactic dependency-based method with a hybrid method integrating both approaches. We conduct a statistical analysis on a sample corpus of written and spoken texts of different registers. Results show that the hybrid method can correctly detect more candidate collocations against a human annotated benchmark. The scores are particularly high in adjectival modifier rela- tions. A hybrid approach to candidate collocation identification seems to lead to an improvement in the quality of results.

Details

Paper ID
lrec2024-ws-mwe-18
Pages
pp. 138-146
BibKey
perri-etal-2024-combining
Editor
N/A
Publisher
European Language Resources Association (ELRA) and ICCL
ISSN
N/A
ISBN
N/A
Workshop
Proceedings of the Joint Workshop on Multiword Expressions and Universal Dependencies (MWE-UD) @ LREC-COLING 2024
Location
undefined, undefined
Date
20 May 2024 25 May 2024

Authors

  • DP

    Damiano Perri

  • IF

    Irene Fioravanti

  • OG

    Osvaldo Gervasi

  • SS

    Stefania Spina

Links