Back to COGALEX 2024
LREC-COLING 2024workshop

Cross-Linguistic Processing of Non-Compositional Expressions in Slavic Languages

Proceedings of the Workshop on Cognitive Aspects of the Lexicon @ LREC-COLING 2024

DOI:10.63317/2gfnoco72cmz

Abstract

This study focuses on evaluating and predicting the intelligibility of non-compositional expressions within the context of five closely related Slavic languages: Belarusian, Bulgarian, Czech, Polish, and Ukrainian, as perceived by native speakers of Russian. Our investigation employs a web-based experiment where native Russian respondents take part in free-response and multiple-choice translation tasks. Based on the previous studies in mutual intelligibility and non-compositionality, we propose two predictive factors for reading comprehension of unknown but closely related languages: 1) linguistic distances, which include orthographic and phonological distances; 2) surprisal scores obtained from monolingual Language Models (LMs). Our primary objective is to explore the relationship of these two factors with the intelligibility scores and response times of our web-based experiment. Our findings reveal that, while intelligibility scores from the experimental tasks exhibit a stronger correlation with phonological distances, LM surprisal scores appear to be better predictors of the time participants invest in completing the translation tasks.

Details

Paper ID
lrec2024-ws-cogalex-10
Pages
pp. 86-97
BibKey
zaitova-etal-2024-cross
Editor
N/A
Publisher
European Language Resources Association (ELRA) and ICCL
ISSN
N/A
ISBN
N/A
Workshop
Proceedings of the Workshop on Cognitive Aspects of the Lexicon @ LREC-COLING 2024
Location
undefined, undefined
Date
20 May 2024 25 May 2024

Authors

  • IZ

    Iuliia Zaitova

  • IS

    Irina Stenger

  • MB

    Muhammad Umer Butt

  • TA

    Tania Avgustinova

Links