Back to Main Conference 2022
LREC 2022main

Modeling the Impact of Syntactic Distance and Surprisal on Cross-Slavic Text Comprehension

Proceedings of the Thirteenth International Conference on Language Resources and Evaluation (LREC 2022)

DOI:10.63317/4fm2v9ib5wbj

Abstract

We focus on the syntactic variation and measure syntactic distances between nine Slavic languages (Belarusian, Bulgarian, Croatian, Czech, Polish, Slovak, Slovene, Russian, and Ukrainian) using symmetric measures of insertion, deletion and movement of syntactic units in the parallel sentences of the fable “The North Wind and the Sun”. Additionally, we investigate phonetic and orthographic asymmetries between selected languages by means of the information theoretical notion of surprisal. Syntactic distance and surprisal are, thus, considered as potential predictors of mutual intelligibility between related languages. In spoken and written cloze test experiments for Slavic native speakers, the presented predictors will be validated as to whether variations in syntax lead to a slower or impeded intercomprehension of Slavic texts.

Details

Paper ID
lrec2022-main-802
Pages
pp. 7368-7376
BibKey
stenger-etal-2022-modeling
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
79-10-95546-38-2
Conference
Thirteenth Language Resources and Evaluation Conference
Location
Marseille, France
Date
20 June 2022 25 June 2022

Authors

  • IS

    Irina Stenger

  • PG

    Philip Georgis

  • TA

    Tania Avgustinova

  • BM

    Bernd Möbius

  • DK

    Dietrich Klakow

Links