HomeLREC 2026WorkshopsRAPID6MENTALAIlrec2026-ws-rapid6mentalai-12
Back to RAPID6MENTALAI 2026
LREC 2026workshop

Automatic Detection of Direct and Self-Repetitions in Naturalistic Speech Recordings of French- and Dutch-Speaking Autistic Children

Proceedings of the Sixth Resources and ProcessIng of linguistic, para-linguistic and extra-linguistic Data from people with various forms of cognitive/psychiatric/developmental impairments in cooperation with the MENTAL.ai consortium

DOI:10.63317/4eo4uey3z8kj

Abstract

This study investigates the use of cosine similarity measures across syntactic, lexical, and semantic vector repre- sentations to detect repetitions in the spontaneous speech of autistic children. It focuses on direct repetitions (i.e., immediate verbatim repetitions of linguistic output produced by another individual) and self-repetitions (i.e., within-speaker recurrence). The performance of similarity-based methods is then compared with state-of-the-art black-box classification models based on BERT, trained on the same data. Using spontaneous speech data from French- and Dutch- speaking autistic children, the results show that lexical and semantic similarity provide reliable cues for identifying self-repetitions, achieving high precision and recall, with F1-scores exceeding 83%, comparable to those obtained by BERT-based models. In contrast, direct repetitions are more difficult to detect using similarity-based approaches, with BERT models clearly outperforming them and reaching F1-scores above 73%. Across all conditions, syntactic similarity consistently underperforms relative to lexical and semantic measures. These findings highlight the strengths and limitations of similarity-based approaches and suggest directions for future research, particularly in improving the detection of direct repetitions and assessing the cross-linguistic generalizability of these methods.

Details

Paper ID
lrec2026-ws-rapid6mentalai-12
Pages
pp. 146-156
BibKey
beccaria-etal-2026-automatic
Editors
Dimitrios Kokkinakis, Charalambos Themistocleous, Gaël Dias, Kathleen C. Fraser, Fredrik Öhman, Sebastião Pais
Publisher
European Language Resources Association (ELRA)
ISSN
N/A
ISBN
N/A
Workshop
Proceedings of the Sixth Resources and ProcessIng of linguistic, para-linguistic and extra-linguistic Data from people with various forms of cognitive/psychiatric/developmental impairments in cooperation with the MENTAL.ai consortium
Location
Palma, Mallorca, Spain
Date
11 - 16 May 2026

Authors

  • FB

    Federica Beccaria

  • MK

    Marie Kolenberg

  • PL

    Pierre Labendzki

  • IZ

    Inge Zink

  • MK

    Mikhail Kissine

Links