Back to Workshops

Proceedings of the First Workshop on Dialects in NLP — A Resource Perspective

LREC 2026 Workshop

Palma, Mallorca, Spain 11 - 16 May 2026 34 papers
Show20per page
01

A Bolu: A Structured Dataset for the Computational Analysis of Sardinian Improvisational Poetry

Silvio Calderaro, Johanna Monti

pp. 1-11 DOI: 10.63317/3d9ufzotw55s
02

Saar-Voice: A Multi-Speaker Saarbrücken Dialect Speech Corpus

Lena Sophie Oberkircher, Jesujoba Alabi, Dietrich Klakow, Jürgen Trouvain

pp. 12-23 DOI: 10.63317/2xvbjjktadtu
03

MD_NLP: Reconstructing an Australian English Heritage Dialect Corpus from the Mitchell-Delbridge Recordings through LLM-Assisted Speaker Attribution

Steven Coats

pp. 24-32 DOI: 10.63317/3iga3zzsh92p
04

Challenges in the Detection of Dialect for Historical Languages; the Case of Old Irish Text Resources

Adrian Doyle

pp. 33-47 DOI: 10.63317/2zf2k5jdm74v
05

Phonologically-aware Automatic Speech Recognition Evaluation of Low-Resource Languages: The Case of Basque Dialects

Christoforos Souganidis, Asier Herranz, Ibon Saratxaga, Eva Navas, Inma Hernaez

pp. 48-57 DOI: 10.63317/262fznwr54us
06

Systematic Normalization of Spoken Mixed-Language, Mixed-Dialect Data

Margaret Blevins

pp. 58-69 DOI: 10.63317/3bv9dmxr24p6
07

Handling Cross-Dialect Syntactic Variation: a Theory-Driven Web Resource

Emanuela Li Destri, Marco Longhin, Gaia Sorge, Sofia Ferroni, Giovanni Battista Matteazzi, Andrea Artioli, Lorenzo Carletti, Federico Motta, Giuseppe Longobardi, Cristina Guardiano

pp. 70-82 DOI: 10.63317/4x8k8si3eybn
08

Can LLM Agents Identify Spoken Dialects like a Linguist?

Tobias Bystrich, Lukas Hamm, Maria Hassan Akhter, Lea Fischbach, Lucie Flek, Akbar Karimi

pp. 83-92 DOI: 10.63317/27m2tbgjcat8
09

Beyond Accuracy: Analyzing Dialect Confusion in Automatic Speech-Based Dialect Classification

Lea Fischbach, Alfred Lameli, Lucie Flek

pp. 93-103 DOI: 10.63317/5f65nkr6qreo
10

FLEURS-Kobani: Extending FLEURS dataset for Northern Kurdish

Daban Q. Jaff, Mohammad Mohammadamini

pp. 104-109 DOI: 10.63317/267vdxnysvbe
11

Exploring the reusability of Northern Kurdish resources for Badini speech recognition

Mohammad Mohammadamini, Aveen Jalal Mohammed, Barzan Hussein Mohammed, Dezheen H. Abdulazeez, Imad Saeed Sadeeq, Dilgash Mohammed Salih, Amera Ismail Melhum, Abuobaida Abdullah Dheyab

pp. 110-115 DOI: 10.63317/2gzjngtqiqp7
12

Wancho Dialectometry: Community-created data and the Living Dictionaries project

Kellen Parker van Dam

pp. 116-123 DOI: 10.63317/4guwk869z8we
13

Dialectometry and Evaluation of the ePark Corpus for Low-Resource Formosan Language Dialects

Henry Gagnier

pp. 124-134 DOI: 10.63317/4scoopyavtvi
14

A Dialectal Corpus for Ukrainian: Collection, Classification, and Standardization

Yuliia Frund, Sina Ahmadi

pp. 135-143 DOI: 10.63317/4mkaru7y2op5
15

German Dialects Across Situations, Generations, and Regions: The REDE corpus as an Oral Resource for NLP

Hanna Fischer, Alfred Lameli

pp. 144-152 DOI: 10.63317/4fe4dkefqah9
16

A Catalog of Basque Dialectal Resources: Online Collections and Standard-to-Dialectal Adaptations

Jaione Bengoetxea, Itziar Gonzalez-Dios, Rodrigo Agerri

pp. 153-164 DOI: 10.63317/3zhrab5powcg
17

WoVis: Interactive Visualization of Word Embeddings for Semantic Change in Historical and Dialectal Language Resources

Filip Miletić, Maximilian Henkel, Rene Cutura, Sophie Sadler, Quynh Quang Ngo, Michael Sedlmair, Sabine Schulte im Walde

pp. 165-176 DOI: 10.63317/3jzkx999kfxq
18

Speaker Normalization via Voice Conversion Reveals a Human-Machine Dissociation in Dialect Classification

Caroline Kleen, Lea Fischbach, Akbar Karimi, Lucie Flek, Alfred Lameli

pp. 177-187 DOI: 10.63317/3sqk7nxsikhp
19

South Tyrolean Dialect-to-Standard Speech Translation: A Resource

Greta H. Franzini, Luca Ducceschi

pp. 188-194 DOI: 10.63317/3visgk9f8s7z
20

TransVar – the Corpus for Variation and Change Study of the Historical Transcarpathian lects

Ilia Afanasev

pp. 195-208 DOI: 10.63317/4wf34recmurr

Showing 20 of 34 papers | Page 1 of 2