Proceedings of the First Workshop on Dialects in NLP — A Resource Perspective
LREC 2026 Workshop
A Bolu: A Structured Dataset for the Computational Analysis of Sardinian Improvisational Poetry
Silvio Calderaro, Johanna Monti
Saar-Voice: A Multi-Speaker Saarbrücken Dialect Speech Corpus
Lena Sophie Oberkircher, Jesujoba Alabi, Dietrich Klakow, Jürgen Trouvain
MD_NLP: Reconstructing an Australian English Heritage Dialect Corpus from the Mitchell-Delbridge Recordings through LLM-Assisted Speaker Attribution
Steven Coats
Challenges in the Detection of Dialect for Historical Languages; the Case of Old Irish Text Resources
Adrian Doyle
Phonologically-aware Automatic Speech Recognition Evaluation of Low-Resource Languages: The Case of Basque Dialects
Christoforos Souganidis, Asier Herranz, Ibon Saratxaga, Eva Navas, Inma Hernaez
Systematic Normalization of Spoken Mixed-Language, Mixed-Dialect Data
Margaret Blevins
Handling Cross-Dialect Syntactic Variation: a Theory-Driven Web Resource
Emanuela Li Destri, Marco Longhin, Gaia Sorge, Sofia Ferroni, Giovanni Battista Matteazzi, Andrea Artioli, Lorenzo Carletti, Federico Motta, Giuseppe Longobardi, Cristina Guardiano
Can LLM Agents Identify Spoken Dialects like a Linguist?
Tobias Bystrich, Lukas Hamm, Maria Hassan Akhter, Lea Fischbach, Lucie Flek, Akbar Karimi
Beyond Accuracy: Analyzing Dialect Confusion in Automatic Speech-Based Dialect Classification
Lea Fischbach, Alfred Lameli, Lucie Flek
FLEURS-Kobani: Extending FLEURS dataset for Northern Kurdish
Daban Q. Jaff, Mohammad Mohammadamini
Exploring the reusability of Northern Kurdish resources for Badini speech recognition
Mohammad Mohammadamini, Aveen Jalal Mohammed, Barzan Hussein Mohammed, Dezheen H. Abdulazeez, Imad Saeed Sadeeq, Dilgash Mohammed Salih, Amera Ismail Melhum, Abuobaida Abdullah Dheyab
Wancho Dialectometry: Community-created data and the Living Dictionaries project
Kellen Parker van Dam
Dialectometry and Evaluation of the ePark Corpus for Low-Resource Formosan Language Dialects
Henry Gagnier
A Dialectal Corpus for Ukrainian: Collection, Classification, and Standardization
Yuliia Frund, Sina Ahmadi
German Dialects Across Situations, Generations, and Regions: The REDE corpus as an Oral Resource for NLP
Hanna Fischer, Alfred Lameli
A Catalog of Basque Dialectal Resources: Online Collections and Standard-to-Dialectal Adaptations
Jaione Bengoetxea, Itziar Gonzalez-Dios, Rodrigo Agerri
WoVis: Interactive Visualization of Word Embeddings for Semantic Change in Historical and Dialectal Language Resources
Filip Miletić, Maximilian Henkel, Rene Cutura, Sophie Sadler, Quynh Quang Ngo, Michael Sedlmair, Sabine Schulte im Walde
Speaker Normalization via Voice Conversion Reveals a Human-Machine Dissociation in Dialect Classification
Caroline Kleen, Lea Fischbach, Akbar Karimi, Lucie Flek, Alfred Lameli
South Tyrolean Dialect-to-Standard Speech Translation: A Resource
Greta H. Franzini, Luca Ducceschi
TransVar – the Corpus for Variation and Change Study of the Historical Transcarpathian lects
Ilia Afanasev
Showing 20 of 34 papers | Page 1 of 2