Back to Workshops

Proceedings of the 19th Workshop on Building and Using Comparable Corpora (BUCC)

LREC 2026 Workshop

Palma, Mallorca, Spain 11 - 16 May 2026 12 papers
Show20per page
01

Keynote: The Cross-Lingual Transfer Myth: Why Modern LLMs Still Fail Without Comparable Corpora and Representations

Els Lefever

p. 1 DOI: 10.63317/49d3g93xayzx
02

A Comparative Study of Parkinsonian Speech Corpora for Deep Learning-Based Detection of Dysarthria

Clara Ponchard, Pierre Serrano

pp. 2-8 DOI: 10.63317/27zb48j5vv5f
03

Computing Semantic Similarity for Aligning Bilingual Semi-parallel Texts: A Case Study

Steffen Frenzel, Maximilian Krupop, Manfred Stede

pp. 9-19 DOI: 10.63317/37kekuueqcz6
04

A Comparative Study in Corpus Linguistics Applied to Automatic Terminology Extraction

Mercè Vàzquez, Sergi Alvarez-Vidal, Antoni Oliver

pp. 20-29 DOI: 10.63317/2u8zp4nujj25
05

Comparable Corpora in Cross-linguistic Research: Nominal Number in English, Czech, and Greek

Konstantinos Diamantopoulos, Magda Ševčíková

pp. 30-40 DOI: 10.63317/4nz4cptoiwv3
06

Liebe Kolleg:innen, Querid@s Compañer@s: Presenting the GILDEES Corpus

Marie-Pauline Krielke

pp. 41-52 DOI: 10.63317/4tbr93ap6scq
07

A Diachronic Comparable Corpus of Spanish Digital News (2017–2026) for the Study of Stylistic Convergence in the GenAI Era

Hugo Sanjurjo-González

pp. 53-61 DOI: 10.63317/5n7czevx4vhk
08

Align and Shine: Building High-quality Sentence-aligned Corpora for Multilingual Text Simplification

Luis Kenji Hilasaca Sanchez, Nouran Khallaf, Serge Sharoff

pp. 62-71 DOI: 10.63317/55pt8xqgkge6
09

Bi-Text Mining across German Dialects: On the Role of Synthetic Training Data for Dialect Adaptation

Jing Wang, Barbara Plank, Robert Litschko

pp. 72-83 DOI: 10.63317/3gmqhegz45cn
10

Parallel Corpora of Scholarly Documents for English-French Machine Translation

Ziqian Peng, Lichao Zhu, Rachel Bawden, Maud Bénard, Éric de la Clergerie, Mathilde Huguin, Natalie Kübler, Paul Lerner, Alexandra Mestivier, François Yvon

pp. 84-95 DOI: 10.63317/2jm9pkbjkg95
11

Validating a Pipeline to Create a Comparable Corpus of Government-Issued Travel Advisories from the Internet Archives

Laura Braun, Christian Oswald

pp. 96-107 DOI: 10.63317/5c34j7cbnshd
12

Leveraging Comparable Toxicity Lexicons in Prompt Instructions for Multilingual Text Detoxification

Yassir El Attar, Esra Dönmez, Nina K. Ohlendorf, Agnieszka Falenska

pp. 108-118 DOI: 10.63317/2f5i2922qqe2

Showing all 12 papers