Proceedings of the 3rd Annual Meeting of the Special Interest Group on Under-resourced Languages @ LREC-COLING 2024
LREC-COLING 2024 Workshop
A Bit of a Problem: Measurement Disparities in Dataset Sizes across Languages
Catherine Arnett, Tyler A. Chang, Benjamin Bergen
A Novel Corpus for Automated Sexism Identification on Social Media
Lutfiye Seda Mut Altin, Horacio Saggion
Advancing Generative AI for Portuguese with Open Decoder Gervásio PT*
Rodrigo Santos, João Ricardo Silva, Luís Gomes, João Rodrigues, António Branco
Assessing Pre-Built Speaker Recognition Models for Endangered Language Data
Gina-Anne Levow
BERTbek: A Pretrained Language Model for Uzbek
Elmurod Kuriyozov, David Vilares, Carlos Gómez-Rodríguez
Beyond Error Categories: A Contextual Approach of Evaluating Emerging Spell and Grammar Checkers
Þórunn Arnardóttir, Svanhvít Lilja Ingólfsdóttir, Haukur Barri Símonarson, Hafsteinn Einarsson, Anton Karl Ingason, Vilhjálmur Þorsteinsson
Bidirectional English-Nepali Machine Translation(MT) System for Legal Domain
Shabdapurush Poudel, Bal Krishna Bal, Praveen Acharya
BK3AT: Bangsamoro K-3 Children’s Speech Corpus for Developing Assessment Tools in the Bangsamoro Languages
Kiel D. Gonzales, Jazzmin R. Maranan, Francis Paolo D. Santelices, Edsel Jedd M. Renovalles, Nissan D. Macale, Nicole Anne A. Palafox, Jose Marie A. Mendoza
CorpusArièja: Building an Annotated Corpus with Variation in Occitan
Clamenca Poujade, Myriam Bras, Assaf Urieli
Developing Infrastructure for Low-Resource Language Corpus Building
Hedwig G. Sekeres, Wilbert Heeringa, Wietse de Vries, Oscar Yde Zwagers, Martijn Wieling, Goffe Th. Jensma
Evaluating Icelandic Sentiment Analysis Models Trained on Translated Data
Ólafur A. Jóhannsson, Birkir H. Arndal, Eysteinn Ö. Jónsson, Stefan Olafsson, Hrafn Loftsson
Exploring Text Classification for Enhancing Digital Game-Based Language Learning for Irish
Leona Mc Cahill, Thomas Baltazar, Sally Bruen, Liang Xu, Monica Ward, Elaine Uí Dhonnchadha, Jennifer Foster
Forget NLI, Use a Dictionary: Zero-Shot Topic Classification for Low-Resource Languages with Application to Luxembourgish
Fred Philippy, Shohreh Haddadan, Siwen Guo
Fostering the Ecosystem of Open Neural Encoders for Portuguese with Albertina PT* Family
Rodrigo Santos, João Rodrigues, Luís Gomes, João Ricardo Silva, António Branco, Henrique Lopes Cardoso, Tomás Freitas Osório, Bernardo Leite
Improving Language Coverage on HeLI-OTS
Tommi Jauhiainen, Krister Lindén
Improving Legal Judgement Prediction in Romanian with Long Text Encoders
Mihai Masala, Traian Rebedea, Horia Velicu
Improving Noisy Student Training for Low-resource Languages in End-to-End ASR Using CycleGAN and Inter-domain Losses
Chia-Yu Li, Ngoc Thang Vu
Indonesian-English Code-Switching Speech Recognition Using the Machine Speech Chain Based Semi-Supervised Learning
Rais Vaza Man Tazakka, Dessi Lestari, Ayu Purwarianti, Dipta Tanaya, Kurniawati Azizah, Sakriani Sakti
Inter-language Transfer Learning for Visual Speech Recognition toward Under-resourced Environments
Fumiya Kondo, Satoshi Tamura
Investigating Neural Machine Translation for Low-Resource Languages: Using Bavarian as a Case Study
Wan-hua Her, Udo Kruschwitz
Showing 20 of 50 papers | Page 1 of 3