Proceedings of the Second workshop on Challenges in Processing South Asian Languages (CHiPSAL2026)
LREC 2026 Workshop
Findings of the Second Workshop on Challenges in Processing South Asian Languages (CHiPSAL 2026)
Kengatharaiyer Sarveswaran, Surendrabikram Thapa, Ashwini Vaidya, Tafseer Ahmed, Bal Krishna Bal
Development of Burushaski Speech - English Text Translation Dataset
Tauqeer Saleem, Abdul Samad, Azkaa Nasir, Adina Adnan Mansoor, Fatima Faisal, Mahrukh Yousuf
Lost the Negation or Lost in Negation
Vennela Bairi, Parameswari Krishnamurthy
A Morphological Transducer for the Limbu Language
Avyaya Singh, Jonathan N. Washington
Evaluating Large Language Models for Medical Named Entity Recognition in Urdu: A Benchmark Study
Bushra Nasim, Kinza Latif, Muhammad Zohair, Muhammad Hassan Asif, Zarmeen Nasim
DR-RAG: Addressing Retrieval Misalignment in Low-Resource Urdu Question Answering
Saad Ahmad, Muhammad Hammad, Muhammad Zeeshan, Faizad Ullah, Asim Karim
Cross-Domain Evaluation of Transformer-Based Models for Punjabi Speech Emotion Recognition
Fatima Tu Zahra, Kulsoom Asim, Sandesh Kumar, Abdul Samad
SiPaKosa: A Comprehensive Corpus of Canonical and Classical Buddhist Texts in Sinhala and Pali
Ranidu Hansaka Gurusinghe, Nevidu Jayatilleke
BNLI: A Linguistically-Refined Bengali Dataset for Natural Language Inference
Farah Binta Haque, Md Yasin, Shishir Saha, Md Shoaib Akhter Rafi, Farig Sadeque
Exploring Large Language Models for Multitask Learning in Bengali Text Classification
Md. Sajjad Hossain, Kawsar Ahmed, SUNY MD ASHRAF KHAN, Mohammed Moshiul Hoque
Nwāchā Munā: A Devanagari Speech Corpus and Proximal Transfer Benchmark for Nepal Bhasha ASR
Rishikesh Kumar Sharma, Safal Narshing Shrestha, Jenny Poudel, Rupak Tiwari, Arju Shrestha, Rupak Raj Ghimire, Bal Krishna Bal
From Romanized to Devanagari: Enhancing Nepali Sentiment Analysis with NepaliXlit
Suraj Patel, Kashish Kumari Dhami, Norden Sherpa, Supriya Khadka
Why Does Low-Rank Adaptation Work for Hindi-English Code-Mixing? A Geometric Analysis
Shashank Vishwakarma, Rakesh Kumar
Hi-SEMFLOW: Lie Algebra–Based Semantic Flow for Span-Level Informal Language Identification in Hindi
Manikandan Ravikiran, Tanmay Tiwari, Vibhu Gupta, Rohit Saluja
NeCCo: Nepali Cultural Commonsense Benchmark for Large Language Model Evaluation
Sanket Shrestha, Raunak Regmi, Sadikshya Ghimire, Satyam Rana, Supriya Khadka
Reward-Guided Fine-Tuning of Whisper for Low-Resource Nepali Speech Recognition
Aadarsh Pandit, Yudhin Khanal, Ishan Pandey, Kushal Kunwar, Sunil Regmi
Evaluating Linguistic Knowledge of LLMs in Tamil: The ILAKKANAM Benchmark
Jeyarajalingam Varsha, Menan Velayuthan, Sumirtha Karunakaran, Rasan Nivethiga, Kengatharaiyer Sarveswaran
A Feature-Fusion Ensemble Approach for Tamil Hate Speech Detection
Sathasivam Nerujan, Kengatharaiyer Sarveswaran
Comparative Analysis of Tokenizers in Tamil Text Classification in Low Resource Settings
Gokulan Sivakumaran, Randil Pushpananda, Bandara
Improving Public Health Safety in Low-Resource Languages Using a Human-Verified Health Misinformation Corpus and Large Language Models
Sujal Maharjan, Astha Shrestha, Laxmi Thapa, Sweta Poudel, Shuvam Shiwakoti, Rabin Thapa, Kritesh Rauniyar, Surendrabikram Thapa
Showing 20 of 33 papers | Page 1 of 2