Request Correction
Use this form to request corrections to the paper metadata. Select the fields that need correction and provide the correct information.
Correction Guidelines
- Click the edit button next to a field to report a correction.
- Fill in the suggested correction value for each field you want to correct.
- Provide your name and email so we can contact you if needed.
Paper Information
Sentiment Analysis and Language Models for Kwanyama
Paper Fields
Click the edit button next to a field to report a correction.
Sentiment Analysis and Language Models for Kwanyama
Kwanyama is related to Swahili, Zulu, and, the more than 300 other languages in the Bantu family. Yet, unlike its better-known relatives, it remains almost entirely absent from modern Natural Language Processing (NLP). We bring Kwanyama into the LLM era of NLP through two key contributions. First, we introduce OkaSentiment, the first sentiment-labeled dataset for Kwanyama. Unlike prior African sentiment corpora that rely primarily on social media, OkaSentiment is grounded in an offline, culturally relevant domain: reviews of domestic labor relationships. The dataset is annotated by over 40 native speakers under expert supervision, with careful quality control. Second, we present OkaLM, the first language models for Kwanyama (1B, 3B, and 8B parameters), obtained by continued pretraining of LLaMA-3 checkpoints on a curated Kwanyama corpus. Together, OkaSentiment and OkaLM bring a left-behind language into the landscape of modern NLP, providing its first benchmark and language models.
Authors
Expand an author to correct their information. Use the remove button to request author removal, or add a new author.
PDF Attachment
You may attach a PDF as a corrected version of the paper. Max file size: 10MB. Only PDF files are accepted.
Your Information
Author Declaration *
Select at least one field to correct using the edit buttons above.