Back to Main Conference 2018
LREC 2018main
Towards Language Technology for Mi’kmaq
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)
Abstract
Mi'kmaq is a polysynthetic Indigenous language spoken primarily in Eastern Canada, on which no prior computational work has focused. In this paper we first construct and analyze a web corpus of Mi'kmaq. We then evaluate several approaches to language modelling for Mi'kmaq, including character-level models that are particularly well-suited to morphologically-rich languages. Preservation of Indigenous languages is particularly important in the current Canadian context; we argue that natural language processing could aid such efforts.