The Community and Ethics Shaping the Norwegian Sign Language Corpus
Proceedings of the LREC 2026 12th Workshop on the Representation and Processing of Sign Languages: Language in Motion
Abstract
Recently, the Norwegian Sign Language Corpus has been published, and it includes language data from over 100 signers from around Norway. Collecting and building such multimodal signed language corpora have important implications for both research and deaf communities. However, consideration is needed to protect the personal nature of signed language data, while also making a long-term resource that is as accessible as possible to various community, research, and professional stakeholders. In addition, the potential exploitation of corpus resources by commercial and other interests, which are not necessarily aligned with the deaf community itself, must also be deliberated. Here, these seemingly opposing issues and the ethics that surround them are discussed. Current best practices in Open Science (including FAIR and CARE data principles), along with ethical discussions raised by scholars working, for example, in Deaf Studies, are shown to be important in navigating this complex research data landscape.