HomeLREC 2026WorkshopsRAILlrec2026-ws-rail-01
Back to RAIL 2026
LREC 2026workshop

A Morpho-Syntactically Annotated Corpus of Ògè Folk Narratives with a Focus on Nominal Structure

Proceedings of Resources for African Indigenous Languages (RAIL) 2026 @ LREC 2026

DOI:10.63317/22aco7jqqkan

Abstract

This paper presents a manually annotated morpho-syntactic corpus of Ògè, an under-resourced indigenous language spoken in Nigeria. The corpus consists of ten folk narratives (approximately 4,667 tokens) collected for the investigation of nominal structure. Annotation is expert-driven and includes token-level part-of-speech tagging together with a structured Determiner Phrase (DP) classification framework designed to capture language-specific nominal configurations. The scheme distinguishes between bare nouns and modified noun phrases, reflecting a central structural property of Ògè: noun forms remain morphologically stable across contexts, while modifiers exhibit formal and positional variation contributing to reference, specificity, and discourse prominence. The DP classification layer encodes both simple and complex nominal constructions, enabling systematic analysis of internal phrase structure. Designed as a reusable digital resource, the corpus supports morphosyntactic tagging, noun phrase boundary detection, and modeling of nominal structure in low-resource NLP settings. The annotated dataset will be made publicly available through the SADiLaR repository. This work demonstrates how descriptive linguistic analysis can inform annotation design and provides a replicable framework for developing structured resources for under-resourced African languages. Keywords: Ògè, low-resource NLP, annotated corpus, nominal structure, African languages

Details

Paper ID
lrec2026-ws-rail-01
Pages
pp. 1-6
BibKey
adenuga-2026-morpho
Editors
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
N/A
ISBN
N/A
Workshop
Proceedings of Resources for African Indigenous Languages (RAIL) 2026 @ LREC 2026
Location
Palma, Mallorca, Spain
Date
11 - 16 May 2026

Authors

  • PA

    Priscilla Adenuga

Links