Request Correction
Use this form to request corrections to the paper metadata. Select the fields that need correction and provide the correct information.
Correction Guidelines
- Click the edit button next to a field to report a correction.
- Fill in the suggested correction value for each field you want to correct.
- Provide your name and email so we can contact you if needed.
Paper Information
FiNERVINER: Fine-grained Named Entity Recognition for Vulnerable Languages of India's North Eastern Region
Paper Fields
Click the edit button next to a field to report a correction.
FiNERVINER: Fine-grained Named Entity Recognition for Vulnerable Languages of India's North Eastern Region
Named entity recognition (NER), particularly fine-grained NER (FgNER), extracts domain-specific entity information for Natural Language Processing (NLP) applications such as knowledge base construction and relation extraction. While manual annotation for creating relevant data is expensive, distant supervision often produces noisy data. Moreover, resources for coarse-grained and fine-grained NER in Indian languages, particularly in the vulnerable languages of India’s North Eastern Region, remain scarce. This work aims at creating such a resource for three vulnerable languages: <i>Bodo/Boro (brx)</i>, <i>Manipuri/Meitei (mni)</i>, and <i>Mizo/Lushai (lus)</i>, which are regarded as official languages in three Indian states and spoken by more than six million people across five countries in South and Southeast Asia. We use annotations projection from high-resource FgNER datasets using source-to-target parallel corpora and a projection tool built on a multilingual encoder. The dataset comprises over 198k sentences, 282k entities, and 2.8M tokens in each low-resource language. Our thorough analyses validate the dataset’s high quality. We further explore zero-shot and cross-lingual settings, examining the impact of script similarity and multilingualism in cross-lingual FgNER performance. The dataset, expert detector models, the agentic tool, and the interactive web application are available as open-source resources at: <url>https://hf.co/collections/prachuryyaIITG/finerviner</url>.
Authors
Expand an author to correct their information. Use the remove button to request author removal, or add a new author.
PDF Attachment
You may attach a PDF as a corrected version of the paper. Max file size: 10MB. Only PDF files are accepted.
Your Information
Author Declaration *
Select at least one field to correct using the edit buttons above.