Request Correction
Use this form to request corrections to the paper metadata. Select the fields that need correction and provide the correct information.
Correction Guidelines
- Click the edit button next to a field to report a correction.
- Fill in the suggested correction value for each field you want to correct.
- Provide your name and email so we can contact you if needed.
Paper Information
BRAGD: Constrained Multi-Label POS Tagging for Faroese
Paper Fields
Click the edit button next to a field to report a correction.
BRAGD: Constrained Multi-Label POS Tagging for Faroese
We present the first multi-label part-of-speech (POS) tagger for Faroese using linguistically-informed constraints, addressing the data sparsity problem inherent in compound tag approaches. We propose the BRAGD tagset, which decomposes compound morphological tags into independent features (word class, gender, number, case, etc.). The BRAGD tagset is the third iteration of a tagset previously released for Faroese, with substantial modifications that are better aligned with Faroese grammar. We annotate the previously released Sosialurin corpus with the tagset, as well as a new annotated out-of-domain test corpus of 500 sentences from more varied and contemporary texts. To train the tagger, we use a constrained loss function that dynamically masks morphologically invalid features based on the word class (noun, verb, adjective, etc.). We fine-tune a Scandinavian transformer language model using the constrained multi-label loss, achieving an overall accuracy of 97.5%. We find that models trained with multi-label loss perform better, converge faster, and show significantly lower error rates on out-of-domain data than single-label approaches or previously reported methods for Faroese POS tagging. This confirms that the multi-label approach learns robust morphological patterns rather than memorizing domain-specific tag distributions. We release models, code, and the systematically revised Sosialurin-BRAGD corpus, featuring the new BRAGD tagset and a new out-of-domain evaluation corpus from diverse and contemporary text types.
Authors
Expand an author to correct their information. Use the remove button to request author removal, or add a new author.
PDF Attachment
You may attach a PDF as a corrected version of the paper. Max file size: 10MB. Only PDF files are accepted.
Your Information
Author Declaration *
Select at least one field to correct using the edit buttons above.