Named Entity Recognition for Persian Literary Text: A Case Study on The Little Prince
Proceedings of The Seventh International Workshop on Designing Meaning Representations (DMR 2026) @ LREC 2026
Abstract
Existing Persian named entity recognition (NER) research has focused predominantly on news and social media domains, leaving literary texts—with their distinct linguistic characteristics—virtually unexplored. This paper addresses this gap by developing a new literary NER corpus using the Persian translation of The Little Prince story and evaluating existing state-of-the-art Persian NER tools on this corpus, trained exclusively on news and social media corpora. Our analysis reveals significant performance degradation on literary text, identifying systematic errors related to narrative-specific entities, metaphorical language, and discourse structures that challenge conventional NER approaches.