Interlinear Glosses as a Multilingual Pivot for Machine Translation: An Updated Study on Turkish with Restricted Resources

Proceedings of the SIGUL 2026 Joint Workshop with ELE, EURALI, and DCLRL "Towards Inclusivity and Equality: Language Resources and Technologies for Under-Resourced and Endangered Languages

DOI:10.63317/4km7f5p8epxt

Abstract

Translating very low-resource languages is a challenge that has been approached using available linguistic cues. Among them, interlinear glosses are linguistic annotations that can essentially bridge the gap between two languages thanks to both grammatical and lexical information. We perform a case study on a simulated low-resource condition for Turkish, a morphologically rich language, with a pipeline approach, following (Zhou et al., 2020). A source sentence is passed through a morphological analyzer and a bilingual dictionary to obtain a gloss-like representation. We then evaluate the current capacity of Neural Machine Translation systems and Large Language Models in performing the translation task from interlinear glosses into fluent English translations. We notably evaluate how performance scales with multilingual glossed data and how translation is affected by pseudo-glosses. Pivoting with glosses remains a better approach than a direct translation for languages with limited parallel data for training. Although glosses remain helpful resources, translations are sensitive to their quality, especially for lexical information.

Resources

Details

Paper ID

lrec2026-ws-sigul-19

Pages

pp. 183-197

DOI

10.63317/4km7f5p8epxt

BibKey

ozer-etal-2026-interlinear

Editors

Atul Kr. Ojha, Sakriani Sakti, Claudia Soria, Maite Melero, John P. McCrae, Constantine Lignos, Chao-Hong Liu, German Rigau Claramunt, Georg Rehm

Publisher

European Language Resources Association (ELRA)

ISSN

N/A

ISBN

N/A

Workshop

Proceedings of the SIGUL 2026 Joint Workshop with ELE, EURALI, and DCLRL "Towards Inclusivity and Equality: Language Resources and Technologies for Under-Resourced and Endangered Languages

Location

Palma, Mallorca, Spain

Date

11 - 16 May 2026

Authors

VO
Volkan Ozer
SO
Shu Okabe
AF
Alexander Fraser

Links

URL

DOI