Back to Main Conference 2014
LREC 2014main

Enriching ODIN

Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC 2014)

DOI:10.63317/57ysgpaoyaj9

Abstract

In this paper, we describe the expansion of the ODIN resource, a database containing many thousands of instances of Interlinear Glossed Text (IGT) for over a thousand languages harvested from scholarly linguistic papers posted to the Web. A database containing a large number of instances of IGT, which are effectively richly annotated and heuristically aligned bitexts, provides a unique resource for bootstrapping NLP tools for resource-poor languages. To make the data in ODIN more readily consumable by tool developers and NLP researchers, we propose a new XML format for IGT, called Xigt. We call the updated release ODIN-II.

Details

Paper ID
lrec2014-main-055
Pages
pp. 3151-3157
BibKey
xia-etal-2014-enriching
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-8-4
Conference
Ninth International Conference on Language Resources and Evaluation
Location
Reykjavik, Iceland
Date
26 May 2014 31 May 2014

Authors

  • FX

    Fei Xia

  • WL

    William Lewis

  • MG

    Michael Wayne Goodman

  • JC

    Joshua Crowgey

  • EB

    Emily M. Bender

Links