Infrastructure for Collaborative Annotation of Speech
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC 2004)
Abstract
Vast amounts of digital language data (primary data) and increasingly complex linguistic annotations (secondary data) are being created around the world with accelerating speed. There is a real risk of losing much of this data unless the compilers of language resources (primary and secondary data) and creators of tools start to pay more attention to the reusability of the resources and the interoperability of the tools. In this poster we report our effort to create best practices for the creation and dissemination of reusable speech resources in Finland. Our suggested solution allows collaborative annotation, which means that researchers in different sites can work on the same speech data, adding different kinds of linguistic annotation and share their work with other researchers.