Back to Main Conference 2014
LREC 2014main

On the annotation of TMX translation memories for advanced leveraging in computer-aided translation

Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC 2014)

DOI:10.63317/4o7awuho76bb

Abstract

The term advanced leveraging refers to extensions beyond the current usage of translation memory (TM) in computer-aided translation (CAT). One of these extensions is the ability to identify and use matches on the sub-segment level ― for instance, using sub-sentential elements when segments are sentences― to help the translator when a reasonable fuzzy-matched proposal is not available; some such functionalities have started to become available in commercial CAT tools. Resources such as statistical word aligners, external machine translation systems, glossaries and term bases could be used to identify and annotate segment-level translation units at the sub-segment level, but there is currently no single, agreed standard supporting the interchange of sub-segmental annotation of translation memories to create a richer translation resource. This paper discusses the capabilities and limitations of some current standards, envisages possible alternatives, and ends with a tentative proposal which slightly abuses (repurposes) the usage of existing elements in the TMX standard.

Details

Paper ID
lrec2014-main-321
Pages
pp. 4374-4378
BibKey
forcada-2014-annotation
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-8-4
Conference
Ninth International Conference on Language Resources and Evaluation
Location
Reykjavik, Iceland
Date
26 May 2014 31 May 2014

Authors

  • MF

    Mikel Forcada

Links