Back to Main Conference 2016
LREC 2016main

Automatic Recognition of Linguistic Replacements in Text Series Generated from Keystroke Logs

Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)

DOI:10.63317/4iptzmqyi7nm

Abstract

This paper introduces a toolkit used for the purpose of detecting replacements of different grammatical and semantic structures in ongoing text production logged as a chronological series of computer interaction events (so-called keystroke logs). The specific case we use involves human translations where replacements can be indicative of translator behaviour that leads to specific features of translations that distinguish them from non-translated texts. The toolkit uses a novel CCG chart parser customised so as to recognise grammatical words independently of space and punctuation boundaries. On the basis of the linguistic analysis, structures in different versions of the target text are compared and classified as potential equivalents of the same source text segment by `equivalence judges'. In that way, replacements of grammatical and semantic structures can be detected. Beyond the specific task at hand the approach will also be useful for the analysis of other types of spaceless text such as Twitter hashtags and texts in agglutinative or spaceless languages like Finnish or Chinese.

Details

Paper ID
lrec2016-main-574
Pages
pp. 3617-3623
BibKey
couto-vale-etal-2016-automatic
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-9-1
Conference
Tenth International Conference on Language Resources and Evaluation
Location
Portorož, Slovenia
Date
23 May 2016 28 May 2016

Authors

  • DC

    Daniel Couto-Vale

  • SN

    Stella Neumann

  • PN

    Paula Niemietz

Links