Back to Main Conference 2022
LREC 2022main

Identifying Copied Fragments in a 18th Century Dutch Chronicle

Proceedings of the Thirteenth International Conference on Language Resources and Evaluation (LREC 2022)

DOI:10.63317/4zti7y2r36wo

Abstract

We apply computational stylometric techniques to an 18th century Dutch chronicle to determine which fragments of the manuscript represent the author’s own original work and which show signs of external source use through either direct copying or paraphrasing. Through stylometric methods the majority of text fragments in the chronicle can be correctly labelled as either the author’s own words, direct copies from sources or paraphrasing. Our results show that clustering text fragments based on stylometric measures is an effective methodology for authorship verification of this document; however, this approach is less effective when personal writing style is masked by author independent styles or when applied to paraphrased text.

Details

Paper ID
lrec2022-main-631
Pages
pp. 5865-5878
BibKey
morante-etal-2022-identifying
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
79-10-95546-38-2
Conference
Thirteenth Language Resources and Evaluation Conference
Location
Marseille, France
Date
20 June 2022 25 June 2022

Authors

  • RM

    Roser Morante

  • ES

    Eleanor L. T. Smith

  • LW

    Lianne Wilhelmus

  • AL

    Alie Lassche

  • EK

    Erika Kuijpers

Links