Back to Main Conference 2016
LREC 2016main
An Annotated Corpus of Direct Speech
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)
Abstract
We propose a scheme for annotating direct speech in literary texts, based on the Text Encoding Initiative (TEI) and the coreference annotation guidelines from the Message Understanding Conference (MUC). The scheme encodes the speakers and listeners of utterances in a text, as well as the quotative verbs that reports the utterances. We measure inter-annotator agreement on this annotation task. We then present statistics on a manually annotated corpus that consists of books from the New Testament. Finally, we visualize the corpus as a conversational network.