Mapping the Historical Ecology of the Cyclades: A Diachronic Natural Language Processing Analysis of Travel Narratives (1700–1920)
Proceedings of the 2nd Workshop on Ecology, Environment, and Natural Language Processing
Abstract
Historical texts can be valuable for the study of a place’s ecological history but reading and extracting information from them can be a tedious and time-consuming task. Natural Language Processing can help in order to extract the most important information of the text in a quick, effective and reproducible way. In this study, travel narratives for the Cyclades Islands from 4 different time periods (1700-1920) have been chosen for analysis. The first step, the quantitative part, includes the semi-automatic detection of geographical entities in the texts and their connection to predefined keywords in order to enable temporal and spatial statistical analysis. The output of this procedure is then inserted in a Retrieval-Augmented Generative Synthesis pipeline in which the text segments with the connected place and keyword are processed by a locally orchestrated Large Language Model. The final output is used for the understanding and interpretation of the original text. Even though the study focuses mainly on the coherence and repeatability of the workflow, an effort is made to interpret and connect the results to the past ecological profile of these islands. The dataset/supplementary material is provided via an open access repository.