Back to Main Conference 2022
LREC 2022main

At the Intersection of NLP and Sustainable Development: Exploring the Impact of Demographic-Aware Text Representations in Modeling Value on a Corpus of Interviews

Proceedings of the Thirteenth International Conference on Language Resources and Evaluation (LREC 2022)

DOI:10.63317/4ifcfc833ze2

Abstract

This research explores automated text classification using data from Low– and Middle–Income Countries (LMICs). In particular, we explore enhancing text representations with demographic information of speakers in a privacy-preserving manner. We introduce the Demographic-Rich Qualitative UPV-Interviews Dataset (DR-QI), a rich dataset of qualitative interviews from rural communities in India and Uganda. The interviews were conducted following the latest standards for respectful interactions with illiterate speakers (Hirmer et al., 2021a). The interviews were later sentence-annotated for Automated User-Perceived Value (UPV) Classification (Conforti et al., 2020), a schema that classifies values expressed by speakers, resulting in a dataset of 5,333 sentences. We perform the UPV classification task, which consists of predicting which values are expressed in a given sentence, on the new DR-QI dataset. We implement a classification model using DistilBERT (Sanh et al., 2019), which we extend with demographic information. In order to preserve the privacy of speakers, we investigate encoding demographic information using autoencoders. We find that adding demographic information improves performance, even if such information is encoded. In addition, we find that the performance per UPV is linked to the number of occurrences of that value in our data.

Details

Paper ID
lrec2022-main-216
Pages
pp. 2007-2021
BibKey
van-boven-etal-2022-intersection
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
79-10-95546-38-2
Conference
Thirteenth Language Resources and Evaluation Conference
Location
Marseille, France
Date
20 June 2022 25 June 2022

Authors

  • Gv

    Goya van Boven

  • SH

    Stephanie Hirmer

  • CC

    Costanza Conforti

Links