Back to CL4HEALTH 2024
LREC-COLING 2024workshop

Exploring the Challenges of Behaviour Change Language Classification: A Study on Semi-Supervised Learning and the Impact of Pseudo-Labelled Data

Proceedings of the First Workshop on Patient-Oriented Language Processing (CL4Health) @ LREC-COLING 2024

DOI:10.63317/5g7dmmqtb9hx

Abstract

Automatic classification of behaviour change language can enhance conversational agents’ capabilities to adjust their behaviour based on users’ current situations and to encourage individuals to make positive changes. However, the lack of annotated language data of change-seekers hampers the performance of existing classifiers. In this study, we investigate the use of semi-supervised learning (SSL) to classify highly imbalanced texts around behaviour change. We assess the impact of including pseudo-labelled data from various sources and examine the balance between the amount of added pseudo-labelled data and the strictness of the inclusion criteria. Our findings indicate that while adding pseudo-labelled samples to the training data has limited classification impact, it does not significantly reduce performance regardless of the source of these new samples. This reinforces previous findings on the feasibility of applying classifiers trained on behaviour change language to diverse contexts.

Details

Paper ID
lrec2024-ws-cl4health-28
Pages
pp. 229-239
BibKey
meyer-etal-2024-exploring
Editor
N/A
Publisher
European Language Resources Association (ELRA) and ICCL
ISSN
N/A
ISBN
N/A
Workshop
Proceedings of the First Workshop on Patient-Oriented Language Processing (CL4Health) @ LREC-COLING 2024
Location
undefined, undefined
Date
20 May 2024 25 May 2024

Authors

  • SM

    Selina Meyer

  • MF

    Marcos Fernandez-Pichel

  • DE

    David Elsweiler

  • DL

    David E. Losada

Links