Conversational Assistants to Support Patients with Heart Failure: Comparing a Neurosymbolic Architecture with GPT

Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)

Abstract

Conversational assistants are becoming increasingly popular, including in healthcare, partly due to the availability and capabilities of Large Language Models. There is a need for controlled, probing evaluations with real stakeholders, which can highlight the advantages and disadvantages of more traditional architectures and those based on generative AI. We present a within-group user study to compare two versions of a conversational assistant that allows patients with heart failure to ask about the salt content in food. One version of the system was developed with a neurosymbolic architecture, and another is based on GPT. Our objective in evaluating the two dialogue systems was not only to compare task performance but also to gain insights from real stakeholders. Results indicate that the two systems complement each other, highlighting the promise of a hybrid approach that leverages the strengths of both systems.