Affect
RE-LLM: Refining Empathetic Speech-LLM Responses by Integrating Emotion Nuance
Abstract
With generative AI advancing, empathy in human-AI interaction is essential. While prior work focuses on emotional reflection, emotional exploration—key to deeper engagement—remains overlooked. Existing LLMs rely on text which captures limited emotion nuances. To address this, we propose RE-LLM, a speech-LLM integrating dimensional emotion embeddings and auxiliary learning.
Experiments show statistically significant gains in empathy metrics almost across three datasets. RE-LLM relatively improves the Emotional Reaction score by 14.79% and 6.76% compared to text-only and speech-LLM baselines on ESD. Notably, it raises the Exploration score by 35.42% and 3.91% on IEMOCAP, 139.28% and 9.83% on ESD and 60.95% and 22.64% on MSP-PODCAST relatively. It also boosts unweighted accuracy by 5.4% on IEMOCAP, 2.3% on ESD and 6.9% on MSP-PODCAST in speech emotion recognition. These results highlight the enriched emotional understanding and improved empathetic response generation of RE-LLM.
Figures
The architecture of our proposed RE-LLM comprises a speech-LLM and an emotion nuance module. A preprocessing generation and expected behavioral alignment constrained on nuance emotion training strategy are depicted as well
Keywords
speech LLM | empathetic conversational agent | speech emotion modeling
Publication Date
2025/12/06
Conference
IEEE ASRU