RESEARCH

HOME RESEARCH
Behavior Computing
States and Traits
Spoken Dialogs
Emotion-Shift Aware CRF for Decoding Emotion Sequence in Conversation
Abstract
Emotion recognition in conversation (ERC) is an increasingly important topic as it improves user experiences when adopting speech technology in our daily life. In this work, we propose an emotion-shift aware decoder based on formulation of conditional random field (CRF) to address the perennial issue of poor performances when handling emotion shift in dialogues. We conduct speech emotion recognition experiments on the IEMOCAP and the NNIME and achieve a 74.47% unweighted accuracy, which is the current state-of-the-art performance in the four class emotion recognition on the IEMOCAP. This is also the first work for ERC on the NNIME that obtains an outstanding performance of 61.02% weighted accuracy.
Figures
An illustration of our framework of emotion sequence decoder: including an emotion-shift transition adjustment mechanism in CRF, an emotion classification module, and an emotion shift module.
An illustration of our framework of emotion sequence decoder: including an emotion-shift transition adjustment mechanism in CRF, an emotion classification module, and an emotion shift module.
Keywords
speech emotion recognition | conversation | conditional random field | emotion shift
Authors
Chi-Chun Lee
Publication Date
2022/09/18
Conference
Interspeech
Interspeech 2022
DOI
10.21437/Interspeech.2022-10438
Publisher
ISCA