An Interaction-aware Attention Network for Speech Emotion Recognition in Spoken Dialogs｜BIIC Lab - NTHU

States and Traits

Speech and Language

An Interaction-aware Attention Network for Speech Emotion Recognition in Spoken Dialogs

Download PDF IEEE Xplore

Abstract

Obtaining robust speech emotion recognition (SER) in scenarios of spoken interactions is critical to the developments of next generation human-machine interface. Previous research has largely focused on performing SER by modeling each utterance of the dialog in isolation without considering the transactional and dependent nature of the human-human conversation. In this work, we propose an interaction-aware attention network (IAAN) that incorporate contextual information in the learned vocal representation through a novel attention mechanism. Our proposed method achieves 66.3% accuracy (7.9% over baseline methods) in four class emotion recognition and is also the current state-of-art recognition rates obtained on the benchmark database.

Figures

An illustration of our proposed interaction-aware attention network (IAAN) for speech emotion recognition.

Keywords

speech emotion recognition ｜ interaction ｜ attention mechanism ｜ spoken dialogs

Authors