Attentive to Individual: A Multimodal Emotion Recognition Network with Personalized Attention Profile｜BIIC Lab - NTHU

States and Traits

Speech and Language

Attentive to Individual: A Multimodal Emotion Recognition Network with Personalized Attention Profile

Download PDF ResearchGate

Abstract

A growing number of human-centered applications benefit from continuous advancements in the emotion recognition technology. Many emotion recognition algorithms have been designed to model multimodal behavior cues to achieve high performances. However, most of them do not consider the modulating factors of an individual’s personal attributes in his/her expressive behaviors. In this work, we propose a Personalized Attributes-Aware Attention Network (PAaAN) with a novel personalized attention mechanism to perform emotion recognition using speech and language cues. The attention profile is learned from embeddings of an individual’s profile, acoustic, and lexical behavior data. The profile embedding is derived using linguistics inquiry word count computed between the target speaker and a large set of movie scripts. Our method achieves the stateof-the-art 70.3% unweighted accuracy in a four class emotion recognition task on the IEMOCAP. Further analysis reveals that affect-related semantic categories are emphasized differently for each speaker in the corpus showing the effectiveness of our attention mechanism for personalization.

Figures

This is the overall PAaAN framework. We compute dot product ofeach target speaker’s LIWC features with a large speaker set ofmovie scripts to project the target speaker into a personal profile space.

Keywords

personal attribute ｜ multimodal emotion recognition ｜ attention ｜ psycholinguistic norm

Authors

Publication Date

2019/09/15

Conference

Interspeech 2019

DOI

10.21437/Interspeech.2019-2044

Publisher

RESEARCH

Related Research