RESEARCH

HOME RESEARCH
Behavior Computing
Speech and Language
Self-Assessed Affect Recognition Using Fusion of Attentional BLSTM and Static Acoustic Features
Abstract
In this study, we present a computational framework to participate in the Self-Assessed Affect Sub-Challenge in the INTERSPEECH 2018 Computation Paralinguistics Challenge. The goal of this sub-challenge is to classify the valence scores given by the speaker themselves into three different levels, i.e., low, medium, and high. We explore fusion of Bi-directional LSTM with baseline SVM models to improve the recognition accuracy. In specifics, we extract frame-level acoustic LLDs as input to the BLSTM with a modified attention mechanism, and separate SVMs are trained using the standard ComParE 16 baseline feature sets with minority class upsampling. These diverse prediction results are then further fused using a decision-level score fusion scheme to integrate all of the developed models. Our proposed approach achieves a 62.94% and 67.04% unweighted average recall (UAR), which is an 6.24% and 1.04% absolute improvement over the best baseline provided by the challenge organizer. We further provide a detailed comparison analysis between different models.
Figures
The complete schematic of our framework: upsampling minority class in our database, training both time-series model (BLSTM with modified attention mechanism) and a static model (SVM with ComParE 16 features), and finally integrating diverse models in a decision-level fusion scheme
The complete schematic of our framework: upsampling minority class in our database, training both time-series model (BLSTM with modified attention mechanism) and a static model (SVM with ComParE 16 features), and finally integrating diverse models in a decision-level fusion scheme
Keywords
computational paralinguistics | BLSTM | affect recognition | attention mechanism
Publication Date
2018/09/02
Conference
Interspeech 2018
Interspeech 2018
DOI
10.21437/Interspeech.2018-2261
Publisher
ISCA