Integrating Perceivers Neural-Perceptual Responses Using a Deep Voting Fusion Network for Automatic Vocal Emotion Decoding｜BIIC Lab - NTHU

Other: Signal Modeling for Understanding

Integrating Perceivers Neural-Perceptual Responses Using a Deep Voting Fusion Network for Automatic Vocal Emotion Decoding

Download PDF IEEE Xplore

Abstract

Understanding neuro-perceptual mechanism of vocal emotion perception continues to be an important research direction not only in advancing scientific knowledge but also in inspiring more robust affective computing technologies. The large variabilities in the manifested fMRI signals among subjects has been shown to be due to the effect of individual difference, i.e., inter-subject variability. However, relatively few works have developed modeling techniques in task of automatic neuro-perceptual decoding to handle such idiosyncrasies. In our work, we propose a novel computation method of deep voting fusion neural network architecture by learning an adjusted weight matrix applied at the fusion layer. The framework achieves an unweighted average recall of 53.10% in a four-class vocal emotion states decoding task, i.e., a relative improvement of 8.9% over a two-stage SVM decisionlevel fusion. Our framework demonstrates its effectiveness in handling individual differences. Further analysis is conducted to study the properties of the learned adjusted weight matrix as a function of emotion classification accuracy.

Figures

A schematic of our proposed deep voting fusion neural work in performing automatic 4-class vocal emotion decoding.

Keywords

individual difference ｜ fMRI ｜ vocal emotion perception ｜ deep voting fusion neural net

Authors