RESEARCH

HOME RESEARCH
Behavior Computing
Other: Signal Modeling for Understanding
Integrating Perceivers Neural-Perceptual Responses Using a Deep Voting Fusion Network for Automatic Vocal Emotion Decoding
Abstract
Understanding neuro-perceptual mechanism of vocal emotion perception continues to be an important research direction not only in advancing scientific knowledge but also in inspiring more robust affective computing technologies. The large variabilities in the manifested fMRI signals among subjects has been shown to be due to the effect of individual difference, i.e., inter-subject variability. However, relatively few works have developed modeling techniques in task of automatic neuro-perceptual decoding to handle such idiosyncrasies. In our work, we propose a novel computation method of deep voting fusion neural network architecture by learning an adjusted weight matrix applied at the fusion layer. The framework achieves an unweighted average recall of 53.10% in a four-class vocal emotion states decoding task, i.e., a relative improvement of 8.9% over a two-stage SVM decisionlevel fusion. Our framework demonstrates its effectiveness in handling individual differences. Further analysis is conducted to study the properties of the learned adjusted weight matrix as a function of emotion classification accuracy.
Figures
A schematic of our proposed deep voting fusion neural work in performing automatic 4-class vocal emotion decoding.
A schematic of our proposed deep voting fusion neural work in performing automatic 4-class vocal emotion decoding.
Keywords
individual difference | fMRI | vocal emotion perception | deep voting fusion neural net
Publication Date
2018/04/15
Conference
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
DOI
10.1109/icassp.2018.8462352
Publisher
IEEE