RESEARCH

HOME RESEARCH
Behavior Computing
States and Traits
Enforcing Semantic Consistency for Cross Corpus Emotion Prediction using Adversarial Discrepancy Learning
Abstract
Mismatch between databases entails a challenge in performing emotion recognition on a practical-condition unlabeled database with labeled source data. The alignment between the source and target is crucial for conventional neural network; therefore, many studies have mapped two domains in a common feature space. However, the effect of distortion in emotion semantics across different conditions has been neglected in such work, and a sample from the target may be considered a high emotional annotation in the target but as low in the source. In this work, we propose the maximum regression discrepancy (MRD) network, which enforces semantic consistency in a source and target by adjusting the acoustic feature encoder to minimize discrepancy in maximally distorted samples through adversarial training. We show our framework in several experiments using three databases (the USC IEMOCAP, MSP-Improv, and MSP-Podcast) for cross corpus emotion prediction. Compared to the Source-only neural network and DANN, MRD network demonstrates a significant improvement between 5% and 10% in the concordance correlation coefficient (CCC) in cross-corpus prediction and between 3% and 10% for evaluation on MSP-PODCAST. We also visualize the effect of MRD on feature representation to shows the efficacy of the MRD structure we designed.
Figures
Adversarial discrepancy learning procedure of MRD network.
Adversarial discrepancy learning procedure of MRD network.
The t-SNE algorithm are employed to plot feature representation transformed by the encoder from the MRD network, DANN and SoNN for activation.
The t-SNE algorithm are employed to plot feature representation transformed by the encoder from the MRD network, DANN and SoNN for activation.
Keywords
speech emotion recognition | generative adversarial network | cross corpus learning | semantic consistency | domain adaptation
Authors
Chun-Min Chang Chi-Chun Lee
Publication Date
2021/09/13
Journal
IEEE Transactions on Affective Computing
IEEE Transactions on Affective Computing
DOI
10.1109/taffc.2021.3111110
Publisher
IEEE