RESEARCH

HOME RESEARCH
Behavior Computing
Other: Signal Modeling for Understanding
States and Traits
A robust unsupervised arousal rating framework using prosody with cross-corpora evaluation
Abstract
This paper presents an unsupervised method for producing a bounded rating of affective arousal from speech. One of the major challenges in such behavioral signal classification is the design of methods that generalize well across domains and datasets. We propose a framework that provides robustness across databases by: selecting coherent features based on empirical and theoretical evidence, fusing activation confidences from multiple features, and effectively weighting the soft-labels without knowing the true labels. Spearman's rankcorrelation (and binary classification accuracy) on four arousal databases are: 0.62 (73%), 0.77 (86%), 0.70 (82%), and 0.65 (73%).
Figures
Histogram of IEMOCAP results.
Histogram of IEMOCAP results.
Keywords
arousal rating | activation | unsupervised | knowledge-based | inter-rater reliability | cross-corpora
Authors
Chi-Chun Lee
Publication Date
2012/09/09
Conference
Interspeech
Interspeech 2012
Publisher
ISCA