RESEARCH

HOME RESEARCH
Behavior Computing
Speech and Language
Other: Signal Modeling for Understanding
An analysis of the relationship between signal-derived vocal arousal score and human emotion production and perception
Abstract
Bone et al. recently proposed an unsupervised signal-derived vocal arousal score (VC-AS) based on fusion of three intuitive acoustic features, i.e., pitch, intensity, and HF500, and have shown the effectiveness of quantifying human perceptual ratings of arousal robustly across multiple corpora. Due to the readily-applicable nature of the system, this objective quantification scheme could foresee-ably be used in multiple fields of behavioral science as an objective measure of affect. In this work, we investigate in detail the relationship of this signalderived measure to both intended arousal expression (i.e., production aspect) and perceived arousal rating (i.e., perception aspect). On the perception side, our results on three databases (EMA, VAM, and IEMOCAP) indicate that VC-AS agrees with mean perception at least as well as an average individual rater does. Regarding production, we observe that intended arousal correlates more with VC-AS than mean perception (EMA and IEMOCAP), and that VC-AS correlates more with intended arousal than perceived arousal (EMA); these findings are surprising given that the framework is motivated by extensive affective perception studies, although there is physiological backing. Implications for the use ofVC-AS for novel scientific study (e.g., to mitigate subjectivity) is further discussed.
Figures
Arousal rating system diagram showing progression from raw data (utterance ‘j’), to features, to individual feature scores, and finally to fused score pj.
Arousal rating system diagram showing progression from raw data (utterance ‘j’), to features, to individual feature scores, and finally to fused score pj.
Keywords
vocal arousal rating | affective perception | affective production
Authors
Publication Date
2015/09/06
Conference
Interspeech
Interspeech 2015
DOI
10.21437/Interspeech.2015-325
Publisher
ISCA