A Robust Unsupervised Arousal Rating Framework using Prosody with Cross-Corpora Evaluation｜BIIC Lab - NTHU

States and Traits

A robust unsupervised arousal rating framework using prosody with cross-corpora evaluation

Download PDF ResearchGate

Abstract

This paper presents an unsupervised method for producing a bounded rating of affective arousal from speech. One of the major challenges in such behavioral signal classification is the design of methods that generalize well across domains and datasets. We propose a framework that provides robustness across databases by: selecting coherent features based on empirical and theoretical evidence, fusing activation confidences from multiple features, and effectively weighting the soft-labels without knowing the true labels. Spearman's rankcorrelation (and binary classification accuracy) on four arousal databases are: 0.62 (73%), 0.77 (86%), 0.70 (82%), and 0.65 (73%).

Figures

Histogram of IEMOCAP results.

Keywords

arousal rating ｜ activation ｜ unsupervised ｜ knowledge-based ｜ inter-rater reliability ｜ cross-corpora

Authors

Publication Date

2012/09/09

Conference

Interspeech 2012

Publisher

RESEARCH

Related Research