RESEARCH

HOME RESEARCH
Behavior Computing
Speech and Language
Automatic Detection of Speech Under Cold Using Discriminative Autoencoders and Strength Modeling with Multiple Sub-Dictionary Generation
Abstract
In this paper we aim to tackle the Cold sub-challenge proposed in the INTERSPEECH 2017 ComParE Challenge. The goal is to determine whether given speech is under cold condition. In this paper we present two frameworks. One of them is based on an alternative neural network-based autoencoder using two different loss functions. The first one is the standard reconstruction error used in unsupervised autoencoder, and the hinge loss (second loss function) is incorporated into the middle layer to attract utterances spoken by the same condition into similar identity code spaces. The classification is then carried out by comparing the cosine similarity of identity codes between the target and the mean of cold and non-cold utterances. With a simple logistic regression combining our method and the baseline systems predictions, we achieve 65.81% and 66% UAR on development set and test set provided by 2017 ComParE, respectively. Another approach is based on strength modeling, where diverse classifiers' confidence outputs are concatenated to original feature space as input to the support vector machine. The feature representations are derived from multiple sub-dictionary within the framework of GMM Fisher-vector encoding and eGeMAPS functional features concatenating with diverse classifiers. We achieve 70.2% and 65.5% on development and test set provided by 2017 ComPareE, respectively.
Figures
Proposed framework with extracting features and training discriminative model.
Proposed framework with extracting features and training discriminative model.
Keywords
cold detection | discriminative autoencoders | deep neural networks | computational paralinguistics
Authors
Hao-Chun Yang Jeng-Lin Li Chi-Chun Lee
Publication Date
2018/09/17
Conference
International Workshop on Acoustic Signal Enhancement (IWAENC)
2018 16th International Workshop on Acoustic Signal Enhancement (IWAENC)
DOI
10.1109/iwaenc.2018.8521319
Publisher
IEEE