Learning Conditional Acoustic Latent Representation with Gender and Age Attributes for Automatic Pain Level Recognition｜BIIC Lab - NTHU

Other: Computation Methods for Health

Spoken Dialogs

Speech and Language

Learning Conditional Acoustic Latent Representation with Gender and Age Attributes for Automatic Pain Level Recognition

Download PDF ResearchGate

Abstract

Pain is an unpleasant internal sensation caused by bodily damages or physical illnesses with varied expressions conditioned on personal attributes. In this work, we propose an age-gender embedded latent acoustic representation learned using conditional maximum mean discrepancy variational autoencoder (MMD-CVAE). The learned MMD-CVAE embeds personal attributes information directly in the latent space. Our method achieves a 70.7% in extreme set classification (severe versus mild) and 47.7% in three-class recognition (severe, moderate, and mild) by using these MMD-CVAE encoded features on a large-scale real patients pain database. Our method improves a relative of 11.34% and 17.51% compared to using acoustic representation without age-gender conditioning in the extreme set and the three-class recognition respectively. Further analyses reveal under severe pain, females have higher maximum of jitter and lower harmonic energy ratio between F0, H1 and H2 compared to males, and the minimum value of jitter and shimmer are higher in the elderly compared to the non-elder group.

Figures

This is our overall framework. A conditional variational autoencoder architecture with maximum-mean-discrepancy criterion is used to learn acoustic representation for automatic pain classification.

Keywords

pain ｜ acoustic representation ｜ age and gender ｜ conditional variational autoencoder (CVAE)

Authors

Publication Date

2018/09/02

Conference

Interspeech 2018

DOI

10.21437/Interspeech.2018-1298

Publisher

RESEARCH

Related Research