An Intelligent Infrastructure Toward Large Scale Naturalistic Affective Speech Corpora Collection｜BIIC Lab - NTHU

Speech and Language

An Intelligent Infrastructure Toward Large Scale Naturalistic Affective Speech Corpora Collection

Full Paper IEEE Xplore

Abstract

The field of speech emotion recognition (SER) aims to create scientifically rigorous systems that can reliably char- acterize emotional behaviors expressed in speech. A key aspect for building SER systems is to obtain emotional data that is both reliable and reproducible for practitioners. However, aca- demic researchers encounter difficulties in accessing or collecting naturalistic large-scale, reliable emotional recordings. Also, the best practices for data collection are not necessarily described or shared when presenting emotional corpora. To address this issue, the paper proposes the creation of an affective database consortium (ADC) that can encourage multidisciplinary coopera- tion among researchers and practitioners in the field of affective computing. This paper’s contribution is twofold. First, it proposes the design of the ADC with a customizable-standard framework for intelligently-controlled emotional data collection. The focus is on leveraging naturalistic spontaneous recordings available on audio-sharing websites. Second, it presents as a case study the development of a naturalistic large-scale Taiwanese Mandarin podcast corpus using the customizable-standard intelligently- controlled framework. The ADC will enable research groups to effectively collect data using the provided pipeline and to con- tribute with alternative algorithms or data collection protocols.

Figures

Customizable Intelligently Controlled Pipeline Infrastructure.

Keywords

Speech Emotion Recognition ｜ Database Consortium ｜ Affective Computing

Authors

Conference

2023 11th International Conference on Affective Computing and Intelligent Interaction (ACII)

Publisher

RESEARCH

Related Research