Abstract
The increasing availability of large-scale emotion corpus along with the advancement in emotion recognition algorithms have enabled the emergence of next-generation human-machine interfaces. This paper describes a newlycollected multimodal corpus, i.e., the NTHU-NTUA Chinese Interactive Emotion Corpus (NNIME). The database is a result of the collaborative work between engineers and drama experts. This database includes recordings of 44 subjects engaged in spontaneous dyadic spoken interactions. The multimodal data includes approximately 11-hour worth of audio, video, and electrocardiogram data recorded continuously and synchronously. The database is also completed with a rich set of emotion annotations on discrete and continuous-in-time annotation by a total of 49 annotators. Thees emotion annotations include a diverse perspectives: peer-report, director-report, self-report, and observer-report. This carefully-engineered data collection and annotation processes provide an additional valuable resource to quantify and investigate various aspects of affective phenomenon and human communication. To our best knowledge, the NNIME is one of the few large-scale Chinese affective dyadic interaction database that have been systematically collected, organized, and to be publicly-released to the research community.