RESEARCH

HOME RESEARCH
Behavior Computing
Spoken Dialogs
Other: Signal Modeling for Understanding
A Social Condition-Enhanced Network for Recognizing Power Distance using Expressive Prosody and Intrinsic Brain Connectivity
Abstract
Culture is the social norm that often dictates a person's thoughts, decision-making, and social behaviors during interaction at an individual level. In this study, we present a computational framework that automatically assesses an individual culture attribute of power distance (PDI), i.e., the measure to describe one's acceptance of social status, power and authority in organizations through multimodal modeling of a participant's expressive prosodic structures and brain connectivity using a social condition-enhanced network. In specific, we propose a joint learning approach of center-loss embedding network architecture that learns to “centerize” the embedding space given a particular social interaction condition to enhance the PDI discriminability of the representation. Our proposed method achieves 88.5% and 73.1% in binary classification task of recognizing low versus high power distance on prosodic and fMRI modality separately. After performing multimodal fusion, it improves to 96.2% of 2-class recognition rate (7.7% relative improvement). Further analyses reveal that average and standard deviation of speech energy are significantly correlated with power distance index; the right middle cingulate cortex (MCC) of brain region achieves the best recognition accuracy demonstrating its role in processing a person's belief about power distance.dd
Figures
It shows the complete architecture of our multimodal social condition-enhanced network (SC-eN) for power distance recognition: ROI-based functional connectivity graph embedding, dynamic modeling of prosodic pitch and energy contour, training networks by jointly optimizing setting-wise center-loss with cross entropy criteria, performing recognition using functional encoding of network output with support vector machine.
It shows the complete architecture of our multimodal social condition-enhanced network (SC-eN) for power distance recognition: ROI-based functional connectivity graph embedding, dynamic modeling of prosodic pitch and energy contour, training networks by jointly optimizing setting-wise center-loss with cross entropy criteria, performing recognition using functional encoding of network output with support vector machine.
Keywords
culture dimensions | fMRI | prosody | center-loss embedding | power distance index
Authors
Publication Date
2021/04/22
Journal
IEEE Transactions on Multimedia
IEEE Transactions on Multimedia
DOI
10.1109/TMM.2021.3075091
Publisher
IEEE