A Multimodal Interlocutor-Modulated Attentional BLSTM for Classifying Autism Subgroups During Clinical Interviews｜BIIC Lab - NTHU

Mental Health

A Multimodal Interlocutor-Modulated Attentional BLSTM for Classifying Autism Subgroups During Clinical Interviews

Download PDF IEEE Xplore

Abstract

The heterogeneity in Autism Spectrum Disorder (ASD) remains a challenging and unsolved issue in the current clinical practice. The behavioral differences between ASD subgroups are subtle and can be hard to be manually discerned by experts. Here, we propose a computational framework that is capable of modeling both vocal behaviors and body gestural movements of the interlocutors with their intricate dependency captured through a learnable interlocutor-modulated (IM) attention mechanism during dyadic clinical interviews of Autism Diagnostic Observation Schedule (ADOS). Specifically, our multimodal network architecture includes two modality-specific networks, a speech-IM-aBLSTM and a motion-IM-aBLSTM, that are combined in a fusion network to perform the final three ASD subgroups differentiation, i.e., Autistic Disorder (AD) vs. High-Functioning Autism (HFA) vs. Asperger Syndrome (AS). Our model uniquely introduces the IM attention mechanism to capture the non-linear behavior dependency between interlocutors, which is essential in providing improved discriminability in classifying the three subgroups. We evaluate our framework on a large ADOS collection, and we obtain a 66.8% unweighted average recall (UAR) that is 14.3% better than the previous work on the same dataset. Furthermore, based on the learned attention weights, we analyze essential behavior descriptors in differentiating subgroup pairs. We further identify the most critical self-disclosure emotion topics within the ADOS interview sessions, and it shows that anger and fear are the most informative interaction segments for observing the subtle interactive behavior differences between these three sub-types of ASD.

Figures

The Framework of Multimodal Interlocutor-Modulated Attentional BLSTM (Multimodal IM-aBLSTM): The learnable weight pair αF and αB within the BLSTM are termed as the interlocutor-modulated attention (IM) that integrates dyad's information to improve the discriminability in differentiating between different subgroups.

Attention Network Architecture Differences between M3 and our proposed models: We improved the conventional attention mechanism in M3 by jointly considering the interlocutors information to construct the IM-attention mechanism in our proposed IM-aBLSTM.

Keywords

Behavioral signal processing ｜ autism spectrum disorder ｜ multimodal BLSTM ｜ attention mechanism

Authors

Publication Date

2020/01/31

Journal

IEEE Journal of Selected Topics in Signal Processing (Volume 14)

DOI

10.1109/Jstsp.2020.2970578

Publisher

RESEARCH

Related Research