RESEARCH

HOME RESEARCH
Health Analytics
Diagnosis
Mental Health
A Multimodal Interlocutor-Modulated Attentional BLSTM for Classifying Autism Subgroups During Clinical Interviews
Abstract
The heterogeneity in Autism Spectrum Disorder (ASD) remains a challenging and unsolved issue in the current clinical practice. The behavioral differences between ASD subgroups are subtle and can be hard to be manually discerned by experts. Here, we propose a computational framework that is capable of modeling both vocal behaviors and body gestural movements of the interlocutors with their intricate dependency captured through a learnable interlocutor-modulated (IM) attention mechanism during dyadic clinical interviews of Autism Diagnostic Observation Schedule (ADOS). Specifically, our multimodal network architecture includes two modality-specific networks, a speech-IM-aBLSTM and a motion-IM-aBLSTM, that are combined in a fusion network to perform the final three ASD subgroups differentiation, i.e., Autistic Disorder (AD) vs. High-Functioning Autism (HFA) vs. Asperger Syndrome (AS). Our model uniquely introduces the IM attention mechanism to capture the non-linear behavior dependency between interlocutors, which is essential in providing improved discriminability in classifying the three subgroups. We evaluate our framework on a large ADOS collection, and we obtain a 66.8% unweighted average recall (UAR) that is 14.3% better than the previous work on the same dataset. Furthermore, based on the learned attention weights, we analyze essential behavior descriptors in differentiating subgroup pairs. We further identify the most critical self-disclosure emotion topics within the ADOS interview sessions, and it shows that anger and fear are the most informative interaction segments for observing the subtle interactive behavior differences between these three sub-types of ASD.
Figures
The Framework of Multimodal Interlocutor-Modulated Attentional BLSTM (Multimodal IM-aBLSTM): The learnable weight pair αF and αB within the BLSTM are termed as the interlocutor-modulated attention (IM) that integrates dyad's information to improve the discriminability in differentiating between different subgroups.
The Framework of Multimodal Interlocutor-Modulated Attentional BLSTM (Multimodal IM-aBLSTM): The learnable weight pair αF and αB within the BLSTM are termed as the interlocutor-modulated attention (IM) that integrates dyad's information to improve the discriminability in differentiating between different subgroups.
Attention Network Architecture Differences between M3 and our proposed models: We improved the conventional attention mechanism in M3 by jointly considering the interlocutors information to construct the IM-attention mechanism in our proposed IM-aBLSTM.
Attention Network Architecture Differences between M3 and our proposed models: We improved the conventional attention mechanism in M3 by jointly considering the interlocutors information to construct the IM-attention mechanism in our proposed IM-aBLSTM.
Keywords
Behavioral signal processing | autism spectrum disorder | multimodal BLSTM | attention mechanism
Authors
Yun-Shao Lin Chi-Chun Lee
Publication Date
2020/01/31
Journal
IEEE Journal of Selected Topics in Signal Processing
IEEE Journal of Selected Topics in Signal Processing (Volume 14)
DOI
10.1109/Jstsp.2020.2970578
Publisher
IEEE