Abstract
Inability to carry out cohesive narratives has been identified in children with autism spectrum disorder (ASD). However, deriving cohesion measures is often done using manual labeling or relying on expert-crafted features. In this work, we develop a novel LSTM framework to learn the embedded narrative cohesion representation from data directly. Our lexical coherence representation achieves a promising recognition accuracy of 92% in classifying between typically-developing (TD) and ASD children, as compared to 73% by using conventional coherence measures computed from syntactic, word usage, and latent semantic analysis. We perform additional validity analyses on our proposed representation. By experimentally introducing incoherence in the TD's story-telling narratives through word and sentence-level shuffling, the derived lexical coherence representation from these incoherent TD data samples result in a representation closer to those of ASD data samples.