Abstract
In this work we analyze the expressive manifestation of a child's engagement behavior on his speech as well as in the speech of psychologist interacting with the child. Visual cues such as facial gestures and gaze are known to be informative of engagement, but here, we examine the less studied speech cues of the children's non-verbal vocalizations. We study the spectral, prosodic and duration features obtained from the child and the psychologist's vocal data. We observe that these measures carry discriminative power in assessing specific engagement levels of the children (49.2% accuracy in classifying 3 levels of engagement compared to 33% chance accuracy). We also present our results as a detection task for disengagement with precision, recall and f-measure of .70, .42, .53, respectively. The unweighted accuracy for binary classification between engagement and disengagement is 62.9%. Our results suggest that vocal cues bear useful information in capturing the state of engagement in speech, indicating that speech can play an effective role in engagement assessment.