RESEARCH

HOME RESEARCH
Behavior Computing
Spoken Dialogs
Speech and Language
Joint learning of conversational temporal dynamics and acoustic features for speech deception detection in dialog games
Abstract
Deception is an intended action of a deceiver to make an interrogator believe something is true (or false) that the deceiver believes to be false (or true) as a purposeful mechanism to share a mix of truthful and deceptive experiences when being asked to respond to questions. Conventionally, automatic deception detection from speech is regarded as a recognition task modeled only using the deceiver's acoustic cues and does not include temporal conversation dynamics between the interlocutors, i.e., ignoring the potential deception-related cues when the two interlocutors coordinate such a back-an-forth interaction. In this paper, we propose a joint learning framework to detect deception by simultaneously considering variations and patterns of the conversation using both interlocutor's acoustic features and their conversational temporal dynamics. Our proposed model achieves an unweighted average recall (UAR) of 74.71% on a recently collected Chinese deceptive corpus of dialog games. Further analyses reveal that the interrogator behaviors are correlated to the deceivers deception behaviors, and including the conversational features provides enhanced deception detection power.
Figures
(a) Turn segmentation (b) Feature-level fusion (c) Deception detection framework
(a) Turn segmentation (b) Feature-level fusion (c) Deception detection framework
Keywords
deception | conversation | BLSTM | attention | speech acoustics
Authors
Publication Date
2019/11/18
Conference
2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)
2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)
DOI
10.1109/apsipaasc47483.2019.9023050
Publisher
IEEE