Joint learning of conversational temporal dynamics and acoustic features for speech deception detection in dialog games｜BIIC Lab - NTHU

Spoken Dialogs

Speech and Language

Joint learning of conversational temporal dynamics and acoustic features for speech deception detection in dialog games

Download PDF IEEE Xplore

Abstract

Deception is an intended action of a deceiver to make an interrogator believe something is true (or false) that the deceiver believes to be false (or true) as a purposeful mechanism to share a mix of truthful and deceptive experiences when being asked to respond to questions. Conventionally, automatic deception detection from speech is regarded as a recognition task modeled only using the deceiver's acoustic cues and does not include temporal conversation dynamics between the interlocutors, i.e., ignoring the potential deception-related cues when the two interlocutors coordinate such a back-an-forth interaction. In this paper, we propose a joint learning framework to detect deception by simultaneously considering variations and patterns of the conversation using both interlocutor's acoustic features and their conversational temporal dynamics. Our proposed model achieves an unweighted average recall (UAR) of 74.71% on a recently collected Chinese deceptive corpus of dialog games. Further analyses reveal that the interrogator behaviors are correlated to the deceivers deception behaviors, and including the conversational features provides enhanced deception detection power.

Figures

(a) Turn segmentation (b) Feature-level fusion (c) Deception detection framework

Keywords

deception ｜ conversation ｜ BLSTM ｜ attention ｜ speech acoustics

Authors