Video: Detecting Deception in Speech, Pt. 1: Methods

Learn more about speech technology in academia at the next SpeechTEK conference.

Read the complete transcript of this clip:

Yocheved Levitan: There are lots of challenges involved with detecting deception in text. Firstly, there's a lack of data to study deceptive speech, the way people lie. We don't always have the truth. It's very hard to know what people are comparing things with, and also when there are studies, it's very difficult to compare them with the real world.

Also, there are differences in the way people lie across cultures and across gender, et cetera. So this is a more recent approach presented in the PhD thesis of Sarah Ita Levitan, who happens to be my sister, and she studied deception detection in the area through the vehicle of speech.

Some notable features of this work is that it offers a large-scale corpus for studying deception, multi-domain, across cultures, across gender. Also, financial incentive for people who lie well and for people who deceive well. And methods for automatically extracting features and machine-learning approaches for classifying deceptive speech.

Three main ideas are identifying cues to deception, studying speaker variability in the way people lie, and using this information to classify speech. A little bit more detail about the corpus known as the CXD corpus, or the Columbia Cross-Cultural Deception Corpus, this is an overview of the experiment where they, it's an extension of the false resume paradigm, also known as like a lying game where participants came into the lab, and one interviewed another, asking questions where one participant was told to lie.

Another was supposed to detect whether they were saying the truth or not. And they were offered money for performing well. It also included personality scores and a baseline sample of their speech to compare with. Here's an example of a questionnaire that they were given where, for half the examples, they were told to fabricate a response, and then when their partner asked them, they would say the false answer. For example, have you ever, do you have allergies to any foods? For that example, they would lie, and then they would have to try to get away with it.

So just some other features of this corpus, 340 subjects, over 120 hours of speech, different cultures, different gender, also labeled with global and local deception. They studied different segments of speech, different units of speech, inter-pausal units, which is pause-free speech. Turns is just one speaker but may include pauses. Question responses are those that directly follow a prompt like, do you have allergies, so the direct response to that. And then question chunks are the total response to a given question.

So once they collected all that data, they extracted different feature sets. Acoustic-prosodic are the speech features. Then, linguistic deception indicators are different features that are explicitly indicative of deception like laughter, hesitation, et cetera. LIWC features are different dimensions of speech like formality and punctuation. And complexity studies the syntax of the speech, how complex it is. In total, there were 152 features studied, three different groups. For example, for acoustic-prosodic, these are the pitch features, different measure for each. Intensity is the volume.

Also there are different measures, and voice quality. We know that people reveal themselves to the world through their words, so it's not only useful to look at the speech sample, but to also look at the content. And that's why we use LIWC to identify different features in the text, which were obtained through transcribing the audio. And these different dimensions can give us some insight into the psychological cues of the speech.

Free

for qualified subscribers

Subscribe Now Current Issue Past Issues

Video: Detecting Deception in Speech, Pt. 1: Methods

Q&A: Greg Stack on Blazing a Trail to Successful AI Migration

Video: How Conversation Print Helps Fight Speech Spoofing

Video: Biometrics and Privacy Regulations Compliance

Video: How Biometrics Can Detect Deepfakes

Five Must-have Speech Recognition Capabilities For the Modern Contact Center

Video: Detecting Deception in Speech, Pt. 2: Findings

Triton Digital Partners with ekoz.ai on Voice-Cloned Podcast Ads

Soul App Launches Full-Duplex Voice Model

Mistral Unveils Voxtral Open-Source AI Voice Model

Leena AI Launches Agentic AI Colleagues