Video: What Is the Minimum Amount of Speech for Authentication?
Learn more about voice authentication at the next SpeechTEK conference.
Read the complete transcript of this clip:
Ben Cunningham: As we determine what's the minimum amount of speech, we try to think of it realistically. So, we've found that in testing the minimum, and I don't think this will surprise anyone, it's three syllables. And, it's not a coincidence that that's how many syllables are in Alexa, and "Hey, Google," and who's the--I don't know what Microsoft's Wake Word is--but, it's essentially the same thing.
It's designing for one-breath utterances. That's where you kind of aim to.
But the reality is, your device, the background noise, everything influences it. You could have someone do it three times perfectly, and then they call from a convertible in a Bluetooth phone and they're like, "I don't know why it doesn't work." I think we also have to account for that that's working under optimal conditions, and we could all debate on equal rates, all day.
We see it differently. We see it as: How do I authenticate, or how do I find fraud in a lifecycle of a phone call? We can do it without any voice, so, again, going back to to that multimodal or multifactor, 80 percent of IVRs aren't voice-enabled.
So what do you do with that? You can't get 45 net seconds of speech out of someone not saying anything, so you have to look at other factors. We can use metadata analysis and actually, we could authenticate, at some level, 70 percent of people that call in based on if you have their phone number on file. Just because we can vet the Caller ID, essentially. We can say, "Hey, that Caller ID, it's not spoofed. There's no risk. So if you want to just let 'em check their balance, things like that, that goes ahead."
Other things factor in like enrollment rates: It varies on how often your callers call back in. So retailers, how many people call Home Depot more than once in their lifetime? Probably not too many. You have to think about, what does the enrollment process look like? And I'm going to go back to best practices for a second.
If you just roll out voice biometrics and just say, "Okay, everyone enroll now," there's a couple things wrong with that. One, you could, without any risk validation of who's enrolling, you don't know who's enrolling the first time. So you could be enrolling a fraudster. Secondly, if you have to spend 45 seconds, or have someone repeat the same phrase in an active situation, if your goal is to reduce average handle time, that's really gonna blow the metrics outta the water. Because now you have to take an additional two minutes to stop and explain what they're doing, how many times, what they have to repeat.
We don't really see that as optimal. So, our solution really is passive, like the other ones I've mentioned. We just stitch however much speech we get to together, and the more they talk, the more we have on file. But we're also looking at device printing, metadata analysis, and all those combined. So, even if they don't talk through the entire conversation, there are still elements there that we can say, "Yeah, this person is who they're claiming to be."
Nuance Communications Senior Manager, Commercial Security Strategy Roanne Levitt explains how Nuance's Conversation Print addresses speech spoofing in this clip from her panel an SpeechTEK 2019.
Nuance Communications Senior Manager, Commercial Security Strategy Roanne Levitt and ID R&D VP of Sales John Amein discuss the essential requirements for complying with privacy regulations in this clip from their panel at SpeechTEK 2019.
Pindrop Director of Product Marketing Ben Cunningham discusses how liveness detection, synthetic speech detection, and other deep voice biometric technologies can fight the threat deepfakes pose in this clip from a panel at SpeechTEK 2019.
Allstate Conversational Designer Katie Lower outlines working models for assessing the viability of a conversational interface with multiple teams within an organization in this clip from her presentation at SpeechTEK 2019.
Allstate Conversational Designer Katie Lower defines the customer journey map as a visualization of the customer's process and explains why it's valuable in this clip from her presentation at SpeechTEK 2019.
Grand Studio Lead Designer Diana Deibel discusses the ethical implications of speech UIs and remaining cognizant of the inherent human elements of speech and conversation in this clip from her presentation at SpeechTEK 2019.
Grand Studio Lead Designer Diana Deibel discusses multiple approaches to making VUI design transparent--the Google vs. Alexa, system-initiated vs. user-initiated--in this clip from her presentation at SpeechTEK 2019.
Gridspace Co-Founder and Co-Head of Engineering Anthony Scodary demonstrates Grace, Gridspace's new automonous call center agent, in this clip from his keynote at SpeechTEK 2019.
Orion Labs Head of Product Ellen Juhlin and Voicea CMO Cory Treffiletti discuss persisting challenges in speech-to-text, AI identifying intent, user expectations, and more in enterprise speech tech applications in this clip from their panel at SpeechTEK 2019.
451 Research Senior Analyst Raul Castanon discusses new findings of a recent survey on speech technology adoption in the enterprise and how adoption of devices in the consumer space have impacted enterprise adoption in this clip from his panel at SpeechTEK 2019.
Grand Studio Lead Designer Diana Deibel discusses best practices for culturally inclusive access in voice UI design in this clip from her presentation at SpeechTEK 2019.
Gridspace Co-Founder and Co-Head of Engineering Anthony Scodary discusses the transactional nature of speech and how that understanding impacts effective, AI-driven call center analytics in this clip from his keynote at SpeechTEK 2019.
Conversational Technologies Principal Deborah Dahl lays out a plan for making more virtual assistants more effective in this clip from her keynote at SpeechTEK 2019.
Conversational Technologies Principal Deborah Dahl discusses the state of the art for the three pillars of conversational systems in this clip from her keynote at SpeechTEK 2019.
Conversational Technologies Principal Deborah Dahl explains how more targeted enterprise knowledge could make VAs more effective in organizations in this clip from her keynote at SpeechTEK 2019.
Nuance Communications' Roanne Levitt delineates the differences between text-dependent and text-independent biometrics and what the advent of text-independent means for IVR applications in this clip from SpeechTEK 2019.