Speech Technology Magazine


The Downside of Directed Dialogue NLU (Video)

Omilia's Quinn Agen discusses the value of context and memory in delivering human-like interactions in self-service contact centers in this clip from SpeechTEK 2018.
By The Editors of Speech Technology - Posted Oct 5, 2018
Page1 of 1
Bookmark and Share

Learn more about customer self-service at the next SpeechTEK conference.

Read the complete transcript of this clip:

Quinn Agen: Since 2012, when we first started developing and deploying our own speech recognition engine, there's been considerable improvements in speech recognition accuracy. We actually leveraged deep neural networks to reach human level accuracy. Our speech recognition today operates in 17 languages, and because of the way that we leverage the deep neural networks, we can spin up new languages in a matter of one to two months.

As I mentioned previously, the key to a conversational experience is allowing your customers to speak naturally, easily switch between any topic at any point in the dialogue. And really the holy grail to being able to have a human-like conversation is context and memory. If we're all having a conversation and I can't remember what we were talking about 15 or five minutes ago, or understand what you're referring to indirectly, then we're not going to make it very far in our conversation. The same thing applies for machine-to-human conversations. The underlying technology needs to have the ability to maintain context and memory, and understand what folks are referring to when they refer to something, or change topics.

This is very different than, let's say, legacy speech technologies that have been sort of monopolized in the market by large vendors up until today. Some key sore points, from a user experience, with those legacy speech recognition technologies is it's really a structured dialogue. Your customers will hear things like “say this to do that, you can say this to do that, say cancel to go back to main menu.” Again, obviously, this is not how humans talk. We don't consider that a conversational interface.

Just to sort of dive a little deeper into the comparison with directed dialogue or speech-enabled IVR applications that are out there today, as you can see, even though you can have a speech-enabled IVR, it is essentially a DTMF menu tree, and so there are scenarios where the customer will have to say certain things to navigate that menu. And that, of course, leads to a mixed user experience because if you present the user with an open question, natural language at the top, making him or her believe that they can speak naturally, and then two dialogue steps later you are limiting them and saying how they have to say something that leads to very low task completion rates and poor customer experience.

Page1 of 1
Learn more about the companies mentioned in this article in the Speech Technology Buyer's Guide:
Learn more about the companies mentioned in this article in the Vertical Markets Guide: