Nuance’s Dragon Voicemail-to-Text Platform Tapped by Deutsche Telekom
Deutsche Telekom has chosen Nuance's cloud-based Dragon voicemail-to-text platform to expand its existing Mobilbox Pro service. Using the Nuance solution, voicemail messages are converted to text, which is then sent out as either a text message or as an email to the recipient.
While the service is easier for users, it's not as simple to build. There are several hurdles to working with this technology, and, converting voicemail to text is "an extraordinarily difficult speech recognition problem," explains Daniel Faulkner, senior vice president of mobile at Nuance.
Since these are human-to-human interactions, people tend to speak very quickly, but another problem is that the system has no way of knowing upfront what someone is calling about. Messages can also vary in length.
Another issue is that there has to be speaker-independent speech recognition. While Dragon NaturallySpeaking can optimize to individuals over time, the voicemail-to-text solution doesn't know who is going to call and leave a voicemail message. "We have to be able to accurately transcribe anyone's speech, any accent, age, gender on any topic at any duration" Faulkner says. "We have to be able to do that extremely fluently, and we are continually updating models based on the acoustic data that we have."
The process gets even more complex since most telecom operators are using fairly old voicemail systems, Faulkner explains. These systems take already compressed audio over the phone and then compress it even more to save storage space. That means that Nuance's solution receives highly compressed or even transcoded audio.
Yet another problem is location: People can be calling from anywhere, so there can be a lot of background noise.
"When you combine all of these things--speaker independence, topic independence, and highly compressed telephony audio--it is among, if not the most, difficult pure speech recognition challenge," Faulkner says. "It [visual voicemail] does make mistakes, but importantly, users can see the meaning of the message and can usually glean enough information from the transcription that they don't have to listen to the audio."
Nuance's voicemail-to-text solution shares some core speech recognition technology that Dragon NaturallySpeaking has but differs in other ways. For instance, to do long-form dictation or transcription, it needs acoustic models that recognize sounds that people make in a given accent and it needs to be a language model that is trained from a sequence of words.
"We have a specialized voicemail language model [for voicemail-to-text] because even though it's a very broad domain, when people leave voicemail messages they don't speak the same way as when they're dictating," Faulkner says.
Nuance recently conducted a survey with Research Now to better understand the text and voicemail behaviors of smartphone customers. One finding was that 95 percent of consumers said that they find text messaging more convenient than voicemails, and that on average it takes six to eight hours before a consumer listened to a voicemail.
People also indicated that they prefer voice communication when the content that needs to be communicated is personal, i.e. someone is in the hospital. "Where there's a high human touch required, people will use voice messaging," Faulkner says. "When there is more day-to-day transactional information, people prefer to text."
Additionally, younger generations prefer text messages over voice. "The relevance of a service like Deutsche Telecom is releasing is quite high because you're transporting something that's been left in a voicemail into the most popular message medium, text," Faulkner says. "The volume of text messages has exploded, but voice calls aren't declining like anything you would imagine. The text explosion is primarily additive rather than a replacement [for voice]."
Nuance has seen that voicemail-to-text launches become quickly popular in every market where the service has been introduced, and the company is betting on its continual popularity, according to Faulkner. "We find that the voicemail-to-text service is incredibly sticky," he says. "When people get it, they don't listen to voicemails anymore. I literally never do. It's something that people feel that they can't live without, and now we expect to see that in Germany."