Speech Technology Magazine

 

IVR Gets Undercover Help

'Secret agents' can make a speech app more intelligent
By Kevin Brown - Posted Aug 7, 2015
Page1 of 1
Bookmark and Share

Speech recognition is continually improving—note the functionality that Google Now and Apple Siri provide from devices that are notoriously difficult to interact with using text. But has it really changed much in the contact center? Sure, a few large corporations have implemented natural language understanding (NLU) or "conversational" IVR apps using traditional speech recognition. The seven-digit price tag, covering the initial cost of the software and the ongoing (and very expensive) fine-tuning, precludes all but the largest organizations from deploying these solutions.

Even for well-designed apps, however, the current state of ASR still prevents self-service from moving beyond very simple apps limited to a few caller inputs. Even moderately complex speech-enabled apps can lead to the dreaded "I'm sorry, I didn't understand. Could you please repeat that?" which usually drives the caller to press zero or repeat, "Operator, operator!"

So should we give up all hope for speech recognition in the contact center, unless you are a massive organization with plenty of cash? Absolutely not! Human assisted understanding (HAU), a technology developed by Interactions LLC, is an extremely potent and cost-effective means of recognizing caller intent and expanding speech-enabled self-service. Variations of it have been available for more than 10 years, but the market uptake was not high until two or three years ago. The system's adaptive-understanding technology works by taking any utterance the ASR engine fails to understand and sending it to an actual person, along with a screen pop of most likely matches.

Over the history of HAU and similar offerings, companies have used various titles for this human assistant, but in business conversations, I tend to use "secret agent." The secret agent matches an utterance to the correct screen-pop suggestion, and the application progresses without the caller knowing that for a few seconds a human was working behind the scenes. The human ear is not perfect, but it is still much better at speech disambiguation than current speech recognition technology. The NLU apps mentioned earlier require significant amounts of tuning to ensure the grammars contain enough words to correctly match nearly all of a caller's possible utterances, but not so many words that the apps misidentify non-targeted words as the correct utterance. Moreover, the tuning to distinguish the voices of two humans, or just to keep the caller's utterances on track (like when a caller says to the IVR, "Tomorrow," immediately after saying to the child/pet, "I said stop it!"), is complex. It falls somewhere between science and art, and it does not come cheap.

In contrast, HAU is very accurate immediately upon deployment, and when combined with additional adaptive-understanding technology, it quickly and cost-effectively tunes the ASR grammars. To be transparent, I have been a major proponent of the capabilities of HAU since it first hit the market. Unveil Technologies offered similar technology, but Microsoft acquired it in 2005 and pulled the offering from the market. At least two others attempted to replicate the technology, but for various reasons, the market did not adopt it. Several years ago, Interactions began offering HAU, just as the market awoke to the understanding of what HAU can do at a cost-effective price point. Late last year, Interactions and AT&T completed a transaction where Interactions took possession of AT&T's Watson natural language understanding technologies, along with several other assets. This acquisition bodes well for the continuous improvement of adaptive-understanding for tuning, as well as its extension into additional customer contact channels.

Extremely successful hosted, speech-enabled virtual assistants with high ROIs are now within the reach of organizations big and small. It is great news for callers and for organizations' service and sales teams alike. Imagine hospitality reservations or the scheduling/changing/cancellation of home service appointments, perhaps even full enrollment into health insurance programs, all accomplished 100 percent via voice virtual agents. To spark your imagination further, what if these voice channel virtual agents perform so well they earn higher Net Promoter Scores and customer satisfaction scores than your employees do when handling such calls? I've been involved with implementation of the technology, and it truly is here and now. 

Kevin Brown is managing director at VoxPeritus, where he specializes in speech solutions and caller experience consulting. He has more than 20 years of experience designing and delivering speech-enabled solutions for on-premises and hosted environments. He can be reached at kevin.brown@voxperitus.com.

Page1 of 1