Speech Technology Magazine

 

Natural Language Meets IVR

Almost all voice processing systems currently require the caller to have a touch tone phone to use the system. Voice mail / auto attendant systems and interactive voice response systems still rarely employ voice recognition. Systems that do employ voice recognition most commonly use number recognition (i.e. the system understands the numbers 0 through 9 and the words yes, no, and oh).
By Mark Nickson - Posted Apr 1, 1998
Page1 of 1
Bookmark and Share
Almost all voice processing systems currently require the caller to have a touch tone phone to use the system. Voice mail / auto attendant systems and interactive voice response systems still rarely employ voice recognition. Systems that do employ voice recognition most commonly use number recognition (i.e. the system understands the numbers 0 through 9 and the words yes, no, and oh).

Think of a recent dialog you may have had with a voice recognition equipped system - "For sales - press or say 1, For service - press or say 2, For the documentation department - press or say 3." Callers are essentially forced to navigate by mapping their desired selection to a number.

First generation voice recognition has primarily been compartmentalized into three groups: number recognition, alpha recognition and word recognition.

Number recognition in its most basic form has required the caller to wait for a short beep tone prior to speaking each digit.

Alpha recognition and word recognition as implemented in first generation design also force callers to speak each letter after a tone, or expect the caller to speak the word to be recognized after a tone.

For word recognition, in its most advanced stage, the first generation recognizor allows a caller to "barge through" and essentially "blurt" out the word that is expected to be recognized.

Natural Language Recognition

Natural language recognition allows a user to speak in a "conversational tone." The recognizer can essentially spot the key words that make up a dialog. With natural language recognition, the user can say "I'd like the sales department please", "sales please," or "gimme sales" and the system will branch to the sales area.

IVR with natural language recognition

By using natural language recognition, the call can progress more quickly. Unfavorable aspects of the first generation recognizers are eliminated. Oral language recognition employs echo cancellation along with word spotting to allow a caller to speak in a conversationally free manner, so long as the required words are spoken at some point in the dialog.

As compared with first generation recognizers, natural recognition can be easier to implement. Natural language recognition can be engaged to look for certain words and translate the selection into a return code similar to a touch tone or "say number" scenario. The more flexible definition of word sets allows the application to take on a more pleasant human interface where the caller can say "size eighteen please" instead of the old way of "one," "eight." As with the first generation recognition, an IVR systems call flow will differ between touch tone and speech enabled versions of the same application.

To conserve on valuable natural language recognition resources, the application can employ the traditional up front screening prompt "if you're calling from a touch tone phone - please press 1 now." In this way, the majority of users can move quickly using traditional touch tone. Only those who indicate they don't have a touch tone phone are handled with the more sophisticated and expensive natural language recognition resources.

Mark Nickson is president of DAC Systems, 60 Todd Rd. Ste 1B, Shelton, CT 06484 and can be reached at 203-924-7000, 203-944-1618 (fax).

Page1 of 1