Locked In

THE VOICE USER INTERFACE (VUI) is often the first point of contact for customers who are calling into a speech-enabled system. It is the element of a speech application with which callers interact. Creating an interface that is welcoming, clear, concise, and easy to navigate is essential to getting a caller to use the system. Measuring the success of that system will require a combination of metrics: customer satisfaction, return on investment, and functionality.

According to gethuman.com's database of 500 U.S. companies, users rated approximately 82 percent of companies' customer services with an F. While this measurement focuses directly on a caller's ability to get to an agent, it appears to depict a fairly accurate picture of today's VUI experiences. "As indicated by the results of our standards testing, poor VUI implementation is very widespread. In fact, less than 10 percent of the 500 companies in our database got a grade of C or above," says Lorna Rankin, director of the GetHuman project. "We are not against the use of automated phone systems if they are well implemented." "People have been writing about how to design these things correctly for 20-plus years," states Walter Rolandi, founder and owner of The Voice User Interface Co. "Unfortunately, those voices often go unheeded."

Although the state of the industry seems bleak by GetHuman's standards, there is hope. The industry is slowly improving, though it still has a ways to go. "Bad VUI design is widespread across the board, but at least some of the really bad practices that we have seen over the last decade are beginning to fade away," explains Bruce Balentine, executive vice president and chief scientist at EIG Labs.

The widespread occurrence of poor VUI design can be attributed to many different factors. However, this list can be narrowed down to a couple factors that are the most common overall contributors to a VUI's failure: a shift in focus away from the user and a limitation of metrics for determining success/failure.

While a voice user interface cuts support costs, its initial purpose is to serve the caller's needs. With this in mind, "the single greatest concern is that it is absolutely astonishing how little time or interest is devoted in IVR development to the actual caller," Rolandi states.

The needs of the user are lost as members of marketing, internal IT, external speech recognition designers, and consultants brainstorm on what can be done to meet goals, but have no representative of the caller's needs present at the meeting. So naturally designers often focus on metrics that apply to the business case, such as containment, ROI, and call completion. This is where the criteria for measuring success can also apply for measuring a VUI's failure.

"Part of the challenge that faces the business today is that sometimes the business metrics for what is a success or failure isn't always the same as what the end user or consumer would consider a success or failure," explains Lizanne Kaiser, customer experience designer at Genesys. Containment and call completion alone do not indicate whether a caller has completed her task. It only shows that the caller didn't leave the automated system.

"Opting out is not a sign of failure," Kaiser adds. Trapping a customer in a system to encourage lower opt outs is common and deters users from calling into the system for assistance.

There are companies that offer surveys and other means of monitoring customer satisfaction with the service experience, but "customer satisfaction isn't good enough," Kaiser says. "Customers will often say that they are satisfied with the customer experience that they are receiving from a particular company, but that doesn't necessarily make them loyal, happy customers."

Error Recovery Is Key
Designers neglect far more metrics that are key to a highly functional, successful interface. One of them is error recovery. "All VUI interfaces work very well until there is an error. The difference between a good VUI and a bad VUI is that errors are extremely damaging to the interaction in a badly designed VUI and errors get easily recovered in a good interaction," EIG Labs' Balentine says.

"Most VUI designers are focused on how to make the user experience really, really fun when there are no errors," he claims. "Instead they should be focusing on how to make the experience tolerable when there are errors because there are always going to be errors."

These common mistakes are not irreversible, though. When designing and deploying a VUI, there are basic rules and methodologies to keep in mind. Achieving a successful design means starting with the most important factor and carrying out the design from beginning to end with that factor in mind. "If [the user] elects to use automation, what you want to do is to give him automation that is as easy to use or even easier than talking to a person. Users will gladly use automation if it is easier, faster, or more convenient than talking to a person," Rolandi explains. "Design with the user's need in mind, so find out what he wants to do, find out how much time he is willing to invest in doing it, and give the user what he wants. If the user does not want to use automation, don't force him to."

There are many methods for creating a user-centered design; these include determining who the users are and writing up user profiles; identifying use case scenarios; and exposing the VUI to callers to get direct feedback. "This information not only drives the design, but also when testing the application you'll want to circle back to the caller profiles to put yourself in the typical mindset of the callers," Genesys' Kaiser states. 

Menu Complexity
Another common problem associated with VUI today is the complexity of the systems and the number of menus involved. "Whenever you are routing from menu to menu, the probability of a misunderstanding or a misrouted call goes up and up and up," Rolandi says. And though usability testing is a well known best practice; "everyone pays lip service to testing, but unfortunately some folks don't actually do it," he states.

Testing an application for usability and accuracy can decrease the probability of a misunderstanding or misrouted call and determine whether the design meets the users' needs. To get an accurate measurement, usability testing requires the system to be tested on end users. 

While user surveys can be helpful, "testing is the ultimate way to determine how good your design is, particularly in VUIs, because there are so many different ways to do this," claims Juan Gilbert, a professor of computer science and software engineering at Auburn University.

Usability testing will also reveal loopholes in error recovery, Gilbert contends. Failure to recognize that an error has occurred is more than just having a speech recognition system with limited grammars. "An out-of-grammar response doesn't mean that the user presented out-of-grammar speech that is knowable, and so accepting the uncertainty of the input is the key to the design," he says. 

The designer must treat the VUI as an uncertain interface that she doesn't control. If she tries to place the blame for all recognition failure on the speech recognizer, then her "design philosophy guarantees a poor user interface," Balentine stresses.

But in the long run, it all boils down to planning for what may or may not be foreseeable. "There are different methods on progressive prompting that will help you design error recovery, but if you don't plan for errors, then inevitably you are going to have bad remarks on your VUI," Gilbert concludes. 

With all the planning and testing necessary to perfect the VUI, tools are needed for designing and developing the interfaces. There is a seemingly endless array of tools for this purpose. "I've seen and used some [tools] that are usually in some limited state of development. Some are better than others, but almost everyone in this field has evolved similar methods for documenting dialogues, so there aren't any that particularly jump out," Rolandi says.

Balentine agrees. With the range of tools available, there appear to be none that have attained a level of functionality that warrant recognition, he says. In fact, "the failure of the industry to build effective user interface tools is one of the great disgraces of this industry." 

The tools that are available seem to place the responsibility for the design entirely on the designer. "All the designers are required to know far more than they can be expected to know to build a good application," Balentine acknowledges.

These tools have been compared to the graphical user interface tools for the Web, which would indicate that "we've lowered the bar so that everyone can build one of these and I think at the same time, we've proven that not everyone should be building one of these," says Dave Pelland, prime VUI engineer at Intervoice. 

The evolution of speech tools has been slower than that of Web tools, but "you are going to end up seeing the whole progression— it's all going to evolve similar to the Web and it's going to become much more pervasive and people are going to think about the user interface," explains Marie Jackson, senior vice president of global marketing for Intervoice.

However, until the speech tools are actually up to par, "it's like Web designers who don't have the tools and have to reinvent windowing every single time that they build a new Web site. And so, part of them are kind-of skilled at it, but the majority of them are just absolutely and completely lost," Balentine says.  

The VUI community has two paths that it can take: it can hold tight to the path of least resistance, as it has done for a while, or it can really shake things up. "It will probably go the way it has been going, which is that everyone will just make up VUI rules as they go and no one will really learn from anyone else's mistakes, in which case the VUI efforts will go away as the technology goes away," he predicts.

However, the user interface community does have an opportunity to step up and bring about a change in the way the market moves. One such way could be that "user interface widgets, something along the lines of Windows, will appear for speech by third-party vendors, and not the speech vendors, that will commoditize VUI design, so VUI designers will just go away and be replaced by toolkits that can be used by IT personnel inside the Fortune 500 companies," Balentine envisions. 

Groundbreaking tools are not the only things that lie ahead for VUI design. A common set of standards that encompasses all VUI design is possible. "Many VUI designers agree with the standards, but have felt unsupported by the corporate community in trying to implement in this way," gethuman.com's Rankin notes.

Taking an overall approach to the design of automated systems rather than segmenting the rules for each niche will create the continuity and support needed to improve design. "If we are going to build speech interfaces that work, then we are going to build really simple speech interfaces that behave mechanically and consistently," Balentine says. Making the speech systems more complex is what is holding back the standards. 

Comparing VUI to the Web, "the actions that the user is engaged in are exactly the same at every Web site and the basics in design are trivial graphical differences. That is the way it has to be with speech and that's the way it's going to be with speech if speech is to survive," Balentine states.

Setting standards that make the VUI easier and more natural for the user is the future of speech. Industry practices that will lead to successful VUI design standards include "elevating the importance of the actual user's needs, making designs more user-centric, and, above all, testing these designs," Rolandi says. 

But, implementing these practices and developing these standards has not and will not be easy. "The truth is this will be a long and difficult climb for most companies, but one worth making.

GetHuman standard v1.0

1. The caller must always be able to dial 0 or to say "operator" to queue for a human.

2. An accurate estimated wait-time, based on call traffic statistics atthe time of the call, should always be given when the caller arrives inthe queue. A revised update should be provided periodically during hold time.

3. Callers should never be asked to repeat any information (name, full account number, description of issue, etc.) provided to a human or an automated system during a call.

4. When a human is not available, callers should be offered the option to be called back. If24-hour service is not available, the caller should be able to leave amessage, including a request for a call back the following businessday. Gold Standard: Call back callers at a time that they havespecified.

5. Speech applications should provide touchtone (DTMF) fall-back.

6. Callers should not be forced to listen to long/verbose prompts.

7. Callers should be able to interrupt prompts (viadial-through for DTMF applications and/or via barge-in for speechapplications) whenever doing so will enable them to complete tasks moreefficiently.

8. Do not disconnect for user errors, includingwhen there are no perceived key presses (as the caller might be on arotary phone); instead queue for a human operator and/or offer thechoice for call-back.

9. Default language should be based on consumer demographics for each organization. Primarylanguage should be assumed with the option for the caller to changelanguage. Gold Standard: Remember the caller's language preference forfuture calls. Gold Standard: Organizations should ideally supportseparate toll-free numbers for each individual language.

10. All operators/representatives of the organization should be able to communicate clearly with the caller (i.e. accents should not hinder communication; representatives should have excellent diction and enunciation.)



SpeechTek Covers
for qualified subscribers
Subscribe Now Current Issue Past Issues