August 22, 2008
By Susan Hura Principal - SpeechUsability
Interact

Getting Users to Do What We Want

It’s a fact of life when dealing with technology: Sometimes the user needs to behave in a particular—and not entirely natural—way for the technology to work. In these cases, it’s incumbent on the designer to inform the user how he must behave. We can use an interaction metaphor that allows the user to translate his natural behavior into the behavior required by the system. We’re all familiar with the PC desktop interaction metaphor in which we open files or search for documents by hitting the right combination of keys or clicking the mouse. While there’s nothing natural about clicking a mouse to open a file, once we become accustomed to it, the unnatural behavior becomes so ingrained that it seems normal.

As much as we advertise speech as a more natural mode of interacting, we’re not immune to the problems of trying to get users to speak in somewhat unnatural ways to make the technology work. (Yes, so-called natural language exists, but until we have natural language understanding to back up our natural language recognition, this will always be a brute-force and time-consuming method that isn’t always financially feasible.) For grammar-based systems, we need to teach the user what to say and how to say it. We now have an arsenal of techniques for gently coaxing the sort of speech needed from users:
•   We know that long, involved instructions don’t work, but that modeling appropriate responses does (e.g., Tell me the date of service, like March 14, 2008).
•   We know that users have a much better chance of speaking words we can handle if we keep menu options brief, descriptive, and distinct.
•   We know that a good grammar can save even imperfect prompting by including likely response variations and weighting those that make the most sense. This includes accepting no-plus-correction responses at yes/no prompts.
•   We know that it’s sometimes better just to present the options available than to ask a conversational question that invites out-of-grammar responses (e.g., we don’t ask What else do you need? but inform users, You can say repeat, check another item, or I’m done).

Using these techniques allows us to guide the user to the somewhat unnatural way that we need him to speak to interact with our systems. Because our metaphor is spoken language, which has lots of built-in rules and conventions that users unconsciously follow, we don’t have to formally instruct users on how to interact with the system. We’re usually able to influence user behavior enough to make the technology work without calling attention to it.

However, some situations require us to influence not just the words that the user speaks, it’s a more global behavior. The case I’m thinking of is one that many VUI designers have encountered: getting users to give enough information to route the call to the appropriate live agent. Many companies now provide freer access to live agents to improve the customer experience—just the way VUIDs have been asking them to. But this opens a can of worms in terms of not being able to use the IVR to route callers to the right agent, thus defeating one of the purposes of having an IVR at all. Sometimes we can figure out which agent to route to depending on the selections made, but in other cases, we can offer a simple A-or-B choice (like Do you need sales or service?). When we need more fine-grained information to route the call, the intrusion is potentially annoying to callers because we’re not immediately doing what they asked. But since this is ultimately going to save them from being misrouted and wasting time, we have to find a way to motivate callers to do what we’re asking.

So what’s the best way to ask a caller for the additional information we need for routing? I recently encountered two prompts that shared this goal but had vastly different outcomes. The first prompt was, OK, I can transfer you to an agent after you make a selection, followed by a replay of the prompt he just heard. This prompt affirms the caller’s request, promises to transfer him, and then tells him what to do to make the transfer happen. The second prompt was OK, I’ll get you to an agent, but first please tell me if you need help with A, B, C, or D. This one does the same things as the first. But, in spite of the similarities, one of the prompts worked beautifully and the other failed miserably. Which one do you think worked? Stay tuned to future columns for the answer.

Susan Hura, Ph.D., is vice president of user experience at Product Support Solutions. She can be reached at shura@psshelp.com.

Free

for qualified subscribers

Subscribe Now Current Issue Past Issues

Getting Users to Do What We Want

Triton Digital Partners with ekoz.ai on Voice-Cloned Podcast Ads

Soul App Launches Full-Duplex Voice Model

Mistral Unveils Voxtral Open-Source AI Voice Model

Vonage Partners with AWS for AI Voice Agent Integration