Romancing the Caller [and the Customer]

Writing directed dialogue voice user interface (VUI) prompts for a living is like being on a blind date every single day of your life. You end up asking yourself the same two fundamental questions VUI designers have to ask themselves every moment they spend slaving at their craft:

What does this person want from me? and
How can I get what I want from him?

Like on a blind date, in VUI design you’re working two angles at once—yours and the date’s. You’re trying to get your interests to collude, trying to elicit the best response from him or her, and, of course, all the technical tricks you can ply come into play. You can set the mood for your date with a candlelit dinner at that romantic French restaurant that got four-and-a-half stars and only three dollar signs on some review site you found, and you can set it for your caller with that well-reviewed, natural-sounding text-to-speech engine you found for only 600 dollar signs.

You can buy your date flowers, wear your nice jacket, and build as deep a grammar as humanly possible, but technology and roses without that certain spark means you’re sunk. Even with the best recognition in the world, if you don’t relate to your caller appropriately, then you’ll find your date getting up, splashing the Merlot in your face, and asking to speak to an agent.

Be Descriptive and Distinct

To make it work, what you say matters most. It’s about the questions you ask and how you navigate from them. In finding the right words to ask a domain’s caller (or your date), VUI designers seem to agree on two metrics for good prompt writing for directed dialogue. Prompts have to be descriptive, and they have to be distinct.

Descriptive prompts are immediately understandable to a caller. Susan Hura, vice president of user experience at Product Support Solutions and founder of SpeechUsability, has heard her share of nondescriptive prompts.

“One of my favorite ones that you hear in all kinds of voice systems is account information,” she says. “What the heck is account information? That could be anything about my account!”

It could include dialogues where she changes her address, checks on her account status, or checks on what she has ordered, Hura says. Descriptive prompts, however, narrowly define themselves in unambiguous terms, like order status or address change.

Distinct prompts, on the other hand, don’t have crossover. They’re clearly delineated, separate entities. Hura says the classic example of an indistinct prompt comes from a human resources system. An employee at the company wants to know about the matching funds his company puts into his 401(k), and the choices in the system are benefits, payroll, and policy. The employee can’t make heads or tails of where his question fits in. The 401(k) is a benefit, but by the same token, it also relates to payroll because it is being paid by the company. Then again, couldn’t it also be a question of policy?

In an instance where a user can’t handle the question, he’s more likely to dial into an operator or just hang up, which more or less spells the failure of an IVR. Cutting down on the number of agents needed to handle a domain is often the main reason that an IVR is implemented in the first place. If a system can’t deliver on that, then it is a waste.

Beyond the more basic concepts of descriptiveness and distinction, more complicated, larger issues challenge designers in prompt writing. One of the biggest can be the customer. You might think of the company commissioning an IVR as the chaperone on your IVR-building blind date, hovering over the project, dictating terms, sometimes hampering the designer’s ability to work his mojo. Many designers agree that one of the worst moves you can make is to let the client dictate the prompt language.

“It’s really interesting,” says Deborah Dahl, principal at speech and language consulting firm Conversational Technologies and chair of the World Wide Web Consortium’s Multimodal Interaction Working Group. “Every layperson thinks he can design dialogue for some reason. Where the random layperson would never think of designing a speech recognition engine grammar or hooking up a network server, he always has an impression about the VUI design.”

Part of the problem is that a good prompt appears deceptively easy to write precisely because it is easily understood. Almost every living being on the planet uses language every day of his life and builds meaning from words so often that it’s practically an unconscious process. No sooner than he thinks, the words are formed in his head. So if presented with a seemingly straightforward and simple interaction, the layman, as Dahl calls him, feels capable of competently writing a prompt. What could be easier than one sentence?

But prompt-writing isn’t as easy as just asking a question. IVRs are still a long way from being conversant with human beings. Even the best systems have limited vocabularies and can’t read human intention with human fluidity. Talking to an IVR still isn’t completely like talking to a person.

As Hura points out, “The fact is, a person never has to get you to speak exact words. A person will understand what you want if you say it in a different way, and most of the time in an IVR, that’s just not true.”

Outside of perhaps law enforcement, there’s no analogy for this kind of specific response solicitation in everyday speaking. How often in conversation do you just approach someone, start listing options, and ask him to repeat one exactly? The tools and strategies we use in everyday conversation don’t apply. Someone outside of VUI design, who doesn’t have a working knowledge of its principles and practices, probably isn’t going to have tremendous success writing a prompt.

Still, they insist all the same. One of the first mistakes that clients make is the inclusion of internal language that is part of their corporate culture but doesn’t necessarily have resonance with a user. A classic example that designers tend to bring up is the word “associates.” Many companies, for reasons of relations management, call employees associates rather than employees or workers. It’s often meant as a semantic gesture of inclusion, but it’s entirely internal. It doesn’t have much currency outside of the company’s cubicle walls.

“[A client] might say something like ‘In our company we refer to everyone as an associate, so we must refer to the employees as associates in the dialogue, because that’s our policy,’” Dahl says. “That goes against this whole goal of getting people to use the IVR and accomplish their goals.”

Callers into a domain’s IVR have no use for that kind of distinction and often aren’t even familiar with the term in its context. That causes unnecessary confusion, and, according to several designers, needs to be resisted at every turn.

Likewise, language from marketing campaigns can also prove problematic for directed dialogue. In natural language systems, marketing language sometimes works its way into user responses and requires the systems to be constantly updated and tuned to maintain good routing numbers. In a directed dialogue, it sometimes works inversely.

“We had a client several years ago who, on its Web page, came up with this cute branding term for setting up recurring payments,” Hura says. “They were called ‘paymatics.’”

The term made sense on the company’s Web site, where it was launched, because it had a larger context, she explains. The paymatics option was included with a number of other payment options. The context allowed users to infer that it was probably related to automatic payments and a portmanteau of just those words, “automatic” and “payment.” It probably also helped users that there were other options, none of which included setting up automatic payments.

For the Web, the use of the term paymatics made sense as a play on words.

On the phone, however, paymatics doesn’t make sense. A menu tree doesn’t contextualize its options in the same way. A user can’t see them all at once, double-check, and compare instantly, drawing whatever relationships he can among them. Rather, he is dealing with them one at a time and is forced to rely on memory to make comparisons to earlier options.

“If users don’t understand the terminology instantly, you’ve lost them,” Hura says. “They’re never going to go back and choose paymatics, and you know what they’re going to do instead? They’re going to ask to speak to an agent.”

For a company, that means a drain of resources and higher costs. More than that, though, a user’s confidence in the system is going to be eroded. If he calls back at any point, he’s likely to have diminished faith in the system’s abilities, get annoyed, and may even begin harboring negative feelings toward the company because of its IVR’s failings. Put another way, even if an enterprise provides the greatest service or most useful product in the world, if the front door to its house is a mess, then who’s going to want to come in?

Reality Check

Another problem VUI designers might have to confront with clients is a reluctance to include negative options—such as the ability to register a complaint or even cancel an account—anywhere in the menu. In other cases, when such options are included, they’re often buried deep within the IVR and difficult to find.

“Maybe the customer doesn’t want to admit that anybody is ever going to want to cancel their accounts,” Dahl suggests. “They think that if you call wanting to cancel your account and can’t find the choice, you’re going to be happy. It’s important to face reality.

“It’s not going to make someone happier if they can’t cancel their account because of a confusing IVR menu.”

Moreover, if a user is dead-set on canceling his service, then making it difficult to do so does nothing to ingratiate the company, especially if the customer might possibly renew the service at a later date.

“I think that’s another one of those cases where it’s not a smart business decision. That’s what retention departments are for, right?” Hura asks. “You don’t want someone who is already getting set to cancel their service to then get pissed off because they can’t find it in the menu. For those people, by the time they do get to customer retention, they’re already annoyed, so why don’t you recognize that and say, OK, we’re going to transfer you to someone who can help you with that, and then let those people work their magic?”

For reasons like this and problems of language, enterprises ought to put some trust in the expertise of their designers and reserve judgments about design until at least some limited statistical data, even from usage tests, is available. Until then, nothing can really tell a company whether the IVR it is designing is going to work. In that absence, the best resource is the experience of a designer, her intuition, and the accumulated knowledge of her trade. To second-guess her before that is just shooting in the dark.

Wrong Answer

This goes for designers, too, suggests Bill Scholz, president of the Applied Voice Input/Output Society and founder of consulting firm NewSpeech Solutions.

“Don’t trust the quality of your assumptions without validating them with actual customer responses to your prompts,” he cautions. “Too often people will have an intuitive sense of what they think will make a good prompt, and they will then throw it into an application and won’t realize that three-fourths of the responders are confused by the prompt and answer it incorrectly.”

Good VUI design means very carefully working over a prompt with run-time analytics, Scholz maintains.

SpeechCycle’s chief technological officer, Roberto Pieraccini, would seem to agree. “I have been working a lot trying to push the notion of VUI design as an art more toward VUI design as a science,” he says.

According to Pieraccini, who considers himself something of a data-driven evangelist, “experience can lead you to a certain point after which it’s hard to understand which prompts will work better in different situations.”

He says that designers have to go beyond just usage tests and have robust and deep data pools to really implement a system.

The model that emerges from Scholz’s and Pieraccini’s views is one heavily reliant on statistics, with designers doing some intuitive work at the start, but very quickly testing the system and making adjustments as needed before advancing too far into the process. This kind of statistical approach makes it easier to come back to clients and argue away any misconceived notions they might be trying to push for, too. It also has the benefit of hard numbers and proven results.

Conflict over how to approach prompt writing isn’t reserved for just the space between clients and their designers, though. Even among designers, contention takes place at a pretty fundamental level.

Hura identifies one that she feels looms in the background of all conversations: whether IVRs ought to speak conversationally and naturally, as a human being would. “There are still some people out there who are like, ‘You want this IVR to speak exactly the way a human being would, and if a person wouldn’t say it, then the IVR shouldn’t either,’” she says.

This is an argument that has roots in the same conflict that arises between designers and clients, where the client thinks he is capable of writing prompts because it’s just language he uses every day. Between designers, however, it takes place at a much higher level. Designers often reach for a level of prompt writing that approaches the kind of conversational everyday language a client uses.

For advocates of the more “natural” prompt, the goal is to put callers at ease and make them feel like they’re going to get the results they want—that the IVR is as good as talking to a live operator. Producing that kind of IVR is the highest aspirational goal a designer can have. For Hura, however, it’s a potentially disingenuous one because the technology cannot yet meet that goal.

“The most important thing is that it works,” she says. “This is a tool. People aren’t calling up to have a chat. People are calling because they have some task that they need to accomplish. The quality of the conversation is less important than the quality of the interaction, in general, meaning was there utility in this? Were they able to accomplish what they wanted to?”

She’s quick to add that she’s not arguing for a robotic IVR. She still thinks it should be friendly and as natural as possible, but if there’s a less natural way to write a prompt that’s going to provide better results, she’s all for that—and in some cases, it breaks down that way.

In the end, the best IVR is the one that gets the job done today.

Free

for qualified subscribers

Subscribe Now Current Issue Past Issues

Romancing the Caller [and the Customer]

Voice Deepfake Fraud Surged 1,300 Percent

Sanas Unveils Simultaneous Real-Time Speech-to-Speech Translation

ESTsoft Partners with ElevenLabs

Deepgram Launches Voice Agent API