Contradicting a Legend
One of the most well-known names in usability, Jakob Nielsen, stated that being able to remember a user interface was a component of usability. Although this may be true for graphic user experiences, I’d argue that being able to remember a voice user interface is exactly what you don’t want to accomplish in the world of speech and interactive voice response design.
Why would I contradict such a towering figure in our field? It’s because of the inherent differences in how we process visual information and social interactions. A stream of research in conversation shows that people are notoriously poor at correctly remembering their everyday communication. We’re subject to a variety of cognitive biases that cause us to incorrectly remember things that weren’t said, omit things that were said, and hear only what we want to hear. We also tend to significantly overestimate how understandable we make our own points. Some people, such as introverts and English-as-a-second-language speakers, may use so many of their cognitive resources to make conversation, such as taking turns and planning what to say next, that they have very little left over to remember other details about the interaction and partner.
Generally speaking, our cognitive resources are limited to a brief summary of what we actually experience, colored by our own expectations and emotional states at the time. And this is if we are at all motivated to recall a conversation or we aren’t distracted by another task. Our everyday interactions are governed by a goal to achieve something—often information transfer—not by a social goal to get to know our conversational partner at a deeper level. Thus, most of our more mundane interactions fall from memory the moment they end, assuming nothing atypical occurred and the partner didn’t do anything out of character.
Short Memory Spans
Backing up the conversation literature, service delivery research also indicates that we remember interactions with service providers only if they do something unusual, like provide us with unbelievably bad (or good) service. We go into an interaction with a sort of cognitive checklist of how the interaction should progress and how the provider should act. If everything is in line with our checklist, we’re satisfied and go on our merry way. But if something atypical happens, we remember the interaction and are more likely to tell everyone we know, thereby paying back bad service or rewarding good service with word of mouth. The Internet has given word of mouth a vastly wider audience and made it a more substantial outcome.
So what does this have to do with us as designers? If you consider that most VUIs typically perform mundane service interactions, research suggests that remembering them is something to be avoided, not achieved. In fact, my own research has shown that the best way to predict customer satisfaction with a VUI is to do what he wants, as quickly as possible, and in a way that is in line with his expectations. This means the voice needs to be professional, the service helpful, and the language polite and efficient.
Many VUI designers know their job is to design a user-oriented interaction. They’ve come to understand that people aren’t interested in getting to know a VUI, listening to a commercial, or hearing a bunch of other stuff; they just want their bank balance or to pay their bill—in, out, and done, thank you very much.
Research also shines a harsh spotlight on those excruciating, circular conversations we’ve all had in which someone wants to change a word, or two, or three. If users don’t remember the specific words, only the gist of the conversation, why would we even get embroiled in this type of discussion? On the flip side, why would a designer adamantly insist on a single, specific wording and get into a power struggle over it?
In both cases, the person doing the asserting is forgetting that a specific word (or phrase) is only one single instrument in an orchestra of finely tuned and collaborating design elements. In good VUI design, we need to focus on the big picture: the overall tone and attitude of the VUI’s language and whether it does what users really want it to do. Many specific words could work equally well within a given tone, and a not-the-greatest script can often be tweaked with an excellent voice talent.
So with respect to Mr. Nielsen, memorability usually isn’t a virtue in VUI design. If you’re overfocused on writing a memorable script, you might just be missing the forest for the veins on the leaves of the trees. In the end, if customers are remembering your IVR, it might just be for the wrong reasons.
Melanie Polkosky, Ph.D, is a social-cognitive psychologist and speech-language pathologist who has researched and designed speech, graphic, and multimedia user experiences for more than 12 years. She is currently a human factors psychologist and senior consultant at IBM. She can be reached at firstname.lastname@example.org.