Do Cultural Differences Make a Difference?
For the record and as a matter of fact, there can be no denial that cultural differences can influence the acceptability of voice user interface (VUI) design and production practices. Furthermore, these differences need not represent historical clashes between civilizations to be problematic.
I have seen English-speaking Canadians cringe upon hearing an opening interactive voice response (IVR) greeting being nasally honked by some vivacious California Valley Girl persona. In this case, the “offended” users, just as their U.S. counterparts, are New World English speakers. Nevertheless, cultural differences do exist, and one manifestation of those differences was nothing less than repugnance to a pervasive U.S. persona practice.
After thoroughly acknowledging the reality and effects of cultural differences, I feel safer in asserting my belief that some of these kinds of concerns are overblown, at least with regards to telephony-based VUIs. After all, at this point in history, the telephone is not a monocultural phenomenon. It might be more accurate to say that it is a 20th Century phenomenon.
Granted, there are still remote, unsettled, even unexplored areas of the world, but, for the most part, just about anyone born since 1915 has seen and used a telephone. Practically speaking, I doubt there is much market demand for IVR in places like the jungles of Amazonia. In effect, essentially all market demand is concentrated in the industrialized world.
Regardless of their cultural differences, people all over the industrialized world share a great deal of common behavior—behavior that is second nature and that they take for granted—when using a telephone. They all understand that to make a call—whether on a cell phone, landline, or over the computer—they must first take the phone off hook, dial the number, and await an answer. Upon being answered, all callers intrinsically understand the conversational turn-taking of talking to someone by telephone, and they also understand that one hangs up after the conversation comes to an end.
This may all seem so obvious that it’s silly to discuss. My point, however, is that all telephony-based VUI users share the same knowledge about what happens during a telephone conversation. In effect, the nuts and bolts of telephone communications impose the same constraints on all who communicate by phone, regardless of their cultural backgrounds.
This common experience affords significant cross-cultural homogenization in the design and production of voice user interfaces. We telephone users have more in common than one might think.
Oddly enough, testing for some usability variables can completely ignore cultural and even language differences. Consider the following: A usability analyst is concerned that users become confused during particular states as they attempt to perform their tasks in a proposed IVR. He conducts Wizard of Oz testing for 12 users, each of whom is asked to perform six tasks. He records all sessions and subsequently listens to each recording, carefully measuring the number of seconds between the offset of each prompt in the IVR script and the onset of the users’ subsequent speech (the users’ response latencies). The analyst collects these measurements for every user in every state and then averages all of the measurements for each individual state. He notices that several states have unusually long response latencies and concludes that something in those states’ prompts is causing confusion.
Given the behavioral operationalization of confusion in terms of unusually long response latencies, the analyst could perform such an assessment even of a VUI created in a language that he does not understand.
Needless to say, fixing prompt problems would require native speaking skills, but many usability variables, such as latencies, incidences of task completion, time to task completion, or incidences of barge-ins, can be measured irrespective of the language of the interface.
Walter Rolandi is founder and owner of The Voice User Interface Co. in Columbia, S.C. He provides consultative services in the design, development, and evaluation of telephony-based voice user interfaces and evaluates ASR, TTS, and conversational dialogue technologies. He can be reached at firstname.lastname@example.org.