January 9, 2004
By Walter Rolandi Founder - The Voice User Interface Company, LLC
The Human Factor

Casting Users in Parts as Parts

Common Sense
I recently gave a talk on the basics of speech technologies at a gathering of utility industry customer support executives. The attendees were honestly concerned about the quality of their existing customer support systems and eager to learn how speech technologies might help. One of the points that I made during the talk struck a chord with one of the attendees and sparked a lively discussion following the presentation. During that discussion, I would hear the attendee say such things as “That’s so obvious,” “That’s common sense” and “Why didn’t I think of that?” What was so obvious?
During the presentation, I entreated the audience to avoid the temptation to think of their users as “parts” in their systems. Regrettably, I noted, this is something that even the best voice user interface (VUI) designers can tend to do. Thinking of users as “parts” is actually a natural and understandable inclination: When we think about systems, we cannot help but think systematically. And when we think systematically, everything starts to seem purposeful, regular and methodical. The trouble begins when we start thinking of our users as purposeful, regular and methodical. Or, in other words, trouble can come when we start thinking of our users as predictable. When we are convinced that we can predict what our users will say and do, the user is no longer any more mysterious than a prompt or a timer or any other component of the system. The user becomes a “part.”

Hubristic Heartache
The situation actually gets worse when assuming that we know what a user will say is only “part” of the problem. When we cast users in parts as parts, we additionally assume that we understand the emotional disposition that the user brings to the application. This tendency is widely prevalent and scarcely anyone takes steps to avoid its likely consequences. Presuming to know the emotional disposition of the user almost always leads to additional usability problems.

The Good News
The good news is that there are some simple guidelines for app design and grammar development that can help designers control their presumption to know everything the user will say and do. Many of these guidelines have been discussed previously in this column (See particularly “Is it Stupid to be Clever?” Speech Technology Magazine, September-October 2002), so this article will focus instead on methods to better accommodate the emotional disposition of the user.

The Common Sense of User Centric Design Designers can avoid many problems that stem from user emotional disposition by profiling their users prior to design. The emotional disposition of a user should be factored into the design requirements just as any other design constraint that the design must meet. Designers can address the issue by asking, answering and analyzing some basic questions. The questions should be posed on a task-by-task basis for each application:

What does the application do for the user?
How important might that task be to the user?
How much effort might the user happily invest to perform that task?
How many steps?
How many turns?
How much time?

What are the circumstances that might motivate the user to perform this task?
How do those circumstances affect the emotional disposition of the user?
Is the user at ease and relaxed?
Is the user worried?
Is the user angry?

What can be done to ensure a more appropriate response to those circumstances?

A simple exercise may help to illustrate how no single design motif could be appropriate for all manner of applications and how the motives and emotional disposition of the user should be factored into an application. For each of the following example applications, users and tasks, ask yourself the above questions:

Case One:
App:pizza takeout system
User:someone wanting a pizza delivered
Task:specify Pizza Margherita; delivery address; payment type

Case Two: App:banking system
User:someone who made a mortgage payment yesterday
Task:get checking balance
Case Three:
App:utility company system
User:someone whose power has been shut off
Task:restore power

If you follow through with the exercise, you will probably note that there are some obvious differences in the motivation and emotional disposition of the users in these apps. The pizza seeker is relatively mellow and will probably be happy to use a speech system as long as doing so presents no greater hassle than talking to a person or ordering from the competition. The check floater is likely to be anxious and therefore likely to be annoyed by a multi-turn interactive process. They will complete the task, no matter how annoying, because they feel that they must. But an annoying experience under these circumstances will not soon be forgotten and this should concern companies who really care about the quality of their customer service. The person whose power has been shut off is probably in the midst of some kind of major crisis and while they may be motivated to slog through an automated system, one wonders if this is even an appropriate place for the non-human touch.

Parting Shot
Just as the fellow said after my presentation, this all sounds so obvious, when you think about it. The problem is that so few designers seem to give it much thought.

Dr. Walter Rolandi is the founder and owner of The Voice User Interface Company in Columbia, S.C. Dr. Rolandi provides consultative services in the design, development and evaluation of telephony based voice user interfaces (VUI) and evaluates ASR, TTS and conversational dialog technologies. He can be reached at wrolandi@wrolandi.com.

Casting Users in Parts as Parts

Scaling Trust, Earning Customer Love: The Reality of AI Voice Agents

AiOla Launches Drax Open-Source Speech Model

Boosted.ai Launches Voice Assistants

Professor Receives $300K Grant to Improve Voice Tech