A Fool's Revenge
The great pleasure of a dog is that you may make a fool of yourself with him
and not only will he not scold you, but he will make a fool of himself too.
What a Pity
It's a pity that humans don't act like the dogs that Samuel Butler describes when they are made to feel foolish by IVRs. Or, expressed another way, perhaps the pity is that dogs don't call IVRs…
Presumably due to a desire to create realistic and natural dialogs, many IVRs "fool" their users. They fool their users by making them think that they are speaking with a person. Now this can be a real pity because nobody likes to be made a fool.
What's the Cause?
While such problems often result from persona excesses, far simpler, subtler, more innocent variables can also be the cause. Sometimes no more than the most elementary prosodic variation can do the trick. An example might be, the length of time that passes between the initial greeting component of an opening prompt and whatever else follows. For instance, consider the follow prompt:
System: Hello! And thank you for calling the ACME Corporation! How may I direct your call?
Simple enough, right? What could be wrong here?
It's Not Just What You Say, But How You Say It
Let's put aside all prosodic variables such as intonation and stress for the moment and manipulate a single temporal variable. Assume that we are using the exact same prompt shown above but that we are varying the pause-length between the initial greeting, "Hello!" and the rest that follows using a sound editing software package. Consider three specific examples:
System: Hello! (200-millisecond pause). And thank you for calling the ACME Corporation! How may I direct your call?
System: Hello! (400-millisecond pause). And thank you for calling the ACME Corporation! How may I direct your call?
System: Hello! (600-millisecond pause). And thank you for calling the ACME Corporation! How may I direct your call?
In the first example, a mere 200 milliseconds pass between the greeting and what follows. When asked, almost all users will report that they perceive this as an awkward, artificial or unnatural pause.
In the second example, 400 milliseconds pass between the greeting and what follows. At this pause-length, most users report that they perceive this as a normal pause, finding nothing in particular to remark about.
In the third example, a full 600 milliseconds pass between the greeting and what follows. Many users will perceive this as an awkwardly long pause.
I have informally investigated this "pause-length" variable and found that when the pause is less than 250 milliseconds, it seems unnatural to users and when the pause is more that about 575 milliseconds, it tends to signal a turn-taking event. In other words, if the pause length is too short, it sounds unnatural. If it is too long, it may still sound "natural," but it makes the user think it is now his/her time to speak.
One More Example
Remember when home answering machines were still a novelty? Did you ever call someone who had deliberately recorded an answering message that was intended to make you think you had reached your party? The phone is answered and you hear your party's voice exclaim:
System: Hello? (600-millisecond pause). Gotcha! I'm not really home so just leave your message at the beep!
The system said, "Hello?" and you started talking, only to be interrupted by, "Gotcha!"….
How did that make you feel? Warm and fuzzy? Crippled by laughter? Moderately annoyed?
A Fool's Revenge
Well, unless you are a dog, I can only hope that the experience didn't make you feel foolish because most people get angry when you make them feel foolish; and in the IVR world, their anger is most commonly expressed by hanging up.