Getting to the Bottom of IVR Abandonment Rates

Of late, I've been brought on to troubleshoot a certain type of project. The projects have an eerie commonality: The data shows the speech system is performing fine. Speech accuracy is high. Tunings are performed twice yearly, with some minor cosmetic adjustments. All appears to be running smoothly.

Yet interactive voice response (IVR) abandonment rates are high, opt-outs are common, and customers despise the automated system.

And the cry I keep hearing is "Why? What are we missing? We need more data!"

Actually...we don't. We have more data than we know what to do with. We have state-of-the art tools that collect, group, list, average, graph, crunch, calculate, and juggle it. The problem is not a lack of data.

Computers can take large quantities of data and analyze it. They can detect patterns, trends, and points of interest, but it takes a human to divine what those mean and what to do with them. If a computer can detect emotion, we still require a human possessing emotional intelligence to hypothesize why that emotion is occurring. Data can point us to similar factors across calls that correlate with high abandonment rates, yet it still takes a human to glean why.

All too often during tunings, the "why"`of customer dissatisfaction is presumed to be a speech accuracy issue. This results in speech scientists spinning their wheels to squeeze out an additional fraction of a percent in recognition accuracy. However, the scenario of high accuracy but low caller satisfaction denotes interface issues and a disconnect between evolving caller and business goals. All the clues are in the data, but discerning the true "why" of caller discontent often requires a human perspective not typically prioritized on tuning projects.

As advanced as our tools and engines are, human beings still have the edge when it comes to listening.

Listening gives us more than the words; it gives us the emotional and mental picture of the caller. It gives us the motivations behind conversations. Listening gives us the subtle tones callers use that provide meaning beyond the literal. Listening allows us access to countless additional data points humans process naturally without thinking about it.

Instead of outsourcing the listening to a transcriber and reading reports computers spew after modular queries, we should be listening to real audio, utilizing the best of our emotional intelligence.

What should we listen to? Full call audio gives a holistic perspective of an evolving customer interaction. Targeted modular utterances allow us to see patterns of interaction across multiple callers. Both are indispensable, and allow us additional perspectives in tuning projects. Here are some areas where human-centric listening can contribute to devising better IVR strategies:

Content. Instead of focusing on word content, a human factors expert can evaluate the interaction content. How are callers interacting with the system? Are they listening to the full content of prompts? Are barge-ins motivated by confidence, or impatience? Is the call an outlier or part of a recurring pattern?
Silence. What is the caller not saying? When is the caller silent? Are those audible breaths due to proximity to a device or an expression of exasperation? Are background noises due to geography or originating from the user, such as the repetitive tapping of a pen or the rustling of a purse?
Connotation. What is the meaning the caller is attaching to a phrase? Does "I don't know" mean "I don't know the answer" or "I don't understand the question"? Does the caller's tone indicate boredom or a sense that he doesn't want to bother fetching the information the call requires?
Escalation. How does the caller change, or not change, over the course of a call? Is an elevated emotional state the cumulative result of a series of mediocre interactions, or due to an immediate source of irritation?

Interface and call flow recommendations may be very different depending on the answers to these questions. While listening is not a skill limited to any particular discipline, designers and usability analysts are able to provide additional value, not just in determining the "whys" of customer dissatisfaction, but also in providing recommendations for prompt and flow fixes to improve interactions.

Don't dismiss these skills as "artsy" or not technical enough for tunings. We've been telling callers forever that their call may be monitored or recorded. Perhaps it is time we stopped merely monitoring and started listening.

Alexandra Auckland is a voice interaction design and tuning consultant at Sotto Voce Consulting. She can be reached at alexandra.auckland@sottovoceconsulting.com.