Speech Technology Magazine

 

Secret Agents

The hidden value of someone behind the scenes.
By Lauren Shopp - Posted Mar 1, 2008
Page1 of 1
Bookmark and Share

It’s a familiar scene: While running errands during peak business hours, customers are forced to trudge through long, grueling lines in a battlefield of other rushed, stressed shoppers. Faced with excessive wait times and dodging the cost of hiring more cashiers, consumer businesses like Home Depot and Wal-Mart have installed U-Scan self-service kiosks over the past decade. Rather than wait for a cashier to tally up their merchandise, customers check it out themselves. As anyone who has used one of these lanes knows, the U-Scan experience tends to work in extremes: a convenient in-and-out trip or a hair-pulling experience that makes people wish they had just waited the extra five minutes for a human cashier. But, if customers are frustrated by sensitive U-Scan technology, they can ask for assistance from the cashier sitting by the kiosk—a combination of automated self-service and human interaction.

Spoken Communications compares its agent-assisted interactive voice response (IVR) technology to the U-Scan brand. The Seattle-based company has carved a place for itself in the crowded contact center industry by selling a single product it says solves three major IVR-related problems: hold times, opt-out rates, and speech recognition errors. Using a technology that has been called everything from guided IVR to Wizard of Oz IVR, Spoken’s technology takes a hybrid approach to call handling, one that presents a front of nearly perfect natural language capabilities. Whether experts call this a clever trick or an insightful business decision, Spoken says it uses both an IVR and a human agent to achieve its goal of near-perfect call routing and increased customer satisfaction.

The company’s agent-assisted IVRs rely on silent agents who work behind the scenes of inbound customer calls. Listening in on several calls simultaneously, agents route callers to what they think is the most appropriate department after the customer answers only one synthesized question, How may I help you? As the call progresses, agents keep gathering information and continue routing calls as they see fit. Opting out is still a choice, but the company claims its technology boasts an almost 70 percent completion rate. And, with one agent handling multiple calls at once, Spoken has found that agent productivity increases fourfold.

It sounds almost too good to be true, and some analysts think it is. Rather than a cure-all solution to a recurring problem in call center IVRs, some believe it’s a quick-fix solution that ignores the bigger problem in the IVR by providing an easy way out for shoddy technologies and poor prompt design. Melanie Polkosky, an independent human factors psychologist specializing in voice user interface (VUI) design, says companies like Spoken fail to recognize that user interface design is to blame for bad IVR experiences, not shortcomings within speech technology.

"The technology is really good, but it’s just that we don’t know how to handle it in a way and design it in a way that’s consistent with how humans actually interact," Polkosky states. "When we look at objective recognition accuracy, it’s usually pretty good, but people can perceive it as being poor if there are other issues in the response, such as what the system says in response to an error or how it tries to solve an error."

Little Competition...for Now
Gilad Odinak is the man at the helm of Spoken’s mission. Above all, he believes in the combination of humans and technology to create a satisfying end-user experience. Almost 10 years ago, Odinak worked with a start-up company called Wingcast, developing speech systems for navigation systems in Ford automobiles. Back then, he says, in-vehicle navigation was nowhere near as commonplace as it is today, and the idea of speech-enabling the systems was just starting to form. Odinak wanted to incorporate speech recognition into navigation systems as a way for users to state their current and end locations. But, as he recounts, nothing worked.

"The solution was building a system where the driver gets a prompt like, What is your destination? and the solution was natural language," he says. But, he adds, the technology was not yet strong enough to ensure accurate recognition. The real solution was a behind-the-scenes call center employee who listened to the driver’s end-point goal and figured it out from there. The idea of a human/machine hybrid for the speech-enabled contact center emerged from Odinak’s early work with Ford. From there, he developed Spoken’s current mantra of giving users the impression that they are talking to a really smart machine.

So when Ford abandoned the project due to budget cutbacks, the scrapped assignment was a mixed blessing. After buying all of the intellectual property associated with the project, Odinak planned to market his agent-behind-the-scenes model another way. Two years in the lab later, Spoken grabbed a financial backer, its first customer, and entered the contact center market.

For now, the market houses at least one other known competitor, Aumtech. Based in New Jersey, Aumtech provides speech recognition solutions and includes agent-assisted IVR in its enhanced self-service suite of programs. Last year, Harvard University's School of Public Health implemented Aumtech's agent-assisted IVR for a diet study  in which participants called into the IVR and told the system what they had eaten that day. Though the application seemed straightforward, Harvard had difficulty  finding an IVR that came equipped with a grammar large enough to encompass the name of every food, and could also handle accents and background noise. Using directed dialogue and a text-to-speech (TTS) program, Aumtech used agents to screen calls, helping the IVR work more smoothly when users entered difficult-to-understand information. According to Aumtech, the deployment saved Harvard $65 per interview call and reduced interview times by 30 minutes.

Another company, Unveil Technologies, had offered a guided IVR. It was acquired by Microsoft in October 2005, and its technology was integrated into Microsoft’s Speech Server platform.

Odinak thinks others will eventually join the space. So why then, in an industry constantly faced with mounting labor costs and pressure to improve IVR systems, did the concept of a "secret" agent take so long to develop? Odinak points to the technological climate of the early part of this decade and the notion that only two methods worked: complete outsourcing to cut costs, or intensive, nonstop improvements that would make IVRs completely automated. Then, in 2005, he says, the waters parted as professionals realized that an either/or approach no longer worked.

"The outsourcing was not working well in many applications because of things that have to do with culture, accent, points of view, mindsets, and the difficulty of managing something that was so far away," he explains. "After a lot of companies and people in the industry realized the shortcomings of both approaches, we offered a practical approach that would work: use automation to the extent that we can, and then have the human intellect that the machines do not."

Though Spoken and one of its major customers say ROI can be achieved in less than a year, some speech professionals disagree on the company’s long-term cost savings. Their biggest gripe? The job of an IVR guide is dull. And when agents are bored, productivity and accuracy decline. Polkosky, whose background is in psychology, notes that the promises of agent-assisted IVRs—both monetary and otherwise—ignore the human being behind the computer.

"We know from organizational psychology that any time you put humans in a boring, repetitive job, they’re much more prone to errors," she says. "The recognition accuracy then becomes based on the humans' ability to listen and respond appropriately, combined with the IVR."

When using the program, agents’ screens fill with forms for each IVR interaction being completed. An agent views the caller’s progress as his information and call path gradually fills the screen. Call forms are tabbed; different colors represent a caller’s progress and alert agents to possible problems. An orange button means a caller is struggling and the agent should perform additional work to help guide him to the correct path.

Polkosky also points to the already high turnover rates of call center agents, and how the training of new agents would take an even bigger chunk out of a business’ investment in an agent-assisted IVR system. Donna Fluss, a principal at DMG Consulting, echoes Polkosky’s sentiments, and notes that giving agents more challenging jobs could help increase their job satisfaction.

"In this case, yes, this is unquestionably a very basic job that is going to lead to tedium, [which] often leads to higher turnover," Fluss says. "Years ago, the industry broke the belief that you needed to keep the [agent’s] job simple, when in reality you retain more agents by keeping the job more challenging."

Spiegel’s Slam Dunk
Tom Scott, chief information officer and senior vice president of operations at Spiegel Brands, would disagree. One of the earliest adopters of Spoken’s agent-assisted IVR, Spiegel, a retail catalog company, fully implemented the technology in its Virginia call centers last September, and has seen significant improvement in its operations ever since. Scott says training an agent to be a guide takes only one day, and that the process of taking on multiple calls through "experiential learning" on the job makes an agent an expert within one week.

And Spiegel had a lot to gamble with in its agent-assisted IVR deployment. The company’s previous call center routed all of its calls via a dual-tone multifrequency (DTMF) menu, and was run through the company’s telecommunications provider, AT&T. With no second-tier menu (callers could only choose between sales and service), Scott says that only about 60 percent of calls were accurately placed. Rather than invest in a traditional IVR, the company chose to bank its money on a product that Scott says was one-third less expensive than those available from other IVR vendors. His main concern was proper call routing.

"We knew there was a lot more money to be saved by being able to handle certain calls longer in the IVR," Scott states. "If we had done IVR, we would have incorporated recognition, but still would have had the issue of a possibly poor recognition rate."

Today Scott says Spoken’s technology sits at the "epicenter" of the company’s 350-seat call center, which receives 450,000 calls per month. Spiegel’s current call handling time clocks in at five minutes. The technology also has saved the company so much money that its Lillian Vernon subsidiary shut down a Philippines call center, according to an entry posted on Spoken’s "Changing Call Centers" blog. The blog entry, dated March 23, 2007, stated that each call was reduced by 20 seconds, and that all contacts are routed to the company’s Virginia call center.

Though Spoken has shown the ability to reduce business costs, skeptics still think the agent-assisted model could fizzle out. Others say that’s not the case. They argue instead that guided programs not only help to bolster the IVR’s reputation, but also that some businesses need technology like this to take baby steps toward speech-only systems.

Getting a Better Reputation
ROI figures aside, Jon Anton, a professor at Purdue University who works for the school’s Benchmark Portal Consumer Sciences Center, says any system that instills faith in the IVR is hitting the target. "For a computer to have a general, wide-open question like How can I help you? up front, the customer wonders how the computer will handle that," Anton states. "[Customers] really think the computer understands them, and they get over the initial reaction of just wanting to opt out."

Anton began working with Spoken in 2005 during the company’s early stages, and has even worked with students to complete white papers on the business. Though he notes that Spoken’s accuracy helps reluctant end users get over the IVR hump, he also says the technology can help in other areas and goes so far as to state that it improves overall system design.

"If the computer hears a request or word it doesn’t understand, the live guide marks down what the computer didn’t get and adds it to the IVR choices," Anton says. "The human allows us to evolve the messages. I see the guide as much a part of the design team as the design team itself."

But with technologies such as speech analytics available for examining the total effectiveness of an IVR’s grammars, is that human factor necessary? Fluss notes that the technology may not cure IVR grammars, but, more important, offers an alternative to businesses scared of making the leap to a speechified IVR.

"The value proposition in this is for those organizations that want to figure out a way to reduce their service cost but aren’t comfortable with full automation," she says. "If you’re not willing to use self-service but must bring down cost, [agent-assisted IVR] is a good way to do it."

Blame it on a lack of faith in the IVR when a company must be created to answer the alleged consumer outrage with automated systems. While Odinak remembers a time when either all-tech or no-tech was the answer, today’s contact center landscape is much more confusing. With the introduction of speech analytics, advanced speech recognition, natural language processing,  and customer relationship management systems, customers are forced to wade through a seemingly endless pool of options for their IVR. Spoken, then, is just one answer, and may someday be joined by other companies that want to cash in on businesses whose trepidation in speech-only systems trumps their willingness to shell out the extra cash for enhanced VUI design and slick production. While Odinak and Scott point to substantial improvement figures, the speech industry remains filled with skeptics who wonder whether guided IVRs create boring work environments or ignore the bigger picture.

The guided approach may solve the problem for a business that feels speech is not the only—or even the right—answer for its business plan, but could a perceived bias against IVR be all there is to blame in companies’ lack of faith in speech? Anton says that the hybrid approach "reduces a hesitancy to use the IVR" to the extent that consumers will be "willing to stay with an IVR longer next time." But one Spoken-enabled IVR may not be a cure-all for every other system out there. Poor design and out-of-date technologies still infiltrate a number of systems.

As Polkosky states, "I’d rather see a big leap forward to a great speech system than something that’s just putting a Band-Aid on it."

Page1 of 1