Implementation Strategies 2009

Article Featured Image

Assessing the Need

The questions are as personal as the answers

By Adam Boretz

Want it? Need it? Gotta have it? What are the costs? What are the benefits? What are the risks? What about call center agents? What about customers? What’s wrong with touchtone? What about call volumes? What about wait times? 

Not surprisingly, these questions are often incredibly complex and difficult to answer, presenting enterprises with numerous challenges, trials, and tribulations. For any company assessing its need for speech solutions, properly addressing questions like these proves to be an intricate and often confusing process that is absolutely essential to both a successful speech deployment and good customer relationship management.

Faced with so many daunting and difficult questions, many executives often find themselves unsure of where and how to begin assessing their organization’s speech needs. Sadly a simple, universal checklist or quiz to assess speech needs does not exist.  

According to Bill Scholz, president of the Applied Voice Input/Output Society (AVIOS) and founder of the consulting firm NewSpeech Solutions, the first step is assessing what parts of your call center agent interactions could be converted over to speech.

“You need to identify whether it’s reasonable to create a dialogue with your end user that is simple and implementable with speech that would meet the goals that you have, that you are currently meeting with your current system,” Scholz says. “My assumption is that some subset of the activities that you are currently meeting with ever-so-expensive call center agents can, in fact, be met by a simple computerized dialogue using speech technology.”

The second step, Scholz says, is laying out a preliminary dialogue call flow to ascertain what the speech interaction would look like and then assessing the finances. 

 “That is assessing what it would cost to build and implement that application, what it would cost for the infrastructure in order to mediate that and how you would then integrate it with the existing activities performed by your live agents,” he says.

Most speech industry consultants would agree that, more often than not, speech will save companies money. 

“Probably the biggest benefit, or the benefit that most people see, is some kind of cost savings,” says Jim Larson, co-chair of the World Wide Web Consortium’s Voice Browser Working Group and an independent consultant and VoiceXML trainer. “The motivation for that is the automated systems almost always cost less than having a human respond to a call. So there’s a potential cost savings there.”

Cost Is King

And with the current financial crisis and worsening recession, saving money is definitely a top priority for almost every organization. 

“From the perspective of the corporation that requires customer access, buns in seats cost a lot of money,” Scholz says, noting the potential savings of touchtone and speech applications. “Certainly if you are faced with having to stretch every dollar because of a deteriorating economic climate, any opportunity to reduce costs suddenly increases in its appeal.”

But Scholz and Larson—who recommends beginning every speech assessment process with a simple assessment of risks and benefits—admit the issue of cost savings isn’t as simple as it appears. 

“Cost savings has it benefits, but it has its disadvantages, also, and that’s that these systems turn some people off,” Larson says. “Some people get frustrated with them.”

According to Scholz, speech’s biggest drawback is that when calling a contact center, users expect to be able to express themselves fluently and openly and to be understood. Therefore, they will often prefer talking to a live human. “You could lose some credibility by not offering them immediate access to a live human, instead requiring them to interact with a speech application,” he says.

Scholz notes that speech can be a mixed blessing: In some cases, people will be perturbed by not having an opportunity to talk to a live agent. But other people will weigh the benefits of shorter hold times and the opportunity to have immediate access to a speech application. 

“That’s part of a whole trade-off that you have to [work] through,” Scholz says. “You have to recognize what your users will be thinking when they are suddenly faced with the necessity of going to a speech application or even a touchtone app as opposed to having direct and immediate access to a live agent.”

Daniel Hong, lead analyst at Datamonitor, agrees, noting that when consumer confidence goes down, customer spending goes down as well, causing a ripple effect across different markets.

“Once that happens, customer loyalty becomes very key,” he says. “Right now it’s more about customer loyalty, customer retention, and really addressing customer concerns in the quickest way possible.”

A Numbers Game

Improving customer satisfaction—along with cost reduction and high call volumes—is among the factors that Hong recommends considering when assessing speech needs.

 “Are [call volumes] large enough to justify a significant reduction in cost from using speech recognition—does it make sense?” he says. “The second thing is: What do you want to achieve? Is it just a cost-reduction play or are you looking to actually improve the entire call flow and improve customer service?”

According to Hong, another factor that drastically affects any speech needs assessment is the specific industry in which a company operates. “There are different processes across all different markets,” he says. “There are different areas of priority among different markets because the types of calls are different and the types of demographics are different as well.”

Hong notes that in healthcare, for example, call routing is going to be a primary concern, whereas in the travel and tourism industry, the processing of transactions is more important. 

Scholz agrees, asserting that in industries requiring unstructured input and unstructured natural language from end users, speech applications are more difficult to design due to their complexity. However, in industries where the expected input from the end user is better defined, speech applications may prove more feasible. 

“It’s difficult to separate speech needs from the appropriateness of speech capabilities for your particular industry,” Scholz says.

On the issue of industry-based speech assessments, Larson offers a different perspective: “Assessments need to be made on an application-by-application basis,” he suggests.

According to Larson, some applications better lend themselves to speech than others. He cites applications that relay stock quotes or flight arrival/departure times as examples of applications that are straightforward, low-risk, and potentially a source of cost savings. So, too, are outbound healthcare applications that give appointment and prescription reminders to patients. 

Conversely, intricate applications—like stock transfers—are more complicated and risky, according to Larson, who goes even further to identify applications that “involve any kind of diagnosis” as “really kind of spooky.” 

“Simple applications with little risk and big benefit should be implemented,” he says. “Those with a greater risk or more complexity need to be more carefully examined and thought through. And I think each area has its own simple and complex applications.” 

 With the worsening economy and the many conflicting and often confusing aspects of any speech needs assessment, many companies often end up looking at the call center with an “if it ain’t broke, don’t fix it” mentality. And when faced with this question, Hong, Larson, and Scholz each offer different answers—some of them surprisingly frank.

“[Touchtone-based] IVR systems—people laugh about them and say they’re old and outdated—served a very useful purpose, and sometimes they’re faster and better than the new speech things,” Larson says.

As such, Larson recommends performing a careful cost-benefit analysis to ensure that jumping to speech is really an advantage. “I think it’s really about re-evaluating your current system and making a decision,” he says. “It’s important to decide what you want to achieve and how much money you have to pay for it. Then the answer is usually pretty self-evident.”

Scholz’s advice is to look at costs. “The single strongest [reason] for moving away from live agents and trying to automate some or all of the interaction is the very high cost per minute for a live agent as compared with the much lower cost for a speech-enabled application,” he says, “and even lower costs for a touchtone-only application.”

Like Larson, Hong admits that sometimes speech isn’t the answer. He also stresses that—regardless of one’s need (or lack thereof) for speech—contact center performance is crucially important. “I would say there’s always room for improvement within the contact center,” he says, noting that companies should never be satisfied with their current contact center performance because economics, customers, etc., can always change. “You have to be constantly fine-tuning your operation and your processes in order to stay on top of that,” he concludes.

Stick to the Plan

Make sure all stakeholders are involved at every stage.

By Adam Boretz

There's an old military saying—no doubt the sage words of a battle-worn general, sitting in a makeshift tent, poring over maps and charts, thoughtfully stroking his beard, exhaling a cloud of blue-gray smoke, and preparing to send his troops into a decisive and bloody battle—a military saying that has become something of a business cliché: “Fail to plan and plan to fail.”

And while that venerable general wasn’t thinking about business—and definitely not about speech technology—his words still resonate, particularly when a company is faced with planning the implementation of a speech solution.

When any enterprise is planning for a speech project, a host of issues, factors, processes, stakeholders, and players must be taken into consideration, both in the short and long terms. And while this process may seem daunting to anyone charged with it, there is a light at the end of the proverbial speech tunnel.

Well-defined planning processes exist to help one pursuing a speech project, according to Jim Larson, co-chair of the World Wide Web Consortium’s Voice Browser Working Group and an independent consultant and VoiceXML trainer.

“There are real established processes for doing this, and I think every company should follow these,” he says. “The biggest place where things go haywire is where they don’t—they just sort of shoot from the hip.”

While these planning processes often include steps and factors that may seem obvious, many of these seemingly self-evident elements are often forgotten or ignored. Larson says any planning process must map out the project’s requirements, lay out phases of the project, assign different tasks to each phase, determine sign-off procedures between phases, determine project length and cost, and include a final cost-benefit analysis. 

Equally important, according to Larson, are some oft-overlooked factors that are particular to speech and not inherent to planning for traditional software projects.

“Grammars are not usually used in traditional applications, but developers spend lots of time developing grammars in speech applications,” Larson says. “So one thing that needs to be included in the planning phase is the time and effort and a knowledgeable person to develop grammars.”

Plan for Mistakes

Another factor that is often overlooked is the creation of error-handling routines, Larson says. “Speech recognition is not perfect, and sometimes it misrecognizes things, so the user gets kind of an unusual result.”

In addition to preparing a comprehensive planning process, purchasing and deploying a speech solution requires companies to answer some important questions, according to Bill Scholz, president of the Applied Voice Input/Output Society (AVIOS) and founder of the consulting firm NewSpeech Solutions.

“First of all, can the solution best be implemented via multimodal architecture, or should it be a speech-only application?” Scholz says. “The next important decision to be made that is a fundamental part of the planning process: Can the application be implemented with directed dialogue, or do you need a fairly open-ended, unstructured dialogue, thus requiring natural language and statistical language models for the solution?”

Scholz notes the magnitude of that determination as it relates to the “huge difference in costs between those two.” Additionally, developers and planners need to consider the benefits of hosted versus non-hosted solutions and determine if their needs are best met by what Scholz calls “a canned application” from a vendor or by developing an application from scratch.

Take the Test

Another vital aspect of planning for a speech project, in Larson’s estimation, is making and taking the time for testing.

“The third thing that I think is often overlooked is simple usability testing,” Larson says. “And by usability testing, I want to involve users in every phase of the project, from requirements to prototyping to final beta testing, deployment testing, and pilot testing. I think a lot of planners underestimate how much testing needs to be done.”

Scholz also cites testing as important, noting the significance of ensuring that project plans aren’t linear, but rather contain iteration across the design, implementation, and test phases. This is important, he says, because after the initial test stage, design flaws become apparent and will require some degree of design modification. And that, in turn, will require additional testing. 

“That iteration is absolutely mandatory, and I’ve seen so many naïve project plans where they forget that,” Scholz warns. “They don’t recognize the degree to which you only learn what the thing is you’re trying to make when you first try to roll it out, way after you’ve done the initial design.”

These sentiments are echoed by Daniel Hong, lead analyst at Datamonitor, who describes any speech solution as “a kind of living organism.” 

“You have to constantly monitor it, tune it,  and refine it,” he says. “Without that, your strategy will probably fail. You have to give it enough resources to continue optimizing and tuning the application.”

Hong—who emphasizes the importance of preplanning research—insists that optimization must be a formal process. “You can’t just do it once or whenever you feel like it. You have to do it monthly—weekly to monthly. It’s very repetitive, but you have to do it.”

Equally important, and perhaps more challenging than the integration of testing, is the process of bringing together the diverse group of stakeholders who should be involved in planning a speech project. To meet this challenge, Larson again points to formalized planning methodologies that divide projects into phases, assign responsibilities, and create a system of sign-offs between phases.

Larson says a good planning methodology becomes “the backbone of communication” for a speech project because individual stakeholders will understand what their parts are, what they have to sign off on, whom they report to, and what to do if sign-off doesn’t occur.

“One of the best ways to bring people together is to have a plan where everybody sees how they fit into the plan, what the responsibilities are, who signs off, and what they do,” Larson says. “That doesn’t guarantee that we’re not going to have problems, but it does minimize the problems that do occur.” 

One Purpose

But even with good planning, a speech project can be plagued by communication problems that make it difficult to bring a diverse group of stakeholders together to work toward a unified goal.

“I just keep thinking of the well-known and overused metaphor of the elephant,” Scholz says. “One guy’s got a hold of the tail, and another guy’s got a hold of the trunk, and then another guy has a hold of the leg, and none of them really see that whole elephant.”

This failure to see the “whole elephant” often relates to the varying goals and objectives within corporate structures. According to Scholz, on higher rungs of the corporate ladder, the emphasis of a speech project is on the bottom line and cost savings; on lower rungs, the focus is on security of critical corporate data; and at even lower rungs, the focus is on specific issues of server architecture, network access, storage design, load balances, and “all the issues that go into the creation of a viable and maintainable call center.”  

Hong also sees the obvious necessity of involving stakeholders—particularly executives—in the planning process, but urges companies to involve an often-overlooked group: call center agents.

“One thing that a lot of companies don’t do is they don’t ask their agents,” Hong says. “They should get input from their agents on how to design speech applications that [provide] what the customers really need.” 

Beyond internal communication, both Scholz and Larson point to external players that are important to the efficacious planning of a speech project. Scholz warns that companies should never select a vendor that is the sole-source provider for any component of the planned solution.  

“Always be sure that you have a fallback or an alternative, and be sure that the vendor you’re talking to understands that so that he is in a competitive situation, not the sole solution to your problem,” he says. “This is quite important, both from the point of view of getting a greater degree out of your vendor and not creating undue vulnerability for yourself by putting yourself in any kind of a sole-source situation.”

But in the end, Larson—like a grizzled general leading his troops into battle in the speech technology trenches—says successful planning comes down to methodology. “Use a planning methodology that your organization is experienced with and is comfortable with,” he says. “And use it religiously, and be sure to involve users in every phase.”

The Key to Success

Use the right resources, and use the resources right.

By Leonard Klie

So you’ve finally selected a vendor, identified the specific solution or solutions you want to deploy, and even negotiated with vendors and resellers for a fair price. Think your job is done? Guess again: It’s only just begun.

Now that the purchasing decisions are out of the way, the real work can begin—all leading up to the moment when, like a proud new father passing out cigars, you announce to your colleagues that your speech application has gone live. But none of it can happen in a vacuum.

“You have this giant, expensive project, and so you have to set up your lines of communication, project management groups, etc.,” advises Moshe Yudkowsky, president of Disaggregate Consulting. “Make certain that you have everyone in the room—that you’re talking to your customer service people, your documentation people, your IT people, your marketing people—and make sure that you’re all on the same page.”

Ideally, many more people than that should be involved. While it may not be possible or practical to have customers involved in this stage of the process, their voices should be heard. “Think about the total, end-to-end customer experience, and design and plan any application in that context,” says Lizanne Kaiser, senior principal consultant for voice services at Genesys Telecommunications Laboratories. “Do field research with customers—this could be with interviews or surveys—and look at their buying patterns. Involve your Web folks to see what customers are doing online. Involve store or branch office folks to see what customers are doing there, and involve [customer service representatives] to find what customers might be calling in about.”

At this stage in the process, it also helps to realize that a contact center’s interactive voice response (IVR) system is not the only customer-facing piece of a business. Nor is it necessarily the first choice among modern consumers looking to get more information or perform a transaction. 

“One of the emerging trends in the industry is that customers are using an IVR in isolation less and less. Prior to calling, they’ve had some interaction with the company already, either on the Web or somewhere else,” Kaiser notes. “Customers willing to use self-service are doing it more and more on the Web, so when they’re calling into an IVR, they want a specific person to help them with a specific problem that they can’t fix anywhere else.” 

That’s something that should be reflected in any speech project’s goals and success criteria, consultants agree. And those goals and criteria need to be spelled out clearly—for all to see—as early as possible in the process. Whether it’s new services to be offered, a set percentage of calls to be offloaded, or a set percentage of expenses to be reduced, for example, it should be clearly defined and known to everyone from the start.

“Does everyone know what success will be in terms of the application? If the goal is to divert calls from live agents, for example, choose a concrete number—say 80 percent—and make sure that goal is clear to everyone,” Yudkowsky says. “And then follow it up with a clear project plan that is also known to everyone.”

On a Clock

That project plan should include comprehensive road maps and timetables that identify when and how every aspect of the project will be delivered. “The worst surprise you can get: On the last day before everything is supposed to go live you find out that something’s not ready yet,” Yudkowsky maintains.

Therefore, in drafting the comprehensive project plan, it is vitally important to conduct a thorough inventory of all current technology and personnel resources to identify potential staffing, application, and equipment/networking gaps. 

Perform skills-based assessments of all employees to determine whether you will need to bring in additional talent to augment your existing team. Also plan time to train all employees on the new systems prior to their going live. 

Get to know the technologies involved, how they are to be used, and what their boundaries and limitations might be. Even if experienced industry professionals will be doing most of the work, managers and team members should be conversant in the various tasks and technologies involved so they can make informed decisions along the way.

In looking at the network, consider the layout of network servers, how the application will integrate with existing data sources, and the security implications of any decision made. Take, for example, a company that is implementing a unified communications solution. Before it can even think about adding voice traffic to its existing data network, it will need to ensure first that network servers, trunking, and other hardware can handle the extra load. 

Next, it will need to consider where individual applications will be housed. Call processing, speech recognition, dialogue interaction, and back-end application processing could all occur on different servers; and if bandwidth between these servers or across the entire network is too narrow, then there could be latency problems, which could be exacerbated if additional layers of middleware, firewalls, or security protocols have to be incorporated into the network design. Finally, security layers between servers will have to be considered, especially if all or pieces of the application are going out over the Internet or an open area network.

Along the same lines, consider network, hardware, and staffing capacity as well, and not just during normal traffic times, but also at expected peaks. For a retail call center, for example, this would be during the holiday buying rush. For an insurance company, it’s likely to come during hurricane season in the late summer and early fall. “Take these peaks and valleys into consideration so that you’re certain your application will be able to handle them prior to going live,” warns Jim Larson, co-chair of the World Wide Web Consortium’s Voice Browser Working Group and an independent speech consultant and VoiceXML trainer. 

Prior to deployment, this can be done by field-testing an application with a select group of frequent, preferred customers. “With your small pilot database, extrapolate how many calls you are receiving and apply those percentages to your entire customer base to see if your system can handle it,” Larson says.

Should things need to be added later on, experts advise using open standards-based platforms and tools—most notably, those built around the VoiceXML standard—to ensure portability and a smooth integration with existing and future applications and technologies. “Look at the entire system,” Yudkowsky states. “Do you have a complete integration all the way through or are things just bolted in? Do you have little islands unto themselves that will create problems later on when they are not talking to one another?”

Design Decisions

And because the whole purpose of any speech application deployment is—or at least should be—talking better to customers, they should be priority No. 1 in creating the user interface. At this stage, as with all those prior, it is important for the project team to work with the VUI designer to draft detailed specifications, clearly defining how the customer will interact with every piece of the solution. The design specs should diagram the call flow with as much detail as possible about prompts, grammars, error-handling, barge-in, and live agent support.

Because customers will make all sorts of inferences about a company just by the voice of the application, it is important to define the personality of that voice as well. The personality should reflect not only the company’s brand image, but also the preferences and makeup of its customers.

Choosing the right persona for an application will also affect dialogue structure, the wording of the specific prompts, and the tone of the recordings. Then, know that how things are said in the system is as important as what is said, and that should carry through to the selection of the voice talent. Choosing the right voice actor will ensure that prompts are delivered properly, with the appropriate tone, pitch, and stress.

What’s more, it doesn’t hurt to personalize the application as much as possible, according to Jamie Bertasi, head of the Business Solutions Unit at Microsoft’s Tellme Networks subsidiary.

For her, personalization means understanding a customer’s behavior and needs and working an application around them. “What someone does on the phone while driving is different than when he’s at his desk,” she says. “You really need to understand what he’s trying to do on the phone in either situation.”

As another example, Bertasi offers an airline call center application. “To be sure, two weeks before a flight and driving to the airport on the day of the flight, customers are calling for different reasons,” she says. “So why are you going to offer them the same menu?”

Personalization requires a great deal of customer data, but the effort is definitely worthwhile. It often helps to know the caller types and profile the full user community. In healthcare, for example, this might mean segmenting out how many callers are healthcare providers versus patients; for a utility company, it could mean residential versus commercial customers. Then, it’s always best to design for the majority, experts agree.

As a last step before fielding the application for the first time, it also helps to notify end users of the availability of the new system. “Let your customers know a few weeks before the go-live date what the system is, why you’re implementing it, what options will be available, and what the benefits will be to them,” Larson says. “And it doesn’t hurt to encourage them to try it out, and then to survey them after their first time using it to find out how they liked or didn’t like it.”

The Test of Time

Repeated tuning should be part of any speech application's maintenance schedule.

By Leonard Klie

Despite your best efforts,  predeployment  application testing and tuning can only go so far. It would have been nearly impossible to fully anticipate and prepare for every potential utterance a caller might make or every path a particular call might take. 

And so it is at long last that your speech application is put into service. Callers start interacting with it almost immediately, and some are bound to struggle with the application.

But with all the time, effort, and expense it took to get to this point, the last thing you want to do now is throw your hands up and walk away from the application. “You can’t wash your hands of it,” says Lizanne Kaiser, senior principal consultant for voice services at Genesys Telecommunications Laboratories. “You need to think of your speech application as a living, breathing being that needs constant care.”

In the case of a speech application, that steady supply of TLC means constant testing and tuning. Without it, a speech application is not likely to survive contact with customers for very long. 

Step 1: Test the application

Testing can take many forms. With such a wide variety of tools, automated solutions, and third-party service providers available to perform this task, it can seem daunting. But it doesn’t have to be. A number of do-it-yourself strategies can give insight about what’s going on with the application and how callers are responding to it.

At a bare minimum, Kaiser suggests listening to call recordings from time to time. “It doesn’t have to be a huge sampling. Once a month, go in and listen to about 20 calls,” she says. “It gives you a chance to get a sense of how the application is doing and if it’s meeting customer expectations.”

But listening can’t be a one-sided effort, according to Jamie Bertasi, head of the Business Solutions Unit at Microsoft’s Tellme Networks subsidiary. “Listening both to the automated portion of the call as well as the calls that transfer to agents provides great insights into how callers are using the automated systems,” she says. 

Hopefully a post-deployment maintenance plan will go far beyond mere listening. Most experts agree that speech applications, especially those that get a lot of use, require much more detailed testing and tuning. “Qualitative assessment of the experience is a key process for high-performing applications,” Bertasi says. “You’ll definitely want to generate actionable insight post-deployment. Once the application is launched, the refinement process [should] start immediately.”

Properly testing an application should weigh both system performance and usability. System performance testing should include:

  • recognition performance (How accurately does the system capture the speech from a variety of people in a variety of settings); 
  • in-grammar rate (How often does the grammar include the caller’s response?); 
  • call automation rate (How many calls were able to be completed without transferring to an agent?);
  • average number of reprompts (How many times did the application have to prompt the user for the same piece of information before it was recognized?);
  • stress testing (How many callers can the system handle at once?); and 
  • dialogue traversal testing (How many possible conversations can occur, and do calls get routed properly based on the caller’s input?). 

Usability testing should address the quality of the caller’s experience with the system, taking into account issues like: 

  • call duration;
  • call completion rates; and 
  • caller satisfaction. 

For most systems, all of this information can be acquired without a lot of work. “Most systems today have very sophisticated call logs that can tell you exactly how many calls were received, how long each call lasted, what the user did in the system during each call, and what he wasn’t able to do,” says Jim Larson, co-chair of the World Wide Web Consortium’s Voice Browser Working Group and an independent speech consultant and VoiceXML trainer. “If there was a recognition error, the log can tell you that.”

Beyond that, when used properly, smart call logging and reporting can help “to identify in which tasks—authentication, menu navigation, bill pay—callers are having trouble and why,” Bertasi says. 

Most experts agree that a good baseline for a meaningful analysis is data from at least 1,000 calls spread out during several days. That can amount to a lot of data, so the one caveat that Larson offers is this: “There are lots of metrics that can be captured in call logs, and you may need trained people to analyze these logs and spot particular trends,” he says.

But, on the upside, the same analysts can later conduct supplemental user surveys. “Have your analysts call users to find out what they liked and what they didn’t like,” Larson says. “[Postcall] surveys are fine, but they are often one-sided because the people who stay around to respond to a survey are usually only the ones who are not happy. This way, you get a more balanced account.”

Throughout the system’s life span, it’s not enough to merely identify and work around problems. Proper system maintenance also involves discovering the root causes of problems, determining how problems can be fixed, and checking that the fix solves the problem and doesn’t introduce other ones somewhere else.

As an example, analysis might show that at a particular prompt, many callers respond with a phrase that is not included in the grammar. Unlike a touchtone application in which a customer chooses from a finite number of options, a speech application opens the door to a much wider range of verbal responses that are recognized within the grammar. In this particular case, perhaps the grammar is too limited and should be expanded, or perhaps the prompt is confusing and needs to be reworded.

Are customers not entering information when expected? Perhaps the dialogues are not clear, the menus too complex, the pace of the dialogue too fast, or the pause not long enough for the caller to enter the requested information.

Step 2: Tune the application

Based on information gathered during testing, be prepared to rewrite or adjust call flows, prompts, grammars, and thresholds to address concerns.

“You want to make sure you do refinements against the speech recognition,” Bertasi adds. “Analyze what’s going on in the application so you can get the best recognition possible, and fine-tune the grammars accordingly.”

But when it comes to system tuning, one mistake that many customers make is overdoing it. Once the red pen comes out, it’s easy to get caught up in the moment. It’s in this context that many applications get trapped in a battle of wills over a single word or phrase used in a prompt, for example. 

Coupled with that, people might overlook the big picture, focusing instead on a change that will affect only a very small portion of callers while ignoring other, much bigger issues.

To that end, experts recommend identifying the key problems first and prioritizing them. They suggest starting with smaller, easy-to-solve dilemmas that yield the greatest benefit and working from there to other, more complex problems. As Caroline Nelson, speech solutions team technical lead at Nortel’s Enterprise Multimedia Professional Services Unit, likes to say: “Start at the micro to impact the macro.” 

Even then, make changes sparingly. “Rewriting prompts and rerecording them can be expensive and [time-consuming],” warns Moshe Yudkowsky, president of Disaggregate Consulting.

“Also, be aware that every time you make a change, you are opening yourself up to cascades—places later on where you can run into trouble,” Yudkowsky adds.

That’s because all of the elements of an application are highly interdependent, from prompting and call flow to features and system personality. As a result, even a seemingly insignificant change, such as a single word in a prompt, could have unforeseen and possibly undesirable consequences later on in the call flow. Therefore, if a change is needed—no matter how small it may seem—include the VUI designer in the process. The designer can examine how much the change can impact other parts of the interaction and how consistent the new prompt will be with other parts of the design. 

It’s also a good idea to involve the agents, Kaiser maintains, because they are the ones on the front lines who will have to address customer needs around those changes. “It’s a good way for ensuring that the agents are on board with all that happens,” she says. If you give them a way to provide input into the application, they can become its greatest champions.”

Step 3: Repeat steps 1 and 2

Ideally, there is no wrong time to perform routine system testing and tuning. However, after the first round of changes, “look at [the application] again in a few months to see if you can change anything else to make the interaction faster, smoother, and easier,” Yudkowsky says.

That kind of in-depth tuning should take place at least every six months, Larson suggests.

Finally, as users become more sophisticated and familiar with the system, you will be able to tweak prompts to shorten navigation, enable barge-in to more quickly get callers where they want to go, expand grammars, and roll out new self-service offerings. “After users have been using the system for a while, they start using the system in ways not originally anticipated,” Larson says. “Why not modify the system so the user can do what he wants [more easily]?”

Out of Site, Peace of Mind

Hosted speech solutions represent a low-cost, trouble-free alternative to on-premises software.

By Gayle Kesten

Doing more with less. If you had a voice port for every time you heard those words, then your call center might very well qualify for world domination. Indeed, times are tough, but one area that continues to gain traction is the market for hosted speech solutions.

Last year companies spent $590 million on hosted speech solutions; by 2013, the market is expected to nearly double to $1.1 billion, according to Datamonitor lead analyst Daniel Hong, who participated in a December Webinar called “Overcoming Economic Uncertainty with Speech Solutions.”

“Reliability of hosted technologies really has changed in the past 24 months and is no longer an issue,” Hong said. “Companies have become quite comfortable with in-the-cloud technologies. [Service-level agreements] are the same, and you’re able to have guaranteed performance from the hosted provider.”

Simply put, a hosted software model is one in which a third-party provider offers applications that are housed in its datacenter—synonymous with software-as-a-service (SaaS) and on-demand technology. Rather than purchasing software outright and maintaining it on premises, a customer rents the application on an as-needed basis.

The advantages of selecting a hosted software model are many, yet they all come back to one undeniable deal-maker: more money in your business’ pocket.  


Unless you’ve been living—or purposefully hiding—under a rock, you have no doubt heard that the U.S. economy is a mess. Among other factors, the recession has manifested itself in budget cuts, layoffs, and delayed technology upgrades. 

Going the hosting route enables you to eliminate the upfront, capital expense involved in purchasing software that resides on premises, says Brooks Crichlow, director of marketing at Microsoft subsidiary Tellme Networks. “Hosting gives you the ability to get smarter about your spending,” he says.

Shifting from a capital expense to an operational expense is probably one of the most compelling arguments for choosing a hosted solution, agrees Nancy Jamison, principal at Jamison Consulting. “With hosting, you don’t have the capital outlay, which is a major draw for CEOs worrying about every dime and having to be prudent for investors,” she says. “Even though there’s still an output of money, it’s coming from a different place [on a budget].”

Cost is lower in other ways, too. For example, purchasing multiple speech software licenses is a moot point, replaced by a less expensive hosting subscription that’s based on usage. From a hardware standpoint, a hosting provider not only houses the necessary infrastructure that runs your applications, but is also in charge of integrating, monitoring, maintaining, and upgrading the equipment—not to mention employing the people who perform those duties. 

“Hosting is aggregation. Having several customers using the same Web-hosted facilities distributes and cuts costs,” says Mudar Yaghi, CEO of AppTek, a hosted speech recognition software developer. “You’re just paying for the amount of infrastructure required for your application.”

The typical return on investment for a hosted speech solution ranges from three months to six months, while an on-premises solution is around nine months to 15 months, according to Hong. “It depends on deployment size and the types of applications being deployed,” he said during the Webinar. “Still, by 2010 and at the current rate, 50 percent of new deployments will be hosted.” 

Hosted applications also offer advantages in terms of capacity and scalability. Think about a florist who’s about to go into his  busiest time of the year—Valentine’s Day. Similarly, consider the holiday season that just ended, spiking the number of calls to retailers, shipping companies, and travel agencies. Ditto for health insurance companies that offered open enrollments for the new year. 

“If you have a situation where you have a large volume of calls but it doesn’t happen consistently, such as inbound calls coming in for Christmas, you’d be paying a lot to own that much year-round capacity on-site,” says Marc LeFleur, an architect with Parlance, which offers both hosted and on-premises speech-enabled telephony applications. “A hosted solution, in which capacity is held by a solution provider, costs substantially less if you’re holding an event once or twice a year.”

A sudden flux of inbound calls isn’t the only cause for scale. Outbound IVR applications will represent the biggest area of adoption among companies in the next two years, worth roughly $525 million by the time 2010 rolls around, Hong said. Still, the reasons for having them hosted are the same as for inbound. 

“With hosted, companies can prevent investment in superfluous equipment that has low utilization rates. Why invest in 20,000 IVR ports when you just need them for specific campaigns at certain times of the month?” Hong said. “With a hosted model, enterprises can increase agent effectiveness by letting them handle more complex issues with customers, whereas lower-level tasks can be handled via speech self-service.”

For smaller companies, access to capacity that would cost hundreds of thousands of dollars to own also goes a long way in leveling the playing field against bigger competitors. 

“For growing companies that want to add a new system, outbound messaging, or a nice IVR, [a hosted solution] allows those customers to have a phone system that you’d find at large enterprises without experts on staff,” LeFleur says.

In fact, part of the attractiveness of a good hosted provider is its combined experience, not only in the people it hires but also because those employees embody know-how across multiple types of applications and best practices, Jamison says. 

That type of experience can be costly to retain in-house, especially for smaller companies that can benefit from turning over the software reins so they can focus on their core competencies, such as creating a better caller experience.  

“If I reallocate head count to do other things and outsource what they do today, I’m effectively doing more with less,” says Scott Manghillis, hosted solutions product manager at Intervoice (now part of Convergys). “The beauty of [hosted software] is you make sure your vendor is meeting your SLA, and then you pay the bill. When you get called into the boardroom, all you have to do is say, ‘Yes, they’re delivering on the service.’”

Along the same lines, the speed at which a new application can be added is typically quicker with a hosted model than when the solution is kept in-house. “As new features become available, migration is included in our hosted model,” Manghillis says. “Whatever is next, the customer no longer has to care about it. That’s our job.”

Not All or Nothing

Choosing between outsourcing your applications and keeping them in-house doesn’t have to be an absolute decision. In fact, many companies, especially those that can’t afford to have an outage, are combining both approaches, Jamison says.

“Companies can develop and/or run their applications in-house but have a hosted provider take a copy of it for business continuity,” she says. “When a customer gets too many calls, they can be diverted to the hosted facility for more ports. The customer is only paying for the minutes it uses at the hosted facility when it needs them.”

In a December white paper she wrote for Echopass, a provider of hosted contact center solutions, Jamison also points to “wild weather” as a reason companies choose both software models, and one of several factors that have led to a tipping point in acceptance of hosting.

“Hurricane Katrina in New Orleans first awakened many people to the long-term business disruption that can be caused by a storm out of control. But that was just the start,” she wrote. “Since then there has been a big increase in discussions by vendors and IT/contact center managers to build hosted services into near-term and long-range plans, not just as a precaution, but as a strategic move.” 

Increased adoption of hosted speech solutions has also paralleled improvements in Voice over Internet Protocol (VoIP), “making calls to a hosting facility entirely affordable and taking slow speed of remote application access off the table,” Jamison wrote. 

Once you’ve decided that hosting is the way to go, the next move is choosing a hosted speech provider. Here, consider how long a company has been in business, says Jim Larson, an independent speech consultant and VoiceXML trainer.  “I’d go with a more established company that has a proven track record and has been in business for several years,” he advises. “They’ll be the ones that are likely to remain in business during economically stressed times.”

Talk to those companies’ customers, too. “Find out the real dirt—what they like and don’t like about the company,” Larson says. “Find out how they deal with problems.”

A provider that supports open standards is also important. “Open source gives you a lot more options,” Tellme’s Crichlow says. “You want a system that’s future-proof to protect your investment. Otherwise, you’re at the mercy of a vendor.”

Finally, as with all technology engagements, communication is key. 

“One thing we and our competitors make a big point of is communication, and making sure both the customer and the vendor truly understand what the customer wants, what the limitations are, and what the time line should be,” Parlance’s LeFleur says. “As long as the expectations are set reasonably and everyone understands what’s being asked for, it works out really well.”

SpeechTek Covers
for qualified subscribers
Subscribe Now Current Issue Past Issues