September 1, 2004
Q & A

Steve Pollock, TuVox

Q. Steve, what changes have you seen in speech over the last couple of years?
A. Over the last couple of years, the focus has really shifted from speech recognition accuracy, to deploying custom speech applications and now to focusing on how to cost-effectively deploy a wider number of speech applications and more diverse types of applications. It's hard to argue with the value speech automation can offer to improve caller satisfaction and reduce costs, but it becomes a question of unlocking that ROI and overcoming the costs of building and maintaining the applications. And so the obvious thing to do is focus on ways of reducing the costs of building, managing and maintaining sophisticated speech applications.

Q. Why do you think these things are coming about?
A. The broader interest in speech applications has forced the industry to shift. Very early on, the focus was on the largest call centers and the narrowest types of calls (relatively few dialog states). So even if it cost $1 million to design and build one application, that was ok since there were so many millions of calls. But those economics no longer work as we look at broadening the type of calls - so we need to design more dialog states - and as small- and medium-size companies want to deploy this type of automation - where there are fewer calls per application.

What we're seeing is the normal evolution that we've seen before in enterprise software from a custom development model to pre-packaged application functionality. For instance in the CRM market, where originally you had to build the application yourself, applications eventually emerged and pre-packaged solutions brought down the cost.

Q. According to recent reports from TuVox you have had nice quarter-to-quarter sales increases. To what reasons do you attribute this growth?
A. We're really seeing our customers and prospective customers embrace the notion of enterprise software for speech applications, which is really modeled after what we've seen happen in other enterprise applications areas. It's the concept of pre-built applications built with and extensible through high-level tools and components. There's a whole set of tools that allow you to manage and maintain content, basically giving you the full ability to customize without any restrictions. Under PeopleSoft you'll find PeopleTools; under Siebel, you'll find SiebelTools. We think that whole model applies directly to the speech industry and we're finding a lot of acceptance around the idea and our product.

And, maybe even more important to point out, we have 100 percent referenceability across our customer base. We've been very successful meeting commitments with our customers, deploying applications quickly, meeting our quality objectives, and satisfying our customers.

Q. Please discuss a couple of recent deployments and what impact speech had for their organization.
A. One of our customers recently completed building and deploying well over 100 applications in a period of 6 months. They were looking to move from a touch-tone system to a speech application, with the specific objective of improving the total automation rate of calls. Early data now shows a 50 percent increase in automation over the touchtone system. TuVox enabled them to deploy an enterprise application with a tremendous amount of dialog content very quickly. That quick deployment time means a significant impact on cost savings, so they could begin achieving that high automation rate and meet their business objectives.

Another customer was looking to have a speech application in place as part of an effort to prove the viability of their long-term customer service plan. In order to do that, it was critical for them to have their speech application deployed in only six weeks and exceed the performance stats from their touchtone applications. This was critical for them to overcome concerns about whether or not their customer base would adopt speech applications. We were committed to helping them meet their business objectives and getting their plan in place. We built their speech application, integrated with their CTI to provide screen pops to the desktop, and of course integrated with their back end systems, and fully deployed the application on premise, and all within a six-week release cycle.

Q. When you look at your customers what are they finding that speech is doing for their organizations?
A. Clearly it varies from customer to customer. Several deal with seasonal call patterns. So, a big benefit to speech automation is evening out the call traffic and helping them more easily manage staffing during those peak times. Second, we see speech helping companies manage cost and maintain their competitiveness. If your competitors are streamlining their operations and using various types of automation, you've got to find a way to be more flexible and offer a better deliver of products and services. Deploying speech applications with TuVox allows them to stay cost-competitive. And, in many cases, speech can make your call-center agents more effective. One of our customers report that when a caller is able to solve some of his problems with the system, when they gets to an agent they're in a better mood than when they were dealing with a touch tone system that didn't provide value and left them frustrated.

Q. Are your customers that currently deploy speech continuing to invest in more speech initiatives and if so what is the motivation?
A. Oh, absolutely. Of course there are a lot of opportunities to improve touchtone applications, but speech applications offer many more avenues for improvement. Many companies we work with change their product offerings annually or even multiple times a month. To maintain a high completion rate and to improve the system, a continued focus is placed on maintaining and updating the system content. That core focus of increasing call completion requires ongoing tuning and monitoring of what's happening in the system as well as looking at changes in the business and what you need to change within the application to support that. In addition, there are more types of calls that can be automated, so there's a continual process of evaluating the kinds of calls that can be automated. It really comes down to understanding the value of a minute of agent time saved, and balancing that against the cost of building and maintaining automation.

Another driver is call routing. While it's possible to provide more types of automation with speech, you need to provide a way for callers to access the automation. Where we've seen some companies start with call routing, others evolve there to provide better overall access to their company and open up more avenues for automation.

The other driver is improving customer satisfaction. We offer a caller satisfaction initiative where to monitor callers though the system as well as through post call survey mechanisms to understand how callers perceive the system, how well it's working for them, and what kinds of improvements they would like to see.

Q. What are Steve Pollock's five 'best practice' deployment considerations that you could share with our readers?
A.

Understand why people are calling. This information is critical to identify necessary and accurate content.
Understand caller preference and skill. Remember there is a difference between a power user and an occasional caller and allow direct dialogue and natural language routing. Be able to quickly accelerate someone to an agent when necessary.
Caller in control. There are two parts to caller in control. One is allowing the caller to opt-in to automation vs. waiting for a live agent. Another is the set of consistent global voice commands, like wait, pause, go back, repeat, and next, that allow the caller to navigate through the system. If you're asking for data that the caller doesn't have in hand, allow the caller to say "wait" while they get their credit card, so they don't hang up and call back. Also if you're providing the caller with data they need to write down, they may need to pause to capture that. These features are often what keep a caller from zeroing out to an agent, creating a better and more successful caller experience and increasing adoption rates.
Plan for application updates. When comparing competitive bids or products, it's important to look past the upfront costs of building a speech application. If you look at the total lifecycle cost of the application, you may spend more maintaining the application than building it. If your company's products or business practices are relatively static, that might not be a big deal. But if your business is changing and dynamic, possibly acquiring other companies or changing product lines, there may be a lot of work involved in updating and maintaining the system. Like a Web site, speech applications will change and grow over time. It's key to ensure that updates are quick, easy and cost-effective. Also, if you're looking to do the work in-house, be sure to understand the tools provided and skills required to maintain your applications.
Plan for extending automation. Once you've deployed your initial speech applications, callers will quickly adopt automation and request more features. Not only do you need to plan to maintain the current applications, but also to extend the applications to maximize automation. It's also important for your speech application to integrate with live agents. So, if a caller is able to partially complete their call through the system, but then needs an agent, that transition should be seamless. You've got to plan for a mechanism to bridge that experience using CTI technology while providing a great caller experience. With applications that are quick to develop and easy to deploy and maintain, expanding automation will increase customer satisfaction and improve the bottom line.

Q. Who do you consider your competitors and what differentiates you from your competition?
A. Our biggest competitor is custom-built applications done either by a vendor's professional services group or by internal IT groups. This is a pretty classic phase of an enterprise software market, so that's to be expected. Companies are trying to figure out, given the VXML standard, why can't write the apps ourselves? To me, this is no different from the way companies address their Web sites. While HTML standardizes the GUI, VXML standardizes the What has emerged in the (after all, they're just HTML, right?). Early HTML apps were written from scratch. However, today, if you're building an enterprise-class Web site, the more common approach is to use a complete enterprise application (complete with content management capabilities, workflow, pre-built components), or a set of high level tools and pre-built components - like shopping carts, product viewers, search, cross-sell. The same evolution is happening in the voice applications market, but accelerated dramatically.

The acceleration is driven from two sources, first is the complexity of speech applications, and the second is the clear parallel to the Web-based model. The complexity of the speech applications is easy to underestimate for companies looking to build the applications in JSP or ASP and go straight to VXML or SALT. However, when I pick up a lot of the VXML books in the bookstore, it's easy to see how it can look easier than it is. Thus you have an emerging market of pre-built applications, or what we refer to as enterprise software for speech applications.

What first differentiates the TuVox solution is time to deployment. TuVox enterprise software has a lot of pre-built functionality designed to reduce the amount of work across the entire lifecycle of the application, giving you a tremendous advantage in actual time to deployment.

Secondly, that built in functionality assures a higher level of quality, out of the box, since you're working from built-in best practices, built-in functionality that's already been tested and therefore you can deploy quickly and with high quality.

The third thing is around flexibility. While custom applications give you a lot of design flexibility—offering any feature you want—they don't give you business flexibility. Business flexibility is being able to quickly change the speech application in response to a change in business needs. TuVox enterprise speech software enables you to update the application very quickly and at any time.

With custom applications, a large number of resources are needed to maintain the application. Many companies become dependent on third-party organizations, because there are some key skills they simply can't replicate in-house. TuVox has provided a high level of tools and pre-built components to reduce the skill level required and the amount of resources needed in-house to build and manage the application.

When you add all of this up, it's really about the total cost of ownership of the speech application. The key point of differentiation for TuVox is looking at the total cost of ownership over a 3 or 5 year lifecycle of the application.

Q. Any last thoughts?
A. We're very excited to see the progress that the speech industry is making, the excitement around speech applications, and the growing adoption of speech. A lot of things are coming together for the speech industry including the overall improvement of speech recognition. And the focus of the largest software companies in the world on enhancing their own entries into the speech market is growing awareness around speech applications and driving overall demand for the industry. We're all very excited about the new approaches and our ability to go out and help customers solve core business problems with the technology.

Free

for qualified subscribers

Subscribe Now Current Issue Past Issues

Steve Pollock, TuVox

IBM Releases Granite 3.3 8B Speech Recognition Model

Nari Labs Launches Dia TTS Model

SoundHound AI Partners with Tencent to Bring Conversational AI to Auto Brands

Mango AI Offers Free Voice Cloning