November 1, 2011
By Kevin Brown enterprise architect, Miratech.
Inside Outsourcing

Are There 31 Outsourcing Flavors?

When IVR speech technology vendors, outsourcing partners, and customers hold public conversations, sometimes it sounds as if nobody quite understands what the other is saying.

Blogs or public forums for speech technologies often pose a simple question about outsourcing some part of IVR services, prompting exchanges focused on reaching a common understanding of what is meant by the term "outsourcing." During the past 10 years, speech technologies and networking have progressed to the point where a speech-enabled solution can be partitioned and distributed across multiple instances, platforms, locations, or organizations. The level of understanding among parties is frequently uneven, depending on people's involvement with newer speech technologies and their deployments.

Let's look at the current state of IVRs, and the associated speech technologies, to better understand why the single term "outsourcing" no longer is definitive without clarification.

Telephony platform—Previously, almost all IVRs were behind an ACD/PBX. Today, the IVR takes 100 percent of the calls, with some going to agents and others staying in the IVR. There are some exceptions to this model, but those are not the rule.

Media controller/call controller platforms—Core IVR components provide voice browser (using VoiceXML) and call setup, monitoring, and teardown capabilities (using CCXML).

Resource managers/proxies/administration platforms—Core IVR management and assisting capabilities are provided by these components.

MRCP servers/ASR and TTS—Speech recognition and text-to-speech capabilities.

Application servers—IVR applications containing logic reside here.

Reporting servers—Databases and reporting applications reside here.

In the past century (yes, IVRs have been around that long), virtually all IVRs were premises-based, and functionality was contained in a small server stack; if you needed more ports, you deployed more server stacks regardless of which pieces of the IVR needed to scale up. The aforementioned short list highlights the key components of today's platforms at the highest level, and the list could be further divided when looking at large deployments.

In the past five years, given the high network bandwidth at diminishing costs, many hosting providers and end users began to split these components across geography. This capability opened the door to having portions of the platform hosted, others in the customers' own data centers, and splitting responsibilities—outsourcing portions and retaining management of others—when it makes sense.

One of the more common combinations is outsourcing MRCP capabilities while retaining the rest of the platform. This helps organizations using speech technologies that have seasonality challenges or unpredictable call volume spikes. Another common outsourcing scenario is to have most of the platform hosted while retaining applications in the customers' data centers. Content caching and high bandwidth support this combination, as well as the TTS capabilities of the aforementioned MRCP-hosted model.

A somewhat less common but growing combination is to have reporting capabilities separated from the applications. That can be done either between two hosting companies or by outsourcing only the applications or the reporting capabilities while the customer retains one or the other capability.

It is possible to outsource your main IVR hosting while outsourcing your application development to another provider and your speech tuning requirements to yet another company.

As a result of those recent developments in IVR architecture and networking capabilities, many combinations of outsourcing IVR speech technologies exist today.

Key drivers for contemplating different outsourcing combinations are core competencies of your organization and outsourcing partner(s), call volumes and arrival patterns, call complexity, customer and contact value (think sales versus self-service of low-value products/services), and cost.

No matter how you dish it up, outsourcing speech technologies offers more flavors than ever.

Kevin Brown is an architect at HP Enterprise Services, where he specializes in speech solutions design. He has 18 years of experience in designing and delivering speech-enabled solutions, and he can be reached at kevin.c.brown@hp.com.