Considerations for Choosing a VoiceXML Platform or Service Provider

Enterprises and carriers are shifting investment from proprietary hardware-based voice systems to open VoiceXML systems in order to take advantage of the flexibility and cost savings that VoiceXML and other open standards can provide. According to Daniel Hong of Datamonitor, revenues from proprietary touchtone IVR in North America, Europe, the Middle East and Africa (EMEA) are expected to decrease by more than 35 percent through 2009 as a growing number of businesses are opting to invest in emerging open-standard IVR platforms to better leverage Web infrastructure, improve functionality and potentially graduate to speech technology to further improve routing, transactions and self-service capabilities.

VoiceXML Forum Platform Certification Program
By Ken Rehor

A primary goal of VoiceXML is interoperability. To quote the VoiceXML 2.0 standard, "VoiceXML is a common language for content providers, tool providers, and platform providers."  The VoiceXML Forum's Certification Program supports this goal by certifying that implementation platforms pass all the required certification tests. Benefits to the industry include:

  • increased choice of a broad range of products: applications, tools, platforms, and service providers
  • reduced customer confusion over language support and compatibility by testing and documenting a vendor's specific implementation
  • a common test suite available to the entire industry
  • certified third-party, independent testing

The clarity and certainty that this program provides are good for vendors, good for customers and good for the industry. Application portability allows choice of vendors for different application requirements, such as cost, scale, reliability, administration features, telephony capabilities, and other features. Another critical choice is deployment architecture; you can choose to deploy your VoiceXML application in-house using an on-premises platform, outsource using a voice service provider or a combination of the two.

Customers that purchase only certified VoiceXML platforms know that the systems and services they purchase are fully standards-compliant.  This assurance is important for companies seeking to avoid the risks associated with vendor lock-in and with obsolescence by adopting a standards-based technology approach.  Additionally, the features and capabilities of certified platforms are well-understood, so customers have confidence that they understand what they're buying. According to Steve Cramoysan, research director at Gartner, "Customers should require that vendors, who claim conformance to VoiceXML, demonstrate their commitment by gaining formal certification of their platforms."

Conversely, vendors whose VoiceXML platforms have been certified make it easy for customers to select those platforms.  The certification is a badge of product maturity, industry awareness and customer focus.  The availability of industry-certified platforms can shorten the sales cycle and increase marketplace adoption generally, which is good for all vendors.

Through the end of 2005, 14 companies were certified VoiceXML platforms or services, with more vendors in the testing process now.  For a complete list of certified implementations see the VoiceXML Forum's Web site at http://www.voicexml.org/

The Certification Program consists of three principal components: the Conformance Test Suite, which is based on the W3C VoiceXML 2.0 Recommendation and the W3C VoiceXML 2.0 Implementation Report; the Test Harness, which simplifies running of the tests (available to Forum members); and formal certification testing that is administered by an independent third party on behalf of the Forum. The test suite includes more than 600 tests.  To aid vendors in preparing for certification, the test suite is available for public download from the Forum's Web site.

Complete details of the certification process are also available on the Forum's Web site.

What's Next
As vendors release new versions of their products, watch for re-certification of previously certified VoiceXML 2.0 implementations.  Also, the Forum will update the Certification Program to include support for VoiceXML 2.1 once the W3C releases the final version of the standard (probably in the first half of 2006).

To simplify the certification process, an online version of the test suite will be made available so the tests can be executed directly via the Web without having to install any software locally.

The VoiceXML Forum is developing plans for an interoperability testing event to be held in the second half of 2006 for vendors of VoiceXML tools, packaged applications, platforms, and services. For more information, or to join the effort, contact Cindy Tiritilli (cindy@voicexml.org).

Glossary and references:

VoiceXML is the "glue" of a voice Web application - it provides the structure of a voice dialog in a manner similar to HTML in a Web page. But VoiceXML is only one piece of the puzzle of components and standards in a voice Web application. Coupled with related standards like SSML, SRGS, CCXML, and MRCP, VoiceXML-based applications offer tremendous flexibility, whether deployment means purchasing equipment, outsourcing via hosting service providers, or a combination of the two.

VoiceXML initially was considered a product unto itself - the VoiceXML platform. In reality, companies were simply building open standards-based IVR platforms. Now just six years after it was first published as an industry proposal, VoiceXML is a key feature of many products, including:

  • Open standards IVR systems
  • Media servers
  • Voicemail and messaging systems
  • PBXs and soft switches
  • Multimodal "browsers"
  • Voice control of devices, such as televisions, car radios, security systems, and more

When deciding whether to choose a VoiceXML-certified platform, a hosting service provider, or a blend of the two, there are factors to consider.

Evaluating the various providers is a two-part process: first, find platforms or providers that offer the features required by your voice applications; then, decide whether to host the applications in-house or outsource to a service provider.

Part 1 - Evaluate Features

Does the VoiceXML platform or service provider offer the features and support you need for your voice applications? Before answering this question, you should first know what features are required for your voice application system.  These requirements can be broadly grouped into two categories: VoiceXML features - the dialog or user-interaction capabilities required by your voice applications - and deployment features - which define system properties and constraints like system availability and uptime, latency, reliability, call volumes and durations, hardware, network interfaces, security, and storage requirements.

An Important Part of This Equation Is Standards
Are the features you need implemented according to open standards or via proprietary implementations?  Voice applications and platforms built on open standards streamline your initial implementations, and allow for future growth, making it easier to migrate to other vendors as your business requirements change. 

A key part of any voice application system is VoiceXML conformance. The VoiceXML Forum Platform Certification Program certifies that implementation platforms conform to the W3C VoiceXML 2.0 specification. (Currently, there are 14 certified platforms.)  To be certified, a platform must support all required elements in the specification.  Certification is a critical factor, but only supporting the required VoiceXML features doesn't guarantee that the platform meets all your needs.  Features you may consider essential - such as call transfers (and specific sub-features), built-in grammars, and speech recognition (including barge-in) - are actually optional according to the VoiceXML standard and thus the certification program.

Other components of the W3C Speech Interface Framework (developed by the W3C Voice Browser Working Group http://www.w3.org/Voice/) are also required, such as SRGS and SSML.

Let's take speech recognition as an example. Looking at the different vendors' ASR offerings, ask whether (and how) they support the relevant standards - SRGS and SISR.  The more complete their support for these standards, the more portable your voice applications become. Then, ask not only whether the platform or service provider supports the ASR vendor and software versions you need, but also how does it integrate with the ASR engine? Does it integrate with the ASR engine via established standards (MRCP) or is it a proprietary integration (for example, to a vendor's API).  MRCP integration allows you to select from many vendors, and possibly implement several different engines according to the recognition features you need.

Proprietary or platform-specific implementations are not necessarily a bad thing. They help identify and address gaps in the existing standards. Take SIV, a feature not currently addressed in VoiceXML.  Today, there are many platform-specific implementations of speaker biometrics extensions to VoiceXML and industry demand has pushed SIV into the requirements for VoiceXML 3.0.

Part 2 - In-House vs. Outsourced Deployments

Once you have a list of candidates that offer the features you need, then the decision is whether to build and deploy your voice applications in-house or outsource the hosting to a service provider. This decision comes down to cost and control.

In-house deployments give you local control of all systems - the voice server, application server, and database servers can all reside on a local network, and you can duplicate development and deployment environments. However, local control of resources also means local control of physical security, failover, reliability, and scalability - and additional IT expenses for hardware, software, support, and staffing.

On the other side, outsourced deployments can reduce costs - no capital investments, pay-as-you-go pricing models, and staffing and hardware for failover, reliability, and scalability are the responsibility of the service provider. The trade-off is a loss of local control over your systems. Physical security of the equipment and networks are managed by the service provider; plus, you'll need VPN access or dedicated data connections to your back end systems.

This isn't strictly an "either/or" decision - you can balance between in-house and outsourced solutions. You could deploy primarily in-house, with overflow during peak usage or failover routed to a service provider. Or, you could develop applications in-house and push to the service provider when ready to deploy.

Glossary and references:

SpeechTek Covers
for qualified subscribers
Subscribe Now Current Issue Past Issues