Brett Azuma, Senior Vice President of Marketing, IP Unity

Q Please tell us a little about IP Unity.

A IP Unity provides the platform and applications for carriers to offer enhanced voice, data and video services to their end users. Cable companies such as Comcast and Liberty Media, as well as RBOCs, CLECs and service providers use our media server platform and enhanced services applications to improve their top line. We are deployed in both next generation and legacy networks.

As telephony and data applications were becoming more sophisticated, IP Unity's team thought that some of the application level media processing functions for things such as conferencing mixing, text-to-speech and speech recognition were better handled at the network level by a dedicated product. Voicemail, messaging, conferencing and call center applications all demand the same basic media processing, and this could be provided more efficiently and economically through this new product. This idea became the media server platform, which IP Unity launched in 2000.

It quickly became apparent that while carriers agreed with our vision for the future, in order to justify investment in new equipment they needed products that they could put to immediate use and generate immediate revenue. We therefore developed a suite of market-proven applications that carriers know they can sell; applications such as Web-enabled conferencing, voicemail, unified messaging and auto attendant.

Q Please describe your Media Server and Application Server products.

A Our media server, the Harmony6000™ is a high-density, carrier-class DSP engine that performs the high-speed, high-volume media processing functions required for enhanced services. These functions include playing announcements, recording, automated speech recognition (ASR), interactive voice response (IVR), text-to-speech, billing record creation (CDR, IPDR), transcoding and fax detection. We sell the media server on a per-port basis, making us very competitive. The platform's partition-able architecture, scalability and multi-network flexibility make it ideal for carriers that want to offer multiple enhanced services on a single platform.

The Harmony6000 Application Server, or Service Execution Server, works with the media server to enable turnkey deployment of service applications such as conferencing and unified messaging. The application server accepts SIP, H.323 and VoiceXML applications and communicates with the media server through JAIN/Parlay APIs and MGCP.

Q Why should service providers and carriers deploy speech solutions?

A Speech solutions cut opex and drive revenue. Vendors most commonly discuss the second point, and the basic idea is that the "sticky" services which consumers like and pay for mostly involve what we call "enhanced services." Dial tone doesn't inspire customer loyalty, but a user-friendly unified messaging service, or a great bundle of other services might. The economics of the media server is revolutionizing the way these services can be offered. In the past, you needed at least a 5% adoption rate to make the business case for enhanced services. With the media server platform your ROI is much better, allowing niche services to be targeted at specific market segments.

Speech technology is heavily tied into reducing opex. Take something as simple as operator services in the past 10 years. It saved carriers millions of dollars per month when they switched to automated operators, and people are now more comfortable interacting with machines. The speech technology has advanced to a point that it can handle customer service and provisioning tasks with around 95% accuracy.

Q What is the Harmony 6000 Platform?

A The platform consists of our purpose built media server coupled with our service execution server and applications. Depending on the deployment it might also involve a stand alone text-to-speech of ASR server.

Q What will the recent addition of speech recognition technology to this platform mean to your customers?

A Carriers form the bulk of our customer base, as well as some large enterprises that run their own voice networks - I guess we can think of them as carriers with 100% penetration.

Some of our carrier customers are interested in using the new speech recognition capability as a front end to hosted call center services. Many of the simple tasks handled by human operators can be managed by the media server, freeing up the human operators to be the second line of service, taking over calls that are too complicated for the media server.

We are also currently speech enabling our messaging and conferencing applications. I'm sure we have all been in voicemail jail, when you are driving and you can't hit seven followed by nine to delete a message. With speech enabled messaging we can speak our commands, and don't get bogged down in our voicemail system. Best of all, our platform is speaker independent so you don't have to program the application to understand your commands. It's all about ease of use and pushing cost out of the equation, and in these regards, we've got a winning platform.

One thing that sets our speech capabilities apart is that we have embedded (SpeechWorks') technology onto the media server itself, and encoded the algorithms into the platform, rather than drawing a port onto a speech recognition server. While all customer deployments are unique, this integration into the platform typically means better and more efficient platform service, not to mention the benefits of simplified management and a smaller footprint.

Q Do you have plans of adding text-to-speech or speaker verification to this platform in the future?

A Speaker verification is in our roadmap and we will have it out to market in 2004. We have seen a lot of demand from government and enterprises for authenticated conferencing. Some organizations are wary of audio/Web conferencing because they don't know who is on the call. Speaker verification will make the conference secure.

On a simpler level, users currently enter a pass code and digit to get into an IP Unity voice conference, but with speaker verification I can simply say my name and the platform will recognize and authenticate my voice, and route me to the right conference bridge.

There is some demand for text-to-speech in unified messaging applications, so that you can listen to your emails from a simple phone, or more likely a cell phone. Harmony6000 has TTS capabilities and we provide TTS translation capabilities in our current unified messaging application.

Q How does IP Unity differ from its competitors? Who are those competitors?

A We took a lead in the market by offering a media server platform and applications. Talking to carriers with a solution to an existing problem--rather than an interesting and versatile piece of iron--has opened many doors in this economy. Within the immediate "media server vendor" category, we have Convedia and SnowShore, but there, IP Unity is unique in offering applications. Our conferencing and messaging products are easily branded by service providers, who like the fact that the buck stops with IP Unity when it comes to deploying enhanced services on the platform. Having said that, we have a number of application vendors as partners who are making applications for our platform, and some of those applications compete directly with our in-house solutions. For us, the key lies in providing choices, and ultimately our carrier customers are the ones who chose which vendor's applications they are going to use.

The interesting thing about the telecom ecosystem is that competitors are often partners too. We often come across Voyant, Latitude and other application vendors in deployment shootouts, but we also work with these application makers as partners. The same is true for the board-level companies such as Dialogic and AudioCodes, they compete at the enterprise level with IP Unity but have scaling issues at the carrier level so we are potential partners for carrier class deployments. There is similar overlap with software companies such as Broadsoft, Dynamic Soft and VocalData. We want to be a good partner and responsible part of the ecosystem.

Q Where do you see IP Unity in five (5) years?

A IP Unity will be a successful solutions company offering enhanced service applications that bridge TDM with IP. I think we will have mastered voice, data and video applications by then, and be offering useful applications such as simultaneous language translation using speech recognition.

SpeechTek Covers
for qualified subscribers
Subscribe Now Current Issue Past Issues