Speech on a Network
Posted May 1, 2008

Though it has become the industry buzzword of late, the concept of a services-oriented architecture (SOA) is not new. In fact, it has been around for almost 40 years. And despite  varying definitions among vendors throughout that time, by now its business benefits are widely accepted: reduced application integration costs, a greater ability to reuse assets and information across channels and applications, and the ability to more quickly adapt to changing business needs.

Many larger enterprises have already made moves toward SOA adoption. In its "SOA Spending Report 2007-2008," AMR Research reported that 53 of the companies surveyed are already using it today and another 37 percent said they are planning to start their first SOA project by 2011.

Middle-tier businesses are just now starting down this path. Naturally, many are apprehensive, and they’re not alone. That’s because converting to an SOA is such a wide-ranging process, one that is part systems design, part architectural overhaul, part application development, part business makeover, and part attitude change. "Fundamentally, it’s a radical change that means reengineering an entire IT stack," says Dan Koloski, chief technology officer and director of strategy for the Web business unit at Empirix, a company that markets load testing, monitoring, and performance solutions for Web, contact center, Voice over Internet Protocol (VoIP), and IP storage applications. "It involves a new top-to-bottom approach to managing IT operations."

Among firms that have already dipped their toes in the SOA waters, most have thus far involved only their back-end databases and core business applications. Front-end, customer-facing speech technologies have only recently begun to enter the picture. "We’re just now starting to see increased awareness of SOA within the contact center space, especially in larger companies where the core IT functions are already in an SOA and they are looking to add more and more components," says Steve Cawn, sales team leader for speech solutions at IBM.

According to Cawn, the concept of incorporating speech into an SOA is "in its earliest stages of adoption," but gaining ground quickly, particularly in the financial services, telecommunications, and retail/ commerce industries. Coding and recoding speech applications is among the most expensive and time-consuming parts of any project, so companies in these sectors have the most to gain from sharing data and processes across channels and applications. That’s because many offer multiple modes and channels of interaction—branch offices, call centers, Web sites, kiosks, and more—and many different transaction types that can be performed through these channels.

In an SOA, "applications are not siloed but broken down into a set of components that can be shared across them," explains Steve Cramoysan, research director for enterprise communications applications at Gartner. "In an IVR that’s asking for a customer ID number in one area and a credit card number somewhere else, it’s better to develop it once and reuse it across all other applications."

In bringing their speech applications under the SOA umbrella, companies create a scenario where voice solutions become an extension of other business systems, including the communications networks that supply phone, fax, email, and messaging services, and software applications like customer relationship management (CRM), enterprise resource planning (ERP), and workforce management. It is important, therefore, that prior to bringing their speech systems into their SOA, companies get their other core business processes, databases, servers, networks, and infrastructures ready.

One caveat, though, is that maintaining strong ties between voice solutions and business information systems could mean that the voice solutions will have to be upgraded or modified more quickly and more often than usual to keep pace. Under an SOA, none of these applications exist in isolation; rather, they all become part of a much larger domain where each application depends on the quality, functionality, and availability of all the others. Compatability, therefore, is of paramount importance.


"The SOA will be led by core business applications, and speech will have to be led by those applications as well," Cramoysan says.  "It’s a fine aspiration to get [speech applications into] an SOA, but until the back-end systems are there, it may not be possible."

Early Engagement
That’s why Cawn and many of his colleagues recommend involving as many rungs of the corporate ladder as possible before deciding how an SOA should be structured. "As you start to look at expanding and growing your application base, don’t do it in a vacuum. Get linked into the entire enterprise, IT, and business organization," Cawn says. "Get engaged early with the IT architects who are in touch with how the entire enterprise is structured."

Cawn also advises the same for call centers. "To build it as an island unto itself is a recipe for disaster in this modern, connected world," he says.

At the same time, though, it is not wise to leave the speech application development solely in the hands of the company’s traditional Web developers. "You should still have your VUI designers build your speech systems," says Ken Rehor, an independent consultant specializing in open-standards telephony, voice, and multimodal applications. "One of the benefits of an SOA is that you can separate out your applications and build them independently."

Simply involving all the key corporate players is not enough, Rehor adds. "Make sure everyone agrees on the goals you’re trying to achieve," he says. "Just rewriting a program is not good if you do not have set goals and not everyone understands why you’re doing it."

Similarly, embedding applications and functions or tying them into specific customer interactions is no longer a good idea, according to Roberto Pieraccini, chief technology officer at hosted IVR provider SpeechCycle. "You want to be creating a universal interface for Web services. That’s a good idea," he says.

Furthermore, building a truly effective SOA means "having access to information and processes outside the core phone environment," notes Beatriz Infante, CEO of VoiceObjects.

In getting back-end systems ready, network security has to be priority number one. "You need to put in a secure infrastructure," Infante says. She notes that creating an SOA where everything is housed on a networked server or the Internet opens the stage for hackers.

Unlike previous closed-loop systems, more open SOA environments are subject to the same kinds of viruses, spyware, denial-of-service attacks, spam, and phishing as other data networks and Web-based systems. They also, therefore, require the same levels of protection, including firewalls, filtering software, encryption, and network traffic monitoring. "Security, logging, and even analytics are needed more in an open-source environment," Infante maintains.

"In an SOA environment, monitoring and managing every application is critical," adds Brian Gollaher, product manager of the contact center business unit at Empirix. "You can’t deploy anything that can’t be managed and monitored. You can’t put anything on your network that you can’t secure."

But IBM’s Cawn is quick to point out that an SOA doesn’t necessarily have to involve the Web; even though SOA has taken on a heavy Web orientation, many SOAs have been built using legacy mainframe applications. "Anything can be adapted to an SOA framework. We’re still living with a large number of legacy applications out there," he says.

Slow Down
Once a business does start down the path to an SOA and has made all the necessary back-end systems adjustments, the tendency is to try to do too much at once. Many even think they will have to rip out all of their existing applications and start from scratch, something that Koloski and others think might be overkill. "It’s not about building new apps, but bringing a new level of interdependence to existing apps," he says. "Most organizations are doing it in incremental changes. Most enterprises tend to [build an SOA] as incremental improvements to existing applications."

While companies certainly can build an SOA from scratch, starting from the ground up can be a big, expensive job. "For the vast majority, you can’t ask to do a full rip and replace because of the time and money involved," Koloski says, noting that most often the approach is to "wrap existing applications around an integration layer with newer interfaces."

Gartner analysts have even predicted that more than 70 percent of all services deployed through an SOA will be built using existing systems and technologies.

Along the same vein, early on many businesses sought out and expected vendors to deliver an SOA in a box—something they could buy off the shelf, open, plug into a network, and go. Not only do such products not exist, but analysts and consultants would urge companies to avoid them if they did. They strongly caution against jumping into an SOA too fast or on too big of a scale all at once.

To do so would result only in many things being overlooked because most modern speech deployments involve a combination of parts from many vendors. "One thing to pay attention to is multivendor integration," says Infante, who notes that many companies are still trying to adjust after spending the past 25 years in a closed environment where they only had to deal with one vendor.

"You have to layer in your budgeting to be a phased implementation. Because you’re doing this in a multivendor environment, you have to work with [all the vendors] to make it happen and debug the system across multiple servers if each component is going to be worked into the system," she says.

"Companies have been slow to adopt an SOA because it’s harder to dowith multiple vendors than it is in a traditional, one-vendor stand-alone application stack," Koloski adds.

Modest Beginnings
It is widely agreed that it is best to start modestly with a handful of applications that can have other applications built around them or be easily expanded to accommodate new uses. 

In the speech arena, VoIP and unified communications (UC) applications are good starting points because their central goal is to merge voice, data, and multimedia solutions onto a single platform that is typically network- or Internet-based. These applications also easily interface with other central business systems, like Microsoft Office or Salesforce.com, to add components like calendaring, contact lists, and address books to traditional phone, messaging, conferencing, and telepresence applications.

That is something to which John Burke, vice president of technology for the Seaport Hotel and World Trade Center in Boston, can attest. In launching the hotel’s Seaportal—a Web-based system that delivers information, entertainment, Web and email access, direct dialing forf eatured guest services, and complimentary VoIP calls straight to hotel guests in their rooms—he simply created a few Web-based XML interfaces to expand existing voice and data services in the hotel and was able to get the system up and running in about 60 days. It went live in December 2006.

SessionSuite SOA Edition from BlueNote Networks served as the backbone for the project. It allowed Burke to integrate IP telephony with existing Web services and business applications without having tor eplace existing hardware or software. A Web interface to the property management system even leverages guest information to personalize both content and services.

Without the ability to repurpose existing systems, the project likely would  have been four times more expensive, Burke estimates.

The hotel even got to keep its existing phone system. "We were really interested in leveraging our previous investments in voice technologies. We weren’t ready to get rid of our existing PBX phone switches yet," Burke says.

Eventually Burke plans to replace the hotel’s legacy phone system witha full IP network, and having an SOA will smooth the transition, he says.

VoIP and UC are not the only communications and voice-based technologies that are feeding off a growing convergence of the Web and telecommunications worlds. An increasing demand from consumers forWeb-style applications on their mobile phones and other devices has also steered more traditional interactive voice response (IVR) and other call center technologies down the same SOA path.

An example cited by Rehor is an airline outbound calling application that alerts travelers when their planes will be delayed. A Web service could initiate the call, based on passenger information stored in the passenger registry. A speech synthesis engine or collection of prerecorded audio files could be used to create the message. Using three simple pieces of information—the flight number, the delay, and the new departure time—the application can inform anyone booked on the flight that Clampett Airlines flight 676 is delayed. The new departure time is 11:21 a.m.

The same application might also provide a few options: To stand by for a different flight, press 1. To be connected with a service representative, press 2.  

"A VoiceXML developer could build the application and provide a Web services interface for the Web developer to use. Furthermore, the Web service could be used by any other application that needs to make an outbound call, without the Web developer having to be concerned witht he telephony or speech technology aspects of the project," Rehor explains.

For businesses today, this represents a dramatic paradigm shift. "Historically, speech and the IVR have been siloed from the rest of the business, but as companies are seeing them as more of an IT function, the same management principles are being applied," Gartner’s Cramoysan notes.

VoiceXML’s Growing Role
Also closely linked to the SOA trend is the growing use in the call center of standards-based development processes and protocols spawned from the IT and Web services domains rather than the traditional telecom world. The de facto standard has become VoiceXML, a programming language for creating machine-based dialogues that feature synthesized speech, digitized audio, recognition of spoken and touch-tone inputs, and telephone access to Web content.


The whole idea behind VoiceXML is portability and a truly open Web-based architecture that allows businesses to build applications once and run them almost anywhere. It’s a goal shared by SOA, which by definition is a framework that lets companies build, deploy, and integrate services independent of applications and the computing platforms on which they are built.

In many leading technology companies today, VoiceXML is a fundamental piece of their overall Web or SOA strategy. "The platform market once was fragmented," Cramoysan admits. "[Now] there has been a high level of VoiceXML adoption in terms of new systems coming out and very wide acceptance of VoiceXML as the standard language. What we’ve seen over the last five to seven years is a shift from legacy, proprietary applications to more open source using VoiceXML, opening the doors for interoperability from one application to another."

Cawn says he has also started to see a more rapid conversion to VoiceXML. "And as we see more people move from proprietary to VoiceXML, SOA will happen," he maintains.

In fact, Daniel Hong, senior analyst at Datamonitor, has repeatedly pointed to a dramatic rise in the number of VoiceXML-based applications in the last few years alone. Whereas slightly more than 30 percent of the 600,000 IVR ports shipped in 2005 were based on VoiceXML, he predicts that this year more than half of the ports shipped will be VoiceXML-based, and by 2010, those numbers will climb to more than 70 percent. Speech technology vendors like IBM, Nortel, Avaya, Genesys Telecommunications Laboratories, and Cisco Systems have led the way with applications that are VoiceXML-compliant and SOA-ready, according to a number of analysts.

Still, one can’t deny the existence of competing standards, including Speech Application Language Tags (SALT). Koloski notes that VoiceXML and SALT are two very different standards, "which creates inherent interoperability challenges. You still can’t take a Microsoft product and just plop it in with [just any vendor’s] product."

"You could apply SOA principles without VoiceXML, but I’m not sure why you would," Cramoysan says.

Rehor, one of the authors of VoiceXML, agrees, noting that SOA and VoiceXML leverage the same underlying technologies and share the same goals. "VoiceXML is a natural fit for an SOA," he adds.

And while some companies have opted instead to layer their SOA around proprietary, legacy systems and applications with a basic Web interface, Rehor says such a deployment fails to take into account long-term goals. "It’s like putting a shiny paint job on an old 1972 Chevy Vega. In the end, you still have an old 1972 Vega," he states.

Still, some have argued against VoiceXML by claiming that current versions of the programming language do not fully cover all the things that people might do with speech. Rehor notes, however, that future versions of VoiceXML will address any lingering inconsistencies with an SOA. Version 3.0 (which is due for release later this year) will contain added support for speaker verification, video, and multimodal applications, he explains.


SOA: A Picture Is Worth 1,000 Words
Whether or not speech applications are included, many industry experts argue that businesses have been slow to convert their systems to an SOA because they are still confused about what an SOA looks like. They have mistakenly been led to look at it as a puzzle with interlocking pieces. "In a puzzle, one piece only interacts with the other pieces around it. In an SOA, each piece can interact with any other piece anywhere in the puzzle," states Steve Cramoysan, research director for enterprise communications applications at Gartner.

Steve Cawn, sales team leader for speech solutions at IBM, agrees. "It’s not a puzzle. It’s more like a bicycle wheel," he says. "The hub is the server at the center and each spoke represents one of the [applications] tied into it and one another."