Speech Claws Its Way Forward

Article Featured Image

According to the Chinese Zodiac, 2010 is the Year of the Tiger, whose character traits include bravery and competitiveness. And while we here at Speech Technology aren’t basing our Year in Preview on astrology, it seems that as the speech industry catches its breath, surveys the financial wreckage, and looks toward the future, those very characteristics—bravery and competitiveness—will be more important than ever as we embark on a new year.

The bad news for 2010 is—rather obviously—the economy, which isn’t going to magically repair itself overnight. However, the good news is also the economy. Word on the street is that the economy is slowly beginning to turn around, both in general and in the speech industry.

According to a host of analysts, experts, and industry insiders, speech will benefit from the gradual economic improvement and find itself on a stronger financial footing this year. 

“The chill is starting to lift,” says Dan Miller, senior analyst and founder of Opus Research, who predicts high-single-digit growth for enterprise speech solutions. Miller also predicts double-digit growth for what he calls “recombinant solutions,” which he defines as rich phone applications involving telephony, voice processing, call processing, and personalization. 

However, growth will be neither universal nor quick. Miller predicts enterprise spending on information technology that includes some form of speech processing will not take off any time soon because many companies are continuing to leverage their existing infrastructure, hardware, and software. 

Daniel Hong, lead analyst at Ovum, also predicts single-digit growth this year, noting some enterprise speech projects that were put on hold due to the recession are slowly starting to ramp up. “We’re seeing it growing, but not as much as we anticipated earlier,” Hong says. “Spending will free up a bit more. We should see more activity [in 2010].”

Echoing Hong’s economic prognostications are Bill Scholz, president of the Applied Voice Input/Output Society (AVIOS), and Ryan Joe, an associate analyst at Ovum.

While Scholz notes that most analysts agree the economy has bottomed out and is beginning to pick up, he nonetheless remains a realist when it comes to a recovery. “We’re certainly not going to be wandering into overwhelming economic prosperity in the next year, but we will continue to improve throughout the year,” he says. “And I don’t see anything that is likely to block that continual growth.”

Joe also is hesitant to offer an overly rosy forecast for 2010. “Even if the economy does start to recover, it’s going to be a while before enterprises start to invest again,” he says. 

The Channel Is Multichannel 

Among a number of speech technology vendors seems to be a consensus that 2010 will see increasing interest in and emphasis on personalization of interactive voice response (IVR) systems, multichannel solutions, outbound IVR, and hosted solutions. The coming year will also see—and this should come as no surprise to anyone familiar with the speech industry—more mergers and acquisitions.

According to Miller, the “inevitable” continuation of mergers and acquisitions will be a direct response to market demands. Speech processing itself—the engines for recognition and grammar building—might have reached a point of diminishing returns as more customers look to multichannel solutions, he says.

“The better mousetrap is going to be extending Web services, extending the Internet over to a multiplicity of devices, and having a voice user interface [as] one of the options that people can use to get those services,” Miller says, noting that even industry leaders like Nuance Communications are hedging their bets and offering a broader portfolio. “If that’s what the demand is for, the ecosystem that delivers that only needs a few voice players.”

Joe echoes Miller’s sentiments, noting that speech solutions providers want to offer customers end-to-end self-service solutions. “There’s this awareness that they need to be able to optimize various customer touch points,” Joe says, stressing the importance of honing in on channels like text messaging and the Web. This year’s acquisitions will not be specific to speech, but rather to customer self-service, he also adds. 

As such, the integration of different channels and the personalization of IVR will be more important than ever in the coming months in the enterprise space. “That to me is the big one, and the one all the vendors are going toward,” Joe says, adding that with the movement toward personalized IVR comes an investment in speech recognition—an indicator that the services around speech will increase in 2010.

This increase in multichannel does not necessarily mean an increase in multimodal, though. In particular, little movement is expected in combining voice and video this year. “There are a lot of issues with video,” Joe says, such as differing operating systems, phones, displays, and formats. “The strength of the network has to be there, and that’s a particular problem in the United States. 

“There are a lot of vendors with this sort of technology in pilot. But right now I think the trend is more toward emerging different channels,” he adds.

Scholz adds that the rise of multichannel solutions will be welcomed and predicts “better cross-channel consolidation in the coming year and coming years.” 

Hong shares that belief. “There’s more interoperability among different switches and platforms, so it’s easier to have a coherent multichannel strategy,” he says.

Among the many speech vendors looking to capitalize on this push toward multichannel solutions is Nuance. According to Steven Chambers, the company’s executive vice president of worldwide sales and chief marketing officer, Nuance has been pushing multichannel solutions for some time. 

Chambers isn’t sure the increased interest in multichannel solutions yet qualifies as a trend, but he sees the benefits it provides in economic return and consistent customer experience. “It just makes sense from a lot of levels,” he says, noting that he sees growth at Nuance and in the market in general. “People are hunkering down on the cost displacement front, and that’s leading them to speech.”

According to Miller, another trend that will continue to gain steam during the year is cloud-based and hosted speech solutions. “More than ever before, there’s hardly a decision made where some element of call processing or voice processing [isn’t] taking place in the network cloud or the public network cloud,” Miller says, noting the popularity of on-demand and software-as-a-service solutions will continue to grow. “The beneficiaries are hosted service providers that are the least monolithic and most open to having flexible arrangements.”  

Miller explains that because of the nature of new IP-based services, demand cannot always be predicted; this, he says, plays into the hands of hosted services providers that can accommodate uncertain demand and use the Internet and Web services to distribute call processing, voice processing, and customer care functions around the world. 

Roberto Pieraccini, SpeechCycle’s chief technology officer, is also bullish about the cloud, calling cloud-based speech technology and applications one of the biggest trends for 2010. “The cloud is probably going to be the thing that we’re going to see that’s going to be the big differentiator from what we’ve seen before,” he says. “[SpeechCycle is] going to put a lot of focus on the cloud deployment.”

In fact, the company has already begun its push in that direction. Late last year, SpeechCycle announced its RPA OnDemand rich phone application services and full suite of virtual agent solutions could be deployed in Windows Azure, Microsoft’s cloud computing environment. “This is something we are really excited about,” Pieraccini says.

Scholz also expects to see increases in cloud computing and cloud-based automated speech recognition (ASR) this year. As an example, he cites AT&T’s speech mashups, which allow speech and language processing to be performed on the company’s servers. “What they are doing is offering high-quality, high-end speech recognition directly accessible over the Internet,” Scholz says. “What’s important is that it has completely bypassed the application server in that process. I think it has profound implications for the future.”

Another technology to watch this year will be outbound IVR. And while interest and activity in this area won’t be overwhelming, Hong does forecast modest growth in 2010. “We’ll see a little more activity with outbound IVR,” he says, noting the focus will be primarily on dual-tone multifrequency solutions and, to a lesser extent, speech solutions.

According to Pieraccini and Chambers, both SpeechCycle and Nuance are seeing significant interest in outbound solutions. And while Chambers admits outbound is “relatively new” for Nuance, he says the company can provide solutions that “proactively notify and then service [customers] with automation.”

The increased interest in outbound IVR is directly tied to the inherent nature of the technology, Pieraccini adds. With inbound, people call into an IVR with a problem and are unhappy to find themselves speaking to a machine. This is not the case with “permission-based outbound” services like flight notifications, he explains. 

“Outbound is a very different kind of experience than inbound,” he says. “It has received more acceptance, and [people] like it.”

Markets to Watch

Looking ahead, analysts are also targeting specific markets that will see significant traction this year. Almost across the board, industry experts are citing healthcare and mobile as the two hottest markets to watch. Additionally, analysts are pointing to finance, media and entertainment, and telecommunications as warm markets for 2010.

“Healthcare is so ripe for so many reasons,” Miller, says, noting the push for healthcare reform in Washington, as well as the federally mandated move to digitize medical records.

Scholz looks instead to the mobile market, pointing to the rise of mobile voice search on handheld devices, voice commands on handheld devices, and voice-enabled mobile solutions, like Ribbit, as indicative of a trend that will go unabated for some time. “Watch the mobile market,” he predicts. “All of this is a harbinger of what is to come in 2010, and we’re going to see much more of that.”

Miller agrees, looking to the latter part of this year for speech-enabled applications that grow out of social and local search. “Mobile search, which is destined to have a voice element, is increasingly going to be about what to do locally,” he says, though this will not happen overnight. “By 2013 [we will] see some pretty elegant voice-activated entertainment guides.”

Miller suggests looking at places that are ready to be disrupted by a better way of servicing the customer, something that will pull voice along as more customer care originates from mobile phones.  

Hong looks to the public sector, utilities, and communications for growth in 2010. However, he labels financial services with “a big question mark.” “There’s been a lot of consolidation that’s going on in that industry, and obviously their budgets have been slashed,” he says. “They may look to outsource a bit more.”

Miller, on the other hand, views the financial sector with more optimism. And again, his reasons stem from customer service: “As finance comes out of the deep freeze, it needs an overhaul of its customer care schema.”

As the speech industry looks to trending technologies, sizes up vertical markets, attempts to stave off financial disaster, and braces for 2010—hopefully with the requisite dose of bravery and competitiveness demanded by the Year of the Tiger—Scholz points out one final trend: Almost annually someone makes a wild prediction and declares the coming 12 months to be “the year of speech,” he says. “Everyone says that every year. [2010 is] not going to be the year of speech.” ?

SpeechTek Covers
for qualified subscribers
Subscribe Now Current Issue Past Issues