2009: What the New Year Will Bring
“I’m a pro, not a con. The sky is not falling to me.” That’s what Tom Schalk, vice president of voice technologies at ATX, says about his industry. He, like many others in the world of speech technology, is putting on an optimistic face as he looks toward the economic future. He likens the industry’s doom and gloom soothsayers to Chicken Little.
That being said, though, no one can deny that the skies are definitely gray and, chickens aside, one has to wonder, as we look ahead, whether the optimists are being too cavalier.
Last year was undoubtedly a hard one for many businesses the world over. In December, the famously slow and deliberative National Bureau of Economic Research finally affirmed what many had long suspected: The United States is in a recession, and has been for the past year.
While it’s true that the speech industry hasn’t yet been lambasted by the current recession the way others have, the downturn remains the 800-pound gorilla in the room. Some privately wonder how much speech will be affected, while others appear poised to capitalize on recession-hit enterprises that, they argue, could save sackfuls of money by automating processes with speech solutions. But this is also an industry where enterprises have historically been slow to move on technology implementations, often grumbling over high start-up costs.
One promising trend that may help spur the reluctant into shorter sales cycles in 2009 is the continued rise of hosted solutions. By turning to third-party vendors to host solutions for them, enterprises are able to cut upfront hardware costs, which may make all the difference in smaller-scale deployments.
“When there is no money, it’s very hard to get a company to invest in technology. That’s why today you see on-demand models or the pay-as-you-go model having more interest from customers,” explains Roberto Pieraccini, chief technology officer at SpeechCycle.
Even on a larger scale, some enterprises that will eventually require on-premises solutions for security reasons are turning to hosted solutions providers as a way of getting a little try-before-you-buy action. With a hosted solution, an enterprise can get a feel for what the on-premises version will provide before taking the expensive plunge. Analysts are expecting to see this trend continue through this year and beyond.
Datamonitor associate analyst Aphrodite Brinsmead projects that North America alone will see hosted outbound interactive voice response (IVR) technologies grow at a compound annual rate of 48.3 percent through 2011. In her article “Hosted Speech and Outbound IVR Services,” she projects that the hosted outbound IVR market will expand from its current $2.6 million to $3.4 million by the end of this year. By 2011 she expects it to be worth $18.3 million.
Brinsmead sees this expansion in IVR coming from airlines, which will use it to notify passengers of flight delays and changes, and from financial firms, which will use it as a reminder system in debt collection. She also expects growth of outbound IVR to be significantly robust in healthcare, where providers can use it to remind patients of appointments, to pick up their prescriptions, or to take their medications.
Datamonitor expects hosted solutions to slow down in the financial sector, but notes that growth in other vertical markets is likely to counterbalance the losses so that the general demand will remain unchanged.
Daniel Hong, Datamonitor’s lead analyst, expects that the recession will especially hurt smaller companies. “I think that the companies that have a hosted solution are the ones that will have an advantage,” he says. “The ones that have long-term financial viability, strong brand equity, and staying power are the companies that seem to have an edge.”
Hong also expects losses for voice biometrics during the year. Some have predicted that password reset applications may provide room for growth because of the cost cuts associated with them, but Hong asserts that the technology is expensive to implement, and companies are going to be conservative with their IT spending in this climate.
“They’re going to hang onto whatever investments they have. They’re going to hold off on technology refresh,” he says.
Hong expects technologies like call recording and optimization analytics to see continued investment. “[But] voice biometrics? I’m not bullish on it,” he adds.
Similarly, Bill Scholz, president of the Applied Voice Input/Output Society (AVIOS) and founder of the consulting firm NewSpeech Solutions, is expecting a slowdown in the development of statistical language models (SLMs). To build an SLM, a firm has to compile tens of thousands of actual samples. The cost of that process may end up dampening the enthusiasm for building elaborate SLMs or natural language-based applications.
Moving on Mobile
Mobile speech applications, on the other hand, are anticipated to be an area of growth during the next year. Before app stores, vendors had to strike deals with telephone service carriers and device manufacturers to get their products preinstalled; now, with many applications available for download through an app store that can be accessed wirelessly over the mobile Internet, the popularity of mobile speech applications has increased. App stores allow start-ups and established companies alike to try and break into the market. Google, for instance, released its Google Voice Search, which gives users voice-enabled access to its search engine, through this channel. For smaller companies, venues like the iPhone App Store open sales channels, allowing them to capitalize on Apple’s mindshare among consumers.
Mobile applications, though, are still largely speculative. With most of the applications being offered as free downloads, questions about how to monetize mobile applications loom.
“I think right now they’re trying to work out how to incorporate some sort of advertising into it,” says Datamonitor associate analyst Ryan Joe. “And that would be mostly for open search. I suspect that’s Google’s interest in it, for instance. But I don’t think anyone has a really good strategy on how to monetize speech applications on mobile devices.”
A Healthy Prognosis
Automated speech-based dictation and transcription in healthcare is also expected to see continued growth. Brinsmead reports in “Automating and Enhancing Processes Through Voice in Desktop and Back Office Environments” that, between licenses and services, the entire healthcare speech recognition market will be worth $601.2 million by 2011. That will be up from $207.9 million in 2008 and the expected mark of $265 million by the end of this year.
Nuance Communications, with its acquisition of Philips Speech Recognition in October for $96 million, is poised to dominate the lion’s share of the market. Nuance was already the largest player in the U.S., but with the Philips acquisition the company has gained a strong foothold in Europe, a market it was previously unable to crack.
“They have speech recognition engines and solutions that are available in multiple languages—25, actually,” says John Shagoury, president of Nuance’s Health Care and Imaging Division. “For us, it has allowed us to rapidly expand our international presence, and to pick up a huge language portfolio and a very strong distribution channel in the healthcare market [in Europe].”
That acquisition is part of a bigger trend that analysts say we can expect to see more of throughout the year: further centralization and consolidation of industry players. When asked where they see the speech industry going, industry experts almost unanimously point to mergers and acquisitions.
“We’re continuously seeing fewer and fewer different companies growing into ever larger central companies,” Scholz says. “In the last couple of weeks, it was VoiceObjects being devoured by Voxeo, and NMS being purchased by Dialogic. It seems that every week we see more consolidation of companies that we’ve known as independents for years.”
Mergers and acquisitions “have been so fast that I’m losing track of them,” SpeechCycle’s Pieraccini adds.
The consolidation could mean a number of things for the industry. On the technological end, the combination of major research and development organizations might lead to an increase in the breadth and depth of core research. Scholz suggests that the consolidated R&D groups may even be greater than the sum of their parts. He explains that a larger core group of talented researchers may be able to feed off of each other’s capabilities and experiences.
Scholz is more pessimistic with respect to the economic implications of the consolidations, though. “I guess I could take a negative view and say that as we move toward a monopolistic domination of the core speech technology by a single industry, that could mean that some of the price reductions that we’d all hoped for remain further in the future,” he says.
With the recession upon us and the tightening of credit, for 2009 he may be right.
Companies and Suppliers Mentioned