The 2013 Market Leaders
A lot has happened in the speech technology industry over the past 12 months. Much of the activity comes from innovation and mergers and acquisitions, which have heavily influenced the outcome of this year's Speech Technology Market Leader Awards. In fact, there are new category winners in four of the five categories this year. Read on to find out which vendors are dominating their respective categories.
The editors of Speech Technology magazine would like to thank those who participated in evaluating the vendors in the Speech Technology Industry Awards. This issue, and the awards themselves, would not be possible without the contributions of the following judges and commenters: Aphrodite Brinsmead, senior analyst, Ovum; Dick Bucci, president and principal analyst, Pelorus Associates; Deborah Dahl, principal, Conversational Technologies; Keith Dawson, principal analyst, Ovum; Donna Fluss, founder and president, DMG Consulting; Sheila McGee-Smith, president, McGee-Smith Analytics; Michael Morgan, senior industry analyst, mobile handsets/devices, ABI Research; John Ragsdale, vice president of technology research, Technology Services Industry Association; Peter Ryan, outsourcing practice leader, Ovum; Bill Scholz, president, NewSpeech; and Paul Stockford, principal analyst, Saddletree Research.
CATEGORIES and CRITERIA: Speech Technology magazine's Market Leader Awards name one Winner, two Leaders, and a Vendor Contender in each of five categories using a proprietary scoring formula that involves input from industry analysts and consultants. (A tie in Speech Analytics and Speech Self-Service this year resulted in one winner, two leaders, and two vendor contenders;) The selection of the leader companies was based on a composite of the judges' scores (based on a 5-point scale, with 5 being the highest rating) in areas including affordability, customer satisfaction, ease of use, accuracy, speed, depth of functionality, and company direction. We used a weighted scale for each of these criteria based on their importance to current or potential customers.
While speech engines could be seen as the heart of speech recognition, not all analysts foresee big growth in the market.
"Speech in enterprise contact centers is growing, but we expect the growth in spending on core engines to be modest, about the same as overall IT spending [five percent to seven percent]," says Dan Miller, senior analyst at Opus Research. "A multiplier effect...applies to the spending on core automatic speech recognition engines. Each dollar spent drives roughly seven times that in overall spending for solutions that include speech. Beneficiaries are third-party system integrators and professional services business units at the major contact center software companies."
Thanks to what can be called the Siri effect, analysts took notice of Apple this year and were particularly impressed with the accuracy, innovation, and cost of its speech engine. Siri is also becoming faster and adding features. Last month, Apple announced that the new version of Siri iOS 7 will include new voices, languages, and integration with Bing and Twitter. Apple is also planning to integrate Siri into in-car interfaces, which will let users get directions, access maps, make phone calls, and dictate and receive iMessages, eyes-free.
"I give credit to Siri and the several Siri-lookalikes," an analyst says. "Siri was the best thing that happened to the speech recognition marketplace in decades. Personal assistants [such as Siri] will dominate the speech app development marketplace for years to come."
Microsoft also stood out with analysts, who noted that its speech engine delivered highly accurate results and also cited its innovative qualities. Most recently, the company announced that its Bing voice search and voice-to-text were twice as fast and 15 percent more accurate on Windows Phone 8. The Microsoft Research team achieved these improvements over the past year by using Deep Neural Networks technology. "Bing's voice search is good and largely underappreciated," an analyst notes. "Microsoft has showcased some really excellent applications for its speech engine to do transcription and simultaneous translation in conjunction with things like videoconferencing. This could have a big impact on companies using Skype or Lync for things such as international conferences."
Google, a leader in last year's awards, rose to the top spot with analysts, who recognized it as number one across the board, giving it the best scores in accuracy, innovation, cost, ability to customize, and customer satisfaction. "Google is ranked highly for the same reason the Microsoft engine is doing well: the size of the acoustic sample pool," an analyst remarks. "Microsoft's pool is huge, but Google's pool dwarfs it. [It saves] everything anyone ever says to its system and systematically folds this massive volume of data into its acoustic models. So [the] recognition quality is continuously improving to the point [that] it is among the top few in the industry."
Another insider singled out Google's accuracy and low cost.
"Google's free speech recognition has been improving constantly based on a brute force approach to building corpora of utterances and applying some really good statistical language models," says an analyst. "It has demonstrated high levels of accuracy on mobile devices for search, commands, and dictation, and you can't beat the price."
AT&T's and Nuance Communications' overall speech engine features tie for this spot, analysts say.
AT&T made a splash in April when it signed a deal with Interactions. The company is using the Watson speech engine to power its speech-enabled virtual assistants for enterprises with customer care needs. AT&T is focusing on its Watson-enabled speech APIs, which developers can use to create apps and services with voice recognition and transcription capabilities.
Nuance made a slew of upgrades to products in voice biometrics with Nina, its virtual assistant app for mobile customer service, and in healthcare with its Dragon Medical Practice Edition. It expanded its automotive speech offerings and acquired Tweddle Connect, an application and content service delivery platform for in-car infotainment systems.