The 2015 State of the Speech Technology Industry: Speech Engine
Decades of product refinement have fashioned modern speech systems as an indispensable part of 21st-century living. From accessing information on the Internet to controlling consumer appliances, speech technology's role in consumer devices has eclipsed more conventional corporate uses, such as interactive voice response (IVR) systems or dictation into electronic health records.
And Deborah Dahl, principal at speech and language consulting firm Conversational Technologies, says we're just at the beginning of "a major broadening of the areas where speech recognition applications are used." While she expects speech recognition in call centers to continue to be an important market segment, she sees the technology "expanding more and more out from the call center to the ambient environment."
This trend started, Dahl says, with Apple's introduction of Siri in 2010, "which put usable speech interaction into millions of pockets." Siri was followed by similar offerings from other platform vendors, such as Google and Microsoft.
"As these personal assistant applications became ubiquitous, people got used to accessing the online world by voice to do common tasks and obtain generic information," Dahl says.
While enterprise-specific applications of the same technology, such as Nuance Communications' Nina and Openstream's EVA, soon emerged, they don't seem to be catching on as quickly as Dahl had previously expected. "But I believe this segment will continue to grow," she says.
Dan Miller, founder and lead analyst at Opus Research, says that as intelligent assistance takes hold globally, automatic speech recognition (ASR), text-to-speech (TTS), and integrated development tools "will be in the mix of supporting technologies."
Spending on these resources in 2015, he adds, will probably reach about $650 million. "But these resources must be tightly linked to the computational resources that support natural language understanding, machine learning, and conversation management."
A Logical Interface
The next frontier for speech, Dahl and others predict, will include interaction with the home environment, robots, sensors, and wearables. "Speech will become increasingly important for these kinds of interactions because there are just too many things for all of them to have their own graphical interfaces," Dahl says, noting that there are 35 apps in the Apple iTunes App Store to control the Philips Hue connected light bulb alone.
As speech enters these burgeoning markets, however, the focus is shifting to increased accuracy, efficiency, and speed while reducing power consumption.
This is where a company such as Sensory has excelled. The company, Dahl maintains, "is outstanding in its niche of providing speech recognition for embedded, hands-free applications. It's accurate and has very low memory and power requirements."
Yet, for all the innovation around other aspects of the technology, speech recognition continues to represent the dominant product market. In the business world, it is on course to continue to redefine customer service and customer self-service in several industries, including airlines, financial services, healthcare, warehousing, and security. To that end, voice recognition is fast becoming a required and much demanded security measure in these industries.
Driven by this, as well as increasing demand from mobile application developers, speech in general is due to experience a compound annual growth rate of 16.2 percent during the next few years, according to several analyst firms.
Also helping speech vendors boost their bottom lines is "a move from selling technologies to selling solutions," observes Bill Scholz, president of the Applied Voice Input/Output Society (AVIOS) and speech technology consultant at NewSpeech.
"We had become accustomed to vendors offering speech recognition or speech synthesis technology. Speaker verification technology was added to the list, joined by natural language understanding and dialogue management," he says. "As these technologies matured, vendors learned how to integrate them and complete solutions were offered."
As an example, Scholz cites Nuance, which he says "has grown from a technology provider focusing on ASR and TTS to a full solutions provider with a product spectrum ranging from Dragon NaturallySpeaking to Nina virtual assistants, including products targeting the healthcare industry."
Bill Meisel, president of TMA Associates and executive director of AVIOS, agrees. "There are many varieties of speech recognition, even from a single firm like Nuance," he says. "They have Web-based services aimed largely at mobile apps like NDEV, dictation
In our first State of the Speech Technology Industry issue, we reveal the latest trends and developments in eight market categories.
Companies and Suppliers Mentioned