Two major environments are poised to define the landscape for speech in 2017: the now-ubiquitous mobile market and the burgeoning Internet of Things (IoT). While smartphone sales are flattening (with only 0.7 percent growth globally, and a leveling off in the United States, Japan, Western Europe, and Canada), the number of smartphone users worldwide is expected to exceed 2 billion this year (a third of the world’s total population). In the U.S. alone, the number of smartphone users is expected to reach 222.9 million by next year (approximately two thirds of the current population).
At the same time, analyst firm IHS estimates that the market for the Internet of Things will double in size, to 30.7 billion devices, by 2020, up from 15.4 billion last year. Deborah Dahl, principal of Conversational Technologies and chair of the World Wide Web Consortium’s Multimodal Interactions Working Group, and Ahmed Bouzid, CEO of Witilingo, both predicted last year that these daisy-chained devices would eventually reach critical mass, where the clutter of multiple interfaces would force a more unified control system, with speech being the most elegant, natural, and likely candidate.
An apparent contradiction arises from the stagnating mobile market and the growing number of mobile users, raising this question: How will mobile devices be a growth opportunity for speech technology? The answer is mobile application development.
Gartner expects the mobile application market to reach $77 billion in revenue in 2017, and many app developers are looking to capitalize on speech to keep up with the intelligent assistants like Microsoft’s Cortana and Apple’s Siri that are baked into mobile platforms as a matter of course.
Not surprisingly, developers are also looking to take advantage of speech in the Internet of Things. In 2016, Amazon’s smart home hub, Echo, reached significant penetration, with 3 million units sold, while its baked-in intelligent assistant, Alexa, continues to be an integration target. In fact, the series of interrelated applications (which Amazon calls “skills”) burgeoned from 135 at the start of 2016 to roughly 6,000 by the end of the year, proving that the emerging application market for the Internet of Things is huge. Google also looks poised to enter the software development kit (SDK) game in a big way in 2017, as it strives to ensure that its answer to the Echo—the voice-enabled Google Assistant—is at least as big a hit.
API versus SDK
Still, despite all the market potential, developers looking to add speech functionality to their products should consider just how they want to develop. If they are planning to design for a single platform—say, Amazon’s Alexa or Apple’s iOS—and merely want to integrate the platform’s native speech into their own applications, a software development kit (SDK) is in order. Should developers seek instead to integrate a particular speech solution into an independent product using outside speech technology, they should seek solutions with robust APIs to ensure that the underlying speech technology is easy to integrate with the product and runs seamlessly, experts advise.
It’s important, then, to weigh each of the available application development platforms and assess which ones are compatible with what you are trying to do. Here are the top contenders right now:
• Nuance Mix. Nuance Communications remains the industry leader in speech-specific technology and solutions, and so of course the company offers an SDK. This year, Nuance rebranded its Nuance NDEV platform as Nuance Mix, which offers developers access to Nuance’s robust and configurable speech recognition and text-to-speech voices, configurable for more than 30 languages. Mix has three-tiered pricing, with the lowest level costing nothing to implement for the first 20,000 transactions per month per application. Android, Apple iOS, and HTTP are supported.