Speech Technology Magazine

Multimodal Growth Pushes Speech Recognition to the Head of the Class

You can touch this but you don't need to. Speech is right behind touch when it comes to the future of multimodal design, according to new research.
By Michele Masterson - Posted Jun 3, 2015
Page1 of 1
Bookmark and Share

When it comes to smart devices, touch currently is—and is expected to remain—the most popular interaction mode on a mobile device. However, as multimodal interfaces are seeing increased adoption and refreshes, speech recognition will not be too far behind, according to a new report from Tractica.

Peter Cooney, principal analyst at the firm and lead author of the report, believes that among the various communication modes, voice solutions will be the "most important area of growth" in mobile user interfaces.

By 2020, Tractica forecasts the following emerging interface technologies and attach rates on mobile devices: speech recognition (82 percent); localized haptics (45 percent); gesture recognition (37 percent); voice biometrics (36 percent); touchless/3-D touch (28 percent); and eye tracking (18 percent).

To the naysayers who think that speech recognition is dead or, at best, treading water, Cooney brings up the public's wide acceptance of Apple's Siri, Cortana, and Google Now. Cooney acknowledges that "there is still a lot of work to do, but the technology is already very well developed. There's the Cortanas and so forth, but there are also the embedded solutions such as banking applications that use speech technology."

Cooney points out that acquisitions of speech solution providers in recent years have been made not only by the usual suspects, such as Nuance, Apple, and Google, but also by companies bringing speech in-house to develop their own solutions, such as Facebook, or bringing it to market, as with Amazon. Intel is yet another example, Cooney says. While the company has a massive presence in the tech market, speech isn't the first thing you'd associate with it; yet with its RealSense technology, Intel has been working on integrating speech with visuals.

The growing use of virtual assistants (what Tractica calls virtual digital assistants, or VDAs) on mobile devices will also drive up greater speech adoption, Cooney says. Consumers have become so comfortable with self-service that, for some of them, "it creates a much better user experience," Cooney says—many people call in to contact centers as a last resort. Virtual Assistants (VAs) are also a win for companies, as less calls need to be handled by agents.

Certainly, mobile interfaces are not the last stop for speech recognition. As devices and products—everything from phones to refrigerators—become "smarter" and woven into the Internet of Things fabric, Cooney believes that speech recognition has nowhere to go but up. 

Page1 of 1