Speech Technology Magazine

 

Speech Is Set to Dominate the Wearables Market

A voice interface opens a world of possibilities for wearable devices.
By Leonard Klie - Posted May 1, 2014
Page1 of 3 next »
Bookmark and Share

For many people, their first exposure to wearable voice technology was Dick Tracy's two-way wristwatch radio, which made its debut in the syndicated newspaper comic strip in 1946. Then in the 1960s TV show Get Smart, CONTROL Agent Maxwell Smart's shoe-phone kept him in constant contact with the chief.

Today, it's Google Glass, which has been showing up in all sorts of places in the past few months. Google Glass, which isn't supposed to hit the market until later this year, made an appearance at the Seattle Seahawks' Super Bowl XLVIII post-game revelry on February 2. As the famed Lombardi Trophy passed through two lines of Seahawks players, wide receiver Golden Tate was seen sporting one of the devices.

British airline Virgin Atlantic in mid-February began a six-week pilot program in which concierges in the Upper Class lounge at London's Heathrow Airport used Google Glass to check in passengers, relay flight information, tell them about the weather and events at their destinations, and translate foreign language information for them. Depending on the outcome of the pilot, the airline could roll Google Glass out to other airports.

Though a few skeptics argue that the market for wearable, voice-enabled devices such as Google Glass will grow very slowly, most analyst firms are predicting a boom in a short amount of time. Frost & Sullivan, for example, identified wearable devices among the technologies that will shake up the business world this year. Wearable devices have already started to create a buzz that is only expected to become louder in 2014, explains Archana Vidyasekar, Frost & Sullivan's visionary innovation team leader and senior research analyst.

Strategy Analytics has numbers to support this enthusiasm, and predicts that global wearable device sales will surge from 15 million units in 2013 to 154 million by 2018. Of those, it forecasts smart watches, smart glasses, and fitness bands will be the device types that drive most of the growth. This represents a huge market for voice technologies, which experts say will need to be a key element in wearable devices.

"If a wearable is to do anything of complexity, it must have a connection to a smartphone," says Bill Meisel, president of TMA Associates and executive director of the Applied Voice Input/Output Society (AVIOS). "And the only way to interact with the smartphone to do anything of any complexity is by voice."

Raul Castanon, senior analyst at the Yankee Group, agrees. "It's hard to have a functional user interface with smart watches on a very small screen, so for many of these devices, it makes sense for them to be voice-controlled," he says.

"Voice technologies will be critical in many wearable devices," asserts Adam Weigold, information technology analyst at BCC Research. "Voice will not be the only interface option," he continues, but it will be "the easiest for many high-end apps."

Wearable devices will also rely on touch interfaces, Weigold says, but "improved voice control software technologies will characterize most future wearable products."

Dan Miller, senior analyst at Opus Research, advises device manufacturers to include both automated speech recognition and human-like text-to-speech to support spoken conversations. "Everyone recognizes that voice is not always the preferred method to activate and provide instructions to a device, but it cannot be ignored," he says. "When looking at input/output overall, voice would be the number one preference a little more than one out of ten times. But the instances where voice is more convenient and trusted are growing and will continue to grow as wearables become more prominent and voice is more elegantly combined with gesture in the mobile user interface."

One company that's certainly not ignoring speech in the wearable devices market is Nuance Communications, which, at this year's Consumer Electronics Show (CES) in January, introduced a number of intelligent voice-enabled systems for wearables based on its Dragon Mobile Assistant.

Working with Omate, Nuance previewed a smart watch app that enables people to speak to their watches to make calls, send email or text messages, set reminders, manage their calendars, search the Web, update social media, and access sports scores, stock quotes, restaurant recommendations, and the weather.

Using music recognition technology from Gracenote, Nuance is also making it easier to use wearable devices to identify songs playing, discover hot new artists, and connect to music services. If the user hears a song he likes, he can simply say, "Hello, Dragon, what song is this?" and Dragon Mobile Assistant will identify the artist, album, and track.

Additionally, Nuance is working with electronics maker Philips to integrate Dragon Mobile Assistant into the Philips Hue lighting system. People with this system will be able to command and control the lighting in their homes just by speaking to their smart watches.

Castanon expects these types of applications to carry over to the larger portable devices market, with voice control becoming more relevant as a user interface not only for wearable devices, but also for tablets, smartphones, and laptops.

In general, he says, "users will come to expect voice functionality for mobile apps where it makes more sense to enter a command by voice than by touchscreen or keyboard."

The Need for Natural Language

While voice capabilities on these smart glasses and wristwatches are quite robust, Castanon and others see natural language understanding (NLU) as the missing link. "Without NLU, you have to structure commands in a specific way in a specific structure. [NLU] allows people to interact more naturally," he says.

However, NLU is not without its problems. "There are still technology challenges on the natural language understanding part of speech understanding," Meisel says.

Miller says other elements will also have to be considered. "Systems must constantly get better at recognizing what a person means when he or she says something," he states. "That is partly dependent on the quality of speech recognition and partly 

Page1 of 3 next »