August 7, 2015
By Leonard Klie Editor, Speech Technology and CRM magazines
FYI

W3C Pushes a Multimodal Development Standard

One of the biggest problems hindering further advancement of the Internet of Things is compatibility. Individual devices have their own interfaces—laptops and PCs use mice and keyboards, tablets have touch screens, TVs have remote controls, and cars have steering wheel buttons. Too often, manufacturers use proprietary standards and protocols, resulting in a collection of devices that cannot communicate with one another.

To address this issue, the Multimodal Interaction Working Group of the World Wide Web Consortium (W3C) published in June the first working draft of a proposed standard that would govern how smartphones and other mobile devices recognize and communicate with other networked applications and devices as part of the Internet of Things.

The proposed standard, called "Discovery and Registration of Multimodal Modality Components: State Handling," is designed to govern how multimodal applications are distributed over local networks or in the cloud. With it, personal assistants and other mobile applications "can automatically detect where you are and what networks and devices are available," says Deborah Dahl, chairperson of the W3C's Multimodal Interaction Working Group. "It's aware of your environment and what devices are around you and automatically connects to them."

The proposed standard builds on the W3C's Multimodal Architecture Specification and lays out how the system discovers and registers all the modality components, including speech applications, in a surrounding network. In this way, modality components can be automatically configured to adapt to the state of the surrounding environment.

"The beauty of it is being able to make things within an environment available dynamically as you enter that environment and disappear as you leave it," she says.

The proposed standard further sets forth the types of modality components that will be responsible for handling specific messages exchanged on the network and presents an adaptive push/pull mechanism to inform the system about the changes in the state of the modality components. It also allows for Internet-connected devices to go into sleep mode to conserve power when they are not being used and sets out the protocols for waking them when they're needed.

"Discovery and Registration of Multimodal Modality Components: State Handling" is currently just a working draft, and the W3C is seeking public comment before advancing it through the standards adoption process. Dahl expects to start seeing the first test cases in about a year and expects a final draft to be ready in 18 to 24 months.

"I'm really looking forward to seeing some of the field trials. This could be revolutionary in how we interact with our environments on a day-to-day basis," she says.

An eager developer community is already standing by, ready to launch applications that can add custom voice controls to everything from smartphones and smart watches to Internet-connected thermostats, toothbrushes, and garage door openers.

Big companies like Apple, Google, and Samsung are investing heavily in the technology. In fact, Samsung CEO Boo-Keun Yoon earlier this year pledged more than $100 million in funding for developers to create an open system that will spark an Internet of Things revolution.

While these companies have deep pockets to spend on the IoT, there is no shortage of smaller companies and independent developers also working on voice-enabling connected devices. One company is Facebook-owned Wit.ai. Already, about 4,600 developers are using Wit.ai to power connected devices, including robots, home automation systems, and wearable devices. A student at the University of Waterloo in Ontario, Canada, even used Wit.ai to develop voice controls for a toaster and microwave oven.

Research firm Gartner predicts that by 2020, the number of Internet-connected devices will reach 26 billion worldwide, up from 4.9 billion today. The current number shows a 30 percent increase from 2014.

Free

for qualified subscribers

Subscribe Now Current Issue Past Issues

W3C Pushes a Multimodal Development Standard

Deepfake AI Market to Generate $41.36 Billion by 2032

SoundHound Launches Vision AI

CivAI Launches AI Voice Game to Demonstrate the Future of AI

The Healthcare Industry's Strategic Advantage Is Now Voice AI