New Standards for Virtual Assistants Sought
                
                A new World Wide Web Consortium (W3C) community group aimed at developing standards for digital assistants launched June 7, following interest expressed during the SpeechTEK conference in Washington in late May.
“After the panel on ‘The Future of Speech Standards’ at SpeechTEK, several people met to talk about starting a new community group to look at use cases and requirements for possible new standards for newer types of voice applications, like virtual assistants,” says Deborah Dahl, principal of Conversational Technologies and a consultant, a member of the VoiceXML Forum, and the chair of the W3C’s Multimodal Interaction Working Group.
Community groups represent a relatively new type of W3C group that is free to join, and they don’t require participants to work for a W3C member company, Dahl explains. The idea is to get wide input from a community of developers and users on some topic of interest.
Dahl hopes the group can help with developing interoperability standards for digital assistants, much as other W3C groups have been instrumental in developing standards for other technologies. For example, VoiceXML standards are based on use cases centered on telephony-based voice systems.
                The typical interaction style that these standards support is system-initiated directed dialogue using grammars to constrain the speech recognizer. In recent years, interaction with voice applications has become much more flexible, with a user-initiated dialogue style and significantly fewer constraints on spoken input.
More recently, many of these new applications have become known as virtual assistants. Most notable among them are Apple’s Siri, Microsoft’s Cortana, Google Now, and Amazon’s Alexa.
According to Dahl, the new proposed community group will collect new use cases for voice interaction, develop requirements for applications such as virtual assistants, and explore areas for possible standardization, possibly producing specifications if appropriate. Depending on interest, this exploration could include the following:
• the discovery of virtual assistants with specific expertise, such as a way to find a virtual assistant that can supply weather information;
• standard formats for statistical language models for speech recognizers;
• standard representations for references to common concepts like time;
• interoperability for conversational interfaces; and
• work on dialogue management or workflow languages.
New functionality for existing voice standards can also be a topic of discussion, Dahl adds.
To date, one of the major challenges for developers working with digital assistants has been the lack of interoperability standards, according to Dahl. If a developer were to produce an app for Siri, for example, he would have to start from scratch to develop a similar app for Cortana, and again for Google Now. “You can’t switch out from platform to platform,” Dahl says.
Ideally, the work group could develop common standards for each that would solve this issue. However, Dahl admits that there will be pressure from Apple, Google, Microsoft, and Amazon to keep as much of their technology as proprietary as possible. “Sometimes they don’t want to participate,” Dahl says of the large companies. “But sometimes they see the value of standards in some areas.”
Midsize companies, on the other hand, tend to be active participants in these types of groups because they want to be able to have unified standards for developing apps for different platforms, according to Dahl. She expects speech application developers and voice user interface designers to be particularly interested in this group.