Increasingly, speech technology is working in areas outside of voice-only applications to enhance their value. Our four 2014 Speech Luminaries have furthered this effort with notable contributions in multimodal control language, wearable computing, emotion detection, and mobile user authentication.
The XML Statesman
Jim Barnett, Director of Development, Genesys
Jim Barnett is a software architect who specializes in designing contact centers and Web standards. A member of the World Wide Web Consortium (W3C), Barnett is an expert in the fields of State Chart Extensible Markup Language (SCXML), Voice Extensible Markup Language (VoiceXML), multimodal architecture, WebRTC, workflow design, call routing, IVR systems, contact center architecture, and natural language processing. He is currently a director of development at Genesys.
Barnett has been highly focused on the development of SCXML, which is used to control other resources. "You can think of that as a concept where the way a system responds depends on the state it's in," he says.
He maintains that SCXML is analogous to a car's automatic transmission. "If a car is in park or reverse, it won't drive," he says. "That's interdependent on the state of the car. If the state of the car is that it's in drive, you won't be able to turn it off, but if you put it in park, then you're allowed to turn it off. That's when a car changes state and responds to things like turning the key. SCXML is a very powerful language for describing systems like that.
"SCXML is designed to invoke and control resources that are defined in other languages or specifications," Barnett says. "An SCXML script can invoke a VoiceXML script or directly control an ASR or TTS engine, even though VoiceXML, ASR, and TTS are not defined as part of the SCXML specification. Instead, the SCXML specification defines the hooks that you use to connect to any type of resource you want to control."
"Due to Jim's tireless work, developers in and outside of the voice application community can use SCXML to describe and automate complex processes," says James Larson, vice president of Larson Technical Services and cochair of the W3C Voice Browser Working Group.
The markup language is already being used at Genesys, Barnett says, to control different platforms, such as telephony. "[SCXML] is used as a general control language," he says. "It can be used to control pretty much anything that Genesys' platforms provide. You can use it when VXML is invoked, in how a call gets routed, to control sending an email, anything you want."
SCXML could also be used as a multimodal control language. "The multimodal field is where exciting things are happening," Barnett says. "This is where big advances are going to come over the next few years. We'll get much better at [letting people] use speech when they want to use speech and using graphical when they want to use that."
The Wearables Whiz
Ahmed Bouzid, Cofounder and CEO, Xowi
If the next disruption in speech technology is going to be wearable and ubiquitous, Ahmed Bouzid is one of the people leading the charge.
As cofounder and former CEO of Xowi, a McLean, VA–based start-up focused on delivering wearable voice technologies, Bouzid is convinced that the time has come for the voice-based user interface (UI). "A lot of attention and energy are being