February 10, 2014
By David Myron Editorial Director - Speech Technology Magazine
Editor's Letter

How Watson Will Affect VUI Designers

When your primary focus is to build effective automated speech interactions, it's easy to get lost in the technical details of the project. Building systems that are designed to mimic human speech and conduct valuable interactions is very complicated, with lots of nuances to consider.

Things get more complicated when you think about all the different types of experiences someone might have when speaking to a machine. For example, it's not enough to simply evaluate a system's ability to leverage speech recognition and language understanding to complete a desired task. When assessing the total user experience, each task along the way must be measured against the users' behaviors and expectations of the system, not the designers' or developers' expectations. This can be very difficult, especially in a call center environment where a lot of people can call in every day with very different expectations.

And that's only on a single interaction channel. Consider how much more complicated things get when you bring automated speech technologies to multiple interaction channels. Nonetheless, as more people use mobile devices, there's a lot of interest in creating multichannel environments (Web, email, chat, phone, social media, etc.) with multimodal (talk, touch, and type) interfaces.

Additionally complicating matters, progressive tech enthusiasts are talking about integrating these various channels, creating an omnichannel environment—one in which people can traverse communication channels without losing information from the previous channel. Considering some of the challenges that speech technology still has with the telephone, is it ready to integrate with other channels? Are we rushing the technology to perform at a level that it can't?

I think not. Actually, this could bode very well for the appreciation and adoption of speech technology, assuming, of course, there are no integration issues. That's one of the things AT&T is focusing on with its Watson technology, according to the story "AT&T's Watson Answers the Call," by Michele Masterson. By creating a multichannel platform for developers, AT&T and its partners will effectively enable users to choose the interaction channel they prefer. Doing this successfully will certainly improve their user experiences.

That's not the only speech-enabled Watson technology that can improve customer experiences. IBM's Watson can recognize spoken requests, search unstructured Web data, and deliver a relevant response in real time. The company demonstrated this ability to millions of viewers when its Watson technology beat the reigning champion on the TV game show Jeopardy! in 2011.

Imagine combining that kind of computing power with a multichannel interaction platform. It would create a self-service experience unlike anything we've ever seen before. This isn't just wishful thinking. AT&T has been building its Watson technology for decades, and IBM recently announced plans to invest more than $1 billion in its Watson Group. For more information on their efforts, read our double feature package "AT&T and IBM: Which Watson Works for You?"

So what does this mean for speech technology user interface designers and developers? Even if they have exclusively worked in speech technology environments, they are still customer experience experts. This means they have a lot to offer graphical user interface designers as more companies build omnichannel environments.

Free

for qualified subscribers

Subscribe Now Current Issue Past Issues

How Watson Will Affect VUI Designers

Conversational AI to Reach $41.39 Billion by 2030

Voice Deepfake Fraud Surged 1,300 Percent

ESTsoft Partners with ElevenLabs

Deepgram Launches Voice Agent API