Emotion Is the Next Frontier in Human-Computer Interaction

Article Featured Image

As the use of voice interfaces continues to increase, researchers in Japan have been busy conducting a number of studies to understand how we perceive and interact with computer voices to help engineers in future voice technology development projects.

The researchers, who hailed from the Tokyo Institute of Technology, the RIKEN Center for Advanced Intelligence Projects in Japan, and gDial in Canada, found that users anthropomorphize the computer-generated agents with which they interact and prefer interactions with automated agents that match their personalities and speaking styles. They also found a preference for human voices over synthetic ones and that the inclusion of vocal fillers, such as pauses and terms like “I mean...” and “um,” improved interactions.

In general, the survey found that people preferred humanlike, happy, empathetic voices with higher pitches. They also found that users tended to perceive agents better when the agents were embodied and when the voice matched the body of the agent.

Did we really need a huge study to tell us that? Ever since the early days of the speech technology industry, engineers and developers have been on a never-ending quest to make voices that sound more lifelike and natural, that speak as normal humans do in everyday conversations.

That quest has led to tremendous advances in just a few short years. The introduction of artificial intelligence, natural language processing/understanding, and emotional intelligence into speech synthesis and speech recognition have certainly put the industry well on its way toward reaching its goal. That was one of our biggest findings while compiling our list of this year’s Speech Industry Award winners (starting on page 13).

Recent technology advances—many of which would have seemed unlikely just a few years ago—fueled the exploits of our 10 Speech Industry Award winners for 2021. They show that the industry is undergoing a dramatic transformation; that speech is clearly reshaping how we interact with one another and the environment around us; and that speech is becoming a more ubiquitous and essential part of our daily lives.

The speech industry continues to build on its previous successes—and failures—with a constant stream of innovations, and many of this year’s achievements can be attributed to the hard work of our 10 visionary vendors for 2021.

Thanks to these vendors, and the dozens of others that make up the industry, robotic-sounding speech synthesis, disconnected interactive voice response systems, and error-prone speech recognition and speech-to-text systems are largely relics of the past. Today’s systems are more capable, accurate, secure, and lifelike—not to mention easier to use and cheaper to deploy.

But there is still plenty of work to be done before automated speech solutions can fully engage in empathetic conversations with real people. Research firm Strategy Analytics recently reported that chatbots today are inflexible, prescriptive, and unable to work outside of their scope. Further, it found that the inability of chatbots to express emotion, attitude, or opinion, especially if it cannot solve a customer’s problem, leads to user frustration and cessation of use.

“Despite some successes in the development of empathetic chatbots, human-level intelligence is still not fully understood. Building intelligent social chatbots that can understand humans and their surrounding world requires further advances in AI, particularly as their use diversifies into critical health-related services, such as mental health support systems,” said Kevin Nolan, vice president of the UX Innovation Practice at Strategy Analytics, in a statement.

“Research has shown that a customer’s emotions have significant influence on their satisfaction with a service chatbot. Consumer reaction to error is significantly influenced by perceived competence and trust. By designing systems that are user-centric and content-driven, in addition to preventing recognized non-progress events from occurring, this will provide numerous benefits to the businesses using them,” added Diane O’Neill, director of Strategy Analytics’ UX Innovation Practice, in a statement.

The field of human-computer interaction, particularly that of voice-based interaction, is a burgeoning one that continues to evolve almost daily. As such, the recent research cited above provides an essential starting point for the creation of new and existing technologies in voice-based human-agent interaction. That should keep the industry going strong for years to come.

SpeechTek Covers
for qualified subscribers
Subscribe Now Current Issue Past Issues