Companies Are Investing in Voice, But They're Ignoring the Real Problem

Article Featured Image

In 2019, an analysis of the rise of voice technology noted that the sector's success hinged on its ability to perform in noisy environments. Two years later, voice technology has had incremental improvements, but it remains clumsy. For all its promise, voice still falls short of its lab-tested potential once deployed in the real world.

Why? The real world is dynamic and unpredictable. Making solutions that work in these kinds of environments is challenging. Instead of complex solutions that match the complexity of the real world, we have simplified solutions tested in simple ways. We need to design voice solutions for the real world.

There is massive investment happening in conversational artificial intelligence. Verint recently acquired Conversocial for $50 million, and 67 percent of businesses are expected to increase their conversational AI budgets this year. However, if companies continue to test their products with synthetic sound environment models, these investments will fail.

With businesses pouring significant investment into conversational AI and voice assistants, designing voice solutions for the real world is a fiscal imperative if these companies want to survive.

As companies increasingly look to conversational AI to front their customer-facing operations and 24-hour support becomes the norm, voice must rise to the occasion and meet the human standard.

It's time to deploy voice that's ready for the real world.

When we talk about humanizing voice technology, we want our voice assistants and voice interfaces to feel as close as possible to interacting with another human. If we expect a person not only to hear, but also understand what we mean and want in a given situation, our voice assistants should also hear and understand us with all our nuance and peccadillos and perform the given function. This is the human baseline.

When companies test their voice technology with synthetic sound environments, they develop enough environmental sound profiles to hopefully match users in their own real-world environment. The issue, however, is that real-world situations are dynamic and, although a profile at the start of an interaction might match, it inevitably changes or new variables enter the equation. The technology can't perform in the myriad edge cases that occur in everyday life.

When exploring voice solutions, businesses need to hire a product expert. Voice technology is complicated.

The reason voice remains clunky and doesn't work properly is the way voice solutions are evaluated is not consistent with how they're going to be used. When designing voice solutions that meet the human standard, we need to design for the uncertainty, complexity, and dynamics of the real world and the test for these solutions must be as complex as the real world.

If your business is employing voice assistants to handle customer support and handle calls from people who might be driving, in a crowded mall, or in any kind of environment with impeding noise and variables, they are going to run into the same problems as before.

Consider this example: a business wants to install a voice-activated kiosk in the middle of a mall. In this single environment, you have to account for considerable noise: the cascading sounds of hundreds of shoppers walking across hard tile, multiple conversations and voices overlapping into a cacophony that rises and falls, a fast food employee in the food court scooping ice into a cup, etc. You can try and replicate this scenario in a test lab, but nothing can match the dynamic sound environment. If you test your voice technology in a lab without this level of variable noise, it will fail in the real world. The kiosk will be unusable except, perhaps, in the early morning hours when there's little or no foot traffic.

Across marketing, sales, and customer service, companies deal with millions of interactions every day—all happening in dynamic, complex, and uncertain environments. If your voice solutions are not ready for the volatile world, you risk alienating and losing customers.

A Matter of Security

While businesses are focused on the user experience with voice, they also need voice solutions that guarantee the safety and privacy of their users' data. The rise (and increasing scope of responsibilities) of the chief information security officer is in direct proportion to security being named in a global study as the top priority for businesses. The same report found that 84 percent of executives said their organizations suffered from data loss or security incident in the last two years.

Security also needs to be a top priority when developing voice technology. Users are becoming exceedingly reliant on voice technology and voice assistants when exchanging secure information or in high-risk situations, such as driving and navigating, healthcare settings, noisy factory floors, and high-traffic retail areas.

Capgemini Digital Transformation Institute in 2018 issued a report on voice technology revolutionizing ecommerce. Among their findings were that 28 percent of active banking and insurance service customers in the United States, United Kingdom, France, and Germany used voice assistants during transactions. However, voice agents that are unable to correctly process information, such as bank account information, lead to people repeating this information more loudly or exiting the interaction out of frustration.

According to a PwC report on the impact of voice assistants on consumer behavior, the majority of consumers have yet to use their voice assistants for activities beyond asking quick questions or as an alternative to a search engine.

Privacy concerns are often cited as a barrier to what could be commonplace activities like online shopping with voice. If your voice assistant can't yet understand you, who would trust it with something as sensitive as health or credit card information.

Voice technology meeting the human standard is not simply a means to having a more seamless interaction with your voice assistants. Without real-world testing, voice technology is unable to perform at a high enough level to guarantee safety and security. This is how you stifle a market and chase away customers.

The Next Generation of Customer Engagement

Think about a scene from any television show where the actors are surrounded by other people. Is it loud? Difficult to hear through all the noise? Of course not. Extras are often directed to pantomime, to pretend as though they're talking so as not to interfere with the dialogue. You hear the main characters, but you don't hear the people around them.</p?

The real world, however, is noisy and messy. We interact with other people in the midst of competing sounds and are often surrounded by other voices. Real-world sound environments are dynamic and always changing. People move. The room gets quieter or louder. Without thinking, people naturally adapt. Human hearing itself is a dynamic system.

In the next decade, we are going to see a new wave of voice technology transform customer experience and engagement. Every month, nearly 128 million people are estimated to use a voice assistant. That number is expected to exceed 135 million by 2022. Voice technology is no longer locked into the realm of stand-alone devices or mobile phones. It's being integrated into virtually every business.

The possibilities for companies are unimaginable.

SpeechTek Covers
for qualified subscribers
Subscribe Now Current Issue Past Issues