The Future of Intelligent Voice Assistants

Article Featured Image

In the future, will you use one intelligent voice assistant (IVA) to mediate and connect you to the world, or will you prefer to use direct connections to various organizations’ agents? In a recent virtual panel discussion, AVIOS members debated the pros and cons of these different models and likely future developments.

One IVA as Majordomo

Here are the arguments in favor of the single IVA theory:

• Some very big companies are investing a lot to make this model happen.

• From the user’s perspective, one IVA is easier to learn than many. Consistency is simplicity.

• To be effective, an IVA majordomo needs significant personal information that would be difficult to transmit to every one of its minion assistants. Think email, contacts, shopping lists.

• Because personal information is in one place, it is safer.

• Authentication is much easier when it is simply one IVA rather than many.

• Having an assistant that works for you and not the enterprise removes conflicts of interest.

• In the future we will have teachable intelligent assistants who will learn about you, your history, your details, your relationships with other services, your preferences, even perhaps your goals.

And here are the mitigating arguments:

• The flip side of item No. 4 above: Because personal information is in one place, it could be less safe.

• Having a personal assistant that works for a global tech giant creates a conflict of interest.

• Skill discovery could be a challenge: “Alexa, tell Jeep to start my Jeep.”

• General IVAs do not offer consistent conversations across channels because the tech stack (automatic speech recognition [ASR], natural language understanding [NLU], dialogue management, text-to-speech [TTS]) is different, for example, between iOS, Android, POTs, and chat/SMS.

• Enterprises can’t control the brand. The branding is the general assistant.

• How do you transfer to a live person?

The Many-Minions Model

Again, let’s start with the pros:

• The enterprise can control branding and the persona.

• The costs of using the generic IVA and its cloud back end are unpredictable, and if it’s down, it’s down.

• From the user’s perspective, why would you trust a global tech giant any more than a global financial institution?

• The enterprise likely understands its own business better than the global tech giant does and is able to provide you with unique services and deals that the majordomo wouldn’t even know about.

• If you have a couple of different agents, you have some backup. Maybe one can find you a better deal than another. You have competition.

• Enterprise assistants offer a consistent omnichannel conversational experience because of a unified technology stack (i.e., you can use the same ASR across iOS, Android, and the phone, as well as the same NLU for these, plus chat).

• Reuse of these assets across channels saves investment by the enterprise.

• This model supports human-assisted understanding.

• Specialization of functions is complex and better suited to this model, as enterprise assistants can be optimized for specific domains (think healthcare).

• This model allows for consistent branding across channels.

And here are the cons:

• Enterprises would need to build support for multiple channels/agents.

• Cognitive load: Do I need to learn their names? Also, adding more apps doesn’t scale.

• If I go from one agent to another, they live in parallel but separate universes and have no idea what happened with the other one. They don’t talk to each other. If you’re stupid enough to rent cars from two different agents for the same trip, there’s no one to alert you.

Six of One…

Asking which model is better is like asking which car is better, a Honda or a Beemer: It depends. For who? For what? It’s similar to the question of whether an independent insurance agent is better than a direct agent (one who works for only one insurance company). Or how about a financial adviser who works for a brokerage who sells you only that brokerage’s products and is paid by the brokerage versus a wealth manager who takes a cut from your account for their unbiased advice. For some enterprise tasks, say package tracking, using a general assistant is easy. For a complex task, like healthcare, you may want to see a specialist.

And the majordomo versus many-minions dichotomy is also false, because I might use Siri on my phone, Alexa in the kitchen, and Google in the car. So there are at least three personal agents working for me.

Nevertheless, it’s worthwhile to think through these things and educate yourself about what’s best for your situation, either as an individual or as an enterprise. 

Phil Shinn is chief technology officer of ImmunityHealth and principal of the IVR Design Group.

SpeechTek Covers
for qualified subscribers
Subscribe Now Current Issue Past Issues