Voice Assistant Technology Struggles with Monetization

Article Featured Image

I still remember my excitement when I first started using Alexa on Amazon’s Echo. It was (and is) such a category-defining product. Leveraging state-of-the-art artificial intelligence, voice assistants like Alexa, Apple’s Siri, and Google Now represented significant advances over previous generation products. I similarly remember buying category-leading voice dictation software 20 years ago. It was expensive and it had to be trained for many hours before you could use it. Even then it made too many errors, and I gave up trying to use it.

But the new voice assistants had no trouble recognizing speech and not just for dictation; they could be used for many more things and were also fun to use. With Alexa on smart speakers and Siri and Google Assistant on smartphones, it seemed as though voice interfaces were poised to go mainstream. So it is sad to read media reports about cost cutting and retrenchments at the Amazon Echo/Alexa divisions and the cutbacks of Google Assistant investments by Alphabet. The technology industry is looking to shore up profits and is rationalizing costs by reducing the resources allocated to unprofitable divisions, including the divisions working on voice assistants.

But why are these voice assistant divisions not profitable? It can’t be from lack of adoption; Siri and Google Assistant are installed on hundreds of millions of smartphones. More than 100 million Echo devices with Alexa have been sold, and Alexa is installed on a similar number of non-Echo devices. It also can’t be for lack of usage; users interact with these voice assistants billions of times every week.

Amazon is said to have sold the devices at cost and hoped to make profits from usage. But it turns out that it’s difficult to find a business model to effectively monetize voice interfaces and interactions. While there are a large number of user interactions, they tend to be for simple tasks like playing music, setting reminders, and getting information like weather updates.

Voice commerce or voice-based shopping hasn’t quite taken off. Compared to mobile apps or websites, the lack of product images and descriptions and the inability to read product reviews are limiting factors. Also, if the smartphone is always with you or if a computer is nearby, users are likely to use them for online shopping rather than voice assistants. And compared to other digital channels, ad-based monetization strategies are also perhaps not viable, as voice ads in the middle of an interaction feel more intrusive and subtract from the experience. To be sure, the web is full of ads, but we can easily tune them out compared to voice ads to which you’re forced to listen.

Unlike mobile apps on the app stores, third-party Alexa skills (i.e., voice apps) have had limited success. There are more than 150,000 skills in the Alexa catalog, but the typical Alexa user hasn’t been installing, using, or subscribing to any third-party skills. Developers are still figuring out what the killer apps are. This means limited revenue for Alexa app developers as well as for the app store.

Consumers consider voice assistants integrated into their smartphones, smart devices, home automation systems, and cars as part of those products, and so it is difficult to monetize them, as consumers are not directly paying for their use. So, in some ways, consumers sense that voice assistants are a feature, not a product.

Despite widespread adoption and significant advancements in capabilities, the monetization of voice assistant technology remains elusive. The billions of dollars of investment in technology improvements and driving consumer adoption have not led to profitability, leading to the above mentioned cost cuts and layoffs. Often, we tend to overestimate what a technology is capable of in the short term and end up here.

What’s the way forward? That’s a topic for another day. While generalist voice assistants are tough to monetize, several opportunities and white spaces exist for specialist products, such as in customer service, in business-to-business (B2B) applications, and in building tools to improve accessibility for seniors and special-needs users. Let’s not underestimate the long-term potential. 

Kashyap Kompella is CEO of rpa2ai Research, a global AI industry analyst firm, and co-author of Practical Artificial Intelligence: An Enterprise Playbook.

SpeechTek Covers
for qualified subscribers
Subscribe Now Current Issue Past Issues