The 2022 State of Speech Analytics

Companies are increasingly relying on speech analytics for contact center applications, podcast analysis, telemedicine, and other applications.

According to Research Reports World, the speech analytics market is expected to surpass $4.2 billion by 2030, growing at a 21.2 percent compound annual rate.

Among the drivers for the growth, according to the report and industry experts, are an expanding outsourcing industry, improved scalability, lower cost, and the need to enhance agent productivity due to higher volumes in contact centers, just to name a few.

“We see a real improvement in transcription accuracy and scalability and more vendors offering APIs to create an infrastructure that is allowing for higher accuracy at a reasonable cost,” says Daniel Ziv, Verint’s vice president of speech and text analytics global product strategy, admitting that cost is still an issue for some potential users.

“If you don’t have high accuracy, you don’t have cost-effective transcription,” Ziv adds. “That’s the fundamental requirement. That’s been a challenge in the past, both in terms of accuracy and in terms of scale and cost. So there’s a big push in the industry to make transcription more cost-effective and more accurate.”

Year in Review

The push came from a changing array of providers, according to Rick Britt, CallMiner’s vice president of artificial intelligence. A number of smaller providers entered the market offering very basic speech analytics solutions focusing on agent improvement and compliance rather than more accurate transcripts designed for a host of uses. Once those smaller companies prove they are effective with these solutions in those areas, they will likely attempt to expand into other areas.

The smaller companies see opportunity in the market as larger players look to solidify their positions. Some examples included Qualtrics buying Clarabridge, LiveVox acquiring SpeechIQ, and Medallia buying Voci.

Larger providers are also providing analysis of contact center transcripts for a vast array of uses, from agent improvement and compliance to product, marketing, and service adjustments, just to name a few.

Beyond more accurate transcripts, speech analytics users need to know what to do with the scripts once they have them, Ziv points out. For example, during the height of the pandemic, sales shifted from in-store to phone or online. In healthcare, many new applications emerged to handle dictation for doctors so they could be much more productive.

Speech analytics providers need to ensure that their technologies not only can handle the growing number of contact center interactions, but also the growing sources of interactions, Britt says, including social media, chats, text, and emails.

The journeys have prompted companies to analyze different interactions from different channels all from a single customer dealing with a similar subject.

Britt pointed to Radial, an outsourced e-commerce technology and services company, which used the CallMiner conversation analytics platform to develop better “last mile” delivery solutions. Radial conducted sentiment analysis through CallMiner and provided the data to its end customer, which worked with Radial to improve the delivery process.

Whereas Radial is a expansive company with 2,000 to 4,000 agents working at any one time, speech analytics has worked its way down to many smaller users, including single offices.

“Many people have a personal scribe, but that is very expensive,” Ziv says. “And a human scribe, regardless of how good he or she may be, is about three times slower than you can do it with automation. And [humans] are going to be much less accurate, especially if they’re typing.”

With the amount of speech data growing swiftly and the tremendous amount companies already have in their archives, AI and machine learning are going to continue to drive the newest speech analytics solutions, experts point out.

With additional data, speech analytics solutions have become much better, according to Ziv. “Transcriptions are becoming much more predictive because you’re learning what is likely the next word. You’re not guessing just based on the sound, you’re using the context. Speech engines are much more accurate.”

Additionally, algorithms continue to get better, and computing power continues to grow, especially for users of scalable solutions in the cloud, which have more processing power than on-premises solutions, Ziv adds.

In addition to harnessing machine learning, AI, and advanced computer power, speech analytics users are moving toward APIs with more advanced analytic capabilities to drive ROI and build innovative products, according to Dylan Fox, CEO of AssemblyAI. “We refer to these as ‘audio intelligence’ features. Some of the top features we’re seeing companies implement include sentiment analysis, summarization, entity detection, [personally identifiable information] redaction, and content moderation.”

Last summer, Verint released its Da Vinci AI and Analytics, which uses advanced machine learning models, natural language processing, intent models, and sentiment models, as part of the Verint Customer Engagement Cloud Platform. The platform uses expanded linguistic and acoustic analysis capabilities and AI-powered bots to comprehend what’s being said, how it’s being said, and the corresponding agent’s actions and desktop activity. It is designed to understand the full interaction context to provide meaningful assistance in the moment to improve agent efficiency and reduce handle time.

Verint’s Real-time Agent Assist, meanwhile, provides critical notifications, knowledge, and reminders on the agent desktop to guide agents on the next best action and drive positive interaction outcomes.

All of this is increasingly important in the current business climate.

“While nearly all businesses know the importance of customer empathy, most struggle to deliver it on a consistent basis, especially given the work-from-anywhere contact center environment,” Ziv says. “Our latest innovation supports the delivery of exceptional experiences aligned with customers’ emotional states and intents for more impactful and meaningful interactions.”

The improvement in the underlying technology has pushed much of the use of speech analytics in the contact center from post-call to real time, according to Ziv. Verint’s Real-time Agent Assist, for example, is focused on using speech analytics to help contact center agents improve customer sentiment; improve compliance for collections, payments or other calls that require legal disclosures; or improve sales opportunities.

“There are focused areas that the system is looking for,” Ziv explains. “If it identifies an issue, it guides the agent or alerts a supervisor. It’s never really been that successful in the past because it was expensive to process this data. If it wasn’t in the cloud, you had to have more hardware.”

In addition to analyzing scripts for keywords, speech analysis for sentiment has become more commonplace in the past year, according to Britt.

A Look Ahead

Beyond the improved underlying technology, changing market dynamics have increased the demand for speech analytics, according to Ziv. “This is more needed because the remote work environment has put the agent at home. Without any peers, without any supervisors, they don’t have the support network that they used to have.”

Even though the industry is slowly making its way back into the office as COVID-19 restrictions start to ease, the need to aid the remote worker is still there. Many contact centers are still operating in a hybrid model, with a significant percentage of agents still working remotely, Zix says. “If you force people to a certain locale, that limits who you can hire. If you open it up and anybody can work from anywhere, you don’t have issues with time zones and you can go to lower-cost areas. That’s why outsourcing started.”

The advanced capabilities of speech analytics have also pushed the technology beyond the common contact center use cases, according to Fox.

With podcasts and broadcast media, for example, companies can use the technology to identify trends in how listeners react to particular content. Additionally, advertisers can use sentiment analysis to help better determine which podcasts would be the best fit for sponsorship.

And in telemedicine, the technology can be used to audit doctor-patient conversations to ensure positive outcomes or identify trends.

The underlying technology and uses of speech analytics will continue to grow, speech analytics providers expect, confirming the Research Reports World forecast.

Britt adds that speech analytics will progress to provide deeper analysis and micro-segmentations of common issues, like packages always being late in a certain geography, to quickly identify and address problems facing even a small segment of customers.

“We’re so far down the path of this technology that we can create solutions very, very quickly,” Britt says.

And those solutions are also becoming far more robust, scalable, and capable, the experts agree.

Phillip Britt is a freelance writer based in the Chicago area. He can be reached at spenterprises1@comcast.net.

The 2022 State of Speech Analytics

Year in Review

A Look Ahead

Vonage Integrates with Salesforce's Agentforce Voice

Lorikeet Launches Voice 2.0

Krisp Launches SDK for AI Accent Conversion

Wispr Raises $25 Million to Build Its Voice Operating System