Call Recording, Transcription, and Analytics Find Strength in Unity

Article Featured Image

Voice recording, transcription, and analytics have been used by companies for years to provide better customer service through the contact center. Everyone who’s ever placed a call to customer service has likely heard the line “This call may be recorded,” or some variation of it.

As customers called in, recording would start. The recordings would then be transcribed, and those transcripts would then be fed into an analytics engine that would provide insight for agent training and other purposes.

In the past, the technologies for each stage in this process were separate, so companies would typically have to go to three different providers to implement them.

That was then, and this is now. Today, call recording, transcription, and analytics technologies have become so deeply aligned that contact center operators can now buy them from a single provider.

“There is a lot that drives having a bundled solution, what we’d like to call a platform solution, that can address multiple needs, especially if there’s such strong synergy,” says Daniel Ziv, vice president of speech and text analytics at Verint. “Recording, transcription, and analytics are all very linked together, and one impacts the other.”

As artificial intelligence and machine learning have evolved, the technology for recording, transcription, and analytics has become part of the same underlying engine, experts agree.

For speech and other technologies, the theory has shifted back and forth from purchasing best-of-breed solutions from different vendors to buying bundled solutions from a single vendor, with the latter being based somewhat on having “a single throat to choke” if anything goes wrong. Of course, the bundled solution providers will contend that their individual solutions are best-of-breed.

There are a few reasons for buying these technologies as a bundled solution from a single provider rather than seeking out different vendors in the hopes of pursuing a best-of-breed strategy, according to Ziv.

Getting the solutions from different vendors is very expensive, Ziv says. “The more you bundle, the more cost-effective that it is.”

Verint isn’t the only vendor unifying recording, transcription, and analytics technologies into a single offering only recently.

“We started our company because it used to be that if you wanted access to state-of-the-art speech recognition software, you had to buy and license this [expensive] enterprise software,” says Dylan Fox, CEO of AssemblyAI, a provider of automatic speech recognition and speech-to-text technology for contact centers. “Beyond the high cost—most of it paid up front—there were numerous cumbersome and restrictive agreements.”

The cost alone was a significant barrier to many companies adopting speech recognition technology, according to Fox, but costs have come way down.

“The overall accessibility of this technology has just gone way up over the past couple years. It’s only recently that the technology has become readily available for developers,” he says.

“Our goal with AssemblyAI was to build and research state-of-the-art speech recognition technology that within two years would be just as accurate as human transcription and to make that accessible to any software developer through a simple API without paying anything up front. It’s all pay-as-you-go.”

Another issue with obtaining solutions from separate providers is that such solutions are often closed, so they don’t integrate with other solutions, according to Ziv. “That limits what you can do.”

Different solutions can also make upgrades more complex, Ziv adds.

Managing the technology also becomes more cumbersome when working with multiple providers, adds Greg Armor, executive vice president of Gryphon.ai, a provider of voice-driven sales technology.

“Some companies want to buy stand-alone point solutions because they solve a problem in the moment. But yet, when it comes to sales technology, they’re looking to consolidate multiple tools and multiple vendors into one solution because they’re starting to see a crowded overlap in their tech stack,” he explains. “You can have five different companies doing five different things, which means five different interfaces, five different dashboards, and five different places that sales reps have to log in to and go to. That’s not the best practice. A consolidation of the tech stack is most likely being driven by the use case of a single interface to not confuse or clutter the day and life of the sales representative.”

Providers are looking to ensure they can provide all or most of the recording, transcription, and analysis capabilities that customers want either by adding or growing those capabilities internally or through very strategic acquisitions.

Dialpad, for example, in mid-September acquired Kare Knowledgeware, a customer experience platform for workflow orchestration, knowledge management, analytics, and business intelligence. Dialpad executives saw the acquisition as a way to enhance their artificial intelligence and natural language processing capabilities to enable conversational AI and improve the customer and agent experience.

The Dialpad-Kare self-service solution will use AI and robotic process automation (RPA) to connect customers and agents to websites and knowledge bases. During customer interactions, it can identify more complex issues and automatically route these interactions to live agents as needed.

“The seamless integration with Dialpad’s Contact Center and Voice Intelligence, offering live transcriptions and a searchable archive of every call, augments and amplifies agent skill levels to make a difficult job easier and create optimal customer interactions,” Dialpad said in its acquisition announcement. “With Dialpad-Kare technology reducing average hold time, improving operational efficiency through self-service, and up-leveling agent proficiency, callers benefit from a streamlined customer experience due to increased agent availability and effectiveness.”

Bundled solutions also allow companies to identify and act on problems much more quickly because the handoff from one component to the other is nearly seamless, Ziv says. “You might want results in near real time because you want to take action as quickly as possible on a call or you’ve identified a breach or a compliance issue. If the recording is conducted separately from the analytics, there’s a time lag between getting the recording and conducting the analytics that doesn’t occur when solutions are bundled,” he states.

That consideration becomes even more critical if a company is looking to provide real-time agent guidance or other recommendations during the course of the conversation.

Technology Advancements

The shift to bundled recording, transcription, and analytics solutions have been made possible by the evolution of these technologies, both individually and together. For starters, the technology’s accuracy has improved significantly as companies collect more call data from which to tune their offerings. More data, combined with machine learning and artificial intelligence, have increased the speed and accuracy of transcriptions and analysis.

Fox says that speech recognition technology has improved 30 percent to 50 percent in the past five years.

A Speechmatics industry report on voice technology trends notes that neural networks, a sharp increase in computing power, and cloud computing have greatly impacted accuracy, feature development, and language capabilities.

But even with dramatic improvements, accuracy is still not good enough for some to make the investment. Though noise cancellation has gotten much better, some engines still have a hard time isolating speech in busy environments, while others still struggle with different accents and dialects.

If, for example, a company finds what it thinks is an excellent transcription engine for English, it will quickly learn that it might do well with U.S. English but not as well with U.K. or Australian English, warns Ryan Steelberg, president of Veritone, an artificial intelligence technology provider. “By the time they deploy on a single point or source solution, they find they need something else for managing different engines. The training against these different cognitive models is a daunting task.”

Steelberg also notes that many companies are still working their way through their digital transformations and aren’t ready yet to take advantage of bundled recording, transcription, and analysis. Companies still have silos of structured and unstructured information throughout their organizations and need to tear down those silos before truly benefiting from bundled speech.

“Companies need to make sure that they have a good handle on their data,” Steelberg says.

Yet the number of companies using the technology is definitely on the upswing, he maintains. “I think it’s because as people have been experiencing and playing with different AI cognitive models, whether that’s natural language understanding, text-to-speech, or speech-to-text, over time, they’ve seen a steep function improvement in some cognitive service offering, which is a material step up in terms of performance and accuracy.”

Continued Growth Expected

The more companies become accustomed to the benefits they can receive from bundled recording, transcription, and analytics, the more they will use it, Ziv says.

Steelberg agrees, noting that thus far many companies have taken what he calls a “microstep”—experimenting with it in one area of the business before considering adoption across other parts of the enterprise.

“There are just more and more use cases,” Ziv says. “It’s just like the way the internet opened up whole new markets. By digitizing communications, there are so many things that can be done. The growth continues to be very strong in this space. As the technology becomes more powerful and cost-effective, it opens up opportunities for more users. The industry will continue to thrive and grow and become really mission-critical in terms of engagement analytics and customer engagement.”

As with other technologies, the cost of recording, transcription, and analytics continues to decline, making it more feasible for smaller businesses and smaller call centers, Ziv says.

Moving the technology to the cloud, a trend that is expected to continue, is also going to increase adoption, particularly as call centers no longer have the expense of installing and maintaining dedicated servers on premises.

“A year from now, the vast majority of our analytics sales will be in the cloud, which will give us nimbleness and flexibility to add users, to add functionality, to create custom models, to add more predictive modeling,” Ziv says. “The cloud [for this technology] will become the de facto standard.”

But the industry isn’t quite there yet.

“We’re still in the first inning of this ecosystem,” Steelberg says. “But what you’re going to start seeing is the emergence of next-gen solutions because people now have an integrated data lake framework and AI so they can start getting the most value out of their data.” 

Phillip Britt is a freelance writer based in the Chicago area. He can be reached at spenterprises1@comcast.net.

10 Tips to Maximize Analytics Insights

Despite recent and swift advances in the technology, analytics is still greatly underused by businesses today, according to LiveVox. A recent company study found that only 11 percent of businesses were using speech analytics to its full capability.

LiveVox offered the following 10 recommendations for businesses to get the largest benefits from speech analytics:

  1. Monitor and score every call to improve contact center key performance indicators.
  2. Set keywords and phrases to recognize customer sentiment. By understanding when a customer is pleased, a call center operator can seek to replicate those conditions.
  3. Create structured data points from speech. This information can be used to understand product or service issues and drive changes.
  4. Use speech analytics for training and quality management. The insights provided can identify good and poor agent performance, giving managers clues to where they need to provide further training to ensure consistent quality.
  5. Engage agents with data so they can see for themselves where they need to improve.
  6. Manage performance in real time. This is particularly important for key business requirements like legal statements, policy adherence, and on-brand communication standards. If things aren’t going well, you can get notified promptly and handle accordingly.
  7. Recognize sales leads. The data will help identify upsell and cross-sell opportunities. The faster these can be recognized, the more likely the agent can make those offers during the call.
  8. Reduce agent time for routine items. The data can automatically populate reports, freeing agents and managers from this tedious task.
  9. Manage compliance. By monitoring, analyzing, and scoring all call center interactions, companies can mitigate unexpected regulatory risks and ensure agents are staying in compliance. The artificial intelligence can be programmed to automatically redact any sensitive information.
  10. Unlock potential. Speech analytics helps companies understand and address customer and agent trends. —P.B.


SpeechTek Covers
for qualified subscribers
Subscribe Now Current Issue Past Issues