Real-Time Speech Transcription Spells Out Contact Center Excellence
The performance of cloud-delivered speech-to-text transcription is increasing at an astounding rate. As a result, many new speech-to-text applications are being created. In the last issue of Speech Technology, research firm MarketsandMarkets reported that it expects the speech-to-text API market to grow from its current value of $2.2 billion to $5.4 billion by 2026, growing at a compound annual rate of 19.2 percent. Suddenly cloud-delivered text-to-speech is a huge trend.
In the past five years alone, speech-to-text transcriptions from the cloud have moved from 15 seconds or longer to now near real time, even for many separate simultaneous conversations. This is exactly what is required for several contact center assistive functions.
With the shift to remote work for contact centers, several nice-to-have capabilities have morphed into critical can’t-live-without requirements. When everyone is under the same roof, training and then monitoring of employee progress is a matter of managing by walking around. If an agent has a question, a lead or supervisor is usually nearby to look at screens and listen to the call. However, when agents are remote, it’s a different story.
New agents frequently struggle to provide a quick answer because they don’t know how to access the information necessary to answer a question or how to proceed to the next step in a multistep process. And like all new employees, they don’t always realize what they don’t know, so they can spend time taking customers down a wrong path or they can provide incorrect information.
In the past, organizations striving for excellent customer experience addressed these complications through extended training. Currently a labor shortage exists in contact center roles, so the luxury of lengthy training is out of the question for most. Enter the can’t-live-without requirements to guide agents and immediately alert supervisors when a conversation with a customer requires additional experience. As with other contact center capabilities, vendors had been focusing on specific business priorities, but with the move to remote work, they’ve merged various additional features to remain competitive.
Initially several vendors focused on agent assist, which includes conversation guidance such as “show empathy and ownership” based on the caller and agent’s verbiage and sentiment, and/or by providing links to appropriate documentation—“Advanced troubleshooting for the Widget Model 1431.” This capability also extends into using AI to pinpoint actions that agents should be aware of, such as identifying that a complaint is being voiced by a caller. In regulated industries, grievances are required to be identified as such and then acted on. Even experienced representatives can miss that a caller is raising a complaint, sometimes so intent on helping the caller as quickly as possible that they miss that the call was truly a grievance.
Other vendors started with escalation alerts for assistance from supervisors. The most frequently demonstrated use case is where a high-value client threatens to pull their business during a call. In some organizations, no matter how experienced the agent is, management wants to be actively involved on those calls.
No matter where vendors started from, all the major contact center players are working on or are already offering merged capabilities.
Benefits can extend past the training and managing of new agents, a function that clearly improves the customer experience. With at least one vendor, value is created over a short time via post-call speech analytics, which then drives the real-time suggestions for all agents. The results are lower call handling times, higher sales conversions, and, again, improved customer experiences—even those provided by experienced agents. Furthermore, this assistance improves the employee experience by providing immediate and recurring coaching. Imagine the difference between having a spelling and grammar check in your word processor versus your manager acting as an editor after the fact, correcting all your mistakes. The former might be a minor annoyance; the latter, on the other hand, can be quite a deflating experience.
Additional capabilities coming from newer players in the real-time speech-to-text transcription space will benefit contact centers but also likely extend into other industries. Automating tasks such as data entry and beginning the next action are currently benefiting contact center agents but clearly will satisfy the requirements of use cases in the healthcare, insurance, and financial industries as well as many others.
A non-comprehensive list of major vendors with offerings in the contact center space using real-time speech-to-text transcription include CallMiner, Cogito, Dialpad, NICE, and Verint.
Because this area of speech technology is growing so fast, we will soon follow with additional innovative examples and take a peek into how it all works and why speech-to-text transcription and its resulting capabilities are rapidly evolving.
Kevin Brown is a customer experience architect with more than 25 years of experience designing and delivering speech-enabled solutions. He can be reached at email@example.com.