Speech Analytics Captures Consumer Sentiment
Nothing reveals more about a business than the voices of its harshest critics and most ardent supporters—its customers. And there was a time when basic speech analytics was sufficient to help companies learn what their clients were thinking. The technology could analyze thousands, even millions, of customer interactions to unearth the vital intelligence needed to build effective cost containment and customer service strategies.
However, the days of basic speech analytics are long gone. That’s not to say speech analytics has been relegated to a bottom desk drawer somewhere or housed on some remote server that seldom gets used. On the contrary, the speech analytics market is as vibrant as ever, with research firm Ovum predicting it would nearly double, from $95 million in 2009 to $180 million by 2014. Thanks to changes in how vendors position the technology, it is gaining ground.
Innovations, which include text analytics integration, emotion detection, real-time capabilities, and an emphasis on actionable results, are driving adoption of speech-based solutions in the contact center and beyond.
Speaking in Text
Once marketed as stand-alone solutions, speech analytics more often than not is being packaged with text analytics to provide a 360-degree view across all of the ways in which customers communicate with and about companies. Speech analytics takes care of phone conversations and other audio, while text analytics handles all of the written forms of communications, including text messages, email, chat sessions, blog posts, Web forums, review sites, and social media.
Naturally, companies need to extract actionable information and sentiment from the vastly untapped world of social media.
“You need to be able to control all the mentions of your brand, and so companies really need to pay attention to what’s in the public domain and make sense of it,” comments Ed Shepherdson, managing director of customer interaction solutions at Coveo, an enterprise search vendor that blends speech and text data to formulate a single voice of the customer.
Such a blending, he says, makes it easier for companies to determine whether there is a correlation between what’s in the public domain and what they’re getting from customer surveys and from the call center.
This form of analytics, called multichannel analytics or analytics convergence, “is absolutely where the industry is going,” says Donna Fluss, president of DMG Consulting.
When it comes to the voice of the customer, “speech is just one of the channels involved,” Fluss says. “There’s written communications, and there’s spoken communications. And if you want to fully understand what the customer wants, you have to look at both.
“I expect the market to move more and more in the direction where one company offers both speech and text analytics in one solution set,” she adds. “It’s already happening.”
The blending of the two is a fairly easy process, given that the foundation of speech analytics is the transcription of calls from spoken to written words and then the indexing of those words in relation to other words and throughout the entire recording.
Outside the Walls
With these new capabilities to pull voice and text together, “all content comes through a single search box, reducing the time needed to find information,” explains Kevin Calderwood, president of Vivisimo, a provider of enterprise search solutions that recently launched the Customer Experience Optimization (CXO) solution to blend information sources. “Now we can go outside a company’s firewalls to get information that is contained within the public domain, like the Web and social media.”
One of the first purely speech analytics vendors to embrace multichannel analytics was Autonomy etalk. Slightly more than a year ago, the company released Explore, which lets businesses connect to and understand almost any type of customer interaction, including audio recordings, Web site visits, chat threads, survey responses, CRM records, blog posts and responses, product reviews, email and documents, Twitter posts, social media status updates, wiki entries, videos, point-of-sales information, transaction records, news articles, and forum comments. Explore has more than 400 available connectors to internal, external, and public data sources.
According to Andrew Joiner, chief executive officer of Autonomy’s Promote Multichannel Technology Business Unit, the reason for offering such a solution is simple. “A lot of business today is being done in unstructured data,” he says. “I call on the phone, but I also send a fax or write an email.”
More than 90 percent of customers engage in multichannel behavior when dealing with companies, according to Autonomy. For instance, the same customer who called in a complaint to the contact center might have searched for products on the company’s site and commented about the company on Twitter.
Other traditional speech analytics vendors—including Nice Systems, Verint Systems, Nexidia, Utopy, and CallMiner—are also providing multichannel analytics as part of their solution sets, either on their own or through partnerships with traditional text analytics providers.
Verint, for example, partnered with Clarabridge to add Impact 360 Text Analytics to its Customer Interaction Analytics portfolio.
“We’ve definitely graduated our speech analytics to offer more customer interactions in one interface,” says Diego Lomanto, solutions marketing manager for speech analytics at Verint. “There are call recordings, but now with our new offering, there’s also email, social media, chats, notes, and surveys.”
Jeff Schleuter, vice president of marketing and business development at Nexidia, has seen interest building for text analytics and speech analytics together, but he’s not ready to call it a trend. “We’re seeing a push for text from customers. It’s definitely something that the industry is pursuing,” he says. “But it’s probably in an early stage, where more is being hyped than is in adoption.”
According to Lomanto, interest in multichannel analytics has been strongest in the hospitality, retail, and technology sectors, “because so much is written about them every day,” but the growing desire for such solutions has also come from the financial services, healthcare, and insurance industries.
In those industries, as in so many others, contact centers process thousands of calls, texts, email messages, and online chat sessions with customers daily. And while those interactions hold a wealth of valuable business insight, “the ability to put speech into context is critical,” Schleuter says.
Lomanto agrees. “Everyone wants information, but without context, it means nothing,” he says. “You want to be able to search for a word, but the real value is in the trending data.”
Therefore, vendors have been busy enhancing their speech analytics applications with sentiment analysis technologies that can identify more than just when and where a particular word or phrase appeared in the dialogue. That requires applications that can identify when customers have expressed an emotional response and can then connect those responses to a particular sentiment.
The technology needed to perform this kind of analysis is still fairly new and in some circles might still be called bleeding-edge, but it has evolved considerably recently. “There’s a lot of attention and investment in emotion detection,” Fluss observes. “It’s all part of an evolving set of solutions.”
A major step in that evolutionary process has been the melding of the two schools of thought when it comes to speech technologies for detecting emotions. One school relied on acoustic qualities—such as tone, pitch, volume, speaking rate, inflection, and intensity— while the other looked at linguistic elements, such as the words used, the pauses, stops, hesitations, laughter, and sighs, to determine the emotional state of a caller.
Advancements have taken the technology from basic acoustic-only or linguistic-only solutions to more advanced solutions that depend upon both models.
Utopy, for example, uses this two-dimensional approach to emotion detection to not only pick up variations in tone, pitch, etc., but also to correlate those variations with the linguistic content (including what the caller and agent said before, during, and after the emotional response occurred).
Autonomy’s Intelligent Data Operating Layer (IDOL) derives meaning by determining patterns of information, dominant terms, and significant relationships among distant ideas using multitiered relevancy modeling.
Nexidia uses more of a lexicon approach, according to Schleuter, that looks at the words being used and when those words and phrases are employed with other words and phrases.
Coveo’s approach is to look at how the words are being spoken and used. “We look at whether the company is being spoken about in a negative or positive way by looking at all the words around the one that you’re looking for,” Shepherdson points out.
Another way in which speech analytics technologies are being applied to emotion detection involves talk-over analysis. That capability identifies where in a call the customer and agent are talking simultaneously—a common indicator of customer dissatisfaction and frustration. Talk-over analysis also can identify periods of silence during calls, which might indicate a gap in the agent’s knowledge about a particular subject.
I Want It Now!
Speech analytics is still far from perfect when it comes to detecting and responding to human emotion. In no way will the technology ever be superior to a human’s abilities in that regard. One of the main reasons for that has been the very nature of speech analytics, which typically involved recorded conversations that are taken apart after the event. Depending on the volume of data—where it sits, if it’s hosted or on premises, and other factors—it could take a few hours to a few days to index the audio.
But that is changing, as more and more vendors experiment with real-time analytics, looking at spoken interactions as they occur.
Michael Maoz, research vice president for customer strategies at Gartner, calls the move toward more real-time capabilities the technology’s natural progression.
“Real-time speech analytics is a technology that has been waiting for a market breakthrough,” he said in a recent report. “When fully integrated with customer-centric solutions, it will enable contact centers to realize their strategic business potential. Processes and measurement metrics will need to evolve for this to happen, but the value is clear: Uncovering customer intent and gaining insights during the actual interaction enable organizations to deliver exactly what customers need in real time. This is the key to securing the customer relationship, improving satisfaction and loyalty, and ultimately driving revenue growth.”
Vendors are taking a look at real-time capabilities “because their customers are asking them for this,” Fluss says.
The reasons are manifold, but in the end, it comes down to a single one: “You have to mine the data more quickly so you can do something with it,” Coveo’s Shepherdson says. “People need to react to changes in their businesses very quickly, and they need to react in near-real time, rather than waiting a few days or weeks to get the data.”
Today’s solutions pull together the data at much higher speeds and are far more dynamic, while the cycle times for processing the data are getting shorter and shorter.
“Once we have an initial index, we can update it very quickly,” Shepherdson points out. “The information is constantly being re-indexed as soon as something happens. You don’t have to wait for it to be turned over to the knowledge management solution or for the voice to be converted to text.
“You can set how often you want it to go in and pull data. It can be very close to real time,” he adds.
Coveo can supply this data in real time via dashboard widgets that open on users’ computer screens. IBM-Netezza, a data warehousing company, was one of the first companies to use those widgets. Within 30 days of the implementation, IBM-Netezza reduced the time needed to identify known problems by 67 percent and cut the number of duplicate bugs submitted to its development engineering department by 50 percent, which in turn led to a 67 percent increase in the number of customer bugs fixed.
“Coveo dashboards and the analytics they provide are allowing our managers and executives, all the way up to our CEO, to have dynamic views of our customer relationships,” Jim Coleman, principal support analyst at IBM-Netezza, wrote in an email.
He continued, “This is particularly important when one of those accounts becomes hot with an issue that requires immediate attention. Instead of spending hours manually compiling reports from data in multiple systems, we can provide them with immediate access to the information needed to make informed business decisions and respond to the customer in seconds.”
Guidance on the Fly
Nice Systems was a pioneer in real-time analytics when in March 2010 it released Real-Time Guidance as part of its Nice SmartCenter suite of speech analytics offerings. Real-Time Guidance provides next-best action recommendations to the agent in real time during a phone call or chat session with a customer. It is accomplished by triggering and presenting to the agent context-sensitive recommendations and information according to pre-set business rules.
Nexidia customers also can take advantage of some real-time capabilities contained within the company’s Enterprise Speech Intelligence suite. “To be able to analyze calls as they are happening and help agents improve performance and serve customers better while they are still on the phone” will be important moving forward, Schleuter says.
“For real time, the technology is there. It requires much deeper integration into the telephony environment and more tie-ins to the CRM system,” he continues. “We’re working on our first implementations now.”
One such implementation is at Page One Ventures, a supplier of management and technology solutions for social networking providers. Page One incorporated Nexidia’s Enterprise Speech Intelligence suite into Page One’s operations in September; of particular interest is Nexidia’s Classifier features, which can detect language, gender, and other speaker characteristics in real time.
Callers who dial into Page One Ventures are required to record a short greeting before entering each session, and those greetings are screened to ensure that calls were routed to the appropriate agent. Prior to working with Nexidia, 83 agents manually screened the recorded greetings, but even working 24/7, they could cover only approximately 5 percent of the total volume.
Now Page One Ventures automatically reviews all caller greetings in real time and can flag relevant calls for further review. As a result, Page One Ventures has lowered agent costs by more than 60 percent while screening every greeting to ensure the best possible user experience.
“Nexidia’s technology helped us in several areas, including our quality assurance process,” Jeff Prete, general manager of Page One Ventures, said in a statement. “Now using Nexidia speech analytics, we are able to dramatically increase the volume of calls reviewed and conduct quality monitoring of our contact center with a much more rigorous set of data. This has resulted in an even better experience for all our customers.”
Schleuter says Page One Ventures is “doing something unique with how they use our technology to improve their operations,” but adds that implementations like this one are the future.
Fluss says speech analytics in general is in the early stages of its second generation, “where they are not just culling data but also making it more actionable. It’s about maximizing the results.”
“Speech analytics is definitely starting to find its rightful home,” Verint’s Lomanto adds.
News Editor Leonard Klie can be reached at firstname.lastname@example.org.
Companies and Suppliers Mentioned