Donna Fluss, Principal, DMG Consulting

Founded in 2001, DMG Consulting LLC delivers customer-focused business strategy, operations and technology services for Global 2000 and emerging companies such as Nortel, RealNetworks, J. Jill Group, Stride Rite, Sub-Zero/Wolf, MCI, HBCS and Internet Order. DMG Consulting is a strategic advisor to companies large and small. Its mission is to leverage technology, process and people to optimize operational efficiency, sales and profits for its clients.

Speech Technology Magazine sat down with Donna Fluss to discuss contact center speech analytics and the benefits of its use.

Q. What is contact center speech analytics?

A. Speech analytics, also known as audio mining, is an emerging application that is capturing the attention of enterprises and their contact centers because of its ability to structure conversations and find hidden insights, implicit needs and wants and the root cause of issues embedded in customer conversations. Today, speech analytics applications capture customer conversations and transform them into metadata that can be searched. The structured conversations are then analyzed using a variety of techniques, including key word, phrase and concept search. Some speech analytics applications are able to identify concepts and trends that end users didn't even know existed. When this information is analyzed, it yields a detailed accounting of the reasons why customers call. This enables contact center managers and executives throughout the enterprise to address the issues that generate call volume and to identify competitive challenges and new revenue opportunities.

Q. How does speech analytics work?

A. There are two primary approaches to recognizing speech, large vocabulary continuous speech recognition (LVCSR) and phonetic-based search.

LVCSR engines do a speech-to-text conversion of audio files. The text file is then searched for target words and phrases.

Phonetic-based applications separate conversations into phonemes, the smallest components of spoken language. Phonetic-based applications then attempt to find segments within the long file of phonemes that match a phonetic index file representation of target words and phrases.

Q. What are the differences between LVCSR and phonetic-based speech recognition?

A. At a high level, there are substantial differences in the two primary speech analytics approaches and what they are able to accomplish. The most fundamental difference is that LVCSR engines are able to discover - since they transcribe spoken words, they are able to go back and find words/phrases/concepts that users didn't think to ask about when defining their searches. When using phonetic-based applications, users have to know what they are looking for.  Phonetic applications are not able to discover anything that they weren't asked to look for when the data was structured. This is because they do not structure non-targeted data.

LVCSR solutions generally require significantly more processing power (although there have been some recent innovations in this area) than phonetic solutions.

As these applications are so new, it's not yet known which approach and specific solution will be most effective in a given situation. However, we do know that language models perform best when they are optimized for a specific environment and task. Over time, the most practical approach to various challenges will become evident, but DMG Consulting believes that a winning application will include both an LVCSR engine and phonetics, as each of these methods has already proven to be effective in different situations.

Q. What is the accuracy rate of speech analytics applications?

A. Defining accuracy is not as simple as it sounds. Accuracy addresses three different areas, including: 

  1. The speech engine - For the speech engine, accuracy is defined as the percentage of time that the engine correctly identifies a target term or event from the source audio file. Accuracy needs to address both false positives and false negatives.
  2. Query efficiency - This measures how well the query accurately finds the events. 
  3. Data aggregation and reporting - The third place where accuracy can be lost is in the final step, aggregating and reporting. With constantly shifting business priorities and changing models, the system has to make it easy for the results to be properly aggregated and promptly shared with the right decision makers in a format that is action-oriented.

Today's speech analytics applications all provide directionally accurate information and will surface and quantify appropriate contact center trends and issues. There is, however, no question that the accuracy of the findings depends upon many factors, including how accuracy is calculated, audio quality, the underlying engine, the search criteria, how well the application is set up, tuned and maintained, knowledge and experience in using the applications and many other factors.  Vendors are investing significantly to improve the accuracy of their results.

Q. What are the primary uses of contact center speech analytics?

A. As it stands today, the contact center is the primary user and beneficiary of speech analytics applications, as it is the source for the audio files. The current primary uses of speech analytics are: 

  1. Identifying call trends and reasons customers call/root cause analysis
  2. Increasing first call resolution/reducing call backs
  3. Reducing the volume of complaint calls (cost avoidance)
  4. Improving customer experience (QA)/improving customer loyalty
  5. Improving script adherence

These applications can also be used for activities that extend beyond the boundaries of contact centers; they can identify new product ideas, determine which marketing campaigns are most successful and why, and drive revenue. The last three uses are applications that are being discussed, but not yet rolled out in many organizations:  

  1. Increasing revenue
  2. Identifying new revenue opportunities
  3. Improving effectiveness of marketing campaigns

Q. What are the benefits of contact center speech analytics?

A. Speech analytics is very compelling for enterprises because it reduces operating expenses, improves quality, enhances the customer experience, increases revenue and reduces corporate liability. Today, speech analytics applications provide information that is directionally accurate. They can spot trends, identify the underlying reasons for customer calls (root cause analysis), improve the effectiveness of your quality assurance program, reduce fraud, determine if your agents are adhering to their scripts and much more. Speech analytics functions as an early warning system, providing the tools to rapidly and unambiguously identify trends and issues so that the enterprise and individual managers can respond much sooner than was possible in the past.

There is no question that speech analytics can and will (if used properly) reduce contact center operating expenses, empowering contact center management to reduce call volume and increase first call resolution through root cause analysis. However, while the financial benefits of root cause analysis are great (keeping in mind that a typical customer service call costs approximately $5.00 to $10.00), the upside potential of using speech analytics for sales and marketing is even higher. Due to organizational politics and evolving product maturity, it is expected to take a couple of years for other operating areas to benefit from the structured output and business intelligence provided by speech analytics

Q. What is the ROI from contact center speech analytics?

A. Despite their shortcomings, speech analytics solutions are viable, have been proven in the field and have a quick and quantifiable ROI. The payback from the speech analytics applications successfully implemented during the past year is approximately nine to 18 months, although it's taken longer in some companies. Achieving these benefits takes a significant investment of time and resources because best practices are just emerging and are available only through expensive professional services offerings from vendors. As time goes on and more speech analytics best practices and experts become available, it's expected that the payback will be reduced to less than nine months. Even better, once you understand how to use them, their contribution to your organization will accelerate. Once sales and marketing begin to use these applications and incremental revenue is included in ROI calculations, the payback will be even more rapid. 

Companies investing in speech analytics should count on a six to 12 month payback, with an average of nine months, but must appreciate that the only way to realize this payback is to invest their own resources in the implementation and maintenance of the application.

Q. Should companies wait or invest now?

A. DMG Consulting suggests that you invest in this technology and not wait, as long as you are willing to allocate the resources necessary to implement and utilize the system properly by establishing internal best practices that allow you to use the system output on a timely basis. 

Despite their limitations, the current batch of speech analytics solutions can add substantial value to your organization. Although these solutions are far from perfect and are certain to improve immensely during the next five years, the information they are currently capable of identifying - when the applications are implemented and managed properly - will have a substantial positive impact on the way you conduct business. For now, it's fine for enterprises to get an enhanced "big picture" from these applications, which are already identifying so much data that contact centers historically didn't know or couldn't access.

Q. Who are the companies providing contact center speech analytics solutions?

A. There are five primary categories of vendors selling speech analytics products to the contact center market:

  1. QM/Recording vendors: Envision, etalk, Mercom, NICE, Verint, Voice Print, VoiceLog and Witness
  2. Stand-alone contact center speech analytics vendors: CallMiner, Nexidia and Utopy
  3. Contact center infrastructure vendors: SER
  4. Hardware-based vendor: Natural Speech Communications
  5. Other existing and emerging vendors: Aurix, ISense, Sonum Technologies and Sivox

Q. Does speech analytics work in real-time?

A. Today, most speech analytics applications analyze recorded transactions. In the future, speech analytics will work in real time, while the customer is on the line. This will enable contact centers to proactively take measures to provide customers with an outstanding customer experience that builds loyalty and drives revenue.

Q. Where can I obtain additional information about contact center speech analytics?

A. The 2006 Speech Analytics Market Report, published by DMG Consulting LLC, is the definitive guide to the emerging contact center speech analytics market. The report gives end users the product, market and vendor information they need to make an optimal technology investment and select the right solution and partner at the right price. The report provides in-depth comparisons and analysis of speech analytics functionality, technology, accuracy, best practices, trends, market projections and pricing. It examines the market and explains why speech analytics is the "next big thing" for contact centers.


SpeechTek Covers
for qualified subscribers
Subscribe Now Current Issue Past Issues