Cost Justifying a Great VUI
With the cost of the Voice User Interface (VUI) accounting for half the total cost of ownership for a speech project, it is critical that enterprises evaluating speech better understand the value that a high quality VUI brings. One key hurdle to evaluating investment in the VUI is the subjective criteria that do not lend themselves well to financial analysis. Its difficult to say how much creation of a persona, or a user-friendly application are really worth. A simple but powerful way to overcome this hurdle is to integrate the concept of Customer Life Time Value (LTV) into a traditional Return on Investment (ROI) analysis. Review of a Speech Project ROI In order to demonstrate how the costs associated with a VUI design can be quantified, lets review a speech application ROI calculation. The simplest and most frequently used ROI analysis is what we call the back of napkin approach. More rigorous analysis would factor in drivers such as time to market, financing charges, telecom contract rates, transaction type and metrics, costs of call blocking and abandoned calls, Ehrlang calculations for system sizing, revenue capture opportunities, specific wage rates and a probable estimate of overhead reduction. However, this approach is adequate for our purposes. Scenario 1 - Manual Call Costing Number of Calls 15,000,000 Average Length (min) 1.50 Cost per Call Minute (fully loaded) $ 1.75 Cost Per Call $ 2.63 Total Manual Cost $ 39,375,000 Scenario 2 - Speech Call Costing Cost of Speech Call Min $ 0.20 Cost of Speech Call $ 0.30 Percent Handled by Speech20% Cost of Manual Calls with Speech $ 31,500,000 Cost of Speech Enabled Calls $ 900,000 Total Speech Cost $ 32,400,000 ROI Calculation (Comparison of Scenario 2 with 1) Project Life (months) 36 Benefit $ 6,975,000 Pay Back in Percent675% Pay Back in Months 4.6 The calculation of ROI is straight-forward. Estimate the total manual handling cost ($39.4 mil) to handle the projects call volume (15 mil) in Scenario 1. Calculate the cost ($900 k) of that portion to be automated with speech (20%) plus the cost of the remainder manual portion ($31.5 mil) in Scenario 2. The benefit is the difference between the two scenarios ($6.975 mil). The ROI can be expressed as a percentage of investment (675%), or as the number of months savings necessary to equal the investment (4.6 months). For speech, the results are almost always less than 9 months, and typically hover in the low single digits. Its that simple. Or is it? Call Completion Drives ROI The back of napkin method is a useful approach to demonstrate the powerful economics of speech, but obscures a very important point in determining the viability of the speech project, namely, how does the speech application achieve a 20% automation rate? How many calls were routed into the system in order to achieve this figure? What was the completion rate? Inversely, what is the percentage of calls that were unsuccessful, and how were they handled? What impression did the unsuccessful call leave the caller? Finally what is the cost to the company of those unsuccessful calls? What is the penalty for failure? To better understand call completion rates and call outcomes, lets create an example of a taxi company that automates their pick-up transactions. Due to the difficulty of automating addresses, the application may have a 60% completion rate, that is, 40% of all calls do not actually result in a customer inputting the information the first time, and receiving confirmation of a pick-up. On the one hand, the cost per minute is very low, probably less than 10% of the cost of the live agent, and the system was very successful for the 60% of completed transactions. However, what happened to the other 40% of the transactions? And what was the cost to the taxi company? Perhaps some frustrated callers zero-ed out, and waited for a dispatcher. This adds live agent cost to the original transaction. Perhaps, some called the application again successfully, but this doubled the time, hence the cost. And perhaps some simply gave up, in which case the taxi company probably lost the transaction revenue. However, all of these outcomes pale in comparison to the possibility that the fed up customer may simply switch brands. Permanently. It would not take more than several road warriors who spend $500 a year on the taxi companys services to switch brands, before the ROI of the project was jeopardized. To understand how an application may fail, a speech application scorecard tracks the possible outcomes for a call, and identifies the cost of each. Speech Application Score Card Telephone calls can be broken down into a set of events. Each of these events has an impact on the caller. The scorecard below lists the impact that speech has on normal live agent call events, and correlates them to specific measurable benefits (+) or liabilities (-). For instance, a speech application will help a call center eliminate abandoned calls, which has a positive impact on every measure of ROI. Traditional ROI methodologies focused on the benefit of successful transactions. Application Score CardROI Benefits (+) and Liabilities (-) Call EventsCall LengthTelecom CostTransaction CostTransaction RevenueLife Time Value SuccessEliminate Call Blocking+++ Eliminate Hold Time++++ Eliminate Abandoned Calls++ Successful Transactions++++ FailureUnsuccessful Transactions----- Agent Fallback--- - Customer Call Back---- Lost Transaction-- However, failed transactions also carry a cost. The penalty for failure can be direct costs associated with additional processing, agent time and longer call lengths. However, every customer interaction also contributes to the image of the company. Speech applications span customer service and marketing and influence the Life Time Value of the customer. LTV is a measure of a complex set of business drivers including product, price, promotion, place and intangibles such as image and servicing. Life Time Value (LTV) Life Time Value is to determine how much companies should invest in soliciting a new customer or maintaining an existing relationship. It is calculated as the Net Present Value (NPV) of profits generated by an individual customer over his purchasing lifetime. For instance, if the road warrior above generates $150 in profit a year from his $500 in expenditures, over a four-year purchasing lifetime, then the Life Time Value is the Net Present Value of $600. For simplicitys sake, we will use the number of $500 as our road warriors LTV. Applying the concept of customer LTV to speech application performance, the LTV represents the stakes associated with every transaction. It is the potential lost profits that might result from a disgruntled caller switching his brand loyalties because of a bad self-service experience, or alternatively the increased loyalty that a good transaction experience might inspire. If one in 100 customers enduring an unsuccessful speech automated transaction decides to take his business elsewhere, then the true cost to the company is $5 per failed transaction. Compare this with the two dollar benefit in automating a live agent call, and it is clear that LTV becomes a critical driver in the ROI equation. Putting the Pieces Together: The True ROI of Speech Returning to the back of napkin example earlier, it is now clear that a comprehensive ROI analysis needs to have a balance sheet approach, where the benefits of successful calls, are also off-set by penalties for unsuccessful calls. Returning to the example above, in order to successfully automate 20% of the call centers calls, it might be necessary to route 24% of the calls into the application. Four percent would fail and be handled manually or lost. From the perspective of application performance, 15% of the calls routed to the application would be lost, routed to an operator, or require some other intervention in order to be completed, and the completion rate would be 85%. Now if each of those calls represents a LTV of $100, and one in 100 of those callers decide that because of his unhappy speech recognition experience to change his buying habits, the magnitude of the opportunity is apparent. Penalty for Failure LTV $ 100.00 Liability per Unsuccessful Call $ 1.00 Completion Rate85% Number of Failed Calls 450,000 Cost of Failed Calls $ 450,000 Once the penalty of failed calls is factored into the ROI analysis, the adjusted TCO is increased from $900,000 to $1.35 mil, or by 50%. For enterprises evaluating VUI design services, the question should be, how much can various levels of investment increase my completion rate, and the LTV of my customers. LTV is the basis to financially justify investment in a better VUI that improves completion rate, and customer satisfaction. In fact, while this analysis has focused on eliminating the downside, it is equally true that a good VUI can increase customer loyalty, and therefore LTV. That is, the $100 LTV at stake for disserviced customers could be increased by callers impressed with the companys application. A good application can extend the customer life-time, or increase his purchases, or both. High Quality Speech Apps Increase Customer Life Time Value Traditional cost justification for speech focused on the direct benefits of automating manual calls. These included lower transaction costs, reduced abandonment rates, capturing blocked calls, around the clock service and faster transactions. If the analysis is extended to include LTV, then the cost of low completion rates from a poorly designed and maintained application is recognized, and enterprises can use ROI to determine how much they will spend on improving the interface. LTV closes the gap between straightforward ROI analysis, customer value and the callers experience. This creates a model for justifying additional resources to be spent on application design, prototyping, tuning and ongoing maintenance. It also provides a benchmark for automation of more difficult transactions that are becoming more prevalent as enterprises aggressively deploy speech across their businesses. John Moffly is an independent consultant to businesses with an interest in speech recognition. He can be reached at firstname.lastname@example.org.