• September 1, 2003
  • Q & A

Reza Nabavi, Senior Market Development Manager, Sun Microsystems

What is Sun doing to help customers make the best of their speech technology investment? What are Sun's plans for optimizing investments for their customers in the future?

Reza Nabavi Surveys have shown that in any IT investment - and this applies to a speech technology solution adoption as well - significant portion of the costs (70-80 percent) and hence the most impact to the bottom line initially are the costs of acquiring the technology. But ultimately, it is not just about the technology, but how it is used to run business more efficiently. Once a customer has gone through the ROI valuation and analysis and has determined that speech solution is the way to address its business case, the real work begins! Cause executing on the IT solution that maximizes your ROI is hard. Balancing traditional PIN-based and/or voice-print authentication, security and privacy can be daunting. Understanding what are the "killer apps" and which application brings the most tangible and intangible value/ROI is key. There are many questions that need be answered. Is the technology available in the language needed? Can and should multi-modality be implemented? What are some of the proven efficiency and cost savings methods through existing packaged applications? How can we enable customer choice and competitive advantage via development of brands and persona? How about integration & scalability along complexity and quality of service level needs? That is where Sun comes in. Sun's efforts in helping its customers and the proliferation of the speech-activated solutions are basically three-fold: 1. Carrier Voice Services: Sun has created an architectural model that helps integrate and basically make voice/speech-activated solutions an extension of existing web infrastructure. Targeted at the carrier community who focus on delivering enterprise managed services, iForce Carrier Voice Services Solution puts forth an architectural blue print that uses IP as its back-bone. Contrary to today's IVR solutions that are essentially what I call "Black Box" solutions - an entire proprietary offering that houses a tightly coupled IVR application, written to a closed API, which runs on a proprietary hardware and telephony solution - CVS is based on the concept of distributed functionality and not distributed objects. Key functions of a speech-based solution deployment (application platform, OAM&P, speech server, signal processing, and middleware) are clustered together. Communications between these key functions take place via industry standard protocols (SIP, MRCP, RTSP) all over an IP backbone. Data and applications are also represented and rendered via standard languages (CCML, SRG, VoiceXML, HTML). CVS model allows for plug-n-play of components within the entire solution without impact to any functional aspect of the speech solution. 2. Partnership is the business model: Over the past couple of years, Sun has built successful partnership with an array of key players in the speech technology market. These include players such as Nuance, SandCherry, BeVocal, Comverse, all of whom have proven, validated and tested their applications and speech solutions on Sun Solaris SPARC servers. These validated solution and partnerships have essentially given reaffirmed Sun's position as a platform of choice for deploying voice-activated applications and services. The benefits to the customers of speech technology are clear: value to our customers through simplification, integration and predictability taking the burden off of the customers and/or the integrators. 3. Sun's Project Orion: We all know that today's voice-web solution are complicated and involve many interconnecting dependencies on enterprise infrastructure software potentially from different vendors as well as from the same vendor. A straight-forward upgrade to a new enterprise infrastructure software can take hours of planning to ensure version compatibility not only between enterprise infrastructure software but also with VoiceXML interpreters and speech engines that interface with it. Project Orion is about integrating all the enterprise infrastructure software in the software factory such that when it is delivered to a customer, it is production ready. The goal is to free the customer from concerns about enterprise infrastructure software release version misalignment and compatibility issues. If it's in the Software System it has been designed to work together, and operate on the same common Solaris SPARC, Solaris X86 and Sun Linux platforms. Doing software factory integration means greater predictability and increased enterprise infrastructure software confidence for the customer allowing redirection of existing resources to other business supporting revenue activities.

ROI may take a backseat in the future, but right now it's driving the development direction of the industry. What is the Sun perspective on this?

RN I couldn't agree with this more. Chances are that ROI will again take more of a back seat in the future, when we get into more prosperous economic times during which there is considerably less scrutiny of IT expenditures. But in my view, that is a distant future. Today, it is vital to have a healthy ROI analysis for any R&D or solution deployment, and that includes speech, voice enabled solution deployments. Let me elaborate on the reasons why. External factors aside, internal reasons are enough to substantiate the need to a healthy ROI analysis. Research has shown, that almost two-thirds of all IT projects completely fail resulting in gross waste of human and capital resources (around $80B-$145B/year), and only 9 percent of technology investments that were completed are within budget and on-time. Research has further shown that 75 percent of the companies who had done a descent ROI analysis and had confidence in their metrics, obtained the results they expected. To complicate matters, CIOs and CTOs no longer solely control the company's IT budget. Each business unit has its own ability to purchase software and HW to improve its productivity and process. The result is 25 percent to 50 percent more on IT spending than budgets indicate. There are companies out there that have invested millions of dollars on software that does not work and/or they have never used and will probably never, ever use because of resistance. These are rising fixed costs that carry risks. CIOs and CTOs are also now facing new roles and challenges. Capital/IT expenditures, which comprise the lion's share of organizations expenses, are under close scrutiny. CIOs and CTOs are all under the gun to justify every significant expense and demonstrate how the costs will improve the bottom line in the near future. They also now have to do more with less money. IT spending budgets are not increasing (global spending went from $642B in Yr 2000 to $573B in 2001. Carrier IT spending was reduced from $150B to $60B). And further, they are to communicate more with stakeholders. Autonomous spending is clipped at $100,000. So we can see, why it is so important to have a healthy ROI model. This is our perspective as to why ROI and we are building answers and justifications to such IT expenses into our products and solutions.

Both short- and long-term ROI depend heavily on a company's 'fit' with a speech solution. How do you help a company identify whether it's a viable candidate for a speech solution?

RN This is actually a two-part question. It implies both, fitness of the application or the business problem being solved by a speech technology solution, and the cost or investment required to deploy it fitting the bill. The answer to both can be found by analyzing the critical components to a successful ROI valuation. Any company who carefully considers these components (Human Factor, the Process, Data Hygiene, Metrics, and Technology) of an ROI for a speech-based solution deployment - or any other solution for that matter - can be determined fit or not. Most companies, for example, just look at the raw Technology (hardware and software) costs to address the business problem and application. The most overlooked component is the Human Factor, i.e., the cost of training the people (employees, suppliers, customers) who will be using these systems. If a company sends 10 employees to an offsite training, at $35/hr, that is $28K. Unless your training returns $28k of benefit somewhere down the road through reduction of give agents or higher productivity, then it is difficult to justify the investment. Companies also ignore the time spend overcoming natural organizational and consumer resistance to change in doing things differently. There is already some level of consumer skepticism towards speech recognition and how well they replace IVR or live agents. Organizational change issues further impact the problem. Therefore, internal and external resistances as well as efforts and costs associated with overcoming them must be factored in and planned for. Clean data, metrics for evaluating the successes of a speech-activated system, as well as maintenance and upgrade costs are also key considerations. Fortunately, there are plenty of ways to address these critical components both from a business application and ROI standpoint. Just in the ROI valuation arena alone, there are as many methodologies available as there are management consulting firms. Sun has made this easier, though. Through building collaborative Professional Services teams, Sun can take a potential speech technology customer through the entire cycle of ROI valuation. Sun technology partners (e.g., Nuance) offer tools and expertise that help transform different interactions with a customer's CRM/IVR systems into a well-designed grammar. Using VoiceXML language and interfaces, these dialog-flows are transformed into Web-based applications that are all on Sun Solaris and SunONE environments. Through these analyses and expertise, Sun and its partners can easily ascertain a company's viability as a speech solution adopter and speech-activated its solutions and services.

Are there new tools or processes in place to help make deployments more cost effective?

RN Without a doubt! All key players in the market (Nuance, BeVocal, Intervoice, Avaya, ScanSoft, audium, et al) offer development tools as well as deployment tools. Many ISVs also offer ROI tools for determining break-even points and analysis before hand. Many ISVs are also beginning to offer creation and testing of standards-based VoiceXML applications. Many are also beginning to incorporate some of the emerging standards such as SGML and SSML as well app-server and other back-end integration code generation. Most of the tools vendors have tested and validated their solutions on Sun platforms. Sun also offers its own SunONE object oriented tools that have been integrated with some of the above vendors. Sun plans to augment its SunONE tools suit by incorporating other capabilities, in particular visual dialog module design as well J2EE app-server and other back-end code generation capabilities.

Do you think the development of industry standards have affected customers' willingness to make a speech solution investment? If so, how?

RN Absolutely. In Sun's view, four key factors have contributed to the growth of the speech technology and solutions: Speech recognition's evolution, proliferation of mobile devices, voice business application dynamics and the advent of the voice technology standards. The significance of voice standards is chief since it has impacted and simplified solutions developed in both plumbing and platform areas. Platform standards specifications (VoiceXML, SSML, CCML, SRG) are governed by W3C while plumbing standards (RTP, SIP, RTSP, MRCP) are overseen by IETF. All of these standards have helped bring to market hundreds of open, standards-based voice-activated applications and services. Today, VoiceXML 2.0 has attracted millions of developers from IVR and Web application development communities. We are also seeing packaged applications based on standard VoiceXML that are designed to simplify building of open, web-based CRM and financial applications based on voice activation. This standardization promises to further proliferate the speech technology solution in the coming year, particularly in areas of multi-modal application development, call-control and speech synthesis. Today, VoiceXML's maturity (over 30 different VoiceXML interpreters have developed since its introduction) has brought about certain comfort level with potential adopters of voice-activated solutions.

You are scheduled to participate in a panel named "Leverage Your Web Investments Using Speech" on Wednesday, Oct. 1 during SpeechTEK. What are your expectations of that session?

RN This session, in my view, speaks to the core premise of the "Voice Web" or Web-based aspect of the speech technology solutions. My intentions and expectations are quite simple: to drive the notion that voice-activated solutions of today are open and are essentially an extension of the web-based application and services model. In any market segment, you have your skeptics and your late adopters. So, I view this session as another opportunity to counter some of their objections. Also this is an opportunity to educate the audience on today's open speech platforms solutions that not only help cut costs, improve productivity, but also on how Sun's Java technology allows leveraging of existing web and back-end infrastructure.

SpeechTek Covers
for qualified subscribers
Subscribe Now Current Issue Past Issues