Reflecting on five years of growth in the speech recognition market reveals some interesting trends from the financial services industry and user adoption perspectives. Many financial services institutions (includes banks, brokerages, credit unions and insurers) have shed their conservative images through aggressively developing company-wide "speech strategies" to improve customer service, accommodate the highly mobile user, and drive costs out of the business. An industry segment historically known for lengthy product evaluations and fraught with regulatory issues has looked beyond the confines of the call center to its application of speech technologies. As such, the financial services industry provides a model for other industries to emulate in deploying a myriad of speech recognition solutions. Moreover, financial services has seen industry-wide acceptance of speech recognition from the demonstrable results of multiple and sometimes identical application deployments, which influence 2003 budgets and helps expedite the otherwise lengthy evaluation cycle. Since 1997, investors have been able to get stock quotes or place a trade over any phone via a voice-activated interface. Automated speech recognition (ASR) solved a very simple problem for brokerage firms: stock names were difficult to enter into a keypad and brokers' time was expensive. According to The TowerGroup in its July 2002 report, Automated Speech Recognition: Moving Telephone Automation from Touch-Tone to the Tip of Your Tongue, the group sees continued adoption in this industry segment alone, estimating that global spending on ASR technology for retail brokerage applications will grow at a compound annual growth rate (CAGR) of 6.2% from $6.3 million in 2001 to $9.0 million in 2007. Similar trends are projected for banks, mortgage lenders and insurance companies as well. For example, the Gartner Group expects a single-digit increase in financial services' IT spending in 2003, with resilient spending in the banking sector and increasing IT spending in insurance, according to the Preview of 2003 IT Spending Growth Among Vertical Markets . Why Growth in Spending In A Tough Economy? Projected growth of the speech market in financial services is in part based on experiences of the past five years. In that time, speech deployments have blossomed within the financial services community, fanning out from straight ASR deployments for specific tasks to deployments of multiple speech technologies across the enterprise. Given the increasing number of deployed speech applications among financial services institutions (FSIs), awareness among players is rising and reports on consistent return on investment scenarios continue to be compelling: -A third-party evaluation, conducted by The Kelsey Group, represents one of the most comprehensive ROI studies of fully deployed speech services in the speech industry. The study revealed that, on average, respondents from a variety of industries, including Financial Services, are saving $1.02 million annually from their ASR deployments and that the cost recovery for ASR deployments averaged 9.5 months. -According to a benchmark study on bank call centers by the Center for Customer-Driven Quality at Purdue University, speech interaction reduces call time by 35% compared to touchtone. Couple these statistics with company-specific case studies and you can quickly see why FSIs continue to invest in speech applications in a down economy. At a recent user group, Citibank Germany explained their experience with speech technology. The company's expected return on their speech application was five years; in actuality their goals were achieved in only five months. Beyond ASR: TTS Is Expanding the Possibilities While early investments in speech recognition technology were for very specific types of applications that would be next to impossible to deploy using touchtone technology, banks, credit unions and insurance companies have changed the scope for all players. These FSIs have presented ASR to their customers and members with an aggressive focus on enhancing the overall customer experience in the enterprise. They realized that this technology is a "convenient" interface. In addition, it wasn't a bad way to differentiate their offerings from those of their competitors. There are currently a number of market leaders who are establishing speech as their primary interface with callers; their competition has noticed - and is following suit. But banks, credit unions and insurers haven't limited their investment to ASR; they have introduced complementary Text-to-Speech (TTS) technology to their customers. In the past few years, TTS has achieved better voice quality, making it seamlessly interoperable with an ASR engine. TTS has more personality than previously thought possible, and the synthesized quality can speed application deployment and reduce costs by eliminating the time and expense of audio studio recordings. A number of FSIs have invested in TTS for applications like bill payment. TTS has allowed for flexibility and personalization of an application that previously assigned random numbers to payees -- a user-interface that is wholly counter-intuitive and can require a "cheat sheet" to remember the number assignments. While ASR streamlines that call flow - "go ahead and say the name of the bill you'd like to pay" - TTS confirms the payee name to the caller, saving time and money that would have previously been spent on a voice talent. This type of application has a solid business case in the call center: compare an average agent cost per call of $2.30 to an average cost per speech-automated call of $0.25. Any increase in automation is a win for FSIs. Speech Outside of the Call Center In an industry whose competitive landscape has been consistently changed by government regulations, it is not surprising that government regulations have prompted FSIs to re-evaluate the way they interact with their customers. Recently, the American Disabilities Act (ADA) Accessibility Guidelines for Automatic Teller Machines required that, "speech output should be supplied for all displayed text and labels." Based on this new legislation, banks are now leading all other financial institutions in the expansion of speech to areas outside of the call center: ATMs. In October of 2002, the Canadian Imperial Bank of Commerce (CIBC) began utilizing embedded TTS in English and French to allow the visually impaired to conduct financial transactions over their Automated Banking Machines (ABMs). These new ABMs meet the needs of the over 100,000 Canadian National Institute for the Blind (CNIB) Canadian clients who are unable to read print because of a disability. CIBC's customers are able to plug in their audio headset and initiate transactions and have the ABM respond using TTS. In addition to allowing for ADA compliance, TTS in the ABM allows for personalization of prompts, transaction-screen advertisement, transaction mini-statements, revenue-generators such as money transfers, prepaid phone recharge, ticket/stamp dispensing, etc. By moving what was historically considered a call center technology onto devices such as ATMs, FSIs have once again leaped to the forefront of speech technology adoption. Whether pushed by regulatory compliance or a strong business case, FSIs are constantly looking for ways to service customers in all channels more cost effectively while retaining high rates of satisfaction. Speech is helping to meet this need. With So Many Options, How Are FSIs Determining The Right Applications? Since Financial Institutions have a history of successful experiences with speech technologies, the question, "Should we use speech", has matured to one that asks, "How do we best use speech?". Leading financial institutions recognize that speech technologies can generate strong business benefits in a number of areas: Customer Service Call Centers, Internal Help Desks, Main Switchboard Call Routing, ATMs and eventually, even on PDAs. In addition, the scope of technologies being deployed, ASR, TTS and Speaker Verification (though not discussed in this article, FSIs are leading adoption of this technology as well), add a depth and complexity to the evaluation process that all financial institutions must now undergo. All of these mitigating factors have pushed FSIs to approach speech investments from a strategic point of view. FSIs are organizing project teams to create a "speech strategy", a company-wide blueprint for achieving significant business results from the deployment of speech technologies throughout an organization. The change from five years ago is stark: today multiple business units are getting involved with strategic speech decisions. FSIs' speech strategies assume an enterprise-wide view, setting the stage for the first and subsequent speech services, in order to achieve maximum benefits of customer satisfaction, operational efficiency and branding. This strategic approach to speech technologies makes FSIs a model for other industries to emulate when deploying a myriad of speech recognition technologies.
