A Whole New World
Businesses are focused on immediate ROI and implementing new technologies to drive revenue. And during the past 18 months, speech technology has been delivering on its promise. Speechenabled automated contact center agents are becoming increasingly common, and conversational navigation systems are no longer limited to luxury cars - they're used even in the mid-priced Honda Accord and Acura MDX. In a pervasive computing world, where there are millions of people connected to billions of devices, the user interface will be key - and speech is one such interface. We've made a lot of progress. The reality of a "Jetsons"-like lifestyle is a lot closer than one might expect. Still, we need to bear in mind that building a good speech application requires very specific skills and considerations, and that we need to go beyond transferring assumptions about the Web world into the speech world. Let's start with the most fundamental level - interfaces. Programming speech applications is quite different from programming graphical user interfaces (GUIs). Conversational user interface (CUI) development requires knowledge of how people react on the phone, with their devices, and how to get them where they need to go, whether they need transactions or information, in the shortest amount of time. Developers must have a firm understanding of how to create an interface that is inviting and will encourage any user, new or experienced, to continually repeat and reuse a company's service. Therefore, companies will gain the greatest success with speech applications when they work with existing enterprise infrastructures, partner with experienced companies that support open standards and deploy speech applications in applicable areas of their business. Every opportunity comes with unique challenges. However, with a clear understanding of these challenges, along with the benefits, and a well-planned strategy for execution, companies both large and small will find that an investment in speech technology reaps significant rewards. UNDERSTANDING THE BENEFITS
One of the largest returns on investment for companies with substantial customer service departments is the self service call center application, powered by Interactive Voice Response (IVR) call center technology. In fact, a study by InStat/MDR research states that live agents cost $1 to $5 per call as opposed to 20 cents for a speech recognition system. In addition, non-traditional devices such as cars, PDAs and home appliances have much more power and memory then they did 5-10 years ago. They are shipping with more memory, wireless connectivity, increased capabilities and hard disk drives. These smart devices with embedded operating systems and technology are becoming the servers of yesterday, showing us how high-end computing system capabilities such as voice recognition and speech translation are filtering down into devices that never had intelligence before. The exponential growth of processing power has helped increase the quality and usability of speech technologies. This is allowing large, complex and sophisticated applications to run on a gamut of offerings from servers to small devices, making speech technologies a vital interface. No longer a standalone technology, it's become part of the pervasive computing infrastructure, helping us to access information anywhere and at any time. What does this mean for businesses looking to implement speech technologies? The technology is taking on a whole new role in this evolving mobile and keyboardless world. Speech makes communication easier and allows us to access data through new areas of computing such as Web portals, handhelds, handhelds, cars and customer call centers. Allied Business Intelligence (ABI) forecasts that the speech market will increase to $897.8 million in 2003, up from $677 million in 2002, and will reach $5.3 billion by 2008. This includes platforms, technologies and related services - signifying new business opportunities for existing enterprises and entrepreneurs in an emerging market. So, what type of technology is right for your company and how do you find your niche in the market? MARKET OPPORTUNITIES
Today's speech market is evolving into four distinct areas: call centers for self-service applications;
telecommunications and carriers where speech will play a role in delivering voice services to subscribers;
embedded technology as users look for new ways to communicate with non-traditional devices such as cars and home appliances; and
the work force where field and sales force professionals will use speech to simplify everyday activities and increase productivity while traveling or in the field. We're seeing real-world deployments including: Automated contact centers: Interactive Voice Response (IVR) technology in call centers allow customers of companies such as investment management firm, T. Rowe Price, to gain access to account information and 401(k) plans over the phone, 24/7, without subjecting callers to hold times or requiring callers to respond to rigidly structured menus.
Telematics: Using speech technologies in automotive applications. For example, the 2003 Honda Accord offers drivers a voice-activated mapping system that provides driving directions to and from any specified address or location in the United States.
Voice portals: Using a speech interface to access enterprise data, allowing consumers and mobile workers to access information and transactions without being tethered to their desks. Orange Dominicana, in the Domincan Republic, for instance, is now providing easier access to its entertainment and information services using speech technology. By voice-enabling items such as movie listings and local events, through Web and voice portals, wireless customers in the Caribbean Island have an opportunity to access enterprise-level information while on the go. The company also manages telephone dialogue via speech recognition and text-to-speech features while also handling more than 10,000 calls per hour.
Devices: The ability to squeeze convenient speech recognition into ever smaller devices, such as cell phones, PDAs like the Hewlett-Packard iPAQ and other mobile devices, provide users with another option to easily access information (including PIM and e-mail). Combined with rapidly-advancing mobile technologies, devices can in effect, evolve into "personal concierges" - for instance, IBM Research's Artificial Passenger prototype demonstrates how a driver can interact with a speechenabled agent in his car. The device converses with him, offers alternative routes to avoid upcoming traffic jams, rebooks flights when it finds out the flight is delayed and plays "name that tune" with the driver when it senses he's getting tired and needs to be kept awake.
Accessible technologies: IBM's Home Page Reader (HPR), allows blind and low vision users to easily surf the Web using text-to-speech synthesis to speak Web-based information aloud just as it is presented on the computer screen. HPR is now produced in 11 languages and distributed worldwide.
Desktop: As the longest running platform for speech technologies, desktop applications such as IBM's ViaVoice provide users with the ability to navigate computer programs and compose e-mails and memos via voice - freeing them from dependence on the keyboard and mouse.
Telecommunications: Additional services in the telecom sector such as unified messaging, offer a combination of voice, fax, and regular text messages as objects in a single mailbox that a user can access either with a regular e-mail client or by telephone. WHAT OPPORTUNITY BEST FITS YOUR NEEDS? Every company has its own set of needs to create a productive, efficient workforce. Enterprises want to integrate new technologies that show a visible ROI. This may be as simple as increasing workforce productivity, streamlining existing business processes or even enhancing customer service offerings. To begin the process, your team must first look at which sector of your business will most benefit from speech technology. For instance, a company with a mobile workforce may want to voice-activate its back-end databases for easy, on-the-fly access to information such as inventory availability and current product pricing. A mobile salesperson checking on inventory at a retail store or factory floor, for instance, could dictate stock availability or request new stock by dictating, rather than filling in forms. Another area to take into consideration is where your company is looking to increase efficiency. To many, this means making important data and functions easily accessible by speech-enabling everyday functions such as PIM or e-mail for mobile sales and field force workers. In this case, you'll want to ask, "Is my mobile workforce getting the data they need when they need it?" For instance, real estate agents in Irvine, Calif., can now dial into a call center via a cell phone and access property listings through NewportWork's AnyTimeMLS - a service that uses speech to access information in local Multiple Listings Service (MLS) databases. This allows agents to close an additional three to five properties annually as they save time from having to return to the office for up-to-date property specs. Finally, you'll want to determine how end-users will access the voice-enabled data. Will they want to drive in their car and hear directions via voice? Will they utilize their smartphone or PDA to set up appointments or correspond on e-mail while on the road? Or are they looking to call into a data center to gather inventory and price figures such as property specs and Multiple Listing Service (MLS) real estate information? START NOW FOR FUTURE SUCCESS If your enterprise has not already started using speech and voice technologies, here are some tips on how to begin in order to gain a competitive advantage: Get going!: It's never too early to start planning and prioritizing project initiatives. Include all constituents involved in the integration process with IT taking a leadership role, working with business units to identify opportunities and create an enterprise strategy that can achieve sound business results. Initiatives led by individual business units can result in incompatible solutions that may prove costly to support and integrate.
Be clear on objectives: Understand from the start how your speech solution will help to achieve objectives such as cost management, customer service satisfaction, process improvement, revenue and competitive gain and/or customer loyalty and retention, by developing a solid, bottom-line business case for proceeding.
In addition, be realistic about the budget. Be aware of the cost of deploying speech applications as well as the ROI. Remember, any project that is under-funded is doomed to fail. Let a seasoned vendor assist: Engage experts and trusted partners at the beginning of the process to better understand the options and make the right choices relative to the platform architectures, speech application and end-user usability and functionality.
Deploy a pilot: Select a small group of individuals and find out if they can navigate easily and effortlessly using the speech framework your team has created. You will want people who represent the people that will be using the application - potential customers, mobile workers and clients. Track the usage of the speech application and gather feedback. This small proof of concept pilot will give your organization a sense of the potential ROI as well as provide you with enough information to plan for improvements to the application before its full rollout.
Deploy and assess: After deploying your application to a larger audience, make sure to assess the project on a quarterly basis to gauge your success factor and adjust plans accordingly.
Listen to your customers: This is the most crucial step to successfully introducing a new application into your enterprise. They are a direct link to whether or not your application will succeed as their feedback will ultimately guide the growth and success of your deployment.
Speech is becoming an essential tool as we look to find new and improved ways of accessing information and completing transactions in this increasingly mobile and untethered world. In a multimodal future, speech technology will provide greater flexibility to communicate with our ever shrinking devices. This will allow us to combine graphic and speech user interfaces to access and manage information anyplace, anytime and in a way dictated by the situation. While competitive and economic pressures increase the value of immediate access to information, ROI is the end goal. A mobile work force empowered with anytime, anyplace access will be more efficient and more productive, positively impacting the company's bottom line. In this age of convergence and mobile devices, the freedom to utilize voice interactions sets the stage for truly pervasive computing. By starting now, your organization can effectively improve the efficiency of your mobile sales and field force and also create a competitive differentiation through new types of transactions and customer services, localized information and personalization.
Gene Cox is director of Pervasive Computing at IBM. He is responsible for overseeing IBM's product strategies to lead the computing industry beyond corporate walls and desktops, into a new era of e-business at your fingertips, anytime and anywhere - using wireless and embedded technologies.