Jim Caldwell, Director of WebSphere Infrastructure Product Management, IBM
IBM connects people, wherever they go, to the information and applications they need, using wireless and voice middleware on the server side to support the broadest spectrum of mobile networks and a wide array of devices on the client side. IBM's solutions enable users of telephones, wired or wireless, to conduct business transactions or access information by speaking.
Speech Technology Magazine sat down with Jim Caldwell to discuss his new role and the changes occurring with speech inside and outside of IBM.
Q. Congratulations on your appointment as director of WebSphere Infrastructure Product Management at IBM. What will your new role entail?
A. My team is responsible for a set of products including our speech products: IBM Embedded ViaVoice and IBM WebSphere Voice Server; our commerce product: IBM WebSphere Commerce and our application infrastructure products: IBM WebSphere Application Server including Community Edition and WebSphere XD.
Q. What does IBM provide relative to speech technologies and products?
A. IBM develops speech engines (ASR & TTS) and languages. We deploy this technology in two areas, enterprise speech and embedded speech. Our enterprise speech provides WebSphere-based speech technology for conversational self-service applications in use by customers such as T. Rowe Price and Prudential, to name a few. Our award-winning embedded speech is well recognized in the automotive space with customers like GM OnStar, Honda, Pioneer and XM Satellite Radio with VoiceBox. We are also expanding into additional markets including consumer electronics and service provider solutions, enterprise solutions and set top box/digital media solutions. Wake Forest University, Miami Children's Hospital with Teges Corporation and Openstream are examples of customers in these markets.
Q. Why did speech move into the IBM WebSphere organization?
A. A lot of times these kinds of technologies are nurtured in IBM as Emerging Business Opportunities, but as they mature, they graduate from EBO status and become major contributors to our end-to-end solutions. The IBM WebSphere platform is key to a Service Oriented Architecture, and SOA is key to being an "on demand" business. With more than 2,400 SOA customer engagements, IBM helps companies become "on demand," so they can react quickly to threats and new opportunities. The Contact Center is a crucial gateway into a company, so speech can be a vital part of the Contact Center strategy to empower customers to complete the transactions they need to complete on a 24X7 basis. Our embedded speech focus complements this by addressing the need for a more natural user interface for connected devices.
Q. What changes, if any, do you have planned for IBM's server-side speech recognition programs?
A. We will continue to innovate with speech technologies and grow the use of speech as part of a larger end-to-end solution. My job is to make sure we leverage the larger IBM community and make sure our worldwide organization is aware of the benefits that speech can bring to our customers. There are tens of thousands of customers using WebSphere technology today, and many of those customers may be able to see great value in conversational self service.
Q. What, in your opinion, will continue to drive the growth of speech technologies?
A The whole field of unified communications, which encompasses SIP and VoIP and open standards, these are all things that are driving the use of speech for conversational self service, authentication and even speech-to-speech translation. Speech Integration and the ability to touch customers in multiple ways will help grow Services Oriented Architecture within companies. Using the SOA example, speech services can exist either inside or outside of corporate IT. Hosting of speech self-service will drive growth giving self-service abilities to companies without the up front investments.
Another very practical answer is the success we and our customers are having with the technology. A good example is in the automotive space. Our customers are providing significant differentiation by using our embedded speech in components such as car navigation systems. Our ability to provide services to help our customers integrate speech into their solutions is also key.
Q. What do you think will be the "next big thing" in speech?
A. The next "big thing" in speech will be when speech stops being a "big thing." Just like when cell phones moved from being a novelty to being a commonplace item, speech will become a natural mode of interaction to access services and information. This will be driven by the integration of speech into our systems and solutions as well as the evolution of the user interface to a more natural conversational interface.
We recently released a new version of IBM Embedded ViaVoice which provides something we call "freeform commands." "Freeform commands" enable our customers to provide a much more natural conversational interface which does not require the user to memorize specific predetermined commands. We see the user interface evolving to a full conversational interface over the next several years.
Q. Is there anything that you would like to add?
A. In the May/June 2006 issue of your magazine, you published an excellent article by analyst Nancy Jamison on "Innovation," featuring the incredible contributions IBM Research has made to the worldwide speech community over 35 years and its continued investment in the development and progress of future conversational technologies for devices. "IBM is certainly at the forefront of what this column is about - innovation," wrote Jamison. "Whether research results are near term or in development for decades, they innovate in new areas and push the envelope in existing ones. From improving self-service applications in a contact center, adding to the efficacy of machine translation and human communication, to improvements in the core technology, IBM is vastly improving the quality of speech technologies." While IBM's previous breakthroughs in speech recognition are part of our proud legacy, our commitment to furthering speech technology in the future is solid.