Ed Miller is the CEO of LumenVox. In a Q&A with NewsBlast, Ed gives some background on his company, as well as discusses some more general issues like standards and benchmarking.
Q Tell us about Lumenvox. How did you get started? What markets do you have a presence in? What technologies do you develop?
A LumenVox is a wholly owned subsidiary of Progressive Computing that has been in business for over 18 years. Progressive wanted to speech enable their voicemail systems and call center, so we went to several speech companies to discuss the technology and their products, but their pricing was a little exorbitant. We then turned to the academic community and went to CMU (Carnegie Mellon University), MIT and CU Boulder for training and education in speech technology and development systems. That process began about 3 years ago; we introduced the result to Progressive's voicemail system about a year and a half ago. Eventually, we decided to package our speech recognition engine and sell it to small and midsize companies that wanted speech recognition capabilities, but could not afford the price tags of the existing speech companies. Since a big part of the development cost in speech recognition is in the technical, programming side, we also developed an extremely easy to use GUI designer and application engine, which allows anyone with basic computing knowledge to develop a speech application, without needing extensive speech application and/or programming experience.
Q What do you believe will be key market drivers for this technology in the short-term? Long-term?
A In the current state of the economy, I see the key market drivers being ease of use and price point. Speech is the most natural interface for humans; the technology has finally reached the point where speech-enabled applications work extremely well for the average user, particularly in previously human-intensive applications, such as call centers, customer support, etc.
In the long term we believe speech will become the primary human-to-machine interface. The over-inflated promises of speech recognition in the past have created the collective mindset that speech recognition is all smoke and mirrors. We must prove to the general business community that speech applications work at the level their customers demand before speech-driven systems will be really integral to their businesses.
Q What vertical market segments do you see supplying the most growth for speech technology developers and why?
A The call center industry is one of the best market segments for speech recognition technology. Speech can easily supplement the call center, where human operators are unavailable or in short supply. There will always be some need for human-to-human interaction, but many customers will receive quick answers to most questions through a much more natural interface than the old touch-tone systems.
Q What should the speech technology industry as a whole be doing to increase the growth rate of speech technology deployments?
A We need to have a stable set of standards for speech applications and create easy to use tools to help foster a rich community of independent speech application programmers. Reporting and logging standards should also be developed, to allow customers to reliably estimate the accuracy of the products, as well as enable inter-process/technology communication without any undue work on the part of programmers and/or users.
In LumenVox's case, our Speech Driven Information System (SDIS) was specifically designed to be compatible with any future standards. Our own internal specifications can be translated into other formats, and allows us to interface with the current crop of speech technologies, particularly with respect to reporting.
Q How should an enterprise evaluate their needs for using speech technology?
A An enterprise should look at the way it communicates with its customers. If they have an existing IVR touch tone system, they have a great starting place for a speech application. Speech systems can augment virtually any customer service department, decreasing hold time and increasing customer satisfaction. Call routing is often the easiest place to quickly and inexpensively realize the efficiency savings of speech.
The basic question that an enterprise should ask is how many of their customers' calls are the same type of information, i.e. account balances, telephone extensions, repeated help questions that can be answered automatically, etc. Those are the kinds of questions that suck up a customer service representative's (CSR) time, and cost the enterprise money. If a speech application can alleviate some of the repetitive work, then the CSR can move quickly to the more complex questions that an automated system can't deal with.
Q Describe a successful speech technology implementation and why you thought it was successful. Please include any benchmark statistics that support your thoughts.
A I feel that any company that implements speech technology, and has 30 percent or more of its calls handled completely by computer is successful. Speech technology should not be used to completely replace real customer service agents, but to increase their productivity. We are trying to provide another vehicle for self-service, in a world that is requiring this autonomy more and more.
Progressive Computing deployed their speech application in the call center during Q1 of 2002. They had set up the system to allow for customers to either go directly to a live agent or have self-service. By August of 2002, out of the 29,025 calls received, over 10,000 calls were handled by the speech application, eliminating CSR intervention for common questions.
Q Provide us with your thoughts on the various standards that are being implemented and discussed.
A Both VoiceXML and SALT are on the frontier of speech technology. Obviously VoiceXML has been around longer than SALT and both have their strengths and weaknesses. VoiceXML has a larger community of developers, but it is essentially a procedural language. SALT has the ability to easily create mixed-initiative and event driven dialogs. LumenVox is closely watching and experimenting with these two technologies before we determine which (or both) to implement in our Speech Driven Information System. Currently the Speech Recognition Engine can be developed to sit below a VoiceXML Interpreter, due to the engine's hardware independence; because SALT is still a very new release technology, we have not yet worked with it as much.
Q The industry has talked about conducting a "benchmark test" of the various suppliers of speech technology. Do you think this would be beneficial for customers? If you answered yes to this question please describe the method by which you think would be most beneficial to buyers of speech technology solutions.
A I do feel that this would be beneficial to customers, simply because each company has different tuning and training tools that they utilize. Our industry especially needs to focus on the most basic engine and functionality, and then expand the benchmark system to include the 'added' features and parameters every company develops. For example, most companies have several versions of the same product offered at different prices that have somewhat different performance characteristics, such as the trade-off between slightly less accurate recognition for a reduced CPU load and decode time. The problem is that potential customers can't really evaluate the various solutions, except on very subjective, more or less ad hoc methods. Benchmarks would allow potential customers to compare on quality, functionality and price of the various companies. One benchmark already exists in the academic community, based on word-error rates, but even that rate is viewed as the 'best measure so far' and not really the 'best' method of measuring accuracy, so perhaps a little cooperation with the academic and government arenas would be profitable.
Q How do you provide confidence to your customers and partners that LumenVox has the ability to support their technology needs?
A We have always had the outlook that all our customers are the top priority and we pride ourselves on that view. For us, the customer experience begins when they first contact us so we strive to provide the support and information they request within a very reasonable period of time. In the past, we have offered demonstration systems tailored to a customer's environment where they can call a toll-free number and get a feel for how well our system works. We also provide the SDIS and Speech Recognition Engine in a single-port version free from our website. With this type of introduction to LumenVox, new customers get the confidence in us that our product works, and even more importantly, we will work to make it even better for each individual client. Our attitude helps customers to believe in our company, service and ability to support their needs.