What We Need Is A Killer App
Waiting for the Revolution
Three or four years ago, speech industry pundits were fond of saying things like "The speech technology industry can expect near exponential growth" and "This is the year for speech recognition."
While the industry has certainly grown since that time, few would argue that it has not enjoyed the tremendous growth that some had envisioned. When industry leaders discuss the state of the speech market, one can often hear it said that what the industry really needs is the emergence of a "killer app."
Killer apps are, of course, software applications that are so obviously valuable and desirable that the desire to obtain them can drive an entire market. In the early days of personal computers, the business case for spreadsheets and word processor applications was often sufficient to justify purchasing the computers necessary to run them. To date, however, no obvious analogy exists in the speech industry.
Public Awareness As an industry consultant, I am frequently reminded how little the public knows about the abilities of speech recognition technologies. Even when dealing with veteran DTMF-based IVR professionals, it is often necessary to "begin with the basics." Sometimes there is an issue of clients having unrealistic expectations of the technologies, but far more frequently, folks simply do not know the basic capabilities of the technologies. They are often unaware of where the technologies are already deployed and how they are being used to solve real business problems. One thing is certain: The speech industry could benefit from an inexpensive, potentially ubiquitous application that showcased some of the desirable abilities of its technologies. Such a product or application would need to be functionally effective and provide a feature or service that is above and beyond existing alternatives.
Possible Candidate? During the public discussion of the national "do not call" registry, such an application came to mind. Why not create a digital answering machine with an embedded speech recognizer that intercepts all calls to one's home? The idea is to answer each call before the phone rings and ask the caller with whom they would like to speak. The grammar would be essentially limited to the names of household members. If the caller says an appropriate name, the device would then emit ring sequence that is unique to the particular household member being called. In theory, no member of a household would ever have to answer a call that is not specifically intended for them. Thus, pesky telemarketers who mispronounce names, hesitate and cause a timeout error or simply attempt to speak with the "head of the household" could not get through. In fact all unresolved calls would automatically be routed to the message-taking component of the machine which would include a terse instruction to telemarketers to remove your number from their calling list. I have discussed this idea with various people in the industry and the consensus seems to be that such a device could be produced and sold for below $50. Furthermore, almost everyone has agreed that a $49 answering machine that only rings your telephone when the name of a household member has been correctly recognized, and even then does so in a way that is unique to each family member, would be a real bargain.
Business Reality As attractive as the idea may seem, the business case for such a device may be questionable. Although it remains to be seen how enforceable the "do not call" legislation will be, it is likely to reduce the incidence of telemarketer calls. On the other hand, some have suggested that telemarketing companies need only set up shop a few steps across the U.S. border to make their calls exempt from the legislation. In any event, there already exist several products that perform some of these functions. One product intercepts the call after an initial ring and sends a signal to the intruding predictive dialing machine that the number is invalid. Another product prevents the phone from ringing but informs every caller that the number is off limits to telemarketers. Callers "in the know" are obliged to enter specific DTMF numbers to effect individual rings for the homeowner, spouse, children, etc. Just how callers become in the know is a mystery. Neither of these alternative devices takes messages.
Wide Appeal Questions as to the business case notwithstanding, the overall idea of a $49 call screening answering machine is extremely attractive on the surface. The user interface would have to be expertly designed in order to ensure ease of use to both the owner of the device and those who call him. But if such a device were inexpensively available, millions of users might quickly come to know the capabilities of speech technologies and, presumably think very positively of it. There is no doubt about it: People will think highly of a technology that does something for them.
Dr. Walter Rolandi is the founder and owner of The Voice User Interface Company in Columbia, S.C. Dr. Rolandi provides consultative services in the design, development and evaluation of telephony based voice user interfaces (VUI) and evaluates ASR, TTS and conversational dialog technologies. He can be reached at firstname.lastname@example.org.