The Prescription for Speech Delivered at SpeechTEK 2011
                
                NEW YORK (SpeechTEK 2011) — Ordinarily, a conference’s closing remarks are supposed to give attendees a positive takeaway, but this year’s ending keynote August 10 posed a challenge.
Conference co-chairs Jim Larson and Susan Hura both maintained that people still hate using interactive voice response (IVR) systems and urged attendees to improve them at all costs.
“The current state of the speech industry is not very happy,” said Hura, president of SpeechUsability. “Speech in the IVR context is not winning rave reviews from the general public. People by and large have a negative opinion of this technology.”
Larson shared a similar outlook. “IVRs are giving speech a bad name,” he said. “Let’s clean them up.”
                As proof of customer dissatisfaction with IVRs, both speakers cited the growing number of companies promising customers that they will speak to a live person rather than an IVR when they call, and the continuing presence of sites, like GetHuman.com, that deliver ways for consumers to bypass IVR menus when calling a customer service line.
To make matters worse, the speech-enabled IVR is no longer “the only kid in the sandbox,” Larson added. “IVR must play nicely with the other kids.”
Among the other kids, Larson referred to mobile and multimodal applications that rely on many other input and output formats. In fact, very few mobile applications rely on speech as the main interface, both Larson and Hura noted. In response, the industry will need to develop applications “with multiple interfaces that are quick, friendly, and easy to understand,” Larson said. 
He then outlined the following six-part prescription for moving the speech industry forward: 
1. Improve your company’s IVR rating to greater than 4 out of 5.
2. Publish a definitive set of user interface guidelines for multimodal mobile applications based on experimental results.
3. Create a collection of world-class multimodal mobile applications.
4. Grow IVR apps into multimodal mobile apps.
5. Develop new multimodal mobile applications.
6. Develop and publish best practices for multimodal mobile apps.
As part of his prescription for the industry, Larson also suggested that developers do away with proprietary systems and platforms and, instead, build to industry standards that provide greater integration and portability. “You can write a native app for each platform—Apple’s iOS, Windows, and Android—but wouldn’t it be better if one app could work on them all?” he asked. Larson suggested using standard languages and programming interfaces, such as VoiceXML or HTML5.
Hura’s prescription contained only three parts:
1. Figure out where speech makes sense. It has to fit within the user’s need and context of use.
2. Make speech work flawlessly by reducing the number of recognition errors.
3. Set the bar higher than just usability in terms of metrics.
She defined usability as the ability of the customer to do what the system was designed for.
“That’s not good enough anymore,” she said, arguing that applications should be valuable, in that they solve a real problem for customers; pleasant; efficient, so they don’t waste callers’ time; and transparent, so they don’t get in the way of what the caller wants to do. 
“Thus far, in terms of providing value, efficiency, and a pleasant interface, we have failed,” Hura said. “What we are doing as an industry is too little, too late. We have not done enough to provide excellent customer experiences.”
She also advocated for fellow designers to “make capturing user feedback as important as gathering requirements from the business stakeholders.”
News Editor Leonard Klie can be reached at lklie@infotoday.com.