June 1, 2007
By Michael Cohen Manager, Speech Technology Group - Google, Inc.
The View from AVIOS

Highlighting Metrics and Mobility

SpeechTEK West was an exciting meeting, with talks covering a wide range of topics, including technology, application deployment, standards, and business analysis. There were also a number of excellent educational workshops covering voice user interface design and personalization, and a big focus was placed on the development of multimodal applications. Given limited space, I’ll comment here on just a few highlights and observations.

There were two speech industry trends that I found especially promising. The first was an increased focus on concrete data analysis techniques applied to live, in-service applications. A number of talks described tools for gathering such data, discussed what measures are most relevant, and provided case studies showing how this type of analysis leads to insights and improvements in application performance through successive iterative changes.

One noteworthy presentation was from Marci Kirkpatrick, a project manager at AT&T. She described a number of measures that AT&T tracked for a deployed call center system, including retry rates (specific to any given dialogue state), self-service success rates, hang-up rates, and repeat-call rates. As a real-world example, Kirkpatrick showed a case in which a prompt change actually degraded recognition performance for a particular dialogue state, leading to more retries and reduced success. Given AT&T’s ongoing measurements, the company was able to detect the problem quickly, make the change, and ultimately achieve improved results. This was a persuasive illustration of the importance of tracking such quantitative data in live systems. Often, changes that intuitively make sense have subtle problems that can only be detected by analysis of substantial amounts of real usage data.

Lizanne Kaiser, the senior principal consultant for voice services at Genesys Telecommunications Laboratories, discussed the need to define and prioritize metrics early in the design process and gave a number of examples of how poorly chosen metrics can result in bad design decisions. She described differences in transactional and informational tasks, and how those differences influence the choice of appropriate metrics.

Mobile Opportunities

The second trend I noticed was excitement about the mobile opportunity, with a focus on new types of applications and services that can be delivered to mobile phones. This is a healthy development for the speech industry. It is exciting to see people so focused on new ways to serve the needs of end users via the rapidly developing mobile industry. Speech technology clearly has a key role to play in making mobile services useful and compelling.

This excitement was evident in the many sessions on mobile and multimodal applications and user interface design, discussions of efforts to come up with standards for multimodal applications, a multimodal workshop, and an excellent keynote address from Mike McCue, founder and CEO of Tellme Networks, which was recently acquired by Microsoft.

The focus on new applications and services this year contrasted with past meetings where the dominant focus was on call center applications to automate calls previously handled by live agents. Certainly, call center automation remains an important area where speech technology can help meet needs and deliver ROI, and there were a few sessions that focused on the topic. Historically, though, the industry focus on minimizing
human-agent interaction with end users has often led to suboptimal design decisions in terms of the end user experience. The renewed focus on innovative applications and new types of value we can bring to end users will hopefully lead to a new generation of applications and services that will provide long-term growth areas for the industry and compelling new experiences for users.

Mike Cohen is the manager of speech technology at Google. He is vice president and a member of the board of directors of AVIOS.

Free

for qualified subscribers

Subscribe Now Current Issue Past Issues

Highlighting Metrics and Mobility

IBM Releases Granite 3.3 8B Speech Recognition Model

Nari Labs Launches Dia TTS Model

SoundHound AI Partners with Tencent to Bring Conversational AI to Auto Brands

Mango AI Offers Free Voice Cloning