Voice Still Searching for Its Place

SAN DIEGO — The first-ever Voice Search Conference opened here today with a Q&A-style panel that explored characteristics, usages, and both technological and sociological hurdles for the proliferation of voice search. While moderator and NewSpeech’s speech technology consultant Bill Scholz opened the line of questioning by acknowledging that voice search refers to both audio mining and conducting data searches by voice, the Q&A focused exclusively on the latter. 

Despite the uptake of voice search in the automotive field—popularized by Ford’s Sync system powered by Microsoft—most developments in voice search occur in the mobile telephony space because product cycles for mobile phones are brief. "By contrast, cars last a long time—five to seven years," says panelist Jordan Cohen, a senior scientist at SRI International. "You build a car and by the time its life cycle is over, the computer is out of date."

There are still areas where the combination of voice technology and search technology need to be resolved. Jim Larson, an independent consultant and VoiceXML trainer, and Bill Meisel, president of TMA Associates, were both in the audience and wondered if voice was an efficient user interface. Larson pointed out that the presentation of data culled from a voice search still hasn’t been optimized. "Nobody thinks of voice as a primary interface," Meisel added.  

"Speech is an inefficient way to present information," concedes panelist Alex Rudnicky, the principal systems scientist at the School of Computer Science at Carnegie Mellon University. "There are practical solutions such as multimodal interfaces."

Panelist Michael Phillips, co-founder and chief technology officer at Vlingo, says his company tries to overlay speech across already-existing paradigms like GUIs. Thus, users don’t have to be trained on how to interact with a speech interface. 

"We treat this more like Web search," says Phillips. "Let users speak whatever they want. We had to get rid of explicit application-level constraints and take advantage of characteristics of mobile phones." Often, this means relying heavily on user preferences culled from use cases and call histories in such a way that provides easy use of voice search without violating a caller’s privacy.

"This is not a speech interface question," Phillips adds. "It’s a mobile question."

Panelist Yoon Kim, CEO at Novauris Technologies, agrees that it’s important to integrate voice search with historical data, user history, and contextual information to provide more accurate search results. Yet, he emphasizes that the comparison between voice search and text-based Web search is overstated. These differing philosophies ultimately seek to distinguish when voice search will be used and in what specific scenario.   

"When people use voice search to get something, the queries and the types of words they use are different in voice search and text search," he says. "Web search on the PCs is research. You’re taking time to look at various kinds of content."

Searching by voice on a mobile device, by contrast, needs to happen quickly and results need to be simple. "So what you’re expecting out of the mobile experience in terms of content coming back is very different," Kim says.
 

SpeechTek Covers
Free
for qualified subscribers
Subscribe Now Current Issue Past Issues