Collaboration will focus on human-machine interaction.
Researchers at the International Computer Science Institute (ICSI) are working with Microsoft to advance the state of the art in human-computer interaction relying on speech and other modalities.
"[This] creates a win-win situation," says Roberto Pieraccini, director of ICSI. "Both Microsoft and ICSI have among the best researchers in speech, so collaboration among them has the potential to produce substantial advancements in the field. Microsoft researchers are working with a focus on commercial realizations, while ICSI researchers follow a more academic and curiosity-driven approach. Thus, in this collaboration, ICSI has a unique opportunity to work on specific problems and real data, while Microsoft can benefit [from] the ICSI long-term research vision."
Under the partnership, researchers will use information conveyed by the melody and rhythm of speech, known as prosody, to improve automatic speech understanding.
"One of the first projects is that of understanding how to use speech prosody—the intonation always present in natural speech—to extract information which can be used to identify the intention and the emotional state of the speaker," Pieraccini says. "While there have been several attempts in the past to use prosodic information at the level of speech recognition, we have not been able to use that effectively. In this study we are trying to push those attempts further."
Elizabeth Shriberg and Andreas Stolcke, principal scientists with the Conversational Systems Laboratory (CSL) at Microsoft and ICSI external fellows, will lead the effort. CSL, an applied research group within Microsoft's Online Services Division based at the Microsoft campus in Sunnyvale, Calif, is exploring novel ways to interact naturally with computer systems and services using speech, natural language text, and gesture. Its aim is to enable conversational understanding of users' inputs and intentions across a range of devices, from mobile phones to Xbox consoles in the living room. CSL conducts research spanning a range of scientific disciplines, from acoustic to semantic and affective language processing.