GALE Winds Blow at BBN

BBN Technologies announced today a $5 million award in additional funding by the Defense Advanced Research Projects Agency (DARPA) for the Global Autonomous Language Exploitation (GALE) program. GALE’s goal is to develop software technologies to analyze large volumes of speech and text in multiple languages using engines that automatically process the language and consolidate it for military personnel and English-speaking analysts. Currently, DARPA is in year three of the projected five-year program. Other participants include IBM and SRI.

During the first two years of the program, BBN met or exceeded annual accuracy goals for translation of foreign textual newswire bulletins and broadcast news. In January, the company was awarded $13 million to continue its research processing Arabic speech and text. The additional $5 million will allow BBN to meet its accuracy goals in Mandarin Chinese. 

Later this summer, BBN and the other participating vendors will have to meet target benchmarks. These include 80 percent accuracy on 80 percent of spoken Mandarin content and 85 percent accuracy on 85 percent of Arabic spoken content. The program goal for GALE, within two and a half years, is 95 percent accuracy.

While this seems steep, it’s par for the course when dealing with DARPA’s requirements. "All DARPA-funded research is about making huge jumps," says Prem Natarajan, vice president of speech solutions at BBN. "Every time we’re doing a DARPA project, it looks unrealistic when we start it. It looks really challenging when we’re halfway through it. And it looks like a pretty significant accomplishment when we’re done with it."

Ultimately, the goal for DARPA is to have what amounts, in theory, to a universal translator. Automatic translation technology is developed off of an algorithmic approach instead of a phoneme approach, and DARPA is primarily interested in the technical challenges of a given language. "One key requirement is that the technology is language-independent, which means you don’t have to reinvent a solution for every new language," Natarajan says. "When you have a new language, it configures the software with the new data so you don’t need linguistic experts."
While BBN has been working on Chinese since the late 1990s, interest in the language, according to Natarajan, has increased immensely during the past six years. This might be due partially to political interest in the area as well as the fact that Chinese, like Arabic, is significantly different from alphabetic languages.

Currently, BBN’s work focuses primarily on Arabic, Mandarin Chinese, and English. "English language modeling continues to be an important part," Natarajan adds. "Without good English models, we can’t translate into English."

SpeechTek Covers
for qualified subscribers
Subscribe Now Current Issue Past Issues