The solution provides advances in translation software via a combination of statistical and rule-based methodologies.
AppTek released the industry's first complete hybrid machine translation (HMT) system, a full integration of statistical and rule-based translation methodologies.
“The special thing about hybrid machine translation is the close integration right at the core of the rule-based system and the statistical machine translation system,” says Braddock Gaskill, software architect at AppTek.
Gaskill credits AppTek’s significant experience with rule-based translation and state-of-the-art statistical translation system with providing the company the expertise to bring the two together in hybrid form.
Machine translation (MT) is typically implemented via one of the two methodologies. Statistical systems—which have high levels of fluidity, but are less literal—use examples and apply statistical techniques to language data to perform translations. Rule-based systems use grammatical knowledge based on linguistic rules defined by users, resulting in translated information that is often more accurate, but harder to read and at times robotically literal.
According to Hassan Sawaf, chief scientist at AppTek, the company’s hybrid model is unlike any other system on the market today—a fact that has lead some universities to attempt to copy the hybrid model.
“Even if companies attempt to hybridize they only do hybridization insofar as that they basically combine translation memory with machine translation,” he says. “Hybridization like we do and a tight integration of rule-based features and statistical-based features are unique.”
AppTek's HMT solution provides a full integration of both methodologies instead of simply adding rules to the statistical system or a minor statistical module to the rule-based engine.
“By combining the rule-based system with the statistical, you sort of bring the literal meaning over into the statistical output and the combination becomes very strong,” Gaskill adds.
According to AppTek, the hybrid approach to MT improves translation of large volumes of speech and text compiled from a variety of sources and assists linguists, translators, and analysts in achieving greater productivity more quickly and in a more cost effective manner.
“You get the benefit of a better translation overall,” Gaskill says. “And you stay closer to the facts. It’s a big advantage over certain other statistical systems.”
According to AppTek, the three metrics used to measure the efficacy of a translation system—fluency, informativeness, and adequacy—are all supported in the hybrid system, with greater performance, quality, and accuracy.
“Those three components are supported by the hybridization of the system because we gain fluidity from the statistical element, we gain informativeness from the rule-based element, and adequacy basically comes over from [both the statistical and rule-based elements],” Gaskill says. “The combination is very strong so all three metrics are helped by it.”
Sawaf says the reaction to the hybrid system—which was in beta for about a month prior to the release—has been very positive from both new and existing customers. And Gaskill notes that AppTek isn’t yet finished with its HMT.
“Almost every month we’re adding a new language pair to they hybrid system,” he says. “We’re taking our old rule-based systems and recombining them with new data and new trainings and building new hybrid systems based off of them.”