Meta Develops Speech-to-Speech Translation for Oral Language

Meta last week intoduced a speech-to-speech translation system powered by artificial intelligence for Hokkien, a primarily oral language spoken within the Chinese diaspora.

The translation system is part of Meta's Universal Speech Translator project, which is developing AI methods to allow real-time speech-to-speech translation across many languages.

Since Hokkien doesn't have a standard written form, producing transcribed text as the translation output doesn't work. So, Meta focused on speech-to-speech translation. The company developed a variety of methods, such as using speech-to-unit translation to translate input speech to a sequence of acoustic sounds, and generated waveforms from them. It also relied on text from a related language, in this case Mandarin Chinese.

The Hokkien translation model is still a work in progress and can translate only one full sentence at a time.

Along with it, Meta also released SpeechMatrix, a large collection of speech-to-speech translations developed through its natural language processing toolkit called LASER. These tools will enable other researchers to create their own speech-to-speech translation systems.

Free

for qualified subscribers

Subscribe Now Current Issue Past Issues

Meta Develops Speech-to-Speech Translation for Oral Language

Voice Deepfake Fraud Surged 1,300 Percent

Sanas Unveils Simultaneous Real-Time Speech-to-Speech Translation

ESTsoft Partners with ElevenLabs

Deepgram Launches Voice Agent API