Automatic Dialogue Replacement Will Translate to Big Profits

Article Featured Image

The astonishing breakthrough known as voice cloning—in which, thanks to artificial intelligence and related technologies, a speaker’s voice can be morphed into dozens of languages instantly—will change the media landscape as we know it. And one of the more obvious and potentially lucrative applications of this technology is automatic video/movie dialogue replacement, also known as voice-to-voice translation or revoicing.

Indeed, such a step forward seems inevitable, as voice-to-voice translation should draw far bigger audiences to a movie or show than one whose only translation option are subtitles. Such revoicing may have the strongest financial rewards when it comes to under-served languages—the lack of content globally in such languages means there’s a pent-up demand for voiced content that caters to those languages’ speakers.

Once the basic technical setup and partnerships are in place for automatic translation, the direct tech costs for translating streaming programs will be $35 to $55, added to which will be the negotiated creator royalties and residuals, plus profit share with software partners for translated versions. To achieve the highest-
quality results from automation, add $85 per runtime hour for human participation during the captioning and final review stages, both of which require special training because the methodologies are unique for automatic voice translation.

Which Types of Content Have the Most Revenue Potential?


Translating telenovelas (non-U.S. soap operas, made primarily, but not exclusively, in Latin American countries), nearly all of which have not been previously translated into target languages, holds high revenue potential. These are insanely popular worldwide in all cultures, garnering 5 percent to 20 percent of most countries’ media watching population. In 2021, Televisa generated $1.6 billion in revenue from telenovelas, with 25 million unique visitors to their content website in May 2021 alone. This single company has a library of 87,000 telenovelas. TV Telemundo, owned by NBC-Universal, is now available in 48 countries, including on the pay-TV satellite platform DStv, owned by South African firm MultiChoice. Turkish-produced telenovela series are due to exceed $600 million in sales this year, airing in 150 countries. India also produces reams of telenovelas, few of which are translated.

Only some telenovelas are translated, and thus far only a fraction of the total have been revoiced into other languages—and even fewer are professionally voiced by talents in under-served languages. Overall, the amount of existing translation is tiny compared to the potential amount. Importantly, telenovela producers can no longer afford to pay heavily for translations. They are in a serious crisis brought on by the pandemic, which pushed many production companies into bankruptcy, as they could not film new episodes. The viewership increased exponentially during the pandemic, but then dropped like a rock after the pandemic. All of these factors are opening a wide door for new approaches to affordable translation.


Bollywood releases 150 to 200 films per year. Its libraries of movies are filled with musical productions, but now there is a strong movement for non-musical movie production, an effort that is expected to grow in the future and that will produce content that is more amendable to automatic revoicing. Some distributed movies are subtitled in some languages, but almost none are revoiced. Automatic revoicing may be the answer to more effective globalization of Bollywood’s non-musical content. Distributors will need to caption only one time in a special way, then use those captions to drive dialogue replacement. The result will be content available in dozens of languages, fully lip-synced and revoiced, from a single captioning effort.

Nature Shows

According to the BBC, more than 1 billion people watched Planet Earth II and Blue Planet II in the past three years. Disney and Apple are investing heavily in wildlife programming as part of their efforts to lure subscribers to their streaming services. In recent years, nature shows have flourished on cable and public broadcast networks, with roughly 130 original nature series airing in 2019, more than the previous three years combined. according to Nielsen. This trend was bolstered by Disney’s $71.3 billion acquisition Fox’s entertainment assets in 2019, a deal that included National Geographic, which now has hundreds of hours of content on Disney+ streaming services.

Libraries of Past TV Shows and Movies

Given that TV shows have been around for 70 years, and not just in English (and movies much longer), there is a huge backlog to draw from and translate into major and minor languages. Not all will be applicable to every culture, and not all will be available for translation, but those that are available make for very interesting revenue possibilities, especially when translated cheaply by automatic translation. Even semi-modern blockbuster movies and shows may be ripe for auto-translation into languages for which translations are rarely provided, like Setswana, Croatian, Marathi, Swahili, etc.

Corporate Media

Revenue from this sector—whose content includes corporate videos, CEO messaging, podcasts, training, and similar material—is not to be underestimated. This sector alone will ensure sufficient revenue so that investors can avoid financial losses during the development stage.

Partnerships Are Key

During dialogue replacement, video elements will be altered, and these alterations will be peppered throughout a program. To control the costs of automatic translation, deals will be struck or company buyouts occur with developers of various software used in the automatic translation process. The list of partners will include makers of these applications:

  1. Voice cloning software, either on a recurring role basis or creating voice-cloned text-to-speech (TTS).
  2. AI TTS software, used to create multiple emotional variations until TTS can be programmatically emotive.
  3. Speed matching software, which re-times video frames to coordinate with the translated TTS timing
  4. Voice identification software, to identify changes in speaker and gender.
  5. Automatic lip sync software, to adjust the onscreen lips to the new language.
  6. Foley filler software. If the music-and-effects track is not separate, this software extracts background ambient sound obtained from the original soundtrack during sequences with no dialogue, then transfers and multiplies it into the translated version.

Media translation as an investment will almost certainly not fail. At absolute worst, investors will get their money back for any development funds invested, whatever the dollar amount. At best, the new-audience floodgates will open for content streamers.

Creating a movie translation does not relieve the translator of responsibility for paying royalties and residuals to the media creator. With automation, the royalties and residuals will be significantly less expensive, but nonetheless a requirement. The cost of automatic revoicing will be, in years to come, so low that profits can indeed be shared in a fair manner without financial pain. 

Sue Reager specializes in across-language speech communication, applications, and context engines. Her innovations are licensed by Cisco Systems, Intel, and telecoms worldwide. For 20-plus years Reager was responsible for translating television shows, movies, cartoons, documentaries, and corporate videos in Europe, Africa, South America, and the United States.

SpeechTek Covers
for qualified subscribers
Subscribe Now Current Issue Past Issues