iQIYI Holds Voice Cloning Challenge

iQIYI, a Chinese online entertainment service has partnered with multiple organizations to hold the Multi-Speaker, Multi-Style Voice Cloning Challenge through Feb. 11.

The contest aims to enhance the quality of synthetic speech while reducing dependence on training datasets. iQIYI hopes participants can improve the intelligibility and naturalness of synthetic speech even under conditions in which there are limited resources.

The competition is comprised of two categories, the 'few-shots' category and the 'one-shot' category. Target speakers for voice cloning validation and evaluation are provided for both categories.

In the few-shots category, each speaker has a different speaking style with 100 available samples. In the one-shot category, each speaker has a different speaking style with only five samples.For both categories, contestants will be provided with two base datasets for base model training, with each dataset containing 5,000 different training samples of different speech styles. Winners will be selected for each category based on  speaker similarity, speech quality, style/expressiveness and pronunciation accuracy.

Through this challenge, iQIYI hopes to team up with researchers and build solutions for low-resource voice cloning with advanced deep-learning technology and multi-stylistic voice morphing technology.

SpeechTek Covers
for qualified subscribers
Subscribe Now Current Issue Past Issues