NVIDIA Joins Mozilla Common Voice Initiative
NVIDIA at its Speech AI Summit introduced a speech artificial intelligence ecosystem built with Mozilla Common Voice to develop automatic speech recognition models that work across languags worldwide. The new ecosystem focuses on developing crowdsourced multilingual speech corpuses and open-source pretrained models and expanding the speech data that is available for low-resource languages.
According to NVIDIA, the initiative will focus on helping AI models understand speaker and language diversity, different accents, and different noise profiles. Developers will be able to train their models on Mozilla Common Voice datasets and then offer those pretrained models as open-sourced automatic speech recognition architectures.
The Mozilla Common Voice platform currently supports 100 languages, including six new languages (Taiwanese, Bengali, Cantonese, Tigre (Eritrean), Meadow Mari, and Toki Pona, and includes more than 24,000 hours of speech data available from 500,000 contributors worldwide, including a growing number of female speakers.
Through the Mozilla Common Voice platform, users donate their audio datasets by recording sentences as short voice clips, which Mozilla validates to ensure dataset quality upon submission.
Funding goes to eight projects using voice technologies to further financial inclusion, access to reliable information, and legal rights for marginalized communities.
Low-resource languages in South Africa and elsewhere are finally catching the attention of speech tech developers.