Mozilla Announces Open Source Common Voice Speech Recognition Datasets

Mozilla has announced an expansion to its crowd-sourced Common Voice project. The Common Voice Project, which is just about a year old, is creating an open source voice-recognition dataset. Now the project is opening up to include more languages. Mozilla wants volunteers from across the globe to record short bits of text with their voice through a web or mobile app.

According to VentureBeat, “Mozilla launched the first fruits of its Common Voice datasets in English back in November, a collection that contained some 500 hours of speech and constituted 400,000 recordings from 20,000 individuals. Today, Mozilla officially kick starts the process of collecting voice data for three more languages — French, German, and — a little randomly — Welsh. Another 40 tongues are currently being prepped for the data collection process, with the likes of Brazilian Portuguese, Chinese (Taiwan), Indonesian, Polish, and Dutch already halfway toward being ready to start crowdsourcing voice data.”

Free

for qualified subscribers

Subscribe Now Current Issue Past Issues

Mozilla Announces Open Source Common Voice Speech Recognition Datasets

Nex-Gen Chat Solutions with Generative AI You Can Trust

Speech Technologies in the Low-Code/No-Code World

Meeting the Rising Demand for Voice-Based Biometric Systems

More Web Events

Tips for Reviewing Voicebot Vulnerability

Safety and Ethical Concerns Loom Large in Voice Cloning

Apple Proposes Acoustic Model Fusion to Improve Speech Recognition

Aculab Launches Audio-to-Audio Translation