Amazon Says Semi-Supervised Learning Builds Better Speech Recognition

Last week, Amazon announced that it has used a large unbundled data set to expose Alexa to a variety of human sounds, in an effort to improve speech recognition. The data set is thought to be one of the largest in history used to train an acoustic model. Amazon said it a blog post, "We developed an acoustic model, a key component of a speech recognition system, using just 7,000 hours of annotated data and 1 million hours of unannotated data.”

The Amazon scientists say they’re using Semi-Supervised—a combination of sounds that have been tagged by human beings as well as machines to train Artificial Intelligence engines. According to Amazon, the results were a reduction in speech recognition errors by 10-22%. Scientists say that this method works better than using sounds tagged only by machines.

According to the blog post, “Compared to a model trained only on the annotated data, our semi-supervised model reduces the speech recognition error rate by 10% to 22%, with greater improvements coming on noisier data. We are currently working to integrate the new model into Alexa, with a projected release date of later this year.”

Free

for qualified subscribers

Subscribe Now Current Issue Past Issues

Providence St. Joseph Health Makes Same-Day Express Care Appointment Scheduling Available On Amazon Alexa

With the voice request, "Alexa, open Providence Health Connect," or "Alexa, open Swedish Health Connect," consumers can schedule Express Care appointments.

09 Apr 2019

Yappa Debuts First Audio/Video Social Commenting Tool to Encourage Less Toxic Online Interactions

Free new plugin makes it fun and easy for site visitors to use voice or video to record quick comments.

09 Apr 2019

Amazon Says Semi-Supervised Learning Builds Better Speech Recognition

Providence St. Joseph Health Makes Same-Day Express Care Appointment Scheduling Available On Amazon Alexa

Yappa Debuts First Audio/Video Social Commenting Tool to Encourage Less Toxic Online Interactions

Nex-Gen Chat Solutions with Generative AI You Can Trust

Speech Technologies in the Low-Code/No-Code World

Meeting the Rising Demand for Voice-Based Biometric Systems

More Web Events

Tips for Reviewing Voicebot Vulnerability

Safety and Ethical Concerns Loom Large in Voice Cloning

Apple Proposes Acoustic Model Fusion to Improve Speech Recognition

Aculab Launches Audio-to-Audio Translation