Two Mics Are Better than Seven According to Amazon

In a post on its Alexa blog, Amazon says it has developed a different acoustic modeling framework. Amazon says it can boosts performance by unifying speech enhancement and speech recognition. The blog says: “The unified acoustic model is optimized solely on the speech recognition criterion. In experiments, we found that a two-microphone system using our new model reduced the ASR error rate by 9.5% relative to a seven-microphone system using existing beamforming technology.”

Amazon detailed its finding in a pair of papers at this year’s International Conference on Acoustics, Speech, and Signal Processing.

The company explains: “Classical beamforming technology is intended to steer a single beam in an arbitrary direction, but that’s a computationally intensive approach. With the Echo smart speaker, we instead point multiple beamformers in different directions and identify the one that yields the clearest speech signal. That’s why Alexa can understand your request for a weather forecast even when the TV is blaring a few yards away.”

Free

for qualified subscribers

Subscribe Now Current Issue Past Issues

Google Introduces All-neural, On-device Speech Recognizer to Power Speech Input in Gboard

Google revealed that it has made spoken to written text translation available offline for Gboard users.

04 Apr 2019

Two Mics Are Better than Seven According to Amazon

Google Introduces All-neural, On-device Speech Recognizer to Power Speech Input in Gboard

Voice Deepfake Fraud Surged 1,300 Percent

Sanas Unveils Simultaneous Real-Time Speech-to-Speech Translation

ESTsoft Partners with ElevenLabs

Deepgram Launches Voice Agent API