Two Mics Are Better than Seven According to Amazon
In a post on its Alexa blog, Amazon says it has developed a different acoustic modeling framework. Amazon says it can boosts performance by unifying speech enhancement and speech recognition. The blog says: “The unified acoustic model is optimized solely on the speech recognition criterion. In experiments, we found that a two-microphone system using our new model reduced the ASR error rate by 9.5% relative to a seven-microphone system using existing beamforming technology.”
Amazon detailed its finding in a pair of papers at this year’s International Conference on Acoustics, Speech, and Signal Processing.
The company explains: “Classical beamforming technology is intended to steer a single beam in an arbitrary direction, but that’s a computationally intensive approach. With the Echo smart speaker, we instead point multiple beamformers in different directions and identify the one that yields the clearest speech signal. That’s why Alexa can understand your request for a weather forecast even when the TV is blaring a few yards away.”
Google revealed that it has made spoken to written text translation available offline for Gboard users.