What is end-to-end deep learning STT?

Sponsored by Deepgram

Adoption of STT and overall voice technology has been hindered by high costs of standard STT fees and computing costs. The legacy tri-gram models with add-on AI require both CPU and GPU resources. Due to these high costs, businesses again only sample their calls. But this high cost also prevents companies from adopting voicebots or full call analytics. To solve these business issues and move voice technology past the “Trough of disillusionment” to the “Slope of enlightenment,” companies, like Deepgram, went a different direction and used AI Deep Learning to develop an end-to-end deep learning speech-to-text solution with no reliance on the legacy tri-gram model. End-to-end deep learning STT does not use any of the legacy tri-gram models but uses only neural networks to transcribe audio to text in one step; audio goes in and text and meta-data come out. Technical architecture is what makes the difference. Continue reading to learn more!

Please sign in or register for access to all SpeechTechMag.com content.

Email:

1. For Int’l readers only – due to GDPR regulations, do you agree to receive information from sponsors via email, postal mail, and/or phone contact. (Doesn’t apply to US residents)

2. How many employees does your company have?

Download