Deepgram Unveils AutoML for Speech Recognition

Article Featured Image

Speech solutions provider Deepgram today introduced Deepgram AutoML, a tool to streamline artificial intelligence model development so engineers, data scientists, and others can unlock insights from their audio data. AutoML, a mechanism by which new AI models can be constructed and tuned automatically, has existed for natural language processing, image, and vision, but Deepgram's AutoML is built specifically for automatic speech recognition (ASR).

"With our approach organizations can deploy not only one, but tens or thousands of models trained to the needs of their specific company, target industries, or largest customers in an automated way," Scott Stephenson, CEO of Deepgram, wrote in a blog post.

"As the first company to offer this innovative technology for ASR, we're furthering our mission to be the de facto speech company, offering the world's fastest, most accurate and scalable speech solution. AutoML training capabilities are one of many ways Deepgram enables customers to extract value from their audio and deliver on the vision of an AI-enabled enterprise."

With Deepgram AutoML data scientists no longer have to do the following:

  • Select input audio features;
  • Denoise audio;
  • Tune hyperparameters of Hidden Markov Models or neural networks;
  • Modify underlying algorithms or architectures;
  • Maintain custom vocabulary lists; or
  • Apply model ensembling with keyword boosting or stacking.

"Deepgram AutoML reduces the time and effort needed to deploy speech recognition, enabling humans to spend more cycles on overall strategy and processes to successfully integrate AI into their organizations. Humans have been, and always will be, an essential part of automating speech recognition, as they are the only ones who can define what accuracy means, derive intuitions about their data, and create or curate new training data. Deepgram AutoML pushes the frontier of how AI helps humans evolve next generation AI," Stephenson explained.

With Deepgram AutoML, customers first begin by selecting a specific audio source. Next, they select a Deepgram base model to use, such as general, phone call, or meeting. Then, they select the training method and submit their model for training. After the model training process completes, they review model performance (e.g., accuracy improvement). If additional gains are required, further training teaches models to recognize specific audio examples. Finally, customers select the top-performing model and with one click deploy it to cloud.

"AutoML is the next frontier for artificial intelligence to allow teams to reach unprecedented levels of accuracy needed to solve business problems. We could not be more excited to be the first to provide AutoML for ASR," Stephenson concluded.

SpeechTek Covers
for qualified subscribers
Subscribe Now Current Issue Past Issues
Related Articles

Deepgram Adds Real-Time Streaming and On-Premises Deployments

Deepgram's additions to its speech recognition platform coincide with the completion of a $12 million funding round.