Google Announces General Availability of Cloud Speech-to-Text, Support for New Languages, and More

According to a blog post, Google is “we're making our Cloud Speech-to-Text and Text-to-Speech products more accessible to companies around the world, with more features, more voices (roughly doubled), more languages in more countries (up 50+%), and at lower prices (by up to 50% in some cases).” Last year, Google announced the availability of premium models for video and enhanced phone in beta. Now, the program is generally available.

Additionally, Google is announcing the general availability of m ulti-channel recognition, which helps the Cloud Speech-to-Text API distinguish between multiple audio channels (e.g., different people in a conversation), which is very useful for doing call or meeting analytics and other use cases involving multiple participants.

Availability isn’t the only way Google is hoping to make cloud speech-to-text more accessible to potential customers. It’s also offering discounts. According to the announcement, “For standard models and the premium video model, customers that opt-in to our data logging program that was introduced last year, will now pay 33% less for all usage that goes through the program.” It continues, "We’ve cut pricing for the premium video model by 25%, for a total savings of 50% for current video model customers who opt-in to data.”

If you’re interested in giving the service a whirl, you can use the $300 GCP credit to start testing. The first 60 minutes of audio you process every month with Cloud Speech-to-Text is always free.

Google also announced support for seven new languages or variants, and 31 new WaveNet voices (and 24 new standard voices) across those new languages.

Radisys Unveils New Advanced Speech Recognition Support for Enhanced In-Call Speech Services

Radisys' MediaEngine solution says it reduces costs of deploying speech-enabled services by more than 90%; allows CSPs to capitalize on mass-market acceptance of person-to-application interaction via speech.

26 Feb 2019

CallMiner and Morae Global Announce Partnership to Deliver Conversational Behavioral Analytics for Financial Services Risk Mitigation and Regulatory Compliance

Voice surveillance based on conversational content proactively reveals compliance and fraud threats.

21 Feb 2019

Google Announces General Availability of Cloud Speech-to-Text, Support for New Languages, and More

Radisys Unveils New Advanced Speech Recognition Support for Enhanced In-Call Speech Services

CallMiner and Morae Global Announce Partnership to Deliver Conversational Behavioral Analytics for Financial Services Risk Mitigation and Regulatory Compliance

Omilia Launches Lexis TTS Model for Contact Centers

Retell AI Launches Conductor

SoundWise Launches Free Forever AI Audio and Video Transcription

Cash Flows in to Speech Company Coffers

Emotion Detection and Recognition Market to Be Worth $43.29 Billion by 2031

Jon Taffer Launches Digital Coversational Twin

Callie Care Collects $500K for Voice AI Development

AI Voice Agents Increase Specialty Care Program Enrollment

Study Proves Assistive Technologies Improve Users' Lives

Symend Launches SymendConverse

Sunoh.ai Enhances Home-Based Primary Care and Operational Efficiency at Bloom Healthcare

Modulate Tops Hugging Face's Transcription Benchmark