The 2017 State of the Speech Technology Industry: Speech Developer Platforms

• IBM Watson. Named one of the strongest software platforms for the Internet of Things this year by Forrester Research, the Watson IoT platform can “serve a broad range of advanced IoT use cases,” according to the research firm. Forrester also posited that IBM “is well positioned for market leadership.” IBM Watson is capable of speech-to-text and text-to-speech and is accessed through a RESTful API or WebSocket. Standard use of the API is free for the first 1,000 minutes of transcription per month, with a scaling premium service tailored to meet developer needs and budgets.

• Amazon Alexa Skills. Amazon boasts a diverse set of APIs aimed toward designing for Alexa, including the Alexa Voice Service, which allows Alexa developers to add and customize Alexa’s voice control functions for everything from turning on the lights to ordering a pizza to finding a sports score. For developers looking to take control of the latest technology that enables Alexa and the Echo to recognize wake words and speech cues in high-noise environments, a developer kit is available for Amazon partner Conexant’s AudioSmart CX20921 Voice Input Processor.

• Google Cloud Speech API. Google provides its own API for implementing speech recognition and voice transcription technology into developer products. Speech recognition can be tailored by context and is noise-robust, with a library of more than 80 languages. Thanks to Google’s machine learning, accuracy improves as the data set of utterances increases. Google offers per-minute pricing with discounts based on usage.

• Avaya Breeze. Avaya offers the Avaya Breeze platform to enable developers to add its functionality into the service end of any product. Breeze features what Avaya calls “Snap Ins”—applications that increase the capabilities of the Breeze platform within a developer product. Of special note to developers is the Real-Time Speech Snap In, which allows speech search and analysis and provides speech recognition and text-to-speech capabilities based on Nuance technology.

• GenesysVoice Platform (GVP). Genesys arms developers with a VoiceXML-based set of standards with which to design architecture, enabling automated speech solutions specifically targeted to interactive voice response (IVR) and chatbots.

• SoundHound Houndify. For developers looking for broader-ranging applications from their speech platforms, there’s SoundHound’s Houndify, which offers an API that provides speech recognition and natural language understanding with context, enabling complex queries and follow-up capabilities for intelligent assistants. Developers looking to quickly build speech-enabled assistants might want to look this way for success in human-like interfaces.

• Wit.ai. For developers wanting an open-source solution to creating voice-controlled devices, Wit.ai, which Facebook acquired in early 2015, is an attractive option. Wit.ai is a small collection of APIs that supports HTTP, Python, Ruby, and the Node JavaScript environment. Pricing is free, with no specified rate limit on accessing the resources, provided that developers who plan on using the tools heavily contact Wit.ai ahead of time.

Tye Pemberton is a freelance writer based in Savannah, Ga. He can be reached at tyepemberton@gmail.com.

Free

for qualified subscribers

Subscribe Now Current Issue Past Issues

The State of Speech Developer Platforms

Vendors woo third parties and have success integrating speech into more business applications

22 Apr 2019

The 2017 State of the Speech Technology Industry: Speech Developer Platforms

The State of Speech Developer Platforms

Triton Digital Partners with ekoz.ai on Voice-Cloned Podcast Ads

Soul App Launches Full-Duplex Voice Model

Mistral Unveils Voxtral Open-Source AI Voice Model

Leena AI Launches Agentic AI Colleagues