Speech Developer Platforms: Reshaping the Market By Mimicking App Stores

Article Featured Image

Want to understand what is happening with speech application development platforms? Then take a close look at smartphone application stores. Led mainly by consumer-focused voice application suppliers, developers are now focusing their attention on extending these emerging platforms to support both horizontal and vertical market needs. The building blocks for these changes were largely put in place in 2017 and are expected to take off this year.

As new, often social media–based channels have emerged, interest in traditional voice applications has ebbed. In fact, the speech application development platforms market showed modest growth, in the low single digits, in 2017. But those numbers are starting to shift, and a major upswing is expected in the coming years because application design is changing dramatically. 

In the past, vendors built all of the components needed in voice applications: They supplied the engines, development tools, and application programming interfaces (APIs). But that model is now in flux. “The desire for companies to build speech applications from scratch themselves is waning,” says Chris Connolly, vice president of solution strategy at Genesys, a provider of customer experience and contact center solutions. 

Instead, two types of platforms have emerged, and both are trying to attract third parties, he says.

The first set of solutions comes from big, well-known, extremely deep-pocketed consumer-focused technology suppliers, like Amazon Web Services, Apple, Google, and Microsoft. They have been building out large, multifunctional, often cloud-based voice application development platforms. 

In addition, suppliers like Aspect Software, Avaya, Cisco Systems, Enghouse Interactive, Genesys, Nuance Communications, and Plum Voice have created solutions that focus on enterprise needs. Their systems are used in contact centers, customer relationship management, and salesforce automation. 

Reshuffling the Deck 

Increasingly, consumer-based systems are providing the basic speech functionality, and the enterprise vendors are migrating their software on top of those platforms. One common foundational item is support for VoiceXML. The standard emerged in 1999, with AT&T, IBM, Lucent, and Motorola as the main drivers. A second version arrived in 2004, and a third in 2010, so the interface is now well understood and widely implemented.

The standard specifies the manner in which interactive media and voice interactions between humans and computers are designed. These interactions can be carried over Voice over Internet Protocol (VoIP) or public-switched telephone network (PSTN) lines. VoiceXML applications are developed and deployed in a manner similar to how web servers interpret and visually render Hypertext Markup Language (HTML) messages. These interactions can be integrated into various applications. They have been used to develop audio and interactive voice response (IVR) applications, such as customer service and automated self-service portals.

With VoiceXML as the foundation, the voice application market is starting to resemble that of mobile apps: a few suppliers offering millions or even billions of applications in their online stores. The large vendors supply base functionality and then third parties build on top. The third-party applications can be as simple as a French female customer greeting or as complex as contact center call routing.

Right now, this market is in a fledgling state, so the platforms and even the nomenclature are unfinished and in varying stages of development. 

“Amazon seems to be leading the pack,” says Deborah Dahl, principal at Conversational Technologies and chair of the World Wide Web Consortium’s Multimodal Interactions Working Group. Amazon’s software applications, which it calls “skills,” numbered just 10,000 at the start of 2017. By the end of the year, that number had grown to 24,000.

Initially, Google termed its software “actions” but later renamed them “apps.” Microsoft calls its solutions “Cortana skills,” highlighting their interoperability with its Cortana voice assistant. But for all their efforts, Google, Microsoft, et al. were still putting a lot of the pieces for their platforms in place in 2017. Consequently, they do not have as many third-party add-ons as Amazon does. 

SpeechTek Covers
for qualified subscribers
Subscribe Now Current Issue Past Issues
Related Articles

The State of Speech Developer Platforms

Vendors woo third parties and have success integrating speech into more business applications