Speech: It's Not Different, Stupid

With Bill Gates as a headliner at a speech show, the question naturally arises as to whether speech technology will become more widely available, riding Microsoft’s ‘low-price/high-volume’ software model. That model, which has given rise to a monoculture at the operating system and server layers of the IT infrastructure, encourages uniformity of middleware components and development methodologies. It is also a 180-degree turnabout from where speech in the enterprise and service provider market has been for the past decade.

While it was interesting to discuss the art of speech, and revel in the fictive lives of speech persona, it hasn’t been particularly commercially rewarding. Indeed, several high-profile voice industry train wrecks (Sprint’s Claire, AOLByPhone) have taught the hard lesson that pushing technology and users too hard can lead to very bad reviews.

But forget the critics, let’s focus on economics. In a recent survey of speech implementers by Zelos Group, we found that the vast majority of implementers were in the single-digit range in terms of speech applications deployed or scheduled to be deployed in 2004. Likewise, in a recent survey of over 200 carrier speech programs across four continents, Zelos Group found just under 80 percent of carriers deploying just one application, and 17 percent deploying only two.

Imagine selling your CIO or CFO a million-dollar program for a Web site that did one thing – you’d be laughed out of the room. So what, really, are we looking at with speech customers who are able to pull two or three apps a year out of the hat, if that? Why isn’t the pace of applications throughput faster, more like the Web? "Speech is different." Three words that we’ve heard all too often, and words that should strike dread in anybody’s breast who wants to see speech become ubiquitous. Bottom line realities dictate that job No. 1 should be to make it less different, and more like the Web that today’s application servers drive – the same servers that use open standards can drive speech apps. A lot of this has to do with process, but just as important is the need to bring speech into the monoculture of the enterprise IT infrastructure.

On the process side of the equation, implementers told us they want the following: (1) better methodologies for (post-sale) requirements definition – the internal business units that will benefit still have a steep learning curve before they can apply speech to their business processes; (2) more packaged VUI and best practices, less professional services – customers want more science, less art; and (3) more flexible testing and iterative design processes – customers don’t want to buy dedicated platforms just to do development on.

On the Infrastructure side, implementers told us they want open tools for development. For VoiceXML, coders come from Java culture, not a compiler environment; for SALT and ASP.NET environments, coders may want VB, others may want to get closer to the metal, but not some alien drag-and-drop IVR tool. More importantly, in the specialized world of speech, where tooling has all too often been so closely tied to the runtime that it is impossible to separate them, it’s time to bring extensions that end up transforming tool/run-time hybrids into proprietary platforms into alignment with accepted processes for managing extensions, such as found in the Java community process.

What does this mean in terms of dollars? In the dawn of a market where Microsoft is an active participant, along with IBM, BEA, and Sun, it means that speech needs to become one of several key enterprise IT utilities. Like e-mail. Like directory servers. Like authentication processes. Nobody sells e-mail on the basis of a single application, but pitching a speech project for displacing an agent or touchtone app based on ROI is like pitching an e-mail platform based on the ability to communicate with workers on the road.

The tunnel vision that various flavors of the call center upgrade/ROI story has promulgated has indeed impeded a larger marketing story. In light of the On Demand zeitgeist the idea of access – the natural, flexible and ubiquitous access that only speech can deliver – is being paid for by huge budgets of demand-generation advertising from IBM, Microsoft, and others. Take the money, it’s a gift. Just remember, speech is not, repeat, not different.

And don’t take my word for it, please. As Cisco, IBM and Microsoft start to crank up the volume on their speech solutions for the broader IT marketplace, they will deliver the kinds of process improvements customers are asking for: packaged VUI solutions, objects and templates, fast iterative design methodologies and standardized tooling. As speech becomes part of the IT portfolio, it will become different in another (good) way – just a different access method to the same business processes driving value across the enterprise.

Mark Plakias is a partner and senior consultant for The Zelos Group. He can be reached at 212.366.0895.

Free

for qualified subscribers

Subscribe Now Current Issue Past Issues

Speech: It's Not Different, Stupid

Nex-Gen Chat Solutions with Generative AI You Can Trust

Speech Technologies in the Low-Code/No-Code World

Meeting the Rising Demand for Voice-Based Biometric Systems

More Web Events

Philips SpeechLive Integrates Nuance's Dragon Speech Recognition

Natural Language Processing to Grow by $53 Billion by 2027

Greenway Health Partners with Nabla

RUSH Partners with Suki