Optimizing Your Life(cycle)

One thing I’ve learned in doing this column is whenever you write about price disruptions, you get mail. So here we go.

Think about your buying behaviors, think about things you like to buy. Typically they’re from
(1) somebody you’ve bought from before and trust; and/or
(2) something you have a lot of knowledge about, perhaps have been studying for a long time before commitment.

Now let’s season this with an observation from IBM Software Group GM Steve Mills, who’s had a bit of experience with selling software. "Nobody wants to buy software," Mills noted recently. "There is no positive gratification or out of the box experience."

This from an executive who sells about $14 billion in software annually. So if you were selling $14 billion of something that people didn’t particularly enjoy buying, you’d have to really fit into one of our two criteria above, in IBM’s case probably No. 1. But since you’re not IBM, there has to be a real compelling reason why somebody would pick you.

In the speech market today, compelling reasons come from organizations that have:
(1) hit the wall and are in some deep pain about their inbound call handling; or
(2) have studied speech really hard and are willing to make a purchase from the (relatively small) community of vendors they know and trust.

In other words, it’s not a large community, as the overall size of the market attests. At Zelos Group, we recently completed a detailed life-cycle analysis of a typical speech deployment, based on a real-world experience base of 20 deployments for both premises-based and hosted solutions. It involves 16 discrete project phases and a cast of at least 34 different individuals. It shows in vivid fashion how many different stakeholders, each requiring their own ‘look’ at the process, are involved in something that, let’s be honest, has traditionally been a niche populated by specialists. It also explains why most engagements we hear about tend to be high-end, mission-critical apps with dramatic ROIs – they are the result of solving significant pain points. Which raises the next question: When do we move speech from the ER to the assembly line? When does it become a standard access methodology instead of an exceptions-based remedial methodology? When people can buy it from someone they know, trust and have purchased other stuff from – or when it comes as part of another purchase.

Based on recent conversations with large IT players – Cisco, HP, IBM, Microsoft – it’s pretty clear that the life-cycle model we documented will be supplanted by a more commoditized and componentized delivery model. In this next-gen model, speech will be encapsulated as components, ultimately delivered as Web Services, that are instantiated on common application servers, and accepted by the customer as another access channel within a unified presentation-layer tooling and runtime environment. Underneath this layer is the same business logic and rules that drive their Web infrastructure, which isn’t standing still, either. While this should not come as news to anybody who can spell VoiceXML, it is worth keeping some context in mind for this architectural shift. A few context sanity checks:

Only about 20 percent of enterprise IT developer resources are spent on new development, the other 80 percent is spent on maintaining what they already got in the asset base;
The corollary of this is that adding servers is the last thing enterprise IT managers want to hear – the most exciting way to reduce cost in the enterprise is to get more utilization out of the servers they already have;
Indeed, the most financially rewarding investment is in the growing array of techniques out there to eliminate server proliferation, and optimize what you got – especially when you’re in a vertical like telecom, where utilization can average under 20; and
As for call center expense reduction, we recently worked on an analysis of offshore call center outsourcing that showed a fully-loaded cost reduction of 30 percent.

Out of these data points, the key word that comes through is ‘optimization’ – and the ‘what’ being optimized is enterprise Web infrastructure. That’s what taking up the footprint in the data center, that’s driving routers and app servers and storage. The vendors in this space are heavily penetrated in the enterprise IT world, and their optimization model includes radically more integrated, and less expensive, methods to instantiate speech into the infrastructure.

Today’s incumbents strive to sell specialized servers dedicated to speech processing and telephony self-service (granted, some, like Edify, justify it on the backs of multichannel delivery). But the data center incumbents are adding speech to the mix as an additional wrinkle, it’s a plug-in, a set of extensions. Priced accordingly.

Finally, in a world where more and more applications development, in addition to a wide range business processes, is being moved offshore, displacing agent costs is fighting the last war. The role of speech automation in a global call center environment is almost virgin territory. What makes speech attractive in this context is as a front-end to international private-line circuits that terminate at Citrix-based workstations in India and elsewhere. Key data can be kept onshore (that’s why Citrix is there, btw), and calls can be routed with data collected by the XML-based speech asset – a kind of global screen pop. So where does this put us, the end of the beginning? Or the beginning of the end for speech as we’ve seen it develop over the past 10 years? Hey, send me an e-mail and tell what you think.

Mark Plakias is a partner and senior consultant for The Zelos Group. He can be reached at (212) 366-0895.

Free

for qualified subscribers

Subscribe Now Current Issue Past Issues

Optimizing Your Life(cycle)

SoundHound Partners with Acrelec

Deepfake AI Market to Generate $41.36 Billion by 2032

SoundHound Launches Vision AI

Vuzix Introduces LX1 Smart Glasses for Warehouses