Speech Technology Magazine

 

Coping with the Terrible 2000s: CAT's Prospects in 2006

We are now half way through the first decade of the new millennium. It's time to take stock of the factors that have had the greatest influence on automated speech's progress into enterprise and service provider IT infrastructures as a major component of Conversational Access Technologies (CAT).
By Dan Miller - Posted Mar 1, 2006
Page1 of 1
Bookmark and Share

We are now half way through the first decade of the new millennium. It's time to take stock of the factors that have had the greatest influence on automated speech's progress into enterprise and service provider IT infrastructures as a major component of Conversational Access Technologies (CAT). To use the terminology made popular by the Boston Consulting Group (BCG) back in the 1980s, CAT is a "problem child." Compared to its data center technology peers, speech has not performed up to its potential in a marketplace that is characterized by both size and growth.

This is where the notion of "The Terrible 2000s" comes into play. I don't mean that CATs are two years old. They just behave like it.

"The Terrible Twos" are a constant tug-of-war during which the individual learns to measure his or her value to the world at large. It's the time that a person individualizes, separates from his or her parents and learns how to behave as a member of a larger society.

The $3 Billion Threshold

CAT entered the IT and telecom scene in the late 1980s with the advent of client-server computing, interactive voice response (IVR), the 'intelligent network' (IN) and the beginnings of online transaction processing. Its proud parents set high expectations for their offspring to have wonderful careers in finance, travel, hospitality, health care and retail. However, to succeed in business, one's growth is limited by the ability to work and play well with others.

The components of CAT now include speech processing, call processing, application development environments, application software, professional services, maintenance and hosting.  It's been a struggle just to get attention. Opus Research's model for CAT spending grows from about $600 million in 2005 to exceed $1.6 billion in 2009. If you add another billion in outsourced VoiceXML-based interactive voice response, spending on CAT tops $2.5 billion. For the same time period, Gartner projects spending on SOA-based (Service-oriented Architecture) Web services growing from over $14 billion to $189 billion.

Investment in emerging architectures accounts for significant spending, where the major objects of expenditure carry names like enterprise service bus (ESB) on the enterprise side and the IP Multimedia Subsystem (IMS) for carriers and service providers. In the name of promoting mobile applications and multimodal delivery of services, businesses are expected to spend something in the area of $49 billion in 2005. CAT's mission during these Terrible 2000s is to build a clearer link with such spending initiatives.

Several other technologies have grown to dwarf "pure" CAT. IP telephony infrastructure will reach $3 billion in 2005 and that "three billion" threshold represents the milestone where emerging IT and telecommunications technologies assume sufficient gravitas to be taken seriously. This heavyweight status has been attained by providers of enterprise resources planning (ERP), customer relationship management (CRM), business process outsourcing (BPO) and mobility.

CAT's Checkered Present
The picture of automated speech's success is a mosaic. Both skill and art are required to fit pieces into disparate parts of emerging infrastructures in a pleasing way. Success occurs on a case-by-case basis and, as is characteristic of any problem child, comes with the help of third parties. In the near term, the greatest influence comes from providers of professional services, including software developers, system integrators and business process outsourcers. Indeed, over half of the spending on CAT-based solutions will go toward professional services and maintenance. This follows the overall IT industry's pattern of spending.

This approach amounts to a "take-n-bake" in which "widgets" and "go-fasters" shorten the time it takes to deploy new, pleasing applications. CAT will be able to take advantage of a library of best practices encapsulated in Java code that are the product of years of implementations. Following the path set forth by the Web, OpenSpeech Dialog Modules (OSDMs) from voice specialist Nuance and reusable dialog components (RDCs) from IT giant IBM become shining examples of the growing code base that lives at "the abstraction layer" of CAT architecture.

By decade's end, today's "novel" applications will become routine and, indeed, codified - with hope, adhering to "best practices" that will lead to greater customer and implementer satisfaction.


Dan Miller is the founder and senior analyst at Opus Research. He published Telemedia News & Views,a monthly newsletter covering developments in voice processing andintelligent network services. He can be reached atdmiller@opusresearch.net.


Page1 of 1