James Colby, Assistant Vice President of Marketing, Comverse Voice Solutions
Q What are Comverse's thoughts concerning the impact of speech technology on network carriers?
A Speech technologies and the associated services that they enable represent a huge set of opportunities for both wireless and wireline network carriers. Speech recognition enables operators to offer a variety of enhanced services that promote incremental usage, increase in-car safety and allow callers to interact with services using natural, intuitive interfaces. Services such as voice portals, voice-controlled messaging, voice dialing and the voice initiation of group conference calls, where the user simply has to state the names of people they want to talk to, have been proven to be successful where launched by some of the world's leading edge carriers. These services, perceived by end users to represent "value add", have been shown to result in greater levels of network loyalty and high Average Revenue Per User.
Additionally, speech activated services offer network carriers - like most other enterprises - the ability to control their costs of operation while providing their customers with a more efficient means by which they can manage their accounts. Maintaining huge armies of Customer Service Representatives is a huge burden on a carrier's bottom line - automated services, such as change of address or balance checking, can be performed efficiently at a fraction of the price at which a human can fulfill such a transaction.
Carriers are also in the novel position of being able to combine both "value added" and "cost saving" services in a single user portal - simplifying the interaction with the network for a caller even further.
Further opportunities exist for carriers to host speech driven applications for enterprises in the network. Much like the Internet model, businesses would like to out-source the management of infrastructure that supports their customer interface. Web sites and IVR / enterprise voice portals can be hosted efficiently on shared platforms in the network. As trusted suppliers of telephony and voice services, carriers are the natural supplier of managed services to the enterprise market.
Q What are the key trends you see in platforms and middleware supporting speech technology for network service providers?
A Open, open and open. In order for carriers to be able to effectively serve their markets with a host of relevant, dynamic and custom solutions they will have to be able to enlist the support of third party application developers. Much like the internet model, or that established by DoCoMo in Japan, success was dependent upon the ability of unrestricted numbers of developers to building applications and services based upon open standards.
Comverse believes that the same principles will hold true in the voice arena also. As proponents of the VoiceXML and SALT standards, Comverse is embracing the standards and implementing supporting them on its platform technologies. Existing customers of the Comverse Trilogue and Voice Portal platforms will be able to execute VoiceXML applications built by independent developers. Users of the Comverse voicemail services will be able to access a variety of additional speech driven applications when accessing their mailbox - extending the revenue possibilities for carriers as a result.
To facilitate this fundamental shift from closed systems, Comverse is constructing a set of tools and capabilities that will simplify the process of managing the development and support of 3rd party services.
Q Who are your speech technology partners and why did you choose those companies?
A Comverse takes the position that it is technology vendor "agnostic". Comverse has developed close partnerships with leading technology suppliers such as SpeechWorks and Nuance but often finds that its customers ultimately make the selection of the speech recognition or text-to-speech engine for a particular product. The decision can be driven by a number of criteria, but it often comes down to the maturity of the speech model for a particular market and language.
Q What should the speech technology industry as a whole be doing to increase the growth rate of speech technology deployments?
A There are a number of initiatives that are already underway in the industry and plenty more that could have a positive affect on the acceptance and adoption of speech technologies.
On the technical side, operators and applications developers and/or enterprises need improved tools to simplify and accelerate the time it takes to build a speech driven service. Pre-defined routines that encapsulate the typical flow of a dialogue between human and machine not only simplify the construction of a service, but ensure that developers who are not experts in human factors can construct a man-machine interface that provides a caller with an intuitive and satisfying experience. Once in place, that service needs to be monitored and enhanced; again, well-designed tools simplify such a task.
The industry also needs to better promote and market the benefits associated with speech-driven services. In addition to advertising the possibilities offered by voice-controlled products to the mass market - whether aimed at the consumer or business user - the network carriers could do much more to explain the merits and the typical returns that result from enterprise centric applications. Again, the carriers have the most substantial reach into the enterprise market segment, but today it is typically technology suppliers, system vendors or small ASPs that attempt to sell the virtues of speech user interfaces to mid and large size businesses. With the limited resources available to these advocates, the message is often not getting through.
Q Describe a successful speech technology implementation and why you thought it was successful. Please include any benchmark statistics that support your thoughts.
A Comverse supplied Sprint PCS with the application technology at the heart of its Voice Command dialing service. The benefits to the operator are clear. In addition to being perceived as a leader and innovator in enhanced services, Sprint PCS is encouraging its customers to program a network-based address book with personal contacts. With such a network dependency, it was believed that user would be less inclined to change networks.
Recent feedback from Sprint PCS at a Nuance hosted conference provides a testimony to the success of the product. In addition to generating in the region of $5 per month per subscriber for the service, Sprint PCS was able to report the following about its Voice Command customers:
Q What applications or services will be most important in the near-term? In the long-term?A
- "75% of heavy users say that they are extremely or very satisfied with Voice Command"
- "81% of users indicated that Voice Command performs as they expected"
- "Voice Command customers have… higher ARPUs, higher MOUs, higher retention"
Comverse is focused on delivering improved of ease of use and valuable new functionality to its core suite of messaging and communication products. The capacity to send, reply to and filter voice and email messages, together with the ability to call one or more parties, through voice control of a personal or enterprise address book empowers users to communicate more naturally and freely. This nucleus of communication tools will be further enhanced with multi-modal user interfaces to further simplify and enrich user control and interaction.
Comverse believes that in addition to its core communication tools, there will not be a few "killer apps", but rather many, suited to the specific needs of a variety of market segments and user types. For example, entertainment, gaming and music services appear well suited to the youth segment. At the other end of the spectrum we already appreciate that voice-commerce applications will serve the enterprise. Again, the ability to develop and host a multitude of applications will require open platforms, standards adherence and development partners.
Q What are your thoughts concerning the developing standards such as VoiceXML and SALT and their impact upon the future of speech technology?
A As stated above, the adoption of open standards is key. Comverse is enhancing its voice service platforms to be VoiceXML compliant and, as a founder member of the SALT initiative, is working to further empower enhanced service users through the development of multi-modal user interfaces.
Q What would you like to see from speech technology developers to accelerate the adoption of speech technologies in the market place?
A This question has largely been addressed in my answer to 4 above. I mentioned two facets of the underlying technology that make deployments simpler - development toolkits and predefined man-machine dialogue routines. These routines should ideally be defined in VoiceXML so that they can be easily modified as necessary for a particular application or market.
In addition, it is an imperative for Comverse that our speech technology partners continue to develop and support new language models. The Comverse customer base is a global one and calls for the support of many more languages than those typically offered by the main vendors that Comverse has historically partnered with. It is usually the case that a deployment of a speech enabled system will not proceed unless recognition can be demonstrated in the native tongue of the buyer.
Q Could you describe interesting aspects of research Comverse has done with users of speech services? What do they like and don't like about using speech technology?
A Comverse has a very significant investment in human factors experts that focus solely on developing speech-enabled products. This multi-lingual team is responsible for defining the interface between a user and all of the Comverse core voice products. Whereas building a user interface that employs DTMF or "touch tone" could be considered a science, the design of a "VUI" or Voice User Interface can definitely be considered an art.
To validate the design of a VUI, Comverse regularly conducts primary market research where it monitors and records how users interact with its products. It is this attention to detail that results in a smooth "user experience" when interacting with a product. What is clear is that when a VUI is designed well, a user may have little or no reaction to it. A poorly designed interface will cause an extreme reaction - confusion, frustration and, very quickly, a refusal to use the product at all.
Comverse actually differentiates between the types of speech user interface it develops for its products. The type of interface is usually a function of the underlying application but it is also tuned to the type and experience of user. Applications that are used relatively infrequently, for instance a service that automates customer care functions, needs to have an interface that clearly explains to a caller how "navigate" the interface, explaining almost step-by-step what the users options are. Such an interface is obviously useful for a first time caller but becomes boring and cumbersome for regular users.
The Comverse core communication suite is linked behind a common VUI. This "voice portal" provides an enriched access to, and advanced control of, the messaging, dialing and address book features that callers use on a regular basis. Comverse calls this type of speech user interface a "conversational" experience. Callers can invoke functions in single commands where it is expeditious to do so. For instance, saying "Call Mike Jeffries in his office" is more effective - and less irritating - to a regular user than having to engage in a protracted interaction with the system along the lines of "Call," "Call who?", "Mike Jeffries," "Where?", "Office." These "complex" instructions work well in other contexts such as sending emails, playing voice messages from a particular person or scheduling a meeting.
This conversational style also embodies an awareness of the context in which a particular application is being used. On receiving an email the user can say "Add him to my address book" if the mail is from a new contact or "Call her" if the sender is already known. These commands, intuitive to the user, hide the underlying complexity that links a variety of key applications together and shares "context" between them.
In summary, ease of use is the primary factor that determines the success of speech-enabled applications. Through extensive research, Comverse believes that you may have one chance to get a Voice User Interface right. If a user is frustrated by their first interaction with a service, they may never use it again.