Each of us has dreams, but sometimes fall back on excuses when those dreams are not achieved.
The joining of Internet capability with television has opened up new possibilities for speech as an interface. Video and audio telephone connections are now possible over conventional telephone lines as well as over larger, faster fiber optic and other fast connections. "Interactive television," which could also make use of speech recognition, is as yet a largely unexplored technology.
Industry Leader Focus
Two hot topics in the world of speech technology these days are multimodal input and customer service. Multimodal input is the process of combining technologies - keyboard, mouse and speech, to name one combination - to form the backbone of a more natural way of communicating with the computer.
Listen Up: How Natural-sounding Speech Engines are Changing the Industry
After a 15-year adolescence, text-to-speech technology is coming of age. Every TTS vendor's goal - a truly natural-sounding, voice-activated computer interface that can read text aloud like a human being - is now within reach of the development community. Industry observers all along have said TTS would have to make a quantum leap before it could achieve anything near the natural-sounding speech necessary for broad market acceptance. Today's synthesizers make that leap possible by using new processing and linguistic models to convert computer text into speech that is nearly indistinguishable from actual recorded human speech. TTS is speaking and the market is finally taking note.
One of the best microphones for speech recognition uses active noise canceling technology. Purchasers often contact us and say, "Microphone sound on the audio setup test is in the high part of acceptable, sound quality is at the bottom of acceptable. What is wrong with my microphone?" Our first question is, "Have you tried dictating with the microphone?" The answer is usually no. The user is invariably pleased with the improved accuracy upon dictating.
In a report published in the April/May 2000 issue of this magazine, Sergei Kochin of Knowles Electronics surveyed 13,690 speech recognition purchasers and found that less than 15 percent of them used speech software at least one hour per week. This means that five out of every six buyers are using the software rarely or never! Internal Microsoft studies place the abandonment rate for speech recognition software rate even higher, at 95 percent.
Speech on the Go: It's Becoming a Wireless World
Speech recognition technology and associated systems have emerged to satisfy consumers' needs for simplicity and for efficiency in operations across several industries. For this simple reason, the technology has already become an important asset to many commercial interests and the industry appears, at least to many market analysts, to be on the verge of a healthy, rapid growth period.
With apologies to my friend James Carville, I paraphrased his often-maligned quote from the 1992 presidential campaign to emphasize the theme of this edition of Speech Technology Magazine. It really is all about our customers and the experiences they derive from using speech technology in their everyday lives. How does it impact them? How does it make their lives better? How does it save them money? How does it improve revenues? And the many other questions they want answered when implementing a speech application.
Speech-enabled appliances, including handheld computers, promise to be everywhere in the near future - in the office, at home and on the road - enabling users to easily interact with people, to control consumer appliances and to access personal and public information. To be used effectively, these appliances will support speech interfaces to intelligent software agents that perform various types of searching and computational tasks on behalf of the user. The sidebar titled "A Day in Jack's Life" Page 26 illustrates how pervasive speech will become in the future. The components of the speech interface will likely be distributed among various hardware components connected via a communication network. This article outlines the types of applications that speech-enabled appliances will support, the architectural environments in which speech-enabled appliances will work and the types of speech-enabled appliances users can expect to use.
The Answer to Customer Service?
Many times, customers experience a trade-off when they try to find satisfaction through machines: what they gain in convenience they lose with modern-day annoyances that come with the new technology. Today, you can check your savings account balance at any time of the night, but not until having to cope with about five rounds of touch-tone multiple-choice questions. You can order clothing online, but if there are delivery problems, e-mailing customer service can be frustrating, especially if you have to wait hours or days for a response.
The Voice Web
Telephones have been like turtles watching the Internet rabbit run. Although telephony infrastructure is changing rapidly, the way the telephone interacts with the user - and what the user does with the telephone - has barely changed for decades. The growth in wireless phones has made telephone service available almost anywhere, but the telephone is still used mostly for contacting specific phone numbers to talk to a person. When the telephone is used for contacting automated systems, the touch-tone interface is notoriously inconvenient and frustrating.
The Meaning Should Be Clear When Choosing the Words
When you examine the individual words that make up the phrase you are likely to decide that it refers to the ability to hear a voice and say, "That sounds like Susan!" When machines can do this it is called speaker identification or speaker recognition. These phrases focus not on the person but what is being examined and recognized is, indeed, that person's voice.