Continuous Speech: Better Over Time
Developers of speech recognition products typically, and for the most part fairly, make the claim that their products get better with use. Speech products recognize words with a higher rate of accuracy as they become more accustomed to a person's speech patterns. In this article, we will attempt to describe the performance of IBM's ViaVoice and Dragon's NaturallySpeaking over the last two months since our previous article. IBM's ViaVoice appeared to be perhaps slightly less accurate than Dragon's NaturallySpeaking, after their initial training sessions. With time, and the training which goes with it, IBM's performance appears to have improved. However, Dragon may still have an edge. One may speak very fast to both systems with high accuracy in a quiet stable environment. Both systems seem to degenerate somewhat in noisy environments or with microphone mismanagement. The most important factor in determining accuracy is the clarity with which one speaks. Assuming one's vocabulary and phraseology are within the machine's general expectations, both programs yield outstanding accuracy. Both systems also degrade quickly if one mumbles or slurs one's words. "Unusual" vocabulary or contexts may confuse the programs although both programs can be "taught" new vocabulary and phrases. Background Noise
Neither system seems to be fond of background noise. When one is not speaking but has the microphone on, both systems may produce unwanted output. In this respect, the IBM microphone continues to seem somewhat more irritable than the Dragon microphone, introducing unwanted stray words. No doubt both companies can improve on this. The recording of the sound behind the words continues to be a helpful feature in IBM ViaVoice, as does the built-in speech synthesizer. These two features, missing from Dragon Naturally-Speaking Personal Edition version 1.0, are expected to be incorporated in the Deluxe version of Dragon's Naturally-Speaking, as well as in possible upgrades and new versions of the Personal Edition. Thus some aspects of editing are easier with the IBM product when making use of the recorded voice behind the words or the speech synthesizer. Otherwise, the Dragon seems somewhat easier to use for editing since one can correct the text somewhat more easily by voice. One may use voice commands to select text and change it or correct it. Since one can leave the Dragon microphone on more easily than the IBM microphone, picking up less stray noise and producing less unwanted output, one can then use the Dragon microphone more easily for editing. Turning the IBM microphone on and off is a somewhat time consuming process, thus making it less attractive for editing material by voice. If one wishes to edit entirely by typing or non-voice methods, then these differences between the two products are less important. But people who find writing by voice easier than typing may also find editing by voice to be more natural than typing. Nevertheless methods of editing are a very personal matter, and differ widely. Editing is clearly a different process from producing a first draft, and some people may prefer to use voice for one but not the other. We suggest that when one is installing both systems on the same computer, users install the IBM ViaVoice first since installing the IBM after the Dragon may disable the Dragon temporarily. This problem can be fixed by briefly reinstalling the Dragon. In summary, IBM's Via Voice and Dragon's NaturallySpeaking are outstanding fully functional products in their present form. We expect both will be improved by upgrades and new editions. IBM has hinted at the release of a new version called Via Voice Gold incorporating more command and control features as well as possibly some new voice correction features. Upgrades of Dragon's Naturally-Speaking personal edition and the release of their Deluxe edition and include speech synthesis, recording of the speaker's voice behind words, the ability to use multiple speakers/users per copy, the ability to develop different topical vocabularies even for a single speaker, and more robust macro writing capabilities. At the present time, both products can be used well together, and complement one another. The most frequently asked question people have is which speech recognizer to buy for dictation. We would advise readers to consider buying both the Dragon NaturallySpeaking Deluxe edition and IBM ViaVoice. Moreover, these systems can work together. Speech recognition software has become so relatively inexpensive that usually the real question is when to invest in a new computer or upgrade. Therein lies the challenge! Stay tuned here for updates. Peter Fleming and Robert Andersen, speech recognition consultants, may be reached at firstname.lastname@example.org (617) 923-9356.
Dragon Sweeps Comdex
As we went to press, Dragon's Naturally Speaking became the first product ever to win a "Grand Slam" of major awards at COMDEX. During the week of COMDEX the continuous speech recognition product became the only product to win all of the following awards:
- PC Week, Best of COMDEX (Utility Software)
- PC Magazine, Technical Excellence (Software)
- PC/Computing, MVP (2 awards, Usability Achievement of the Year, and Best Input Device)
- Home Office/Small Office Computing Editors Pick (Most Innovative Product)
"It is extremely gratifying to be recognized in this way by the computer industry," said Janet Baker, president and co-founder of Dragon Systems.