Continuous Dictation: Speech in Hand

Welcome to the new world of mobile speech recognition! Small hand-held recorders now allow the user to dictate almost anywhere, free of the computer. Later, the material can be downloaded to a computer for processing by speech recognition. The computer will automatically type out the recorded material. This is a very exciting development in the history of speech recognition A variety of recording devices, and linking software, are currently available. Dragon Systems offers NaturallySpeaking Mobile and NaturallyOrganized. The linking software is available on all Dragon's higher end dictation products. Dragon has developed a small lightweight (about 4 oz.) portable hand-held digital dictation device, with removable as well as in-dwelling electronic recording media. The Dragon recorder is ergonomically designed and easy to use. This device is downloaded via a special cable, to a serial port, for subsequent recognition. A related product to be used with the Dragon recorder is called Dragon NaturallyOrganized. The Dragon NaturallyOrganized program allows voice commands to record important information, or send emails, after the dictation is downloaded to the computer. The dictation linking software can be integrated with a variety of organizer programs such as ACT, Pilot, Goldmine and Lotus Notes. Lernout & Hauspie has released VoiceXpress Mobile, an adjunct to their dictation product. This package includes the new Olympus DS-150 digital recorder, which is one of the smallest, lightest weight recorders on the market. Weighing in at slightly less than 2 oz., it is quite easy to use. The L&H VoiceXpress Mobile contains an excellent feature, the ability to record "behind" the words, which are transcribed on the screen. This allows the user to double click on a word, and that word, as well as the surrounding words, will be played back. The correction dialogue box appears, and then you can make the appropriate correction not only in the text of the document, but also for training purposes of the software. The ability of the software to play back the surrounding material with a double click allows easier correction and transcription. IBM, Dragon, and Philips, also have the voice recorded behind the words, however, double clicking does not play the adjacent material in the same manner as the L&H system does, thus allowing more context recognition by the user. The DS-150 Olympus digital recorder has approximately 75 minutes of available recording time on an internal memory. It does not permit removable media. The related Olympus D1000 digital recorder does allow removable memory cards, and is a somewhat larger and more complex digital recorder, which can also be used with speech recognition systems. Microphones and speakers can be added to it as well. IBM's ViaVoice 98 speech recognition software can also be used with the Olympus D1000 and the Olympus DS-150 digital recorders. Packaged products containing recorder and software are available. IBM packages with the Norcom recorders have also been available. Philip's FreeSpeech 95, 98, and 2000 can be also be used with any recorder that produces a wave file. Philips has the ability to play back and transcribe wave files as part of its regular system. Philips features a moving highlighter, which shows which words are associated with which sounds, as the dictation is played back. This allows the user to see the relationship between the sound and the words. The highlight bounces along the words on the screen, as the background recording of the speaker's voice is played. Philips has also developed a tiny digital recorder the size of a PC card, which functions as a PC card, and can be inserted into the computer for download. It contains a small microphone and is designed especially for attaching voice notes or sound files to accompany email. Norcom offers two hand-held, high quality analogue recorders for speech recognition use. The newer model, namely the 2440, contains the very helpful "passed through" feature, allowing the user to train the speech recognition system using the recording device as a microphone. This makes the training process somewhat easier. It is notable that the Olympus DS-150 also has this helpful "pass through" feature. The Norcom recorder uses "mini" cassette tapes, 15 minutes on a side, and 30 minutes to a tape. Norcom sells high quality tapes for about eight dollars, with a higher amount of iron adhered to the tape, allowing better recording. Cheaper tapes of lower, but possibly sufficient quality, are also available. The quality of the analog recording is excellent. The Norcom has an internal microphone, as well as a jack to add a different microphone. These units are somewhat larger than the digital recorders. Norcom also offers a manual/pedal tape transcribing device. They obviously have an advantage of low-cost removable medium, which can easily be transported. Favorable feedback from users of the Norcom has been received. The Norcom contains a special device through which the tapes are played back into the computer, allowing very good quality, with a favorable electronic connection. VXI offers an interesting analogous product, namely the Portable Parrott which allows other conventional analog tape recorders to be played back into the computer. The VXI device is reported to equalize and balance the electronic signals, delivering the appropriate signal strength to the computer for optimal speech recognition, removing some of the distortion, hums and hisses, which sometimes occur when the recording is played back through the recorder's own output system. VXI also offers a small stub "lollipop" microphone, which can be placed in a small hand-held recorder, for increased sound quality and some noise cancellation. Olympus also offers a noise canceling "lollipop" microphone, as do several other vendors. Dictaphone and Sony, among others, also offer portable digital recorders, which can be used for speech recognition transcription. Dictaphone has a product called the "Walk About" for portable digital dictation. Sony has long offered a variety of digital recorders, including a very high quality "walk-man" type, an older digital hand-held tape recorder, and some newer lighter weight digital recorders. Certain caveats apply to using these portable hand-held recorders for speech recognition dictation. Proper use of the microphone, whether internal or added to the recorder, is very important for accurate recognition. The quality of speech, clear pronunciation, the consistency of speech recording, the distance of the mouth from the microphone, and other microphone parameters, must be held constant to obtain optimal quality recording and recognition. The relative absence of background noise, whether by using a quiet environment or an adequate noise canceling microphone, is very important. Obviously one must have one's speech recognition systems tuned up and working well. If one cannot dictate well to the original system directly into the computer, then it is difficult to expect the recorder to deliver high accuracy recognition. A good sound card in the computer, enough memory, a good speech recognition software system, and proper microphone usage and environmental management are all important in obtaining accurate speech recognition results. However, with the careful use of these parameters, it is possible to obtain very high quality recognition.

Peter Fleming, a speech recognition consultant, may be reached at aris@world.std.com, or 617 923-9356.

SpeechTek Covers
for qualified subscribers
Subscribe Now Current Issue Past Issues
Companies and Suppliers Mentioned