July 23, 2008
By Leonard Klie Editor, Speech Technology and CRM magazines
Features

Is Speech to Blame for Bad Transcripts?

Though modern Internet and digital communications technologies have made audio easier to record, manipulate, and transfer from source to destination, generating transcripts from them is often fraught with problems, according to officials at a New York-based transcription services company.

Chief among the problems cited by TruTranscipts is sound quality, though current shortcomings in speech-to-text technologies also come into play.

Especially among several of its medical, financial services, and media company clientele, "speech recognition was not where it needs to be," says Sonia Houmis, business development manager at TruTranscripts.

Though she would not elaborate on the specific speech recognition products used, Houmis notes that her firm has had a number of clients that were disappointed with transcripts produced through speech-to-text systems in the past year. Garbled phrases and unintelligible words, especially with multiple voices or the presence of ambient sound, were among the biggest problems cited with the technology. Also common were recognition errors associated with the large, specialized, and highly technical terms used, especially in the medical profession.

"They had to type in what the voice recognition couldn’t get," Houmis says. "The technology’s not where it should be, and they were wasting time and money."

That's why a number of speech recognition vendors, such as Nuance Communications' Dictaphone Division and Philips Speech Processing, have created dictation and transcription software geared toward specific vertical markets, such as the medical and legal professions. These applications contain industry-specific vocabularies, templates, macros, dictionaries, and shortcuts built-in, and because they are adaptive, they can automatically store new terminology, correct mistakes, and handle rephrasing and formatting issues based on individual preferences, officials at Nuance point out.

Many of the recognition problems could more likely be linked to the quality of the recordings themselves, something even officials at TruTranscripts concede. "Although digital technology has made recording easier, it is still possible to capture bad sound or audio with quality issues," says Sandra Arroyo, technical support supervisor at TruTranscripts. "These factors play a role in the interpretation of the dialogue and ultimately the accuracy of the transcript."

"A lot of it is in the placement of the recorder," Houmis adds. "People pretty much take a handheld recorder and place it on the desk or a chair next to them and don’t really think about how to capture the sound."

To that end, TruTranscripts recommends the following techniques to consider when recording an interview or event:
• Use lapel, lavaliere, or other external microphones, such as omni-directional and table conference microphones.
• Set the recording speed to standard or SP for a higher sound quality.
• Do not use voice-activated devices, as they often miss key pieces of dialogue.
• Pay careful consideration to surrounding conditions, background noises, and ambient sounds.
• Test the voice dynamics of participants. Some individuals speak loudly in the beginning and then trail off, so adjustments might have to be made during the recording session.

"A lot of people feel intimidated when they are asked to wear a microphone for fear that wearing a microphone will change the dynamics of the interview, but it does help tremendously," Houmis concludes.

Free

for qualified subscribers

Subscribe Now Current Issue Past Issues

Companies and Suppliers Mentioned

Is Speech to Blame for Bad Transcripts?

Nex-Gen Chat Solutions with Generative AI You Can Trust

Speech Technologies in the Low-Code/No-Code World

Meeting the Rising Demand for Voice-Based Biometric Systems

More Web Events

Tips for Reviewing Voicebot Vulnerability

Safety and Ethical Concerns Loom Large in Voice Cloning

Apple Proposes Acoustic Model Fusion to Improve Speech Recognition

Aculab Launches Audio-to-Audio Translation