2023 Vertical Markets Spotlight: Speech Technology in Healthcare

Speech in healthcare is slowly starting to branch out from medical transcription—its primary usage—to provide outbound follow-up checks and even serve as biomarkers that can indicate disease progression or that some medical care is warranted.

Medical transcription is still the leading use of speech technology in healthcare and likely will be for many years to come. Research firm IMARC Group valued the global medical transcription market at $66.5 billion in 2021 and expects it to reach $96.5 billion by 2027, growing at a 6.4 percent compound annual rate. It includes in its numbers not just the speech recognition technology but the supporting electronic health record systems, picture archiving and communications systems, radiology information systems, and more.

The report notes that with advancements in technology, healthcare providers are opting for medical transcription to optimize operational efficiency, improve quality of patient care, and minimize capital expenditure while helping them better manage and maintain patient care and treatment records.

Besides this, the report says, medical transcription aids in reducing staff burden by eliminating the need to file patient forms, discharge summaries, and operation and progress notes and creates an organized and appropriate patient history to help doctors analyze current physical conditions and prepare ideal treatment plans.

Medical transcription technology has been around for some time, but it’s still in its infancy, according to some experts, who see it evolving significantly as artificial intelligence becomes more embedded.

“Medical transcription continues to get better and better,” says Daniel Ziv, vice president of speech and text analytics global product strategy at Verint. Because of that, more healthcare providers are incorporating it day-to-day operations.

Baljit Singh, cofounder of Simbo, a provider of conversational AI solutions for healthcare, says medical providers are increasingly using advanced transcription technology with embedded AI because they lack the time or resources to do transcriptions themselves. Another huge selling point is the ability to record provider-patient conversations in real time, meaning the provider doesn’t need to recall and summarize everything for documentation after appointments or at the end of the day, he adds.

“We are using conversational AI technology to reduce the burden on physicians, clinicians, and nurses in healthcare,” Singh says. “Our natural language understanding is very strong from the perspective of understanding the clinical context of what is happening in the conversation. We are looking at all possible ways where voice conversational AI can be used to reduce the burden on the staff.”

Singh adds that the emergence of large language models (LLMs) and generative AI will enhance speech tech’s benefits to healthcare providers. Both of these are so new that the full range of their capabilities has yet to be revealed.

This past March, Nuance Communications launched Nuance Dragon Ambient eXperience (DAX) Express, a workflow-integrated, fully automated clinical documentation application that combines conversational and ambient AI with OpenAI’s newest LLM, GPT-4. Medical professionals can use it to draft clinical notes for immediate clinical review and completion after each patient visit in the exam room or via telehealth.

DAX, which is part of Nuance’s Dragon Medical portfolio, launched in 2020. The DAX Express version makes AI available to more than 550,000 Dragon Medical users. In addition to GPT-4, DAX Express also leverages Microsoft’s Azure cloud platform.

“Nuance and Microsoft came together with the goal of helping to digitally transform healthcare, and today we are marking the next step forward in the ongoing evolution of AI-powered solutions for overburdened care providers,” said Mark Benjamin, Nuance’s CEO, in a statement. “We’ve taken the power and advanced reasoning capabilities of GPT-4 and integrated it into our proven outcomes-focused AI technologies in a tested and responsible way. Our state-of-the-art blend of conversational, ambient, and generative AI will accelerate the advancement of the care delivery ecosystem beyond what Nuance or Microsoft could have achieved separately.”

And in April, 3M Health Information Systems began working with Amazon Web Services to bring AWS machine learning and generative AI services, including Amazon Bedrock, Amazon Comprehend Medical, and Amazon Transcribe Medical, to its M*Modal ambient clinical documentation and virtual assistant solutions.

In April, Microsoft and Epic expanded a long-standing partnership to develop and integrate the Azure OpenAI Service with Epic’s electronic healthcare records software.

Another recent development is the birth of speech-based documentation solutions for specializations within the larger medical field. Several solutions have been available for some time for radiology. Nuance, for example, introduced the PowerScribe One radiology reporting platform in 2018. 3M Health Information Systems has also offered M*Modal Fluency for Imaging for some time.

Other areas of documentation specialization have included oncology and ophthalmology. And just last month, HCA Healthcare and Augmedix started deploying AI-enabled medical dictation software for acute-care settings.

Speech for Outbound Calls

Speech technology is also helping medical professionals with the after-visit patient experience by automating the follow-up phone calls to ensure patients follow doctors’ instructions, fill prescriptions, and aren’t experiencing problems that would necessitate their return to the care facility.

“A nurse may need to make 20 of these calls a day,” Singh says, noting that automating the follow-up calls and capturing patient responses enables the nurse to focus on more critical duties rather than spending much of the day on the phone.

Patients who received phone calls two days after discharge from a hospital emergency room were less likely to need a return trip to the hospital, according to a study by researchers from the University of California San Francisco and the University of Colorado.

Similar technology is also helping healthcare providers uncover serious health conditions.

Like many other healthcare providers, Wolters Kluwer Health has offered outgoing messages to check on patients’ health, a critical part of ongoing care for those who have recently been discharged from the hospital or have health conditions such as heart disease or diabetes.

Wolters Kluwer Health had been recording responses from patients through its Emmi interactive voice response system. The healthcare information, services, and software provider is refining its IVR to capture the audio of patient responses and analyze them to identify voice biomarkers to detect or follow the progression of certain diseases and conditions. Those recordings could also be used for sentiment analysis to inform care teams about what’s happening with the patient beyond what just words could tell them.

When the voice biomarker technology is launched later this year, it will give healthcare providers yellow flags and red flags to help indicate that patients’ conditions (including their mental health) might be degrading and human follow-up could be necessary, says Freddie Feldman, voice user interface design director at Wolters Kluwer Health.

“We’re not saying someone is depressed,” Feldman says. “We are saying that there’s an elevated level of stress in their voice that indicates that they may be depressed.”

Speech analytics is also being used to diagnose physical and cognitive conditions and track their progression. The technology uses just a few seconds of speech data and measures qualities like volume, tone, pitch, speaking rate, articulatory precision, word search time, and other factors.

“As cognitive disease develops, subtle changes in speech occur, which can be precisely detected using our powerful speech analytics,” Judy Smythe, CEO of Aural Analytics, one of the companies leading in this area, said recently. Other speech players that are heavily invested in this area include Sonde Health and Winterlight Labs.

“Today’s healthcare companies are realizing how vocal biomarkers can engage people earlier in their health. The data and insights found in voice can power health monitoring and patient stratification so issues can become apparent well before a costly medical event occurs,” said David Liu, CEO of Sonde Health, in a statement.

Among the conditions influenced by such speech biomarkers are Parkinson’s, Lou Gehrig’s disease, Alzheimer’s, dementia, stroke, migraine, schizophrenia, chronic obstructive pulmonary disease (COPD), and COVID-19. Speech analytics is even being used to aid in clinical trials of some experimental medications and to detect concussions among players in contact sports like football and hockey.

And one more innovative use of speech in healthcare settings follows a trend that is taking root in the consumer electronics space—voice control.

Clarius Mobile Health, a provider of handheld ultrasound systems, in early May added Voice Controls to enable clinicians to control multiple imaging functions with voice commands. With the new AI-powered Voice Controls, users don’t have to put down the Clarius ultrasound scanner mid-procedure or ask an assistant to adjust the image. Users speak commands such as freeze, adjust gain and depth, capture images and videos, and switch imaging modes.

“The addition of the Voice Controls feature is brilliant,” said Dr John Arlette, a dermatologist, in a statement. “Procedures are faster, and I can focus on the patient instead of being distracted by mechanical adjustments to the settings on my device.”

And in the end, patients benefit when caregivers can spend more time with them and less time doing mundane tasks. Speech tech, in its many forms, is helping them do so, and it’s just starting to scratch the surface of what’s possible.

Phillip Britt is a freelance writer based in the Chicago area. He can be reached at spenterprises1@comcast.net.

Free

for qualified subscribers

Subscribe Now Current Issue Past Issues

2023 Vertical Markets Spotlight: Speech Technology in Healthcare

Speech for Outbound Calls

Voice Deepfake Fraud Surged 1,300 Percent

Sanas Unveils Simultaneous Real-Time Speech-to-Speech Translation

ESTsoft Partners with ElevenLabs

Deepgram Launches Voice Agent API