Vertical Markets Spotlight: Speech in Healthcare

The National Library of Medicine has said that speech technology is not being appropriately explored in healthcare, even though tech innovation, especially in the area of artificial intelligence, is providing unprecedented opportunities for industrywide transformation.

It’s not that speech technology isn’t being used by healthcare providers; the issue is that right now it is not being used enough or to its full potential.

“Speech technology and voice recognition technology have really picked up as a way to augment and enhance healthcare interactions,” says Rebecca Wettemann, CEO and principal of Valoir. Speech technology, she says, can record information with far greater precision than a physician’s notes or memory, which is critical for short- and long-term patient health.

DeepScribe, a company that offers AI-based medical transcription services, reports that physicians spend nearly 38.5 percent of their clinical hours on documentation, which 59 percent of clinicians cited as a primary source of burnout. Additionally, healthcare providers needlessly spend $102 billion per year and lose revenue by having high-skilled workers spend nearly half their day doing basic data entry.

Healthcare providers that have installed Nuance Communications’ Dragon Medical One and Dragon Ambient eXperience (DAX) transcription and documentation solutions have reportedly seen an average seven-minute decrease in the patient encounter time, a 50 percent reduction in documentation time, and an 83 percent improvement in document quality, according to the company.

Ken Harper, Nuance’s vice president and general manager of healthcare virtual assistants and ambient clinical intelligence, also notes that the time savings and better focus on care has resulted in a 79 percent reduction in caretaker fatigue and burnout, which became a critical issue during the COVID-19 pandemic.

University of Michigan Health-West, based in Grand Rapids, Mich., is using Nuance Dragon Medical One and Dragon Ambient eXperience to leverage conversation AI and ambient clinical intelligence. The technology is designed to automatically capture important conversations, enabling clinicians to engage in natural conversation with patients and other family members while a dedicated mobile app securely captures the conversation at the point of care. No explicit voice commands are required.

The technology converts these conversations into comprehensive clinical notes tailored to each specialty while adhering to strict documentation standards, enabling appropriate coding. It also integrates with electronic healthcare record systems to pull out patient context, deliver the final notes, and enable care teams to complete a growing list of tasks in real time with virtual assistants.

During University of Michigan Health-West’s pilot of the Nuance technology, which included 15 primary clinicians, physician note taking time was reduced from 35 minutes to 11 minutes per patient per day. As a result, patient wait times, which rose to around 20 minutes during the pandemic, were reduced to 10 minutes.

After piloting the technology, University of Michigan Health-West deployed it across the entire department of more than 170 primary care providers, including all of its family care, internal medicine, pediatrics, and behavioral health practices.

Currently medical transcription is the biggest use of speech technology in the industry, absolving providers of the tedium of notetaking during conversations with patients and family members so they can spend more time on patient care. Modern systems not only feature automated speech recognition that takes notes at a better rate of understanding, but they can also gather much more complete data that can be cross-applied automatically to help doctors treat patients with similar conditions, Wettemann says.

But even then, the use of speech technology for medical transcription is still in its early stages of adoption, according to Wettemann. The industry has just begun to scratch the surface of where speech can go.

That growth potential was a big reason that Microsoft shelled out $19.7 billion to acquire Nuance in 2021. That deal took nearly a year to close as government regulators across the world really looked into it.

Nuance will collaborate with Microsoft to accelerate the Microsoft Cloud for Healthcare initiative that was announced in 2020. The plan is to combine the Microsoft Cloud for Healthcare with Nuance’s medical dictation and transcription tools.

“Completion of this significant and strategic acquisition brings together Nuance’s best-in-class conversational AI and ambient intelligence with Microsoft’s secure and trusted industry cloud offerings,” said Scott Guthrie, executive vice president of the Cloud and AI Group at Microsoft, in a statement. “This powerful combination will help providers offer more affordable, effective, and accessible healthcare and help organizations in every industry create more personalized and meaningful customer experiences.”

The technology doesn’t aid only the doctors but also the support staff, says Ed Miller, CEO of LumenVox. “It’s one of the hardest domains, not only because of the importance of what has to be transcribed but also because it has to be transcribed accurately. The terms that [medical] specialties use are quite complex Latin derivatives. Small changes can mean mistakes.

“Voice is critically important for providing more efficient communication across healthcare providers’ systems, for all people involved in the care of a person. You can imagine if you were going to type in some of these very long, complex names, you’re going to need to have a spell checker and other associated technologies,” Miller adds.

Miller also argues that speech technology is “sticky,” meaning that once a doctor starts using it he is unlikely to stop.

“The good news for patients is that more information is being captured digitally; the bad news for the clinicians is that it is complex. There are a lot of things that need to be captured,” Harper says. “There’s a lot of information that has to be captured when patients first come in, whether they are seeing their primary care physician or they are using a specialist for something that is unique to them.”

Historically, electronic health record vendors have needed to design keyboard- and mouse-friendly interfaces, which severely limited the ability to properly capture complex information, Harper explains. “The amount of information a provider has to put into the electronic health record is immense.”

Speech recognition technology makes the process much more efficient, particularly now that such technology is cloud-based, enabling better systemwide use and collaboration, Harper says. “Historically, the way speech has been used in healthcare is as a one-person system, meaning that the physician speaks what he or she wants into the health record. What limits it is that you still have to speak explicitly and you still have to speak directly to the computer. It’s like redoing your day’s work.”

That is no longer the case with technology that can capture speech across a number of devices.

Regulatory Concerns

Speech technology could have more healthcare applications, but the U.S. Health Insurance Portability and Accountability Act (HIPAA) has held it back to some extent with very powerful data protection safeguards, says Clifton Wiser, Voice Foundry vice president of solution architecture.

“Healthcare is a unique space just because it’s littered with [personally identifiable information],” Wiser says. “We’ve found a tremendous amount of attraction lately with some of the speech and comprehension engines.”

The natural language technology behind current speech and comprehension engines could identify personal information and then automatically redact that information from the record, increasing compliance with HIPAA regulations, according to Wiser.

“But we’re just starting where that technology can make real progress in being able to identify sensitive information and redact it to lower your risk,” Wiser says.

Wiser says that once doctors can be completely confident that personal information is properly redacted, speech technology can have much broader use.

The technology is not quite there yet, he says, noting that such capability is still considered bleeding edge and will likely take a while to advance to the point that the redaction capabilities meet full HIPAA requirements.

Beyond medical transcription, Wettemann and Miller both see speech eventually helping improve diagnosis and treatment of certain medical conditions, matching people with similar symptoms. Already, speech recognition is being used to identify people with some conditions, including mental health issues, cognitive decline, respiratory conditions like COVID-19, Alzheimer’s disease, and stress levels using certain biomarkers and speech characteristics like volume, tone, and pitch.

“From what I’ve heard, they’ve already done some amazing things in focusing on particular strategies, like ophthalmology, oncology, radiology,” Miller says. “Some of the latest and greatest of what they are doing is now integrated into the Microsoft stack. It will be really important to see how Microsoft leverages some of the high-tech capabilities of the Azure cloud and all of the collaboration tools that they have to provide a more complete solution for healthcare services. It just makes for a great business.”

Phillip Britt is a freelance writer based in the Chicago area. He can be reached at spenterprises1@comcast.net.

Free

for qualified subscribers

Subscribe Now Current Issue Past Issues

Vertical Markets Spotlight: Speech in Healthcare

Triton Digital Partners with ekoz.ai on Voice-Cloned Podcast Ads

Soul App Launches Full-Duplex Voice Model

Mistral Unveils Voxtral Open-Source AI Voice Model

Leena AI Launches Agentic AI Colleagues