Medical Diagnosis Applications Are the Frontier of Speech Tech

Article Featured Image

To appreciate how voice can be used in medical diagnosis, we first need to understand biomarkers and digital biomarkers. A biomarker is a biological characteristic that serves as an indicator of a particular condition, or a biological state. Examples of biomarkers include blood glucose levels for diagnosing diabetes and protein markers in urine for monitoring kidney function. A digital biomarker is a digital representation of a biomarker, and is often generated from data collected through digital devices. Examples of digital biomarkers include heart rate variability measured through wearable devices for assessing cardiovascular health, and smartphone accelerometer data used to monitor physical activity and detect falls.

In short, biomarkers are typically biological characteristics, while digital biomarkers are digital representations of these. Biomarkers often require invasive procedures in the laboratory; digital biomarkers can be collected through non-invasive methods like wearable devices or smartphones. Digital biomarkers offer advantages in terms of ease of collection, real-time monitoring, and potential for large-scale data aggregation.

The human voice holds great potential as a digital biomarker, as articulated speech contains rich contextual information. For example, changes in tone, pitch, and cadence can indicate mental states like stress, anxiety, or excitement. Similarly, voice quality parameters can reveal subtle changes in physiological states like fatigue, sleepiness, or hydration levels.

The production of human speech requires different biological systems in the body. Any significant changes and anomalies in the functioning of these biological systems are difficult for the untrained ear to detect. But machine learning algorithms trained on a large corpus of voice data can pick up these changes, and this paves the way for voice-based diagnosis.

As a biomarker, voice is extremely sensitive—it can detect even slight changes in a person’s physiological or psychological state. This makes it useful for detecting early warning signs of conditions like depression, anxiety disorders, or neurological diseases and allows for early diagnosis and treatments. This approach has shown promise in diagnosing neuro-degenerative disorders such as Parkinson’s and Alzheimer’s. There are currently several large-scale studies under way, such as the National Institute of Health’s Bridge2AI program, that are actively researching the use of voice diagnosis applications.

It is envisaged that in the future, screenings using voice will become another tool of triage used by healthcare practitioners. It can serve as a self-monitoring tool for patients and as a way to assess the progress of disease by clinicians. And as an early diagnosis tool, voice will be hugely beneficial; it will help free up lab testing and other resources and ease the resource burden on the healthcare system.

So voice holds great potential as a digital biomarker, but before it can be scaled to become part of everyday clinical practice, the following several research, ethical, and practical considerations that must be addressed:

  • Standardization: Establishing standardized protocols for data collection, processing, and analysis to ensure consistency across studies.
  • Validation:Conducting large-scale validation studies to confirm the effectiveness of voice-based biomarkers in various populations and conditions.
  • Data privacy and security: Protecting patient data collected through voice-based biomarkers by ensuring secure storage, transmission, and processing.
  • Informed consent:Obtaining informed consent from patients before collecting and using their voice data for diagnostic or monitoring purposes.
  • Bias and equity: Ensuring that voice-based biomarkers are not biased toward specific demographics or populations.
  • Regulatory compliance:Complying with relevant regulatory requirements, such as those related to patient data privacy and informed consent.
  • Clinician training:Educating clinicians on how to interpret voice-based biomarkers and integrate them into clinical practice.
  • Integration with existing tools:Developing methods for integrating voice-based biomarkers with existing diagnostic tools and clinical workflows.
  • Patient engagement:Ensuring that patients are informed and engaged in the use of voice-based biomarkers, particularly if they will be used to monitor their condition or treatment.

We can expect that in the future voice-based diagnosis will find several uses as it becomes a part of personalized and precision medicine and will serve as an important diagnosis tool. 

Kashyap Kompella, CFA, is an industry analyst, author, educator, and adviser. He is the founder of the AI advisory outfits RPA2AI Research and AI Profs and a For Humanity Certified AI Auditor.

SpeechTek Covers
for qualified subscribers
Subscribe Now Current Issue Past Issues
Related Articles

Voice Cloning Using Artificial Intelligence Is a Pandora’s Box

It is the best of times, it is the worst of times for this powerful technology.

Audiovisual Speech Recognition Takes ASR to the Next Level

Humans rely on audio and visual cues to comprehend speech, and ASR should do the same.

Generative AI and Speech Technology: Proceed with Caution

With great power comes great responsibility.