Speech Technology Magazine

 

Voice Biometrics Needs to Adapt

Accuracy has been a top-of-mind concern
By Leonard Klie - Posted Aug 29, 2017
Page1 of 1
Bookmark and Share

As companies continue to increase their use of voice biometrics as a way to identify and authenticate callers, accuracy has been a top-of-mind concern. False readings can either deny access to legitimate customers or grant access to fraudsters, either of which can have serious impacts on businesses and their customer relationships.

Companies have known for some time that certain factors, such as mood, surroundings, and colds, can affect the accuracy of such solutions, but new research by Pindrop, an Atlanta-based provider of enterprise solutions to secure phone and voice communications, found another factor that can have an even more dramatic effect on how well voice biometrics works: the aging process.

Pindrop recently completed a two-year study in which it found that voice biometric accuracy rates decline sharply over time, with the expected error rate (EER) nearly doubling in two years as time passes and people get older.

Elie Khoury, principal research scientist at Pindrop, led the study, which analyzed former president Barack Obama’s speech patterns in recordings of more than 400 speeches during his eight years in office and found a significant degradation in voice biometrics systems’ ability to recognize Obama’s voice. During that time, the voice accuracy rating dropped by 23 percent, and at the high accuracy threshold that most banks use to identify incoming callers, most systems would have started rejecting the former president after just two years.

During the same test on former president George W. Bush, the rate of voice degradation was even higher.

The Pindrop research team also tracked 122 speakers in six languages (English, French, German, Spanish, Dutch, and Italian) over two years and found that error rates doubled from 4 percent to 8 percent in that time. There was a noticeable degradation already after just four months.

This could be problematic, given that 48 percent of people call into their banks’ call centers only once every eight months, according to the research.

The study also found that the female voice does not change as much as the male voice, and voices of men age 60 and over change the most dramatically. Additionally, people’s voices do not all age at the same rates.

The research, while alarming, isn’t much of a surprise. As people age, so too do the muscles, vocal chords, and other structures that they use to speak. Other factors, such as emotional states, stress levels, hydration, overall health, activity levels, and even the time of day can all affect voice pitch and speed and, therefore, the accuracy of voice biometrics systems. External factors also play a role, as callers could sound differently on landlines or mobile phones, for example.

“There are lots of things that could cause variances in an individual’s voice,” admits Dan Miller, founder and lead analyst at Opus Research.

For this reason, he and others recommend biometrics approaches that rely on more than one factor. Among other modalities, irises and fingerprints stay the same over the course of people’s lives.

But the fact that voices change with age shouldn’t be a deal breaker for companies looking to add voice biometrics to their contact center systems. Voice biometrics can be calibrated to take into account the passage of time.

“There are ways to build more dynamic voice templates that are based on both physical and behavior characteristics,” Miller says. “You can anticipate certain changes to captured utterances and still retain high accuracy scores based on them.”

Another solution could be to have customers update their voice prints periodically.

Beyond that, though, there are other things that limit the effects of the aging process. “There is a lot of information stored in voice print data that does not change,” Miller says.

As proof, Nuance Communications earlier this year conducted its own research on voice biometrics systems using recordings of actors Arnold Schwarzenegger and Morgan Freeman, comparing samples that were recorded more than 30 years apart. Despite the passage of that much time, systems were able to identify the samples as coming from the same person in both cases.

Brett Beranek, a director of product strategy at Nuance, said in a recent blog post that systems today can achieve this level of sophistication due to smart adaptation, automatically adjusting individual voice prints on file with each successful authentication to the system.

“Age may be a very sensitive topic, requiring tact when the subject arises in conversation, but to an adaptable voice biometric engine, your voice is wonderful no matter what your age,” he concluded.

Page1 of 1