Embedding Speech Biometrics on Devices

Speech identification not only can properly authenticate a person under normal conditions but can also provide the same accuracy under trying conditions—if a person speaks quickly one time and slowly another; if the speaker has a cold or mild laryngitis; or if he is calling from a variety of devices. However, if the sound quality is too poor or if health conditions are too severe, the speech biometrics may not work.

“If your mother wouldn’t recognize your voice, neither would [any speech biometrics] system,” says Mia Puzo, manager of the biometric security team for Nuance. But, unlike examples shown in some spy movies, voice biometrics cannot be fooled by recordings of a person’s voice, according to Yaghi.

Strong Growth Expected

Due to the above advantages and the need to thwart fraudsters trying to exploit authentication systems, market intelligence firm Tractica is predicting some major growth for the voice and speech recognition market over the next several years. Opus Research reports that 56% of firms in Singapore have either implemented or are implementing voice biometrics-based solutions. Companies in Spain and Portugal are using voice signatures for signing contracts. Tractica further predicts that the value of the voice and speech recognition market will rise $1.1 billion last year to $6.9 billion by the year 2025.

In a report summary, Tractica explained that these technologies have “undergone a transformation in recent years” and point to advancements in AI technology that have helped to propel the rise of speech interfaces for smartphones and other devices. As principal analyst Mark Beccue explained, “It was the emergence of smartphones and cloud computing that was the real game changer for this market.”

Beyond those advances, biometric providers continue to refine their technologies to use ever smaller device footprints and to handle less-than-perfect audio conditions. Nuance’s Dragon Drive, for example, adds voice biometrics and advanced audio processing technologies to recognize and allow multiple passengers to interact simultaneously with the automotive assistant.

Financial services firms, healthcare providers, and retailers all want easy ways to authenticate customers that protect their accounts and their privacy, says Efrat Kanner-Nissimov, product marketing director for NICE. “Voice identification is very easy to use on a mobile device. It’s one of the easiest methods to use for fraud prevention.”

Earlier this year, Nuance launched its own fraudster identification solution, ConversationPrint—a “behavior-conversational” innovation that can identify fraudulent activity in real-time based on choice of words and patterns of speech or writing during a speech interaction with a human or virtual assistant.

Consumers are generally comfortable with voice biometrics as well. According to a study from the University of Texas at Austin’s Center for Identity, titled “Consumer Attitudes About Biometric Authentication,” about 75% of consumers are very comfortable or somewhat comfortable with voice biometrics. Obtaining the initial voiceprint from which to positively identify the customer in future communications is easy as well, according to Kanner-Nissimov.

Voice biometrics uses either text-independent or text-dependent enrollment. Text-independent allows the user to say any combination of words long enough to capture the necessary biometric information. Then the user can use any combination of words again (it doesn’t need to be the same phrase) at authentication. With text-dependent authentication, the user repeats a specific set of words at enrollment, then repeats another set (it may or may not be the same) for authentication.

The text-dependent option can positively identify someone in as little as 1.5 seconds, according to Puzo.

Yet Nuance and the other speech biometrics providers continue to offer other ways to confirm the authentication of customers/employees via voice. Earlier this year Nuance added these methods:

Device ID to analyze audio to determine the device type and model used during the interaction.
ANI ID to analyze the meta data in a phone call to identify inconsistencies and determine phone number spoofing. The geo-location feature detects a caller’s geographic location via phone network.
Synthetic ID to detect speech produced by software. The most recent release includes detection of a wide array of synthetic voice technologies, including those generated by deep neural networks.
Liveness ID to detect voice recordings through intra-session voice variation liveness testing.
Playback ID to detect voice recordings through audio anomalies created by the recording and playback process.

Faster Password Resets

The advantages offered by speech biometrics encouraged German delivery provider Deutsche Post DHL to automate the password reset process for employees—a process that had previously taken four hours, according to VoiceTrust, which offers voice-activated password reset solutions.

In the case of Deutsche Post DHL, employees voluntarily enrolled their voices by calling a password reset phone number or receiving an automated call-back from a webpage to store their unique voiceprint. Their voices were then used to automatically authenticate them each time they required a password reset for a more secure and convenient process.

More than 25,000 employees enrolled in the solution, which, according to VoiceTrust, shortened password resets from 4.5 hours to 15 minutes and reduced help desk costs by 16%.

Some Drawbacks

Despite all of the positive attributes listed by Kanner-Nissimov and others, speech biometrics won’t work in all instances. Sometimes there is too much background noise, severe laryngitis, a poor connection, or other issues that render the speaker unrecognizable. Older versions of mobile devices, like flip phones, may not have the ability to download the necessary voice verification applications.

There are other challenges as well, according to Opus’s Sanjith: “Even though mobile biometrics may be implemented in centralized, hybrid, and on-device architectures, and utilize the on-device sensors—microphone, camera, fingerprint-sensor, etc.—the best performance is achieved when the biometric template is created for singular use (ongoing verification) on the same device” rather than a single template for multiple devices, which tends to lead to poor performance, he says.

So if a user had multiple smartphones, for example, she would need to enroll on each device separately, Sanjith says. “While the actual biometric enrollment process is generally not too onerous—e.g., say the same phrase three times, place your finder on the sensor three to five times, etc.—the pre-enrollment process of ensuring the bona fides of the user, which is referred to as the ‘ground truth,’ can add friction and reduce positive user participation.”

Additionally, there are the 25% of customers in the University of Texas at Austin study who aren’t too comfortable with voice biometrics. Permission is necessary to collect and store a person’s biometric information. While most customers tend to give that permission, some tech giants like Facebook, Google, Shutterfly, and Snapchat were sued last year when some consumers claimed their biometric information was handled illegally.

But the lawsuits and other challenges notwithstanding, voice biometrics providers and research firms like Tractica and Opus expect strong growth for the technology in the next several years.

Phillip Britt is a freelance writer based in the Chicago area. He can be reached at spenterprises@wowway.com.

Free

for qualified subscribers

Subscribe Now Current Issue Past Issues

Market Spotlight: Security

Voice biometrics is vital, and it's only the beginning. AI and speech technologies are helping drive new levels of security and pioneering new ways to keep the public safe.

08 Jul 2019

The State of Voice Biometrics

Voice biometrics is becoming popular with big corporations but is still out of reach for many

11 Mar 2019

Embedding Speech Biometrics on Devices

Strong Growth Expected

Faster Password Resets

Some Drawbacks

Market Spotlight: Security

The State of Voice Biometrics

SoundHound Partners with Acrelec

Deepfake AI Market to Generate $41.36 Billion by 2032

SoundHound Launches Vision AI

Vuzix Introduces LX1 Smart Glasses for Warehouses