Speech technology sees government growth spurt.
Contrary to what people may think, the government does hear us, thanks to speech technology advances. At the federal, state, and local level, applications such as voice biometrics and speech recognition are showing up in a range of deployments, from Medicare compliance to Hurricane Sandy notifications. "There are many reasons to compel government agencies to make improvements, and the easy and obvious first level of improvement is about engaging with speech," says Scott Fischer, chief operating officer at MicroAutomation, a Voxeo preferred partner.
Thanks to a slew of ever-changing laws concerning eligibility, liability, compliance, and confidentiality, speech technology in the government healthcare sector is hot.
The most significant U.S. legislation to affect speech technology in the next year is likely to be the Health Information Technology for Economic and Clinical Health (HITECH) Act, enacted in February 2009, which mandates that all healthcare providers switch to electronic health records (EHRs), also known as electronic medical records (EMRs), by 2014.
The federal government will offer tax incentives under the Medicare EHR Incentive Program to providers that make the switch and can prove meaningful use. The program, which falls under the HITECH Act, stipulates that organizations that qualify for the Medicare EHR Incentive Program and achieve meaningful use by 2014 will be eligible for incentive payments, but those that haven't complied by 2015 will suffer penalties.
"With meaningful use, there are government incentives [and] government penalties for hospital systems to deploy EMRs, but you can't just install software," says Peter Mahoney, chief marketing officer at Nuance Communications. "Medical facilities actually have to prove that they're being used in a meaningful way. Dragon is one of the key technologies that is used to help hospitals—not only in EMR systems to get them deployed, but to get doctors to actually use them."
Another growth area comes from a change in the government's Medicare program, involving accountable care organizations (ACOs)—organizations formed by groups of doctors and other healthcare providers to coordinate care for people with Medicare. Currently, participation in an ACO is voluntary for providers. The Medicare Shared Savings Program (MSSP) and other initiatives related to ACOs are made possible by the 2010 Affordable Care Act. ACOs serve more than a million people with Medicare in 40 states and Washington, D.C.
Mahoney explains that hospitals are transitioning from a model of being paid for delivering a service, such as a test, to being paid based on patient outcomes. "To do this, you need to do a really good job of tracking patient information and categorizing it, which is driving significant deployment of natural language," he says. "Healthcare is somewhat ahead of the curve in using speech technology in some areas because of the necessity across all healthcare, whether it's government or not. We've seen high adoption here."
There are many healthcare initiatives, all with some common considerations—the need to protect privacy while providing secure access and to provide an audit trail that meets industry regulatory standards, according to VoiceVault. The company points out that collecting traditional pen-and-paper signatures is not only time consuming, but costly as well, whereas obtaining voice signatures is a relatively simple process and as legally binding as a written contract.
Voice biometrics can be used to facilitate compliance, secure access to EMRs, reduce costs, and provide iPad access to EMRs for medical practitioners in a secure manner.
VoiceVault helps companies use voice biometrics to generate legally binding e-signatures over the phone to efficiently add security to the health insurance application process. The company's e-signatures are recognized as legally binding under several government acts, such as the E-Sign Act, HIPAA/CMS, and FDA 21CFR Part 11. Already, VoiceVault has amassed some large clients in the healthcare industry, including WellPoint/Anthem, Aetna, and Blue Cross/ Blue Shield of Kansas City.
"Traditionally, health insurers are a little behind the times from a technology standpoint, but we really see that they are engaging in voice biometrics," says Julia Webb, executive vice president of sales and marketing at VoiceVault. "I look at it as a great install base, where we can extend the enrolled voiceprints to other applications, such as information about benefits or premiums. It's also very relevant right now because of healthcare reform. The healthcare providers we've been talking to are saying, 'How are we going to deal with this major influx of applications from people who are moving policies?"
Using voice biometrics, VoiceVault has seen mailing and back-end costs decrease, and an increase in closure rates as well. The company says that its VoiceSign solution has enabled organizations to increase telephone sales closure rates by more than 20 percent and reduce administrative costs associated with the typical paper trail that accompanies handwritten signatures by up to 80 percent.
"With changes in healthcare laws, we see people looking for individual medical policies moving away from small group plans. Voice signature solutions allow customers to apply over the phone," Webb says, adding the efficient verification process enables organizations to handle a large number of applications and policy changes.
Speech Arms Users
Voice biometrics is also widely used for surveillance and identification in military and police organizations, with users including agents, soldiers in the field, and forensic specialists.
In December 2012, Speech Technology Center (STC), based in Russia, rolled out the world's first bimodal system on a nationwide level in Ecuador. The solution combines facial and voice identity biometrics that are used for criminal forensics. Voice samples and photographs of suspected criminals can be placed in the database and used for comparison purposes to detect matches among suspects. The state-of-the-art system took about a year to roll out.
The company's voice biometrics solution has a reliability rate of 97 percent, and when voice is combined with facial biometrics, the rate is closer to 100 percent. The technology's algorithms are able to deliver reliable results even if a face has been altered.
"The entire system is more reliable because you're not using just one form of identification," says Alexey Khitrov, president of SpeechPro, the U.S. subsidiary of STC. "The idea was to make the system as usable as possible. You can use just the face, the voice, or both."
"Voice and face identification are providing new and valuable investigative capabilities," said Mikhail Khitrov, chief executive officer of STC, in a statement. "The biometric technologies providing the foundation of the system have proven to be reliable and robust in even the most challenging conditions. As biometric technologies mature, we're seeing a growing demand for these kinds of tailored voice and multimodal biometric solutions—not just in Latin America, but in the global marketplace."
In June 2010, SpeechPro launched the world's first nationwide automatic voice identification system for the government of Mexico. That deployment helped more than 250 law enforcement agencies throughout Mexico collect, manage, and search hundreds of thousands of voiceprints in their fight against crime. The company is reportedly negotiating similar deployments elsewhere in Latin America, Asia, and Europe.
For now, though, the focal point is Latin America. "In Latin America in particular, they are dealing with crimes like kidnapping, drug trafficking, and organized crime," Alexey Khitrov says. "It's such a big issue that [governments] are willing to boost their technology to counter it."
STC is also making headway in U.S. law enforcement following a strategic partnership deal last year with Data Works Plus to integrate voice identification technologies into its law enforcement software application platform. Several law enforcement agencies have committed to pilot the system, which will take voice samples of criminals when they are arrested.
STC's success in Ecuador and Mexico is "helping to build confidence in it throughout the U.S. and Europe," Alexey Khitrov says. "We're seeing a growing interest in biometrics in general as a result."
Madrid-based voice biometrics software solution provider Agnitio has been a growing presence in the field. In early 2012, the company launched Kivox Mobile, a product for secure authentication in mobile devices using a person's voice. The software can perform voice authentication on Android smartphones or tablets without being connected to a network.
"Kivox Mobile will bring voice authentication closer to the user," said Emilio Martinez, CEO of Agnitio, in a statement. "You will teach your device to recognize your voice, whenever you want and at the pace you want. The device will improve its recognition capabilities over time, and you will be able to test how it works in multiple situations. Agnitio technology is used around the world to identify persons of interest to fight terrorism and improve homeland security. Reducing the size of our engine so that it can run locally in a consumer mobile device is truly unique."
In May 2012, the U.S. government's Advanced Analytic Capabilities Subgroup of the Technical Support Working Group awarded Agnitio a research and development contract to deliver improved voice biometrics–based mobile phone security capabilities. With the contract, Agnitio's mobile voice authentication solution will be used in various devices and operating systems to secure remote authentication in field tactical operations.
"Our technology has been tested by many end users," Martinez says. "In addition to being accurate…our systems can process millions of conversations in a few minutes. The time you need to create a voiceprint is roughly thirty to forty seconds, but once you have that, you can do matching in a fraction of a second. We're very optimistic about voice biometrics."
The U.S. Army is employing several types of speech technologies. Next IT, a provider of Intelligent Virtual Assistant (IVA) technology, created Sgt Star, a virtual assistant that the Army uses to communicate with recruits and their families. Sgt Star—who routinely shares personal information with potential recruits—has answered more than 11 million sensitive, personal, and potentially life-altering questions ranging from "Will I have to serve in combat?" to "Are the showers co-ed?"
"[An] advantage of our platform is that it is able to work with any speech recognition commodity, since the power is in the conversational understanding," said Denise Caron, Next IT's chief technology officer, in a statement. "We believe in proving that our technology works, and Sgt Star is the first of many upcoming deployments, including releases in healthcare."
Several speech technology providers, including SRI International, are working on more technical solutions. SRI was recently awarded a $7.1 million contract for the first phase of a five-year, $41.5 million project with the Defense Advanced Research Projects Agency (DARPA).
SRI will provide research with the goal of developing systems that can translate foreign languages accurately, no matter what the source, and provide clarification and instantaneous interpretation.
The DARPA contract is part of the BOLT program, a worldwide research project focusing on language technology. SRI also has worked with DARPA's Global Autonomous Language Exploitation effort, which develops software that analyzes and translates speech and text in various languages. Part of the initiative, the Spoken Language Communication and Translation System for Tactical Use, uses technology to facilitate communication between the U.S. military and foreigners.
The International Computer Science Institute (ICSI) has long been involved in government-driven speech initiatives and in November announced its Babel project, on which it was working with the Intelligence Advanced Research Projects Activity. Babel focuses on building speech recognition solutions with self-imposed time and data limitations for a variety of languages. ICSI said that a team of researchers will be focusing on speech technology's basic principles rather than smaller improvements that have been made to existing technology. The scientists believe this study could be helpful for keyword-search systems for languages without much transcribed audio.
Nelson Morgan, the deputy director and leader of the speech group at ICSI, says that the advances made in speech recognition have also, ironically, been a hindrance. Because they have the "curse of being relatively good," there's been less impetus to change the technology, he explained in a statement.
Speech solutions are also a hot commodity in providing the technology behind N11 services, such as 911 and 511, and have more than proved their worth in disasters, such as Hurricane Sandy, and for notifications, such as in the case of mass shootings.
VoltDelta N11 offerings include a number of features that help callers to quickly pinpoint the information they need. Exceptional disambiguation works to avoid confusion for like-sounding topics or points of interest to increase success rates. Personalization remembers the stretch of road the motorist last asked about to more quickly provide targeted traffic updates.
"Government is a key focus of our business," says Terry Saeger, senior vice president and general manager. "We're getting a lot of traction in the 511 space. That's a pretty complex speech recognition task because of all the points of interest and the unbounded grammars that are involved with space, locations, and points of interest. With 511, it's not just speech, it's information that's needed, such as accidents and bad weather."
VoltDelta, which has a contract with the state of New Jersey, proved its mettle when Hurricane Sandy hit the state.
"We have a hybrid deployment model with the state, where we have some premise-based equipment and also have our traditional hosted call center technology in our data centers," Saeger says. "When the storm hit, obviously there were huge spikes in traffic, five or six times the normal volume, but it wasn't a challenge. The building that housed the on-premise equipment got flooded and lost power, and we had seamless fail-over to our data centers."
Saeger maintains that the future of 511 is bright, and he sees more mobile applications integrating with speech applications. He also points to a lot of 511 turnover this year.
"Because of the nature of the 511 business, they tend to be five-year contracts," Saeger says. "511 really got going about ten years ago, so companies have been going through a few [contract] cycles, and now we're going into a stage where states are going through evaluation processes."
MicroAutomation has several contracts with government agencies, such as the Department of Treasury's Financial Management System (FMS), which provides a type of help desk to callers asking about tax issues.
"In every regard, [government agencies are] like the customer service department of a major business, in that they're helping people using the same technologies of intuitive menus; they are looking at natural language processing within their IVRs; they're looking at multiple languages," MicroAutomation's Fischer explains. "All of these commercial characteristics are equally relevant."
Fischer says that such technologies involving speech are either in place or on a road map to be put into place.
"Sometimes what we're finding is that the government can appear to be slow to adopt to many of these new technologies and capabilities, but that's only because they have to be bulletproof and have to withstand not only usage requirements but any political issues that could potentially arise," Fischer notes.
Staff Writer Michele Masterson can be reached at firstname.lastname@example.org.