The Speech Accessibility Project Could Open Doors, Literally
Speech recognition’s come a long way. And along the way it sometimes seems that people with disabilities are left out of the conversation. Years ago, when the only speech recognition option was discrete speech, pretty much the only people who were willing to—who wanted to—use the technology were people with disabilities, because it was a better (if not the only) solution to whatever strategy they were currently using.
As the technology improved, accessibility features were added, making it more usable for and exciting to people with disabilities. At some point, though, the pendulum reversed: Accessibility features were removed, positioning speech as more of a business tool than an accessibility tool.
All these years later, the technology has evolved to the point where doctors dictate and navigate within EMRs, and virtual assistants answer questions, set timers, and play music. Speech recognition may have become ubiquitous, but it isn’t necessarily the panacea for which many people with disabilities had hoped.
That may be changing. And for the better.
Researchers at the University of Illinois Urbana-Champaign (UIUC), in a consortium with Amazon, Google, Meta, Microsoft, and Apple, are working on the Speech Accessibility Project (SAP) to introduce dysarthric speech into voice algorithms at a larger scale than ever before.
Mark Hasegawa-Johnson, professor at UIUC, spearheaded the project after years of working to improve ASRs’ recognition of dysarthric speech. “This perennial question of ‘how much data is enough data to make a speech recognizer that really works’ now has a meaningful answer,” he says.
Turns out it’s 1,000 hours of transcribed speech.
Hasegawa-Johnson approached colleagues at the sponsoring companies, resulting in the consortium to create a dataset that will provide meaningful results for people with dysarthric speech. The project is collecting speech recordings from people with Parkinson’s disease, amyotrophic lateral sclerosis (ALS), cerebral palsy, stroke, and Down syndrome, recruiting participants by partnering with organizations that provide services to each of the communities, including the Davis Phinney Foundation, LSVT Global, and Team Gleason.
While there is a small stipend for participants, Hasegawa-Johnson says participants seem to be motivated by altruism and the recognition that creating ASRs that recognize dysarthric speech will make the world a better place.
The database will first be used by engineers at the five sponsoring companies; hopefully in about six months, it will be available to any company or university in the world that wants to create speech technology that’s effective for people with dysarthria.
Over the years I’ve had clients with profoundly affected speech. These users could be incredibly successful using speech recognition—if they were willing to put in the work to train the software to recognize their voices. And we could do more than dictate into the computer. It was possible—complicated, but possible—to control one’s environment as well, like changing the channel on the television or opening a door.
For those who accepted the challenge, the result was that the computer recognized them better than most people did. This is another reason SAP has so much potential: because it is possible to have an automatic speech recognizer get better recognition than a human listener. While we did it on an individual level, imagine if we could generalize these results to everyone with affected speech. That’s what this project aims to do.
Affected speech often goes hand in hand with mobility issues, adding another layer of dependence, which could be addressed by the new technology. Speech language pathologist Clarion Mendes, who is also part of the UIUC team, envisions what this project could provide in, say, five years from now: A user might be able to ask the refrigerator to open or a microwave to heat food for 30 seconds. It could even open physical doors, not just in one’s home like the old days but everywhere. “That’s a game changer for independence and quality of life,” she says.
“One of the interesting things about this project is that if it works very well, nobody will notice any difference in the world except that all of [their] devices will suddenly now work,” says Hasegawa-Johnson.
If you or someone you know has Parkinson’s disease and would like to participate in this project, go to https://saa.beckman.illinois.edu/Identity/Account/Register to register. Individuals in the other categories should sign up on the mailing list at https://speechaccessibilityproject.beckman.illinois.edu/contact-us. Robin Springer is an attorney and the president of Computer Talk, a consulting firm specializing in implementation of speech recognition technology and services, with a commitment to shifting the paradigm of disability through awareness and education. She can be reached at (888) 999-9161 or firstname.lastname@example.org.