Vertical Markets Spotlight: Speech in Education
Speech technology hasn’t traditionally played much of a role in the day-to-day education of students, but in the past few years, products have emerged to help underachievers and overachievers stay on track, provide individuals with disabilities access to course materials, and enable administrators to gauge students’ mental health. However, finding funding and the appropriate contacts to lead such initiatives can be challenging.
In the typical classroom, one teacher and maybe an assistant or two are responsible for working with a handful to dozens of students, and the ratios get larger as budgets grow tighter. Consequently, lessons lack personalization. The teacher presents information to everyone in a rote way. Students who pick up the main concepts quickly cannot leap ahead to new topics, and others who fail to grasp the key points fall behind.
Speech-to-text solutions are helping to bridge the gaps. The technology presents information so pupils work through courseware at their own pace, speeding up when concepts are clear and slowing down when they are not.
The solutions are evolving quickly, enabling two-way communication between a virtual teacher and the student and vice-versa. The tutors provide not only verbal reading of the materials but also real-time feedback. In a best case, they stimulate critical thinking and answer questions about the lesson plan.
The tools benefit schools in a number of ways. Teachers become more productive as they serve more pupils. Students gain a better understanding of the lesson plan.
Perhaps one surprising development is that deployments have been more common in higher education (about 70 percent) than in K-12, according to data collected by ReadSpeaker, a text-to-speech technology provider.
“I thought that younger students would need more help learning, but the reality is so much of K-12 learning still is done the old-fashioned way, [with] the chalkboard,” explains Matt Muldoon, ReadSpeaker’s North America president. “Higher education still has lectures but also presents more content, such as books, electronically, so they have been more willing to use technology to bridge learning gaps.”
The COVID-19 pandemic might also have created a shift. Schools were forced to deploy virtual content that supports remote learning as social distancing rules went into effect. Therefore, K-12 understanding about technology’s potential and interest in the new speech solutions might still rise.
Speech Bridges Disabilities Gap
Not every student arrives at school outfitted with the same skills. In many cases, students have physical limitations, such as vision or hearing problems, that limit their ability to access courseware. Students with disabilities like cerebral palsy might not be able to input information in the same way as their classmates.
Schools have also been trying to close the chasm between students who have disabilities and those who do not. Speech-to-text solutions provide a valuable tool, empowering them to put down their words in an efficient and effective way. “Most of our sales in the education market start with us talking to someone from the disability services area,” says Ed McGuiggan, vice president and general manager of Dragon Professional at Nuance Communications.
ReadSpeaker also plays heavily in the assistive technologies market. In fact, the company was started 20 years ago to address such problems. “The four founders wanted to help a blind friend use the web and built a speech system that articulated what they were seeing on the computer screen,” Muldoon explains. Increasingly, such solutions are being embedded into educational software, like courseware and learning management systems.
Giving Students a Voice
Speech recognition is beginning to impact education in more ways, but a challenge has been the evolution of student voice recognition systems. “Children are not a homogenous group,” explains Martyn Farrows, CEO of SoapBox Labs. “A child speaks much differently at 2 to 3 than 5 to 6 and 9 to 10.” As a result, few datasets have been collected and few AI models have been built for that younger group.
Compounding the problem: Inclusiveness is big in education but is an area where speech recognition systems sometimes falter. The products often do not accurately account for students from all backgrounds and abilities. Recognition is lower for students with pronounced accents, obscure dialects, and unsophisticated speech patterns. In addition, youngsters tend to speak in slang, and the systems do not recognize such words.
SoapBox Labs tried to account for such limitations in building its solution. In business since 2013, the vendor worked with Ireland’s National Center for Leadership and Innovation and Trinity College to create its speech engine. “We created a dataset from scratch that focused on equity and understand firsthand how challenging the work is,” Farrows says.
Because of that, vendors want to make sure that their systems account for speech differences among children but do not want to establish “Boil the Ocean” objects, goals so broad that they cannot sustain a sound business case.
In April, SoapBox Labs teamed with Imagine, which develops adaptive learning programs to accelerate literacy and English language development for students in grades pre-K-6. The two are integrating speech recognition into the courseware with the goal of replacing cumbersome, manual assessments of students’ reading and language acquisition with automated scoring, which provides teachers with more accurate student reading and language learning assessments. The feedback loop included in the SoapBox voice engine is built to make it easier to monitor progress, giving teachers more frequent and better insights into where students need support on their literacy and language journeys.
Speech Aids Mental Health Professionals
Understanding students’ moods and mind-sets is another stumbling block for schools. Youngsters walk into class with all sorts of complications in tow, possibly from unhealthy home environments. As a result, they might be thinking about harming themselves or others. “Schools are ground zero in dealing with children’s mental health issues,” says Dr. Yared Alemu, founder and clinical director at TQIntelligence. In fact, 70 percent of students today receive special services, and the numbers have gone up since the COVID pandemic.
Schools are ill-equipped to function as full-service mental health facilities. School therapists typically are assigned many students and meet with them occasionally, once a week or so. “Traditionally, mental health professionals lacked tools to measure if children were getting better, worse, or staying the same as they received mental health services,” Alemu says. Consequently, they cannot identify students that might be reaching a breaking point and provide them with the support needed to help them mitigate an impending crisis.
Psychologists founded TQIntelligence, which built a voice recognition system capable of identifying students who are at risk from trauma. The system does not depend on pupils’ content. Instead, it searches for idiosyncrasies in how they speak. “Trauma impacts parts of the nervous system and three parts of the brain, which impairs a person’s ability to generate speech,” Alemu notes. In essence, speech illustrates what is going on with the student psychologically. The solution relies on artificial intelligence and machine learning to automatically examine the speech patterns, correlate them to the person’s state of mind, generate a report detailing how well or not so well the person is doing, and provide possible next steps.
The company partnered with Family Ties Enterprises and is integrating Clarity AI, TQIntelligence’s digital health platform, into in-school mental health services programs in Georgia schools. Clarity AI identifies and targets high-risk patients and tries to improve treatment outcomes and reduce unnecessary hospitalizations, ER visits, and other high-dollar services by its use of predictive analytics.
Clearly, speech use is branching out in school systems but still faces many hurdles before reaching widespread adoption. As vendors collect voice samples to build their data models, privacy has become a growing concern, one that intensifies when vendors gather information from children.
Suppliers try to assuage such concerns by devising techniques that anonymize data collection. They amass information about students, such as age, gender, and background in some cases, but do not link individuals to samples.
In addition, educators are cautious. “Schools set very high bars in terms of accuracy because the stakes are so high,” Farrows says. “It is not a case like with most [interactive voice response] systems, where if the system gets the word wrong, the impact is minimal.” Many schools require at least 95 percent accuracy, which is a difficult bar to hit given the uniqueness of the student input.
Finding the decision maker in a school also requires persistence. “We have seen the buying persona change significantly since the pandemic,” says ReadSpeaker’s Muldoon. “Historically, we worked almost exclusively with the disability services department. Now, the director of inclusion, dean of student services, and even classroom teachers are approaching us.”
However, securing funding can be problematic. Schools find themselves with little latitude in their budgets. The Coronavirus Aid, Relief, and Economic Security Act (CARES) Act, which was passed during the pandemic, helped, but that funding is starting to run out. The budgeting cycles are long as well. “Education has its own purchasing rhythm, and its buying cycles are much longer than those in the commercial market,” Farrows notes.
Paul Korzeniowski is a freelance writer who specializes in technology issues. He has been covering speech technology issues for more than two decades, is based in Sudbury, Mass., and can be reached at firstname.lastname@example.org or on Twitter @PaulKorzeniowski.