ASR Cleared for Takeoff

With increased alerts and safety measures at airports, people aren't the only ones needing supervision; the planes and their flight plans are monitored by air traffic controllers to ensure accuracy and safety. Speech technology is helping the Federal Aviation Administration (FAA) train the people who are tasked with that very difficult task every day.

Training for air traffic controllers at the FAA Academy in Oklahoma City used to require four to five pseudo-pilots who would act as real pilots during simulated take-offs and landings. Those pseudo-pilots have now been replaced by simulators that are equipped with speech recognition software that accepts and responds to voice commands as a real pilot in the air would.

"The FAA felt that there was a need to change how they do training in the tower environment and simulation was a way to reduce training time. The goal was if simulation could reduce training by twenty-five percent to thirty percent, then it would be worth the investment," says John O'Leary, air traffic simulation training manager at the traffic control tower at California's Ontario Airport.

The use of speech has enabled the FAA to cut the average training time at its main facility and on-site training facilities at Chicago's O'Hare Airport, Miami International Airport, and Ontario by about three-quarters. While it may have taken 100 hours to certify trainees using traditionaltraining, the speech-enabled simulator has certified trainees at Ontario in as few as 25 hours, O'Leary says.

"When it takes a few million dollars to train an air traffic controller and the industry is currently short 11,000 controllers, that is a significant savings in cost and time getting people into the field," says Gary Pearson, vice president of advanced programs at Adacel, the developer of the air traffic control simulation equipment and software in use by the FAA Academy.

The Adacel simulators, called MaxSim Control Tower Simulators, use speech recognition software from Nuance Communications. They take the air traffic controller through an array of scenarios. Within each simulation, the speech recognition technologies accept the air traffic controller's spoken commands and create a verbal response from a simulated pilot. Using those voice inputs, the simulator controls the movements of the plane. Each simulator can support up to six environments and is designed to be 10 percent more challenging than the most difficult event at any airport.

In implementing the technology, the FAA's biggest challenge was to gather the necessary vocabulary to improve recognition rates and times. This had to be carried out early in the project and required a list of specific phrases from the 7110.65 air traffic control handbook.

The system was then designed to support 40 to 50 phrases from the handbook with as many as 209,000 variations per phrase. The voice instructions for the air traffic control role support 500,000 commands, with the ability to combine up to eight commands in one transmission.

Every simulation is recorded and the wave files are run automatically through a batch recognition test process, which highlights what the performance was and where the issues were. From this, the Adacel simulator adjusts the grammars and the systems to deal with the problems and corrects the user through conversation, rather than just rejection notices on the screen.

Another challenge that the FAA wanted to avoid was the need to train each air traffic control student's voice on the simulator. To solve this challenge, the speech recognition component was designed to be speaker-independent and to work with various accents. "With speech recognition, you don't have to train a person or record their speech to get the computer to recognize their speech," O'Leary explains.

The new support boosted recognition rates to 98 percent. "I was surprised that the system was able to recognize various dialects and speech patterns," O'Leary admits. "As long as the words were correct, it would recognize them."

Once the project had overcome grammar and voice-training hurdles, the implementation was set into motion. The FAA originally bought four speech-enabled simulators for its formal academy in Oklahoma City and then purchased three more—one for each training site in Miami, Chicago and Ontario. It is installing a fourth on-site simulator at the airport in Phoenix, and expects to have that equipment up and running in late February.

So far, four new air traffic controllers have been trained in Ontario using the simulator, and other controllers have used the simulator there to refresh training they'd already received. Miami began on-the-job simulation training with a new hire in October and in March expects to receive four additional new hires, all of whom will have been trained on the simulators there.

"There is initial frustration with the student, especially if she is transferring from one facility to another and wants to apply the language used at a previous facility because the simulation environment forces her to follow the rules of the 7110.65 handbook," O'Leary asserts.

There are plans to expand the program to other states as well. The FAA also plans to incorporate speech into another simulator, the Stars Simulation—the next generation of air traffic radar.

"ATC has been a particularly demanding domain for speech technology because not only do we have to recognize what was said, but we also have to understand it and we have to understand the pieces of the phrase that make the aircraft do certain things," Adacel's Pearson explains. Prior to working with the FAA, Adacel had been working with the U.S. Air Force on air traffic control simulation at Wright- Patterson Air Force Base in Ohio under a Defense Department contract that has been in place since 2002. The Air Force had been researching speech recognition in air traffic control training for two years prior to issuing the Adacel contract. "At the time speech recognition was fairly new and there were a lot of people who thought it would not do the task for us," says Shelby Monhollen, program manager at Wright-Patterson. "The commands were limited, the voice recognition rates were not able to keep up with air traffic control commands, and the accuracy left something to be desired. Most people who were using it in the ATC simulation environment were relying on a pseudo-pilot backup to correct mistakes in recognition," Monhollen explains.

That has not been the case at the FAA Academy. The simulator has allowed the academy to eliminate the need for pseudo-pilots, reducing the necessary staff to one instructor.

"Voice recognition helps in being able to run the simulator with fewer people than before," O'Leary says. "The selling factor in terms of determining the use of speech was that it reduces the number of operators required to run the scenarios.

"It allowed us to look at developing problems without needing additional staff to support the problems in terms of getting the airplanes to follow the instruction. That was a significant savings in terms of manpower that speech recognition brings to simulation," he notes.

ASR Cleared for Takeoff

ServiceNow Partners with OpenAI on Voice AI

FlashLabs Releases Chroma 1.0 Voice AI Model

Agora Partners with MiniMax on Voice AI

VoiceRun Launches Voice AI Platform with $5.5 Million Seed Round