Speech Recognition for the Warehouse Comes of Age
Speech technology has been used in warehousing operations since the 1980s, but it has really only started to hit its stride and advance to the mainstream.
SR of the 1980s
The warehouse was a great market for the fledgling speech recognition (SR) industry of the 1980s. Unlike telephone applications that are characterized by large numbers of infrequent callers, warehouse operations have small numbers of repeat users. That matched the speaker-dependent SR of the time. Unlike free-form dictation, warehouse applications generally need relatively small vocabularies. That was ideal for the small-vocabulary, word-based technology of the ’80s that required users to train each word. Verbex, the warehouse-industry leader (later acquired by Voxware), for example, offered a vocabulary of up to just 80 words and phrases.
Twenty years ago, warehouse SR was hardware-based. The user’s headset was wired to a wearable radio frequency transmitter that communicated with a receiver and SR engine housed in a ruggedized proprietary box. Later, standard PCs replaced the boxes. and Vocollect introduced a wearable computer called Talkman that stores user input until it can be uploaded to a computer.
SR systems also had limited interoperability. A receiving dock application I worked on combined SR with barcodes. Barcodes captured basic information about the shipment, but dock workers also would verbally describe the condition of the containers, material shortages/overages, and the pallets on which containers rested.SR automated those verbal communications, but a custom application had to be written to format the SR output for the warehouse databases and integrate it with barcode data.
Why would anyone use such limited technology? Because SR improves worker productivity and safety, and it can save money for the company.
Warehouse applications are generally hands-busy, eyes-busy, and mind-focused operations. SR allows workers to keep their eyes, hands, and minds on their jobs. Workers verbalize reports via SR as they work, which reduces errors related to shifting between performing tasks and reporting. SR protects workers from repetitive strain injury and other injuries (e.g., from barcode wanding or wearing heavy input devices around their necks for eight hours at a time), and can be incorporated into protective earphones. A Verbex SR system even outperformed me in one high-noise environment. I couldn’t hear what the person beside me said, but the SR performed beautifully.
Warehouse SR is still dominated by Voxware and Vocollect, but it is no longer bleeding-edge technology. "Speech has hit the mainstream, especially in retail, where voice is seen as a key technology for helping to drive cost out of the supply chain," says Stephen Gerrard, Voxware’s vice president of marketing. SR algorithms are phonetic, which enables SR to support unlimited vocabularies with minimal training. Warehouse solutions are Internet-capable. Voxware is fully VoiceXML-compliant, and Datria, a VoiceXML solutions provider, has a line of warehouse products that leverage Cisco's IP telephony. SR solutions also interface with all major supply chain back-end software.
These changes have fostered partnerships with a spectrum of warehouse management systems, technology, and solutions providers who have moved SR into a global marketplace.
The most significant change was the advent of SR-quality audio for the ruggedized, wireless mobile devices used in warehouse operations. The most widely used of these devices are manufactured by Motorola/Symbol, LXE, and Psion.
Motorola and other hardware manufacturers simply responded to the marketplace, according to Richard Nedwich, Motorola’s senior manager of voice technology in mobile computing. "Our customers wanted to extend the use of Symbol products to voice-directed picking. They felt that using speech on products already used for data would make their workers more productive and save them a lot of money," he says.
"The ability to put voice on those devices represents a great market expansion opportunity," adds Larry Sweeney, Vocollect’s cofounder and vice president of product management.
It is, indeed, a new and exciting market.
Judith Markowitz is technology editor of Speech Technology magazine and has been an analyst in the speech processing industry for more than 20 years. She can be reached at (773) 769-9243 or firstname.lastname@example.org.