Freedom in the Field
Several weeks ago, the heating and air conditioning unit in my building was on the fritz and needed service. After tinkering with it, and recruiting my super to do the same, I called HVAC Company XYZ, navigated through the directed-dialogue prompts in its interactive voice response (IVR) system, eventually spoke to a live agent, and scheduled a service call.
A few days later, the HVAC Company XYZ technician—let’s call him Mack—arrived. I watched Mack through the window as he parked his van, grabbed a toolbox, flipped through some papers stuck to a clipboard, and rang the bell. I buzzed him in and we exchanged a few pleasantries. He arranged his tools, positioned his clipboard on a chair, asked me a few questions, made some notations on his clipboard, and jotted down a few notes on a separate pad. And then Mack set to work. In a matter of minutes, he had fixed the HVAC unit.
What followed was about 10 minutes of paperwork, with me signing things and spelling my last name and Mack checking off boxes and making notations and shuffling through papers. As he was writing and juggling his toolbox and clipboard, I had what I considered a stroke of brilliance.
“You know, I bet there’s a speech solution that would really help you out with all that paperwork,” I said. “You could probably just speak it into a cell phone—speech technology could convert your speech into text.”
Mack gave me a look—one that I took to mean “Speech? Isn’t that something they use in contact centers?” but which was probably a lot closer to “Can you please shut up so I can do my job?”—and left, no doubt off to another service call, followed by more needless paperwork before he returned to the office to enter all the day’s data into the company’s customer relationship management (CRM) system.
Mack and HVAC Company XYZ—and a lot of enterprises—still fail to realize that speech technology isn’t just for contact centers anymore. The truth is that speech technology has proved itself in the call center. And while many callers may not want to interact with a speech system, they are more than resigned to that fact. Enterprises across the world have seen the cost-cutting benefits and reaped the return on investment (ROI) that speech technology provides.
Now, more then ever, speech is moving out of the contact center and into other parts of the enterprise, where the technology is often even more efficacious, according to Deborah Dahl, principal at speech and language technology consulting firm Conversational Technologies and chair of the World Wide Web Consortium’s Multimodal Interaction Working Group.
“There’s a good argument that enterprise applications have a real leg up on an application designed for [public users] because an enterprise…can train employees how to use an application,” she says, noting that someone calling an IVR has no training and may not even want to use the application.
“That has a huge advantage,” she adds.
Judith Markowitz, president of J. Markowitz Consultants and an independent analyst in the speech and voice biometrics fields, agrees, and also notes that traditional IVRs often serve as a transition for speech to spread to other areas.
“There are different ways in which speech is spreading,” Markowitz says. “One is through the call center into different areas, like security. Another is the use of speech outside of the contact center.”
Donna Fluss, founder and president of DMG Consulting, also agrees, but offers a slightly different perspective. She contends that embedded speech will pull enterprises into a much wider use of the technology.
Speech is already being used in a variety of areas outside of the contact center. Among the most prevalent, and those that show the most promise for enterprise customers, are:
- voice-directed warehousing and logistics solutions;
- mobile applications; and
- salesforce automation.
Of the three, warehousing is probably the hottest growth area for speech right now, according to Daniel Hong, lead analyst at Datamonitor.
The market for voice picking solutions will grow to almost $900 million by 2014, he predicted in a recent report titled “The Guide to Voice Solutions in Warehouse Environments.” He sees the primary drivers for investment as increased productivity and accuracy, reduced costs, hard ROI, soft ROI, and improved worker safety.
Making Sense of It All
“If you have a hands-free, eyes-up environment, speech recognition makes a huge deal of sense,” he says. “Safety is of primary concern.”
Markowitz also sees the clear benefits speech can provide to the warehouse operator. “Speech brings efficiencies to the operations, and that also translates into cost savings. If a person can do more picking faster with fewer errors, you have done other things, but you’ve ultimately saved money.”
Speech solutions in the warehouse can also cut down on injuries and improve safety, Markowitz adds. “If you have a person whose eyes are on the task and whose hands are on the task rather than having a handheld device you have to look at, then you have less chance for accidents,” she says. “Also, if you have something that you can attach to your belt and a headset you don’t have to lift a device or twist or move your arm over and over to manipulate an input device, then you save on healthcare costs.”
In the fertile area of speech solutions for the warehouse, Hong refers to Vocollect as an “800-pound gorilla,” but also points to other companies like Datria as key players in the space.
According to Tom Upshur, vice president of product management and marketing at Vocollect, the company sees—and has seen for years—significant growth in the use of voice.
The company, which has been applying speech to the warehouse since the late 1990s, started with grocery redistribution and moved from there, offering a wide range of speech-enabled, hands-free, eyes-up solutions to warehouses.
“The real benefit to the warehouse environment is when you have dedicated selectors. This is the most efficient way to maximize their effectiveness,” he says, noting that the company’s picking solutions can result in 99.97 percent accuracy rates. “You have hands-free and eyes-free technology so they can dedicate all of their time and efforts to selecting.”
And, as the “800-pound gorilla,” Vocollect has seen its share of success. The company’s technology is in 85 percent of grocery distribution centers. BiRite Foodservice Distributors, Giant Eagle, Price Chopper, and Southeast Frozen Foods are just a few of the company’s many grocery customers.
Additionally, Upshur points out that Vocollect is the only company doing more than $100 million in voice business per year, with $200 million of goods moved daily via its technology. “From a marketing standpoint it’s really not about selling voice anymore,” he says. “It’s about selling Vocollect voice.”
But despite Vocollect’s impressive numbers and reputation, James Greenwell, president and CEO of rival Datria, makes a compelling argument to look elsewhere when searching for a voice-enabled warehouse solution.
Datria—a supplier of configurable packaged speech-enabled mobility solutions for field service and asset management—has a very different approach when it comes to speech in warehousing.
Datria is not interested in getting its technology onto mobile devices, but rather, on back-end enterprise resource planning (ERP) and CRM servers, Greenwell says.
Make the Call
“We really see this as revolutionary because there’s no call center in here,” Greenwell says. “There’s no typing in here. It’s all data collection, speaking, and Bluetooth-enabling, and all over Internet IP.”
Greenwell says that with Datria speech technology a forklift driver doing voice picking arrives at work and simply makes an eight-hour phone call into the company’s mainframe, logging on just like a worker might at a desktop, but all over the Internet with Voice over IP.
“Nobody else in the marketplace does that,” he says. “All the other voice warehouse vendors have a propriety [radio-frequency]-based solution. Nobody’s doing it as IP telephony.”
And by providing warehouse solutions in this manner, Greenwell says Datria can provide customers significant benefits—particularly if they are considering going with Vocollect and purchasing a “$3,000 device” for a warehouse worker.
“That $3,000 device we replace with a $300 Cisco phone,” he says. “And the phone doesn’t have all that firmware on it. It’s an off-the-shelf phone. And that phone calls back to the server that has our software with the Nuance software to do all that same recognition on the server.”
Greenwell points to the company’s recent work with Coca-Cola Enterprises (CCE) as proof positive for the Datria model. CCE—which stocks 1 million locations each day from 432 warehouses—came to Datria because it had a quality control problem.
The company had 2,500 pickers whose work was monitored by 800 checkers who inspected each pallet before it was shipped. The picking process involved printing out a pick list that a forklift driver put on a clipboard and read as he went along.
Datria changed all this when it gave pickers a holstered Cisco phone with a headset and audio pick lists provided by speech technology, leading companies to save time and see significant ROI by eliminating expensive hardware.
“It’s Voice over IP. It’s off-the-shelf IP telephony. There’s no middleware, techno-babble, or firewall magic,” Greenwell says. “And it’s not about the devices.
“Now you’re efficiently looping through the warehouse, picking what you need, filling an order, dropping it off, and having the next one read to you,” he says, noting that Datria deployed the solution at CCE’s first 25 warehouses in three languages in two months. CCE then deployed the solution at 75 more warehouses—without any involvement from Datria.
So far, CCE is very pleased with the Datria system. “I think it’s going very well,” says Michael Jacks, senior manager of logistics and transportation systems at CCE. “We were able to save $2 million—there’s a cost avoidance of $2 million—by not having to buy the expensive equipment.”
The lower cost of the solution was one key benefit, but there have been many others. “The training was significantly reduced because it’s voice-independent as opposed to being voice-dependent. There are no speech samples,” he adds. “And by putting the Nuance voice engine on a server rather than on a small [Windows] CE device, we can increase the footprint of Nuance and get the recognition people were concerned about.”
Going a little farther outside the four walls of most enterprise operations, Markowitz sees speech for salesforce automation as a way of providing mobile workers with information and messages faster via mobile devices and working to make salespeople more efficient.
“If you can do that on the road using any device, then you have certainly saved money, but you have made it tremendously easier for the salespeople to do input/output of what they need just by saying it,” Markowitz says. “For them, it’s really efficiency, ease of use, and, once again, a kind of accuracy.”
One of the speech providers leading the way in the salesforce automation space is California’s Ribbit—a company that made waves in the CRM space with its software-as-a-service solution that speech-enables Salesforce.com.
The impetus for Ribbit for Salesforce—launched in May 2008—relates to the “massive trend toward mobility” and the “significant trend” toward increased sales productivity via the field use of CRM systems, according to Greg Goldfarb, vice president and general manager for enterprise applications at Ribbit.
“When we built out our first commercial application for Salesforce.com, we were very squarely focused on how do you enable productivity for people who are out and about during the day,” Goldfarb says.
Goldfarb—who stresses that Ribbit for Salesforce is only one of many applications the company is developing across the enterprise—says Ribbit’s approach assesses the challenges to productivity in the field and addresses them in unique, easy-to-use ways.
Ribbit for Salesforce revolves around three fundamental challenges to productivity:
- enabling salespeople to be more responsive to customers and therefore more competitive;
- facilitating information sharing and accelerating collaboration between workers in the office and workers in the field; and
- facilitating higher visibility of customer interactions when disparate devices, software, and technology are being used.
Ribbit for Salesforce brings together day-to-day sales tools like mobile phones, email, Salesforce.com, and text messaging and ties these different information portals together via voice-to-text technology. It works something like this: If a saleswoman is in a meeting, all her voicemail is converted to text and sent to her mobile device so she can address pressing matters and improve responsiveness. Once the meeting is over, she can hop into her car, access her mobile device, and speak her meeting notes. The content gets converted to text and entered into the company’s CRM system. When this is finished, the saleswoman can continue to drive while she uses her mobile device to call in and dictate a draft of an email. Upon returning to her office, she can sit down at her desk and open up Ribbit for Salesforce on her computer. Now, she can edit and send the email, look at customer records, and check messages—the system stores them as both text and voice files—that partners or customers may have left.
“So I have a much better 360-degree view of what’s gone on with the customer,” Goldfarb says of the system.
Additionally, Goldfarb says Ribbit for Salesforce offers users an insurance policy for mobile phones. If a user loses mobile coverage or power, the solution provides an online version of the cell phone that runs in Salesforce.com.
“If I’m traveling in Asia where my cell phone doesn’t work, I can still answer and make phone calls from my PC as if I’m using my cell phone,” Goldfarb says. “One of our underlying principles here is [that] to drive user adoption of something it’s got to be dead simple and be a very slight transition in terms of their existing behavior. So this works with your existing Salesforce account, your existing cell phone number. And it is really something that enables you to do more with what you’ve already got.”
Goldfarb asserts that Ribbit is the first company that has linked mobile voice communications to CRM workflow. “Integrating your mobile phone line, voice messaging, voice-to-text, and CRM—they all work together to really eliminate a lot of admin work and enable salespeople to focus on selling, not admin,” he says. “And that really works to accelerate the sales cycle and make salespeople more effective.”
Ribbit’s efforts are paying dividends. “Adoption is going quite well,” Goldfarb says, citing Drive Financial Services as a success story. Since adopting Ribbit’s technology, the company has seen a monthly savings of $50,000.
“They’re seeing some really quick, early returns,” Goldfarb says. “And in this environment that’s the key to acquiring customers and making customers successful: giving them something that is really easy for their users to pick up and use and something that delivers immediate productivity returns.”
But despite the success of these companies and the increasing use of speech by enterprises outside the contact center, everything is not wine and roses in the world of speech technology. Companies are, in fact, facing severe economic downturns and a global recession.
Hong says that the economy will affect the market for speech in enterprise applications “regardless of what vendors may be saying.” He foresees longer sales and purchasing cycles and predicts that companies will be reluctant to invest despite ROI, especially if they have to purchase new mobile devices.
“I think [companies are] going to take a look at what they have now, see if they can prolong or leverage those a bit more effectively, and then just have a wait-and-see approach,” he says. “And then maybe starting in 2010 it’s going to start picking up again.”
But many speech vendors simply disagree. “I say that’s hogwash,” Greenwell says, when asked about the economic downturn holding back investments in speech technologies.
And that is something Greenwell and Upshur—who points to a strong fourth quarter in 2008 and a strong first quarter in 2009 at Vocollect—can agree on.
“We’ve actually been very pleased with how our business is going, and we have seen really no reduction in interest and in actual purchase of voice solutions,” Upshur says. “People continue to look at the need to improve efficiency, improve speed, and improve accuracy.”
In fact, many speech vendors remain very positive about the future of speech in enterprise applications.
Dan Villanueva vice president of marketing at Vangard Voice Systems—provider of AccuSpeech Mobile Voice Technology and the Mobile Voice Platform for enterprise-wide voice deployment—sees the federal government, particularly the Department of Defense, as an untapped market for speech.
“Wherever mobile transactions are occurring, that’s where voice needs to go,” he says. “That’s what it was designed for.”
That’s especially true in the private sector. “In terms of enterprise mobility, you have to get real work done. You have to cut the costs. You have to be faster. You have to be more accurate. You have to be more responsive,” Villanueva says. “Wherever you see mobility across industry, across government, that’s where voice is going to go.”
Greenwell is equally optimistic. “There are massive logistical models that have been established with old paradigms of technology and infrastructure which could be replaced and upgraded and made far more efficient, and I think we’re only starting to understand what those are.”
Fluss provides a more balanced assessment of the speech industry and the economic downturn. She notes that speech has never had what she calls “the year of adoption,” a year in which speech became the hot, must-have technology.
“It’s never had that kind of year, but it’s slowly but surely finding its way into consumers’ pockets and pocketbooks,” she says. “It’s slowly but surely finding its way into a growing number of cars, and not just the high-end [models]. It’s slowly but surely finding its way into warehouses and manufacturing facilities. And it will slowly but surely find its way into our homes. Ten to 15 years from now it’s just going to be taken for granted.”
On the Back Burner
Fluss even sees embedded speech being used in the kitchen for everything from preheating the oven to setting the clock on the microwave.
“Why is it that you can’t use speech?” she says. “There’s no reason, actually, why you can’t use speech, except that today the cost is too high. Those are embedded uses of speech—futuristic, but, by the way, only slightly.”
Maybe they’re not so futuristic. In Europe, home appliances manufacturer Indesit has been working for the past year with speech companies Loquendo and Amuser to prototype an oven that lets users set cooking times, temperatures, and more with voice commands.
“The underlying technology is really very strong,” Fluss adds, noting that a lot of great applications get developed during tough economic times. “And now it’s just a question of coming up with cost-effective applications and building up adoption.”
So maybe, it’s really just a matter of time before Mack and HVAC Company XYZ get on board. And maybe next time I suggest a speech solution to help Mack get out from under his reams of paperwork, he won’t look at me like I have three heads.