Enterprise Strategy: Success on All Levels

Article Featured Image

Implementing a speech technology system without considering all of the necessary stakeholders is like playing chess without considering all of the pieces. It’s possible to succeed, but it is much more challenging.
There are many considerations from concept to go-live and beyond that require input from various members of an enterprise. Allocating the funds to make it happen, knowing which solution to buy and how to put it in place, and then continually maintaining, testing, and fine-tuning the system all require different skill sets. Because a successful speech technology implementation requires involvement from a variety of stakeholders throughout an organization, the ultimate goal should be clearly defined and all the parties involved should execute on it. A critical element of any company’s strategic goal should not only be to cut service and support costs, but to also provide better service to customers.
Speech Technology magazine’s Associate Editor Stephanie Staton spoke with industry experts, each one offering advice to one of three groups involved in a speech technology implementation, C-level executives, project leaders, and builders. Speaking to the concerns of C-level executives is Deborah Dahl, principal at Conversational Technologies. Susan Hura, founder of consulting firm SpeechUsability, addresses important issues to consider from the developer and designer perspective. Finally, Ian Jacobs, strategic analyst at Frost & Sullivan, offers advice to project leaders of a speech implementation. With a sound project strategy, the cooperation from all stakeholders involved, and the advice from these three experts, your organization should avoid a speech technology stalemate.

Develop Requirements That Meet Company and Customer Goals
C-level executives and technology buyers determine what solutions will be available to the company and what type of budget will be available to support the solution’s design, development, deployment, and maintenance. To determine the pertinent information that this audience will need, Deborah Dahl, principal at speech and language technology consulting firm Conversational Technologies and chair of the World Wide Web Consortium’s Multimodal Interaction Working Group, draws on more than 25 years of experience in the fields of spoken computer-human dialogues and natural language understanding.

Speech Technology: Why should C-level executives be interested in speech technologies?
Dahl: C-level executives should see speech technologies as a way to reduce costs and improve customer service. Cost reduction in terms of needing fewer agents is clear. Improved customer service can be a benefit of speech applications, but this doesn’t come automatically. If the system is welldesigned so that it can be used efficiently and accurately, improved customer service will follow.

ST: What should C-level executives look for when determining whether to use speech technologies in their companies?
Dahl: Executives need to start by understanding what they want speech to do and what the technology is actually capable of doing. Speech is appropriate for automating well-defined telephone interactions that don’t lend themselves well to touchtones and don’t involve complex interactions that would require agents. Speech applications also need to be on a sufficient scale to justify the cost of developing the system.

ST: What steps should be completed before purchasing the speech system?
Dahl: Learning about speech technology and the speech industry is an important place to start. This can be accomplished by attending industry trade shows, reading industry magazines, and talking with experts and vendors. The C-level executive should then look at how speech might be applied in his business and make the business case. Input should be gathered from call center agents, IT people, marketing people, and customers. This needs to be followed by carefully developing requirements so that both the enterprise and the vendor have the same expectations.

ST: What are some of the common mistakes made when purchasing speech technology?
Dahl: One pitfall is expecting too much. It’s very important to understand the technology and what it can and can’t do. Overly ambitious systems will be inaccurate and frustrating to callers. Another risk with speech is putting in a system that callers hate because it wastes their time or seems ridiculous. Another risk is making unnecessary design changes. Both of these can be addressed by taking the callers’ needs into account from the beginning and looking at the system from their perspective.

ST: What tips can you offer for purchasing speech to keep costs down while still getting the best bang for the buck?
Dahl: One key factor in keeping costs down in purchasing speech is to reduce the need for design changes in the development process. Some ways to do this include the following:
• Understanding what the technology is capable of. Don’t try to do too much, but at the same time, do fully exploit the technology.
• Defining requirements is critical. Work with the vendor to carefully define what you need from the speech system to support your business, and what the speech system is expected to do.
• Make sure that the design of the user interface accommodates the intended end users by involving them early in the design process.
• Validate designs early on through simulated interactions with actual users.

ST: Are there ways speech technologies can benefit departments other than the contact center?
Dahl: One area in the enterprise where speech can be extremely valuable is field force automation. Whenever field personnel need to call in to record hours, get assignments, or file a report, speech is an excellent alternative since it doesn’t require someone to answer the phone. In addition, because the users (the field personnel) are employees, they can be trained to use the system effectively.
Another area where speech is valuable is in situations where the users have their hands and eyes busy. A good example is picking merchandise in a warehouse, where voice input can be much faster than keying in a product number. Equipment maintenance is another area where speech can add value. A speech interface can make it much easier to consult a manual while the user is climbing around a large piece of equipment.
The main benefits of employee-facing speech outside the call center are improved employee productivity and a reduction in errors that are made because the employee is distracted by trying to operate a keyboard or keypad.

Balance Enterprise, End User, and System Requirements
The technology community determines how to most effectively and efficiently design and test the systems that the enterprise purchased. These individuals will determine how the system operates and interacts with end users. Susan Hura, founder of SpeechUsability, a voice user interface (VUI) design, evaluation, and consulting firm, and a member of the board of directors of the Applied Voice Input Output Society (AVIOS), addresses the roles and responsibilities that this community carries.

ST: What should a designer keep in mind when designing a speech system?
Hura: The designer needs to keep in mind that his job is a balancing act; it is actually a three-way, triangular balancing act. The first corner of the triangle is the needs of the business client, who is requesting the system to be designed. He has specific business problems, and it is the designer’s job to create an automated system that will help him solve these problems. The second corner is the needs of the end users. It is not enough that they can use it; designers have to build a willingness to use the system. The last point on the triangle—one that we tend not to think about as much as we should—is the fact that we have to do this well within the limits of the speech technology itself. There are times when we are too unaware of what the technology is really good at and what it is not. Usability, as perceived by the end users, should be the real baseline; they have to have technology that functions properly to reach the goal of being easy to use.

ST: How should a developer approach and work with a design?
Hura: No matter how well-specified a design is, there are always going to be options from a development perspective. There are always going to be different ways to implement any particular design. It is vitally important that developers be in very close contact with designers because designers have a very specific vision of how they want the system to operate. It is incumbent upon the developers to make sure they are doing it in the way that the designer intended. On the flip side of that, it is also very important for the designers to have continued involvement in the project when it gets to the development phase and to try to design things that are maybe not easily implementable, but implementable in a way that creates a good system overall. It is a two-way street; designers and developers really should be working hand in hand.

ST: What are the most important steps in a project?
Hura: The requirements phase is hugely important. We need to do a better job in our requirement-gathering. This means gathering requirements from each of the three points of the triangle. It is gathering the business requirements and setting up success criteria for those goals at the beginning of the project. You have to set specific goals and figure out how you are going to measure them from the user side. Also in terms of user requirements, you need to gather as much information as you can about the mindset of the users as they are coming into this interaction. On the technology side, it is really important for developers to be involved in this requirement-gathering phase. If you have developers there at the very beginning, then they are going to get the understanding of what the overarching goals are and what kinds of things are going to be important to end users. It also is going to allow for that initial level-set between the developer and designer.
In terms of what happens after that, you basically need to take the requirements and go design. You have to iterate on this designing and testing and gathering information. When you get to the end here, designers should be involved in the pilot testing, call monitoring, and cycles of tuning.

ST: What are the most important elements of the design and development of speech?
Hura: The first thing you want to think about is giving your users early success in the application. Giving them the opportunity to give you one answer that they don’t have to think about and that you are sure won’t receive a recognition error can do wonders in terms of building user confidence. If you can come up with a first prompt where you are going to give the user some success in the vast majority of cases, that is a great step for any application. If it is not possible to have that wildly successful initial prompt, at least do whatever you can to make sure your users don’t start out in an error condition.
In terms of what is important overall, it varies a lot from one project to another. One of the things you should keep in mind at a more general level is that we should concentrate more on preventing errors than on what to do with them once they occur. Writing error prompts for no matches and time-outs takes a lot of time because we typically write several different error-handling prompts for each initial prompt. The error conditions end up taking more time than the positive path through the system. What we should be spending more time on is building designs that prevent us from getting into those error conditions in the first place.

ST: What forms of testing should be performed regularly on these types of technologies?
Hura: On an on-going basis, you should do call monitoring, which simply means having the VUI designer or maintainer of the application go through and listen to a representative number of calls periodically. Exactly how often you are going to want to do these tests depends on how often you make changes to the application, how your call volume is, and whether or not you are seeing any problems. If there is evidence of problems, then obviously you are going to want to jump right in. In addition to that, running tuning reports on a fairly regular basis makes sense. If you make very few changes to your application, you could probably get away with doing a round of tuning once or twice a year, but if there are changes being made to the application, you will want to go though and do call monitoring or a limited round of tuning quarterly. The other thing you should be willing to consider every one to two years is having some kind of third-party review of your application. Again, usability testing could be a logical follow-up if during tuning or call monitoring you noticed particular problems and you were trying to diagnose what was happening and why.

ST: What measurements/indicators should be met before rolling the system out?
Hura: That is something that you are going to want to look at on a case-by-case basis. It is all about making sure that you will be meeting the can vary dramatically from one group of users to another. There are a lot of things that play into that, like having done adequate requirement-gathering up front. Mainly, what you will be looking at is whether you have good enough recognition performance and whether people will be reasonably able to respond to the prompts that you have written; both of those things have a subjective and an objective component. You want to get reactions and opinions from users for the subjective measurements and calculate and keep track of statistics for the objective measurements. There will always be two sides as to whether an application is ready to deploy. Most likely you will need these two things when looking at success criteria as well.

Minimize, Minimize, and Minimize Again
Project leaders, business managers, and department heads are there to roll out the systems that the enterprise purchased and to ensure that the goals set forth by the C-level executives are being met. These individuals will be involved with all aspects of the system’s operation, including testing, tuning, and maintaining it. Ian Jacobs, a strategic analyst with Frost & Sullivan’s North American Information and Communications Technology Practice, has more than 12 years of experience working with vendors and end users on automatic call distributor systems, IP contact centers, CRM enablement, and offshore outsourcing of customer support. He addresses the concerns of this group and offers his suggestions for the continuous improvement of the solution.
ST: Who needs to be involved in the planning, implementation, and tuning of the speech system?
Jacobs: If we are talking about the contact center, then we are talking about points where there are actual customers on the other end of the phone calling in. You will obviously need the technologist who can tell you what the technology can do, but you also need user-experience designers who understand a customer’s perspective on how things work. The problem that we have found is that most user-experience designers have a lot of experience in visual application design. The rules and shortcuts that they have come up with in that milieu don’t really work in the voice arena. It is much more difficult to present a wide array of information in a voice application than it is on screen. You need somebody who actually focuses on the user experience, not necessarily people who code applications or build the voice application themselves. Then there also needs to be the domain experts, so if the domain in this case was customer service, then you need customer service experts.
Then there also needs to be several different tiers of testers involved once you build your first iteration of the system. Advice that probably only works for a smaller company is that if you have time, not for the first iteration but maybe the next-to-final one before going live, get the CEO to call in. You will find out really quickly how frustrating it is if the CEO finds it frustrating. More realistically, there are product line managers, for example, who could call in if we are talking about a consumer goods company. Somebody from the product side of the thing that is being supported should be calling in. Then, of course, you should have the people who stand in for the customer calling in. In most cases, that would be a contact center agent.
The only answer that I would give differently if we were talking about internal uses and internal helpdesks would be an advocate for the employees who are actually going to use it. It is a lot easier to mandate that all employees who are going to use the system do the voice recording of numbers or whatever the system needs to tune to a specific person’s voice. You can make your employees do that, but you can’t make your customers do that. You need to have an employee to say that something is overly burdensome or taking too much time.

ST: What types of baseline measurements should be taken before the speech implementation begins?
Jacobs: If you are talking about customer interactions, something that is fairly easy to do—if you have a decent trouble ticket system or the equivalent— is to look at the bulk of the calls that you get and analyze the top two or three issues that people call in for. What you are trying to do is automate those two or three things that make up 20 percent of total issues but 80 percent of the call volume. You also need to analyze what sorts of things actually are possible. You want to measure the potential level of frustration and then design it to one step before frustration.

ST: What should call center managers and project leaders measure once the system is in place?
Jacobs: Some of the obvious ones are call abandonment rates or call kick-outs. If you follow the gethuman.com standard and allow people to jump out of an IVR into the agent queue, you need to figure out how many people are jumping out and at what point they are jumping out. Usually you will find that the problems are in specific areas.
You also need to measure faulty or incorrect inputs. If there is some way that you can capture those, you can figure out whether the problem is a cultural thing, such as accents or pronunciation. Those are the things that you should be measuring for the continuous improvement.
The best practice here would be to minimize, minimize, minimize. Make sure you only do one or two things and do them really well.

ST: What are some reasonable goals to expect within the first year and then over the long run?

Jacobs: First of all, there is definitely going to be cost savings, especially in the human resources area, if it is done right. That is a completely reasonable goal. You need to have reasonable expectations. It is not going to completely transform your business in a year, especially if you take my advice and minimize, minimize, minimize.
As far as revenue gains, there are some uses of voice systems where companies can reasonably expect to see gains in revenue, but they are very specific uses.
You are going to find that if you set the system up right, you are going to see greater employee satisfaction in the contact centers because most people who work there like to help people and are happy to get rid of the mundane tasks.

ST: What implementation pitfalls should the call center manager avoid?

Jacobs: As a general category, the biggest one is taking the rules of the visual realm and applying them to voice. There are specific things voice has a harder time doing than visual applications and people still do it. One is figuring out how to get users to choose from a very long list of data. Natural language speech recognition tries to solve that problem, but then you run into accent issues or pronunciation issues. Trying to limit the amount of options that you present to the user is another pitfall that you can easily avoid. The main strength that visual applications have that voice doesn’t is that it is very easy for a visual application designer to present the constraints of the application to the user. To do the same thing in a voice application would require the application to tell you each time you select something what options are available to you. You have to design the interactions in a way that the users understand quickly what they can and cannot do. One of the ways to do that is to not mix input methods. If it is only a touchtone application, then that is all that you will get. As long as you say what the limits are once, you can avoid the pitfalls of having confused users.
Because voice applications have a hard time expressing the constraints to the user, users tend to make more mistakes with input with voice systems than they do with visual applications. The voice applications should be extremely tolerant of incorrect input, but they are not. You need to build in exceptional tolerance and then gently redirect (without chastising) the users.

SpeechTek Covers
for qualified subscribers
Subscribe Now Current Issue Past Issues