Focus Groups Can Optimize Speech Applications

In designing a voice user interface, actual customers can best express the preferences of the customers.
By Nava Shaked - Posted Nov 1, 2007
It is often said that the challenge in planning a speech application lies in its design. The focus has moved from core technology—engines and algorithms—to implementations and voice user interface (VUI) design. Much has been made of the importance of VUI as a success factor in speech-based systems. Since we as designers are trying to build a relationship with the customer throughout the dialogue, it is very important to give the user a tool to enable a successful and convenient service, task completion, and a sense of comfort.

Our task is to create an experience that will result in the user calling back into the system again, and to automate as many activities as possible. A bad experience will result in a failure to attract the user back to the service or force him to use a live agent.

As designers, we also have a commitment to our clients; we must consider the needs of the company, marketing issues, and the branding and persona associated with the organization. The system’s VUI must maintain and reflect the organizational consensus—the very same one with which end users are so familiar.

To bring these two sides together, we must acknowledge that no one can take the place of end users in telling us how they like to be served and what makes for a positive experience. Although we know of general surveys regarding user behavior and preference, each organization must check with its own end users separately to get results.

One best practice is using focus groups for the application, enabling us to identify and characterize important building blocks of grammar, lexicon, and dialogue. These include speech application perceptions and prejudices, aliases and semantic networks, terms with double meanings and ambiguities, and style and VUI preferences.

A case in point: while I was designing an IVR for Hpoalim Bank, the company assembled five focus groups from among its customers, using the following criteria:
• Keeping the right mix of users: IVR active users, non- IVR active users (can be Internet users), and active users in existing speech-enabled applications.
• Covering two major age groups: 25- to 40-year-olds, and 40- to 60-year-olds.
• Maintaining a balance between genders: 50 percent male and 50 percent female. This holds true for a bank, but numbers can vary depending on a company’ customer base.
• Keeping a moderate group size to get valuable results.
• Allowing enough time for people to get familiar with the focus group concept and feel comfortable expressing their thoughts. Two- to three-hour sessions worked best.
• Recording all sessions to allow for several reviews.

The focus groups revealed solid, applicable data. From the professional side we got a very clear list of aliases and new expressions for the application’s grammar and disambiguation lexicon—banking terms versus user terms. We also got ideas about new phrasing of messages, proposals for the proper order of lists, and suggestions for the flow of the system’s dialogue. From the behavioral side, we noticed that users, while interacting with a machine, naturally adjust their speaking styles or languages accordingly.

We noticed a strong preference for open questions, followed by instructive questions, such as What would you like to do? You can ask for... We also discovered that users preferred a guided message, such as Next time just say ‘order checks’ to get your service, and that they think it’s crucial to announce that the service is free.

Conclusions from these sessions might correspond to general surveys, but they allowed the bank to relate specifically to its users and design the application accordingly. There was a clear need to bridge the professional language of banking and everyday conversations. The same was done for the conversational style the user preferred. Most important, focus groups allow you to check not only what users say they feel or will do, but their actual behavior.

Our job will always be to constantly balance technology and business requirements, keeping end users in mind at all times. Using different focus groups will give the best results and ensure better coverage of the language. It will also secure a stronger prompt flow and sense of convenience for users without compromising the company’s branding.

Nava A. Shaked is the CRM and Call Center Practice Leader in IBM Israel. She is a member of AVIOS board of directors and the chair of the AVIOS chapter in Israel. She can be reached at snava@il.ibm.com.

