Bueno? Are You Listening to Your Spanish Speakers

Spanish speakers are the fastest growing population in the United States . To realize the full potential of this largely untapped market, businesses must plan for the impact this will have on self-service automation; voice user interface designers need to incorporate best practices targeted for these users; and the industry as a whole should offer technology solutions and packaged applications specific to Spanish as spoken in the United States .   Bienvenido If you think your business' approach to voice self-service doesn't need to provide support in Spanish… That current market demands don't justify investing in speech recognition technology for U.S.-Spanish speakers … Or that the solution is simply to "translate" your English audio prompts and recognition grammars … Think again!   As of the 2000 Census, 13 percent of the U.S. population is Hispanic/Latino - a stunning 58 percent increase over 1990. Twenty-eight million people (11 percent) speak Spanish, and 13 million (five percent) speak primarily Spanish-only. Hispanics are now the largest U.S. minority, with an annual purchasing power estimated in 2003 at more than $653 billion. And this unprecedented growth is predicted to continue. By 2050, one out of every four people in the United States will be Latino.   Understandably, this increase has affected call center traffic, making it a key opportunity for speech automation. However, the business impact and technical challenge of multilingual applications is often unknown, misunderstood or left as one of the last requirements in a speech implementation. Our challenge as an industry, as VUI designers and as business decision-makers is: How do we serve this growing, multilingual and multicultural market?   One of the most challenging applications for encapsulation into custom or pre-packaged applications is address recognition, especially in the context of multiple languages and cultures. Our primary research findings on U.S. addresses in Spanish reveal surprising results that underscore the need for better understanding of this target market.   Your Competitors May Already be Speaking to Your Customers  Today, key vertical industries market directly to Hispanic consumers (see Figure 1) and leading companies are exploring Spanish speech applications to win and retain this lucrative consumer base.   Unfortunately, Spanish often isn't given the attention it deserves. Frequently it's an afterthought: "Oh, and it also needs to work in Spanish." It's assigned inappropriate resources: "Sure, we have an internal employee who had Spanish in high school and can do the translations." Or, it's simply disregarded: "Well, if they want to do business with us, they need to speak English." Only later do companies wonder in surprise about their investment in the Spanish application: "Why isn't it working as well as in English?"   Today it is possible to create quality U.S.-Spanish applications. But VUI designers, program managers and business decision-makers need expertise in the cultural and linguistic characteristics of this caller population in order to evaluate which functionality is readily supported by the current technology and to be able to determine which
functionality requires a more custom solution. Also, projected call volume increases should be considered when cost-justifying customization.   English Callers are from Mars.
Spanish Callers are from Venus.
When using speech automation, English and Spanish callers may respond very differently. If the application asks "What's your account number?", English speakers typically say their number. But in response to the equivalent Spanish question "¿Cual es su número de cuenta?", Spanish speakers often give more details about why they're calling.    U.S.-Spanish speakers come from diverse backgrounds as shown in Figure 2, but here are some generalizations about their language and cultural identity:  

  • May be reluctant to use automation.
  • Strongly value relationships.
  • Can be more reserved until rapport is built.
  • Communication evolves from formal ("Usted"-'you') to informal ("tú"-'you').
  • Generally more conversational, with more social greetings and background information.
  • Often involve families in the decision-making process.
  • More background noise or side-speech during a call.
  • Multiple calls may be needed to complete a transaction by phone as the caller gets family consensus.
  • Even younger consumers in the prime of their earning potential prefer to use Spanish - many are recent immigrants.
  Speech applications must reflect these differences, since they can greatly impact a caller's acceptance of an automated system, their perception of a company's or product's brand, and ultimately their purchasing decisions. Language is powerful word-of-mouth advertising: 90 percent of consumers state that positive customer service influences their decision to do business with a particular brand and 50 percent who have a negative experience tell an average of five friends and family members. Given that Hispanics/ Latinos more strongly retain their culture and language, you can't afford not to get this right.   Best Practices So, you're thinking of putting Spanish on your voice self-service roadmap? See Figure 3 for a checklist of things to keep in mind when getting started. Our Top 10 Survival Strategies combine industry best practices as applied to U.S.-Spanish applications, with the authors' experience on 27 Spanish or multilingual speech-recognition projects executed at Edify, Syntellect and other industry leaders.   What's So Interesting about U.S. Addresses in Spanish? >A major difference between English and Spanish as spoken in the United States is address capture. Edify recently conducted primary research on how U.S.-Spanish speakers say their addresses, because it wasn't entirely clear how the addresses would be spoken.        Additional challenges include:  
  • Frequent grammar updates as new roads and housing are built.
  • Specialized dictionary entries for place names (e.g., Vallejo (vah-LAY'-hoh) rather than the native Spanish (vah-YAY'-hoh) or the phonetic English (VAL'-le-joh)).
  • In-grammar accuracy percentages in the mid-to-high-80s due to grammar complexity, as opposed to the mid-90s
    for other types of grammars.
The situation is even more complicated in Spanish.   First, many Spanish speakers may not know their ZIP code for various cultural or logistical reasons. Also, when saying their ZIP code and other numbers, Spanish speakers are more likely to use natural numbers than their English counterparts:   10001 as: "diez mil uno" ('ten thousand one')   Rather than: "uno, cero, cero, cero, uno" ('onezero zero zero one').   However, there are more challenging recognition and dialog issues than the ZIP code. More interesting is how speakers say the street address, since the vocabulary and word order vary between the two languages:                     

But callers living in the United States frequently see or hear addresses in English. This may influence how they say them, even when speaking in Spanish. For instance, callers may mix Spanish and English words or word order, prompting a number of different hypotheses about possible patterns as outlined in Figure 6. If there is too much variation in the vocabulary or word order, the Spanish street-address grammars would be multiple times larger than their English counterparts, yielding a recognition accuracy far below the high-80s.    To determine if there are predictable enough patterns to achieve reasonable recognition rates, data based on actual caller utterances was collected. Details about this case study, our methodology and findings can be found in Kaiser and Ahlén, 2004. Figure 7 highlights the results of this research, based on 70 calls of Spanish-speaking utility customers saying their home address to a call center agent.   When saying their street address, the majority of the speakers used a mix of English and Spanish vocabulary. The street number was always said in Spanish (100 percent), and the street name was almost always said in English (98 percent), often with a heavy Spanish accent, e.g., "Main " might be pronounced like the English word "mine." The street type was frequently omitted (47 percent), the majority of speakers who gave it said it in English (40 percent), with a fair number using an equivalent Spanish word (12 percent). A mixture of English and Spanish word order was also observed. In terms of grammar complexity, the two key elements with respect to word order are street number and street name, since there are many different possible street numbers and names. If the word order between these elements is fairly fixed, this would greatly reduce the grammar complexity. Word order for direction and street type is less important, since there aren't as many possible options and thus their word order has little impact on the overall grammar complexity. Surprisingly, we found that in 91 percent of the cases the order between the street number and the street name consistently followed the English pattern with the number (spoken in Spanish) preceding the name (said in English), even though sometimes Spanish word order was used for the direction or street type in the same utterance.   These findings suggest it should be possible to create grammars and dictionaries yielding reasonably good recognition performance for U.S. addresses in Spanish, particularly if the system's audio prompts can guide the callers toward saying their address in this format, with the street number preceding the street name, all in the same utterance. We refer the interested reader to Kaiser and Ahlén (2004) for suggestions on how to word such prompts and recommend that usability studies be performed to determine the best prompting strategies.   This research underscores that it's not simply a matter of translation. If the English prompts and grammars had simply been handed off to a translator, the street address would likely have been translated into the most common format used in Spanish-speaking countries - with the street name preceding the street number. So the resulting design would have been ill-equipped to handle the actual caller responses.   Industry Challenge To take support for U.S.-Spanish to the next level, the industry now needs:  
  • Packaged solutions for U.S.-Spanish address- and name-capture to parallel existing English offerings.
  • Recognition engines specialized for U.S.-Spanish, including U.S.-Spanish acoustic models, standard dictionaries with U.S.-Spanish-specific vocabulary and U.S.-Spanish grammars.
  • TTS engines specific to U.S.-Spanish, mixing the vocabulary, word order and pronunciation in appropriate ways and in the same voice. Neither Spanish- nor English-based TTS engines can currently provide this right mix, without extensive customization and costly tuning.
Moreover, these offerings have been lacking for quite some time. In 2000, a previous client had wanted address capture in both languages. But, since there was no speech-industry offering for U.S. addresses in Spanish and a custom solution was cost prohibitive, the client was persuaded not to automate Spanish addresses, but instead transfer callers to live agents. Rather than talking clients out of features that require support in U.S.-Spanish, we challenge the industry to provide solutions for this area of targeted growth and opportunity.   Spanish speakers are the fastest growing market segment in the United States . Today, they are 10 percent of your customers. Soon they will be 25 percent of your customers. Let's start talking to them!  
References   Kaiser, Lizanne and Sondra Ahlén. (2004). "Are you Listening to your Spanish Speakers? How Spanish speakers in the U.S. say U.S. addresses." presented at AVIOS~ SpeechTEK Spring conference, San Francisco , CA .   "Missed Opportunities - Two-thirds of Major Advertisers Missing Opportunities." Retrieved April 16, 2004 from Association of Hispanic Advertising Agencies' website: http://www.ahaa.org/ research_study_02/pressconf_document_web.pdf .   Olvera, Eduardo. (2004). "VUI Design and Development Tips and Best Practices. Spanish as a 2nd language." presented at AVIOS~SpeechTEK Spring conference, San Francisco , CA .
Sondra Ahlén, M.A. is the principal VUI consultant and owner of SAVIC, an
independent consulting firm in Silicon Valley . She works with industry companies such as Edify and Nuance, specializing in Spanish and Portuguese language applications. She can be reached at sondra@savicvoice.com . Lizanne Kaiser, Ph.D. is principal consultant in the speech solutions group at Edify Corporation. She can be reached at Lizanne.Kaiser@edify.com . Eduardo Olvera, M.S.E. is a senior speech analyst in the speech services group at Syntellect, Inc. He can be reached at eolvera@syntellect.com .

SpeechTek Covers
for qualified subscribers
Subscribe Now Current Issue Past Issues