VUIs in Vehicles: Meeting Customer Expectations

How well do automotive speech-enabled interfaces meet customer expectations? One could say that it depends on the application, but beyond voice dialing, the room for improvement seems endless. In fact, it has only been recently that the usability of in-vehicle voice technology has begun to meet customer expectations.

Speech interfaces are most prominent in high-end vehicles and example applications include climate control, radio control, voice dialing, telematics services, traffic, and navigation. Consumers who purchase high-end vehicles generally expect high performance, and speech interfaces that are built into the car are no exception. The ultimate goal is to provide voice interfaces that approach human-to-human interaction. Talk to your car the way you want, but focus on driving.

Voice interfaces in today's vehicles have significant advantages over other interfaces such as knobs, buttons, controllers, and touch screens. The most obvious advantage to the driver is lower driver distraction—eyes remain on the road and hands remain on the wheel. However, a button, typically embedded in the steering wheel, must be pushed to initiate dialogue. Pressing the voice button is usually followed by radio muting and a short beep to signal to the driver that the car is listening. Learning to use the voice buttons is fairly easy, although the techniques vary depending on the make and model of the vehicle.

Audio feedback plays an important role in automotive speech interfaces. The basic idea is to playback audio prompts that indicate the meaning of what was understood by the system. For example, a user may change a radio station by saying, FM radio station eighty-nine point five. The audio playback (a male or female prompt) could contain the same wording: FM radio station eighty-nine point five. The user knows the spoken utterance was understood. The user may mean the same thing, but say the utterance differently. For example, Radio station eighty-nine point five would also elicit the prompt playback of FM radio station eighty-nine point five, depending on the active grammars and dialogue design. That is, if the radio is already in FM mode, FM might be an optional grammar item. The point is that the wording of the audio feedback can teach the user the key words in the grammar to the degree that the user will mimic the audio feedback and experience optimum performance. For voice dialing, audio feedback is used to indicate a recognized phone number or the nametag of the destination party when dialing by name.

Another aspect of audio feedback is persona. Male and female personas are both as common and available as recorded speech, text-to-speech, or both. Some vehicles allow the driver to select male versus female voices. And, there can be different voices depending on the application (that is, you may hear one voice for voice dialing and another voice for radio control). To maintain consistency, some believe that a single voice (male or female) should be used for all audio feedback, even when accessing off-board telematics services that are automated (such as traffic).

Speech is critical for certain vehicle applications, such as navigation, satellite radio control, and voice dialing. For navigation, the ultimate user experience is to say your destination as though you're telling a cab driver where you want to go. Today, there is limited capability (quite impressive at that), but not to where you'll be recognized when you say, "Take me to an economy hotel with air conditioning within 12 miles of the airport." As for satellite radios, they offer stations that are grouped by category. Using a dial to scroll through stations isn't as appealing as saying the station number you want and having the option of hearing a list of stations from a spoken category. For satellite radio control, speech-enabled interfaces are available today.

There are in-vehicle speech interfaces that are not considered to be critical. Controlling radio volume is a good example of a superfluous speech interface. Imagine the radio playing at a low volume. To turn up the volume by voice, one pushes the voice button on the steering wheel, which causes the radio to mute as a beep is heard. "Volume up" is uttered and the radio sound comes back on, a few decibels louder. It is better to skip using voice and use the radio volume buttons (+ and -) embedded in the steering wheel. The other non-critical speech interfaces, such as setting the temperature in the vehicle, make more sense to use. New ones are coming: How much gas do I have? How many miles can I go? What's the temperature outside? When is my next maintenance?

In summary, the automotive industry is in the process of taking interactive voice technology to the next level of performance. Embedded speech technology will converge with off-board speech-enabled telematics services, which include vehicle-centric services that often involve vehicle location. As dialogue flexibility improves, allowing users to speak more naturally, it will be interesting to see how audio feedback evolves. Will the feedback remain short and concise and cause users to only speak what is necessary? Today's vehicle dialogues are directed, but extremely useful. You won't see complex dialogue designs until some of the basics work better (consider destination entry, for example). Over time, we will experience more flexible dialogues that work well for first time users that didn't take the time to read the manual.

Dr. Thomas Schalk is vice president of voice technology at ATX Group, a company that provides telematics services to the automotive market. He is a former president and current member of the AVIOS board of directors. He can be reached at tschalk@axtg.com.

SpeechTek Covers
for qualified subscribers
Subscribe Now Current Issue Past Issues