Designing a Speech Interface? Learn From Web Design Fails

Article Featured Image

Early in 2021 I wrote an article that described how difficult it was to use various websites—actually, every single website in my area—to find a dose of COVID-19 vaccine. That article discussed the lessons a website designer can learn from speech interfaces that handle customer requests. Two salient questions emerged:

• How would an intelligent human operator handle the interaction?

• Does the website use its vast, instant access to current data and previous interactions to make the search as easy and painless as possible?

Unfortunately, the answer to the second question was “no.” Not only were the websites amazingly clumsy, but almost all wouldn’t even remember what I did 30 seconds previously and demanded that I re-enter the same information over and again.

Now it’s time to ask the opposite question: What can the speech community learn from our experiences with web pages? I spend all day in front of my terminal, and I often have to leave the safe havens of Stack Overflow and venture onto corporate websites in search of information. A well-designed website is quite the rarity; but if I pay attention, the annoyances and mistakes on websites can provide valuable lessons.

Don’t get me wrong: Speech interfaces do avoid problems that plague most websites. I think the best example is the creative ways that website designers make entering a date as difficult as possible. Want to know what year I was born? Drop-down menus to select my year of birth, a favorite among websites for some reason, are particularly annoying at my age: I have to scroll back more than six decades. Speech interfaces just ask me to say the year.

Here’s a different source of irritation: entering a phone number. If I send someone my phone number via email, I usually just write it out with a +1 in front for the country code. Websites, on the other hand, have a wide variety of different, and vexing requirements. Some demand that you enclose the area code in parenthesis, and put a dash between the exchange and line numbers. Other websites stipulate that you enter your phone number without any spaces. Almost all throw a fit if I include the country code; I’ve yet to see one that lets me write +1, or will accept dots instead of dashes in the number. I have conjectures about why they do this but, to paraphrase one of my favorite high school teachers, these are excuses, not reasons. Speech interfaces don’t make this mistake: They do not require that I pronounce the spaces or dashes in a phone number.

And yet sometimes interactive voice response (IVR) systems will make extraordinary formatting demands that generate user errors. Years ago—thankfully, it’s been fixed—an IVR interface at a local hospital asked me to enter “the group number. The group number is the first two digits to the left of the dash in your account number.” I failed at this simple task—it was too mind-boggling to figure out what they needed, and I kept entering the two digits to the right of the dash. I’ve run across similar errors many times in both voice and web interfaces: The system insists that I manually parse a number instead of letting me enter the entire number.

Consider another web annoyance. How often do you go a website and, as you try to navigate, a pop-up advertisement shows up offering you some good or service or subscription? I find these bizarre and rage-inducing; I’ve yet to meet a customer who thought these were fun and interesting. Telephony services often include the same nonsense. When my internet service goes down I call my internet service provider (ISP), and I’m greeted by an announcement that cheerfully tells me that I can use the website to handle my account instead of calling. That’s the definition of adding insult to injury: the only reason I’m calling is that my internet service is down, and my ISP knows (or should know) my phone number and perhaps take a guess that I’m calling for a good reason. Why doesn’t my ISP look at my phone number and remember that I only call when there’s an outage?

The other voice counterpart, offering me goods and services before letting me make a menu selection, I find particularly frustrating. Unlike a website, I can’t even click to close the window and escape the sales pitch; I must wait and then try to get my actual business done.

The irritating pop-up—thanks, European Union!—that asks me to consent to various forms of cookie tracking find their counterparts in the ever-popular IVR announcement “please listen carefully as the menu items have changed.” I don’t actually believe that these change every day, or every week, or every month, or even every year.

By the way, I realize that website designers have run “A/B testing” on website designs and determined that pop-up boxes, etc., do work. My retort is simple: It depends on what your goals are. Their goal is to sell me something. My goal is to accomplish a task and move on to the next one. Pop-up boxes have no place in a website, and their equivalents have no place in IVR design.

One final note: My web browser lets me block most nuisances via settings and plug-ins. IVR systems do not offer these options. If my only recourse to doing business with a company is a choice between an annoying IVR and an annoying website, I will do my best to choose another company. 

Moshe Yudkowsky, Ph.D., is president of Disaggregate Consulting and author of The Pebble and the Avalanche: How Taking Things Apart Creates Revolutions. He can be reached at speech@pobox.com.

SpeechTek Covers
for qualified subscribers
Subscribe Now Current Issue Past Issues