Speech Technology Magazine

With Voice Systems, Hiding Complexity Is a Must

Added functionality shouldn't equal a burden on users
By Susan L. Hura - Posted Aug 1, 2016
Page1 of 1
Bookmark and Share

How should designers manage increasing complexity in automated systems without overwhelming the user? The rule of thumb is to hide as much complexity as possible behind the scenes while exposing the user to just enough of it to have control and freedom. Two of my recent interactions demonstrated that achieving such balance is a tricky but achievable task.

The successful example came from a new Siri feature on the iPhone. I use Siri multiple times per day for voice texting, as it beats tapping out messages on a tiny keyboard. A couple of weeks ago, I needed to text my daughter while I was out running errands. Before pulling out of a parking lot, I pressed the button on my iPhone to start dictating a text, assuming I’d have to wait till I reached my destination to send it. But to my surprise, Apple had added a new texting-in-the-car function: Instead of silently displaying the results of my dictated text onscreen (as it does outside the car), Siri read my message aloud and then asked if I was ready to send it. I’m normally skeptical of using this term in relation to design, but I was delighted.

Apple did everything right: It identified an unmet user need, added intelligent functionality to address it, and enabled the functionality to spring into action when appropriate, behind the scenes. I didn’t have to learn anything new, change any settings, or even turn the feature on—the functionality was there when I needed it and added zero complexity to the interaction.

Apple avoided complicating the interface by making smart assumptions. It correctly assumed that switching to audio confirmation of voice texts just makes sense when the user is driving. And the developers got the technology to do the work of detecting when the user is in the car and switching to the new modality automatically. Bravo!

I found an opposite example on the website lovemyecho.com, dedicated to news about Amazon Echo and opinions from blogger April Hamilton. I’ve generally found her writing to be informative and witty, but a post titled “Alexa’s Growing Pains” defends added complexity in the interface of Alexa (Echo’s voice service) in a way I find unsupportable.

The post describes newly added functionality for playing music on Echo and its effects on interaction. Hamilton’s thesis is that Alexa’s early interaction model was too simple to meet user needs. The new functionality allows users to ask for music by genre as well as by artist or song within their Amazon Music Library. Hamilton happily reported that she was able to say “Alexa, shuffle new wave,” a command not previously supported because it refers to a genre she set up.

The post goes on to mildly chastise another user who complained that a very similar command that used to work for him (“Alexa, play smooth jazz”) no longer did. Hamilton explains that “smooth jazz” is no longer interpreted the way it used to be because of how it’s represented in Prime Music genres, and suggests several longer, more complex commands the user could speak to get the result he wanted. She ends the post defending the new demands on users:

“With each new capability, Alexa gains new vocabulary and the user gains a new or improved level of control, but it comes at the cost of increased interaction model complexity.... Remember: we wanted Alexa to get smarter, we wanted more options. The more options available to Alexa, the more specific your request must be.”

I couldn’t disagree more with this analysis and conclusion.

Interactions need not become more complex when features are added. Simplicity for its own sake is not necessarily the end goal of interaction design, but as systems become more complex, as design guru Don Norman puts it, “it is the job of the designer to manage that complexity with skill and grace.”

New functionality should not burden the user. If Alexa made some intelligent assumptions when interpreting user requests, the resulting interaction could remain simple and still support new functionality. The specifics of what assumptions make sense need to be discovered via research with Alexa users, but the basic idea is to stop taking music requests at face value. Alexa could capitalize on what she knows about the user’s music library and make decisions about how to interpret a request for something like “smooth jazz” intelligently. A smart decision strategy could allow Alexa to achieve what I experienced with Siri: added functionality while maintaining simplicity for the user.


Susan Hura, Ph.D., is a principal and founder of SpeechUsability, a VUI design consulting firm. She can be reached at susan@speechusability.com.

Page1 of 1