Speech Technology Magazine


Wearable Voice: The Next Big Thing

Wearable technology brings voice apps with me wherever I go with little effort on my part.
By Ahmed Bouzid - Posted Sep 9, 2013
Page1 of 1
Bookmark and Share

Like most people, I suspect, I first acquired a laptop, then a smartphone, and then a tablet, in that order. With my iPhone, I was suddenly able to do things in places and situations that I could not do before: mainly, at first, send and receive email wherever I was. I could also browse the Web, check the weather, look up stocks – and do all of that wherever I was. More crucially, I could do things with a much simpler sequence of movements: instead of lugging a laptop around, opening the laptop with two hands (and I would probably need to sit to put the laptop on my lap), unlocking it, and navigating to an application, I simply whipped out my iPhone, unlocked it, and clicked and swiped and I had what I wanted. And I could do it with one hand while walking. It was (and still is) a magical experience. 

With the iPad, I discovered that the sphere of things that I could do didn't expand as significantly as it did when I started using my iPhone. Reading on my iPad was definitely much better for my eyes than reading on my iPhone, and typing email was easier on my iPad. But at the same time, the device was larger and felt cumbersome to carry everywhere I went. On the other hand, nothing compares to the iPad when I want to share with people around me an image, map, or PowerPoint presentation.

In a nutshell, the progressive acquisition of my devices did two things: it enabled me to do new things in new situations and it made certain tasks easier to do than others. In technical design terms, the progression resulted in an increase in both use cases and in usability.

The next obvious wave that will increase both the sphere of use cases and the degree of usability is wearable technology.

Up to now, we have had to divert our eyes and our hands to the device that we wanted to use. We need to locate the laptop with our eyes, open the laptop with both hands, fix our eyes on the laptop screen to navigate the icons, and click on the keyboard and the mouse with our fingers. Similarly for both the tablet and the smartphone: engaging those devices requires us to focus our eyes and hands on the task. Think about what it takes to read a simple text message that has arrived on your iPhone. You hear the ding-ding and feel the buzz. Of course, you immediately reach for the device with one hand, bring it in front of your face, and then unlock it by tapping with the fingers of the other hand; you keep holding the device with the first hand as you read the text, your visual attention almost fully consumed by the text you are reading; having determined that the text does not warrant a response, you click on the home button and put the device away.

But what if I was driving, with both of my hands on the wheel and my eyes on the road and I received that irresistible ding-ding and buzz of a text message? What if I was gardening and both of my hands were dirty when I received that text?  What if I was washing the dishes, or fixing a broken pot, or painting the wall? In such situations, I want to process the text (at least find out who sent it) without needing to stop what I am doing and disrupt my activity flow for a text that might not be worth reading in the first place.

And that is where wearable technology comes in. A smartwatch for instance, can let me see who is calling me, or who has sent me an email, or what tweets of mine have been re-tweeted, with a simple turn of the wrist and a quick glimpse. I could wear a device that takes a picture every 30 seconds whenever I am wearing that device. I can wear a bracelet that notifies me with a different color flash depending on the event (email vs. text vs. phone call).  Wearable devices are so compelling because they are effortlessly always on the user, doing their bidding with minimal effort from that user. 

Two things about a wearable device: Unlike the smartphone, let alone a tablet or a laptop, the user does not have to fetch it to use it: it is on his body, and because it is always on the user, he does not have to remember to keep it nearby. The first has to do with the interaction effort, while the second has to do with the effort to keep it within range of use.

In the context of voice interactions, a third important characteristic needs to be highlighted: because the wearable device is on the user, the distance between the user and the device will always be small and, as a consequence, the speech recognition will have a high level of accuracy and the text-to-speech output will have good audibility. Imagine receiving a text while fetching soda and pretzels. I am away from my iPhone, both my hands and eyes are busy fetching the goodies from the basement, my voice-enabled device tells me that it has a text for me. I respond by simply saying, “Read me the text” (effortlessly) and the device reads me the text while I am walking back up. 

This combination of effortless, don’t need to remember to take it with you and don’t need to look for availability meaningfully expands the sphere of use cases and the usability of technology. In the case of wearable voice, a whole world of situations where interactions were not possible because our hands or our eyes, or both, were busy, now opens up. Moreover, because of the nature of voice, the cost of performing many actions can be significantly reduced. Actions that perhaps I didn’t bother to do as often before I do now because the cost of performing them is a fraction of what it used to be: I can quickly ask for the weather, sports scores, traffic, ask a question, because the cost of doing so is now affordable: it's one or two actions (push and speak, for instance), rather than a tedious sequence of tapping, swiping, and (God forbid) typing.

With voice, the level of intrusiveness is minimized. I can keep the flow of my writing and the gaze of my eyes fixed on my screen as I listen to a readout of the two twitter direct messages that I just received, rather than having to stop what I was doing and move my whole attention to the device just to read the messages.

A small wearable device that can interact with you by voice, understand what you say, and speak back its responses  has all the marks of  good, life-enhancing, technology. Such technology needs to be where and when you need it without interfering with your life flow; it needs to do what is asked of it without requiring you to expend much effort; and just as importantly, it needs to get out of the way as soon as it is done with its job. 

The challenge is clear: making the technology perform effectively and efficiently in the rumble and tumble flow of life is no easy feat. But in the case of wearable voice, the promise is real and it is massive. And the effort to bring it fully to life is well worth it.

Ahmed Bouzid, Ph.D., is CEO and co-founder of XOWi, a startup company focused on delivering wearable voice technology products and solutions. Its first product, the XOWi Voice Companion, will be launched on Kickstarter in late September. Bouzid is also co-author of a new book about voice user interface design titled Don't Make Me Tap! A Common Sense Approach to Voice Usability.

Page1 of 1