How to Design Voice-Enabled Experiences Kids Will Love 

Article Featured Image

Some advances arrive and are a natural fit, so much so that they quickly transition from a novelty to an expectation. This is especially true of voice technology. As a natural, intuitive interface, voice is permeating every aspect of children's lives, much like touch screens before it. Whether it's video chat, interactive storybooks, gaming, or learning and literacy experiences, kids expect to engage verbally with technology.

But too often, children inherit hand-me-down technology rather than something built for their unique experiences and circumstances, or in the case of speech technology, their unique speech patterns, pitches, and behaviors. Both the physiology and environmental context of children's voices differ greatly from adults'. Children have smaller vocal folds and underdeveloped larynges, meaning they can't always enunciate words clearly and correctly. They over-punctuate words, use unpredictable or erratic tones, and mix up words. They also often have to communicate in loud environments, like classrooms, that represent a huge range of accents, dialects, academic, and socio-economic backgrounds. No wonder speech engines built for adults misunderstand kids so often.

For speech technology in education, the stakes couldn't be higher. Imagine the implications of children being told the words they're reading out loud are wrong, when in fact, they are correct, and the technology misheard their accents?

So how can we tap the power of speech technology at a time when students urgently need all the support they can get?

In their book Conversations with Things: UX Design for Chat and Voice, authors Diana Deibel and Rebecca Evanhoe make the case for developing speech technology with a human-focused approach rather than a tech-centered one. A voice system should accommodate users not force them to change their behavior to be understood. To create a true user-focused experience, designers and developers must apply that philosophy from the ground up when building for kids. They must reframe their design decisions with the specific needs of children in mind.

For a voice-led experience for kids, there are four key areas that you can use as a framework. As a follow-on from what Deibel and Evanhoe write about, below are the four areas I consider when designing a voice-first experience for kids.

Whether your activity is built with assessment or entertainment in mind, the core loop of Prompt, Listen, Respond remains relevant. Emphasis will vary case by case, but, along with onboarding, these steps are part of every voice experience.

1. Onboarding

The first thing to decide is who goes first: the product/experience or the user (the child). When dealing with young children, consider letting the product begin. In most cases, this will allow the experience to provide context and establish rules.

If the experience is a trivia quiz, for example, establish rules first, then ask a question: "Quiz time! There are going to be five questions. I'll ask a question and then give you two answers. Then you tell me the right one. Let's start." If it's their first time playing, then you've established what's about to happen and wha's expected of them.

2. Prompting

Expectation-setting carries through to all aspects of the conversational experience. Contrary to popular opinion, enabling children to do or say anything they want doesn't create a delightful open-ended experience. They're much more likely to become frustrated and feel a paralysis of choice. A defined experience rather than a blank canvas gives children agency and helps them feel included. By allowing children to decide which path to take in an interactive story, they become a protagonist and equal partner.

Think about cognitive load and how children can hold fewer options in their heads. Consider limiting choices to options A, B or C. Give children too many choices and they'll just pick the last one they remember rather than select something meaningful. When a screen is available, use visuals and other multimedia cues to define options.

3. Listening

During conversations, eye contact and head nods demonstrate that you're paying attention. Consider the equivalent for your experience. When children are taking their turns, what is your product doing to demonstrate it's anticipating a response from the child?

The goal is twofold: to show that their input is anticipated and to indicate when they are being heard. See this as an opportunity to inject extra personality into your product. With whom or what is the child having the interaction? Do you have an on-screen character you can use? Perhaps a character can lean in to listen better. Having your user interface react by glowing or pulsing a little in response to the child speaking can create a connection. Even a basic reaction is enough to maintain engagement.

4. Responding

This is where the delight can truly happen. This is where children get to see the results of their actions with your product/experience. They've spoken to their devices, answered questions or asked their favorite characters to do a dance and received responses. Even if you only give minimal feedback, say in an educational assessment scenario, it's critical to acknowledge the kid has been heard successfully.

On a side note, kids love hearing their own voices being played back to them. Something most adults dislike or dread is an (almost) endless joy for young kids.

Interactive voice experiences are still in their infancy, and we're constantly refining what we know works for kids. What's undeniable is that there is no one-size-fits-all approach. A first-grader completing an oral reading assessment on an app has different needs than a fifth-grader commanding a digital character in an instructional game. Both require different cues, approaches, and accommodations for children's unique needs, languages, and unpredictable behavior to create a successful experience. For that first-grader, though, meaningful feedback is the most powerful tool to develop the skills and confidence needed to become a life-long learner.

The promise of speech technology looms large for our youngest of users. We just need to play at their level.

SpeechTek Covers
for qualified subscribers
Subscribe Now Current Issue Past Issues