Voice Interfaces and Contactless Interactions

Article Featured Image

In the wake of the novel coronavirus, we are all much more conscious of the physical surfaces we are touching in our daily lives, at our workplaces, common areas, and commercial spaces—door handles, elevator buttons, vending machines, ATMs, and so on. The list of devices we use for transactions and interactions is rather long. But now there will likely be a shift in our collective preferences toward contactless devices and eliminating or reducing touchpoints. What is the role of voice and speech technologies, and to what extent can they enable this transition?

Where You Don’t Want to Use Voice Controls

It’s tempting to look for more voice-enablement use cases, but in some scenarios, the technology may not be needed, or there may be other technologies that are more suitable. So let’s start with ruling out use cases. Many non-voice touchless technologies already exist. Automatic doors, almost all washroom gadgetry, lighting, and escalators rely on motion or other sensor-based inputs and are already quite mature and have mainstream adoption.

Physical access control can be made to use voice authentication mechanisms (if desired), but facial-recognition systems (warts and all) have more adoption and maturity. Note that I am not referring to voice authentication for phone-based IVR systems but for secured doors and gates. There are novelty use cases like using Alexa to pay for gas (when using an Alexa-enabled car), but several non-touch alternatives exist for this use case as well, such as near field communication (NFC). Finally, there is little marginal utility in voice-enabling devices like smartphones and home electronics that are typically used by only you or your family in the confines of your home. Of course, voice assistants for these devices might make for better user experiences and efficiency, but that’s a topic for a different day.

Where It Makes Sense to Use Voice Controls

There is a plethora of touchscreens in public areas; when you visit the mall, for example, you’ll find a touchscreen interface for the shop directory. Such applications are prime candidates for voice-enabling. Devices used by many people—elevators, televisions, air-conditioning units, lights, and curtains/blinds in hotels—also make good candidates. Ditto for certain kinds of gym equipment, like treadmills.

Examples abound in the office setting: projectors in meeting rooms, light switches, video conferencing equipment, coffee machines, and microwaves in employee cafeterias—all of these can be voice-controlled. While some (like TVs) are already voice-enabled, we can expect to see voice functionality being added to or enhanced for the rest.

The straightforward and limited range of operations performed by these devices (there are only so many things you can do with an elevator) makes it relatively easy to add voice-control capabilities. Depending on the context of usage, such as frequency, some of this control will happen via voice assistants on smartphones; some of it will happen via smart speakers.

Outside of such voice-controlled equipment use cases, speech technologies will come in handy elsewhere—speech-to-text tools will be used to generate transcripts of meetings and online lectures, for instance. We’ve just scratched the surface of possibilities here, but we can expect to see more innovative and creative uses of voice interfaces.

Altered consumer behavior and preferences mean that voice control could end up being a product differentiator. Retrofitting existing devices may not always be possible, but when the next versions of various products are released, it seems inevitable that greater voice control capabilities will be part of the package.

For greater accessibility, and to make technology more inclusive for the speech-impaired, gesture-based controls can complement voice controls. Voice controls can also be more environment-friendly because they can reduce the cleaning and maintenance requirements. Compared to apps installed on smartphones, voice controls can even be privacy-preserving if they don’t store and transmit data.

We can expect certain consumer habits picked up during this public health crisis to be long-lasting. This suggests the acceleration of contactless options for interacting with the physical world. Voice interfaces are convenient, inclusive, and well-suited for a wide range of applications and will find greater adoption in the days and years ahead. 

Kashyap Kompella is the CEO of rpa2ai Research, a global AI industry analyst firm, and is the co-author of Practical Artificial Intelligence: An Enterprise Playbook.

SpeechTek Covers
for qualified subscribers
Subscribe Now Current Issue Past Issues