2021 Speech Industry Award Winner: Sensory Is Living on the Edge
Privacy is a top consumer concern these days, and Sensory has it covered. The speech technology provider is a pioneer in embedded software, or speech on the edge, and addresses the privacy issue in a number of ways. First, the high accuracy of its wake word technology minimizes the likelihood that devices are listening when consumers don’t want them to. Second, the Santa Clara, Calif.-based company’s technology keeps voice data private and secure; voice requests never leave the device and are never stored.
Sensory this year went all in on voice technologies that operate 100 percent on device. Due to the edge architecture of Sensory’s technology, customers don’t have to deal with Wi-Fi, apps, third-party hardware, or cloud-connected assistants always eavesdropping.
This was the basis for one of Sensory’s most significant product releases to date. In January, the company launched VoiceHub, a free and flexible online portal for creating and designing voice user interfaces. Coupled with the release were companion mobile apps for both iOS and Android devices.
With VoiceHub, Sensory integrated support for TrulyNatural, its large-vocabulary speech recognition solution that runs entirely on device, with customizable natural language understanding (NLU) capabilities. Voice user interface designers can leverage this new capability to produce natural language-enabled products capable of supporting dozens of languages and dialects.
Sensory put out a beta release of VoiceHub in October, and voice user interface designers around the world quickly jumped on it to create hundreds of voice AI models for dozens of automotive, wearable, smart speaker, and smart home products. One of them was STMicroelectronics; that vendor’s marketing director for microcontrollers, Daniel Colonna, credited Sensory VoiceHub with enabling developers to design working prototypes of natural language-capable products “in a manner of minutes, not days.”
Another company leveraging Sensory’s edge-based voice technologies was Knowles, which incorporated VoiceHub and TrulyHandsFree, Sensory’s wake word engine, into its AISonic White Goods Standard Solution, a development kit for voice integration for smart appliances.
On that same front, Farberware microwaves became the first of many home devices to feature a custom, private voice interface from Sensory.
The Farberware FM11VABK voice-controlled microwave oven features Sensory’s TrulyHandsfree and TrulyNatural technologies to provide all the benefits of a custom voice assistant without the privacy compromises of cloud-based, general-purpose assistant platforms.
The Farberware microwave can understand numerous voice commands, such as “Open door,” “Cook popcorn,” “Set timer to one minute 36 seconds,” “Defrost,” or “Reheat for two minutes.” The large vocabulary recognizer, with a custom statistical language model and NLU, can also support more complex actions like “Cook four baked potatoes.” The Sensory NLU engine looks for intents within a limited domain.
Sensory’s TrulyHandsfree edge-based wake word and phrase recognition engine also became the de facto model for “Hey Siri” in dozens of countries.
With the release of the Hey Siri model, Sensory now supports most popular virtual assistants in the United States. It also provides wake word models for Tencent, Baidu, Naver, Rakuten, and others. The new Siri models support multiple languages across North America, Europe, and Asia.
Sensory’s TrulyNatural embedded speech recognition software was also part of the latest public beta release of Zoom Rooms for Android, iOS, MacOS, and Windows.
Powered by Sensory, voice commands are now supported on all Zoom Rooms platforms. Participants in Zoom Rooms can now use their voices to wake the room system by saying “Hello Zoom” and then employ voice commands to start the meeting, leave the meeting, check in, and more. All voice commands are processed locally, never in the cloud.
Zoom and Sensory have worked together in the past to leverage TrulyNatural to create domain-specific recognizers that can handle common voice requests for controlling meetings. They have also developed more complex voice tasks, like alphanumeric recognition for entering meeting IDs and passcodes.
“Sensory’s technology checked all the boxes for us: accurate, fast, and private,” said Cynthia Lee, lead product manager at Zoom, in a statement.