Picovoice Brings Real-Time Speech Recognition to Offline Devices
Picovoice, a Canadian AI startup, has developed a real-time speech recognition engine that can run offline anywhere, from a $5 Raspberry Pi Zero to within a web browser.
Despite the ubiquity of speech-enabled devices, processing speech in the cloud has raised major privacy concerns with the uploading and handling of personal voice data. Cloud speech recognition also has fundamental limitations in terms of latency, reliability, and cost-effectiveness at scale.
Picovoice says offline speech recognition has the potential to address these issues by eliminating the need for connectivity and tapping into readily available compute resources on billions of devices. Alas, the computational cost of speech recognition algorithms to date has made it impossible to get comparable accuracy on an edge device.
Picovoice has developed deep learning technology specifically designed to run speech recognition efficiently on commodity hardware with limited compute resources. Its bespoke voice AI technology enables Picovoice to run real-time speech-to-text on a $5 Raspberry Pi Zero or locally within a web browser. This lowers latency and cost while respecting user privacy by not requiring their speech data to leave their device.
Picovoice routinely publishes open-source benchmarks for their products including their recent speech-to-text engine. The benchmarks indicate that the software is matching the accuracy of major cloud providers such as Google and Amazon while running locally on a small embedded device.
Picovoice software is used by dozens of enterprise licensees, including Fortune 500 companies. LG, Whirlpool, and Local Motors are among those that they can mention. Picovoice technology has already received immense interest from enterprises who are trying to push AI to the edge.
New Intelligent Engagement Services will expand its open, cloud-agnostic framework, and the cloud-native approach bolster third-party integration capabilities and offer organizations more control.
McDonald's buys Apprente--a provider of voice-based, conversational technology--in a move that will integrate new teams with advanced technology skill sets into the McDonald's business.
SPi Global announced the release of SmartDub, a solution that enhances Text-to-Speech (TTS) technology to generate synthetic voices creating a more natural and immersive experience for users.
New round is Uniphore's largest to-date; one of the biggest in conversational AI in recent months.
Inference Solutions' Studio Platform is highlighted for Leading Development of the fast-growing Intelligent Virtual Agent market by Frost & Sullivan.
Chatmeter's Google Q&A helps brands gain the ability to act on user-generated responses at scale while improving customer experience, local rankings and voice search.