Picovoice Brings Real-Time Speech Recognition to Offline Devices

Picovoice, a Canadian AI startup, has developed a real-time speech recognition engine that can run offline anywhere, from a $5 Raspberry Pi Zero to within a web browser.

Despite the ubiquity of speech-enabled devices, processing speech in the cloud has raised major privacy concerns with the uploading and handling of personal voice data. Cloud speech recognition also has fundamental limitations in terms of latency, reliability, and cost-effectiveness at scale. 

Picovoice says offline speech recognition has the potential to address these issues by eliminating the need for connectivity and tapping into readily available compute resources on billions of devices. Alas, the computational cost of speech recognition algorithms to date has made it impossible to get comparable accuracy on an edge device.

Picovoice has developed deep learning technology specifically designed to run speech recognition efficiently on commodity hardware with limited compute resources. Its bespoke voice AI technology enables Picovoice to run real-time speech-to-text on a $5  Raspberry Pi Zero or locally within a web browser. This lowers latency and cost while respecting user privacy by not requiring their speech data to leave their device.

Picovoice routinely publishes open-source benchmarks for their products including their recent speech-to-text engine. The benchmarks indicate that the software is matching the accuracy of major cloud providers such as Google and Amazon while running locally on a small embedded device.

Picovoice software is used by dozens of enterprise licensees, including Fortune 500 companies. LG, Whirlpool, and Local Motors are among those that they can mention. Picovoice technology has already received immense interest from enterprises who are trying to push AI to the edge.

SpeechTek Covers
for qualified subscribers
Subscribe Now Current Issue Past Issues
Related Articles

Picovoice Launches Speech-to-Text

Picovoice enters the voice transcription market with cloud-level accuracy on the edge.

Nuance Expands Open Architecture to Deliver More Flexible, Enterprise-Grade Conversational AI

New Intelligent Engagement Services will expand its open, cloud-agnostic framework, and the cloud-native approach bolster third-party integration capabilities and offer organizations more control.

McDonald's to Acquire Apprente, a Provider of Voice Technology

McDonald's buys Apprente--a provider of voice-based, conversational technology--in a move that will integrate new teams with advanced technology skill sets into the McDonald's business.

SPi Global Launches Dubbing Solution

SPi Global announced the release of SmartDub, a solution that enhances Text-to-Speech (TTS) technology to generate synthetic voices creating a more natural and immersive experience for users.

Uniphore Seeing Momentum: Raises $51 Million in Series C Funding Led by March Capital Partners

New round is Uniphore's largest to-date; one of the biggest in conversational AI in recent months.

Inference Solutions Earns Frost & Sullivan’s Customer Value Leadership Award

Inference Solutions' Studio Platform is highlighted for Leading Development of the fast-growing Intelligent Virtual Agent market by Frost & Sullivan.

Chatmeter Adds Google Q&A Feature to Social Suite

Chatmeter's Google Q&A helps brands gain the ability to act on user-generated responses at scale while improving customer experience, local rankings and voice search.