Adobe and Speechmatics Deliver Speech Recognition On-Device for Premiere
Speechmatics has deepened its partnership with Adobe to develop an on-device speech-to-text model in Premiere that delivers near-cloud accuracy while keeping all audio local to the device.
Speechmatics has been Adobe's partner since 2021, when Adobe first included speech-to-text (STT) in Premiere.
With this new technology, studios, agencies, and production companies handling content before it goes public can now work seamlessly from anywhere: on a film set, between client meetings, on a flight, with no dependency on a connection and no interruption to the work. Editing video and audio with text, creating captions, and labeling speakers with speaker diarization are all local, all private.
The new Speechmatics on-device model has been trained on millions of hours of speech for accented speech, non-native speakers, and noisy environments like field reporting or film sets.
The new on-device model in Premiere processes one hour of audio in about 55 seconds. It Runs on Windows and Mac, using the latest AI acceleration techniques for processing across a range of hardware, including broad hardware support for the latest Mac M5, NVIDIA RTX, AMD GPUs, and older hardware, such as Intel Macs.
"Adobe's global creator community speaks hundreds of languages and dialects. Since 2021, our partnership has focused on making sure speech technology works for everyone, whether you're editing in Scottish English, Mexican Spanish, or Cantonese. Today, millions of users can benefit from accurate transcription that works anywhere, on-device for privacy, and in the cloud for scale, without compromising performance. As Adobe builds toward LLM-powered creative workflows, having a speech foundation that truly understands diverse voices becomes even more critical. We're proud to be part of that future," said Katy Wigdahl, CEO of Speechmatics, in a statement.