Speechmatics Launches Language Identification
Speechmatics, a provider of autonomous speech recognition technology, today launched Language Identification (Language ID) as a component of its speech-to-text engine to automatically identify the predominant language spoken in any media file.
Language ID removes the manual step of selecting which language pack should be used when the language is not explicitly stated on the file. It not only helps users identify unknown languages but also adds metadata about the language of the spoken audio.
Speechmatics' speech-to-text engine is trained through exposure to hundreds of thousands of individual voices using millions of hours of unlabelled, more representative voice data. Speechmatics applied this technique to identifying predominant spoken languages on a diverse set of audio data.
"Up until now, identifying languages without human intervention has been costly and time-consuming for users of speech-to-text. However, with our new Language ID, this will be a thing of the past and allow customers to swiftly identify and transcribe media files with less hassle and more efficiency. We can't wait for our customers to use this Language ID and see it deliver accurate and valuable results," said Speechmatics CEO Katy Wigdahl in a statement.
This latest update can be used with pre-recorded media files, works with up to 12 languages (English, German, Spanish, French, Hindi, Italian, Japanese, Korean, Mandarin Chinese, Dutch, Portuguese, and Russian) and adds a confidence score to show the certainty of the predominant language.