YouTube Expands Video Transcription Option for All

Back in November, YouTube added a feature that generates video captions and transcripts for videos uploaded to its servers. At the time, the feature had only been enabled for a small number of channels that usually feature talks and interviews, such as a handful of universities as well as PBS and National Geographic.

As of yesterday, YouTube began rolling out the robo-transcription option across its entire Web site, much to the delight of the speech technology community.

YouTube, which is owned by Google, uses Google’s automatic speech recognition (ASR) and voice search technologies, together with its own captioning system, to generate audio transcripts of the videos. People who post videos to YouTube can download  the auto-generated transcript, make any necessary corrections, and upload the transcripts along with the video. People with existing content already posted to YouTube can select a “request processing” option to add captioning to their video tracks.

Google was quick to point out that the transcriptioning option is limited right now in that it only works for videos with audio in English and there will be problems where there is not a very clear audio track. However, many in the industry view it as a shot in the arm for the speech-to-text industry as a whole.

“It provides market validation for us,” says Tom Wilde, CEO of RAMP, a  provider of content optimization, speech-to-text, and auto transcription for close captioning, video search, search engine optimization, targeted advertising and other applications.

“Historically, Web video had a separate life from TV, which has had closed captioning for some time. Web video had been left out of the mix,” Wilde says.

RAMP, which currently provides such services to large media publishers such as NBC, FOX News, Meredith, Dow Jones, Thomson-Reuters, and others, calls the move by YouTube part of a “focused evolution” for speech recognition. “It’s a game of inches,” Wilde says. “It’s an evolving technology. It’s been around for 20 years, and it’s getting better all the time.”

Google said in a blog post yesterday that it plans to expand the feature to include more languages. It further said that among the ultimate goals for the transcriptioning option is to make YouTube more accessible to the hearing-impaired and those learning English as a second language.

Beyond that, Wilde says video transcription has lots of uses online. It makes the video content searchable, helps in search engine optimization, and could even be used to generate targeted advertising, something Google has so far been very adept at on its own.

“YouTube’s announcement is a clear indication of the growing importance of a scalable strategy for making video content accessible, and evidence this technology has become a mainstream requirement for publishers seeking to fully leverage the exploding demand for Web video,” Wilde said in a statement.

SpeechTek Covers
for qualified subscribers
Subscribe Now Current Issue Past Issues