Speech Technology Magazine

 

Making the Internet Talk

Text-to-speech (TTS) provider Cepstral is giving a voice to the Web avatars used on the 3D instant messenger site IMVU Virtual World.
By Lauren Shopp - Posted Feb 5, 2008
Page1 of 1
Bookmark and Share

Text-to-speech (TTS) provider Cepstral is giving a voice to the Web avatars used on the 3D instant messenger site IMVU Virtual World. The hybrid virtual world/social networking site announced today that it will use TTS to offer its users access to Cepstral's 30 different synthesized voices through a service  called imVoices.

The emergence of TTS programs for use in avatars means the technology has found a base within the Internet, having moved away from just IVRs and document-reading -- in fact, Cepstral's VoiceForge program's motto is "We will make the Internet talk."

IMVU Virtual World will use Cepstral's Software as a Service (SaaS) model, VoiceForge, an Internet-based software that does not require users to download a TTS engine or voice database. Patrick Dexter, director of business development at Cepstral, said IMVU Virtual World's unique service offering (similar to that of Second Life) required a different approach to TTS service.

"IMVU is a very personal environment. You personalize your avatar to reflect how you look or as a fantasy," Dexter explains. "VoiceForge gave [users] access to not just the typical, corporate-sounding male or female voice, but over 30 different voices with a variety of accents or characters. They can find voices to really match the personality of their avatar."

In the future, Dexter says Cepstral will further diversify the selection of voices by allowing users to change their avatar's pitch or tone, as well as add sound effects.

Today, IMVU Virtual World's current business model for the service is different from similar sites. Even Dexter admits it is "kind of odd." Rather than purchasing access to a synthesized voice for a flat fee, users buy special "furniture" that is placed in their avatar's environment; when an avatar sits on this furniture, they activate the TTS chat capabilities. The actual TTS service is sold in a way similar to pre-paid mobile phone minutes -- what Dexter calls "chat bites." As users sit on furniture and chat, they diminish their "chat bites," and must purchase more.

Though a YouTube video of the imVoices program is online, it was created during a demo version of the VoiceForge product and has since been improved. Dexter says that Cepstral's technology can now convert text into synthesized speech in less than a second; the video shows a significantly longer lag in processing time. For TTS to further penetrate the Web, fast processing and increased choice among voices will remain a deciding factor. Dexter says he remains optimistic that TTS programs for avatars, social networking sites, email, and blogs will grow in popularity in coming years.

"I think as a whole, the Internet has been widely ignored by the TTS industry, and so that's really what VoiceForge is doing," he says.

Page1 of 1