Speech Technology Magazine

 

Innovations: Speech Technology with Impact - New Company, New Breakthrough

At the Consumer Electronics show in Las Vegas this week, Sensory, Inc., (http://www.sensoryinc.com /) debuted their new subsidiary, 3Dmsg, which is developing technology and applications for the cell phone and wireless handset markets. Ho hum, you say, but it's not so. 3Dmsg - yes - that is 3D messaging - is about to do for the improvement of personalization and avatars, what advances in intonation, accents, and accuracy did for text-to-speech. …
By Nancy Jamison - Posted Jan 1, 2006
Page1 of 1
Bookmark and Share

New Company, New Breakthrough
At the Consumer Electronics show in Las Vegas this week, Sensory, Inc., (http://www.sensoryinc.com/) debuted their new subsidiary, 3Dmsg, which is developing technology and applications for the cell phone and wireless handset markets. Ho hum, you say, but it's not so. 3Dmsg - yes - that is 3D messaging - is about to do for the improvement of personalization and avatars, what advances in intonation, accents, and accuracy did for text-to-speech.

Sensory Inc. and 3Dmsg
3Dmsg (http://www.3dmsg.com/) was created as the result of Sensory working with some animation technology they acquired when they bought Fluent Speech Technologies back in 2000. As Sensory is the world leader in embedded speech technology with a robust customer base in the consumer electronics market, it only made sense to find a fit between this animation technology and the fast-growing, enhanced services market for handheld devices, hence the birth of 3Dmsg.

The result of their efforts is what they claim to be the most accurate lip synchronization technology on the market.  3Dmsg claims that they will provide accuracy that even lip readers can understand. Their technology takes advantage of a hybrid neural network and hidden Markov modeling speech technology to analyze spoken data and break it into visemes or visual phonemes. To create the three dimensional animation of someone speaking, animators create these visemes, which map to the movement of the mouth. They then use these visemes, along with facial expressions and sound, to make the animation or avatars appear as natural looking as possible.

With current technology, as well as the avatars used in applications, synchronization of movement and sound can be off, creating a jittery effect, or worse yet, the "synchronization" can be nothing more than "flapping lips" opening and closing with amplitude changes. If all you have is a bobbing head along with voice, and not tight synchronization, comprehension goes down. You can liken this to the difficulty in participating in a video conference call when the sound is delivered slightly faster than the visual feed. 3Dmsg changes this as their technology takes into account the movement of the teeth, lips and tongue that speakers use to communicate, and strings and morphs between these visemes, creating a better rendering of a talking person or creature - an avatar.

Benefits of 3Dmsg's avatars are that they can be created, not only from live speech, but from input text or a combination of text and speech. In addition, the 3Dmsg technology allows MMS, SMS, Instant Messaging, and other text or voice messages to be delivered as ultra-low bandwidth 3D video-like messages by a talking avatar. 3Dmsg's avatars achieve the quality and accuracy of lip-synchronized video with the bandwidth of text or voice.

The creation of high-quality lip synchronization opens up a wealth of possibilities for its user. Initially, 3Dmsg will be creating products for the handset market. This will include voice messaging applications in which an animated avatar will speak, along with text and SMS messages. However, there are other areas in which this technology will be everything from useful to just plain fun. For example, users will be able to send customized digital greeting cards using their voice and the avatar of choice to represent them. The technology can be used in language-learning applications too. In this case, the placement of teeth and tongue is particularly helpful in both comprehension and in training the user how to speak. Further development of these language-learning programs might include speech recognition, enabling the user to record their training and have the speech recognizer "grade" their efforts.

Development of the technology is underway at 3Dmsg's facilities in Portland, Oregon. The first products from 3Dmsg are expected to be announced in the first half of 2006. It will be interesting to see how accurate the applications will be and how they will be accepted by users. Stay tuned. To see a demonstration of 3Dmsg's talking avatar technology, go to http://www.3dmsg.com/.

Have a cool, or noteworthy announcement? Please email me at nsj@jamison.com.

Page1 of 1