October 14, 2008
By Leonard Klie Editor, Speech Technology and CRM magazines
Speech Technology News

W3C Standard Simplifies Creation of Speech-Enabled Applications

The World Wide Web Consortium (W3C) today published a standard that will simplify the development of Web applications that speak and listen to users.

The Pronunciation Lexicon Specification (PLS) 1.0, developed by the W3C’s Voice Browser and Multimodal Interface working groups, defines the standard format for matching word sounds and spellings in the pronunciation dictionaries used by speech recognition and speech synthesis systems. Up until now, each vendor had its own way to specify the pronunciation dictionary that accompanied its applications..

"Developing the pronunciation dictionaries [for speech recognition and text-to-speech] is one of the hardest parts of designing these applications. It’s a lot of work," says Deborah Dahl, chair of the W3C’s Multimodal Interface Working Group. "To be able to agree on a format will make for a lot less work. Where you do not want to recreate dictionaries over and over again is where this will really make a difference."

PLS can reduce the cost of developing these applications by allowing people to share and reuse pronunciation dictionaries. In addition, it will allow speech user companies to incorporate the same dictionaries into systems from different vendors and different system vendors to use the same dictionary. PLS can also make it easier to localize applications by separating pronunciation concerns from other parts of the application.

This will be especially helpful for languages like English and Chinese, where the word sounds and spellings don’t always match, Dahl says.

The PLS, she adds, does not create the dictionary, but the format for others to create the dictionary. "The one good thing about the standard is that it opens up market opportunities. I can imagine a whole group of people working to create one single dictionary for everyone."

Free

for qualified subscribers

Subscribe Now Current Issue Past Issues

W3C Standard Simplifies Creation of Speech-Enabled Applications

Voice Deepfake Fraud Surged 1,300 Percent

Sanas Unveils Simultaneous Real-Time Speech-to-Speech Translation

ESTsoft Partners with ElevenLabs

Deepgram Launches Voice Agent API