Is Jennifer the Best Voice Ever?
On Friday, Polish text-to-speech firm IVO Software unveiled Jennifer, an English voice in its IVONA TTS line. Jennifer is also available for Expressivo 1.3, a TTS application that can integrate with iPods and read emails, RSS feeds, and Web pages.
According to IVO president Lukasz Osowski, Jennifer’s new voice marks an area in TTS technology "when it becomes practically impossible to tell a voice generated by our system from a natural voice."
Osowski cited the improvements made since the company’s high scores at the international 2006 and 2007 Blizzard Challenges, in which vendors had to build a synthetic voice. "This is why the version we have just released is probably the best English language text-to-speech system on the market," said Osowski. "The users of the test version say that this is the best voice they have ever heard."
The ambition for Jennifer’s voice in Expressivo is huge. Later versions might even help disabled persons in rehabilitation efforts. "Expressivo also stands out due to its very rich functionality," Osowski says. "It can be used as help with work (e.g. reading out emails, RSS news or planned events from the planner), study (e.g. reading out lectures or facilitating learning of a foreign language), entertainment (e.g. reading out subtitles in the movies), or as help for visually impaired people (e.g. reading out the content of Web pages)."
These goals, however, might be excessive given the current state of the technology. While impressive, my own review of Jennifer’s voice is that it's far from indistinguishable from a human’s. While elocution is crisp, there are still problems with concatenation. While such an application might have functionality when it comes to reading events from a planner or brief emails, more performance-oriented readings, such as from movie subtitles or books, might not elicit the greatest amount of customer acceptance.