The 2013 Star Performers

Article Featured Image

Google Sees New Opportunities with Speech

Google is creating quite a stir with its Google Glass, the company's much-anticipated augmented reality eyewear that displays information and interacts with the user via natural language voice commands. The product isn't even available to the public yet—Google just closed the selection of beta testers in February—but casinos in Atlantic City have already banned the wearable computer, arguing that Google Glass could lead to a new generation of high-tech cheating at the poker tables.

However, in the consumer market, the positive buzz around Google Glass far exceeds the gambling industry's negative sentiment, and for good reason. By simply speaking to Google Glass, users can conduct Web searches, send messages, get directions, take and share pictures and videos, start Google+ "Hangouts," get the weather, check the status of a flight, have their words translated, and more.

The company's investments in conversational search go far beyond Google Glass. This year Google expanded the capabilities of its popular Google Translate application, which speaks and functions on mobile devices. It also added to the number of languages supported by its Google Voice Search app for Android-powered mobile devices, potentially exposing the app to about 100 million new users.

The company further advanced its speech portfolio in April by bringing its Google Now voice search app to Apple's iPhone and iPad, placing it in direct competition with Apple's Siri personal assistant app.

Google's stated goal is to develop fully conversational search, which could signal "the end of search as we know it," according to Amit Singhal, Google's senior vice president.

But for all that Google has done to elevate the speech industry, perhaps its biggest power play was bringing Ray Kurzweil, credited with the invention of text-to-speech technology, on board. Google pulled off a major coup in February when it hired Kurzweil as its director of engineering.

Kurzweil's legendary career already includes countless contributions to science and technology, through research in character and speech recognition and machine learning. At Google, he will focus on machine learning and language processing, but analysts expect his hiring will create the framework for Google's diverse initiatives around predictive search, natural language understanding, mobile assistance, and artificial intelligence, among other areas.

"My focus will be enabling computers to understand the semantic content of natural language and to use that understanding to enhance Google applications, such as search and question answering," Kurzweil told Speech Technology magazine shortly after he was hired.

ISpeech Is Homeward Bound

Apple's Siri, the digital assistant app that is now a standard feature on the iPhone, is often credited with bringing speech technology into the mainstream. However, iSpeech's contributions in this area cannot be ignored. The company's stated goal, according to CEO and founder Heath Ahrens, is to "continue to work hard to eliminate all barriers to adoption of speech technology for developers, such as cost, quality, and difficulty of integration."

The Newark, N.J., company this year achieved several milestones. Its mobile software development kit (SDK) has been used more than 2 billion times in mobile apps by more than 25,000 developers. Applications employing its text-to-speech and speech recognition technologies were downloaded more than 100 million times.

Among the more impressive uses of its technology is an app developed by 12-year-old Eric Zeiberg to help his autistic sister communicate. The app, Handy-Speak, converts handwritten words to text, then uses iSpeech's TTS to turn the printed words to audio.

In the past year, the company released numerous updates to its SDK and application programming interfaces, making speech even more accessible, powerful, and easier to use and integrate for thousands of developers. A new publishing platform, released in March, lets publishers of print and digital content convert books, articles, and other text-based content into audio. The platform launched with two partners, Evernote and Pearson, with additional publishers coming soon.

Still, its most long-awaited and potentially life-changing contribution was its release of iSpeech Home, a solution that would enable users to control their televisions, entertainment systems, lighting, heating, ventilation, irrigation, security systems, and appliances by voice through natural language commands.

The company has already met with dozens of thermostat, security system, and appliance manufacturers about iSpeech Home, and expects to start seeing the first devices trickle into the market soon.

"We have been pleasantly surprised at how many device manufacturers are looking at this," Yaron Oren, chief operating officer at iSpeech, tells Speech Technology magazine. "There are a limitless number of products out there that could be made easier to use with a speech interface. The technology has such broad applicability.

"We believe speech is the user interface of the future," Oren adds. "Siri has done an amazing job of bringing this vision to life on the iPhone, and we are helping bring it to more applications, more platforms, and new markets, such as the connected home."

SpeechTek Covers
for qualified subscribers
Subscribe Now Current Issue Past Issues
Related Articles

The 2013 Market Leaders

The 2013 Implementation Awards

The 2013 Speech Luminaries