March 1, 2008
By Judith Markowitz Principal - J. Markowitz, Consultants
Forward Thinking

Embedded Speech in China

For more than 20 years, Western speech-processing companies have tried to gain a foothold in what has long promised to be a huge market, mainland China and Taiwan. Those efforts are starting to bear fruit, primarily because of economic and global factors in those nations.

Mainland China is the fourth largest economy in the world, and RNCOS, an Asian market-research company, expects that country’s retail industry to see compounded annual growth of 20 percent through 2011. Taiwan, already a dominant producer of low-cost electronic components, is steadily moving its production upscale, and its economy is following suit. Both countries are experiencing strong market internationalization and steady growth in consumer buying power.

As these two nations advance, there is a growing interest among them in embedded speech applications, particularly on mobile devices. So far, most of the deployments in this area have been network-based, but Chinese and Taiwanese developers and integrators like Cyberworkshop are looking to do more with embedded speech as well. Most are working on adding speech synthesis and speech recognition to high-end mobile phones, PDAs, and toys.

The application that garnered the earliest interest in embedded technology was foreign language learning, something with great appeal among consumers and businesses seeking international trade. For more than 10 years, Chinese consumers have been carrying specialized mobile devices called talking dictionaries. As the name suggests, talking dictionaries use text-to-speech (TTS) technologies to model the correct translation and pronunciation of a foreign word that the language learner has highlighted. Most do not yet include speech recognition, but that is likely to change.

A vibrant tourism industry, the 2008 Summer Olympics in Beijing, the 2010 World Expo in Shanghai, and the rapidly expanding global market for goods from mainland China and Taiwan are accelerating the popularity of mobile language-learning devices. Talking dictionaries, once found only on specialized devices, are now also available on cell phones and PDAs. Those devices now provide instruction in a range of foreign languages and in Mandarin (the official language of mainland China).

There have also been devices that boast embedded machine translation. Most perform phrase-level conversion, but this technology is advancing. In 2006, for example, NEC announced two-way (Chinese-to-Japanese) continuous-speech translation technology for travel-related conversations. The technology is embedded in a PDA.

Another emerging market for speech recognition and TTS is the automobile. One of the most active players in this market is Nuance Communications, which has established partnerships with Volkswagen China and Cherry (a mainland automobile manufacturer) to develop voice-activated command-and-control systems in Mandarin and Cantonese. Cars with speech technology will soon be sold in Taiwan and on the mainland. Dutch GPS systems manufacturer TomTom is also embedding Nuance’s Mandarin and Cantonese speech recognition engines into its personal navigation systems for cars. Those systems will become available in the near future.

Market Outlook
Taiwan and mainland China represent huge potential markets for embedded speech technologies. Although those markets are just beginning to emerge, market forces in both are quickly translating the potential into reality. Those market forces include the robust economic growth that is giving Chinese consumers greater buying power and latitude, the mobile revolution that is especially strong in China, a growing global business focus, and the availability of high-quality speech technology in Chinese languages.

Today, Chinese companies are partnering with Western companies—notably Sensory and Nuance—in the integration of embedded speech technologies. Similarly, many Chinese researchers are working in foreign universities or in the laboratories of IBM, Microsoft, Toshiba, and other foreign companies. There is also a small but growing body of research on core speech recognition and TTS embedded technologies coming out of Chinese laboratories, such as Beijing’s Tsinghua University. ˝

^{Editor’s Note: This is the first of two articles that Judith Markowitz will write on speech technologies in China and Taiwan. The next column, which will run in the May issue, will discuss the many network-based speech deployments in mobile devices.}

Judith Markowitz, Ph.D., is the technology editor of Speech Technology magazine and an independent analyst in the speech technology and voice biometric fields. She can be reached at judith@jmarkowitz.com.

Embedded Speech in China

Eltropy Expands Voice Authentication Ecosystem with Illuma, IDgo, and Pindrop

Modulate Expands Velma with Voice-Native Real-Time Conversation Intelligence

Corti Launches Symphony for Speech-to-Text

Why Voice AI’s Next Big Challenge Isn’t Accuracy. It’s Relationship Design.