Speech Technology Magazine

 

Mobility Ubiquity

Can speech make it on mobile devices?
By Nancy Jamison - Posted Mar 8, 2010
Page1 of 1
Bookmark and Share

Voice search, command and control, speech-to-text, translation, navigation, voice-activated dialing, multimodal input/output—it’s becoming rare to find a mobile phone today that doesn’t have at least one of these features. In fact, most come with several. 

By the end of 2009, signs were evident that speech technologies were ubiquitous on handheld devices, helping further the concept of “superphones” or “smartphones.” This does not even include handheld GPS devices, e-readers, or translation devices. 

The plethora of announcements across a breadth of capabilities on these devices is already continuing into 2010. At the Consumer Electronics Show in Las Vegas in early January, for example, Google introduced the Nexus One (what the company is calling a superphone), a device that allows a user to speak instead of typing. This applies to any application  or field in which text is required. The user can also control features on the phone with his voice. 

Applications

Speech-enabling mobile devices is one thing, but the recent availability of downloadable speech-enabled applications is also driving the expansive use of speech. Downloadable applications are a multibillion-dollar market, fueled by companies like Apple, Google, and Microsoft. These companies have app stores with thousands of applications (or in the case of Apple’s iPhone, more than 125,000 applications)—many that incorporate speech technologies—making them inexpensive (or free) and easy to obtain. 

Here are a few examples that demonstrate the breadth of speech applications now available on mobile devices:

  • Toshiba provides a trilingual translation capability (English, Chinese, and Japanese) that can be embedded into a phone, rather than as a service. 
  • Navigon provides an on-board navigation application for Android phones. 
  • Nuance Dragon NaturallySpeaking allows dictation on the iPhone that can then be sent as text or email messages, and works with the iPhone’s clipboard so text can be pasted into other applications, such as Twitter or Facebook. 
  • Dragon Search allows voice search on the iPhone, which brings up results in such categories as YouTube, regular text Web sites, etc. 
  • DriveSafe.ly allows users to listen to and respond to text messages and email. 
  • Kirusa provides a Call-n-Tweet service for users to speak Twitter updates.
  • Vlingo 4.0 lets BlackBerry users control all aspects of their phones, from texts, tweets, and voicemail to filling in fields on Web sites. 
  • ShoutOUT, Dial2Do, Jott (Nuance), and Google Voice allow users to dictate texts and messages. 

Peripherals

Let’s not forget peripherals, such as headsets. In addition to vastly improving the auditory qualities of headsets by providing noise-cancelling capabilities, speech recognition has been added to enable hands-free control of the phones via the headsets. A big winner that sports embedded speech recognition is BlueAnt’s Q1 Bluetooth headset, which allows a user to control the headset and many of the phone’s functions with simple voice commands. 

Still, some factors are dampening the industry’s growth. For example, just because an application is preloaded onto a phone doesn’t mean the user will try it; many people might just download others as a novelty and never use them. In addition, with so many incompatible wireless operating systems, developers are forced to write to a specific platform, operating system, or device, and that limits the number of users available for a given application. 

That hasn’t hampered developers from adding both speech recognition and text-to-speech to mobile applications. As with speech-to-text applications, in some cases necessity is driving growth. For example, a person might need a mobility application that lets him use his voice to send text messages while driving a car. The same is true for voice-activated dialing. 

The key to growth will be in finding an equally compelling reason for new or existing speech applications that will make users think speech is indispensable.


Nancy Jamison is principal analyst at Jamison Consulting. She can be reached at nsj@jamison-consulting.com.

Page1 of 1