August 7, 2015
By David Myron Editorial Director - Speech Technology Magazine
Features

The 2015 Speech Industry Luminaries: Andrew Ng

THE NEURAL NETWORKER

Andrew Ng

chief scientist, Baidu Research; and director, Stanford University’s Artificial Intelligence Lab

Creating smart robots has been a fascination of Andrew Ng's since he was a boy. Today, thanks to his contributions in neural networking, he is working toward making that dream a reality.

But it hasn't been easy. In fact, for years Ng gave up on creating smart robots, bemoaning the complexities associated with building complicated programs.

Then, a key neurological discovery changed his stance. Neuroscientists learned that after rerouting a signal from an animal's eye to its auditory cortex, the section of the brain that processes sound, the auditory cortex could see and process images. This was a major breakthrough for Ng, because he posited that if one area of the brain could process multiple senses, then perhaps the same would apply to machines. It is this idea that has become the cornerstone of deep learning and its ability to aid computer perception.

"If you look at how the human brain does perception, rather than needing tons of algorithms for vision and tons of algorithms for audio, it may be that most of how the brain does it may be a single learning algorithm or a single program.… If this is true, then maybe we don't need to figure out all these different complicated programs. Maybe we need to just figure out one program.... That would let us progress much faster on perception," Ng said in the video presentation "The Future of Robotics and Artificial Intelligence."

This approach would later lay the groundwork for his contributions to the business and technology world. In 2011, he founded the Google Brain project, at Google, which developed large-scale artificial neural networks. The project's self-learning capabilities were utilized for the Android operating system's speech recognition system.

In May 2014, he became the chief scientist at Baidu, a Chinese search engine company. Under his direction, in December 2014, Baidu's California-based research arm developed "Deep Speech." Deep Speech's recognition accuracy outperformed the company's peers in noisy environments, such as restaurants—in addition to far-field and reverberant scenarios. Baidu's tests revealed that the technology's word error rate was 10 percent better than solutions from Google, Apple, and Bing.

With these developments, there's no doubt that, for Ng, computer perception is becoming a reality.

The 2015 Speech Industry Luminaries: Andrew Ng

Vallige Introduces Val, a Smart Companion to Support Families Living with Dementia

AI Translation and Captioning Emerge at College Graduations

Bland Raises Funds to Advance Voice AI

Klick Labs Partners with Mayo Clinic on Vocal Biomarker Research