The 2015 Speech Industry Luminaries: Andrew Ng

Article Featured Image


Andrew Ng

chief scientist, Baidu Research; and director, Stanford University’s Artificial Intelligence Lab

Creating smart robots has been a fascination of Andrew Ng's since he was a boy. Today, thanks to his contributions in neural networking, he is working toward making that dream a reality.

But it hasn't been easy. In fact, for years Ng gave up on creating smart robots, bemoaning the complexities associated with building complicated programs.

Then, a key neurological discovery changed his stance. Neuroscientists learned that after rerouting a signal from an animal's eye to its auditory cortex, the section of the brain that processes sound, the auditory cortex could see and process images. This was a major breakthrough for Ng, because he posited that if one area of the brain could process multiple senses, then perhaps the same would apply to machines. It is this idea that has become the cornerstone of deep learning and its ability to aid computer perception.

"If you look at how the human brain does perception, rather than needing tons of algorithms for vision and tons of algorithms for audio, it may be that most of how the brain does it may be a single learning algorithm or a single program.… If this is true, then maybe we don't need to figure out all these different complicated programs. Maybe we need to just figure out one program.... That would let us progress much faster on perception," Ng said in the video presentation "The Future of Robotics and Artificial Intelligence."

This approach would later lay the groundwork for his contributions to the business and technology world. In 2011, he founded the Google Brain project, at Google, which developed large-scale artificial neural networks. The project's self-learning capabilities were utilized for the Android operating system's speech recognition system.

In May 2014, he became the chief scientist at Baidu, a Chinese search engine company. Under his direction, in December 2014, Baidu's California-based research arm developed "Deep Speech." Deep Speech's recognition accuracy outperformed the company's peers in noisy environments, such as restaurants—in addition to far-field and reverberant scenarios. Baidu's tests revealed that the technology's word error rate was 10 percent better than solutions from Google, Apple, and Bing.

With these developments, there's no doubt that, for Ng, computer perception is becoming a reality. 

SpeechTek Covers
for qualified subscribers
Subscribe Now Current Issue Past Issues