Video: How Hardware Advances Have Transformed NLP
Learn more about speech-to-text and natural-language processing applications at the next SpeechTEK conference.
Read the complete transcript of this clip:
Paco Nathan: There's been a big generational change in NLP. I teach courses in NLP and AI, and I find that in industry, what's changed over the past two, three years isn't really well-known. There are still a lot of concepts that are floating around that people take as givens that were outdated 10 years ago.
So, I definitely want to put in a plug. For one thing, a lot of people talk about NLP and you might hear mention "bag of words.” I was recently at a research conference with Ph.D. students and they were mentioning this, and it was just amazing. This was an outmoded technique a long time ago.
And so, "bag of words" and other techniques like "stemming" and "in grams" and in general, keyword search, like you'll see in Solr and Elastic and others, they were shortcuts because we didn't have enough computational power. And now, we're seeing a lot better cloud resources and a lot better ways to work with NLP. So there's been a kind of generational lag and even the inner vendors, we're still seeing this. Even as some of the Google APIs. They have some fairly old stuff that's hard-coded.
Another notion that I find with NLP is that it really requires a lot of investment in big data frameworks, like you have to have a spark cluster running or something like that. And again, that's no longer the case. I'll show that in a moment. But really, the one that stands out to me is a kind of either/or fallacy that either you can fully automate a thing or if you can't automate it, okay, we're not going to do that yet. What we're seeing is a lot of middle ground where problems can be partially automated and then have human in the loop. And the two of those together augment each other.
Now, the thing that's really changed on this, as I mentioned, I was doing neural networks back in the '80s and we didn't have enough hardware. And again, when we were doing NLP work on early social media, we ran into the problem of not really having enough hardware but starting to get there.
So what's changed is hardware is vastly different. If I take what was one of the largest Hadoop instances in the cloud from 2008. I can take that same algorithm and the same data. I can run it faster in Python on my laptop today and it's only been 10 years.
And so what we've seen is this real big change because of multi-core, because of having a lot of CPUs all in one chip. You don't need a huge cluster anymore. And also, because of large memory spaces and faster interconnects, we can see a lot of workloads that would have required a big cluster before can all just be put on one machine.
And there are projects like Apache Arrow that are really leveraging this, and opensource I highly recommend. But hardware is changing. We're seeing not just CPUs. I think that for CPUs, Moore's law, and the advances we'd banked on for the last decades, that's all but dead. We're seeing a lot of advances now with GPUs. Anybody own NVIDIA stock? Anybody? Anybody seen NVIDIA stock lately? It's been this hockey stick.
GPUs are very popular. You can't buy enough of them these days. But even Google is moving beyond that. If you look at Jeff Dean’s talk from NIPS in 2017, Google is moving away from GPUs. They've got TPUs that they've built. They have their second generation of them coming out now in use. They work in their third generation. Google is essentially becoming a hardware vendor. And so they're building their own processor and they're seeing substantial performance by doing custom work, creating hardware specifically for AI.
Paco Nathan of O'Reilly Media's R & D Group discusses the merits of the emerging Text Rank initiative in leveraging data in NLP in this clip from SpeechTEK 2018.
Paco Nathan of O'Reilly Media's R & D Group discusses the role of big models in the commoditization of AI in this clip from SpeechTEK 2018.
Paco Nathan of O'Reilly Media's R & D Group discusses the role of big compute in the commodification of AI in this clip from SpeechTEK 2018.
Paco Nathan of O'Reilly Media's R & D Group discusses the role of big data in the commoditization of AI in this clip from SpeechTEK 2018.
Omilia's Quinn Agen discusses the value of context and memory in delivering human-like interactions in self-service contact centers in this clip from SpeechTEK 2018.
Omilia's Nikos Kolivas discusses the value of context and memory in delivering human-like interactions in self-service contact centers in this clip from SpeechTEK 2018.
Omilia's Quinn Agen discusses the way an an end-to-end conversational approach in machine-to-human interaction helps deliver satisfying contact center experiences in this clip from SpeechTEK 2018.
USAA's Brett Knight discusses how USAA has used machine learning and AI in virtual agent training at the enterpise level in this clip from SpeechTEK 2018.
Omilia's Quinn Agen discusses the advantages of taking a conversational approach in automated contact center customer care in this clip from SpeechTEK 2018.
USAA's Brett Knight describes all the players enterprises should assemble when planning conversational development for virtual agents in this clip from SpeechTEK 2018.
USAA's Brett Knight explains how to train virtual agents to anticipate customer needs more effectively in this clip from SpeechTEK 2018.
Aspect's Andreas Volmer discusses the value of visually representing conversational flow logic in chatbot design, and explores the advantages of a language-based approach over a statistically based approach in this clip from SpeechTEK 2018.
Amazon Connect General Manager Pasquale DeMaio demos Amazon Connect AI-driven Lex chatbot in this clip from his keynote at SpeechTEK 2018.
Aspect's Andreas Volmer discusses key components of a rules-based approach to chatbot design based on a robust language model in Part 1 of this two-part series from SpeechTEK 2018.
AWS Head of Product, Language Tech Vikram Anzabhagan outlines 4 essential strategies for leveraging technology to make contact center interactions more personal, conversational, agile, and engaging in this clip from his SpeechTEK 2018 keynote.