NVIDIA Launches Jarvis Interactive Conversational AI Framework
NVIDIA today released the NVIDIA Jarvis framework, providing developers with pre-trained deep learning models and software tools to create interactive conversational artificial intelligence services that can be adapted for specific industries or domains.
NVIDIA Jarvis models offer automatic speech recognition, as well as language understanding, real-time translations, and text-to-speech capabilities to create expressive conversational AI agents. They can be deployed in the cloud, in the data center, or at the edge, instantly scaling to millions of users.
"Conversational AI is in many ways the ultimate AI," said Jensen Huang, founder and CEO of NVIDIA, in a statement. "Deep learning breakthroughs in speech recognition, language understanding, and speech synthesis have enabled engaging cloud services. NVIDIA Jarvis brings this state-of-the-art conversational AI out of the cloud for customers to host AI services anywhere."/p>
NVIDIA Jarvis has been built using models trained on more than 1 billion pages of text, 60,000 hours of speech data, and in different languages, accents, environments and lingos. Developers can use NVIDIA TAO, a framework to train, adapt, and optimize these models for any task, industry, or system.
Developers can select a Jarvis pre-trained model from NVIDIA's NGC catalog, fine-tune it using their own data with the Transfer Learning Toolkit, optimize it for real-time speech services, and then deploy it with just a few lines of code.
Among early users is T-Mobile.
"With NVIDIA Jarvis services, fine-tuned using T-Mobile data, we're building products to help us resolve customer issues in real time," said Matthew Davis, vice president of product and technology at T-Mobile, in a statement. "After evaluating several automatic speech recognition solutions, T-Mobile has found Jarvis to deliver a quality model at extremely low latency, enabling experiences our customers love."
NVIDIA is also partnering with Mozilla Common Voice, an open-source collection of voice data with more than 9,000 total hours of contributed voice data in 60 languages. NVIDIA is using Jarvis to develop pre-trained models with the dataset, and then offer them back to the community for free.
"We launched Common Voice to teach machines how real people speak in their unique languages, accents, and speech patterns," said Mark Surman, executive director of Mozilla, in a statement. "NVIDIA and Mozilla have a common vision of democratizing voice technology and ensuring that it reflects the rich diversity of people and voices that make up the internet."
Jarvis eases creation of interactive conversational artificial intelligence agents, while Merlin speeds up data loading and training time to improve recommendations for online businesses.
NVIDIA's Jarvis application framework enables the creation of custom, language-based AI services.