NVIDIA Launches Jarvis for Building Conversational AI

Article Featured Image

NVIDIA has released NVIDIA Jarvis, an application framework that allows companies to use video and speech data to build custom conversational artificial intelligence services.

"Conversational AI is central to the future of many industries as applications gain the ability to understand and communicate with nuance and contextual awareness," said Jensen Huang, founder and CEO of NVIDIA, in a statement. "NVIDIA Jarvis can help the healthcare, financial services, education, and retail industries automate their overloaded customer support with speed and accuracy."

Applications built with Jarvis can take advantage of innovations in the new NVIDIA A100 Tensor Core GPU for AI computing and the latest optimizations in NVIDIA TensorRT for inference. It enables organizations to run an entire multimodal application using NVIDIA's vision and speech models, faster than the 300-millisecond threshold for real-time interactions.

Jarvis provides a complete, GPU-accelerated software stack and tools for developers to create, deploy, and run end-to-end, real-time conversational AI applications that can understand terminology unique to each company and its customers. It includes deep learning models, such as NVIDIA's Megatron BERT for natural language understanding. Companies can fine-tune these models on their data using NVIDIA NeMo, optimize for inference using TensorRT, and deploy in the cloud and at the edge using Helm charts available on NGC, NVIDIA's catalog of GPU-optimized software.

Among the first companies to take advantage of Jarvis-based conversational AI products and services for their customers are Voca, an AI agent for call center support; Kensho, for automatic speech transcriptions for finance and business; and Square, with its virtual assistant for appointment scheduling.

Voca's AI virtual agents understand the full intent of spoken conversation and speech to help agents identify tones and vocal clues to discern between what a customer says and what a customer means.

"Low latency is critical in call centers, and with NVIDIA GPUs our agents are able to listen, understand, and respond in under a second with the highest levels of accuracy," said Alan Bekker, co-founder and chief technology officer of Voca, in a statement. "Now our virtual agents are able to successfully handle 70 percent to 80 percent of all calls, ranging from general customer service requests to payment transactions and technical support."

Kensho has used NVIDIA's conversational AI to develop Scribe, a speech recognition solution for finance and business.

"We're working closely with NVIDIA on ways to push end-to-end automatic speech recognition with deep learning even further," said Georg Kucsko, head of AI research at Kensho, in a statement. "By training new models with NVIDIA, we're able to offer higher transcription accuracy for financial jargon compared to traditional approaches that do not use AI, offering our customers timely information in minutes versus days."

Square created an AI virtual assistant for companies to confirm, cancel, or change appointments with customers.

"Square Assistant can understand and provide help for 75 percent of customer questions, along with ensuring that 10 percent more people are showing up to their appointments," said Gabor Angeli, head of conversational AI at Square, in a statement. "With GPUs, we're able to train models 10 times faster versus CPUs to deliver more accurate, human-like interactions, ultimately helping our customers grow their businesses."

SpeechTek Covers
for qualified subscribers
Subscribe Now Current Issue Past Issues
Related Articles

NVIDIA Launches Jarvis Interactive Conversational AI Framework

Jarvis Interactive Conversational AI Framework's pre-trained deep learning models and software tools enable developers to adapt Jarvis to specific industries.