The State of Artificial Intelligence

Article Featured Image

Speech technology has advanced seemingly at light speed over the past several years. But 2020 proved to be especially pivotal as artificial intelligence (AI) further infiltrated the industry with enhanced capabilities. And these capabilities have certainly been put to the test since the start of the coronavirus crisis, which has forced millions of people worldwide to work, shop, and play from home and increasingly rely on innovative apps, platforms, and online solutions powered by AI-driven speech technology.

John Kane, head of signal processing and machine learning at Cogito, says AI and speech have finally reached a synergistic new high point.

“The industry is experiencing major wins, like being able to accurately recognize text from speech and creating natural-sounding computer voices,” he says. “New products can now analyze nonverbal characteristics of speech. So much meaning in speech comes from how something is said, not just what is said. And voice technology’s ability to read and analyze these characteristics is a major step forward for the industry.”

Bill Rogers, CEO of Orbita, agrees.

“Machine learning algorithms today can predict and make recommendations, which have automated conversations to the human level. Advanced applications of this include the incorporation of biomarkers to understand human emotions and draw conclusions from intonation patterns and user identification based on voice recognition,” he explains.

This latter point is particularly important, others concur.

“The bar has been raised for speech technology in an age when empathy and compassion are crucial to successful interactions,” insists Michael Johnston, director of research and innovation at Interactions. “Modern AI systems not only provide routine automation, but they’ve begun to lean in on conversations between customers and agents and add value in numerous new ways, such as providing personalized suggestions and surfacing relevant information and content.”

The latest statistics underscore the importance of speech and AI:

• Allied Market Research expects the global virtual assistant market to grow 37.7 percent over the next seven years, reaching $44.3 billion by 2027.

• The global speech and voice recognition market is expected to reach $43 billion by 2030, based on insightSLICE data.

• 27 percent of the online population worldwide uses voice search, per Google; eMarketer reveals that almost 40 percent of all U.S. internet users and a third of its total population use voice; and 55 percent of smartphone users now use voice search, according to Perficient.

• More than nine of 10 businesses have an ongoing investment in AI, based on NewVantage research.

Year in Review

It’s impossible to evaluate 2020 without giving priority to COVID-19 and how it forced AI to rise to the challenges.

“Unsurprisingly, the coronavirus was a major catalyst for AI adoption and innovation in speech technology. Widespread social isolation and the need for remote communication and connection pushed conversational AI to center stage,” Rogers says.

Case in point: Traditional contact centers increasingly turned to conversational AI to ensure business continuity.

“This ability to provide consistent, effective service to customers was perhaps more important than ever in an era marked by uncertainty and confusion,” Johnston says.

Healthcare organizations and providers also impressively employed AI and speech technology to clear coronavirus hurdles.

“AI-powered chatbots and virtual assistants were at the forefront of the fight against COVID, helping to screen and triage patients, conduct surveys, share information, and enable telemedicine at a time when people couldn’t leave their homes,” says Kirill Petrov, CEO and founder of Just AI.

Using voice analytics with care coordination teams, at-risk patients were able to be contacted more frequently, too, which drove real-time data and engagement.

“As the pandemic continued and mental health concerns increased, AI technology has been used to increase patient engagement and monitor changes in mental health,” says David Hunt, founder and chief marketing and development officer of Cosán Group.

Overall, 2020 made business leaders realize that consumer habits will continue to evolve rapidly.

“More companies learned that AI can help meet those changes to continue to provide quality customer experiences, thus helping companies foster better relationships with customers,” says Matt Muldoon, president of North America for ReadSpeaker. “Companies have begun developing more emotional, higher-quality voices by leveraging AI, and brands used more interactive voice ads that allow them to talk directly to consumers.”

Other highlights from 2020 were significant as well:

• Automatic speech recognition took another leap forward through increased application of recurrent neural network transducers, which has led to increased accuracy and reduced computational footprint.

• Facebook AI’s wav2vec 2.0 garnered plenty of attention. “Wav2vec 2.0 is an innovation for audio and speech, without the need for automatic speech recognition, and provides a powerful raw material for downstream audio and speech classification tasks,” Kane says.

• OpenAI’s GPT-3 was introduced to enable prediction and generation of natural language.

• Voice cloning improved, as evidenced by the Localize function from Resemble AI. “Their synthetic voice clones can be trained to speak a half dozen languages. The new function allows for translating digital voices to speak in other languages, which will make localization much easier,” Petrov says.

A Look Ahead

Reading the tea leaves, industry experts are bullish on the rapid expansion of related technologies in the months ahead.

“In 2021, we will see increasing use of conversational AI as an alternative to an augmentation of human intelligence,” Johnston forecasts. “We will also witness an increasing expansion of conversational AI from voice and text channels to rich media and multimodal interactions where intelligent virtual assistants will be able to present information to the customer through a combination of visual media with voice and text.”

Changing consumer habits due to the pandemic will likely accelerate how AI and speech technology will be used, especially in the first half of this year, Muldoon predicts.

“We’ll start to see the expansion of voice-enabled AI’s capabilities,” he states. “As more models continue being built, there will be opportunities to create more robust interactions, and in a few years AI will become a secondary way to complete tasks vs. a supportive role.”

Kane expects significant jumps in accuracy for classification problems like sound event detection and speech emotion recognition due to unsupervised representational learning.

“This might be the year that interactive, conversational data gets the research and development attention it deserves and we see more academic research on interactive speech synthesis to help power applications like voice assistants,” Kane says.

Others are hopeful that AI advancements will result in better voice-to-text capabilities on smartphones.

“Right now, we see lots of errors when we ask our phones to create a message from voice. But as innovation continues and the algorithms grow stronger, we will benefit from increased accuracy in this application, which will also influence smart speakers to expand their understanding of human language and ability to converse more widely with users,” Rogers suggests.

John Langton, director of applied data science at Wolters Kluwer, believes we can expect enhanced synergy between AI and other technologies like facial recognition and voice recognition this year.

“For improved integration, we can now use affective computing as an additional signal to speech inputs to infer user intent when answering consumer questions,” he says.

Count on AI to continue to provide better patient care delivery in healthcare as well, many agree.

“We’ll observe more personalized care for patients and more efficient and effective operations for providers. And the rapid adoption of telehealth in combination with widespread consumer use of voice-enabled smart home technology will likely fuel the growth of communicative healthcare AI bots,” Hunt says.

Erik J. Martin is a Chicago area-based freelance writer and public relations expert whose articles have been featured in AARP The Magazine, Reader’s Digest, The Costco Connection, and other publications. He often writes on topics related to real estate, business, technology, healthcare, insurance, and entertainment. He also publishes several blogs, including martinspiration.com and cineversegroup.com.

SpeechTek Covers
for qualified subscribers
Subscribe Now Current Issue Past Issues