The 2022 State of Artificial Intelligence

Article Featured Image

Speech technology continues to advance by leaps and bounds from year to year. And 2021 was no exception, with innovation coming at a faster pace than anticipated, thanks in large part to advancements in artificial intelligence (AI). This progress has come in handy, especially as the COVID-19 crisis has lingered into 2022 and as consumers and businesses increasingly rely on speech embedded into apps, platforms, and devices powered by AI.

“Speech-based solutions that combine natural language processing (NLP) for language understanding and emotional cognition have been adopted to address mental health issues that arose around the pandemic,” says Narmeen Makhani, executive director of AI labs at ETS. “The increasing need for customer service solutions, combined with labor shortages and supply chain issues, have also accelerated enterprise adoption and reliance on AI-based conversational agents. These agents have helped enhance customer experiences and increase efficiency and customization for customer support and speech recognition tools.”

In fact, conversational AI agents have become mainstream for most companies. And advancements in AI-based speech technologies, such as on-device automatic speech recognition (ASR), textless models, and large language models, have opened new possible use cases in banking, health care, and education, Makhani says.

Matt Muldoon, president, North America, of ReadSpeaker, agrees. “As verticals such as hospitality, automotive, and customer service have adopted speech technology more broadly, other verticals, like smart home controls, experiential marketing, and gaming, are recognizing the benefits of leveraging speech technology powered by AI in their offerings to users and customers,” he notes.

Callan Schebella, executive vice president of product management at Five9, says improvements in AI in the past few years “has been explosive. You can do things with speech recognition now that were only possible in a lab a few years ago.”

“AI is accelerating the capabilities of [voice assistants like Alexa and Siri], and as a result, over time Alexa and other voice assistants will only get smarter,” Puneet Mehta, founder and CEO of Netomi, believes. “Meanwhile, for companies, conservational AI is becoming widely adopted to reduce business costs, boost customer service, and save employees time by automating mundane tasks, empowering them to take on more creative tasks.”

Evan Macmillan, CEO of Gridspace, agrees. “Business customers are becoming increasingly savvy about what is possible with conversational AI and seeking out best-of-breed solutions from full-stack AI-first product companies,” he says. “Businesses are also moving voice workloads to the cloud at record speeds, which opens the door to conversational AI and new process automations.”

Year in Review

Last year was a pivotal one for AI and speech technology, as evidenced by several prominent developments. For one, Meta (Facebook), Google, and NVIDIA all jumped on the metaverse bandwagon, spurring innovations in speech AI, computer vision, natural language understanding (NLU), and virtual reality (VR) to create humanlike avatars that can recognize speech and communicate with users.

McKinsey found that 56 percent of businesses today use AI in at least one business function, up from 50 percent in 2020. Nearly two-thirds plan to increase investments in AI over the next three years.

A Gartner survey found 36.3 percent of customer service leaders plan to deploy AI by 2023.

Market growth—being driven by increasing demand for voice-activated systems, voice-enabled virtual assistants, and voice-enabled devices—continues to skyrocket

“As smart appliances become more common, and companies leverage AI to boost the accuracy of speech technology, more consumers will be apt to use it in their everyday lives. The research also suggests that the market will see significant adoption thanks to the decline in cost of voice and speech devices, a rise in software development, and continuous demand for virtual assistant smart speakers with voice capabilities,” Muldoon says.

Yet adoption of AI-powered speech tech has been difficult for some, mainly due to a lack of infrastructure, technical limitations, and lack of education.

“Many people do not fully grasp the full benefits of AI and its use in the workplace. Like most emerging technologies...AI can seem complex to many, and it is still very much in its infancy,” Mehta says.

Makhani believes speech technology accuracy continues to be a top challenge, especially with non-native speakers and children and in situations with background noise.

“Furthermore, many mainstream, AI-based speech devices continue to listen in on users. The challenge of always being responsive while also protecting user privacy continues to need to be addressed to ensure ethical standards are met in the widespread use of AI,” Makhani says. “And as AI technology gets more humanlike, fake content is going to proliferate. Since it is possible to create fake content at scale now in some areas, this can have dire results. On the positive side, this technology can be used to identify fake content as well.”

Data privacy concerns continued to dominate headlines in 2021, too.

“Cybersecurity and data privacy remain the top risks for companies when it comes to leveraging AI,” Muldoon warns. “Companies need to be explicit in communicating how they safeguard customers’ privacy and data to help consumers feel more comfortable using the solutions, which can help increase adoption rates and help companies expand AI into other business functions to streamline operations.”

Businesses are also still adapting to changes caused by COVID, which speech technology can either help or hinder.

“With remote work and virtual events considered the new normal, companies are still grappling with how to better connect and engage with consumers and clients. Speech-to-text use cases—from interactive notetaking to captioning town halls—allow organizations to maximize the potential of their audio and video files by making information searchable, accessible, and actionable. But ongoing education will be required to maximize this potential,” says Ariel Utnik, chief revenue officer and general manager of Verbit.

Another continuing conundrum? Businesses tend to get lost when it comes to AI.

“It may be tempting to adopt shiny new technologies like AI for the sake of new technologies, but coming up with a cohesive strategy is an issue for many companies,” Mehta says. “Start small, measure, and see if you can start to see an impact in under six months.”

A Look Ahead

Most experts predict exciting things ahead for AI and speech.

“Within the next five years, every major brand will have an AI-powered voice channel for customer service, in the same way that every brand has a website,” Schebella predicts. “Just like in the 1990s and early 2000s, when there was a rush by organizations to establish an online presence, businesses will be expected to have an intelligent, conversational interface to meet customers’ expectations for engagement.”

Mehta sees more companies turning to AI to create efficiencies in the customer journey. “People today demand truly effortless support and interactions, and while that’s been delivered primarily through chat and messaging, as we look forward, there will be tremendous growth for hands-free, voice interactions across support, sales, and marketing,” he says.

Large language-like models, as well as multilingual models, will become more commonplace, easily accessible, and integrated with regular technology, Makhani maintains.

“They will move from primarily research use in the hands of big tech to industry use. Startups and decentralization advocates will actively accelerate the widespread availability and use of large models. ASR and NLU will continue to blur the line between humans and avatars, thus furthering immersive entertainment and accelerating real-world applications, such as education and health diagnosis in the metaverse.”

Voice assistants will play a greater role in education, both inside and outside the classroom, Makhani continues.

“Also, privacy regulations from governments around the world will tighten around AI, including capturing and storing speech, especially regarding minors, thereby accelerating the use of on-device ASRs, on-device model deployments, and text-less models,” Makhani adds.

Jim Freeze, chief marketing officer of Interactions, is excited by how, in 2021, AI became an integral part of the conversation around healthcare innovation—a movement he foresees snowballing.

“As hospital systems are increasingly overwhelmed by the consequences of the pandemic, administrators are turning to conversational AI as a solution,” Freeze says. “I expect to see this trend of conversational AI in healthcare continue into 2022.”

Expect AI to become smarter and more aware of surroundings to detect the local environment, too.

“This will have many capabilities—suppressing noise in headsets, warning us of events around us, performing natural language control locally without cloud assistance, and more. Speech AI will eventually be using multiple sensors to make context-dependent complex decisions based on sound detection and processing,” predicts Vikram Shrivastava, senior director of AISonic Edge Processors at Knowles.

Lastly, count on speech to spread its tendrils across gaming and the metaverse.

“With an estimated 3.24 billion gamers worldwide, game developers can leverage speech technology to create better character voices without relying on voice actors and ensure that players at all levels have the best experience by using the technology to enhance accessibility features,” Muldoon says. “Additionally, with new adoption of the metaverse, human and machine interaction will become more seamless and blur the lines between reality and virtual. In this space as individuals, we could be dynamically interacting with people and items that we like. In the metaverse, AI and speech technology will be paramount to ensure people have a positive experience.” 

Erik J. Martin is a Chicago area-based freelance writer and public relations expert whose articles have been featured in AARP The Magazine, Reader’s Digest, The Costco Connection, and other publications. He often writes on topics related to real estate, business, technology, healthcare, insurance, and entertainment. He also publishes several blogs, including martinspiration.com and cineversegroup.com.

SpeechTek Covers
for qualified subscribers
Subscribe Now Current Issue Past Issues