Breakthroughs in Speech Are Just Beginning
As 2019 came to a close and we hung a new 2020 calendar on the wall, we couldn’t help but take stock of where we came from, where we are today, and where we will be going in the year ahead. To do that, our writers consulted with industry experts across the major sectors of the speech technology landscape to determine the state of the industry, which we highlight in our special report. Luckily, based on our reporting, I can say that the industry is in as good a shape now as it’s ever been. Strong industry growth is projected across all technology types, and both acceptance and adoption of speech is clearly on the rise.
The entire speech technology market, QY Research says, is poised for compound annual growth of 18.9 percent, up from a valuation of $8.6 billion in 2018 to $28.9 billion by the end of 2025.
That is great news for an industry that has, to put it mildly, seen its fair share of disappointments over the years.
Today, thanks to Apple’s Siri, Amazon Alexa, Google Assistant, Samsung’s Bixby, and similar technologies, consumers have not only become familiar with and comfortable using voice, but, in many situations, they actually prefer it. Soon it will become nearly impossible to ever imagine placing a call, searching the web, setting the thermostat, or playing a song without some sort of voice interaction.
None of this would be possible without the influx of artificial intelligence into the speech industry. Because of AI, particularly in the form of natural language processing, machine learning, and advanced analytics, consumers aren’t constrained any longer by set dialogue flows and limited grammars; they can speak naturally and have a pretty high level of confidence that systems will not only recognize them but also be able to respond appropriately. Systems today are faster, more accurate, more robust, and capable of feats that would have been considered impossible not long ago.
The emphasis on AI is certainly not surprising. People in the industry have been talking about it for years. The technology took a long time to get here. Unfortunately, that’s the way it is in this industry.
In some cases, it’s because the industry is simply too slow to act. In others, the vision is too far ahead of technological realities. And in still others, the technology fails to live up to the hype, causing a backlash after just a few failed implementations.
But, if the analysts, consultants, and other experts are to be believed, the industry has a rich future ahead. There are still many hurdles to overcome. Chief among them is the issue of context, as the industry looks beyond producing technology capable of merely understanding the individual words that are spoken, applying AI to drill down into the circumstances and the emotional states of the people involved. Once that barrier is breached, the impacts on speech synthesis, speech recognition, speech analytics, voice biometrics, contact center technologies, translation, transcription, virtual assistants, and so many other technology segments will extend far and wide.
I have no doubt that we will get there—and probably soon. In the meantime, we can all be glad that we’ve moved way past the point where we constantly throw our hands up in the air and say that speech technology just does not work.
Leonard Klie is the editor of Speech Technology. He can be reached at firstname.lastname@example.org.