November 12, 2019
By James A. Larson program co-chair, SpeechTEK 2021
Q & A

Q&A: Sam Ringer Says the Revolution Is Coming — the Medium-Term Future of AI and ML

Current AI and machine learning (ML) technologies are starting to change the way we build and innovate. However, the power of our current ML technologies is not fixed. Sam Ringer, Machine Learning Engineer, Speechmatics will explore where ML is at the moment and where it is heading in the next 10 years in his session “The Revolution Is Coming: The Medium-term Future of AI and ML” at SpeechTEK 2020. How are its current uses cases different from those we can expect to see in the future? How will increased technological leverage change how we do business? He’ll answer these questions and more, but first, he answered a few of our questions.

Q: How does ML make it possible to develop better speech applications than traditional technology?

A: ML allows you to leverage compute and data to learn patterns instead of having to define the patterns yourself. This scales far better than human heuristics can.

Q: Some developers feel that it is difficult to generate meaningful error messages when an ML fails? How can this problem be overcome?

A: This is very much still an open problem and there are currently no standard best practices. There are however some early solutions include but are not limited to constantly checking the following: training vs validation loss, overfitting ability, gradient norms, deep data visualisation and recurring failure cases at test-time.

Q: ML requires large volumes of sample data to train the ML engine. Where does the training data come from?

A: Training data typically comes in two forms: labelled and unlabelled. Until very recently only labelled data has been used to train ML systems at scale. In the case of ASR, data comes from a variety of sources but all of it must be hand labelled.

Q: How much training data is required? How do you know when you have enough training data?

A: Normally thousands of hours of labelled data. You can never really have too much training!

Q: Are there techniques to avoid using large amounts of training data?

A: Yes: meta-learning and self/semi-supervised learning. These areas have only shown promise recently and still have lots of open questions to be solved.

Q: If a trained ML system needs to support a new class of queries, must the ML system be retrained?

A: It depends. If the new query is within a similar domain to the query your system is trained on then you can use transfer learning to do a decent chunk of heavy lifting for you.

Q: ML improvements in accuracy often simply depend on supplying larger amounts of training data. What other techniques do developers have for improving the accuracy of their applications?

A: There are hundreds, if not thousands of ways of boosting accuracy. However, nearly all are hacky solutions. If your learning process is strong enough then you shouldn’t have to hack your way to better results.

To see presentations by Ringer and other speech technology experts, register to attend the SpeechTEK Conference.

Q&A: Sam Ringer Says the Revolution Is Coming — the Medium-Term Future of AI and ML

Q&A: David Morand on Integrating a Contextual AI Assistant with VoiceXML

Higher Learning: AI, ML, and Speech Tech in Academia

Q&A: Dr. Nava Shaked on Evaluation, Testing Methodology & Best Practices for Speech-Based Interaction Systems

Ethics and Algorithms—Exploring the Implications of AI

Eltropy Expands Voice Authentication Ecosystem with Illuma, IDgo, and Pindrop

Modulate Expands Velma with Voice-Native Real-Time Conversation Intelligence

DentScribe Launches DentScribe Perio Charting 3.0

Krisp Launches Voice Translation v3