Beware the Limitations of Machine Learning
The speech world has changed since developers began using AI and machine learning. Before that, developers used “rule-based” techniques, such as VoiceXML or hand-coded instructions, to carefully specify the rules for responding to user requests. These techniques were time-consuming, and the rules or programs were difficult to maintain and modify.
We’ve all seen YouTube videos that demonstrate how easy it is to train a simple AI system by typing a few phrases and responses. No more careful coding in VoiceXML. No more writing detailed specs using a programming or scripting language.
Whoa, hold on! Developing a speech-enabled AI system is not that easy. It may take hundreds or thousands of pairs of requests and responses to train an AI system. Collecting these utterances is time-consuming and expensive. After an AI system is deployed, users may come up with unexpected requests that require additional training.
AI systems using machine learning excel at matching a user request to the appropriate response, but these simpler AI systems are not able to perform other actions that users expect from a system that understands natural language. The following is a short list of limitations of AI systems based solely on machine training:
- Limited conversational memory: While an AI system can access a database of transactions to obtain, say, historical business information, it may not remember what the user said during a previous statement in the current dialogue. The system may not be able to relate pronouns to previously spoken objects.
- Reasoning: Suppose the AI system knows that Fred’s Grocery store sells breakfast cereals and corn flakes are a breakfast cereal. How can these two facts be combined so that the AI system responds “Fred’s Grocery store sells corn flakes” to the user request “Where can I buy breakfast cornflakes”? The AI system needs the ability to reason to derive new facts to respond to user requests.
- Planning: A user wants to fly from Bend, Oregon, to New York City, but there are no direct flights between these two cities. The AI system must develop a plan: for example, fly from Bend to Portland, Oregon, transfer planes, and then fly from Portland to New York City. The AI system requires the powers of induction and deduction, as well as other forms of reasoning and logic.
- Explanations: Given user food preferences and current location, many AI systems can recommend a restaurant. But if the user asks, “Why?” the AI system may not able to justify or explain its recommendation in a way understandable to the user.
- Ignorance: An AI system may not realize when it doesn’t know how to respond to a request. Instead, it just matches the request to a response that loosely matches the request. Users need to understand the limits of what the AI system can do rather than listen to a “guess” in place of a factual response.
- Common sense: A user requests flowers for a funeral, but because it’s Valentine’s Day, the AI system responds with an overly cheerful response: “I’ve ordered the flowers; have an enjoyable evening.” Most people have years of experience dealing with personal situations and use that experience to respond appropriately. AI systems may lack this capability.
- Biases: An AI system may have implied biases hidden in the data used to train the system. For example, if the training data for food preferences contains recommendations from a large number of Italian customers, the results may recommend Italian restaurants over French restaurants. Developers may not realize the existence of implicit biases in training data, or how harmful biases may be to users and society.
AI platform developers should strive continuously to improve AI systems, and while machine training is very useful, a combination of training and rule-based systems seems an inevitable choice for many developers. All participants in the recent Amazon competition to build a social bot capable of sustaining a 20-minute conversation used a combination of rule specification and training.
Application developers should consider using software like IBM’s AI Fairness 360, an extensible open source toolkit that can help you examine, report, and mitigate discrimination and bias in machine learning models throughout the AI application life cycle. It contains more than 70 fairness metrics and 10 state-of-the-art bias mitigation algorithms developed by the research community.
Finally, developers must set user expectations for what their systems can actually do, as most AI systems have intelligence equivalent to a typical 5-year-old’s. Modules that intercept user requests that the system cannot process correctly, and inform users about those limitations, can help ensure transparency.
James A. Larson is co-program chair of SpeechTEK 2020. He can be reached at firstname.lastname@example.org.