This article is part of our coverage of the latest in AI research.
Artificial intelligence systems can mimic some aspects of human intelligence with impressive results, including detecting objects, navigating environments, playing chess, or even generating text. But cloning human behavior has its limitations. Without backing actions with thought, AI systems can become brittle and make unpredictable mistakes when faced with novel situations.
One recent project by scientists at the University of British Columbia and Vector Institute shows the benefits of getting AI systems to think like humans. They propose a technique called “Thought Cloning,” which trains the AI on thoughts and actions at the same time.
Thought cloning can enable deep learning models to generate a sort of reasoning process for their actions and convey that reasoning to human operators. There are many benefits to Thought cloning, including training efficiency, troubleshooting and error fixing, and preventing harmful behavior.
Behavior cloning vs thought cloning
Many deep learning systems are trained on data generated by humans. For example, training data can be the list of moves in a chess game or the sequence of actions in a strategy game. It can be real-world actions such as completing tasks in a warehouse. By training on a large enough dataset, the AI agent will be able to create a model of human behavior on that task.
But while the model can learn to mimic human behavior and reach the same results on many tasks, it does not necessarily learn the reasoning behind those actions. Without the thought process, the AI agent will not be able to generalize the learned actions to new settings. Consequently, it will require a much larger training dataset that includes all possible scenarios. And it will still remain unpredictable in the face of unseen edge cases.
The hypothesis behind thought cloning is that if you train a model on actions and their corresponding thoughts, then the model will learn the right associations between behavior and goals. And it will also be able to generate and communicate the reasoning behind its actions.
To achieve thought cloning in ML models, you provide the model with multiple streams of information during training. One is the action observations, such as the moves that a player is performing in a game. The second is the thought stream, such as the explanation behind the action. For example, in a real-time strategy game, the AI observes that the player moved a few units in front of a bridge. At the same time, it receives a text explanation that says something like “prevent enemy forces from crossing the bridge.”
There are several benefits to this approach. First, AI agents will learn faster because they will need fewer examples to figure out why a certain action matters. Second, they will perform better, because they will be able to generalize the same reasoning to unseen situations. And third, they will improve safety by expressing the reasoning behind each action they take. For example, if the AI agent is pursuing the right goal but intends to take an unsafe action (e.g., to drive through a red light to reach the destination on time), then it can be deterred before it causes damage. Accordingly, if it is taking the right action for the wrong reason, it can be steered in the right direction.
Teaching AI to imitate human thought
The researchers propose a deep learning architecture composed of two parts that try to accomplish a mission. The “upper component” processes a stream of thoughts and environment observations and tries to predict the next thought that will help the model achieve its goal. The “lower component” receives the environment observations and the output of the upper component and tries to predict the correct action to take.
The model repeats this process and uses the results of each stage as input into the next stage. During training, the model has access to the sequence of thoughts and actions produced by humans. It uses this information as ground truth to adjust its parameters and minimize the loss in thought and action predictions. A trained model should be able to generate the right sequence of thoughts and actions for unseen tasks.
The model uses transformers, long short-term memory (LSTM) networks, and vision-language models to process text commands and visual data, fuse them together, and track embeddings across multiple steps. The researchers released their results on GitHub, including the model weights, the code for training the model, and the code for generating the training and test data. (This is a hopeful development against the backdrop of AI labs sharing less and keeping the details of their models secret.)
For their experiments, the authors used BabyAI, a grid world platform in which an AI agent must accomplish different missions. The agent can perform various actions such as picking up objects, opening doors, and navigating rooms. The advantage of the BabyAI platform is that it can programmatically generate worlds, missions, solutions, and narrations to train the AI system. The researchers created a dataset of one million scenarios to train their thought-cloning model.
To test their technique, the researchers created two different models. The first was trained for pure behavior cloning, which means it only received environment observations. The second was trained for thought cloning, receiving both the behavior data and a stream of plaintext explanations about the reasoning behind each move.
The results show that thought cloning significantly outperforms behavior cloning, and it converges faster because it needs fewer training examples to generalize to unseen examples. Their experiments also show that thought cloning also outperforms behavior cloning in out-of-distribution (OOD) examples (tasks that are very different from the model’s training examples).
Thought cloning also enabled the researchers to better understand the behavior of the AI agent because for each step, it produced its planning and reasoning in natural language. In fact, this interpretability feature enabled the researchers to investigate some of the model’s early errors during training and quickly adjust their training regime to steer it in the right direction.
In terms of safety, the researchers developed a technique called Precrime Intervention that automatically detects and prevents risky behavior by examining the model’s thought stream. They observe that in their experimental environment, Precrime Intervention “almost entirely eliminates all unsafe behaviors, thereby demonstrating the promising potential of TC agents in advancing AI safety.”
Applying thought cloning to real-world AI
Thought cloning is an interesting and promising direction of AI research and development. It fits in other activities that try to create embodied and multi-modal deep learning models, such as Google’s PaLM-E and DeepMind’s Gato. Part of the reason human intelligence is so much more robust than current AI is our ability to ingest and process different modalities of information at the same time. And experiments show that multi-modal AI systems are much more robust and efficient.
However, thought cloning is not without its challenges. For one thing, the BabyAI environment is simple and deterministic, which makes it much easier for deep learning models to learn its nuances and intricacies. The real world is messier, unpredictable, and much more complex.
Another challenge of this method is creating the training data. People don’t necessarily narrate their every action when performing tasks. Our shared knowledge and similar biology obviate the need to explicitly spell out our every intention. The authors propose that a solution could be using YouTube videos in which people explain as they perform tasks. However, even then, human behavior is fraught with implicit reasons that can’t necessarily be explained in plain text.
It remains to be seen how thought cloning performs on internet-scale data and complex problems. But as the paper’s authors state, it creates new avenues for “scientific investigation in Artificial General Intelligence, AI Safety, and Interpretability.”