This article is part of our reviews of AI research papers, a series of posts that explore the latest findings in artificial intelligence.
Despite seeing tremendous advances in the recent decade, artificial intelligence is still lacking sorely in basic areas such as generalizability, adaptability, and causality. Today’s AI systems—mostly centered around machine learning and deep learning—are limited to narrow applications, require large amounts of training data or experience, and are very sensitive to changes in their environments.
Researchers are looking to various fields of science to find solutions to the current limits of AI systems. A new concept, proposed by researchers from various organizations and universities, draws inspiration from the two-system thinking framework proposed by Nobel laureate psychologist and economist Daniel Kahneman. Introduced in a paper published online, the technique is called SlOw and Fast AI (SOFAI). SOFAI uses meta-cognition to arbitrate between different modes of inference to improve the efficiency of AI systems in using data and compute resources.
Two systems of thinking
In his acclaimed book Thinking Fast and Slow, Kahneman proposes that the human mind has two systems of decision making. System 1 is fast, implicit, intuitive, and imprecise. It controls the unconscious decisions we make, such as walking or driving in a familiar neighborhood, climbing stairs, tying our shoelaces, and other tasks we can do without conscious thinking and oftentimes in parallel. System 2, on the other hand, is the slow and meticulous type of decision-making that requires logic, rational thinking, and concentration, such as solving complex mathematical equations, playing chess, or walking on a narrow ledge.
The human brain does a great job of dividing decision-making between the two modes of thinking. For example, when you’re learning a new task, such as driving, your System 2 will be more engaged. You’ll need to concentrate to coordinate your different muscles, shifting gears, pressing and releasing pedals, and turning the steering wheel, while at the same time watching the street and listening to the engine. As you gradually repeat the routines, you learn to perform the tasks without concentration and your brain shifts the task to your System 1. This is why an experienced driver can control the car and do something else at the same time, such as talking to the passengers, while a novice driver must concentrate fully on doing all the tasks right.
In mentally demanding tasks, such as calculus or chess, System 2 will remain the ultimate controller. But System 1 will also shoulder some of the burden over time. For example, experienced chess players who have played thousands of games use System 1 to recognize patterns of moves of formations on the chessboard. It won’t give the player a perfect solution, but it will provide intuition on where the game is headed and help save the expensive System 2 crucial time when deciding the next move.
The division of labor between System 1 and System 2 is nature’s solution to creating a balance between speed and accuracy, learning and execution.
As the researchers note in their paper, “System 1 is able to build models of the world that, although inaccurate and imprecise, can fill knowledge gaps through causal inference, allowing us to respond reasonably well to the many stimuli of our everyday life. When the problem is too complex for System 1, System 2 kicks in and solves it with access to additional computational resources, full attention, and sophisticated logical reasoning.”
System 1 and System 2 processing in AI
Most AI systems use a single architecture to solve problems. For example, machine learning engineers will design a deep neural network to perform a single task and train it until it reaches the desired level of accuracy. Classic deep learning architectures have distinct limitations that have been amply documented in recent years. Among them is the need for large amounts of training data and computational resources. For example, a deep reinforcement learning system that mastered the videogame Dota 2 required thousands of years’ worth of training.
On the other hand, current AI systems are very sensitive to edge cases, situations that they haven’t encountered during training. For example, despite having been trained on millions of miles of simulation and real-world driving, autonomous vehicles sometimes make mistakes that most average drivers would easily avoid.
Inspired from System 1 and 2, the SOFAI architecture uses multiple problem-solvers to address some of these limitations. SOFAI is composed of a pair of System 1 (S1) and System 2 (S2) models. The System 1 solver is very fast and automatically processes any new problem or input that SOFAI faces.
SOFAI has a meta-cognitive module (MC) that decides whether the System 1 solution is accurate and reliable enough or if it needs to activate the slower and more resource-intensive System 2 solver. Like the human mind, the system also has models of itself, others, and the world. As it accumulates experience, SOFAI updates these models, which helps it improve the confidence and reliability of fast decision-making with System 1.
The MC module arbitrates between the two systems by using the information it gains from the models and the solution provided by the S1 solver. Sometimes, the S1 solution might not be too accurate, but given time constraints, it might be a better option than spending additional resources on S2. In other cases, the expected gain from activating the S2 might not justify wasting extra resources, so the MC will opt to use the S1.
According to the researchers, “This architecture and flow of tasks allows for minimizing time to action when there is no need for S2 processing since S1 solvers act in constant time. It also allows the MC agent to exploit the proposed action and confidence of S1 when deciding whether to activate S2, which leads to more informed and hopefully better decisions by the MC.”
Testing SOFAI in the real world
While the researchers present SOFAI as a concept, they also experimented with a real implementation of the system in a grid-navigation problem. The goal of the AI system was to generate a trajectory that went from the initial state to a goal state.
The environment had a reward for achieving the goal and penalties for each move. There are additional constraints such as extra penalties for squares with black, green, and blue color codes. Basically, the AI agent must find the shortest trajectory to the goal while avoiding states that result in penalties. The researchers added some randomness to the environment to prevent it from becoming deterministic.
The SOFAI agent was composed of a simple System 1 solver that chose the move with the highest expected reward. The S1 starts with no knowledge of the environment and gradually improves as it collects experience and the SOFAI updates its model of the world (the grid environment) and itself (trajectories and moves). The System 2 component is created based on the Multi-alternative Decision Field Theory (MDFT), an inference model that can reason between different choices. MDFT can provide better results than the untrained S1 solver, but it is slower and computationally more expensive. In their experiments, the researchers tested three variations of the MDFT, each tuned for different trajectory preferences.
At every step, the SOFAI’s meta-cognition unit decides whether it can trust the S1’s solution or if it needs to switch to the S2 solver.
The researchers tested the different variations of SOFAI against solo S1 and S2 (MDFT) agents. Their experiments show that when used alone, the S1 system generates poor reward, trajectory length, and timing results. The S2 solver generates good trajectories and rewards but is computationally expensive and has poor timing results. In contrast, SOFAI found the right balance between reward and efficiency.
They then aggregated the results over 1,000 trajectories to see how the SOFAI model evolves its behavior and balances the use of the S1 and S2 agents. The results show that as SOFAI goes through more and more trajectories, its timing decreases, which means it becomes more compute-efficient, and its evolving behavior is very similar to how the human mind distributes cognitive labor between System 1 and System 2.
In the beginning, the SOFAI mostly uses S2 because its S1 module does not have enough experience and its decisions are not trustable. As the S2 model goes through multiple trajectories, the SOFAI updates its environment and self models, which results in better decisions by the S1. Consequently, the MC module gradually starts to shift decisions to the faster S1 module instead of relying on the compute-intensive S2. After about 450 trajectories S1 is used more often than S2. This evolving behavior allows SOFAI to be faster without degrading the quality of the trajectories it generates.
“This behavior is similar to what happens in humans… we first tackle a non-familiar problem with our System 2, until we have enough experience that it becomes familiar and we pass to using System 1,” the researchers write.
SOFAI is one of several directions of research that have been inspired by the System 1 and 2 thinking theory. In 2019, deep learning pioneer Yoshua Bengio discussed System 2 deep learning, an area of research that aims to improve neural networks toward developing symbolic reasoning capabilities. Other related efforts are being made in developing hybrid AI systems that combine neural networks and symbolic AI.
And there are notable efforts in self-supervised learning systems that can develop behavior without the need for large amounts of data. The intersection of self-supervised learning and reinforcement learning is particularly interesting as it aims to develop memory and data-efficient AI systems that can be applied to the real world.
Though SOFAI is not the only game in town, it looks promising. The researchers plan to expand on the idea and create SOFAI systems that have multiple S1 and S2 modules and can tackle several problems with the same architecture.