Reviews

The Book of Why: Exploring the missing piece of artificial intelligence

December 9, 2019

Thinking robot — Image credit: Depositphotos

Welcome to TechTalks’ AI book reviews, a series of posts that explore the latest literature on AI.

In the past six decades, the field of artificial intelligence has traveled through a meandering path, passing through periods of excitement and disenchantment, and a longstanding dispute between various approaches to creating intelligence.

Today, deep learning, the current dominant AI technique, owes its success in large part to an abundance in data and compute resources. Thanks to deep learning models and their underlying technology, artificial neural networks, we have been able to tackle problems that were impossible to solve with classical AI approaches. There are now AI algorithms that can outperform humans at many complicated tasks, such as playing Go or predicting cancer.

Today, most advances in the field are associated with creating bigger neural networks and training them with more and more data. In the past few years, this approach has yielded AI models that can perform more accurately on tasks that require spatial consistency (e.g., image classification), or temporal consistency (e.g., text generation).

But the current excitement surrounding pouring more data and compute into deep learning models has blinded most research to one of the fundamental problems that AI technology still suffers from: causality.

The Book of Why: The New Science of Cause and Effect, written by award-winning computer scientist Judea Pearl and science writer Dana Mackenzie, delves into this topic. In his book, Pearl discusses the need to move past data-centric approaches and embed AI algorithms with the capability to find causes. In other words, this could be the one thing that stands between current AI and human intelligence, the power to ask questions and look for answers, hence the name of the book.

“If I could sum up the message of this book in one pithy phrase, it would be that you are smarter than your data. Data do not understand causes and effects; humans do,” Pearl writes.

Causal models

The Book of Why cover — In “The Book of Why,” Judea Pearl discusses why without causal models, AI algorithms will never get us closer to replicating human intelligence.

Consider one of the simplest tasks that every human being learns to do early in life: household chores. Most of us can do things like tidying up the house, doing the dishes, hanging the clothes to dry, making coffee, getting milk from the fridge, etc., without the need for too much instruction. If we move to a new house with a totally new layout and different appliances, we’ll be able to find our way around and do all those things without the need for new instructions and training, just by doing a minimal exploration.

“Our causal intuition alone is usually sufficient for handling the kind of uncertainty we find in household routines or even in our professional lives,” Pearl writes.

Our causal model of the world is what enables us to make assumptions about the relations between objects, draw analogies across experiences and deal with new environments and problems. But unfortunately, there has been scant efforts to provide AI models with the same kind of tools.

“While awareness of the need for a causal model has grown by leaps and bounds among the sciences, many researchers in artificial intelligence would like to skip the hard step of constructing or acquiring a causal model and rely solely on data for all cognitive tasks,” Pearl notes. “The hope—and at present, it is usually a silent one—is that the data themselves will guide us to the right answers whenever causal questions come up.”

For the time being, the more successful AI systems are deep learning models that leverage bigger datasets with more examples about different possible situations. But data won’t answer the question when the problem moves away from very narrow situations, such as driving on public roads. The AI will remain brittle, which means it won’t be able to generalize its behavior past the domain of examples it has seen. And it will continue to fail in face of corner cases, situations it hasn’t seen before.

Going back to the household chores scenario, an AI that would want to be able to do the minimal tasks in a house would need to be trained on all possible kinds of fridges, dishwashers, washing machines, dryers, cupboards, different types of food packaging, etc. And it would need separate training on home navigation, trained on different floor types (carpet, tiles, parquet, etc.), different walls (solid paint, wallpapers, etc.), different stair types, different doors, etc.

The possibilities are virtually unlimited, and no amount of data can create a solid model that can solve all these problems in a robust manner. With every house being unique in its own way, you would effectively need to retrain your robot for every home. That’s why we still don’t have robo-butlers.

But why is it that we humans can handle these seemingly simple tasks that perplex the most complicated AI systems?

“Very early in our evolution, we humans realized that the world is not made up only of dry facts (what we might call data today); rather, these facts are glued together by an intricate web of cause-effect relationships,” Pearl writes, adding that causal explanations, not dry facts, make up the bulk of our knowledge, and should be the cornerstone of machine intelligence.

The ladder of causation

In The Book of Why, Pearl introduces the “ladder of causation,” a three-level model to evaluate the intelligence of living or artificial systems. While a lot of the book goes into explaining the ladder of causation with historical and practical examples, I’ll do my best to summarize it here.

The first rung, “seeing,” is everything you can learn from observation alone. These are the kind of correlations you can find from the data you collect from the world. This is the model we share with animals.

The second rung, “doing,” is the things we learn by going beyond observation and intervening. Second rung involves performing experiments, controlling for specific variables, and drawing conclusions from the results. The second rung is limited to humans and a few animals that have manifested the capacity to use tools.

“Yet even tool users do not necessarily possess a ‘theory’ of their tool that tells them why it works and what to do when it doesn’t,” Pearl notes. “For that, you need to have achieved a level of understanding that permits imagining.”

And this brings us to the final rung, “imagining,” the causal model of modern humans, the ability to think about counterfactuals and imagine alternate worlds.

“It was primarily this third level that prepared us for further revolutions in agriculture and science and led to a sudden and drastic change in our species’ impact on the planet,” Pearl writes.

Elsewhere in the book he mentions, “Our ability to conceive of alternative, nonexistent worlds separated us from our protohuman ancestors and indeed from any other creature on the planet. Every other creature can see what is. Our gift, which may sometimes be a curse, is that we can see what might have been.”

The mini-Turing test

Pearl has focused The Book of Why on what he calls “the mini-Turing test,” named after the AI evaluation experiment that computer science pioneer Alan Turing proposed in 1950. Pearl describes the mini-Turing test as such:

“How can machines (and people) represent causal knowledge in a way that would enable them to access the necessary information swiftly, answer questions correctly, and do it with ease, as a three-year-old child can?”

An AI that can pass the mini-Turing test should be able to process a story and correctly answer causal questions that a human can answer. To make the test simple (hence the “mini” prefix), Pearl has excluded aspects of human intelligence such as vision and natural language, and has also allowed the contestant to encode the story in any convenient representation.

The mini-Turing test can help a lot in evaluating the level of intelligence AI systems manifest. This is important because we’re at a time where AI systems are manifesting behavior that seems very intelligent on the surface. Consider AlphaGo’s move 37, a tactic that awed even the most proficient Go players, or the more recent AI program Aristo, which can score above 90 percent on an 8th-grade science test.

Can any of these AI systems answer simple questions about the tasks they perform? Or are they simply converting statistics and probabilities into decisions? Misunderstanding these concepts sometimes leads to the misinterpretation of the capabilities and limits of current AI technologies.

Where does deep learning currently stand?

“Without the ability to envision alternate realities and contrast them with the currently existing reality, a machine cannot pass the mini-Turing test; it cannot answer the most basic question that makes us human: ‘Why?’” Pearl writes in the final chapter of The Book of Why.

So, where does deep learning stand on the ladder of causality?

There’s no doubt that despite its limits and challenges, deep learning has made great contributions to many domains. The AI technique can solve many problems that are beyond the capacity of the human brain. Also, areas such as computer vision, speech recognition, and natural language processing have seen great leaps thanks to advances in deep learning.

But does that mean that deep learning manifests intelligence?

“Deep learning has succeeded primarily by showing that certain questions or tasks we thought were difficult are in fact not. It has not addressed the truly difficult questions that continue to prevent us from achieving humanlike AI,” Pearl writes.

And he is right. Consider the best chess- or Go- or StarCraft-playing AI systems. None of these state-of-the-art AI programs manage to solve the complicated problems of their environments in the data- and resource-efficient manner that the human brain does. None of them can answer questions or explain the reasons behind their decisions. But they have proven that there are alternate ways to solve those problems, through the sheer power of search and pattern-matching algorithms.

In many ways, we’ve made great progress in deep learning, but our AI is still stuck at the first rung of the ladder of causation.

“The goal of strong AI is to produce machines with humanlike intelligence, able to converse with and guide humans. Deep learning has instead given us machines with truly impressive abilities but no intelligence. The difference is profound and lies in the absence of a model of reality,” Pearl writes.

Where do we go from here? There are rigorous debates over what course AI should take, and naturally, many don’t agree with Pearl’s views.

A good read in this regard is Architects of Intelligence, a compilation of interviews with 23 AI and computer scientists and philosophers, including Judea Pearl. In his interview with the author Martin Ford, Pearl stresses the need to find ways to integrate causality into AI technologies. “Causal modeling is not at the forefront of the current work in machine learning. Machine learning today is dominated by statisticians and the belief that you can learn everything from data. This data-centric philosophy is limited,” he says.

In The Book of Why, he writes, “Causal questions can never be answered from data alone. They require us to formulate a model of the process that generates the data, or at least some aspects of that process.”

I see dozens of "Data Science Institutes" erected across the country, I read their manifestos and I check their advisory boards. Causality does not seem to be on their agenda. Which makes one doubt whether the Ladder has been internalized and where this hype will end. #Bookofwhy

— Judea Pearl (@yudapearl) November 27, 2019

The Book of Why, a much-recommended read to anyone who’s interested in an alternate view on the current state of AI, is much more than just a discussion about intelligence. It’s a look at the history of causal science and humanity’s path from observing data to developing new sciences.

Why LLMs should stop thinking out loud (and what comes after…

Beyond vibe coding: How Codev 3.0 engineers the AI-powered dev team

How Cursor’s Composer 2.5 uses self-distillation to beat the frontier LLMs…

Vertical integration as AI infrastructure: What 21D’s full arch implant system…

Why sandboxing OpenClaw doesn’t stop data exfiltration

Applied ML: When ‘perfect’ becomes the enemy of ‘good’

AI can’t replace software engineers yet, but here is how to…

How to turbocharge your product and market research with DeepSearch

How looking differently at data can save your machine learning project

Building a solid data foundation for generative AI applications

Demystifying loop engineering: Get more from AI agents, avoid loopmaxxing

Why the future of agentic AI is all about the harness

The evolution of LLM tool-use from API calls to agentic applications

What makes DeepSeek-V3.2 so efficient?

What to know about Claude Opus 4.5

AI is writing your code, but who’s reviewing it?

Machine learning in space: Building intelligent systems for the harshest environments

Decoding the brain, inspiring AI: How Rahul Biswas is bridging neuroscience…

The cash flow conundrum: How technology is reshaping small business finance

What to know about the security of open-source machine learning models

The Book of Why: Exploring the missing piece of artificial intelligence

Causal models

The ladder of causation

The mini-Turing test

Where does deep learning currently stand?

Like this:

Leave a ReplyCancel reply

Causal models

The ladder of causation

The mini-Turing test

Where does deep learning currently stand?

Like this:

Leave a ReplyCancel reply

Discover more from TechTalks