Welcome to TechTalks’ AI book reviews, a series of posts that explore the latest literature on AI.
With the hype and excitement surrounding recent advances in deep learning and neural networks, it’s easy to misunderstand the capabilities of artificial intelligence. The media is rife with stories that warn of AI algorithms bringing people back from the dead, AI algorithms developing secret languages, mass technological unemployment, and a looming robot apocalypse. Movies and TV series like Her, The Circle and Westworld, which present a mystic portrayal of conscious machines and human-level AI being just around the corner.
It’s not, as a new book, Rebooting AI, by Robust.AI CEO and cognitive scientist Gary Marcus and New York University professor Ernest Davis, shows. Rebooting AI is a refreshing read and a much-needed reality check on the current confusing state of artificial intelligence.
Why AI can’t understand text
Consider the following text, mentioned in Rebooting AI: “Elsie tried to reach her aunt on the phone, but she didn’t answer.” You don’t need to be a genius to quickly make the following assumptions after reading this sentence:
- By “reach,” we don’t mean Elsie tried to physically reach out to her aunt, but tried to contact her by using the phone.
- “On the phone” means Elsie tried to communicate with her aunt by using the phone as opposed to looking for her on the physical surface of the phone.
- “She didn’t answer” is a reference to Elsie’s aunt, not Elsie herself.
But even the most sophisticated AI algorithm would struggle to draw the same conclusions.
To be clear, deep learning has led to some very interesting advances in natural language processing. Machine translation, search engines, smart reply and autocorrect features, and grammar checkers have become much better thanks to an explosion of innovation in neural networks.
Recently, scientists at Allen Institute for Artificial Intelligence developed an AI language model that could pass an 8th grade science test. And OpenAI’s massive language model GPT-2 has caused much concern about the threat of AI-generated fake news.
However, as Marcus and Davis argue, AI’s achievements in processing and generating language are often misleading. Deep neural networks are huge statistical machines, massive mathematical functions that can find complex and intricate correlations between large sets of data. This makes them extremely efficient at classification tasks. But a lot of the things we do when reasoning about language have nothing to do with correlations and statistics.
As the scientists explain in Rebooting AI, there’s a mismatch between what machines are good at doing now—classifying things into categories—and the sort of reasoning and real-world understanding that would be required to perform mundane tasks, such as understanding the sentence we mentioned earlier.
“[Deep learning] struggles when it comes to understanding how objects like sentences relate to their parts (like words and phrases),” Marcus and Davis write. “Why? It’s missing what linguists call compositionality: a way of constructing the meaning of a complex sentence from the meaning of its parts.”
In fact, if you go back to the telephone conversation, much of the inferences we make is because we have a lot of background knowledge about communications and how phones work. But a deep learning model doesn’t have that kind of understanding—it only has correlations learned from training examples. That’s why it has become widely known that deep learning and neural networks are as good as their training data.
This does not pose much of a problem when tackling simple computer vision tasks such as classifying images and detecting objects. Likewise, in language, deep learning can do things such as answering questions whose answers are directly included in its corpus of text (e.g. who did Elsie try to reach?).
But when it comes to understanding hidden and implicit meanings, things that can’t be taught through sheer number of examples, deep learning starts to break and manifest weird behavior. This is especially true about language, one of the most complex functions of the brain.
“Except for a few small sentences, almost every sentence you hear is original. You don’t have any data directly on it. And that means you have a problem that is about inference and understanding,” Marcus told me in September. “The techniques that are good for categorizing things, putting them into bins that you already know, simply aren’t appropriate for that. Understanding language is about connecting what you already know about the world with what other people are trying to do with the words they say.”
For the moment, most AI researchers have been seeking to fix their models’ errors by throwing bigger data sets and compute resources at the problem. Their hope is that with more data, their AI will eventually be able to answer every possible corner and edge case.
But as Marcus and Davis explain in Rebooting AI, “Statistics are no substitute for real-world understanding. The problem is not just that there is a random error here and there, it is that there is a fundamental mismatch between the kind of statistical analysis that suffice for translation and the cognitive model construction that would be required if systems were to actually comprehend what they are trying to read.”
The home: AI’s challenges in dealing with open environments
Interestingly, another challenge that AI has struggled to solve is one of something that most humans learn to handle at a very young age: navigating the home. In Rebooting AI, Marcus and Davis mention the example of Rosie the Robot Maid, featured in the cartoon The Jetsons. Rosie could perform a variety of tasks including washing the dishes, vacuum cleaning, dusting the shelves, doing the laundry and more.
Interestingly, Jetsons launched in the early 1960s, at the dawn of artificial intelligence. At the time, AI scientists genuinely believed they could crack the code of reproducing the human mind in a matter of months. It was before the first AI winter, the period where overpromising and underdelivering dampened interest and funding in artificial intelligence. If AI would soon solve much more complicated problems, then keeping the home tidy would be a walk in the park.
Boy were they wrong. Five decades later, AI scientists have solved complicated problems such as predicting breast cancer and synthesizing natural-sounding voice. But robo-butlers like Rosie are nowhere in sight.
“For now, the best-selling robot of all time isn’t a driverless car or some sort of primitive version of C-3PO, it’s Roomba, that vacuum-cleaning hockey puck of modest ambition, with no hands, no feet, and remarkably little brain… about as far from Rosie the Robot as we can possibly imagine,” Marcus and Davis observe in Rebooting AI.
As it happens, solving problems in open environments such as the home require situational awareness and commonsense, skills that remain unique to humans. The kinds of decisions we make around the home, subconsciously and without a second thought, are very hard for current AI technologies to replicate. To a degree, the same applies to roads, where self-driving cars still make stupid mistakes even after driving millions of miles.
Most robotic applications and projects use reinforcement learning, a subset of AI that revolves around maximizing reward. Reinforcement learning works well in closed worlds and limited goals such as game environments. But as soon as the situation becomes complicated, and the AI agent has to choose between many conflicting goals, current techniques start to break.
In Rebooting AI, Marcus and Davis mention many interesting examples of situations in the home that can affect our actions. Every decision we make involves plenty of parameters that we take for granted and require lots of background knowledge about the world and experience that we’ve picked in our lives. Those are the kinds of things that simply can’t be represented in statistics of neural networks and the pure trial-and-error process of reinforcement learning.
“Roboticists have done an excellent job of getting robots to figure out where they are, and a fairly good job of figuring how to get robots to perform individual behaviors,” Marcus and Davis write. “But the field has made much less progress in three other areas that are essential to coping in the open-ended world: assessing situations, predicting the probable future, and deciding, dynamically, as situations change, which of the many possible actions makes the most sense in a given environment.”
So what do books and homes have in common? They’re both open-ended problems, and as Marcus and Davis note in Rebooting AI, “In a truly open-ended world, there will never be enough data.”
The future of AI: where do we go from here?
In Rebooting AI, Marcus and Davis discuss possible paths toward creating robust artificial intelligence that can solve commonsense problems and doesn’t need millions of training examples. The book gives plenty of details about how we can learn from the functions of the human brain to changing the methods of AI research.
But perhaps one of the most fundamental problem with current AI is that it has become mostly focused on solving problems through bigger neural networks, bigger computers and larger data sets. AI has become too much about mathematical representations and too little about building cognitive models.
“As long as the dominant approach is focused on narrow AI and bigger and bigger sets of data the field may be stuck playing whack-a-mole indefinitely, finding short-term data patches for particular problems without every really addressing the underlying flaws that make these problems so common,” Marcus and Davis write in Rebooting AI.
And without fundamental changes in the field, books and homes will remain off-limits to artificial intelligence. “Just as there can be no reading without rich cognitive models, there can be no safe, reliable domestic robots without rich cognitive models.”