ChatGPT seems to be everywhere. From in-depth reports in highly respected technology publications to gushing reviews in mainstream media, ChatGPT has been hailed as the next big thing in artificial intelligence, and with good reason.
As a developer resource, ChatGPT is simply outstanding, particularly when compared to searching existing resources such as Stack Overflow (which are undoubtedly included in GPT’s data model). Ask ChatGPT a software question and you get a summary of available web solutions and some sample code that can be displayed in the language you need. Not happy with the result? Get a refined answer with just a little added info as the system remembers the context of your previous queries. While the just-released GPT-4 offers some significant new features, its usefulness to a developer hasn’t changed much in my usage.
As a software asset, ChatGPT’s API can be used to give the illusion of intelligence to almost any interactive system. As opposed to typing questions into the web interface, ChatGPT also offers a free API key which enables a program to ask questions and process answers. The API also provides access to features that are not accessible via the web, including options like how long an answer is expected and how creative it should be.
But while ChatGPT has already attracted more than a hundred million users, drawn by its impressive capabilities, it is important to recognize that it only gives the illusion of understanding. In reality, ChatGPT is manipulating symbols and code samples which it has scoured from the web without any understanding of what those symbols and samples mean. If given clear, easy questions, ChatGPT will offer (usually) clear, accurate responses. If asked tricky questions or questions with false or negative premises, the results are far less predictable. ChatGPT can also provide plausible sounding, but incorrect answers and can often be excessively verbose.
So what’s wrong with that? To a developer, not much. Simply cut-and-paste the sample code, compile it, and you’ll know in a few seconds whether or not the answer works properly. This is a different situation than asking a health question, for example, where ChatGPT can report data from dubious sources without citing them, and it is time-consuming to double-check the results.
Further, the new GPT-4 system isn’t very good a working backwards from a desired solution to the steps needed to achieve it. In a programming context, we are often given an existing data set and a desired outcome and need to define the algorithm to get from one to the other. If such an algorithm already exists in GPT’s dataset, it will likely be found and modified to fit the needed capabilities. Great for a majority of instances. If a new algorithm is needed, though, GPT should not be expected to define one.
ChatGPT represents an incredibly powerful tool and a major advance in self-learning AI. It represents a step toward artificial general intelligence (AGI), the hypothetical (though many would argue inevitable) ability of an intelligent agent to understand or learn any intellectual task that a human can. But it makes only a pretense of actual understanding. It simply manipulates words and symbols. In fact, AI systems such as ChatGPT may be slowing the emergence of AGI due to their continued reliance on bigger and more sophisticated datasets and machine learning techniques to predict the next word or phrase in a sequence.
To make the leap from AI to AGI, researchers ultimately must shift their focus to a more biologically plausible system modeled on the human brain, with algorithms that enable it to build abstract “things” with limitless connections and context, rather than the vast arrays, training sets, and computer power today’s AI demands.
For AGI to emerge, it must have the capability to understand that physical objects exist in a physical world and words can be used to represent those objects, as well as various thoughts and concepts. Because concepts such as art and music, and even some physical objects (those for example which have tastes, smells, or textures) don’t easily lend themselves to being expressed in words, however, AGI must also contain multisensory inputs and an underlying data structure which will support the creation of relationships between multiple types of data.
Further, an internal mental model of the AGI’s environment with the AGI at its center is essential. Such a model will enable an artificial entity to have perspective and a point of view with respect to its surroundings that approximates the way in which humans see and interpret the world around them. After all, how could a system have a point of view if it never experienced one?
The AGI must also be able to perceive the passage of time, which will allow it to comprehend how each action it takes now will impact the outcomes it experiences in the future. This goes hand-in-hand with the ability to exhibit imagination. Without the ability to imagine, AGI will be incapable of considering the numerous potential actions it can take, evaluating the impact of each action, and ultimately choosing the option that appears to be most reasonable.
There are certainly other capabilities needed for AGI to emerge, but implementation of just these concepts will allow us to better understand what remains to be done for AGI to be realized. Moreover, none of these concepts are impossible to create. To get there, though, researchers need to abandon the current, widely used model of extending a text-based system like ChatGPT to handle multisensory information, a mental model, cause-and-effect, and the passage of time. Instead, they should start with a data structure and a set of algorithms and then utilize the vision, planning, and decision-making capabilities of an autonomous robot to extend these capabilities to ChatGPT’s text abilities.
Fortunately, a model for doing all these things already exists in an organ which weighs about 3.3 pounds and uses about 12 watts of energy—the human brain. While we know a lot about the brain’s structure, we don’t know what fraction of our DNA defines the brain or even how much DNA defines the structure of its neocortex, the part of the brain we use to think. If we presume that general intelligence is a direct outgrowth of the structure defined by our DNA and that structure could be defined by as little as one percent of that DNA, though, it is clear that the real problem in AGI emergence is not one that requires gigabytes to define, but really one of what to write as the fundamental AGI algorithms.
With that in mind, imagine what could happen if all of today’s AI systems were to be built on a common underlying data structure which would enable them and their algorithms to begin interacting with each other. Gradually, a broader context that can understand and learn would emerge. As these systems become more advanced, they would slowly begin to work together to create a more general intelligence that approaches the threshold for human-level intelligence, then equals it, then surpasses it. Perhaps only then will we humans begin to acknowledge that AGI has emerged. To get there, we simply need to change our approach.
Portions of this article are drawn from Microsoft Research’s just-published paper, “Sparks of Artificial General Intelligence: Early experiments with GPT-4″ By Sebastien Bubeck, et al, https://arxiv.org/pdf/2303.12712.pdf