What to (not) expect from OpenAI’s ChatGPT

openai chatgpt

This article is part of our coverage of the latest in AI research.

This week, OpenAI released ChatGPT, another fascinating large language model (LLM) based on its flagship GPT series. ChatGPT, which is available as a free demo at the time of this writing, is a model that has been specialized for conversational interactions.

As with most things regarding LLMs, the release of ChatGPT was followed by controversy. Within hours, the new language model became a Twitter sensation, with users posting screenshots of ChatGPT’s impressive achievements and disastrous failures.

However, when looked at from the broad perspective of large language models, ChatGPT is a reflection of the short but rich history of the field, representing how far we have come in just a few years and what fundamental problems remain to be solved.

The dream of unsupervised learning

Unsupervised learning remains one of the sought-after goals of the AI community. There is a ton of valuable knowledge and information on the internet. But until recently, much of it was unusable for machine learning systems. Most interesting machine learning and deep learning applications are supervised, meaning humans had to take a bunch of data samples and annotate each to train ML systems.

This changed with the advent of the transformer architecture, the key component of large language models. Transformers can be trained with a large corpus of unlabeled text. They randomly mask parts of the text and try to predict the missing pieces. By doing this over and over, the transformer tunes its parameters to represent the relations between different words in large sequences.

This has proven to be a very effective and scalable strategy. Without the need for manual labeling, you can collect very large training corpora, which in turn allow you to create and train larger and larger transformer models. Studies and experiments show that as transformers and LLMs grow larger, they can generate longer sequences of coherent text. LLMs also show emergent abilities at scale.

A return to supervised learning?

Data points
Image source: 123RF

LLMs’ window to the world is only text, which means they lack the rich and multi-sensory experience of the humans they try to emulate.  Despite their impressive results, LLMs such as GPT-3 suffer from fundamental flaws that make them unpredictable in tasks that require common sense, logic, planning, reasoning, and other knowledge that is often omitted in text. LLMs are notoriously renowned for hallucinating responses, generating text that is coherent but factually false, and often misinterpreting the obvious intent of the user’s prompt.

By increasing the size of the model and its training corpus, scientists have been able to reduce the frequency of blatant mistakes in large language models. But the fundamental problems don’t go away, and even the largest LLMs still make stupid mistakes with a little push.

If LLMs were only used in scientific research labs to track performance on benchmarks, this might not have been a big problem. However, as there has been a growing interest in using LLMs in real-world applications, addressing these and other issues become more crucial. Engineers must make sure their machine learning models remain robust under different conditions and meet the needs and demands of their users.

To address this problem, OpenAI used reinforcement learning from human feedback (RLHF), a technique it had previously developed to optimize RL models. Instead of leaving an RL model to explore its environment and actions at random, RLHF uses occasional feedback from human supervisors to steer the agent in the right direction. The benefit of RHLF is that it can improve the training of RL agents with very minimal human feedback.

OpenAI later applied RLHF to InstructGPT, a line of LLMs that are designed to better understand and respond to instructions in user prompts. InstructGPT was a GPT-3 model fine-tuned with human feedback.

There is obviously a tradeoff here. Manual annotation can become a bottleneck in the scalable training process. But by finding the right balance between unsupervised and supervised learning, OpenAI was able to gain important benefits, including better response to instructions, reduction of harmful output, and resource optimization. According to OpenAI’s findings, a 1.3-billion parameter InstructGPT often outperforms a 175-billion-parameter GPT-3 model in instruction-following.

ChatGPT training
ChatGPT training process (source: OpenAI)

ChatGPT builds on top of the experience gained from the InstructGPT model. Human annotators create a set of sample conversations that include both the user prompt and the model response. This data is used to fine-tune the GPT-3.5 model based on which ChatGPT is built. In the next step, the fine-tuned model is provided with new prompts, to which it provides several responses. A human labeler ranks these responses. The data generated from these interactions is then used to train a reward model, which helps further finetune the LLM in a reinforcement learning pipeline.

OpenAI has not yet disclosed the full details of the reinforcement learning process. I’m interested to know the “unscalable costs” of the process, i.e., how much human effort was required.

How far can you trust ChatGPT?

The results of ChatGPT are nothing short of impressive. The model has accomplished all kinds of tasks, including providing feedback on code, writing poetry, explaining technical concepts in different tones, generating prompts for generative AI models, and going on philosophical rants.

However, the model is also prone to the kinds of errors that similar LLMs have made, such as making references to non-existing papers and books, misinterpreting intuitive physics, and failing at compositionality.

I’m not surprised by the failures. ChatGPT is not doing magic and is ought to suffer from the same problems as its predecessors. The question I’m interested in, however, is where and how far can you trust it in real-world applications? Obviously, there is something of value in there, and as we have seen with Codex and GitHub Copilot, LLMs can be put to very productive use.

Here, I think what will determine the usefulness of ChatGPT is the kinds of tools and guardrails that are implemented alongside it. For example, ChatGPT might become a very good platform to create chatbots for companies, such as a digital companion for coding and graphical design. For one thing, if it follows the example of InstructGPT, it should be able to have the performance of complex models with much fewer parameters, which will make it cost-effective. Also, if OpenAI provides the tools to enable organizations to implement their own RLHF fine-tuning, it can further be optimized for specific applications—which in most cases is more useful than a chatbot that can ramble about anything and everything. Finally, if app developers are provided with tools to integrate ChatGPT with application context and map its inputs and outputs to specific application events and actions, they will be able to put in the right guardrails to prevent the model from taking erratic actions.

Basically, OpenAI has created a powerful AI tool that has distinct flaws. It now needs to create the right ecosystem of development tools to make sure product teams can harness the power of ChatGPT. GPT-3 opened the way for many unpredictable applications. It will be interesting to see what ChatGPT has in stock.


  1. Great article. In my opinion, LLMs should prohibited by law. Their potential for serious harm outweighs their utility. They will fill the internet with misinformation that looks legit. Worse, future LLMs will be trained on texts scoured from the internet, much of which will have been generated by LLMs, a dangerous feedback loop.

    • This is where humans come into the feedback loop. Even if LLMs generate data onto the internet. Making things illegal isn’t the solution to a math problem.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.