Why you should be concerned about ChatGPT’s weird behavior

Confused robot
Image generated with Bing Image Creator

On February 20, OpenAI confirmed reports that ChatGPT was providing “unexpected responses” and that the company was looking into it. The statement came after users started posting examples of the large language model (LLM) showing weird behavior. 

The problem has since been fixed, but concerns remain about what to do if it happens again. It is one thing to see ChatGPT churn out gibberish like “the cogs en la tecla might get a bit whimsical.” It’s another to build a critical application on top of the LLM only to see it malfunction for reasons you don’t know.

Cloud outages are nothing new. Web servers become targets of DDoS attacks. API services become unresponsive. Even AWS, the king of cloud infrastructure, suffers outages occasionally. But the problem with the kind of mishap that ChatGPT suffered is that we don’t know what exactly happened. And OpenAI not providing details is not helping.

How opacity hurts

The lack of transparency is making it hard to understand the reason behind ChatGPT’s weird behavior and even harder to prepare for similar events in the future. The service was working, which means the compute infrastructure (or at least part of it) was functional. But we do not know what components run behind the scenes, so maybe one of them had malfunctioned, which led to the model not working properly.

OpenAI is also not transparent about the architecture of ChatGPT. Unofficial reports indicate that ChatGPT is a mixture-of-experts (MoE) system, which means it is a big machine learning model composed of several smaller models that specialize for different tasks.

It uses one or more of those models to respond based on the input it receives. Maybe one or several of these experts had stopped working? Again we don’t know.

Finally, large deep learning models, especially LLMs, are big black boxes that are still being studied. Many things about their behaviors and success (or lack thereof) on specific tasks are still unknown. Moreover, the behavior of LLMs can change dramatically when they are retrained. It might give wrong answers to questions that it answered correctly before. We don’t know when and how often the LLMs used in ChatGPT are updated and retrained. Maybe the unusual behavior was caused by some experiment on the model? Again, we don’t know.

What does it mean for LLM applications?

People who are directly using the ChatGPT application can immediately spot the problem when the LLM goes off-rails. The real threat is the applications that are using ChatGPT or a similar private model as part of a workflow or pipeline. The service is not down, which means the application will continue to work without raising alarms. But the model is not behaving as it should, which means the problems it causes can propagate across other components in the application. And if you have an application that uses several LLM agents based on ChatGPT, then the behavior can become even more unpredictable and potentially destructive.

All of this is a reminder that we have yet to learn much about the risks of building applications on top of LLMs, especially closed-source proprietary models like ChatGPT and GPT-4. We are facing a new class of failures that might next turn into security threats if not addressed properly. As LLMs become more important in critical applications, the industry needs to take measures to protect against such threats.

How to protect LLM applications against model failures

Subscribe to continue reading

Become a paid subscriber to get access to the rest of this post and other exclusive content.