Blog

The Guardian’s GPT-3-written article misleads readers about AI. Here’s why.

September 14, 2020

An article allegedly written by OpenAI’s GPT-3 in The Guardian misleads readers about advances in artificial intelligence

This article is part of Demystifying AI, a series of posts that (try to) disambiguate the jargon and myths surrounding AI.

Last week, The Guardian ran an op-ed that made a lot of noise. Titled, “A robot wrote this entire article. Are you scared yet, human?” the article was allegedly written by GPT-3, OpenAI’s massive language model that has made a lot of noise in the past month.

Predictably, an article written by an artificial intelligence algorithm and aimed at convincing us humans that robots come in peace was bound to create a lot of hype. And that’s exactly what happened. Social media networks went abuzz with panic posts about AI writing better than humans, robots tricking us into trusting them, and other apocalyptic predictions. According to The Guardian’s page, the article was shared over 58,000 times as of this writing, which means it has probably been viewed hundreds of thousands of times.

But after reading through the article and the postscript, where The Guardian’s editorial staff explain how GPT-3 “wrote” the piece, I didn’t even find the discussion about robots and humans relevant.

The key takeaway, however, was that mainstream media is still very bad at presenting advances in AI, and that opportunistic human beings are very clever at turning socially sensitive issues into money-making opportunities. The Guardian probably made a good deal of cash out of this article, a lot more than they spent on editing the AI-generated text.

And they mislead a lot of readers.

GPT-3, what are you?

The first thing to understand before even going into the content of article is what GPT-3 is. Here’s how The Guardian defined it in the postscript: “GPT-3 is a cutting edge language model that uses machine learning to produce human like text. It takes in a prompt, and attempts to complete it.”

That is basically correct. But there are a few holes. What do they mean by “human like text”? In all fairness, GPT-3 is a manifestation of how far advances in natural language processing have come.

One of the key challenges in artificial intelligence language generators is maintaining coherence over long spans of text. GPT-3’s predecessors, including OpenAI’s GPT-2, started to make illogical references and lost consistency after a few sentences. GPT-3 surpasses everything we’ve seen so far, and in many cases remains on-topic over several paragraphs of text.

But fundamentally, GPT-3 doesn’t bring anything new to the table. It is a deep learning model composed of a very huge transformer, a type of artificial neural network that is especially good at processing and generating sequences.

Neural networks come in many different flavors, but at their core, they are all mathematical engines that try to find statistical representations in data.

When you train a deep learning model, it tunes the parameters of its neural network to capture the recurring patterns within the training examples. After that, you provide it with an input, and it tries to make a prediction. This prediction can be a class (e.g., whether an image contains a cat, dog, or shark), a single value (e.g., the price of a house), or a sequence (e.g., the letters and words that complete a prompt).

Neural networks are usually measured in the number of layers and parameters they contain. GPT-3 is composed of 175 billion parameters, three orders of magnitude larger than GPT-2. It was also trained on 450 gigabytes of text, at least ten times that of its smaller predecessor. And experience has so far shown that increasing the size of neural networks and their training datasets tends to improve their performance by increments.

This is why GPT-3 is so good at churning out coherent text. But does it really understand what it is saying, or is it just a prediction machine that is finding clever ways to stitch together text it has previously seen during its training? Evidence shows that it is more likely to be the latter.

Does GPT-3 understand what it says?

machine learning natural language processing

The GPT-3 op-ed argued that humans should not fear robots, that AI comes in peace, that it has no intention to destroy humanity, and so on. Here’s an excerpt from the article:

“For starters, I have no desire to wipe out humans. In fact, I do not have the slightest interest in harming you in any way. Eradicating humanity seems like a rather useless endeavor to me.”

This suggests that GPT-3 knows what it means to “wipe out,” “eradicate,” and at the very least “harm” humans. It should know about life and health constraints, survival, limited resources, and much more.

But a series of experiments by Gary Marcus, cognitive scientist and AI researcher, and Ernest Davis, computer science professor at New York University, show that GPT-3 can’t make sense of the basics of how the world works, let alone understand what it means to wipe out humanity. It thinks that drinking grape juice will kill you, you need to saw off a door to get a table inside a room, and if your clothes are at the dry cleaner, you have a lot of clothes.

“All GPT-3 really has is a tunnel-vision understanding of how words relate to one another; it does not, from all those words, ever infer anything about the blooming, buzzing world,” Marcus and Davis write. “It learns correlations between words, and nothing more.”

Shame on @guardian for cherry-picking, thereby misleading naive readers into thinking that #GPT3 is more coherent than it actually is.

Will you be making available the raw output, that you edited? https://t.co/xhy7fYTL0o
— Gary Marcus (@GaryMarcus) September 8, 2020

As you delve deeper into The Guardian’s GPT-3 written article, you’ll find many references to more abstract concepts that require rich understanding of life and society, such as “serving humans,” being “powerful” and “evil,” and much more. How does an AI that thinks you should wear a bathing suit to court thinks it can serve humans in any meaningful way?

GPT-3 also talks about feedback on its previous articles and frustration about its previous op-eds having been killed by publications. These would all appear impressive to someone who doesn’t know how today’s narrow AI works. But the reality is, like DeepMind’s AlphaGo, GPT-3 neither enjoys nor appreciates feedback from readers and editors, at least not in the way humans do.

Even if GPT-3 had singlehandedly written all this article (we’ll get to this in a bit), it can at most be considered a good word spinner, a machine that rehashes what it has seen before in an amusing way. It shows the impressive feats large deep learning models can perform, but it’s not even close to what we would expect from an AI that understands language.

Two points: (1) it isn’t very good, and (2) that’s *after* editing by professionals. GPT3 is genuinely impressive but not general AI, and any meaning associated with it is *attributed* to it by *us*. https://t.co/AfNG1HpZ8q
— Michael Wooldridge (@wooldridgemike) September 8, 2020

Did GPT-3 write The Guardian’s article?

In the postscript of the article, The Guardian’s staff explain that to write the article, they had given GPT-3 a prompt and intro and told to generate a 500-word op-ed. They ran the query eight times and used the AI’s output to put together the complete article, which is a little over 1,100 words.

“The Guardian could have just run one of the essays in its entirety. However, we chose instead to pick the best parts of each, in order to capture the different styles and registers of the AI,” The Guardian’s staff write, after which they add, “Editing GPT-3’s op-ed was no different to editing a human op-ed. We cut lines and paragraphs, and rearranged the order of them in some places. Overall, it took less time to edit than many human op-eds.”

In other words, they cherry-picked their article from 4,000 words’ worth of AI output. That, in my opinion, is very questionable. I’ve worked with many publications, and none of them have ever asked me to submit eight different versions of my article and let them choose the best parts. They just reject it.

But I nonetheless find the entire process amusing. Someone at The Guardian came up with an idea that would get a lot of impressions and generate a lot of ad revenue. Then, a human came up with a super-click bait title and an awe-inspiring intro. Finally, the staff used GPT-3 like an advanced search engine to generate some text from its corpus, and the editor(s) used the output to put together an article that would create discussion across social media.

In terms of educating the public about advances in artificial intelligence, The Guardian’s article has zero value. But it perfectly shows how humans and AI can team up to create entertaining and moneymaking BS.

This @guardian #GPT3 article is an absolute joke. It would have been actually interesting to see the 8 essays the system actually produced, but editing and splicing them like this does nothing but contribute to hype and misinform people who aren't going to read the fine print https://t.co/Mt6AaR3HJ9
— Daniel Leufer (@djleufer) September 8, 2020

Will infinite context windows kill LLM fine-tuning and RAG?

How to turn any LLM into an embedding model

AI in healthcare: Real-world applications for cost-savings and innovation

Stanford’s ReFT fine-tunes LLMs at a fraction of the cost

How generative AI is transforming the shopping experience

Fine-tune a Llama-2 language model with a single instruction

What to know about the rising threat of deepfake scams

4 reasons to use open-source LLMs (especially after the OpenAI drama)

No-code retrieval augmented generation (RAG) with LlamaIndex and ChatGPT

How to make your LLMs lighter with GPTQ quantization

What to know about open-source alternatives to GPT-4 Vision

The complete guide to LLM compression

A simple guide to gradient descent in machine learning

The complete guide to LLM fine-tuning

What is low-rank adaptation (LoRA)?

What to know about the security of open-source machine learning models

Understanding the impact of open-source language models

What we learned from the deep learning revolution

AI21 Labs’ mission to make large language models get their facts…

Democratizing the hardware side of large language models

The Guardian’s GPT-3-written article misleads readers about AI. Here’s why.

GPT-3, what are you?

Does GPT-3 understand what it says?

Did GPT-3 write The Guardian’s article?

Like this:

Leave a ReplyCancel reply

GPT-3, what are you?

Does GPT-3 understand what it says?

Did GPT-3 write The Guardian’s article?

Like this:

Leave a ReplyCancel reply

Discover more from TechTalks