Blog

The GPT-3 economy

September 21, 2020

This article is part of our series that explore the business of artificial intelligence

Since its release, GPT-3, OpenAI’s massive language model, has been the topic of much discussion among developers, researchers, entrepreneurs, and journalists. Most of those discussions have been focused on the capabilities of the AI-powered text generator. Users have been publishing the results of interesting experiments using the AI to generate anything and everything from articles to website code

But much about GPT-3 remains obscure. The company has opted to commercialize the deep learning model instead of making it freely available to the public. And though the AI has shown to be capable of many interesting feats, it’s not yet clear if GPT-3 will become a real product or will join the endless array of abandoned projects that never found a viable business model.

Earlier this month, as reported by users who have access to the beta version of the language model, OpenAI declared the initial pricing plan of GPT-3. While this new piece of information is not as sensational as some of the other news surrounding AI, it is nonetheless very important for weighing in on the future of GPT-3 and OpenAI.

According to reports, OpenAI announced the pricing scheme for GPT-3's API usage from October.

See: https://t.co/IS1sgsb6Ku

h/t @gwern @sonnylazuardi pic.twitter.com/w7GlkyDbLV
— hardmaru (@hardmaru) September 3, 2020

This pricing plan will enable us to better assess what it would take for OpenAI to turn GPT-3 into a profitable business, and what kind of organizations might be able to benefit from the AI. GPT-3 is the first AI model of its kind, and much of what I will discuss is speculation because there’s still a lot we don’t know about the hidden costs of running a business on top of the huge deep learning algorithm. But it’s good to have some guidelines to trace the progress of the GPT-3 economy in the coming months.

Commercial artificial intelligence

Ideally, OpenAI would have made GPT-3 available to the public. But we live in the era of commercial AI, and AI labs like OpenAI rely on the deep pockets of wealthy tech companies and VC firms to fund their research. This puts them under strain to create profitable businesses that can generate a return on investment and secure future funding.

In 2019, OpenAI switched from a non-profit organization to a for-profit company to cover the costs of their long marathon toward artificial general intelligence (AGI). Shortly after, Microsoft invested $1 billion in the company. In a blog post that announced the investment, OpenAI declared they would be commercializing some of their pre-AGI technologies.

So, it was not much of a surprise when, in June, the company announced that it would not release the architecture and pretrained model for GPT-3, but would instead make it available through a commercial API. Beta testers vetted and approved by OpenAI got free early access to GPT-3. But starting in October, the pricing plan will come into effect.

In the blog post where it declared the GPT-3 API, OpenAI stated three key reasons for not open-sourcing the deep learning model. The first was, obviously, to cover the costs of their ongoing research. Second, but equally important, is running GPT-3 requires vast compute resources that many companies don’t have. Third (which I won’t get into in this post) is to prevent misuse and harmful applications.

Based on this information, we know that to make GPT-3 profitable, OpenAI will need to break even on the costs of research and development, and also find a business model that turns in profits on the expenses of running the model.

The costs of training GPT-3

It’s hard to estimate the cost of developing GPT-3 without transparency into the process. But we know one thing: Training large neural networks can be very costly.

GPT-3 is a very large Transformer model, a neural network architecture that is especially good at processing and generating sequential data. It is composed of 96 layers and 175 billion parameters, the largest language model yet. To put that in perspective, Microsoft’s Turing-NLG, the previous record-holder, had 17 billion parameters, and GPT-3’s predecessor, GPT-2, was 1.5-billion-parameters strong.

Lambda Labs calculated the computing power required to train GPT-3 based on projections from GPT-2. According to the estimate, training the 175-billion-parameter neural network requires 3.114E23 FLOPS (floating-point operation), which would theoretically take 355 years on a V100 GPU server with 28 TFLOPS capacity and would cost $4.6 million at $1.5 per hour.

“Our calculation with a V100 GPU is extremely simplified,” Chuan Li, Lambda Lab’s Chief Science Officer, told me. “In practice you can’t train GPT-3 on a single GPU, but with a distributed system with many GPUs, like the one OpenAI used.”

Adding parallel graphics processors will cut down the time it takes to train the deep learning model. But the scaling is not perfect, and the device-to-device communication between the GPUs will add extra overhead. “So, in practice, it will take more than $4.6 million to finish the training cycle,” Li said.

It is worth noting is that specialized hardware, such as the supercomputer Microsoft built in collaboration with OpenAI’s talent, might prove to be more cost-efficient than parallel V100 clusters. But we don’t know the details.

Equally important is the process of reaching the right configuration for GPT-3. Training the final deep learning model is just one of several steps in the development of GPT-3. Before that, the AI researchers had to gradually increase layers and parameters, and fiddle with the many hyperparameters of the language model until they reached the right configuration. That trial-and-error gets more and more expensive as the neural network grows. We can’t know the exact cost of the research without more information from OpenAI, but one expert estimated it to be somewhere between 1.5 and five times the cost of training the final model.

This would put the cost of research and development between $11.5 million and $27.6 million, plus the overhead of parallel GPUs.

Note that in the 75-page GPT-3 whitepaper published in May, OpenAI declared eight variants of the language model, including a 125-million parameter “GPT-3 small.” The costs of researching and developing those models are not included in this calculation. I also haven’t taken into account the stellar salaries OpenAI has to pay the highly coveted AI talent it has hired for the task.

GPT-3-whitepaper-model-sizes — In the GPT-3 whitepaper, OpenAI introduced eight different versions of the language model

The cost of running GPT-3

Many research labs provide pre-trained versions of their models to save developers the pain and costs of training the neural networks. They would then only need to have a server or device that can load and run the model, which is much less compute-intensive than training it from scratch.

But in the case of GPT-3, the sheer size of the neural network makes it very difficult to run it. According to the OpenAI’s whitepaper, GPT-3 uses half-precision floating-point variables at 16 bits per parameter. This means the model would require at least 350 GB of VRAM just to load the model and run inference at a decent speed.

This is the equivalent of at least 11 Tesla V100 GPUs with 32 GB of memory each. At approximately $9,000 a piece, this would raise the costs of the GPU cluster to at least $99,000 plus several thousand dollars more for RAM, CPU, SSD drives, and power supply. A good baseline would be Nvidia’s DGX-1 server, which is specialized for deep learning training and inference. At around $130,000, DGX-1 is short on VRAM (8×16 GB), but has all the other components for a solid performance on GPT-3.

Best guess is something like a DGX1
— Nick Walton (@nickwalton00) August 15, 2020

Lamda Labs’ Li told me that the memory requirements of running the AI model is not a function of its parameters alone. “We don’t have the numbers for GPT-3, but can use GPT-2 as a reference. A 345M-parameter GPT-2 model only needs around 1.38 GB to store its weights in FP32. But running inference with it in TensorFlow requires 4.5GB VRAM. Similarly, A 774M GPT-2 model only needs 3.09 GB to store weights, but 8.5 GB VRAM to run inference,” he said. This would possibly put GPT-3’s VRAM requirements north of 400 GB.

Based on what we know, it would be safe to say the hardware costs of running GPT-3 would be between $100,000 and $150,000 without factoring in other costs (electricity, cooling, backup, etc.).

Alternatively, if run in the cloud, GPT-3 would require something like Amazon’s p3dn.24xlarge instance, which comes packed with 8xTesla V100 (32 GB), 768 GB RAM, and 96 CPU cores, and costs $10-30/hour depending on your plan. That would put the yearly cost of running the model at a minimum of $87,000.

Again, OpenAI might be getting a much better deal from Microsoft, given their partnership.

The GPT-3 business model?

Until now, we’ve been speculating on the development and running costs of GPT-3. Now let’s see how it can be turned into a profitable business.

In general, machine learning algorithms can perform a single, narrowly defined task. This is especially true for natural language processing, which is much more complicated than other fields of artificial intelligence. To repurpose a machine learning model for a new task, you must retrain it from scratch or fine-tune it with new examples, a process known as transfer learning.

But contrary to other machine learning models, GPT-3 is capable of zero-shot learning, which means it can perform many new tasks without the need for new training. For many other tasks, it can perform one-shot learning: Give it one example and it will be able to expand to other similar tasks. Theoretically, this makes it ideal as a general-purpose AI technology that can support many new applications.

Now, let’s see how much OpenAI will be charging customers and who could afford the pricing plan. According to what we know, GPT-3 has a free “Explore” plan, which gives you 100,000 free tokens, and two paid plans:

Create: $100 per month, 2 million tokens + 8 cents per additional 1,000 tokens
Build: $400 per month, 10 million tokens + 6 cents per additional 1,000 tokens

OpenAI also provides a “Scale” tier for those who want to customized pricing plans.

Some of the nonprofit, entertaining, and scientific projects built on top of GPT-3 have already declared they will be shutting down due to the high costs they will incur under the declared pricing plan.

PhilosopherAI, a website and mobile app that generates text on different topics, declared it will be switching to paid mode because, under the current pricing plan, the service’s operations will cost at least $4,000 per month.

But it’s not clear how the premium model will affect the traffic and usage of the application.

AI Dungeon has also created a premium plan for the GPT-3-based version of its game, charging players $10 monthly. Again, aside from a niche base of hard-core text-based RPG players and AI geeks, I don’t think other gamers will be willing to dish out $10 a month for AI Dungeon when there are plenty of free games with rich graphics and gameplay.

One obvious business case for GPT-3 is content generation. According to details published on the GPT-3 subreddit, OpenAI’s FAQ states that the 2 million tokens included in the Create tier are “roughly equivalent to 3,000 pages of text,” or around 1.5 million words.

We’ve already seen GPT-3 spin nice articles. In most cases, users gave the same query to the AI model several times and either edited one of the results or stitched together the best parts of each result. Considering six samples being enough to get a valuable result, a one-page article (~500 words) would require six pages of output, giving you around 500 monthly articles for the Create plan.

There are a few caveats, however. As experiments show, GPT-3 is not very good at tasks that require reasoning and logic, so I don’t think news outlets and magazines that depend on original and newsy content would have for GPT-3.

But it might be a good option for content farms that write SEO articles for corporate blogs. They might outsource the raw writing to GPT-3 and hire editors to put in the finishing touches. I’m still a bit skeptical, however, because the price of rehashing content from the web is already very low. I’m not sure whether editors would be willing to edit GPT-3’s output at an even lower price.

Aside from that, there is a slate of startups that plan to build on top of GPT-3, such as OthersideAI, which is planning to provide AI-based creativity tools. Matt Shumer, the company’s co-founder and CEO, told me that the pricing plan works for their business model.

I’m also interested to see whether HTML-markup generation, legal document scanning, and other use cases create viable business models.

Overall, OpenAI will have to find a way to turn the eight-figure development costs and the five-figure monthly running costs into profit. That means at least several dozen of Build customers that have working business models.

According to one estimate, the preliminary pricing plan provides OpenAI with a near-6,000-percent profit margin, so the company has room for much adjustment if the current business plan doesn’t bring in customers.

Fine-tuning, model decay, and other open issues

We will also have to wait and see how GPT-3 performs in domains where other AI algorithms already have a sizeable footprint. An example is customer-service chatbots, where a range of rule-based algorithms and deep learning models are taking care of automatable queries. Does GPT-3’s generalization capability come at the cost of poorer performance in specialized fields, such as healthcare chatbots? If it has better performance, is the improvement significant enough to convince customers to switch to GPT-3? How does the pricing compare to the costs of currently developed technologies?

We’ll probably get the answers to those questions in the coming months.

Another interesting point stated in the pricing plan disclosed earlier this month is the option to finetune the model, which is only available for the Scale tier. This tacitly means that the OpenAI team already acknowledges that GPT-3 will not be suitable for specialized uses. But we’ll have to see what “fine-tuning” means. Is it a separate model deployed on its own cloud server instance and retrained for the special purpose? That would cost a huge bit and would probably be out of the question for most businesses. A less costly option would be for OpenAI to retrain one of the smaller GPT-3 models for the new application.

OpenAI will have to consider other business costs too, such as customer service, marketing, product management, ethics and legal issues, security and privacy, and much more. Until now, OpenAI was a research lab with a cool technology. As soon as it starts charging users for GPT-3, it will have a commitment to them.

Finally, one thing that has mostly gone unnoticed is model decay. Unlike the human mind, which is constantly learning and adjusting itself with every interaction it makes, most deep learning models are static. They undergo training once and then do their job on whatever parameter settings they have. Depending on its purpose, every machine learning model decays after a while because its training data no longer represent the real situation of its problem space. After that, it needs to be retrained. For instance, many facial recognition algorithms started to fail after people started wearing face masks because they had not been trained to recognize people with masks.

Model decay is especially critical in general language models because human language is constantly changing with the news and developments around the world.

For instance, in today’s world, everyone knows that when we speak of “the lockdown,” “social distancing rules,” and “the pandemic,” we are talking about the covid-19 outbreak. However, GPT-3, whose training data mostly predates the novel coronavirus, treats those concepts in their abstract and general form. I queried GPT-3 on the above topics without making direct reference to covid-19, and I got interesting but off-topic results.

AI philosophy social distancing GPT-3 — GPT-3 has many different interpretations of “social distancing rules,” but doesn’t have context on the covid-19 pandemic.

There are online deep learning models that readjust their parameters with new feedback from users. But I don’t think the heavy computational costs of backpropagation and parameter adjustment would make it possible to deploy GPT-3 as an online deep learning model.

We will have to see how often OpenAI will have to retrain GPT-3 to keep the AI up to date for general purpose tasks, and what the costs of retraining will be.

The beginning of a new AI economy?

Overall, I’m very excited to see how GPT-3 will perform as a business platform. As we’ve seen time and again, there’s a big difference between a shiny new object and one that works. GPT-3 has dazzled everyone, but it will still have to pass the machine learning business test.

If its business model works, GPT-3 could have a huge impact, almost as huge as cloud computing. If it doesn’t, it will be a great setback for OpenAI, which is in dire need to become profitable to continue chasing the dream of human-level AI.

Subscribe to get updates from TechTalks

2 COMMENTS

Paul Weber September 21, 2020 at 8:37 pm

As long as AI doesn’t have the ability to train itself, it will never be AI. Intelligence has awareness. By being able to take from it’s environment and learn from it. AI has to be able to process all of it’s feedback, even if it is wrong, and learn from it. If a model has to be the one to retrain it then it will never be AI. Someone should write a program like python that simply has the ability to take advantage of its own diversity.

Loading...

RDM December 14, 2020 at 2:29 pm

Super interesting article. I want to try GPT3 for free as an end user, I don’t understand why I should pay for the service, which seems too expensive. I understand that they make companies pay for it, but they should also implement a (minimal) free service. I don’t understand a thing, there are games that have patches pretty much every week. Why couldn’t it be the same for GPT3?

Loading...

How to turn any LLM into an embedding model

AI in healthcare: Real-world applications for cost-savings and innovation

Stanford’s ReFT fine-tunes LLMs at a fraction of the cost

How generative AI is transforming the shopping experience

Will large language models kill Medium’s business model?

Fine-tune a Llama-2 language model with a single instruction

What to know about the rising threat of deepfake scams

4 reasons to use open-source LLMs (especially after the OpenAI drama)

No-code retrieval augmented generation (RAG) with LlamaIndex and ChatGPT

How to make your LLMs lighter with GPTQ quantization

What to know about open-source alternatives to GPT-4 Vision

The complete guide to LLM compression

A simple guide to gradient descent in machine learning

The complete guide to LLM fine-tuning

What is low-rank adaptation (LoRA)?

What to know about the security of open-source machine learning models

Understanding the impact of open-source language models

What we learned from the deep learning revolution

AI21 Labs’ mission to make large language models get their facts…

Democratizing the hardware side of large language models

The GPT-3 economy

Commercial artificial intelligence

The costs of training GPT-3

The cost of running GPT-3

The GPT-3 business model?

Fine-tuning, model decay, and other open issues

The beginning of a new AI economy?

Like this:

2 COMMENTS

Leave a Reply to Paul WeberCancel reply

Commercial artificial intelligence

The costs of training GPT-3

The cost of running GPT-3

The GPT-3 business model?

Fine-tuning, model decay, and other open issues

The beginning of a new AI economy?

Like this:

2 COMMENTS

Leave a Reply to Paul WeberCancel reply

Discover more from TechTalks