How open-source LLMs are challenging OpenAI, Google, and Microsoft

closed vs open source language models

This article is part of our series that explores the business of artificial intelligence

In the past few years, it seemed that wealthy tech companies would be able to monopolize the growing market for large language models (LLM). And recent earnings calls from big tech companies suggested they are in control. Microsoft’s announcements, in particular, show that the company has created a billion-dollar business from its AI services, including through Azure OpenAI Services and the workloads OpenAI runs on its cloud infrastructure.

However, a recently leaked internal document from Google indicates that the market share of big tech is not as secure as it seems thanks to advances in open-source LLMs. In short, the document says “We have no moat, and neither does OpenAI.” The dynamics of the market are gradually shifting from “bigger is better” to “cheaper is better,” “more efficient is better,” and “customizable is better.” And while there will always be a market for cloud-based LLM and generative AI products, customers now have open-source options to explore as well.

The moats of large language models

GPT-3 economy

The GPT-3 paper, published in 2020, showed the promise of scale. At 175 billion parameters, the model could do plenty of things that it hadn’t been trained on. The evolution of the GPT models suggested that if you continue to create bigger LLMs and train them on larger datasets, you’ll be able to create more capable models.

The success of GPT-3 amplified interest in creating larger language models. Several research papers explored the fascinating properties of LLMs, including their emergent abilities. At the same time, AI research labs raced to create bigger and bigger models. Gopher (280B params), LaMDA (137B params), PaLM (540B params), and Megatron-Turing (530B params) are some examples.

But at the same time, the LLM community underwent a more unsavory change. With focus shifting toward creating larger LLMs, the costs of research and innovation rose dramatically. Models like GPT-3 cost millions of dollars to train and run. Consequently, work on LLMs became limited to a few wealthy companies and AI labs associated with them.

As AI labs became dependent on the financial backing of for-profit organizations, they became under increasing pressure to monetize their technology. This pushed them to create products around their technology. And at the same time, they needed to build “moats” around their products. Moats are defensibility mechanisms that prevent competitors from copying your product and business.

The key moats for LLMs are 1) training data 2) model weights and 3) costs of training and inference. Big tech companies already had the advantage in (3) because they’re the only ones that can pay for the costs of training and running very large LLMs. Even the open-source alternatives to GPT-3 like BLOOM and OPT175-B are practically inaccessible to cash-strapped organizations that can’t afford to buy or rent thousands of GPUs.

However, to obtain the advantage in the other two areas, tech companies pushed the field toward more obscurity and less sharing. OpenAI is probably the most representative example. It went from an AI lab that released all its research to a startup that sells API access to its models. It doesn’t even release details about its training data and model architecture anymore.

For a long while, it seemed like a race to the bottom, with big tech companies throwing more money at LLMs and making the field more secret.

Open-source LLMs

chatgpt alternatives

As LLM power became centralized within a few big tech companies, the open-source community responded. Their efforts intensified after the release of ChatGPT showed the growing promise of instruction-following language models in different applications. In the past few months, we’ve seen the release of several open-source LLMs that challenge the entire business model that big tech has established.

These open-source alternatives to ChatGPT prove a few key points. First, LLMs with a few billion parameters can compete with very large models in terms of performance if you train them on very large datasets. Second, you can fine-tune small LLMs to impressive degrees with a very small budget and a modest amount of data. And finally—and this is not a new point—the pace of advances in open-source LLMs is much faster than in the closed ecosystem because different teams can build on top of each other’s work.

Most of these LLMs range between 7-13 billion parameters and can run on a strong consumer-grade GPU. Interestingly, the movement was triggered by the release of LLaMA, a family of open-source LLMs developed by Meta. Soon after, researchers at different universities released Alpaca and Vicuna, two models created on top of LLaMA that were fine-tuned for instruction-following like ChatGPT.

LLaMA’s license prevents using it for commercial purposes. Dolly 2 by Databricks solves this problem by building on top of EleutherAI’s Pythia model. And Open Assistant is a fully open model that offers access to everything, including code, model weights, and training data.

These models also take advantage of techniques such as low-rank adaptation (LoRA), which reduces the cost of training by up to a thousand times.

These models provide alternatives to businesses that want to use LLMs in their applications. Now, they have access to low-cost models that can run on their own servers and can be updated with their own data frequently with a very small budget.

What does this mean for big tech companies? As the Google memo warns, “…holding on to a competitive advantage in technology becomes even harder now that cutting edge research in LLMs is affordable. Research institutions all over the world are building on each other’s work, exploring the solution space in a breadth-first way that far outstrips our own capacity. We can try to hold tightly to our secrets while outside innovation dilutes their value, or we can try to learn from each other.”

What will happen to the market for closed LLMs?

Clearly, big tech companies will not be able to monopolize the market for LLMs. But this does not mean that the market for cloud-based language models will go away. As AI researcher Andrej Karpathy points out, the open-source LLM ecosystem still faces a few problems, including the high costs of pre-training the base models.

At the same time, open-source LLMs are not suitable for everyone. Serverless black-box solutions will still be very attractive to companies that don’t have in-house machine learning talent and want to quickly integrate LLMs into their applications with a few API calls. At the same time, companies like Microsoft and Google have very strong distribution channels through their applications and customer base.

However, the efforts of the open-source community will expand the market, making it possible to use LLMs in new environments, such as your own laptop. At the same time, they will—to a degree—commoditize the market and force tech giants to provide more competitive prices to their clients. The LLM space is evolving very fast. It will be interesting to see what will unfold in the coming weeks and months.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.