This article is part of our series that explores the business of artificial intelligence
We might be at an inflection point in the market for applied artificial intelligence. The jury is still out on whether current techniques will get us to artificial general intelligence (AGI). But there is no doubt that current AI systems create a lot of value.
The introduction of ChatGPT triggered an AI arms race that continues to accelerate. Large tech companies are scrambling to seize their share of the emerging market for generative AI. Every few days, we’re seeing new product launches featuring large language models (LLM) and diffusion models. At the same time, we’re seeing a change in data-scraping policies as AI changes market dynamics.
And the open-source movement is trying to prevent big tech from becoming too powerful. From Amazon’s new AI products to the merger of Google Brain and DeepMind, here’s a recap of the latest developments in the AI arms race and what they can mean for the future of the field.
Amazon Bedrock ups the ante for Microsoft
Amazon released several important generative AI products last week. The most important is Bedrock, a service that makes it easy to set up, train, and deploy LLMs and diffusion models. Bedrock solves three key problems:
1- It provides access to a variety of generative models from AI21 Labs, Anthropic, Stability AI, and Amazon’s own Titan models.
2- It abstracts the technical complexities of setting up the compute clusters needed to run large machine learning models.
3- It provides tools to easily fine-tune and customize models for specific use cases.
Bedrock can easily be integrated into other AWS tools, such as SageMaker, S3, and CI/CD solutions, which makes it convenient for existing customers.
Why it matters? Amazon’s cloud business is facing stiff competition from Microsoft Azure. Microsoft has Azure OpenAI Services, which provides serverless access to LLMs and generative models from OpenAI. As more and more companies are looking for fast and easy integration of LLMs into their products, Amazon could face an exodus of enterprise customers to Azure. With the release of Bedrock, Amazon has brought the value of the AI offerings of AWS on par with Azure. Its larger offering of models will also provide it with greater leverage to attract new businesses to its cloud service.
What are the challenges? One of the important values of generative models is their integration with productivity applications. Here, Microsoft has a clear advantage with its market for the Office suite in the enterprise. Microsoft is integrating generative models in applications such as Word, Outlook, and Teams, making them much more useful than their direct competitors. Although this is not an area where Microsoft and Amazon compete, it could provide additional incentives for companies to choose Azure as their preferred cloud provider.
Amazon CodeWhisperer undercuts GitHub Copilot
Amazon also announced the general availability of CodeWhisperer, an AI-powered tool that generates source code for applications. CodeWhisperer is a direct competitor to GitHub Copilot, which is powered by OpenAI’s Codex language model.
Coders can install CodeWhisperer as an extension on their IDE. As you write code and comments, CodeWhisperer helps by suggesting full blocks of code or simply completing the current line. There is a lot of value for using LLMs in coding. It helps keep developers in the flow, especially on mundane tasks that comprise a large part of software development. CodeWhisperer also includes guardrails to filter out insecure code suggestions.
In contrast to GitHub Copilot, which costs $10 per month, CodeWhisperer is available for free for individual users.
Why it matters? Code generation is one of the most important use cases of large language models. LLMs can boost developer productivity considerably and save plenty of coding hours, which directly translates to major cost savings. It can become a very profitable market. Previously, GitHub Copilot was the main player in the space. Now, with the free availability of CodeWhisperer, Amazon can seize a large share of Microsoft’s market.
What are the challenges? It remains unclear how long Amazon will keep CodeWhisperer free. Also, the accuracy of LLM code generation depends largely on the training data. GitHub has access to the largest code repository in the world, putting it at an advantage in the long run, especially as online platforms become more protective of their data.
Google Brain and DeepMind merge
This week DeepMind and Google Brain merged to create Google DeepMind. Before the merger, DeepMind was running as an independent unit under Alphabet’s “other bets.” It will now be an official Google subsidiary.
“The pace of progress is now faster than ever before. To ensure the bold and responsible development of general AI, we’re creating a unit that will help us build more capable systems more safely and responsibly,” Google CEO Sundar Pichai announced in a blog post.
DeepMind CEO Demis Hassabis will remain at the head of the entity. Jeff Dean, who was previously head of Google Brain, is now Google’s Chief Scientist.
“Through Google DeepMind, we are bringing together our world-class talent in AI with the computing power, infrastructure and resources to create the next generation of AI breakthroughs and products across Google and Alphabet, and to do this in a bold and responsible way,” Hassabis declared in a separate statement.
According to Pichai’s statement, the first project Google DeepMind will engage in will be the creation of multimodal AI models.
Why it matters? Google is facing increasing competition from Microsoft, which is threatening its search ads business, one of its main revenue streams. Merging DeepMind and Google Brain will bring together some of the brightest minds in the field. Together, the two entities account for some of the most important innovations in deep learning, including transformers and word2vec.
Bringing DeepMind under the Google umbrella will also solve some of the financial challenges that the AI lab previously faced. Previously, DeepMind had to file its finances separately and it was struggling hard to find a working business model. Now, its expenses will be a footnote under Google’s massive balance sheet.
What are the challenges? DeepMind is facing the same challenge that other AI labs like OpenAI face. How will it maintain the balance between scientific research and creating marketable products. The lab’s long-term vision is to achieve AGI safely. However, Google shareholders expect technologies that can generate revenue in the short term. Those two goals do not necessarily align.
Quality data is becoming the new battlefield for LLM dominance
Reddit announced this week that it will start charging for access to its API. The decision comes on the heels of Twitter imposing similar restrictions on its API.
“The Reddit corpus of data is really valuable,” Reddit co-founder and CEO Steve Huffman told The New York Times. “More than any other place on the internet, Reddit is a home for authentic conversation. There’s a lot of stuff on the site that you’d only ever say in therapy, or AA, or never at all … But we don’t need to give all of that value to some of the largest companies in the world for free.”
Shortly after, Stack Overflow also declared that it will start charging for its API. “Community platforms that fuel LLMs absolutely should be compensated for their contributions so that companies like us can reinvest back into our communities to continue to make them thrive,” Prashanth Chandrasekar, CEO of Stack Overflow, said. “We’re very supportive of Reddit’s approach.”
Why it matters? In a recent interview, OpenAI CEO Sam Altman said, “I think we’re at the end of the era where it’s gonna be these giant models, and we’ll make them better in other ways.”
What are these “other ways”? One possible path is fine-tuning the models on more quality data and creating better training techniques. Manually curated datasets can be very valuable but expensive and slow to create. Platforms such as Reddit and Stack Overflow provide quick access to valuable data on very specific topics and questions for fine-tuning LLMs. And the owners of these platforms are quickly realizing the value of the data they are sitting on.
What are the challenges? The growing competitiveness of the data market can push the industry toward less sharing and more monetization. How long before Quora and other question-answering forums make similar moves? Unfortunately, aggressive monetization will further empower big tech companies, who can afford API costs. Small labs and cash-constrained startups, on the other hand, will have to do with the low-quality data that is available.