Microsoft and OpenAI get ahead in the LLM competition

microsoft openai langauge models

The past few weeks have seen major AI announcements by Microsoft, OpenAI, Google, and other organizations. Tech companies are scrambling to solidify their position in the fast-expanding market for large language models (LLM) and generative AI.

And as big tech continues to pour more money into the field, competition is gradually becoming polarized between Microsoft and Google. So far, Microsoft has proven to be craftier and more capable in getting LLMs generative machine learning models to work in its products. But the race is not over, and we might yet see Google (or some other company) take the lead.

Here’s everything you need to know about the recent AI announcements and what they mean for the market for applied machine learning. (Note: I didn’t include GPT-4 here because I covered it last week.)

GitHub Copilot gets an upgrade

GitHub Copilot is one of the most successful applications of large language models (LLM). It has a great product/market fit. It solves the problem of generating code well enough for developers to choose it over alternatives. Instead of going through Stack Overflow and other forums to find solutions, developers give Copilot a description and it generates the code for them.

The success of GitHub Copilot hinges on the right implementation of LLMs. Its base model is Codex, an OpenAI LLM that has been trained on a large corpus of source code. When generating code, it uses the existing source code of your current file as context. This helps improve the accuracy of its output and reduce its errors.

github copilot x

Copilot provides a great productivity boost and an improved user experience for developers. The added value of GitHub Copilot is enough that developers and companies are willing to pay for it.

Now, GitHub has introduced Copilot X, the updated version of its AI programming assistant. GitHub Copilot X goes beyond auto-completion and adds new features. The main addition is a ChatGPT-like assistant that helps you debug your code, explain error messages, and analyze existing code. The LLM seamlessly uses the existing code and highlighted snippets as context to improve its suggestions and output. GitHub has also added some neat perks such as directly inserting the suggested code into the current file.

GitHub has also added other features such as writing descriptions for pull requests in GitHub. This moves the language model beyond the IDE and into the CI/CD pipeline.

There will be more features to come, such as automatically suggesting “sentences and paragraphs as developers create pull requests by dynamically pulling in information about code changes.”

Another upcoming feature is Copilot for Docs, in which an LLM will help developers navigate documentation for different programming frameworks. Developers can configure Copilot to adjust its output based on their knowledge of the topic in question. This feature currently works with known documentation sources such as React and Azure Docs. But GitHub plans to add the capability to custom sources of documentation.

Microsoft 365 gets an AI Copilot

On the back of GitHub Copilot’s success, Microsoft decided to release a Copilot for its 365 applications. Microsoft 365 Copilot is another interesting example of integrating LLMs into existing applications.

It can help compose emails in Outlook, draft and revise content in Word and PowerPoint, and analyze data in Excel. It can also do business-level tasks, such as summarizing discussion points in Teams, and generating insights from company data. Microsoft will also integrate Copilot into the Power Platform, where it helps in automating business tasks. It is also supposed to “learn new skills,” according to Microsoft.

According to the Microsoft blog, Copilot brings together LLMs, Microsoft 365 apps, and Microsoft Graph. Graph serves as a language model augmentation tool, providing an API interface for information from across the business.

The wording in the blog post is very careful to not overdo the capabilities of 365 Copilot: “Sometimes Copilot will be right, other times usefully wrong — but it will always put you further ahead. You’re always in control as the author, driving your unique ideas forward, prompting Copilot to shorten, rewrite or give feedback.”

ChatGPT gets external plugins

Until now, the power of LLMs was mostly experienced through integration with applications.

A different approach, however, is to integrate apps into AI.

An example is the new plugin feature for ChatGPT. These plugins augment the LLM with external sources of knowledge, such as Wolfram Alpha, Bing, OpenTable, and Instacart.

When ChatGPT receives a prompt, it uses these applications to get up-to-date information instead of being limited to its training data. The plugins can also enable ChatGPT to take actions in these applications.

While retrieval augmentation can help with some of the challenges of LLMs, such as their hallucination problem.

The bigger implication of “apps in AI” can be a paradigm shift for applications. Chatbot interfaces such as ChatGPT can become new portals to applications. Instead of opening apps, users will spend more time in ChatGPT to solve their problems. This can trigger a new wave of applications built for integration with LLMs.

It is too early to tell if the field is ready for this kind of shift. It will depend on the robustness, reliability, and convenience of augmented LLMs. According to OpenAI, “there’s a risk that plugins could increase safety challenges by taking harmful or unintended actions, increasing the capabilities of bad actors who would defraud, mislead, or abuse others. By increasing the range of possible applications, plugins may raise the risk of negative consequences from mistaken or misaligned actions taken by the model in new domains.”

We’ll have to see how the community solves these problems in the coming months.

Google releases Bard

Google Bard
Image source: 123RF

Shortly after OpenAI released GPT-4, Google announced Bard, its rival to ChatGPT. Bard had an initial failed demo in February. But it is now ready for prime time, with waitlist access available in the U.S. and UK.

Bard is designed to provide a seamless experience between LLM chat and Google search. It does not provide the code generation like ChatGPT and Bing. Google is moving very cautiously. Its FAQ is filled with notices such as these:

“Bard is experimental, and some of the responses may be inaccurate, so double-check information in Bard’s responses.”

“LLM experiences (Bard included) can hallucinate and present inaccurate information as factual.”

“Bard responses may also occasionally claim that it uses personal information from Gmail or other private apps and services.”

“Bard’s ability to hold context is purposefully limited for now.”

But Google is also quick to point out that it invented the transformer architecture, the architecture used in LLMs.

At first glance, Google seems to be lagging behind Microsoft/OpenAI. Experiments show that Bard is inferior to ChatGPT and Bing, and it lacks several features. But it is too early to call the race. Google is a very wealthy and resourceful company, and it is making moves to strengthen its front against Microsoft.

What is next for the LLM race?

The growing rivalry between Google and Microsoft is cause for concern. In their haste to grab a bigger share of the market for LLMs and generative AI, they risk pushing the field toward less sharing and transparency. Microsoft, Google, OpenAI, and other companies are becoming more reluctant to release their models to the public. The GPT-4 technical report has even less details about the model than its predecessors. And openness is being traded off for commercial advantage.

However, not everything is grim. There are also efforts to create open-source LLMs to keep the field democratized. Meta’s LLaMA and OPT-175B, Hugging Face’s BLOOM, and Stanford’s Alpaca are examples.

The pace at which the field is advancing ensures that the year will have a lot more to deliver.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.