Navigating the world of commercial open-source large language models

By Aharsh M.S

swirls
Image source: 123RF

I was genuinely surprised when I first glimpsed the capabilities of large language models (LLM). Out of the blue, it felt like the future had arrived at our doorstep. These LLMs, or generally foundational models, could learn almost anything you throw at them. LLMs are game-changers for businesses. They’re not just tools; they’re catalysts, capable of supercharging our teams and redefining the “not so efficient” processes we are used to. 

And it isn’t just about what they could do, but how they do it—they get the job done much faster, with greater creativity and heightened efficiency. We’re prospecting the possibility of a drastic drop in ops cost, work execution time, and effort, the kind of efficiencies businesses dream of. So, you might naturally think, “How can I get my hands on this technology?”

However, there is a challenge. As the interest in generative AI exploded, with it, the market was suddenly flooded with LLMs. From startups to the global open-source community and corporate giants, a lot of innovators contributed to the generative AI domain. It was overwhelming, to say the least; every day, there was at least one news about someone somewhere in the world releasing a new GAI model or tool, or research paper. 

Despite a gazillion models being released in a matter of months, most of them were locked behind hefty paywalls or usage restrictions. However, there were open-source models on the other end of the spectrum. They whispered promises of freedom, experimentation, and innovation. But, most of them came with strings attached. One general pattern you can observe in the majority of open-source LLMs is — “Feel free to play, but don’t you dare think about making money with it.”

However, amidst this maze of choices, some visionaries stood out as true “heroes,” releasing open-source LLMs with commercial licenses. This is a golden ticket for businesses. A green light to experiment, adapt, and, crucially, integrate these models into their commercial applications. Today, I want to guide you through this landscape of open-source LLMs that offer commercial usage rights.

What you need to know about open source licenses

Once you start your research on Generative AI, you realize that there are hundreds of models out there, patiently waiting to confuse you.

For businesses, I believe their key objective is to identify commercially usable open-source models. You can distinguish the models based on their licenses. While the variety of licenses can seem confusing, there are a few that you should familiarize yourself with, and I hope to simplify it for you. 

Apache 2.0: A permissive license that lets you freely use, modify, and distribute the software, even for commercial purposes. This license has a provision that modified versions of the software must state the changes made when redistributed. It also provides an express grant of patent rights from contributors to users, but you must provide attribution. This license is like a generous friend who shares their toys and says, “Play as you wish; just remember where you got it!”

MIT License: Another permissive license, it’s known for its simplicity. While it’s permissive, it doesn’t specifically grant any patent rights to the user. You can do virtually anything you want with the software – including using it commercially – as long as you provide attribution. Think of it as a gracious librarian who tells you, “Borrow any book, share its stories, even rewrite them, just keep my name on the first page.”

CC BY-SA-4.0: Creative Commons Attribution-ShareAlike 4.0, this one’s a bit tricky. You can use, share, and even build upon the material for any purpose, including commercially. But there’s a catch: any derivative work must be distributed under the same license. Imagine a friend lending you their recipe, saying, “Make it, sell it, but if you tweak it, share your version the same way I shared mine with you.”

OpenRAIL-M v1: A more recent addition, it’s specially crafted for AI models. While it permits commercial use, it also has stipulations about safety and ethics. It’s like the responsible elder sibling in the licensing family ensuring that the tools aren’t just used but used wisely.

BSD Licenses: BSD-2-Clause allows for almost unrestricted use, including redistribution and use in proprietary software, provided that the copyright notice is preserved. For example, A neighbor who lends you their tools, saying, “Use them as you see fit; just don’t forget to mention it was mine if someone asks.” BSD-3-Clause is Similar to the 2-Clause license but with an additional clause that prevents using the licensor’s name to endorse or promote products derived from the software without permission. For example, a skilled artist who gives you their artwork, saying, “Use this wherever you like; just don’t claim you painted it, and don’t use my name to promote your gallery without asking.”

MPL-2.0:  Mozilla Public License 2.0 is a weak copyleft license; it allows you to integrate open-source code into proprietary projects. However, any changes to the licensed software must remain under the MPL and be made available to the public. Analogous to a crafty friend who gives you a handcrafted item and says, “Feel free to mix this with your own crafts, but if you tweak my design, let others know how you did it.” MPL is Stricter than MIT, BSD, or Apache because it combines aspects of both permissive and copyleft licenses.

Ms-PL: Microsoft Public License is a permissive license specific to Microsoft’s ecosystem. It allows for redistribution and the right to use for any purpose, provided you include the original copyright notice. Analogous to a mentor who shares their notes, saying, “Learn from this, build upon it, share it as you like, but always respect its origins.”  It has conditions about larger works and how the software is bundled and licensed, which can make it less flexible in certain scenarios.

CC0: Creative Commons Zero is not exactly a software license but a public domain dedication. It means the creator waives all their copyright and related rights, allowing others to use, modify, and distribute the work for any purpose without restrictions. It’s like a philosopher who declares, “My thoughts and ideas are for everyone; use them, build upon them; they’re humanity’s to cherish.”

Unlicense: A license that dedicates the work to the public domain, waiving all copyright claims. It grants absolute freedom for using, modifying, and distributing the work. It’s like a gardener who opens their garden to all, proclaiming, “Take the flowers, plant them elsewhere, let beauty flourish without bounds.”

These are some of the most common licenses that allow commercial usage. Creative Commons licenses, such as CC BY-NC, CC BY-NC-SA, and CC BY-NC-ND, put specific restrictions on the commercial use of content or software. The GNU Affero General Public License (AGPL) presents unique challenges for some commercial operations. And, of course, Non-Commercial Open Source licenses that are clear-cut in prohibiting commercial utilization. 

Grasping this knowledge on licenses is not just academic; it’s practical. When a particular model piques your interest, this knowledge becomes your compass, guiding you to discern whether the model is merely fascinating or functionally usable for your business pursuits. So, as you venture into the LLM world, model licenses can be your guiding light, ensuring you don’t just find a model but the right one for your needs.

Open source LLMs with commercial usage rights

book recommendation transformers for natural language processing
Transformers for Natural Language Processing is an excellent introduction to the technology underlying LLMs

Considering the license of an LLM is important for your business application, but it shouldn’t be the only factor you rely on. You have to dive deeper and assess each model’s unique strengths. For instance, some LLMs are purely pre-trained, like GPT-2 or BERT. Then some have been meticulously fine-tuned for specific tasks, such as T5 for translation or RoBERTa for sentiment analysis. So, a solid rule of thumb is to choose an open-source model that allows commercial usage and is already fine-tuned for your application’s specific use case.

For example, if you’re building a Q&A system, using an LLM that is fine-tuned on question-answer data can provide better results. Of course, researching different open-source LLMs’ capabilities is itself a hectic task. But, if you need guidance in this area, I have something to help. I recently published a leaderboard for open-source LLMs, spotlighting the cream of the crop. I ranked the models by their downstream task capabilities, as gauged from benchmark results, and their popularity within the open-source community. If you want to evaluate other foundational models like text-to-image, program synthesis, or even instruct eval models, I’ve created separate leaderboards for generative AI models, which could be helpful for you in your research.

So, all LLMs are good to go?

I’d like to share a few open-source LLM recommendations that businesses might find valuable. The recommendations are based on an evaluation that considers the model’s capabilities, ease of adoption which accounts for the adoption cost, and the usability of the model, which accounts for the availability of commercial usage rights. 

The capability score is calculated using the MMLU (Massive Multitask Language Understanding) benchmark scores. A higher value of MMLU indicates that the model is proficient in understanding language across a wide range of tasks. This proficiency suggests that the model has a robust capability that can benefit business applications. 

Ease of adoption is calculated based on the model size. The bigger the model size, the more computing power it needs for fine-tuning, which means higher costs to use it. So bigger models have low scores for ease of adoption. The usability score is calculated based on the degree of restrictions present in the model’s license. Apache 2.0 have the highest score as it offers the most flexibility and benefits to the user, followed by MIT and CC licenses. 

Note that the ranks below are anchored to the information available as of the publication date of this article. These rankings can evolve over time. To check the most recent ranks, you may refer to this leaderboard of business-friendly LLMs, which will be updated periodically.

The above table shows only a relative comparison. For example, If you want to use an open-source LLM for a commercial project without any usage restrictions and need to fine-tune the model for specific needs, then you can choose a model based on the above model ranks as an indicator. 

Note that the models in the first two positions offer very good capability scores and high scores for ease of adoption. A higher value of ease of adoption means it would cost you less money and computing power to fine-tune it. Moreover, these models are under Apache 2.0 license, enabling unrestricted commercial usage.

However, If you do not want additional fine-tuning and do not expect your application to have more than 700 million users, then LLaMA2 models would be the best choice as they offer far better capability scores. The most capable model in the list, LLaMA2 70B, ranks lower in the above leaderboard because of the higher cost of fine-tuning and the conditional restrictions in its license, which may not be favorable for large-scale business adoption.

Wrapping things up

Even though hundreds of LLMs and foundational models exist in the market, only a select few are truly business-friendly. I hope this article provided you better clarity on the LLMs and their usability from a business perspective. As technology advances, I’m sure we’ll see an influx of even more business-friendly LLMs in the near future. Feel free to contact me or my company, Accubits Technologies, if you have any questions regarding this blog or need help embracing Generative AI in your organization.

About the author

Aharsh MS

Aharsh is the CMO of Accubits Technologies, a tech entrepreneur, a growth strategist and a visionary who believes in, and works towards empowering 7 billion creative minds. He inspires, motivates and educates businesses on how to stand out in the crowd and reach their target customers. He is a technology enthusiast focusing on AI and Blockchain technologies. Over the past few years, he has designed 10+ products and solutions that have helped tens of hundreds of people around the world.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.