The implications of ChatGPT’s new fine-tuning feature

ChatGPT fine-tuning
Image source: 123RF (with modifications)

OpenAI has introduced a fine-tuning feature for its GPT-3.5 Turbo model, the model powering the free version of ChatGPT, its popular large language model (LLM). The new feature can allow developers and businesses to customize the model for specific use cases. This broadens the scope of applications for ChatGPT and presents a stronger business case for the LLM by reducing operational costs.

Later this year, OpenAI plans to extend similar fine-tuning capabilities to GPT-4, the model behind the stronger ChatGPT Plus. This rollout is indicative of the expanding market for large language models, with new sub-markets continually emerging. As the demand for specialized LLMs grows, we can anticipate a surge in competition, driving further innovation and advancements in the field.

Fine-tuning ChatGPT

Large language models like ChatGPT have been designed to perform a multitude of tasks, including some they have not been explicitly trained for. The original GPT-3 paper posited that scaling a model to hundreds of billions of parameters and training it with the right data could enable it to learn new tasks through few-shot learning, retrieval augmentation, and prompt engineering, without the need for parameter modification.

However, in practice, models like ChatGPT can prove unreliable for certain tasks, even when applying all known prompt-engineering techniques. In such instances, one solution is to fine-tune the LLM.

Until recently, GPT-3.5 Turbo lacked fine-tuning capabilities. However, OpenAI has now added support for fine-tuning GPT-3.5 Turbo with 4k context. According to OpenAI’s blog, fine-tuning support for the 16k model and GPT-4 will be added in the future.

At present, fine-tuning can be achieved through API calls, but OpenAI has plans to introduce a more user-friendly web UI in the future. The introduction of fine-tuning opens up new opportunities for businesses. It can lead to better instruction-following and customized tone, enabling enterprises to launch their own internal chatbots and other applications powered by fine-tuned ChatGPT models.

Is fine-tuned ChatGPT cost-effective?

chatgpt costs
Image source: 123RF (with modifications)

One of the benefits of fine-tuning ChatGPT is reducing the costs of using the LLM by allowing developers to achieve suitable responses with shorter prompts. According to OpenAI, “Early testers have reduced prompt size by up to 90% by fine-tuning instructions into the model itself, speeding up each API call and cutting costs.”

However, calculating costs is not very straightforward. The cost of fine-tuning this model is $0.008 per thousand tokens, about four to five times the costs of inference with GPT-3.5 Turbo 4k. According to OpenAI, “Fine-tuning job with a training file of 100,000 tokens that is trained for 3 epochs would have an expected cost of $2.40.” 

The amount of data and epochs required to fine-tune the model will depend largely on the target application and how much it resembles ChatGPT’s original training data. But fine-tuning is a one-time cost, and if done correctly, it can result in large savings across large numbers of interactions with the model.

OpenAI’s early tests have demonstrated that a fine-tuned version of GPT-3.5 Turbo can match or even surpass the base GPT-4’s capabilities on certain narrow tasks. The usage cost for the fine-tuned ChatGPT is $0.012 per thousand input tokens and $0.016 per thousand output tokens. This pricing places it at half to one-quarter of the price of GPT-4, but still approximately eight times the cost of the standard GPT-3.5 Turbo model with 4k context. 

Therefore, the cost-effectiveness of fine-tuning largely depends on the specific use case. However, the introduction of this feature has undeniably enriched the pricing options available to developers and businesses and filled a gap between GPT-3.5 and GPT-4.

ChatGPT costs
Fine-tuning GPT-3.5 Turbo fills the gap between vanilla GPT-3.5 and GPT-4

GPT-3.5 Turbo 4k is the most affordable ChatGPT model. It’s suitable for simple tasks that can be accomplished with basic prompt engineering and minimal retrieval augmentation. 

GPT-3.5 Turbo 16k costs twice as much as the base model but offers more room for prompt engineering and context. This makes it useful for applications that can be managed with extensive instructions and retrieval augmentation. Although it’s less expensive than the fine-tuned model, the need for more instructions and prompt engineering might result in similar or even higher costs. It can serve as a good starting point to explore ChatGPT’s capabilities for your application and potentially generate data for later fine-tuning.

The fine-tuned GPT-3.5 Turbo 4k model is the new addition to the family. It’s pricier than the base models but requires less instruction and prompt engineering. If you have a high-quality training dataset, this model can be an excellent choice for specific applications, which is often the case for enterprises and businesses. It’s also a viable option if you’re currently using GPT-4 and are looking to switch to a less expensive alternative.

Lastly, GPT-4 8k and 32k are the most powerful and expensive models. They’re a good starting point to explore the potential of large language models and generate data for fine-tuning GPT-3.5 Turbo, ultimately reducing costs.

The growing market for specialized LLMs

closed vs open source language models

The landscape of large language models is in a constant state of flux, with the market continually expanding and evolving. One burgeoning segment is that of fine-tuned models. The era of “one model to rule them all” is giving way to a new paradigm where models can be customized or fine-tuned for specific downstream tasks at a minimal cost.

OpenAI acknowledges this shift in their blog: “developers and businesses have asked for the ability to customize the model to create unique and differentiated experiences for their users.” 

Earlier this year, the release of several open-source LLMs, fine-tuned for instruction-following, provided an alternative to ChatGPT. These models, which could be easily fine-tuned with proprietary data, began to nibble away at ChatGPT’s market share.

With the introduction of fine-tuning capabilities for GPT-3.5 Turbo, OpenAI is responding to these market changes, ensuring it remains a competitive player. The ease of use of ChatGPT, particularly once OpenAI rolls out more user-friendly fine-tuning features, is a significant advantage.

However, OpenAI’s policy of not open-sourcing its models and requiring everything to run on its servers or Microsoft Azure may lead some companies to opt for open-source models. The market is dynamic, and we can anticipate further changes in the available models and tools.

To navigate this evolving landscape, it’s crucial to have a robust data collection pipeline and maintain a comprehensive record of the data used for fine-tuning. This approach will enable you to stay flexible and avoid lock-in with a specific model or vendor, ensuring you can adapt to the ever-changing market for specialized LLMs.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.