How to fine-tune GPT-3.5 or Llama 2 with a single instruction

superhero llama
Image source: 123RF (with modifications through Bing Image Creator)

Large language models (LLM) like ChatGPT and Llama 2 have gained immense popularity due to their versatility across various tasks. However, some applications require fine-tuning these models with custom data to gain better performance.

Unfortunately, fine-tuning LLMs for specific applications is often complex and frustrating and largely depends on the application type and required data. Fortunately, HyperWrite CEO Matt Schumer has developed a very useful tool, gpt-llm-trainer, which streamlines the fine-tuning process for Llama 2 or GPT-3.5 Turbo

gpt-llm-trainer reduces the intricate task of fine-tuning LLMs to a single, straightforward instruction, making it significantly easier for users to adapt these models to their needs.

How does gpt-llm-trainer work

gpt-llm-trainer employs a technique known as “model distillation.” This process essentially involves transferring knowledge from a larger machine learning model—the teacher—to a smaller one—the student. In the context of LLMs, model distillation typically involves the teacher model generating task-specific training examples, which are then used to train the smaller model.

gpt-llm-trainer takes a description of your task uses GPT-4 to automatically generate training examples for the smaller model you aim to train. These examples are then used to fine-tune a model of your choice, currently including Llama 2 and GPT-3.5 Turbo.

It’s important to note that model distillation isn’t a one-size-fits-all solution for fine-tuning LLMs. In many instances, you may still need to undergo the laborious process of manually curating your own data. Yet, model distillation proves particularly effective in scenarios where the teacher model surpasses the student in performance.

To determine if distillation is the right approach for your task, you can refer to benchmark performance reports or conduct your own empirical study of the teacher and student models. This will help you make an informed decision and optimize the fine-tuning process.

LLM model distillation
LLM model distillation

How to use gpt-llm-trainer

You can access the GitHub page for gpt-llm-trainer here. Matt has also prepared two Google Colab notebooks, one for GPT-3.5 Turbo and another for Llama 2, which makes it easy to run them without setting up your own Python environment.

To use the gpt-llm-trainer tool, you’ll first need an OpenAI account and a valid API key. This key should be entered in the notebook, where it states “YOUR KEY HERE.” 

In the first cell of the notebook, you’ll input the description of your task, the number of examples you want, and the temperature, which adjusts the model’s creativity level. The next steps are straightforward: run the cells sequentially to generate examples and train the model.

If you’re using the Llama 2 notebook, the resulting model will be saved to your Google Drive. If you’re using the GPT-3.5 notebook, the model will be stored in your OpenAI account. 

It’s crucial to note that OpenAI’s terms of service prohibit the use of its LLMs to train models for competing products. This means you can’t use models fine-tuned with gpt-llm-trainer for commercial purposes. But you can easily use it to create your own writing or coding assistant or some other tool for your personal daily use.

Also note that the data generation and training process can be time-consuming, depending on the number of examples you wish to generate and fine-tune the model on. As the examples are generated with GPT-4, it’s important to monitor the training costs. You can generate a small batch of around 50 short training examples for less than a dollar. However, if you’re planning to generate a large dataset, be cautious about your costs. You can start by generating a small batch of examples and then assess their quality and adjust your instruction if needed before proceeding to create the entire dataset.

For those using the Llama 2 notebook, gpt-llm-trainer will default to fine-tuning the “NousResearch/llama-2-7b-chat-hf” model, which is accessible without the need to fill an application form. If you wish to fine-tune the original Meta Llama 2, you’ll need to modify the code and provide your Hugging Face key. Also, remember that the fine-tuning will be performed using your Colab’s GPU, so ensure your environment is configured to use a GPU.

Improving gpt-llm-trainer

While gpt-llm-trainer is a powerful tool, its interface, based on Google Colab, is not the most user-friendly, given that Colab is typically not designed for production environments. 

Moreover, there are several features that could enhance the tool’s usability. For example, the training examples generated will not be stored and will be discarded once your Colab session ends. However, these examples are stored in a Pandas DataFrame during the session, and with a bit of coding, you can export them to a CSV file for future use.

An intriguing idea to consider, which I may delve into soon, is porting gpt-llm-trainer to Streamlit. This would provide a more user-friendly interface for fine-tuning LLMs, allow for bootstrapping with your own training examples, and enable the storage of generated examples for later use. gpt-llm-trainer is a great starting point for LLM distillation, but there is so many ways you can improve it. Excited to see what you do with it.


  1. I wonder why cant ppl just use Google offerings for fine tuning, distillation and model training via Vertex Model Garden, where everything is easy and secure for both Google LLMs and Open Source. Just using Colab it’s a joke… And if you are not yet using bard, may be a good time to start because nothing else is as good connected to Google solutions like YouTube, maps, search and more.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.