Ollama Chapter 3: Database and Fine-tuning

July 12, 2024

by kevin

Although large language models have been trained on massive amounts of data, their knowledge base still has certain limitations. For instance, they cannot access real-time information or may lack knowledge in specific domains. To address these limitations, we have two approaches. One is to provide the AI with additional documents or databases, enabling it to perform searches before generating responses. The other method involves incorporating specific knowledge to fine-tune the model’s responses.

Method 1 : RAG Database

For example, I have a PDF document here. I can upload it to Ollama so that it can provide the information from this document whenever I need it.

We navigate to the Workspace mentioned earlier on the upper left corner, click the “+” next to Document to upload our PDF. We can also add a tag to the uploaded document; in this case, the tag is “Math.” This allows us to later select specific documents by typing “#” in the dialogue box, enabling the model to search and respond accurately. Even if the language model hasn’t been trained on this data before, it can still answer our questions accurately.

Method 2: Model Fine-tuning

Although using a knowledge base allows the model to provide accurate responses, there is one drawback: every time the model needs to retrieve content from the knowledge base, it increases the response time. Moreover, users must specify the particular document each time, which adds extra steps and reduces efficiency.

The second method is to have the AI internalize and integrate domain-specific knowledge as part of the model. This process, where the language model undergoes additional training based on its existing knowledge, is called “model fine-tuning.” The process of fine-tuning a model can be complex, but by following my steps, you can also train your own custom model.

Step 1: Turn your DATA into JSON

One of the formats required for model fine-tuning is JSON. It looks similar to the image above. Although it may appear complex, it’s actually just a collection of simple instructions. Essentially, when a question is asked in the “instruction” field, the AI responds based on the corresponding “output” field. For this step, we can utilize Chap-GPT to assist us in converting the format. First, provide a template of the format, and then copy the document data you want it to learn. GPT will then convert it into the template format.

GPT Prompt：

"""

I want to convert the data in my hands into JSON format for fine-tuning the model, below I will provide you with an example of a JSON dataset format.

[

  {

  "instruction": "What are some effective stress management techniques?",

  "input": "",

  "output": "Effective stress management techniques include deep breathing exercises, meditation, regular physical activity, and 

  maintaining a healthy work-life balance."

  },

  {

  "instruction": "How can I improve my public speaking skills?",

  "input": "",

  "output": "To improve public speaking skills, practice regularly, know your audience, use visual aids, focus on body language, and seek 

  feedback from others."

  }

]

"""

Afterward, you can convert the content you need to train into the aforementioned format for GPT.

First, send the above prompt to GPT-4.

Then, send your data in segments to GPT-4 for format conversion.

GPT will directly convert the training content into a specific JSON format.

We will copy the formatted data, paste it into a new text file, and then manually change the file extension to .json.

Step 2 Unsloth and Google Colab

We accessed https://github.com/unslothai/unsloth, a tool for training and fine-tuning models. Unsloth supports many open-source language models, including popular ones like Llama 3, Mistral, and Gemma 2, all of which offer fine-tuning capabilities. We will use Llama 3 for demonstration. First, click on the Start for free link for Llama 3.

After clicking on it, we will be taken to the Google Colab website. Don’t be intimidated by the complex interface; in reality, we only need to make a few changes to start fine-tuning the model. Please follow my steps carefully.

1. Click on the folder icon on the far left.

2. Right-click and select New folder, give it any name you like.

3. Drag the newly created JSON file into the folder.

4. Right-click the Folder and select Copy path

5. Scroll down the page to find a section titled “Data Prep.” Under this title, there is a line of code for reading data. Paste the path you just copied into this line.

After pasting, it will look like this.

6. Next, we need to scroll down to find another block starting with “GGUF”. This block is used to specify how to store the fine-tuned language model. We need to change the “False” value in this block to “True”.

Step 3 Download the model

As the download speed of the fine-tuned language model directly from this page might be slow, we can consider uploading the trained model to Google Drive first and then downloading it from there. This operation requires writing a few simple lines of code.

1. Click the Google Drive icon in the top left corner.

2. You will see a section highlighted in red with the label “import drive”.

3. Click “+Code”.

4. Paste the following command in the section.

import shutil
local = '/content/model-unsloth.Q4_K_M.gguf'
google_drive = '/content/drive/My Drive/model-unsloth.Q4_K_M.gguf'
shutil.copy(local, google_drive)
print("Finish! ")

In the Runtime menu above, click “Run all”. The model will start training and downloading. The entire process will take approximately 10 minutes, depending on the amount of training data. Once the download is complete, it will be indicated as “Finished” at the bottom.

You will see a .gguf file in your Google Drive after finishing, download this file to your computer.

Step 5: Install the model to your Ollama

Back to Ollama, go to Admin Panel, select Model

1. Go to Admin Panel

2. Select Model

3. Click “Show” on the Experimental

4. Upload the GGUF file

After uploading the model, you can select your trained model from the menu in the upper left corner to use it!

Through this detailed guide, we have explored how to leverage the Ollama platform to expand and customize the capabilities of large language models. We first introduced the RAG database method, which allows AI to instantly retrieve information from specific documents. Then, we delved into more advanced model fine-tuning techniques, providing a detailed explanation of each step, from data preparation to model training. Both of these methods have their own advantages and can be flexibly applied according to different needs. By implementing these techniques, you will be able to create an AI assistant that truly understands and adapts to your unique requirements. In today’s rapidly evolving landscape of AI technology, mastering these tools and skills will give you a competitive advantage in personal development or business competition.

Categories: AI Tools Guide