Using LoRA with Granite 3.3

Abstract geometric design featuring a hexagonal shape with layered colors in green, blue, and light blue.

IBM just announced Granite 3.3 with lots of new features!

I am going to focus on the LoRA adapter feature for this post and how to use it.

LoRA stands for “Low rank adaptation” which is a way to fine tune LLMs fast and cheap, while mitigating “Model collapse” that can happen with normal transfer learning. It does this by locking down the model except for a few parts that you can tweak.

Because of this you can actually train a model on your laptop and get decent results in a relatively short amount of time.

For testing this I used the granite3.3-8b-instruct model. My laptop spec is M4 Max Pro 128GB. If your Mac Mx laptop has less than 40GB of memory you would be better to use the granite3.3-2b-instruct model instead. For PC you will need CUDA.

I used the following steps.

Notebook 1:

  1. For my dataset file I took the watsonx Assistant manual. [888 Pages]
  2. Using docling I converted this to markdown.
  3. The markdown I converted into tokens and then chunked those tokens by size 512 with an overlap of 50. This was then saved to a JSON list file.

Total Time: 17 minutes.

Notebook 2:

Using that chunked list I trained a LoRA adapter and saved it.

Total Time: 1 hour 42 minutes

Notebook 3:

At this point it was ready to go. I created a third notebook to test the results using the base model and the model with the LoRA adapter.


You can download the full project from github if you want to reproduce.

The results below are as-is from the LLM (No RAG)

Results

Question

What is the “Trigger Words” feature in watsonx Assistant? [ref]

LoRA Answer

✅ The “Trigger Words” feature in watsonx Assistant is designed to detect and prevent user inputs from triggering sensitive actions or responses that might be harmful or inappropriate.

Base Model Answer

❌ The “Trigger Words” feature in watsonx Assistant allows users to specify certain words or phrases that, when present in a user’s input, will activate a predefined response or action. This enables the assistant to perform specific tasks or provide particular information in response to cues from the user’s input.

How do I stop my action being used as a clarifying question? [ref]

✅ You can stop an action from being used as a clarifying question in the following ways:

  1. If you want to stop all actions from being used as clarifying questions, go to Home -> Actions -> Settings -> Clarifying questions and switch the toggle to Off.
  1. If you want to stop a specific action from being used as a clarifying question, open the action and go to Action settings > Clarifying question and switch the toggle to Off.

❌ To prevent your action from being interpreted as a clarifying question, be clear and direct in your communication. State your intentions explicitly and avoid ambiguous language. If you’re providing information, present it as a statement rather than a query. For instance, instead of saying “Isn’t it true that…?”, say “The fact is that…”.

What are the response types available to use? [ref]

✅ The response types available to use are text, image, video, audio, iframe, and options. (note: correct in relation to UI)

❌ The response types available to use are:

  • Plain text
  • Numbered or bulleted lists
  • Yes/No responses
  • Short phrases or words

Note!

The results are somewhat cherry picked. The model can still fail to give a correct answer without more context. But in conjunction with RAG, your answers are less likely to hallucinate as it understands your domain language/topic.

Leave a Reply