Kaggle Granite4.0
To run this, press "Runtime" and press "Run all" on a free Tesla T4 Google Colab instance!
To install Unsloth on your local device, follow our guide. This notebook is licensed LGPL-3.0.
You will learn how to do data prep, how to train, how to run the model, & how to save it
News
Train MoEs - DeepSeek, GLM, Qwen and gpt-oss 12x faster with 35% less VRAM. Blog
You can now train embedding models 1.8-3.3x faster with 20% less VRAM. Blog
Ultra Long-Context Reinforcement Learning is here with 7x more context windows! Blog
3x faster LLM training with 30% less VRAM and 500K context. 3x faster • 500K Context
New in Reinforcement Learning: FP8 RL • Vision RL • Standby • gpt-oss RL
Visit our docs for all our model uploads and notebooks.
Installation
Unsloth
==((====))== Unsloth 2025.9.11: Fast Granitemoehybrid patching. Transformers: 4.55.4. \\ /| Tesla T4. Num GPUs = 1. Max memory: 14.741 GB. Platform: Linux. O^O/ \_/ \ Torch: 2.8.0+cu126. CUDA: 7.5. CUDA Toolkit: 12.6. Triton: 3.4.0 \ / Bfloat16 = FALSE. FA [Xformers = 0.0.32.post2. FA2 = False] "-____-" Free license: http://github.com/unslothai/unsloth Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored! Unsloth: QLoRA and full finetuning all not selected. Switching to 16bit LoRA.
The fast path for GraniteMoeHybrid will be used when running the model on a GPU
Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s]
We now add LoRA adapters so we only need to update a small amount of parameters!
Unsloth: Making `model.base_model.model.model` require gradients
Data Prep
📄 Using Google Sheets as Training Data
Our goal is to create a customer support bot that proactively helps and solves issues.
We’re storing examples in a Google Sheet with two columns:
- Snippet: A short customer support interaction
- Recommendation: A suggestion for how the agent should respond
This keeps things simple and collaborative. Anyone can edit the sheet, no database setup required.
🔍 Why This Format?
This setup works well for tasks like:
Input snippet → Suggested replyPrompt → RewriteBug report → DiagnosisText → Label or Category
Just collect examples in a spreadsheet, and you’ve got usable training data.
✅ What You'll Learn
We’ll show how to:
- Load the Google Sheet into your notebook
- Format it into a dataset
- Use it to train or prompt an LLM
The chat template for granite-4 looks like this:
<|start_of_role|>system<|end_of_role|>Knowledge Cutoff Date: April 2024.
Today's Date: June 24, 2025.
You are Granite, developed by IBM. You are a helpful AI assistant.<|end_of_text|>
<|start_of_role|>user<|end_of_role|>How do astronomers determine the original wavelength of light emitted by a celestial body at rest, which is necessary for measuring its speed using the Doppler effect?<|end_of_text|>
<|start_of_role|>assistant<|end_of_role|>Astronomers make use of the unique spectral fingerprints of elements found in stars...<|end_of_text|>
Downloading data: 0%| | 0.00/1.04M [00:00<?, ?B/s]
Generating train split: 0 examples [00:00, ? examples/s]
We've just loaded the Google Sheet as a csv style Dataset, but we still need to format it into conversational style like below and then apply the chat template.
{"role": "system", "content": "You are an assistant"}
{"role": "user", "content": "What is 2+2?"}
{"role": "assistant", "content": "It's 4."}
We'll use a helper function formatting_prompts_func to do both!
Map: 0%| | 0/504 [00:00<?, ? examples/s]
We now look at the raw input data before formatting.
'User: I\'m getting an error when trying to log in. \nAgent: What error message are you seeing? \nUser: It says "Invalid credentials" even though I\'m sure my password is correct. \nAgent: Have you tried clearing your browser cache? \nUser: Yes, I cleared it already. \nAgent: Let me check your account status. \nUser: I\'ve been using this account for months without issues. \nAgent: I found no issues with your account. \nUser: Maybe there\'s a problem with the login server? \nAgent: Let\'s try resetting your password. \nUser: I just did that, and it\'s not working either. \nAgent: I\'ll need to escalate this to our engineering team. \nUser: Okay, what should I do in the meantime? \nAgent: Try using a different browser or device. \nUser: I\'ll try Chrome on my laptop. \nAgent: Let me know if that resolves the issue.'
'#### Analysis\nThe user is experiencing persistent login issues ("Invalid credentials", password reset failure) despite clearing cache and confirming correct credentials. No account anomalies were detected by the agent. The root cause remains unresolved and potentially related to server-side authentication or user-specific credential handling.\n\n#### Recommendation\n- Step 1: Confirm if using Chrome on the laptop resolved the login issue. (User action)\n- Step 2: If Step 1 was successful, no further immediate action needed. If not, proceed to escalate based on user feedback.\n- *Next best action for the agent*: Report back to the user whether using Chrome on their laptop confirmed or failed to resolve the issue.' And we see how the chat template transformed these conversations.
'<|start_of_role|>user<|end_of_role|>User: I\'m getting an error when trying to log in. \nAgent: What error message are you seeing? \nUser: It says "Invalid credentials" even though I\'m sure my password is correct. \nAgent: Have you tried clearing your browser cache? \nUser: Yes, I cleared it already. \nAgent: Let me check your account status. \nUser: I\'ve been using this account for months without issues. \nAgent: I found no issues with your account. \nUser: Maybe there\'s a problem with the login server? \nAgent: Let\'s try resetting your password. \nUser: I just did that, and it\'s not working either. \nAgent: I\'ll need to escalate this to our engineering team. \nUser: Okay, what should I do in the meantime? \nAgent: Try using a different browser or device. \nUser: I\'ll try Chrome on my laptop. \nAgent: Let me know if that resolves the issue.<|end_of_text|>\n<|start_of_role|>assistant<|end_of_role|>#### Analysis\nThe user is experiencing persistent login issues ("Invalid credentials", password reset failure) despite clearing cache and confirming correct credentials. No account anomalies were detected by the agent. The root cause remains unresolved and potentially related to server-side authentication or user-specific credential handling.\n\n#### Recommendation\n- Step 1: Confirm if using Chrome on the laptop resolved the login issue. (User action)\n- Step 2: If Step 1 was successful, no further immediate action needed. If not, proceed to escalate based on user feedback.\n- *Next best action for the agent*: Report back to the user whether using Chrome on their laptop confirmed or failed to resolve the issue.<|end_of_text|>\n' Unsloth: We found double BOS tokens - we shall remove one automatically.
Unsloth: Tokenizing ["text"] (num_proc=6): 0%| | 0/504 [00:00<?, ? examples/s]
We also use Unsloth's train_on_completions method to only train on the assistant outputs and ignore the loss on the user's inputs. This helps increase accuracy of finetunes!
Map (num_proc=2): 0%| | 0/504 [00:00<?, ? examples/s]
Let's verify masking the instruction part is done! Let's print the 100th row again.
'<|start_of_role|>user<|end_of_role|>User: My account is locked. I tried to log in but got an error message saying "Too many failed attempts".\n\nAgent: Can you please try logging in again and enter the security code sent to your email? That should unlock your account temporarily.\n\nUser: I did that already. I received the code and entered it, but my account is still locked.\n\nAgent: I see. Have you tried resetting your password via the \'Forgot Password\' link?\n\nUser: Yes, I clicked on that. It sent me an email with a reset link, but when I tried to reset my password, I got an error message saying "Invalid request".\n\nAgent: Okay, I can\'t access your account to check directly. Could you please provide me with your account ID or email address associated with the account?\n\nUser: My email is user@example.com. Account ID is 123456789.\n\nAgent: Thank you. I\'m looking into this. It seems there might be an issue with the account lockout mechanism or the password reset process. I\'ll need to contact our security team to manually unlock your account and investigate further. I\'ll keep you updated on the progress.<|end_of_text|>\n<|start_of_role|>assistant<|end_of_role|>#### Analysis\nThe user is experiencing account lockout due to multiple failed login attempts, and standard troubleshooting steps like password reset and security code entry are failing, indicating a potential issue with the account lockout mechanism or password recovery system.\n\n#### Recommendation\n- Step 1: Attempt to reset the password using the \'Forgot Password\' link and provide the error details received.\n- Step 2: Contact support with the account ID/email and request manual account unlock and investigation.\n- *Next best action for the agent*: Instruct the user to contact support immediately, providing their account details for manual intervention and further investigation.<|end_of_text|>\n'
Now let's print the masked out example - you should see only the answer is present:
" #### Analysis\nThe user is experiencing account lockout due to multiple failed login attempts, and standard troubleshooting steps like password reset and security code entry are failing, indicating a potential issue with the account lockout mechanism or password recovery system.\n\n#### Recommendation\n- Step 1: Attempt to reset the password using the 'Forgot Password' link and provide the error details received.\n- Step 2: Contact support with the account ID/email and request manual account unlock and investigation.\n- *Next best action for the agent*: Instruct the user to contact support immediately, providing their account details for manual intervention and further investigation.<|end_of_text|>\n"
GPU = Tesla T4. Max memory = 14.741 GB. 6.059 GB of memory reserved.
Let's train the model! To resume a training run, set trainer.train(resume_from_checkpoint = True)
Notice you might have to wait ~10 minutes for the Mamba kernels to compile! Please be patient!
==((====))== Unsloth - 2x faster free finetuning | Num GPUs used = 1 \\ /| Num examples = 504 | Num Epochs = 1 | Total steps = 60 O^O/ \_/ \ Batch size per device = 2 | Gradient accumulation steps = 4 \ / Data Parallel GPUs = 1 | Total batch size (2 x 4 x 1) = 8 "-____-" Trainable parameters = 1,703,936 of 3,193,100,032 (0.05% trained)
Unsloth: Will smartly offload gradients to save VRAM!
954.0727 seconds used for training. 15.9 minutes used for training. Peak reserved memory = 10.42 GB. Peak reserved memory for training = 4.361 GB. Peak reserved memory % of max memory = 70.687 %. Peak reserved memory for training % of max memory = 29.584 %.
<|start_of_role|>user<|end_of_role|> User: Everyone in my meeting just sees a black screen when I share. Agent: Sorry about that—are you sharing a window or your entire screen? User: Entire screen on macOS Sonoma. Agent: Thanks. Do you have “Enable hardware acceleration” toggled on in Settings → Video? User: Yeah, that switch is on. Agent: Could you try toggling it off and start a quick test share? User: Did that—still black for attendees. Agent: Understood. Are you on the desktop app v5.4.2 or the browser client? User: Desktop v5.4.2—just updated this morning. <|end_of_text|> <|start_of_role|>assistant<|end_of_role|>#### Analysis - The user reports a black screen for attendees when sharing the entire screen on macOS Sonoma. - The user has "Enable hardware acceleration" enabled in Settings → Video. - The user is on the desktop app v5.4.2, which was updated this morning. #### Recommendation - Step 1: Ask the user to restart the desktop app and try sharing again. - Step 2: If the issue persists, ask the user to uninstall the desktop app, restart their computer, and reinstall the latest version of the desktop app. - *Next best action for the agent*: Ask the user to perform the restart and reinstall steps and confirm if the issue is resolved.<|end_of_text|>
<|start_of_role|>user<|end_of_role|> User: I changed the batteries, but the lock app still says 5 % and won’t auto-lock. Agent: Let’s check firmware. In the app, go to Settings → Device Info—what version shows? User: 3.18.0-alpha. Agent: Latest stable is 3.17.5. Did you enroll in the beta program? User: I might have months ago. Agent: Beta builds sometimes misreport battery. Remove one battery, wait ten seconds, reinsert, and watch the LED pattern. User: LED blinks blue twice, then red once. Agent: That blink code means “config mismatch.” Do you still have the old batteries handy? User: Tossed them already. <|end_of_text|> <|start_of_role|>assistant<|end_of_role|>#### Analysis The user's lock app shows 5% battery and fails to auto-lock, despite recent battery replacement. The firmware version is 3.18.0-alpha, which is outdated. The LED blink pattern indicates a configuration mismatch, likely due to the user's previous beta enrollment. #### Recommendation 1. **Verify firmware version** in the app's Device Info section. 2. **Check for beta enrollment** history. 3. **Remove one battery**, wait ten seconds, then reinsert it. 4. **Observe the LED pattern** for confirmation of the configuration mismatch error. 5. **Provide the user with the correct firmware version** (3.17.5) and instructions to update. 6. **Ask the user to re-enroll** in the beta program if applicable, or confirm they have the latest stable firmware.<|end_of_text|>
('lora_model/tokenizer_config.json',
, 'lora_model/special_tokens_map.json',
, 'lora_model/chat_template.jinja',
, 'lora_model/vocab.json',
, 'lora_model/merges.txt',
, 'lora_model/added_tokens.json',
, 'lora_model/tokenizer.json') Now if you want to load the LoRA adapters we just saved for inference, set False to True:
Saving to float16 for VLLM
We also support saving to float16 directly. Select merged_16bit for float16 or merged_4bit for int4. We also allow lora adapters as a fallback. Use push_to_hub_merged to upload to your Hugging Face account! You can go to https://huggingface.co/settings/tokens for your personal tokens. See our docs for more deployment options.
GGUF / llama.cpp Conversion
To save to GGUF / llama.cpp, we support it natively now! We clone llama.cpp and we default save it to q8_0. We allow all methods like q4_k_m. Use save_pretrained_gguf for local saving and push_to_hub_gguf for uploading to HF.
Some supported quant methods (full list on our docs page):
q8_0- Fast conversion. High resource use, but generally acceptable.q4_k_m- Recommended. Uses Q6_K for half of the attention.wv and feed_forward.w2 tensors, else Q4_K.q5_k_m- Recommended. Uses Q6_K for half of the attention.wv and feed_forward.w2 tensors, else Q5_K.
[NEW] To finetune and auto export to Ollama, try our Ollama notebook
Likewise, if you want to instead push to GGUF to your Hugging Face account, set if False to if True and add your Hugging Face token and upload location!
Now, use the granite_finetune.Q8_0.gguf file or granite_finetune.Q4_K_M.gguf file in llama.cpp.
And we're done! If you have any questions on Unsloth, we have a Discord channel! If you find any bugs or want to keep updated with the latest LLM stuff, or need help, join projects etc, feel free to join our Discord!
Some other resources:
- Train your own reasoning model - Llama GRPO notebook Free Colab
- Saving finetunes to Ollama. Free notebook
- Llama 3.2 Vision finetuning - Radiography use case. Free Colab
- See notebooks for DPO, ORPO, Continued pretraining, conversational finetuning and more on our documentation!
This notebook and all Unsloth notebooks are licensed LGPL-3.0.



