Notebooks
U
Unsloth
Synthetic Data Hackathon

Synthetic Data Hackathon

unsloth-notebooksunslothnb

Synthetic Data Generation and Unsloth Tutorial

Synthetic Data Generation

In this section, we use the CLI from synthetic-data-kit to generate datasets

Converting to Fine-Tuning Format

This command uses the save-as function to convert curated Q&A pairs to fine-tuning format:

  • Reads curated JSON files from data/curated/
  • Converts to format ft (fine-tuning format with messages structure)
  • Outputs are saved to data/final/ with proper conversation format
  • The resulting format is compatible with standard fine-tuning pipelines

Successfully converted 2 files to fine-tuning format.

[1]

šŸŽÆ Total conversations: 29
{
  "conversations": [
    {
      "content": "A person is approached by two islanders, Alex and Ben. Alex says, 'Ben is a knave.' Ben says, 'Alex is a knight.' Who is a knight and who is a knave?",
      "role": "user"
    },
    {
      "content": "First, let's assume Alex is a knight. Then, according to Alex's statement, Ben must be a knave. As a knave, Ben's statement 'Alex is a knight' must be false, which creates a contradiction because we assumed Alex is a knight. Therefore, our initial assumption is wrong, and Alex must be a knave. Since Alex is a knave, his statement 'Ben is a knave' is false, meaning Ben is actually a knight. Thus, Alex is a knave, and Ben is a knight.",
      "role": "assistant"
    }
  ]
}

Loading and Converting Data to HuggingFace Dataset

This cell performs comprehensive data processing:

  1. Finding Files: Locates all JSON files in data/final/ directory
  2. Loading Data: Reads each JSON file containing fine-tuning formatted data
  3. Format Conversion: Extracts user and assistant messages from the fine-tuning format
  4. Structuring Conversations: Creates a standardized conversation format with role-content pairs
  5. Creating Dataset: Converts the processed data into a HuggingFace Dataset object

The output shows 74 total conversations were successfully loaded and formatted. The preview displays a sample conversation showing a knight-and-knave logic puzzle with its solution.

Fine-Tuning

Note: Please remember to shutdown the vLLM instance!

See https://unsloth.ai/docs/new/unsloth-amd-pytorch-synthetic-data-hackathon#how-do-i-free-amd-gpu-memory

[2]

Importing Standard Libraries

Imports essential Python libraries for fine-tuning:

  • os, json, glob: File system operations and JSON handling
  • torch: PyTorch deep learning framework
  • shutil: File operations
  • Path: Path manipulation
  • Dataset: HuggingFace datasets library for data handling
[3]
🦄 Unsloth: Will patch your computer to enable 2x faster free finetuning.
#### Unsloth: `hf_xet==1.1.10` and `ipykernel>6.30.1` breaks progress bars. Disabling for now in XET.
#### Unsloth: To re-enable progress bars, please downgrade to `ipykernel==6.30.1` or wait for a fix to
https://github.com/huggingface/xet-core/issues/526
INFO 10-19 03:15:01 [__init__.py:216] Automatically detected platform rocm.
🦄 Unsloth Zoo will now patch everything to make training faster!

Importing Unsloth and Training Libraries

Imports specialized libraries for efficient fine-tuning:

  • FastLanguageModel from Unsloth: Optimized model loading and training
  • get_chat_template, standardize_sharegpt, train_on_responses_only: Chat formatting utilities
  • SFTConfig, SFTTrainer: Supervised fine-tuning configuration and trainer from TRL
  • DataCollatorForSeq2Seq: Handles batching and padding for sequence-to-sequence training

Setup Unsloth model and tokenizer for ROCm without bitsandbytes

[4]
Unsloth: AMD currently is not stable with 4bit bitsandbytes. Disabling for now.
Unsloth: WARNING `trust_remote_code` is True.
Are you certain you want to do remote code execution?
==((====))==  Unsloth 2025.10.6: Fast Llama patching. Transformers: 4.56.2. vLLM: 0.11.0+rocm631.
   \\   /|    AMD Instinct MI300X VF. Num GPUs = 1. Max memory: 191.688 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.8.0+rocm6.4. ROCm Toolkit: 6.4.43482-0f2d60242. Triton: 3.4.0
\        /    Bfloat16 = TRUE. FA [Xformers = 0.0.32.post2. FA2 = True]
 "-____-"     Free license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!
`torch_dtype` is deprecated! Use `dtype` instead!
INFO:accelerate.utils.modeling: We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
Loading checkpoint shards:   0%|          | 0/30 [00:00<?, ?it/s]
āœ… Loaded: Llama-3.3-70B-Instruct (bfloat16, ROCm compatible)
Unsloth 2025.10.6 patched 80 layers with 80 QKV layers, 80 O layers and 80 MLP layers.

Loading Llama-3.3-70B Model with LoRA

This cell sets up the model for efficient fine-tuning on AMD ROCm hardware:

Model Configuration:

  • Model: Llama-3.3-70B-Instruct (70 billion parameters)
  • Data type: bfloat16 for ROCm compatibility
  • No quantization (load_in_4bit=False) to avoid bitsandbytes dependency
  • Max sequence length: 1024 tokens

LoRA (Low-Rank Adaptation) Configuration:

  • Rank (r): 64 - Higher rank for the large 70B model
  • Target modules: All attention and MLP layers (q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj)
  • LoRA alpha: 64
  • Dropout: 0 (no dropout)
  • Gradient checkpointing: "unsloth" for memory efficiency

LoRA enables efficient fine-tuning by only training small adapter layers instead of the entire 70B model, making it feasible to train on a single AMD MI300X GPU with 192GB HBM3 memory.

[5]
šŸ”§ Preparing dataset for training...
Unsloth: Standardizing formats (num_proc=20):   0%|          | 0/29 [00:00<?, ? examples/s]
Map:   0%|          | 0/29 [00:00<?, ? examples/s]
Filter:   0%|          | 0/29 [00:00<?, ? examples/s]
āœ… Prepared 29 valid examples for training
šŸ“ Sample formatted text:
<|begin_of_text|><|start_header_id|>system<|end_header_id|>

Cutting Knowledge Date: December 2023
Today Date: 26 July 2024

<|eot_id|><|start_header_id|>user<|end_header_id|>

A person is approached ...

Preparing Dataset with Chat Template

This cell formats the dataset for fine-tuning:

Steps:

  1. Set Chat Template: Applies Llama-3.1 chat template formatting
  2. Configure Padding: Sets pad token to eos token if not already set
  3. Format Conversations: The formatting_prompts_func function:
    • Takes raw conversations from the dataset
    • Applies the chat template to format them properly
    • Validates conversation structure (list of dicts with role/content)
    • Filters out malformed conversations
  4. Standardize Format: Uses standardize_sharegpt to normalize the data structure
  5. Apply Formatting: Maps the formatting function across all examples
  6. Remove Empty: Filters out any empty or invalid formatted texts

The output shows 74 valid examples were successfully prepared. A sample of the formatted text is displayed, showing the proper Llama-3.1 chat template structure with system, user, and assistant headers.

[6]
Unsloth: Tokenizing ["text"] (num_proc=24):   0%|          | 0/29 [00:00<?, ? examples/s]
Map (num_proc=24):   0%|          | 0/29 [00:00<?, ? examples/s]
==((====))==  Unsloth - 2x faster free finetuning | Num GPUs used = 1
   \\   /|    Num examples = 29 | Num Epochs = 1 | Total steps = 1
O^O/ \_/ \    Batch size per device = 64 | Gradient accumulation steps = 1
\        /    Data Parallel GPUs = 1 | Total batch size (64 x 1 x 1) = 64
 "-____-"     Trainable parameters = 828,375,040 of 71,382,081,536 (1.16% trained)
==((====))==  Unsloth - 2x faster free finetuning | Num GPUs used = 1
   \\   /|    Num examples = 29 | Num Epochs = 1 | Total steps = 1
O^O/ \_/ \    Batch size per device = 64 | Gradient accumulation steps = 1
\        /    Data Parallel GPUs = 1 | Total batch size (64 x 1 x 1) = 64
 "-____-"     Trainable parameters = 828,375,040 of 71,382,081,536 (1.16% trained)
Unsloth: Will smartly offload gradients to save VRAM!

Training the Model with ROCm-Optimized Settings

This cell configures and executes the fine-tuning process:

Training Configuration (SFTConfig):

  • Batch size: 64 per device - leveraging the AMD MI300X's massive 192GB HBM3 memory
  • Gradient accumulation: 1 step
  • Warmup: 5 steps
  • Epochs: 1 full pass through the dataset
  • Learning rate: 1e-4
  • Optimizer: adamw_8bit for memory efficiency
  • Precision: bf16 (bfloat16) for ROCm
  • Gradient checkpointing: Enabled for memory efficiency

Special Training Mode: Uses train_on_responses_only to compute loss only on the assistant's responses, not on the user's questions. This focuses the model on learning to generate accurate answers rather than memorizing the input format.

Key Features:

  • DataCollatorForSeq2Seq handles variable-length sequences with proper padding
  • No packing to preserve conversation structure
  • Single dataloader worker for ROCm stability
  • Gradient checkpointing via Unsloth for memory optimization

The model is then trained on the 74 logical reasoning conversations.

[ ]

šŸ’¾ SAVING ROCM-TRAINED MODEL
āœ… LoRA adapters saved to: logical_reasoning_rocm_lora
šŸ”„ Saving merged model...
Found HuggingFace hub cache directory: /root/.cache/huggingface/hub
Checking cache directory for required files...
Unsloth: Copying 30 files from cache to `logical_reasoning_rocm_merged`: 100%|ā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆ| 30/30 [01:05<00:00,  2.18s/it]
Successfully copied all 30 files from cache to `logical_reasoning_rocm_merged`
Checking cache directory for required files...
Cache check failed: tokenizer.model not found in local cache.
Not all required files found in cache. Will proceed with downloading.
Unsloth: Preparing safetensor model files: 100%|ā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆ| 30/30 [00:00<00:00, 626015.52it/s]
Unsloth: Merging weights into 16bit:  10%|ā–ˆ         | 3/30 [00:24<03:35,  7.98s/it]

Saving the Fine-Tuned Model

This cell saves the trained model in two formats:

  1. LoRA Adapters (logical_reasoning_rocm_lora/):

    • Saves only the trained LoRA adapter weights (lightweight, ~few hundred MB)
    • Can be loaded later with the base model
    • Useful for sharing or deploying with the original base model
  2. Merged Model (logical_reasoning_rocm_merged/):

    • Merges LoRA adapters back into the base model
    • Creates a standalone model with all weights
    • Saved in 16-bit precision for better quality
    • Ready for immediate inference without loading adapters

Both formats include the tokenizer configuration. The merged model is production-ready and can be used directly for generating answers to logical reasoning questions.And we're done! If you have any questions on Unsloth, we have a Discord channel! If you find any bugs or want to keep updated with the latest LLM stuff, or need help, join projects etc, feel free to join our Discord!

Some other resources:

  1. Train your own reasoning model - Llama GRPO notebook Free Colab
  2. Saving finetunes to Ollama. Free notebook
  3. Llama 3.2 Vision finetuning - Radiography use case. Free Colab
  4. See notebooks for DPO, ORPO, Continued pretraining, conversational finetuning and more on our documentation!

Join Discord if you need help + ā­ļø Star us on Github ā­ļø

This notebook and all Unsloth notebooks are licensed LGPL-3.0