Synthetic Data Hackathon
Synthetic Data Generation and Unsloth Tutorial
Synthetic Data Generation
In this section, we use the CLI from synthetic-data-kit to generate datasets
Converting to Fine-Tuning Format
This command uses the save-as function to convert curated Q&A pairs to fine-tuning format:
- Reads curated JSON files from
data/curated/ - Converts to format
ft(fine-tuning format with messages structure) - Outputs are saved to
data/final/with proper conversation format - The resulting format is compatible with standard fine-tuning pipelines
Successfully converted 2 files to fine-tuning format.
šÆ Total conversations: 29
{
"conversations": [
{
"content": "A person is approached by two islanders, Alex and Ben. Alex says, 'Ben is a knave.' Ben says, 'Alex is a knight.' Who is a knight and who is a knave?",
"role": "user"
},
{
"content": "First, let's assume Alex is a knight. Then, according to Alex's statement, Ben must be a knave. As a knave, Ben's statement 'Alex is a knight' must be false, which creates a contradiction because we assumed Alex is a knight. Therefore, our initial assumption is wrong, and Alex must be a knave. Since Alex is a knave, his statement 'Ben is a knave' is false, meaning Ben is actually a knight. Thus, Alex is a knave, and Ben is a knight.",
"role": "assistant"
}
]
}
Loading and Converting Data to HuggingFace Dataset
This cell performs comprehensive data processing:
- Finding Files: Locates all JSON files in
data/final/directory - Loading Data: Reads each JSON file containing fine-tuning formatted data
- Format Conversion: Extracts user and assistant messages from the fine-tuning format
- Structuring Conversations: Creates a standardized conversation format with role-content pairs
- Creating Dataset: Converts the processed data into a HuggingFace Dataset object
The output shows 74 total conversations were successfully loaded and formatted. The preview displays a sample conversation showing a knight-and-knave logic puzzle with its solution.
Fine-Tuning
Note: Please remember to shutdown the vLLM instance!
See https://unsloth.ai/docs/new/unsloth-amd-pytorch-synthetic-data-hackathon#how-do-i-free-amd-gpu-memory
Importing Standard Libraries
Imports essential Python libraries for fine-tuning:
os,json,glob: File system operations and JSON handlingtorch: PyTorch deep learning frameworkshutil: File operationsPath: Path manipulationDataset: HuggingFace datasets library for data handling
𦄠Unsloth: Will patch your computer to enable 2x faster free finetuning. #### Unsloth: `hf_xet==1.1.10` and `ipykernel>6.30.1` breaks progress bars. Disabling for now in XET. #### Unsloth: To re-enable progress bars, please downgrade to `ipykernel==6.30.1` or wait for a fix to https://github.com/huggingface/xet-core/issues/526 INFO 10-19 03:15:01 [__init__.py:216] Automatically detected platform rocm. 𦄠Unsloth Zoo will now patch everything to make training faster!
Importing Unsloth and Training Libraries
Imports specialized libraries for efficient fine-tuning:
FastLanguageModelfrom Unsloth: Optimized model loading and trainingget_chat_template,standardize_sharegpt,train_on_responses_only: Chat formatting utilitiesSFTConfig,SFTTrainer: Supervised fine-tuning configuration and trainer from TRLDataCollatorForSeq2Seq: Handles batching and padding for sequence-to-sequence training
Setup Unsloth model and tokenizer for ROCm without bitsandbytes
Unsloth: AMD currently is not stable with 4bit bitsandbytes. Disabling for now. Unsloth: WARNING `trust_remote_code` is True. Are you certain you want to do remote code execution? ==((====))== Unsloth 2025.10.6: Fast Llama patching. Transformers: 4.56.2. vLLM: 0.11.0+rocm631. \\ /| AMD Instinct MI300X VF. Num GPUs = 1. Max memory: 191.688 GB. Platform: Linux. O^O/ \_/ \ Torch: 2.8.0+rocm6.4. ROCm Toolkit: 6.4.43482-0f2d60242. Triton: 3.4.0 \ / Bfloat16 = TRUE. FA [Xformers = 0.0.32.post2. FA2 = True] "-____-" Free license: http://github.com/unslothai/unsloth Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!
`torch_dtype` is deprecated! Use `dtype` instead! INFO:accelerate.utils.modeling: We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
Loading checkpoint shards: 0%| | 0/30 [00:00<?, ?it/s]
ā Loaded: Llama-3.3-70B-Instruct (bfloat16, ROCm compatible)
Unsloth 2025.10.6 patched 80 layers with 80 QKV layers, 80 O layers and 80 MLP layers.
Loading Llama-3.3-70B Model with LoRA
This cell sets up the model for efficient fine-tuning on AMD ROCm hardware:
Model Configuration:
- Model: Llama-3.3-70B-Instruct (70 billion parameters)
- Data type: bfloat16 for ROCm compatibility
- No quantization (load_in_4bit=False) to avoid bitsandbytes dependency
- Max sequence length: 1024 tokens
LoRA (Low-Rank Adaptation) Configuration:
- Rank (r): 64 - Higher rank for the large 70B model
- Target modules: All attention and MLP layers (q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj)
- LoRA alpha: 64
- Dropout: 0 (no dropout)
- Gradient checkpointing: "unsloth" for memory efficiency
LoRA enables efficient fine-tuning by only training small adapter layers instead of the entire 70B model, making it feasible to train on a single AMD MI300X GPU with 192GB HBM3 memory.
š§ Preparing dataset for training...
Unsloth: Standardizing formats (num_proc=20): 0%| | 0/29 [00:00<?, ? examples/s]
Map: 0%| | 0/29 [00:00<?, ? examples/s]
Filter: 0%| | 0/29 [00:00<?, ? examples/s]
ā Prepared 29 valid examples for training š Sample formatted text: <|begin_of_text|><|start_header_id|>system<|end_header_id|> Cutting Knowledge Date: December 2023 Today Date: 26 July 2024 <|eot_id|><|start_header_id|>user<|end_header_id|> A person is approached ...
Preparing Dataset with Chat Template
This cell formats the dataset for fine-tuning:
Steps:
- Set Chat Template: Applies Llama-3.1 chat template formatting
- Configure Padding: Sets pad token to eos token if not already set
- Format Conversations: The
formatting_prompts_funcfunction:- Takes raw conversations from the dataset
- Applies the chat template to format them properly
- Validates conversation structure (list of dicts with role/content)
- Filters out malformed conversations
- Standardize Format: Uses
standardize_sharegptto normalize the data structure - Apply Formatting: Maps the formatting function across all examples
- Remove Empty: Filters out any empty or invalid formatted texts
The output shows 74 valid examples were successfully prepared. A sample of the formatted text is displayed, showing the proper Llama-3.1 chat template structure with system, user, and assistant headers.
Unsloth: Tokenizing ["text"] (num_proc=24): 0%| | 0/29 [00:00<?, ? examples/s]
Map (num_proc=24): 0%| | 0/29 [00:00<?, ? examples/s]
==((====))== Unsloth - 2x faster free finetuning | Num GPUs used = 1 \\ /| Num examples = 29 | Num Epochs = 1 | Total steps = 1 O^O/ \_/ \ Batch size per device = 64 | Gradient accumulation steps = 1 \ / Data Parallel GPUs = 1 | Total batch size (64 x 1 x 1) = 64 "-____-" Trainable parameters = 828,375,040 of 71,382,081,536 (1.16% trained)
==((====))== Unsloth - 2x faster free finetuning | Num GPUs used = 1 \\ /| Num examples = 29 | Num Epochs = 1 | Total steps = 1 O^O/ \_/ \ Batch size per device = 64 | Gradient accumulation steps = 1 \ / Data Parallel GPUs = 1 | Total batch size (64 x 1 x 1) = 64 "-____-" Trainable parameters = 828,375,040 of 71,382,081,536 (1.16% trained)
Unsloth: Will smartly offload gradients to save VRAM!
Training the Model with ROCm-Optimized Settings
This cell configures and executes the fine-tuning process:
Training Configuration (SFTConfig):
- Batch size: 64 per device - leveraging the AMD MI300X's massive 192GB HBM3 memory
- Gradient accumulation: 1 step
- Warmup: 5 steps
- Epochs: 1 full pass through the dataset
- Learning rate: 1e-4
- Optimizer: adamw_8bit for memory efficiency
- Precision: bf16 (bfloat16) for ROCm
- Gradient checkpointing: Enabled for memory efficiency
Special Training Mode:
Uses train_on_responses_only to compute loss only on the assistant's responses, not on the user's questions. This focuses the model on learning to generate accurate answers rather than memorizing the input format.
Key Features:
- DataCollatorForSeq2Seq handles variable-length sequences with proper padding
- No packing to preserve conversation structure
- Single dataloader worker for ROCm stability
- Gradient checkpointing via Unsloth for memory optimization
The model is then trained on the 74 logical reasoning conversations.
š¾ SAVING ROCM-TRAINED MODEL ā LoRA adapters saved to: logical_reasoning_rocm_lora š Saving merged model... Found HuggingFace hub cache directory: /root/.cache/huggingface/hub Checking cache directory for required files...
Unsloth: Copying 30 files from cache to `logical_reasoning_rocm_merged`: 100%|āāāāāāāāāā| 30/30 [01:05<00:00, 2.18s/it]
Successfully copied all 30 files from cache to `logical_reasoning_rocm_merged` Checking cache directory for required files... Cache check failed: tokenizer.model not found in local cache. Not all required files found in cache. Will proceed with downloading.
Unsloth: Preparing safetensor model files: 100%|āāāāāāāāāā| 30/30 [00:00<00:00, 626015.52it/s] Unsloth: Merging weights into 16bit: 10%|ā | 3/30 [00:24<03:35, 7.98s/it]
Saving the Fine-Tuned Model
This cell saves the trained model in two formats:
-
LoRA Adapters (
logical_reasoning_rocm_lora/):- Saves only the trained LoRA adapter weights (lightweight, ~few hundred MB)
- Can be loaded later with the base model
- Useful for sharing or deploying with the original base model
-
Merged Model (
logical_reasoning_rocm_merged/):- Merges LoRA adapters back into the base model
- Creates a standalone model with all weights
- Saved in 16-bit precision for better quality
- Ready for immediate inference without loading adapters
Both formats include the tokenizer configuration. The merged model is production-ready and can be used directly for generating answers to logical reasoning questions.And we're done! If you have any questions on Unsloth, we have a Discord channel! If you find any bugs or want to keep updated with the latest LLM stuff, or need help, join projects etc, feel free to join our Discord!
Some other resources:
- Train your own reasoning model - Llama GRPO notebook Free Colab
- Saving finetunes to Ollama. Free notebook
- Llama 3.2 Vision finetuning - Radiography use case. Free Colab
- See notebooks for DPO, ORPO, Continued pretraining, conversational finetuning and more on our documentation!


