Llasa TTS (1B)
unsloth-notebooksunslothoriginal_template
Export
News
Placeholder
Installation
[ ]
Unsloth
FastModel supports loading nearly any model now! This includes Vision and Text models!
Thank you to Etherl for creating this notebook!
[ ]
🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning. 🦥 Unsloth Zoo will now patch everything to make training faster! ==((====))== Unsloth 2025.3.19: Fast Llama patching. Transformers: 4.48.0. \\ /| Tesla T4. Num GPUs = 1. Max memory: 14.741 GB. Platform: Linux. O^O/ \_/ \ Torch: 2.6.0+cu124. CUDA: 7.5. CUDA Toolkit: 12.4. Triton: 3.2.0 \ / Bfloat16 = FALSE. FA [Xformers = 0.0.29.post3. FA2 = False] "-____-" Free license: http://github.com/unslothai/unsloth Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!
Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s]
unsloth/Llasa-1B does not have a padding token! Will use pad_token = <|finetune_right_pad_id|>.
We now add LoRA adapters so we only need to update 1 to 10% of all parameters!
[ ]
Not an error, but Unsloth cannot patch MLP layers with our manual autograd engine since either LoRA adapters are not enabled or a bias term (like in Qwen) is used. Not an error, but Unsloth cannot patch Attention layers with our manual autograd engine since either LoRA adapters are not enabled or a bias term (like in Qwen) is used. Not an error, but Unsloth cannot patch O projection layer with our manual autograd engine since either LoRA adapters are not enabled or a bias term (like in Qwen) is used. Unsloth 2025.3.19 patched 28 layers with 0 QKV layers, 0 O layers and 0 MLP layers.
[ ]
[ ]
You are using a model of type xcodec2 to instantiate a model of type xcodec. This is not supported for all configurations of models and can yield errors. Processing: 100%|██████████| 1195/1195 [09:41<00:00, 2.05it/s]
Dataset loaded for split 'train'. Number of samples: 1195 Moving XCodec2 model to cpu
Train the model
Now let's use Huggingface Trainer! More docs here: Transformers docs. We do 60 steps to speed things up, but you can set num_train_epochs=1 for a full run, and turn off max_steps=None.
[ ]
[ ]
GPU = Tesla T4. Max memory = 14.741 GB. 5.713 GB of memory reserved.
[ ]
==((====))== Unsloth - 2x faster free finetuning | Num GPUs used = 1 \\ /| Num examples = 1,195 | Num Epochs = 1 | Total steps = 149 O^O/ \_/ \ Batch size per device = 2 | Gradient accumulation steps = 4 \ / Data Parallel GPUs = 1 | Total batch size (2 x 4 x 1) = 8 "-____-" Trainable parameters = 36,700,160/3,450,801,152 (1.06% trained)
Unsloth: Will smartly offload gradients to save VRAM!
[ ]
[ ]
[ ]
[ ]
('lora_model/tokenizer_config.json',
, 'lora_model/special_tokens_map.json',
, 'lora_model/tokenizer.json') Saving to float16
We also support saving to float16 directly. Select merged_16bit for float16 or merged_4bit for int4. We also allow lora adapters as a fallback. Use push_to_hub_merged to upload to your Hugging Face account! You can go to https://huggingface.co/settings/tokens for your personal tokens.
[ ]
Unsloth: You have 1 CPUs. Using `safe_serialization` is 10x slower. We shall switch to Pytorch saving, which might take 3 minutes and not 30 minutes. To force `safe_serialization`, set it to `None` instead. Unsloth: Kaggle/Colab has limited disk space. We need to delete the downloaded model which will save 4-16GB of disk space, allowing you to save on Kaggle/Colab. Unsloth: Will remove a cached repo with size 15.1G
Unsloth: Merging 4bit and LoRA weights to 16bit... Unsloth: Will use up to 3.99 out of 12.67 RAM for saving. Unsloth: Saving model... This might take 5 minutes ...
100%|██████████| 28/28 [00:01<00:00, 27.83it/s]
Unsloth: Saving tokenizer... Done. Unsloth: Saving model/pytorch_model-00001-of-00002.bin... Unsloth: Saving model/pytorch_model-00002-of-00002.bin... Done.