Unsloth Llasa TTS (3B)

Llasa TTS (3B)

unsloth-notebooksunslothoriginal_template

alph-notebooks/unsloth-notebooks / Llasa_TTS_(3B).ipynb

Export

Run Notebooks

Contents

No cells yet

Add cells to see them here

News

Placeholder

Installation

[ ]

Unsloth

FastModel supports loading nearly any model now! This includes Vision and Text models!

Thank you to Etherl for creating this notebook!

[ ]

🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.
🦥 Unsloth Zoo will now patch everything to make training faster!
==((====))==  Unsloth 2025.3.19: Fast Llama patching. Transformers: 4.48.0.
   \\   /|    Tesla T4. Num GPUs = 1. Max memory: 14.741 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.6.0+cu124. CUDA: 7.5. CUDA Toolkit: 12.4. Triton: 3.2.0
\        /    Bfloat16 = FALSE. FA [Xformers = 0.0.29.post3. FA2 = False]
 "-____-"     Free license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

unsloth/Llasa-3B does not have a padding token! Will use pad_token = <|finetune_right_pad_id|>.

We now add LoRA adapters so we only need to update 1 to 10% of all parameters!

[ ]

Not an error, but Unsloth cannot patch MLP layers with our manual autograd engine since either LoRA adapters
are not enabled or a bias term (like in Qwen) is used.
Not an error, but Unsloth cannot patch Attention layers with our manual autograd engine since either LoRA adapters
are not enabled or a bias term (like in Qwen) is used.
Not an error, but Unsloth cannot patch O projection layer with our manual autograd engine since either LoRA adapters
are not enabled or a bias term (like in Qwen) is used.
Unsloth 2025.3.19 patched 28 layers with 0 QKV layers, 0 O layers and 0 MLP layers.

Data Prep

We will use the MrDragonFox/Elise, which is designed for training TTS models. Ensure that your dataset follows the required format: text, audio. You can modify this section to accommodate your own dataset, but maintaining the correct structure is essential for optimal training.

[ ]

You are using a model of type xcodec2 to instantiate a model of type xcodec. This is not supported for all configurations of models and can yield errors.
Processing: 100%|██████████| 1195/1195 [09:41<00:00,  2.05it/s]

Dataset loaded for split 'train'. Number of samples: 1195
Moving XCodec2 model to cpu

Train the model

Now let's use Huggingface Trainer! More docs here: Transformers docs. We do 60 steps to speed things up, but you can set num_train_epochs=1 for a full run, and turn off max_steps=None.

[ ]

GPU = Tesla T4. Max memory = 14.741 GB.
5.713 GB of memory reserved.

[ ]

==((====))==  Unsloth - 2x faster free finetuning | Num GPUs used = 1
   \\   /|    Num examples = 1,195 | Num Epochs = 1 | Total steps = 149
O^O/ \_/ \    Batch size per device = 2 | Gradient accumulation steps = 4
\        /    Data Parallel GPUs = 1 | Total batch size (2 x 4 x 1) = 8
 "-____-"     Trainable parameters = 36,700,160/3,450,801,152 (1.06% trained)

Unsloth: Will smartly offload gradients to save VRAM!

[ ]

Inference

Let's run the model! You can change the prompts

[ ]

Saving, loading finetuned models

To save the final model as LoRA adapters, either use Huggingface's push_to_hub for an online save or save_pretrained for a local save.

[NOTE] This ONLY saves the LoRA adapters, and not the full model. To save to 16bit or GGUF, scroll down!

[ ]

('lora_model/tokenizer_config.json',
, 'lora_model/special_tokens_map.json',
, 'lora_model/tokenizer.json')

Saving to float16

We also support saving to float16 directly. Select merged_16bit for float16 or merged_4bit for int4. We also allow lora adapters as a fallback. Use push_to_hub_merged to upload to your Hugging Face account! You can go to https://huggingface.co/settings/tokens for your personal tokens.

[ ]

Unsloth: You have 1 CPUs. Using `safe_serialization` is 10x slower.
We shall switch to Pytorch saving, which might take 3 minutes and not 30 minutes.
To force `safe_serialization`, set it to `None` instead.
Unsloth: Kaggle/Colab has limited disk space. We need to delete the downloaded
model which will save 4-16GB of disk space, allowing you to save on Kaggle/Colab.
Unsloth: Will remove a cached repo with size 15.1G

Unsloth: Merging 4bit and LoRA weights to 16bit...
Unsloth: Will use up to 3.99 out of 12.67 RAM for saving.
Unsloth: Saving model... This might take 5 minutes ...

100%|██████████| 28/28 [00:01<00:00, 27.83it/s]

Unsloth: Saving tokenizer... Done.
Unsloth: Saving model/pytorch_model-00001-of-00002.bin...
Unsloth: Saving model/pytorch_model-00002-of-00002.bin...
Done.