Notebooks
U
Unsloth
Mistral (7B) Text Completion

Mistral (7B) Text Completion

unsloth-notebooksunslothnb

To run this, press "Runtime" and press "Run all" on a free Tesla T4 Google Colab instance!

Join Discord if you need help + ⭐ Star us on Github

To install Unsloth on your local device, follow our guide. This notebook is licensed LGPL-3.0.

You will learn how to do data prep, how to train, how to run the model, & how to save it

News

Train MoEs - DeepSeek, GLM, Qwen and gpt-oss 12x faster with 35% less VRAM. Blog

You can now train embedding models 1.8-3.3x faster with 20% less VRAM. Blog

Ultra Long-Context Reinforcement Learning is here with 7x more context windows! Blog

3x faster LLM training with 30% less VRAM and 500K context. 3x faster500K Context

New in Reinforcement Learning: FP8 RLVision RLStandbygpt-oss RL

Visit our docs for all our model uploads and notebooks.

Installation

[ ]

Unsloth

Text Completion / Raw Text Training

This is a community notebook collaboration with [Mithex].

We train on Tiny Stories (link here) which is a collection of small stories. For example:

	Once upon a time, there was a little car named Beep. Beep loved to go fast and play in the sun.
Beep was a healthy car because he always had good fuel....

Instead of Alpaca's Question Answer format, one only needs 1 column - the "text" column. This means you can finetune on any dataset and let your model act as a text completion model, like for novel writing.

[ ]
[ ]
🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.
config.json:   0%|          | 0.00/1.15k [00:00<?, ?B/s]
==((====))==  Unsloth: Fast Mistral patching release 2024.5
   \\   /|    GPU: Tesla T4. Max memory: 14.748 GB. Platform = Linux.
O^O/ \_/ \    Pytorch: 2.3.0+cu121. CUDA = 7.5. CUDA Toolkit = 12.1.
\        /    Bfloat16 = FALSE. Xformers = 0.0.26.post1. FA = False.
 "-____-"     Free Apache license: http://github.com/unslothai/unsloth
model.safetensors:   0%|          | 0.00/4.14G [00:00<?, ?B/s]
generation_config.json:   0%|          | 0.00/111 [00:00<?, ?B/s]
tokenizer_config.json:   0%|          | 0.00/137k [00:00<?, ?B/s]
tokenizer.model:   0%|          | 0.00/587k [00:00<?, ?B/s]
special_tokens_map.json:   0%|          | 0.00/560 [00:00<?, ?B/s]
tokenizer.json:   0%|          | 0.00/1.96M [00:00<?, ?B/s]

We now add LoRA adapters so we only need to update 1 to 10% of all parameters!

We also add embed_tokens and lm_head to allow the model to learn out of distribution data.

[ ]
Unsloth: Offloading input_embeddings to disk to save VRAM
Unsloth: Offloading output_embeddings to disk to save VRAM
Unsloth 2024.5 patched 32 layers with 32 QKV layers, 32 O layers and 32 MLP layers.
Unsloth: Casting embed_tokens to float32
Unsloth: Casting lm_head to float32

Data Prep

We now use the Tiny Stories dataset from https://huggingface.co/datasets/roneneldan/TinyStories. We only sample the first 5000 rows to speed training up. We must add EOS_TOKEN or tokenizer.eos_token or else the model's generation will go on forever.

If you want to use the llama-3 template for ShareGPT datasets, try our conversational notebook

[ ]
Downloading readme:   0%|          | 0.00/1.02k [00:00<?, ?B/s]
Repo card metadata block was not found. Setting CardData to empty.
WARNING:huggingface_hub.repocard:Repo card metadata block was not found. Setting CardData to empty.
Downloading data:   0%|          | 0.00/249M [00:00<?, ?B/s]
Downloading data:   0%|          | 0.00/248M [00:00<?, ?B/s]
Downloading data:   0%|          | 0.00/246M [00:00<?, ?B/s]
Downloading data:   0%|          | 0.00/248M [00:00<?, ?B/s]
Downloading data:   0%|          | 0.00/9.99M [00:00<?, ?B/s]
Generating train split:   0%|          | 0/2119719 [00:00<?, ? examples/s]
Generating validation split:   0%|          | 0/21990 [00:00<?, ? examples/s]

Print out 5 stories from Tiny Stories

[ ]
=========================
One day, a little girl named Lily found a needle in her room. She knew it was difficult to play with it because it was sharp. Lily wanted to share the needle with her mom, so she could sew a button on her shirt.

Lily went to her mom and said, "Mom, I found this needle. Can you share it with me and sew my shirt?" Her mom smiled and said, "Yes, Lily, we can share the needle and fix your shirt."

Together, they shared the needle and sewed the button on Lily's shirt. It was not difficult for them because they were sharing and helping each other. After they finished, Lily thanked her mom for sharing the needle and fixing her shirt. They both felt happy because they had shared and worked together.
=========================
Once upon a time, there was a little car named Beep. Beep loved to go fast and play in the sun. Beep was a healthy car because he always had good fuel. Good fuel made Beep happy and strong.

One day, Beep was driving in the park when he saw a big tree. The tree had many leaves that were falling. Beep liked how the leaves fall and wanted to play with them. Beep drove under the tree and watched the leaves fall on him. He laughed and beeped his horn.

Beep played with the falling leaves all day. When it was time to go home, Beep knew he needed more fuel. He went to the fuel place and got more healthy fuel. Now, Beep was ready to go fast and play again the next day. And Beep lived happily ever after.
=========================
One day, a little fish named Fin was swimming near the shore. He saw a big crab and wanted to be friends. "Hi, I am Fin. Do you want to play?" asked the little fish. The crab looked at Fin and said, "No, I don't want to play. I am cold and I don't feel fine."

Fin felt sad but wanted to help the crab feel better. He swam away and thought of a plan. He remembered that the sun could make things warm. So, Fin swam to the top of the water and called to the sun, "Please, sun, help my new friend feel fine and not freeze!"

The sun heard Fin's call and shone its warm light on the shore. The crab started to feel better and not so cold. He saw Fin and said, "Thank you, little fish, for making me feel fine. I don't feel like I will freeze now. Let's play together!" And so, Fin and the crab played and became good friends.
=========================
Once upon a time, in a land full of trees, there was a little cherry tree. The cherry tree was very sad because it did not have any friends. All the other trees were big and strong, but the cherry tree was small and weak. The cherry tree was envious of the big trees.

One day, the cherry tree felt a tickle in its branches. It was a little spring wind. The wind told the cherry tree not to be sad. The wind said, "You are special because you have sweet cherries that everyone loves." The cherry tree started to feel a little better.

As time went on, the cherry tree grew more and more cherries. All the animals in the land came to eat the cherries and play under the cherry tree. The cherry tree was happy because it had many friends now. The cherry tree learned that being different can be a good thing. And they all lived happily ever after.
=========================
Once upon a time, there was a little girl named Lily. Lily liked to pretend she was a popular princess. She lived in a big castle with her best friends, a cat and a dog.

One day, while playing in the castle, Lily found a big cobweb. The cobweb was in the way of her fun game. She wanted to get rid of it, but she was scared of the spider that lived there.

Lily asked her friends, the cat and the dog, to help her. They all worked together to clean the cobweb. The spider was sad, but it found a new home outside. Lily, the cat, and the dog were happy they could play without the cobweb in the way. And they all lived happily ever after.

Continued Pretraining

Now let's use Unsloth's UnslothTrainer! More docs here: TRL SFT docs. We do 20 steps to speed things up, but you can set num_train_epochs=1 for a full run, and turn off max_steps=None.

Also set embedding_learning_rate to be a learning rate at least 2x or 10x smaller than learning_rate to make continual pretraining work!

[ ]
/usr/local/lib/python3.10/dist-packages/multiprocess/popen_fork.py:66: RuntimeWarning: os.fork() was called. os.fork() is incompatible with multithreaded code, and JAX is multithreaded, so this will likely lead to a deadlock.
  self.pid = os.fork()
Map (num_proc=8):   0%|          | 0/2500 [00:00<?, ? examples/s]
[ ]
GPU = Tesla T4. Max memory = 14.748 GB.
6.367 GB of memory reserved.
[ ]
==((====))==  Unsloth - 2x faster free finetuning | Num GPUs = 1
   \\   /|    Num examples = 2,500 | Num Epochs = 1
O^O/ \_/ \    Batch size per device = 2 | Gradient Accumulation steps = 8
\        /    Total batch size = 16 | Total steps = 156
 "-____-"     Number of trainable parameters = 603,979,776
Unsloth: Setting lr = 5.00e-06 instead of 5.00e-05 for embed_tokens.
Unsloth: Setting lr = 5.00e-06 instead of 5.00e-05 for lm_head.
[ ]
2780.9282 seconds used for training.
46.35 minutes used for training.
Peak reserved memory = 11.432 GB.
Peak reserved memory for training = 5.065 GB.
Peak reserved memory % of max memory = 77.516 %.
Peak reserved memory for training % of max memory = 34.344 %.

Inference

Let's run the model!

We first will try to see if the model follows the style and understands to write a story that is within the distribution of "Tiny Stories". Ie a story fit for a bed time story most likely.

We select "Once upon a time, in a galaxy, far far away," since it normally is associated with Star Wars.

[ ]
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
<s>Once upon a time, in a galaxy, far faraway, there was a little girl named Lily. She loved to 
play with her toys and explore the universe. One day, she found a big, shiny rock. She picked it up and 
put it in her pocket.

Lily went to play with her friends, but she forgot about the rock. When she 
came back home, she realized that she had lost the rock. She was very sad and started to cry.

Her mom 
saw her crying and asked her what was wrong. Lily told her about the rock and how she lost it. Her mom 
said, "Don't worry, we can find it again." They went back to the place where Lily found the rock and 
searched for it. After a while, they found the rock and Lily was very happy. She learned that it's 
important to take care of her things and not to lose them. The end.</s>

And we're done! If you have any questions on Unsloth, we have a Discord channel! If you find any bugs or want to keep updated with the latest LLM stuff, or need help, join projects etc, feel free to join our Discord!

Some other resources:

  1. Looking to use Unsloth locally? Read our Installation Guide for details on installing Unsloth on Windows, Docker, AMD, Intel GPUs.
  2. Learn how to do Reinforcement Learning with our RL Guide and notebooks.
  3. Read our guides and notebooks for Text-to-speech (TTS) and vision model support.
  4. Explore our LLM Tutorials Directory to find dedicated guides for each model.
  5. Need help with Inference? Read our Inference & Deployment page for details on using vLLM, llama.cpp, Ollama etc.

Join Discord if you need help + ⭐️ Star us on Github ⭐️

This notebook and all Unsloth notebooks are licensed LGPL-3.0