Notebooks
N
Nebius
Fine Tune Llama

Fine Tune Llama

fine-tuning-1post-trainingnebius-token-factory-cookbook

Fine-tune Open-Source LLMs on Nebius Token Factory

Learn how to fine-tune & deploy open models like Llama 3.1 directly from your dataset using Nebius Token Factory, an all-in-one platform for working with large language models (LLMs).

Before you begin, get your API key from the Dashboard.

You can run this notebook on Google Colab (without setting up any python env!) or locally.

Open In Colab

Step-1: Fine Tuning Configuration

Pick models to fine tune

[ ]

Step 2: Installation & Setup

[ ]
NOT running on Colab
[ ]

Step 3: Load Configuration

[ ]
✅ NEBIUS_API_KEY found
[ ]

Step-4: Initialize Client

[ ]

Before running, store your key in Colab Variables as NEBIUS_API_KEY or export it as an environment variable.

Step 5: Prepare your dataset

5.1 - Load data

Fine-tuning works best with conversational data (the OpenAI-style format with messages). We’ll use a sample dataset from Hugging Face to keep things simple.

You can learn more about preparing DataSets here

[ ]
/Users/sujee/my-stuff/nebius/token-factory-cookbook-worktree/fine-tuning-1/post-training/fine-tuning-1/.venv/lib/python3.12/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm
Dataset({
,    features: ['messages'],
,    num_rows: 1000
,})

5.2 - Examine data

[ ]
[
  {
    "role": "system",
    "content": "Star is a friendly chatbot companion ready to engage in everyday role-playing\u00a0conversations."
  },
  {
    "role": "user",
    "content": "Happy birthday, Lillian. Here's a little present I bought for you."
  },
  {
    "role": "assistant",
    "content": "Thank you, Paul."
  }
]

---

[
  {
    "role": "system",
    "content": "Star is a friendly chatbot companion ready to engage in everyday role-playing\u00a0conversations."
  },
  {
    "role": "user",
    "content": "Hi Isabel! You wanted to see me?"
  },
  {
    "role": "assistant",
    "content": "Yes Anthony, come on in. Have a seat."
  },
  {
    "role": "user",
    "content": "Is everything okay? You seem a bit preoccupied."
  },
  {
    "role": "assistant",
    "content": "Well, Anthony, this is not going to be an easy conversation.The company has made some decisions which will have a direct impact on your employment."
  },
  {
    "role": "user",
    "content": "Okay..."
  },
  {
    "role": "assistant",
    "content": "I don't want to beat around the bush so I'm just gonna come out with it. We are going to have to let you go."
  },
  {
    "role": "user",
    "content": "What? Why? I mean... just like that? I'm fired?"
  },
  {
    "role": "assistant",
    "content": "I'm sorry but, to be honest, you are a terrible employee!"
  },
  {
    "role": "user",
    "content": "What! I resent that!"
  },
  {
    "role": "assistant",
    "content": "Anthony, you were caught making international calls from the office phone, you called in sick in eight times this month and you smell like alcohol!"
  }
]

---

5.3 - Create train / validate datasets

[ ]
[ ]
DatasetDict({
    train: Dataset({
        features: ['messages'],
        num_rows: 800
    })
    test: Dataset({
        features: ['messages'],
        num_rows: 200
    })
})
Saved 800 samples to data/training_data.jsonl
Saved 200 samples to data/validation_data.jsonl

Step 6: Upload your dataset to Token Factory

Next, we’ll upload the dataset so Nebius can access it for training.

[ ]
FileObject(id='file-019b0be3-b432-7392-8f07-9d714bc23224', bytes=657543, created_at=1765431030, filename='training_data.jsonl', object='file', purpose='fine-tune', status=None, expires_at=None, status_details=None)
[ ]
FileObject(id='file-019b0be3-bfc8-7a03-9796-fba7337b9301', bytes=184319, created_at=1765431033, filename='validation_data.jsonl', object='file', purpose='fine-tune', status=None, expires_at=None, status_details=None)

Step 7: Create and start your fine-tuning job

We’ll fine-tune Llama 3.2 1B Instruct using LoRA, which is efficient and much faster than full fine-tuning.

[ ]
Job created: ftjob-49b4625687fe42d1a981c9f9bbdfd569 | status: running

Step 8: Monitor job progress

When you create a fine-tune job, its initial status will usually be running. The script below polls the status every 15 seconds to check for updates.

If it fails, Nebius will return an error message explaining what went wrong, and how to fix it. If you get a 500 error, just resubmit the job.

The training is complete when you see either Dataset processed successfully or Training completed successfully in the event logs.

[ ]
Fine tuning job ftjob-49b4625687fe42d1a981c9f9bbdfd569 created, waiting for completion...
Elapsed: 30s (0.5 min) : current status: running
Elapsed: 61s (1.0 min) : current status: running
Elapsed: 92s (1.5 min) : current status: running
Elapsed: 122s (2.0 min) : current status: running
Elapsed: 153s (2.6 min) : current status: running
Elapsed: 183s (3.1 min) : current status: running
Elapsed: 213s (3.6 min) : current status: running
Elapsed: 244s (4.1 min) : current status: running
Elapsed: 274s (4.6 min) : current status: running
Elapsed: 305s (5.1 min) : current status: succeeded
Final status: succeeded
CPU times: user 99.6 ms, sys: 38.1 ms, total: 138 ms
Wall time: 5min 5s

Check job events:

[ ]
[1765431035] info: Job is submitted
[1765431063] info: Dataset 'validation' processed successfully
[1765431065] info: Dataset 'training' processed successfully
[1765431268] info: Training completed successfully

Step-9: Inspect training job

Examine loss over epochs

[ ]
[ ]
Output

Step 8: Download your checkpoints

After every epoch, Nebius saves a checkpoint, a snapshot of the model at that stage. You’ll get all of them. For the final model, just grab the last checkpoint.

The code below creates a folder for each checkpoint and saves all the files there.

[ ]
[ ]

Step-8: How much did our training cost?

The price for fine-tuning a model under 20B parameters is $0.4/1M tokens. Let's calculate the total fine-tuning price.

Check the pricing guide

[ ]
Fine-tuning price: $0.5

Setp 9: Deploy Your LoRA Adapter

Now that your fine-tune is complete, you can deploy the LoRA adapter directly on Nebius Token Factory for inference.
This lets you use your fine-tuned model as a hosted endpoint, ready for API calls, experiments, or integration into your own applications.

[ ]
{'name': 'meta-llama/Meta-Llama-3.1-8B-Instruct-LoRa:my-finetuned-model-JuNN',
, 'base_model': 'meta-llama/Meta-Llama-3.1-8B-Instruct',
, 'source': 'ftjob-49b4625687fe42d1a981c9f9bbdfd569:ftckpt_98a07a09-9fb7-4009-8e18-0f72beade81c',
, 'description': 'Fine tuned model',
, 'created_at': 1765431401,
, 'status': 'validating'}
[ ]
deployed model name: meta-llama/Meta-Llama-3.1-8B-Instruct-LoRa:my-finetuned-model-JuNN
[ ]
Waiting for validation of LoRA model 'meta-llama/Meta-Llama-3.1-8B-Instruct-LoRa:my-finetuned-model-JuNN'...
Current status for 'meta-llama/Meta-Llama-3.1-8B-Instruct-LoRa:my-finetuned-model-JuNN': unknown
Current status for 'meta-llama/Meta-Llama-3.1-8B-Instruct-LoRa:my-finetuned-model-JuNN': active
{'type': 'text2text', 'name': 'meta-llama/Meta-Llama-3.1-8B-Instruct-LoRa:my-finetuned-model-JuNN', 'status': 'active', 'status_reason': None, 'checkpoint_id': 'ftckpt_98a07a09-9fb7-4009-8e18-0f72beade81c', 'job_id': 'ftjob-49b4625687fe42d1a981c9f9bbdfd569', 'file_id': None, 'url': None, 'created_at': 1765431401, 'description': 'Fine tuned model', 'vendor': 'meta', 'tags': ['128K context', 'small', 'JSON mode', 'lora'], 'use_cases': ['lora'], 'quality': 73, 'context_window_k': 128, 'size_b': 8.03}
[ ]

Step-10: Test the newly deployed model

Once the model status becomes active, you can send chat completions just like any OpenAI-compatible model.

[ ]
{
  "id": "chatcmpl-649d3439e2094d8384ab2e941c408aff",
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "logprobs": null,
      "message": {
        "content": "Hello",
        "refusal": null,
        "role": "assistant",
        "annotations": null,
        "audio": null,
        "function_call": null,
        "tool_calls": [],
        "reasoning_content": null
      },
      "stop_reason": null
    }
  ],
  "created": 1765431413,
  "model": "meta-llama/Meta-Llama-3.1-8B-Instruct-LoRa:my-finetuned-model-JuNN",
  "object": "chat.completion",
  "service_tier": null,
  "system_fingerprint": null,
  "usage": {
    "completion_tokens": 2,
    "prompt_tokens": 41,
    "total_tokens": 43,
    "completion_tokens_details": null,
    "prompt_tokens_details": null
  },
  "prompt_logprobs": null
}

DONE!

And that’s it!

You’ve just fine-tuned, deployed, and run inference with your own LoRA model, all using Nebius Token Factory.

If you want to go further, here are a few next steps worth exploring:

Start tracking and deploying your fine-tuned models today at Nebius Token Factory.