Microsoft Phi 3 Finetune Qlora Python

Phi 3 Finetune Qlora Python

codemicrosoft-phi-cookbook03.Finetuning

alph-notebooks/microsoft-phi-cookbook / Phi-3-finetune-qlora-python.ipynb

Export

Run Notebooks

Contents

No cells yet

Add cells to see them here

Guide on fine-tuning a Phi-3-mini model for Python code generation utilizing QLoRA via Hugging Face Hub

Installing and loading the libraries

[ ]

Importing the libraries

[ ]

Setting Global Parameters

[ ]

Connect to Huggingface Hub

IMPORTANT: The upcoming section's execution will vary based on your code execution environment and the configuration of your API Keys.

Interactive login to Hugging Face Hub is possible.

[ ]

Alternatively, you can supply a .env file that contains the Hugging Face token.

[ ]

Load the dataset with the instruction set

[ ]

Load the tokenizer to prepare the dataset

[ ]

Function to generate the suitable format for our model.

[ ]

Implement the ChatML format on our dataset.

[ ]

Instruction fine-tune a Phi-3-mini model using QLORA and trl

Initially, we attempt to recognize our GPU.

[ ]

Load the tokenizer and model to finetune

[ ]

Set up the QLoRA parameters.

[ ]

The SFTTrainer offers a built-in integration with peft, simplifying the process of effectively fine-tuning LLMs. All we need to do is establish our LoRAConfig and supply it to the trainer. However, before initiating our training, it's necessary to specify the hyperparameters (TrainingArguments) we plan to utilize.

Establish a connection with wandb and enlist the project and experiment.

[ ]

We now possess all the necessary components to construct our SFTTrainer and commence the model training.

[ ]

Initiate the model training process by invoking the train() method on our Trainer instance.

[ ]

Combine the model and the adapters, then save it.

Note: When operating on a T4 instance, memory cleanup is necessary.

[ ]

Reload the previously trained and saved model, merge it, and then proceed to save the entire model.

[ ]

Model Inference and evaluation

In the end, we download the model that we created from the hub and conduct tests to ensure its proper functioning!

[ ]

Retrieve the model and tokenizer from the Hub.

[ ]

We arrange the dataset in the same manner as before.

[ ]

Establish a text generation pipeline to execute the inference.

[ ]

Evaluate the performance

[ ]

We'll employ the ROGUE metric for performance evaluation. While it may not be the optimal metric, it's straightforward and convenient to measure.

[ ]

Establish a function for inference and evaluation of a sample.

[ ]

At this point, we can compute the metric for the sample.

[ ]

Inference in batches

[ ]

At this point, we can compute the metric for the sample.

[ ]