Notebooks
M
Microsoft
E2E Phi 3 MLflow TransformerPipeline

E2E Phi 3 MLflow TransformerPipeline

codemicrosoft-phi-cookbook06.E2E

Running Phi-3 model in MLFlow


This notebook describes the use of Transformer pipeline on the example of Phi-3 mini 4K Instruct.

[1]

Selecting Phi-3 Mini 4K from HuggingFace

[2]
Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.

Generating MLFlow artifact

[3]
[ ]

Running Phi-3 as MLFlow model

[5]
Loading checkpoint shards:   0%|          | 0/34 [00:00<?, ?it/s]
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
[6]
inputs: 
,  ['messages': Array({content: string (required), name: string (optional), role: string (required)}) (required), 'temperature': double (optional), 'max_tokens': long (optional), 'stop': Array(string) (optional), 'n': long (optional), 'stream': boolean (optional)]
,outputs: 
,  ['id': string (required), 'object': string (required), 'created': long (required), 'model': string (required), 'choices': Array({finish_reason: string (required), index: long (required), message: {content: string (required), name: string (optional), role: string (required)} (required)}) (required), 'usage': {completion_tokens: long (required), prompt_tokens: long (required), total_tokens: long (required)} (required)]
,params: 
,  None
[7]
You are not running the flash-attention implementation, expect numerical differences.
Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
[{'id': 'f9c2ca5c-b684-4a77-8a43-356b671fd797', 'object': 'chat.completion', 'created': 1718709547, 'model': 'microsoft/Phi-3-mini-4k-instruct', 'usage': {'prompt_tokens': 11, 'completion_tokens': 73, 'total_tokens': 84}, 'choices': [{'index': 0, 'finish_reason': 'stop', 'message': {'role': 'assistant', 'content': 'The capital of Spain is Madrid. It is the largest city in Spain and serves as the political, economic, and cultural center of the country. Madrid is located in the center of the Iberian Peninsula and is known for its rich history, art, and architecture, including the Royal Palace, the Prado Museum, and the Plaza Mayor.'}}]}]
[8]
Question: What is the capital of Spain?

Response: The capital of Spain is Madrid. It is the largest city in Spain and serves as the political, economic, and cultural center of the country. Madrid is located in the center of the Iberian Peninsula and is known for its rich history, art, and architecture, including the Royal Palace, the Prado Museum, and the Plaza Mayor.

Usage: {'prompt_tokens': 11, 'completion_tokens': 73, 'total_tokens': 84}