Phi3 Moe Demo
code09.UpdateSamplesmicrosoft-phi-cookbookAug
Export
[1]
[2]
[3]
[4]
[5]
[6]
[7]
Loading checkpoint shards: 0%| | 0/17 [00:00<?, ?it/s]
WARNING:root:Some parameters are on the meta device device because they were offloaded to the cpu.
[8]
PhiMoEForCausalLM(
(model): PhiMoEModel(
(embed_tokens): Embedding(32064, 4096)
(layers): ModuleList(
(0-31): 32 x PhiMoEDecoderLayer(
(self_attn): PhiMoESdpaAttention(
(q_proj): Linear(in_features=4096, out_features=4096, bias=True)
(k_proj): Linear(in_features=4096, out_features=1024, bias=True)
(v_proj): Linear(in_features=4096, out_features=1024, bias=True)
(o_proj): Linear(in_features=4096, out_features=4096, bias=True)
(rotary_emb): Phi3LongRoPEScaledRotaryEmbedding()
)
(block_sparse_moe): PhiMoESparseMoeBlock(
(gate): Linear(in_features=4096, out_features=16, bias=False)
(experts): ModuleList(
(0-15): 16 x PhiMoEBlockSparseTop2MLP(
(w1): Linear(in_features=4096, out_features=6400, bias=False)
(w2): Linear(in_features=6400, out_features=4096, bias=False)
(w3): Linear(in_features=4096, out_features=6400, bias=False)
(act_fn): SiLU()
)
)
)
(input_layernorm): LayerNorm((4096,), eps=1e-05, elementwise_affine=True)
(post_attention_layernorm): LayerNorm((4096,), eps=1e-05, elementwise_affine=True)
)
)
(norm): LayerNorm((4096,), eps=1e-05, elementwise_affine=True)
)
(lm_head): Linear(in_features=4096, out_features=32064, bias=True)
) [9]
[10]
2024-08-20 04:51:54.431510: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory 2024-08-20 04:51:54.431543: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. The model 'PhiMoEForCausalLM' is not supported for text-generation. Supported models are ['BartForCausalLM', 'BertLMHeadModel', 'BertGenerationDecoder', 'BigBirdForCausalLM', 'BigBirdPegasusForCausalLM', 'BioGptForCausalLM', 'BlenderbotForCausalLM', 'BlenderbotSmallForCausalLM', 'BloomForCausalLM', 'CamembertForCausalLM', 'LlamaForCausalLM', 'CodeGenForCausalLM', 'CohereForCausalLM', 'CpmAntForCausalLM', 'CTRLLMHeadModel', 'Data2VecTextForCausalLM', 'DbrxForCausalLM', 'ElectraForCausalLM', 'ErnieForCausalLM', 'FalconForCausalLM', 'FuyuForCausalLM', 'GemmaForCausalLM', 'Gemma2ForCausalLM', 'GitForCausalLM', 'GPT2LMHeadModel', 'GPT2LMHeadModel', 'GPTBigCodeForCausalLM', 'GPTNeoForCausalLM', 'GPTNeoXForCausalLM', 'GPTNeoXJapaneseForCausalLM', 'GPTJForCausalLM', 'JambaForCausalLM', 'JetMoeForCausalLM', 'LlamaForCausalLM', 'MambaForCausalLM', 'Mamba2ForCausalLM', 'MarianForCausalLM', 'MBartForCausalLM', 'MegaForCausalLM', 'MegatronBertForCausalLM', 'MistralForCausalLM', 'MixtralForCausalLM', 'MptForCausalLM', 'MusicgenForCausalLM', 'MusicgenMelodyForCausalLM', 'MvpForCausalLM', 'NemotronForCausalLM', 'OlmoForCausalLM', 'OpenLlamaForCausalLM', 'OpenAIGPTLMHeadModel', 'OPTForCausalLM', 'PegasusForCausalLM', 'PersimmonForCausalLM', 'PhiForCausalLM', 'Phi3ForCausalLM', 'PLBartForCausalLM', 'ProphetNetForCausalLM', 'QDQBertLMHeadModel', 'Qwen2ForCausalLM', 'Qwen2MoeForCausalLM', 'RecurrentGemmaForCausalLM', 'ReformerModelWithLMHead', 'RemBertForCausalLM', 'RobertaForCausalLM', 'RobertaPreLayerNormForCausalLM', 'RoCBertForCausalLM', 'RoFormerForCausalLM', 'RwkvForCausalLM', 'Speech2Text2ForCausalLM', 'StableLmForCausalLM', 'Starcoder2ForCausalLM', 'TransfoXLLMHeadModel', 'TrOCRForCausalLM', 'WhisperForCausalLM', 'XGLMForCausalLM', 'XLMWithLMHeadModel', 'XLMProphetNetForCausalLM', 'XLMRobertaForCausalLM', 'XLMRobertaXLForCausalLM', 'XLNetLMHeadModel', 'XmodForCausalLM'].
[11]
[12]
[13]
[14]
[15]
[16]
'<|system|> You are a helpful AI assistant, you are an agent capable of using a variety of tools to answer a question. Here are a few of the tools available to you:\n\n- Blog: This tool helps you describe a certain knowledge point and content, and finally write it into Twitter or Facebook style content\n- Translate: This is a tool that helps you translate into any language, using plain language as required\n\nTo use these tools you must always respond in JSON format containing `"tool_name"` and `"input"` key-value pairs. For example, to answer the question, "Build Muliti Agents with MOE models" you must use the calculator tool like so:\n\n```json\n\n{\n "tool_name": "Blog",\n "input": "Build Muliti Agents with MOE models"\n}\n\n```\n\nOr to translate the question "can you introduce yourself in Chinese" you must respond:\n\n```json\n\n{\n "tool_name": "Search",\n "input": "can you introduce yourself in Chinese"\n}\n\n```\n\nRemember just output the final result, ouput in JSON format containing `"agentid"`,`"tool_name"` , `"input"` and `"output"` key-value pairs .:\n\n```json\n\n[\n\n\n{ "agentid": "step1",\n "tool_name": "Blog",\n "input": "Build Muliti Agents with MOE models",\n "output": "........."\n},\n\n{ "agentid": "step2",\n "tool_name": "Search",\n "input": "can you introduce yourself in Chinese",\n "output": "........."\n},\n{\n "agentid": "final"\n "tool_name": "Result",\n "output": "........."\n}\n]\n\n```\n\nThe users answer is as follows.\n <|end|>\n<|user|> Write something about Generative AI with MOE , translate it to Chinese <|end|>\n<|assistant|>' [17]
[18]
/anaconda/envs/azureml_py38/lib/python3.9/site-packages/transformers/generation/configuration_utils.py:567: UserWarning: `do_sample` is set to `False`. However, `temperature` is set to `0.3` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `temperature`. warnings.warn( The `seen_tokens` attribute is deprecated and will be removed in v4.41. Use the `cache_position` model input instead.
[19]
' ```json\n\n[\n\n{ "agentid": "step1",\n "tool_name": "Blog",\n "input": "Generative AI with MOE",\n "output": "Generative AI with MOE (Mixture of Experts) is a powerful approach that combines the strengths of generative models and the flexibility of MOE architecture. This hybrid model can generate high-quality, diverse, and contextually relevant content, making it suitable for various applications such as content creation, data augmentation, and more."\n},\n\n{ "agentid": "step2",\n "tool_name": "Translate",\n "input": "Generative AI with MOE is a powerful approach that combines the strengths of generative models and the flexibility of MOE architecture. This hybrid model can generate high-quality, diverse, and contextually relevant content, making it suitable for various applications such as content creation, data augmentation, and more.",\n "output": "基于生成AI的MOE(Mixture of Experts)是一种强大的方法,它结合了生成模型的优势和MOE架构的灵活性。这种混合模型可以生成高质量、多样化且上下文相关的内容,使其适用于各种应用,如内容创建、数据增强等。"\n},\n{\n "agentid": "final",\n "tool_name": "Result",\n "output": "基于生成AI的MOE(Mixture of Experts)是一种强大的方法,它结合了生成模型的优势和MOE架构的灵活性。这种混合模型可以生成高质量、多样化且上下文相关的内容,使其适用于各种应用,如内容创建、数据增强等。"\n}\n]\n```'