deepset Model Explorer Streaming

Model Explorer Streaming

agentic-aiagenticagentsgenaiAIhaystack-cookbookgenai-usecaseshaystack-ainotebooksPythonragai-tools

alph-notebooks/haystack-cookbook / model_explorer_streaming.ipynb

Export

Run Notebooks

Contents

No cells yet

Add cells to see them here

Streaming model explorer for Haystack

notebook by Tilde Thurium: Mastodon || Twitter || LinkedIn

Problem: there are so many LLMs these days! Which model is the best for my use case?

This notebook uses Haystack to compare the results of sending the same prompt to several different models.

This is a very basic demo where you can only compare a few models that support streaming responses. I'd like to support more models in the future, so watch this space for updates.

Models

Haystack's OpenAIGenerator and CohereGenerator support streaming out of the box.

The other models use the HuggingFaceAPIGenerator.

Prerequisites

You need HuggingFace, Cohere, and OpenAI API keys. Save them as secrets in your Colab. Click on the key icon in the left menu or see detailed instructions here.
To use Mistral-7B-v0.1, you should also accept Mistral conditions here: https://huggingface.co/mistralai/Mistral-7B-v0.1

[ ]

In order for userdata.get to work, these keys need to be saved as secrets in your Colab. Click on the key icon in the left menu or see detailed instructions here.

[8]

tokenizer_config.json:   0%|          | 0.00/967 [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/493k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.80M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/72.0 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/287 [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/2.73M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/281 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/222 [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/14.5M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/85.0 [00:00<?, ?B/s]

[16]

The AppendToken dataclass formats the output so that the model name is printed, and the text follows in chunks of 5 tokens.

[19]

[20]

HBox(children=(Output(layout=Layout(border='1px solid black')), Output(layout=Layout(border='1px solid black')…

This was a very silly example prompt. If you found this demo useful, let me know the kinds of prompts you tested it with!

Mastodon || Twitter || LinkedIn

Thanks for following along.