Notebooks
d
deepset
Rag Fastembed

Rag Fastembed

agentic-aiagenticagentsgenaiAIhaystack-cookbookgenai-usecaseshaystack-ainotebooksPythonragai-tools

RAG pipeline using FastEmbed for embeddings generation

FastEmbed is a lightweight, fast, Python library built for embedding generation, maintained by Qdrant. It is suitable for generating embeddings efficiently and fast on CPU-only machines.

In this notebook, we will use FastEmbed-Haystack integration to generate embeddings for indexing and RAG.

Haystack Useful Sources

Install dependencies

[ ]

Download contents and create docs

[1]
[3]

Clean, split and index documents on Qdrant

[4]
[5]
[ ]
[7]
493

FastEmbed Document Embedder

Here we are initializing the FastEmbed Document Embedder and using it to generate embeddings for the documents. We are using a small and good model, BAAI/bge-small-en-v1.5 and specifying the parallel parameter to 0 to use all available CPU cores for embedding generation.

⚠️ If you are running this notebook on Google Colab, please note that Google Colab only provides 2 CPU cores, so the embedding generation could be not as fast as it can be on a standard machine.

For more information on FastEmbed-Haystack integration, please refer to the documentation and API reference.

[ ]
Fetching 9 files:   0%|          | 0/9 [00:00<?, ?it/s]
Fetching 9 files: 100%|██████████| 9/9 [00:00<00:00, 36900.04it/s]
Calculating embeddings: 100%|██████████| 493/493 [00:35<00:00, 13.73it/s]
[9]
500it [00:00, 4262.26it/s]             
493

RAG Pipeline using Qwen 2.5 7B

[ ]
[11]
[ ]
[ ]
[ ]
[ ]

Try the pipeline

[16]
Calculating embeddings: 100%|██████████| 1/1 [00:00<00:00, 24.62it/s]
[ ]
(' Dave Grohl is the founder and lead vocalist of the American rock band Foo '
 'Fighters, which he formed in 1994 after the breakup of Nirvana, in which he '
 'was the drummer.')