Notebooks
M
MongoDB
Haystack MongoDB Atlas RAG

Haystack MongoDB Atlas RAG

agentsartificial-intelligencellmsmongodb-genai-showcasenotebooksgenerative-airag

Open In Colab

Haystack and MongoDB Atlas RAG notebook

Install dependencies:

[1]
Collecting haystack-ai
  Downloading haystack_ai-2.1.2-py3-none-any.whl (319 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 319.5/319.5 kB 5.5 MB/s eta 0:00:00
Collecting mongodb-atlas-haystack
  Downloading mongodb_atlas_haystack-0.3.0-py3-none-any.whl (13 kB)
Collecting tiktoken
  Downloading tiktoken-0.7.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.1/1.1 MB 29.1 MB/s eta 0:00:00
Collecting boilerpy3 (from haystack-ai)
  Downloading boilerpy3-1.0.7-py3-none-any.whl (22 kB)
Collecting haystack-bm25 (from haystack-ai)
  Downloading haystack_bm25-1.0.2-py2.py3-none-any.whl (8.8 kB)
Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from haystack-ai) (3.1.4)
Collecting lazy-imports (from haystack-ai)
  Downloading lazy_imports-0.3.1-py3-none-any.whl (12 kB)
Requirement already satisfied: more-itertools in /usr/local/lib/python3.10/dist-packages (from haystack-ai) (10.1.0)
Requirement already satisfied: networkx in /usr/local/lib/python3.10/dist-packages (from haystack-ai) (3.3)
Requirement already satisfied: numpy in /usr/local/lib/python3.10/dist-packages (from haystack-ai) (1.25.2)
Collecting openai>=1.1.0 (from haystack-ai)
  Downloading openai-1.30.5-py3-none-any.whl (320 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 320.7/320.7 kB 15.7 MB/s eta 0:00:00
Requirement already satisfied: pandas in /usr/local/lib/python3.10/dist-packages (from haystack-ai) (2.0.3)
Collecting posthog (from haystack-ai)
  Downloading posthog-3.5.0-py2.py3-none-any.whl (41 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 41.3/41.3 kB 4.2 MB/s eta 0:00:00
Requirement already satisfied: python-dateutil in /usr/local/lib/python3.10/dist-packages (from haystack-ai) (2.8.2)
Requirement already satisfied: pyyaml in /usr/local/lib/python3.10/dist-packages (from haystack-ai) (6.0.1)
Requirement already satisfied: requests in /usr/local/lib/python3.10/dist-packages (from haystack-ai) (2.31.0)
Requirement already satisfied: tenacity in /usr/local/lib/python3.10/dist-packages (from haystack-ai) (8.3.0)
Requirement already satisfied: tqdm in /usr/local/lib/python3.10/dist-packages (from haystack-ai) (4.66.4)
Requirement already satisfied: typing-extensions>=4.7 in /usr/local/lib/python3.10/dist-packages (from haystack-ai) (4.11.0)
Collecting pymongo[srv] (from mongodb-atlas-haystack)
  Downloading pymongo-4.7.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (670 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 670.0/670.0 kB 12.4 MB/s eta 0:00:00
Requirement already satisfied: regex>=2022.1.18 in /usr/local/lib/python3.10/dist-packages (from tiktoken) (2024.5.15)
Requirement already satisfied: anyio<5,>=3.5.0 in /usr/local/lib/python3.10/dist-packages (from openai>=1.1.0->haystack-ai) (3.7.1)
Requirement already satisfied: distro<2,>=1.7.0 in /usr/lib/python3/dist-packages (from openai>=1.1.0->haystack-ai) (1.7.0)
Collecting httpx<1,>=0.23.0 (from openai>=1.1.0->haystack-ai)
  Downloading httpx-0.27.0-py3-none-any.whl (75 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 75.6/75.6 kB 7.9 MB/s eta 0:00:00
Requirement already satisfied: pydantic<3,>=1.9.0 in /usr/local/lib/python3.10/dist-packages (from openai>=1.1.0->haystack-ai) (2.7.1)
Requirement already satisfied: sniffio in /usr/local/lib/python3.10/dist-packages (from openai>=1.1.0->haystack-ai) (1.3.1)
Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests->haystack-ai) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests->haystack-ai) (3.7)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests->haystack-ai) (2.0.7)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests->haystack-ai) (2024.2.2)
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2->haystack-ai) (2.1.5)
Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas->haystack-ai) (2023.4)
Requirement already satisfied: tzdata>=2022.1 in /usr/local/lib/python3.10/dist-packages (from pandas->haystack-ai) (2024.1)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-packages (from python-dateutil->haystack-ai) (1.16.0)
Collecting monotonic>=1.5 (from posthog->haystack-ai)
  Downloading monotonic-1.6-py2.py3-none-any.whl (8.2 kB)
Collecting backoff>=1.10.0 (from posthog->haystack-ai)
  Downloading backoff-2.2.1-py3-none-any.whl (15 kB)
Collecting dnspython<3.0.0,>=1.16.0 (from pymongo[srv]->mongodb-atlas-haystack)
  Downloading dnspython-2.6.1-py3-none-any.whl (307 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 307.7/307.7 kB 20.1 MB/s eta 0:00:00
Requirement already satisfied: exceptiongroup in /usr/local/lib/python3.10/dist-packages (from anyio<5,>=3.5.0->openai>=1.1.0->haystack-ai) (1.2.1)
Collecting httpcore==1.* (from httpx<1,>=0.23.0->openai>=1.1.0->haystack-ai)
  Downloading httpcore-1.0.5-py3-none-any.whl (77 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 77.9/77.9 kB 6.2 MB/s eta 0:00:00
Collecting h11<0.15,>=0.13 (from httpcore==1.*->httpx<1,>=0.23.0->openai>=1.1.0->haystack-ai)
  Downloading h11-0.14.0-py3-none-any.whl (58 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 58.3/58.3 kB 4.9 MB/s eta 0:00:00
Requirement already satisfied: annotated-types>=0.4.0 in /usr/local/lib/python3.10/dist-packages (from pydantic<3,>=1.9.0->openai>=1.1.0->haystack-ai) (0.7.0)
Requirement already satisfied: pydantic-core==2.18.2 in /usr/local/lib/python3.10/dist-packages (from pydantic<3,>=1.9.0->openai>=1.1.0->haystack-ai) (2.18.2)
Installing collected packages: monotonic, lazy-imports, haystack-bm25, h11, dnspython, boilerpy3, backoff, tiktoken, pymongo, posthog, httpcore, httpx, openai, haystack-ai, mongodb-atlas-haystack
Successfully installed backoff-2.2.1 boilerpy3-1.0.7 dnspython-2.6.1 h11-0.14.0 haystack-ai-2.1.2 haystack-bm25-1.0.2 httpcore-1.0.5 httpx-0.27.0 lazy-imports-0.3.1 mongodb-atlas-haystack-0.3.0 monotonic-1.6 openai-1.30.5 posthog-3.5.0 pymongo-4.7.2 tiktoken-0.7.0

Setup MongoDB Atlas connection and Open AI

  • Set the MongoDB connection string. Follow the steps here to get the connection string from the Atlas UI.

  • Set the OpenAI API key. Steps to obtain an API key as here

[2]
[3]
Enter your MongoDB connection string:··········
[4]
Enter your Open AI Key:··········

Create vector search index on collection

Follow this tutorial to create a vector index on database: haystack_test collection test_collection.

Verify that the index name is vector_index and the syntax specify:

	{
  "fields": [
    {
      "type": "vector",
      "path": "embedding",
      "numDimensions": 1536,
      "similarity": "cosine"
    }
  ]
}

Setup vector store to load documents:

[5]

Build the writer pipeline to load documnets

[6]
Calculating embeddings: 100%|██████████| 1/1 [00:00<00:00,  4.16it/s]
{'doc_embedder': {'meta': {'model': 'text-embedding-ada-002',
,   'usage': {'prompt_tokens': 32, 'total_tokens': 32}}},
, 'doc_writer': {'documents_written': 0}}

Build a RAG Pipeline

Lets create a pipeline that will Retrieve Augment and Generate a response for user questions

[9]
<haystack.core.pipeline.pipeline.Pipeline object at 0x7fc98d95bdf0>
,🚅 Components
,  - text_embedder: OpenAITextEmbedder
,  - retriever: MongoDBAtlasEmbeddingRetriever
,  - prompt_builder: PromptBuilder
,  - llm: OpenAIGenerator
,🛤️ Connections
,  - text_embedder.embedding -> retriever.query_embedding (List[float])
,  - retriever.documents -> prompt_builder.documents (List[Document])
,  - prompt_builder.prompt -> llm.prompt (str)

Lets test the pipeline

[12]
Mark lives in Berlin.