Mongodb As A Toolbox For Llamaindex Agents
MongoDB as a Toolbox for LlamaIndex Agents
This notebook demonstrates how to leverage MongoDB Atlas as a "toolbox" for LlamaIndex agents. The application showcases the integration of MongoDB's capabilities, specifically its Vector Search feature, with LlamaIndex for building intelligent agents capable of performing various tasks by calling relevant tools stored and managed within MongoDB.
Key Features:
- MongoDB as a Tool Registry: Instead of hardcoding tool definitions within the agent, this application stores tool metadata (name, description, parameters) directly in a MongoDB collection.
- MongoDB Atlas Vector Search for Tool Discovery: LlamaIndex uses the vector embeddings of tool descriptions stored in MongoDB to perform semantic searches based on user queries. This allows the agent to dynamically discover and select the most relevant tools for a given task.
- LlamaIndex Agent with Function Calling: The LlamaIndex agent is configured to use the retrieved tool definitions from MongoDB to enable function calling. This means the agent can understand the user's intent and execute the appropriate Python function (tool) stored in the application.
- Data Storage in MongoDB: Besides tool definitions, the application also uses separate MongoDB collections to store operational data like customer orders, return requests, and policy documents.
- Integration with External Services: The tools defined and managed in MongoDB can interact with external services (e.g., fetching real-time data, processing requests) or perform operations on the data stored within MongoDB itself (e.g., looking up order details, creating return requests).
This approach provides a flexible and scalable way to manage and expand the agent's capabilities. New tools can be added to the MongoDB collection dynamically, and the agent can discover and utilize them without requiring code changes to the agent itself.
Environment Setup and Configuration
This section covers the installation of necessary libraries, setting up API keys, and configuring the database connection to MongoDB Atlas.
Install required libraries
This cell installs the necessary Python libraries using uv pip install. These libraries include:
pymongo: A Python driver for MongoDB.llama-index-core: The core LlamaIndex library.llama-index-llms-openai: LlamaIndex integration with OpenAI LLMs.llama-index-embeddings-voyageai: LlamaIndex integration with VoyageAI embeddings.llama-index-vector-stores-mongodb: LlamaIndex integration with MongoDB Atlas Vector Search.llama-index-readers-file: LlamaIndex file readers.
Get and store API keys
Get and store API keys This cell retrieves API keys for OpenAI, MongoDB, and VoyageAI from Google Colab's user data secrets and sets them as environment variables.
Please obtain your own API keys for OpenAI, MongoDB Atlas, and VoyageAI.
OpenAI: You can get an API key from the OpenAI website. MongoDB Atlas: Get your connection string from your MongoDB Atlas cluster. VoyageAI: Obtain an API key from the VoyageAI website. Once you have your keys, add them to Google Colab's user data secrets by clicking on the "🔑" icon in the left sidebar. Name the secrets OPENAI_API_KEY, MONGODB_URI, and VOYAGE_API_KEY respectively.
It also defines the GPT model to be used.
Setup the database
This cell establishes a connection to the MongoDB Atlas database using the provided URI. It then defines the database name and the names of the collections that will be used in this notebook for storing tools, orders, returns, and policies. Finally, it creates client objects for each of these collections.
Loading Demo Data
Download and store policy documents into MongoDB Atlas vector store
This cell downloads policy documents and stores them in a MongoDB Atlas vector store. It initializes a vector store, checks if the collection is empty, downloads PDF documents, loads them, adds metadata, initializes embedding and node parsing, parses documents into nodes, creates a storage context, creates a vector index, and ingests the documents.
Create and Store Dummy Order Data
This cell generates a list of fake order data with details like order ID, date, status, total amount, shipping address, payment method, and items. It then checks if the orders collection in MongoDB is empty. If it is, the fake order data is inserted into the orders collection. This is done to populate the database with sample data for testing and demonstrating the order lookup functionality later in the notebook.
Application Setup and Configuration
Define MongoDB Tool Decorator
This cell defines the mongodb_toolbox decorator. This decorator is used to register functions as tools that can be discovered and used by the LlamaIndex agent. It also handles generating embeddings for the tool descriptions and storing them in the MongoDB 'tools' collection for vector search.
Setup indexes
This cell checks for and creates vector search indexes on the specified MongoDB collections if they don't already exist. These indexes are crucial for performing efficient vector searches on the data stored in these collections.
Define Vector Search Function
This cell defines the vector_search_tools function, which performs a vector search on a given LlamaIndex vector store based on a user query. It uses the specified vector store and embedding model to find the most relevant documents (in this case, tool definitions) and returns a list of their metadata.
Define MongoDB Tools
This cell defines several Python functions that will serve as tools for the LlamaIndex agent. Each function is decorated with the @mongodb_toolbox decorator, which registers the function and stores its definition and embedding in the 'tools' collection in MongoDB. These tools include functions for shouting, getting weather, getting stock price, getting current time, looking up orders, responding in Spanish, checking return policy, and creating a return request.
Populate Tools Function
This cell defines the populate_tools function. This function takes the results from a vector search (which are tool definitions) and converts them into a list of LlamaIndex FunctionTool objects. It looks up the actual function object in the decorated_tools_registry based on the tool name found in the search results and creates a FunctionTool with the corresponding function and its description.
Running the Agent
Test Tool Retrieval
This cell demonstrates how to use the vector_search_tools function to find relevant tools based on a user query and then uses the populate_tools function to convert the search results into LlamaIndex FunctionTool objects. Finally, it prints the names of the retrieved tools to verify the process.
Get and store API keys
This cell retrieves API keys for OpenAI, MongoDB, and VoyageAI from Google Colab's user data secrets and sets them as environment variables.
Please obtain your own API keys for OpenAI, MongoDB Atlas, and VoyageAI.
- OpenAI: You can get an API key from the OpenAI website.
- MongoDB Atlas: Get your connection string from your MongoDB Atlas cluster.
- VoyageAI: Obtain an API key from the VoyageAI website.
Once you have your keys, add them to Google Colab's user data secrets by clicking on the "🔑" icon in the left sidebar. Name the secrets OPENAI_API_KEY, MONGODB_URI, and VOYAGE_API_KEY respectively.
It also defines the GPT model to be used.