Pinecone 00 Langgraph Intro

00 Langgraph Intro

vector-databasesemantic-searchlearnAILLMlanggraphgenerationPythonjupyter-notebookpinecone-exampleslangchain

alph-notebooks/pinecone-examples / 00-langgraph-intro.ipynb

Export

Run Notebooks

Contents

No cells yet

Add cells to see them here

Understanding LangGraph

LangGraph is a special LangChain-built library that focuses on building intelligent AI Agents using graphs. Ie, agentic state machines.

We need these prerequisite libraries to run a graph visualization library (pygraphviz). We will use this library during this notebook to understand the structure of our graphs but it is not required to use langgraph.

[1]

Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
pkg-config is already the newest version (0.29.2-1ubuntu3).
graphviz is already the newest version (2.42.2-6).
libgraphviz-dev is already the newest version (2.42.2-6).
python3-dev is already the newest version (3.10.6-1~22.04).
0 upgraded, 0 newly installed, 0 to remove and 45 not upgraded.

We need a few libraries from LangChain:

[2]

Graph State

We will define a custom graph state to support our agent-oriented decision making. In this we will define:

our user input (ie the most recent message from the user)
agent_out which is used by the graph (and our final output) to consume/output agent outputs
intermediate_steps which is a list maintained over our graph runtime to keep track of the results of previous steps

During each step in our graph we will be able to add to, modify, or extract these values from our state object.

[3]

Emulate Search

To test a RAG-like agent we'll provide a tool that provide information as we would expect a search tool in a RAG agent to do.

[4]

Custom Tools

We will define two tools for this agent, a search tool (which emulates our RAG component) and a final_answer tool — which is provides output in a specific format, ie:

{
    "answer": "<LLM generated answer here>",
    "source": "<LLM generated citation here>"
}

We define both using the @tool decorator from LangChain.

[5]

These tools will be triggered via OpenAI Tools (ie function calling). The LLM will be provided information on the schema (ie structure) of the function to be called, like that which we can see here:

[6]

StructuredTool(name='search', description='search(query: str) - Searches for information on the topic of artificial intelligence (AI).\n    Cannot be used to research any other topics. Search query must be provided\n    in natural language and be verbose.', args_schema=<class 'pydantic.v1.main.searchSchema'>, func=<function search_tool at 0x7c2e2c357520>)

Initialize Agent

[7]

Test the agent quickly to confirm it is functional:

[8]

[ToolAgentAction(tool='search', tool_input={'query': 'EHI embeddings'}, log="\nInvoking: `search` with `{'query': 'EHI embeddings'}`\n\n\n", message_log=[AIMessage(content='', additional_kwargs={'tool_calls': [{'index': 0, 'id': 'call_lnb9cUYfYUD2IXfjd5aaBM6z', 'function': {'arguments': '{"query":"EHI embeddings"}', 'name': 'search'}, 'type': 'function'}]}, response_metadata={'finish_reason': 'tool_calls'}, id='run-bdfd0e63-4422-4360-ad79-36b0df05cc35-0', tool_calls=[{'name': 'search', 'args': {'query': 'EHI embeddings'}, 'id': 'call_lnb9cUYfYUD2IXfjd5aaBM6z'}])], tool_call_id='call_lnb9cUYfYUD2IXfjd5aaBM6z')]

[9]

{'index': 0,
, 'id': 'call_lnb9cUYfYUD2IXfjd5aaBM6z',
, 'function': {'arguments': '{"query":"EHI embeddings"}', 'name': 'search'},
, 'type': 'function'}

The agent won't perform the function calls themselvs, that is up to us and we will handle it in downstream actions through our agent graph.

The information provided by agent_out will be used to decide whether we move to the search or END nodes of our graph. We'll also add a error handler node in case our agent fails to produce the output we need.

Define Nodes for Graph

[10]

Define Graph

Our graph is constructed of nodes and edges. A node represents a function (one of those we just defined above) whereas an edge allows us to travel from one node to another.

Let's start by initializing our graph using our AgentState object and adding our first set of nodes and the graph entry point (ie where the graph begins once called).

[ ]

In addition to our nodes we have our "one-way" edges — that is, once node X is called the state must continue to node Y as defined by these edges. We define these using:

graph.add_edge(X, Y)

If X or Y are defined nodes in our graph we pass the name of that node in string format. So, if we want to add an edge that navigates from our "search" node to our "rag_final_answer" node, we do:

graph.add_edge("search", "rag_final_answer")

We will also have an end node in our graph — we have not defined this end node as it is imported as a specific graph object END. To use this, we must add edges between our final nodes and the END object, like so:

graph.add_edge("rag_final_answer", END)

When the END node is called, our graph completes.

[11]

[12]

[13]

> run_query_agent
> router
> execute_search
> final_answer

[14]

{"answer":"AI stands for Artificial Intelligence. It refers to the simulation of human intelligence processes by machines, especially computer systems. AI encompasses various technologies and approaches that enable machines to perform tasks that typically require human intelligence, such as learning, problem-solving, perception, and decision-making.","source":"https://en.wikipedia.org/wiki/Artificial_intelligence"}

[15]

> run_query_agent
> router
> execute_search
> final_answer
{"answer":"EHI embeddings refer to the embeddings generated by the End-to-end Hierarchical Indexing (EHI) model. These embeddings are learned jointly with the ANNS (Approximate Nearest Neighbor Search) structure to optimize retrieval performance. EHI uses a standard dual encoder model to embed queries and documents while learning an inverted file index (IVF) style tree structure for efficient ANNS. The EHI embeddings are designed to capture the position of a query/document in the tree, ensuring stable and efficient learning of the discrete tree-based ANNS structure.","source":"https://arxiv.org/pdf/2310.08891.pdf"}

[16]

> run_query_agent
> router
> execute_search
> final_answer
{"answer":"EHI embeddings refer to the embeddings generated by the End-to-end Hierarchical Indexing (EHI) model. The EHI model jointly learns both the embeddings and the Approximate Nearest Neighbor Search (ANNS) structure to optimize retrieval performance. It uses a standard dual encoder model for embedding queries and documents while learning an inverted file index (IVF) style tree structure for efficient ANNS. EHI introduces the notion of dense path embedding to capture the position of a query/document in the tree, ensuring stable and efficient learning of the discrete tree-based ANNS structure.","source":"https://arxiv.org/pdf/2310.08891.pdf"}

[17]

> run_query_agent
> router
> handle_error
{"answer":"Hello! How can I assist you today?","source":"Assistant's response"}

[18]

> run_query_agent
> router
> handle_error
{"answer":"Hello! How can I assist you today?","source":"N/A"}

[18]