Mistral AI Property Graph Extractors Retrievers

Property Graph Extractors Retrievers

mistral-cookbookLlamaIndexthird_partypropertygraphs

alph-notebooks/mistral-cookbook / property_graph_extractors_retrievers.ipynb

Export

Run Notebooks

Contents

No cells yet

Add cells to see them here

Extractors and Retrievers in Property Graph

In this notebook, we will explore how to define extractors and retrievers for the PropertyGraph Index.

A property graph is a structured collection of labeled nodes (such as entity categories and text labels) with properties (metadata), interconnected by relationships to form structured paths (triplets).

In LlamaIndex, the PropertyGraphIndex plays a crucial role in:

• Constructing a graph

• Querying a graph

Building and Using PropertyGraph

[ ]

Property graph construction involves executing a series of knowledge graph extractors on each chunk, and attaching entities and relations as metadata to each node.

You can use as many extractors as needed, and all will be applied.

[ ]

If not provided, the defaults are SimpleLLMPathExtractor and ImplicitPathExtractor.

[ ]

SimpleLLMPathExtractor

Use an LLM to extract short statements and parse single-hop paths in the format (entity1, relation, entity2).

If desired, you can also customize both the prompt and the function used for parsing the paths.

Here’s a straightforward (though simplistic) example:

[ ]

ImplicitPathExtractor

Extract paths using the node.relationships attribute on each LlamaIndex node object.

This extraction process does not require an LLM or embedding model, as it simply parses properties that already exist on the node objects.

[ ]

SchemaLLMPathExtractor

Extract paths by adhering to a strict schema that specifies allowed entities, relationships, and the connections between them.

Using Pydantic, structured outputs from LLMs, and some intelligent validation, we can dynamically define a schema and verify the extractions for each path.(triplet)

[ ]

Retrieval and Querying

Labeled property graphs offer various querying methods to retrieve nodes and paths. In LlamaIndex, we have the ability to simultaneously combine multiple node retrieval techniques!

[ ]

If no sub-retrievers are specified, the default retrievers used are the LLMSynonymRetriever and VectorContextRetriever (if embeddings are enabled).

Currently, the following retrievers are included:

• LLMSynonymRetriever: Retrieves nodes based on keywords and synonyms generated by an LLM.

• VectorContextRetriever: Retrieves nodes based on embedded graph nodes.

• TextToCypherRetriever: Directs the LLM to generate Cypher queries based on the schema of the property graph.

• CypherTemplateRetriever: Utilizes a Cypher template with parameters inferred by the LLM.

• CustomPGRetriever: Easily subclassed to implement custom retrieval logic.

[ ]

LLMSynonymRetriever

This retriever takes the input query and attempts to generate relevant keywords and synonyms. These are used to retrieve nodes and consequently, the paths connected to those nodes.

Explicitly declaring this retriever in your configuration allows for the customization of several options.

[ ]

VectorContextRetriever

This retriever identifies nodes based on their vector similarity, subsequently fetching the paths connected to those nodes.

If your graph store natively supports vector capabilities, managing that graph store alone suffices for storage. However, if vector support is not inherent, you will need to supplement the graph store with a vector store. By default, this setup uses the in-memory SimpleVectorStore.

[ ]

TextToCypherRetriever

This retriever utilizes a graph store schema, your query, and a prompt template for text-to-cypher conversion to generate and execute a Cypher query.

Note: Since the SimplePropertyGraphStore is not a full-fledged graph database, it does not support Cypher queries.

To inspect the schema, you can use the method: index.property_graph_store.get_schema_str().

[ ]

CypherTemplateRetriever

This is a more constrained version of the TextToCypherRetriever. Instead of allowing the LLM free rein to generate any Cypher statement, we can provide a Cypher template and have the LLM fill in the blanks.

[ ]