Agent-Powered Retrieval with Haystack
Notebook by Bilge Yücel
In this notebook, you'll build an intelligent movie recommendation assistant powered by Haystack and Qdrant. You'll learn how to combine sparse vector search, metadata filtering (payload), and LLM-based agents to create a system that can understand natural language queries and recommend relevant movies from a curated dataset.
By the end of this notebook, you’ll have a fully working assistant that can answer queries like:
- “Find me a highly-rated action movie about car racing.”
- “Can you recommend five japanese thrillers?”
- “What can I watch with my kids, about animals?”
This assistant will be implemented as a tool-calling agent and have access to a retrieval_tool that can retrieve the information from the database based on the generated query and filters.
🧱 Setting up the Development Environment
Install required dependencies:
haystack-ai: For building pipelines and agentsdatasets: For loading the movie dataset from Hugging Faceqdrant-haystack: For vector database accessfastembed-haystack: For using FastEmbed library for sparse embedding generation
Add credentials for OpenAI API and Qdrant Cloud
🎬 Loading Movie Dataset
Load the Pablinho/movies-dataset from Hugging Face
{'Release_Date': '2022-01-28',
, 'Title': 'The Ice Age Adventures of Buck Wild',
, 'Overview': "The fearless one-eyed weasel Buck teams up with mischievous possum brothers Crash & Eddie as they head off on a new adventure into Buck's home: The Dinosaur World.",
, 'Popularity': 1431.307,
, 'Vote_Count': '737',
, 'Vote_Average': '7.1',
, 'Original_Language': 'en',
, 'Genre': 'Animation, Comedy, Adventure, Family',
, 'Poster_Url': 'https://image.tmdb.org/t/p/original/zzXFM4FKDG7l1ufrAkwQYv2xvnh.jpg'} Converting to Haystack Documents
Convert all movies into Haystack Document objects. Document dataclass holds text, tables, and binary data as content alongside with metadata.
Document(id=1f5260fdf65717876901d115502f2d3286cc76ba582a05ca38249b16fb7423d7, content: 'The fearless one-eyed weasel Buck teams up with mischievous possum brothers Crash & Eddie as they he...', meta: {'title': 'The Ice Age Adventures of Buck Wild', 'rating': 7.1, 'language': 'en', 'genre': ['animation', 'comedy', 'adventure', 'family']}) 📄 Creating Sparse Embeddings
Use Qdrant/minicoil-v1 sparse neural embedding model with FastembedSparseDocumentEmbedder component. We'll use meta_fields_to_embed parameter to include title metadata into the content.
After initializing FastembedSparseDocumentEmbedder, run the component with documents to create an embedding for each document.
Document(id=1f5260fdf65717876901d115502f2d3286cc76ba582a05ca38249b16fb7423d7, content: 'The fearless one-eyed weasel Buck teams up with mischievous possum brothers Crash & Eddie as they he...', meta: {'title': 'The Ice Age Adventures of Buck Wild', 'rating': 7.1, 'language': 'en', 'genre': ['animation', 'comedy', 'adventure', 'family']}, sparse_embedding: vector with 71 non-zero elements) ☁️ Writing Movies into Qdrant Cloud
Initialize the QdrantDocumentStore to connect to your Qdrant Cloud cluster.
You'll need to provide the following:
url: The endpoint of your Qdrant Cloud instanceindex: The name of the index where documents will be storedapi_key: Your Qdrant API key (set this securely via environment variables)
Make sure to set:
use_sparse_embeddings=True to enable sparse vector supportpayload_fields_to_indexto define which metadata fields (like genre, rating, language) are indexed for filtering
This setup ensures your vector store supports both sparse retrieval and metadata filtering, which are essential for advanced search scenarios.
Now, call .write_documents() with the Document list.
🔍 Creating a Sparse Retrieval Pipeline
Build a sparse retrieval pipeline that accepts a user query about what type of movie that they're looking for and retrieves relevant movie from our database.
First, initialize and add a FastembedSparseTextEmbedder with the Qdrant/minicoil-v1 model to our pipeline. This component will convert user queries into sparse vector representations. We then add a QdrantSparseEmbeddingRetriever, which connects to our QdrantDocumentStore and uses those vectors to retrieve the top 5 most relevant movies.
Finally, we connect the two components by passing the output from the embedder into the retriever's input. This creates a functional query pipeline that transforms a natural language input into a sparse embedding and returns matching results from Qdrant.
<haystack.core.pipeline.pipeline.Pipeline object at 0x7d5e78335e80> ,🚅 Components , - sparse_text_embedder: FastembedSparseTextEmbedder , - sparse_retriever: QdrantSparseEmbeddingRetriever ,🛤️ Connections , - sparse_text_embedder.sparse_embedding -> sparse_retriever.query_sparse_embedding (SparseEmbedding)
Run the pipeline with a query
Calculating sparse embeddings: 100%|██████████| 1/1 [00:00<00:00, 172.81it/s]
See the retrieved movies:
Title: Cannonball Run II Content: The original characters from the first Cannonball Run movie compete in an illegal race across the country once more in various cars and trucks. Rating: 5.5 Genres: ['action', 'comedy'] --- Title: Ford v Ferrari Content: American car designer Carroll Shelby and the British-born driver Ken Miles work together to battle corporate interference, the laws of physics, and their own personal demons to build a revolutionary race car for Ford Motor Company and take on the dominating race cars of Enzo Ferrari at the 24 Hours of Le Mans in France in 1966. Rating: 8.0 Genres: ['drama', 'action', 'history'] --- Title: Driven Content: Talented rookie race-car driver Jimmy Bly has started losing his focus and begins to slip in the race rankings. It's no wonder, with the immense pressure being shoveled on him by his overly ambitious promoter brother as well as Bly's romance with his arch rival's girlfriend Sophia. With much riding on Bly, car owner Carl Henry brings former racing star Joe Tanto on board to help Bly. To drive Bly back to the top of the rankings, Tanto must first deal with the emotional scars left over from a tragic racing accident which nearly took his life. Rating: 5.1 Genres: ['action'] --- Title: New Initial D the Movie - Legend 2: Racer Content: The planned film trilogy retells the beginning of the story from Shuuichi Shigeno's original car-racing manga. High school student Takumi Fujiwara works as a gas station attendant during the day and a delivery boy for his father's tofu shop during late nights. Little does he know that his precise driving skills and his father's modified Toyota Sprinter AE86 Trueno make him the best amateur road racer on Mt. Akina's highway. Because of this, racing groups from all over the Gunma prefecture issue challenges to Takumi to see if he really has what it takes to be a road legend. Rating: 8.1 Genres: ['animation', 'drama', 'action'] --- Title: The Art of Racing in the Rain Content: A family dog - with a near-human soul and a philosopher's mind - evaluates his life through the lessons learned by his human owner, a race-car driver. Rating: 8.3 Genres: ['comedy', 'drama', 'romance'] ---
🗑️ Adding Filters
Let's enrich the user query by combining semantic search with filters.
The text query "A movie about car race" is first converted into a sparse embedding to make a search in movie descriptions. In addition, we apply two filters: the movie must have a rating of at least 7, and its genre must include "action". By combining both semantic similarity and structured filtering, we ensure that the results are not only topically relevant but also meet specific quality and genre constraints.
Calculating sparse embeddings: 100%|██████████| 1/1 [00:00<00:00, 236.82it/s]
Title: Ford v Ferrari Content: American car designer Carroll Shelby and the British-born driver Ken Miles work together to battle corporate interference, the laws of physics, and their own personal demons to build a revolutionary race car for Ford Motor Company and take on the dominating race cars of Enzo Ferrari at the 24 Hours of Le Mans in France in 1966. Rating: 8.0 Genres: ['drama', 'action', 'history'] --- Title: New Initial D the Movie - Legend 2: Racer Content: The planned film trilogy retells the beginning of the story from Shuuichi Shigeno's original car-racing manga. High school student Takumi Fujiwara works as a gas station attendant during the day and a delivery boy for his father's tofu shop during late nights. Little does he know that his precise driving skills and his father's modified Toyota Sprinter AE86 Trueno make him the best amateur road racer on Mt. Akina's highway. Because of this, racing groups from all over the Gunma prefecture issue challenges to Takumi to see if he really has what it takes to be a road legend. Rating: 8.1 Genres: ['animation', 'drama', 'action'] --- Title: Watch Out, We're Mad Content: After a tied 1st place in a local stunt race, two drivers start a contest to decide who of them will own the prize, a dune buggy. But when a mobster destroys the car, they are determined to get it back. Rating: 7.5 Genres: ['action', 'comedy'] --- Title: New Initial D the Movie - Legend 1: Awakening Content: The first movie in a trilogy, focusing on the battle against the Takahashi brothers. High school student Takumi Fujiwara works as a gas station attendant during the day and a delivery boy for his father's tofu shop during late nights. Little does he know that his precise driving skills and his father's modified Toyota Sprinter AE86 Trueno make him the best amateur road racer on Mt. Akina's highway. Because of this, racing groups from all over the Gunma prefecture issue challenges to Takumi to see if he really has what it takes to be a road legend. Rating: 8.3 Genres: ['animation', 'action'] --- Title: New Initial D the Movie - Legend 3: Dream Content: Mt. Akina's new downhill racing hero Fujiwara Takumi prepares for the final showdown against Red Sun's unbeaten leader and Akagi's fastest driver, Takahashi Ryosuke. Rating: 7.4 Genres: ['action', 'animation', 'drama'] ---
📐 Filtering without Query
Initialize a metadata-only retriever (FilterRetriever) that allows filtering movies based on fields like rating, language, or genre without semantic matching.
Run FilterRetriever to get all movies rated 9.0 and above:
🤖 Building the Movie Recommendation Agent
In this section, we bring everything together by building a movie recommendation agent that leverages tools, a system prompt, and an LLM.
🛠️ Retrieval Tool
To interact with the data, the agent uses a retrieval tool. Every tool in Haystack requires a name, a description, a function to call, and a parameter schema.
We start by defining a retrieval function that dynamically chooses between semantic search and metadata-based filtering, depending on the input. If the user provides a query, the tool uses the sparse retrieval pipeline. If only filters are provided, it falls back to a metadata filter retriever. This logic gives the agent the flexibility to handle a wide range of user queries—from simple genre filters to more nuanced, semantic prompts.
Define the JSON schema for your tool parameters, such as query, top_k, and metadata_filters, to ensure tool calls are properly structured and validated.
In most cases, you don’t need to define
parametersmanually. When you use the @tool decorator or create_tool_from_function, Haystack automatically infers the tool’s name, description, and parameters from the function and generates a matching JSON schema for you.
Format the output of retrieved movies into readable text, including the title, content, rating, genre, and language. Then, wrap the retrieval function as a Haystack Tool, specifying how it should be invoked by the agent and how outputs should be displayed.
🧠 Agent
First, you need to define a system prompt for the agent. This prompt instructs how to interpret user queries, generate tool inputs, and handle fallback strategies.
Initialize the Haystack Agent with the retrieval tool and the system prompt. This Agent can understand user queries and respond with movie recommendations.
Send a sample query to the Agent. The Agent interprets the intent, invokes the retrieval tool, and returns suitable movie recommendations.
Calculating sparse embeddings: 100%|██████████| 1/1 [00:00<00:00, 198.07it/s]
Print messages["last_message"].text to see the Agent's final response
Here are some highly-rated action movies about car racing: 1. **Ford v Ferrari** - **Rating:** 8.0 - **Genres:** Drama, Action, History - **Description:** American car designer Carroll Shelby and the British-born driver Ken Miles work together to build a revolutionary race car for Ford Motor Company and take on the dominating race cars of Enzo Ferrari at the 24 Hours of Le Mans in France in 1966. - **Language:** English 2. **New Initial D the Movie - Legend 2: Racer** - **Rating:** 8.1 - **Genres:** Animation, Drama, Action - **Description:** High school student Takumi Fujiwara discovers his precise driving skills while working as a gas station attendant and becomes the best amateur road racer on Mt. Akina's highway. - **Language:** Japanese 3. **Watch Out, We're Mad** - **Rating:** 7.5 - **Genres:** Action, Comedy - **Description:** After a tied 1st place in a local stunt race, two drivers start a contest to decide who will own a dune buggy. When a mobster destroys the car, they are determined to get it back. - **Language:** Italian 4. **New Initial D the Movie - Legend 1: Awakening** - **Rating:** 8.3 - **Genres:** Animation, Action - **Description:** Focusing on the battle against the Takahashi brothers, this movie follows Takumi Fujiwara as he rises as the best amateur road racer on Mt. Akina's highway. - **Language:** Japanese 5. **Italian Spiderman** - **Rating:** 7.1 - **Genres:** Comedy, Action - **Description:** A parody of Italian action–adventure films with an Italian take on the comic book superhero Spider-Man. - **Language:** Italian Let me know if you need more information or recommendations!
🔉 Enable Streaming
Enable streaming output by passing print_streaming_chunk as the streaming_callback. This shows real-time updates inlcuding all tool invocations as the agent processes the query.
[TOOL CALL]
Tool: retrieval_tool
Arguments: {"query":"highly-rated action movie about car racing","metadata_filters":{"operator":"AND","conditions":[{"field":"meta.genre","operator":"==","value":"action"},{"field":"meta.rating","operator":">=","value":7}]}}
Calculating sparse embeddings: 100%|██████████| 1/1 [00:00<00:00, 190.17it/s]
[TOOL RESULT] Movie details for New Initial D the Movie - Legend 2: Racer: The planned film trilogy retells the beginning of the story from Shuuichi Shigeno's original car-racing manga. High school student Takumi Fujiwara works as a gas station attendant during the day and a delivery boy for his father's tofu shop during late nights. Little does he know that his precise driving skills and his father's modified Toyota Sprinter AE86 Trueno make him the best amateur road racer on Mt. Akina's highway. Because of this, racing groups from all over the Gunma prefecture issue challenges to Takumi to see if he really has what it takes to be a road legend. Rating:8.1 Genres:['animation', 'drama', 'action'] Language:ja --- Movie details for New Initial D the Movie - Legend 1: Awakening: The first movie in a trilogy, focusing on the battle against the Takahashi brothers. High school student Takumi Fujiwara works as a gas station attendant during the day and a delivery boy for his father's tofu shop during late nights. Little does he know that his precise driving skills and his father's modified Toyota Sprinter AE86 Trueno make him the best amateur road racer on Mt. Akina's highway. Because of this, racing groups from all over the Gunma prefecture issue challenges to Takumi to see if he really has what it takes to be a road legend. Rating:8.3 Genres:['animation', 'action'] Language:ja --- Movie details for Ford v Ferrari: American car designer Carroll Shelby and the British-born driver Ken Miles work together to battle corporate interference, the laws of physics, and their own personal demons to build a revolutionary race car for Ford Motor Company and take on the dominating race cars of Enzo Ferrari at the 24 Hours of Le Mans in France in 1966. Rating:8.0 Genres:['drama', 'action', 'history'] Language:en --- Movie details for Watch Out, We're Mad: After a tied 1st place in a local stunt race, two drivers start a contest to decide who of them will own the prize, a dune buggy. But when a mobster destroys the car, they are determined to get it back. Rating:7.5 Genres:['action', 'comedy'] Language:it --- Movie details for Italian Spiderman: This is an Australian-made parody of Italian action–adventure films of the 60s and 70s. and foreign movies that misappropriated popular American superheroes such as the Japanese TV series “Spider-Man”. (It should be noted that the Japanese Spider-Man was officially sanctioned by Marvel and is considered canon in the Marvel universe) This is an Italian take on the comic book superhero Spider-Man. Rating:7.1 Genres:['comedy', 'action'] Language:it --- [ASSISTANT] Here are some highly-rated action movies related to car racing: 1. **New Initial D the Movie - Legend 1: Awakening** - **Rating:** 8.3 - **Genres:** Animation, Action - **Language:** Japanese - **Description:** The first movie in a trilogy, focusing on the battle against the Takahashi brothers. High school student Takumi Fujiwara discovers his precise driving skills make him the best amateur road racer on Mt. Akina's highway. 2. **New Initial D the Movie - Legend 2: Racer** - **Rating:** 8.1 - **Genres:** Animation, Drama, Action - **Language:** Japanese - **Description:** Continuing from the first film, Takumi Fujiwara's journey in car racing challenges from groups all over the Gunma prefecture unfolds. 3. **Ford v Ferrari** - **Rating:** 8.0 - **Genres:** Drama, Action, History - **Language:** English - **Description:** Carroll Shelby and driver Ken Miles work to build a revolutionary race car for Ford to compete against Ferrari's dominating race cars at the 1966 24 Hours of Le Mans. 4. **Watch Out, We're Mad** - **Rating:** 7.5 - **Genres:** Action, Comedy - **Language:** Italian - **Description:** After a tie in a local stunt race, two drivers contest for ownership of a dune buggy, but when a mobster destroys the car, they are determined to retrieve it. 5. **Italian Spiderman** - **Rating:** 7.1 - **Genres:** Comedy, Action - **Language:** Italian - **Description:** A parody of Italian action/adventure films, this takes a humorous spin on superheroes and foreign adaptations. Let me know if you need more information or further assistance!