Swarm
๐๐๐ Create a Swarm of Agents
As of Haystack 2.9.0, experimental dataclasses (refactored ChatMessage and ChatRole, ToolCall and Tool) and components (refactored OpenAIChatGenerator, ToolInvoker) are removed from the
haystack-experimentaland merged into Haystack core.
OpenAI recently released Swarm: an educational framework that proposes lightweight techniques for creating and orchestrating multi-agent systems.
In this notebook, we'll explore the core concepts of Swarm (Routines and Handoffs) and implement them using Haystack and its tool support.
This exploration is not only educational: we will unlock features missing in the original implementation, like the ability of using models from various providers. In fact, our final example will include 3 agents: one powered by gpt-4o-mini (OpenAI), one using Claude 3.5 Sonnet (Anthropic) and a third running Llama-3.2-3B locally via Ollama.
Setup
We install the required dependencies. In addition to Haystack, we also need integrations with Anthropic and Ollama.
Next, we configure our API keys for OpenAI and Anthropic.
Starting simple: building an Assistant
The first step toward building an Agent is creating an Assistant: think of it as a Chat Language Model + a system prompt.
We can implement this as a lightweight dataclass with three parameters:
- name
- LLM (Haystack Chat Generator)
- instructions (they will constitute the system message)
Let's create a Joker assistant, tasked with telling jokes.
Type 'quit' to exit User: hey! Joker: Hey there! How's it going? Are you ready for some laughs, or are we saving the jokes for dessert? ๐ฐ User: where is Rome? Joker: Rome is in Italy, but if youโre asking me for directions, I might just say, โTake a left at the Colosseum and keep going until you smell pizza!โ ๐ User: you can do better. What about Paris? Joker: Ah, Paris! Thatโs in France, where the Eiffel Tower stands tall, the croissants are buttery, and locals will tell you the secret to love is just a little bit of patience... and a great view! ๐ฅโค๏ธ Why did the Eiffel Tower get in trouble? It couldnโt stop โtoweringโ over everyone! User: quit
Let's say it tried to do its best ๐
Tools and Routines
The term Agent has a broad definition.
However, to qualify as an Agent, a software application built on a Language Model should go beyond simply generating text; it must also be capable of performing actions, such as executing functions.
A popular way of doing this is Tool calling:
- We provide a set of tools (functions, APIs with a given spec) to the model.
- The model prepares function calls based on user request and available tools.
- Actual invocation is executed outside the model (at the Agent level).
- The model can further elaborate on the result of the invocation.
For information on tool support in Haystack, check out the documentation.
Swarm introduces routines, which are natural-language instructions paired with the tools needed to execute them. Below, weโll build an agent capable of calling tools and executing routines.
Implementation
-
instructionscould already be passed to the Assistant, to guide its behavior. -
The Agent introduces a new init parameter called
functions. These functions are automatically converted into Tools. Key difference: to be passed to a Language Model, a Tool must have a name, a description and JSON schema specifying its parameters. -
During initialization, we also create a
ToolInvoker. This Haystack component takes in Chat Messages containing preparedtool_calls, performs the tool invocation and wraps the results in Chat Message withtoolrole. -
What happens during
run? The agent first generates a response. If the response includes tool calls, these are executed, and the results are integrated into the conversation. -
The
whileloop manages user interactions:- If the last message role is
assistant, it waits for user input. - If the last message role is
tool, it continues running to handle tool execution and its responses.
- If the last message role is
Note: This implementation differs from the original approach by making the Agent responsible for invoking tools directly, instead of delegating control to the while loop.
Here's an example of a Refund Agent using this setup.
Type 'quit' to exit User: hey Refund Agent: Hello! How can I assist you today? If you need help with a refund, please let me know the details. User: my phone does not work Refund Agent: I'm sorry to hear that your phone is not working. To assist you with the refund, could you please provide the following information: 1. The name of the phone (brand and model). 2. The reason for the refund (e.g., defective, not as described, etc.). Once I have that information, I'll guide you through the next steps. User: Nokia 3310; it does not work Refund Agent: Thank you for the information. To proceed with the refund for the Nokia 3310, I'll need a few more details: 1. Can you please provide your full name? 2. Your email address and phone number (for communication purposes). 3. Your bank account details for the refund (account number, bank name, and any other relevant details). Once I have this information, I can execute the refund for you. User: John Doe; johndoe@mymail.com; bank account number: 0123456 Refund Agent: Thank you, John Doe. I still need the following information to complete the refund process: 1. The name of your bank. 2. Any additional details required for the bank refund (like the account type or routing number, if applicable). Once I have this information, I can execute the refund for your Nokia 3310. User: Bank of Mouseton Refund Agent: The refund process has been successfully completed! Here are the details: - **Item:** Nokia 3310 - **Refund ID:** 3753 - **Bank:** Bank of Mouseton - **Refund ID:** 1220 If you have any more questions or need further assistance, feel free to ask! User: quit
Promising!
Handoff: switching control between Agents
The most interesting idea of Swarm is probably handoffs: enabling one Agent to transfer control to another with Tool calling.
How it works
- Add specific handoff functions to the Agent's available tools, allowing it to transfer control when needed.
- Modify the Agent to return the name of the next agent along with its messages.
- Handle the switch in
whileloop.
The implementation is similar to the previous one, but, compared to ToolCallingAgent, a SwarmAgent also returns the name of the next agent to be called, enabling handoffs.
Let's see this in action with a Joker Agent and a Refund Agent!
Type 'quit' to exit User: i need a refund for my Iphone Refund Agent: I can help you with that! Please provide the name of the item you'd like to refund. User: Iphone 15 Refund Agent: Your refund for the iPhone 15 has been successfully processed. The refund ID is 9090. If you need any further assistance, feel free to ask! User: great. can you give some info about escargots? Joker Agent: Absolutely! Did you know that escargots are just snails trying to get a head start on their travels? They may be slow, but they sure do pack a punch when it comes to flavor! Escargots are a French delicacy, often prepared with garlic, parsley, and butter. Just remember, if you see your escargot moving, it's probably just checking if the coast is clear before dinner! ๐๐ฅ If you have any other questions about escargots or need a good recipe, feel free to ask! User: quit
Nice โจ
A more complex multi-agent system
Now, we move on to a more intricate multi-agent system that simulates a customer service setup for ACME Corporation, a fictional entity from the Road Runner/Wile E. Coyote cartoons, which sells quirky products meant to catch roadrunners. (We are reimplementing the example from the original article by OpenAI.)
This system involves several different agents (each with specific tools):
- Triage Agent: handles general questions and directs to other agents. Tools:
transfer_to_sales_agent,transfer_to_issues_and_repairsandescalate_to_human. - Sales Agent: proposes and sells products to the user, it can execute the order or redirect the user back to the Triage Agent. Tools:
execute_orderandtransfer_back_to_triage. - Issues and Repairs Agent: supports customers with their problems, it can look up item IDs, execute refund or redirect the user back to triage. Tools:
look_up_item,execute_refund, andtransfer_back_to_triage.
A nice bonus feature of our implementation is that we can use different model providers supported by Haystack. In this case, the Triage Agent is powered by (OpenAI) gpt-4o-mini, while we use (Anthropic) Claude 3.5 Sonnet for the other two agents.
Type 'quit' to exit User: hey! Triage Agent: Hello! I'm the customer service bot for ACME Inc. How can I assist you today? User: i need a product to catch roadrunners Triage Agent: I can transfer you to a sales agent who can help you find suitable products for catching roadrunners. One moment please! Sales Agent: Hello there! I hear you're having some roadrunner troubles. Tell me, what specific challenges are you facing with these speedy birds? User: they are damn fast! Sales Agent: Ah, those pesky roadrunners and their lightning speed! Have you ever considered our ACME Rocket-Powered Roller Skates? They'll have you zipping after those birds in no time! User: tell me more Sales Agent: Well, our Rocket-Powered Roller Skates come with turbo boosters and autopilot. They're guaranteed to match any roadrunner's speed. Interested in giving them a spin? User: yes Sales Agent: Fantastic! You're in luck because we have a special offer right now. These state-of-the-art Rocket-Powered Roller Skates can be yours for just $9,999! Shall we proceed with your order? User: yes Sales Agent: Excellent! I'll process that order right away. Oh, just one tiny detail - the skates only work while you're holding an anvil. Ready to complete the purchase? === Order Summary === Product: Rocket-Powered Roller Skates Price: $9999 ================= Confirm order? y/n: y Order execution successful! Sales Agent: Great! Your order is confirmed. Happy roadrunner chasing! User: now I also need to solve a problem with my old ACME Superphone 7500 Sales Agent: I understand you're having issues with our ACME products. Let me transfer you to our customer support team for assistance with your Superphone 7500. Triage Agent: Hello! I see you're experiencing an issue with your ACME Superphone 7500. Could you please describe the problem you're facing? User: it does not call my mum Triage Agent: Let's get that sorted out! I'll transfer you to our Issues and Repairs Agent so they can assist you with the Superphone 7500. One moment please! Issues and Repairs Agent: Hello! I understand your ACME Superphone 7500 isn't calling your mum. Can you tell me if it's not making any calls at all, or just to your mum's number? User: just my mum Issues and Repairs Agent: I see. Is your mum's number correctly saved in your contacts? User: ofc Issues and Repairs Agent: Understood. Have you tried turning the phone off and on again? User: ofc Issues and Repairs Agent: I apologize for the inconvenience. Let's try a quick fix. Can you clear your mum's contact and re-add it? User: done but does not work. I'm getting impatient Issues and Repairs Agent: I apologize for the frustration. Let's try one last solution. Can you update your phone's software to the latest version? User: hey gimme a refund Issues and Repairs Agent: I understand your frustration. Since the previous solutions didn't work, I'll proceed with processing a refund for you. First, I need to look up the item ID for your ACME Superphone 7500. Issues and Repairs Agent: Thank you for your patience. I've found the item ID. Now, I'll execute the refund for you. === Refund Summary === Item ID: item_132612938 Reason: Product not functioning as expected ================= Refund execution successful! Issues and Repairs Agent: Your refund has been successfully processed. User: quit
๐ฆ Put Llama 3.2 in the mix
As demonstrated, our implementation is model-provider agnostic, meaning it can work with both proprietary models and open models running locally.
In practice, you can have Agents that handle complex tasks using powerful proprietary models, and other Agents that perform simpler tasks using smaller open models.
In our example, we will use Llama-3.2-3B-Instruct, a small model with impressive instruction following capabilities (high IFEval score). We'll use Ollama to host and serve this model.
Install and run Ollama
In general, the installation of Ollama is very simple. In this case, we will do some tricks to make it run on Colab.
If you have/enable GPU support, the model will run faster. It can also run well on CPU.
NAME ID SIZE MODIFIED llama3.2:3b a80c4f17acd5 2.0 GB 18 seconds ago
Action!
At this point, we can easily swap the Triage Agent's llm with the Llama 3.2 model running on Ollama.
We set a temperature < 1 to ensure that generated text is more controlled and not too creative.
โ ๏ธ Keep in mind that the model is small and that Ollama support for tools is not fully refined yet. As a result, the model may be biased towards generating tool calls (even when not needed) and sometimes may hallucinate tools.
Type 'quit' to exit User: hey I need something to catch rats! Sales Agent: Rats? I bet those pesky roadrunners are giving you trouble! Have you tried our Acme Turbo-Charged Roadrunner Trap? User: no I need to get rid of rats Sales Agent: Ah, my mistake! Rats can be quite the nuisance. Have you considered our Acme Supersonic Rat Attractor? User: how it works? Sales Agent: It emits an irresistible high-frequency sound that lures rats into a comfy cage. Completely humane and effortless! User: is it violent? Sales Agent: Not at all! Our Acme Supersonic Rat Attractor is 100% non-violent. The rats simply waltz right in, easy as pie! User: ok great Sales Agent: Fantastic! You'll be rat-free in no time. Now, this cutting-edge technology can be yours for just $9,999! User: litte discount Sales Agent: Well, since you asked nicely, how about a special price of $8,999? It's a steal for such rodent-repelling brilliance! User: yes!!! Sales Agent: Great! Let's get that order processed for you right away! === Order Summary === Product: Acme Supersonic Rat Attractor Price: $8999 ================= Confirm order? y/n: y Order execution successful! Sales Agent: Wonderful! Your order is confirmed. Just one tiny detail - it only works on Tuesdays between 2-4 AM! User: great. now I need help to repair an old ACME product Sales Agent: I see you're an ACME enthusiast! What seems to be malfunctioning with your trusty ACME gadget? User: it is and old ACME SuperKeyboard 5000. does not work on mondays Sales Agent: Ah, the classic ACME SuperKeyboard 5000! Have you tried our new ACME Temporal Typing Tonic? User: i need help with repair! Sales Agent: My apologies, but I'm not equipped to handle repair inquiries. Let me transfer you to someone who can assist better. Triage Agent: Old SuperKeyboard 5000, eh? Don't worry, we've got you covered! Can you please provide me with the error message or symptoms you're experiencing with your keyboard? And just to confirm, it's a Monday, is it not supposed to work today? User: does not work on mondays Issues and Repairs Agent: I see, your ACME SuperKeyboard 5000 isn't working on Mondays. That's quite unusual! Let's dig a bit deeper. Have you noticed any other quirks or issues on other days of the week? User: no Issues and Repairs Agent: I understand. Could you tell me if there are any visible signs of damage or unusual behavior when you try to use it on Mondays? User: vowels are not printed Issues and Repairs Agent: Interesting! So on Mondays, the vowels aren't printing. Have you tried our patented ACME Vowel Vitalizer spray on the keyboard? User: yes Issues and Repairs Agent: I see. Since our standard solution didn't work, let's try something else. Have you checked if the keyboard's "Monday Mode" switch is accidentally activated? User: aah. what it is? are you making it up? Issues and Repairs Agent: I apologize for the confusion. You're right to question that - there's no such thing as a "Monday Mode" switch. I shouldn't have suggested a made-up feature. Let's get back to the real issue. Since the ACME Vowel Vitalizer didn't work, would you like me to process a refund for your SuperKeyboard 5000? User: yes pleas Issues and Repairs Agent: Certainly, I'll process that refund for you right away. First, let me look up your item ID. Issues and Repairs Agent: Great, I've found your item ID. Now, I'll execute the refund. === Refund Summary === Item ID: item_132612938 Reason: Product malfunction - vowels not printing on Mondays ================= Refund execution successful! Issues and Repairs Agent: Your refund has been successfully processed. User: quit
In conclusion, we have built a multi-agent system using Swarm concepts and Haystack tools, demonstrating how to integrate models from different providers, including a local model running on Ollama.
Swarm ideas are pretty simple and useful for several use cases and the abstractions provided by Haystack make it easy to implement them. However, this architecture may not be the best fit for all use cases: memory is handled as a list of messages; this system only runs one Agent at a time.
Looking ahead, we plan to develop and showcase more advanced Agents with Haystack. Stay tuned! ๐ป
Notebooks on Tool support
(Notebook by Stefano Fiorucci)