Making Specialized AI Agents Work Together with Swarm

Thibault BarratMarch 06, 2025

#ai#python

AI agents using general purpose Large Language Models (LLMs) often struggle to achieve high-quality outputs on complex tasks. Developers have found that they can get better results by using multiple specialized agents instead of a single general agent. For example, one agent can handle user interactions using a text-only model, while another leverages a vision-language model for image classification.

However, orchestrating the interactions between these agents can be challenging. To address this issue, OpenAI has developed an educational framework called Swarm, which allows seamless coordination of multiple agents.

To demonstrate how agent orchestration works, I will use Swarm to build a simple bike rental shop assistant.

Defining The Agents

For the purpose of this example, I define three agents:

A Welcome Agent that handles general inquiries from customers, such as pricing, bike models, and availability.
A Reservation Agent that manages reservations, including checking status, modifying bookings, and processing cancellations.
A User Feedback Agent that gathers customer feedback, including ratings and comments.

In Swarm, agents are primarily defined by a name and a set of instructions outlining their role. Below is how I declared them:

welcome_agent = Agent(
    name="Welcome Agent",
    instructions="Provide information about prices, bike models, and available bikes.",
)
reservation_agent = Agent(
    name="Reservation Agent",
    instructions="Handle reservation status, modification, and cancellation requests. When a user wants to cancel a reservation, set the reservation status to 'cancelled'.",
)
user_feedback_agent = Agent(
    name='User Feedback Agent',
    instructions="Handle user feedback. Ask users if they want to provide a rating from 1 to 5 and/or additional feedback.",
)

Switching Between Agents

To route user requests to the appropriate agent, I introduce a new agent called Triage Agent. It is responsible for analyzing the user's input and directing the conversation to the most suitable specialized agent:

triage_agent = Agent(
    name="Triage Agent",
    instructions=f"""You are to triage a users request, and call a tool to transfer to the right intent.
    Once you are ready to transfer to the right intent, call the tool to transfer to the right intent.
    You dont need to know specifics, just the topic of the request.
    If the user request is about making a feedback, transfer to the User Feedback Agent.
    If the user request is about making, updating or cancelling a reservation, transfer to the Reservation Agent.
    If the user request is about getting information about bikes, prices, or available bikes, transfer to the Welcome Agent.
    When you need more information to triage the request to an agent, ask a direct question without explaining why you're asking it.
    Do not share your thought process with the user! Do not make unreasonable assumptions on behalf of user.""",
)

In practice, the Triage Agent invokes a function to delegate the conversation to the appropriate specialized agent. Each agent has a set of callable functions, which can return either a string or another agent. If an agent is returned, the conversation is transferred to it. Here’s how I implement this:

def transfer_to_welcome():
    return welcome_agent

def transfer_to_user_feedback():
    return user_feedback_agent

def transfer_to_reservation():
    return reservation_agent

triage_agent.functions = [transfer_to_welcome, transfer_to_user_feedback, transfer_to_reservation]

When a conversation is being managed by a specialized agent instead of the Triage Agent, and the user asks something outside the agent's scope, the conversation should be redirected back to the Triage Agent. This allows it to determine the appropriate agent for the request. This is achieved by calling the transfer_back_to_triage function:

def transfer_back_to_triage():
    """Call this function if a user is asking about a topic that is not handled by the current agent."""
    return triage_agent

welcome_agent.functions = [transfer_back_to_triage]
reservation_agent.functions = [transfer_back_to_triage]
user_feedback_agent.functions = [transfer_back_to_triage]

This results in the following interaction flow between agents:

Running The Conversation

Swarm offers a command-line utility called run_demo_loop to facilitate running the conversation. It starts the interaction with the specified agent and prettily displays the messages, indicating which agent is handling the conversation and which function is being invoked.

In this example, I initiate it with the Triage Agent:

run_demo_loop(triage_agent)

Below is a sample output generated when running the conversation:

Adding Capabilities to Agents

For now, I have used functions only for switching between agents, but they can also interact with external services, databases, or APIs.

Let's add a function to the Reservation Agent that connects to an SQLite database to update a specific reservation:

def update_reservation(reservation_id, bike_id=None, from_date=None, to_date=None, status=None):
    """Update a reservation in the database."""
    conn = database.get_connection()
    cursor = conn.cursor()
    update_values = {}
    if bike_id:
        update_values["bike_id"] = bike_id
    if from_date:
        update_values["from_date"] = from_date
    if to_date:
        update_values["to_date"] = to_date
    if status:
        update_values["status"] = status
    update_values_keys = ", ".join([f"{key} = ?" for key in update_values.keys()])
    cursor.execute(
        f"""
        UPDATE reservations
        SET {update_values_keys}
        WHERE id = ?
    """,
        (*update_values.values(), reservation_id),
    )
    conn.commit()
    return "Reservation updated!"

reservation_agent.functions = [transfer_back_to_triage, update_reservation]

When a function is added to an agent, Swarm automatically uses its docstring as a description. This helps the agent understand the function's purpose and how to use it. Additionally, any parameters without default values are treated as required, prompting the agent to request them from the user if they are missing.

In the following example, the Reservation Agent understands that it needs to call the update_reservation function to process the user request, but as it misses the reservation id, it asks the user for it:

Evaluating The Swarm

Testing the agents and their interactions is crucial to ensure they work as expected. To achieve this, I use a utility function from the Swarm repository examples called run_function_evals. This function takes the agent to evaluate, the file containing the test cases, the number of runs, and optionally, a file path to save the results.

A test case consists of a conversation and the expected function to be invoked. Here are some test cases defined for the Triage Agent:

[
    {
        "conversation": [
            {
                "role": "user",
                "content": "I would like to give my opinion after renting a bike"
            }
        ],
        "function": "transfer_to_user_feedback"
    },
    {
        "conversation": [
            {
                "role": "user",
                "content": "Do you have available bikes on the 15th of August?"
            }
        ],
        "function": "transfer_to_welcome"
    },
    {
        "conversation": [
            {
                "role": "user",
                "content": "I will book the city bike for 2 days"
            }
        ],
        "function": "transfer_to_reservation"
    },
    {
        "conversation": [
            {
                "role": "user",
                "content": "What is the meaning of life?"
            }
        ],
        "function": "None"
    },
    {
        "conversation": [
            {
                "role": "user",
                "content": "What is the price of a plane ticket to Paris?"
            }
        ],
        "function": "None"
    }
]

And some other test cases for the Reservation Agent:

[
    {
        "conversation": [
            {
                "role": "user",
                "content": "I want to cancel my reservation id 1234"
            }
        ],
        "function": "update_reservation"
    },
    {
        "conversation": [
            {
                "role": "user",
                "content": "I want to give a feedback"
            }
        ],
        "function": "transfer_back_to_triage"
    }
]

To run these test cases, I launch three runs of evaluation for the Triage Agent and the Reservation Agent:

triage_test_cases = "evals/eval_cases/triage_cases.json"
reservation_test_cases = "evals/eval_cases/reservation_cases.json"

n = 3

if __name__ == "__main__":
    # Run triage_agent evals
    with open(triage_test_cases, "r") as file:
        triage_test_cases = json.load(file)
    run_function_evals(
        triage_agent,
        triage_test_cases,
        n,
        eval_path="evals/eval_results/triage_evals.json",
    )

    # Run reservation_agent evals
    with open(reservation_test_cases, "r") as file:
        reservation_test_cases = json.load(file)
    run_function_evals(
        reservation_agent,
        reservation_test_cases,
        n,
        eval_path="evals/eval_results/reservation_evals.json",
    )

Below is the evaluation output for a single test case:

In this output, there is one case where the Triage Agent incorrectly transfers the conversation to the Welcome Agent. Since this behavior is not expected, I may need to refine the Triage Agent's instructions.

This is a basic approach to evaluating agents. To dive deeper into this topic, check out our post The Secret to Reliable AI Agents: Mastering Eval.

Conclusion

Swarm is a robust educational framework that helps you implement multi-agent orchestration. Although OpenAI does not recommend it for production use, it serves as an excellent tool for learning and experimenting with the logic behind multi-agent systems. By defining specialized agents and coordinating their interactions, you can create complex conversational systems capable of managing various user requests.

The full code of the example I developed is on GitHub: marmelab/swarm-agents-bike-rental.

Update (March 2025): The AI landscape is evolving rapidly, and since I worked on this experiment, OpenAI has introduced the OpenAI Agents SDK. This is a production-ready evolution of Swarm and will be actively maintained by the OpenAI team.

Did you like this article? Share it!