Skip to main content
UnblockDevs

Multi-Agent AI Systems — Complete Guide: Patterns, Frameworks, and Production Deployment

Multi-agent AI systems use multiple AI models working together — each with specialized roles, tools, and capabilities — to accomplish complex tasks that a single LLM cannot handle alone. This guide covers architectures, frameworks, real-world patterns, and what it takes to deploy multi-agent systems reliably in production.

Orchestrator

agent that coordinates other agents

Tool use

agents call APIs, databases, code executors

AutoGen

Microsoft's multi-agent framework

CrewAI

role-based multi-agent framework

1

What is a Multi-Agent System?

Multi-agent architecture overview

A multi-agent system is a network of AI agents, each with a specific role, set of tools, and memory. Agents communicate through messages. An orchestrator agent breaks down complex tasks and delegates to specialist agents — researcher, writer, code executor, reviewer. The result is greater capability than any single agent, with each agent optimized for its role.

1

User submits complex task

The task is too large or multi-dimensional for one agent: "Research competitor pricing, write a report, validate the data, and format for the board."

2

Orchestrator breaks it down

The orchestrator agent decomposes the task into subtasks and assigns each to the appropriate specialist agent with the right tools.

3

Research agent gathers data

The research agent uses web search, database queries, or API calls to gather relevant information. It returns structured findings to the orchestrator.

4

Writer agent composes content

The writer agent takes the research output and composes a structured document, report, or response following the required format.

5

Reviewer agent validates

The reviewer agent checks the output for accuracy, completeness, and quality. It either approves or returns specific revision requests.

6

Final output delivered

The orchestrator collects all outputs, resolves any conflicts, and delivers the final result to the user.

2

Agent Architecture Patterns

ItemPatternUse Case
Pipeline (Sequential)Chain: A → B → C → D, each step feeds the nextDocument processing, content pipeline, ETL workflows where order matters
SupervisorOrchestrator delegates to specialist sub-agentsResearch + writing + coding tasks, complex multi-domain workflows
Peer-to-peer (Debate)Agents discuss, critique, and vote on decisionsCode review, fact-checking, consensus tasks requiring adversarial review
HierarchicalTree of orchestrators managing sub-orchestratorsEnterprise-scale tasks simulating departments of agents
Parallel fan-outOrchestrator spawns multiple agents simultaneouslyTasks that can be parallelized: analyzing multiple documents at once
Map-reduceFan out to process N items, aggregate resultsSummarizing 100 articles, processing large datasets in parallel
3

Building a Multi-Agent System with LangGraph

pythonLangGraph multi-agent pipeline
from langchain_anthropic import ChatAnthropic
from langgraph.graph import StateGraph, END
from typing import TypedDict, Annotated
import operator

# Define shared state passed between all agents
class AgentState(TypedDict):
    task: str
    research: str
    draft: str
    review: str
    final: str
    messages: Annotated[list, operator.add]

llm = ChatAnthropic(model="claude-sonnet-4-6")

def research_agent(state: AgentState):
    """Agent 1: Research the topic"""
    response = llm.invoke([
        {"role": "system", "content": "You are a research expert. Gather key facts and cite sources."},
        {"role": "user", "content": f"Research thoroughly: {state['task']}"}
    ])
    return {"research": response.content, "messages": [response]}

def writer_agent(state: AgentState):
    """Agent 2: Write based on research"""
    response = llm.invoke([
        {"role": "system", "content": "You are a professional writer. Be clear and structured."},
        {"role": "user", "content": f"Write about: {state['task']}\n\nResearch: {state['research']}"}
    ])
    return {"draft": response.content}

def reviewer_agent(state: AgentState):
    """Agent 3: Review and provide specific feedback"""
    response = llm.invoke([
        {"role": "system", "content": "You are a critical editor. Be specific about improvements."},
        {"role": "user", "content": f"Review this draft:\n\n{state['draft']}\n\nList specific improvements needed."}
    ])
    return {"review": response.content}

def finalize_agent(state: AgentState):
    """Agent 4: Incorporate review feedback"""
    response = llm.invoke([
        {"role": "user", "content": f"Revise based on review:\n\nDraft: {state['draft']}\n\nReview: {state['review']}"}
    ])
    return {"final": response.content}

# Build the graph
workflow = StateGraph(AgentState)
workflow.add_node("research", research_agent)
workflow.add_node("writer", writer_agent)
workflow.add_node("reviewer", reviewer_agent)
workflow.add_node("finalize", finalize_agent)

# Sequential pipeline
workflow.set_entry_point("research")
workflow.add_edge("research", "writer")
workflow.add_edge("writer", "reviewer")
workflow.add_edge("reviewer", "finalize")
workflow.add_edge("finalize", END)

app = workflow.compile()
result = app.invoke({"task": "Explain quantum computing for developers", "messages": []})
print(result["final"])
4

Key Frameworks Comparison

LangGraph (LangChain)

Graph-based workflow with explicit state management. Best for complex conditional flows, human-in-the-loop, and long-running agents. Production-ready with LangSmith observability and checkpoint/resume support.

AutoGen (Microsoft)

Conversational agent framework where agents talk to each other via messages. Best for research tasks and code generation with code execution. Easy to prototype, built-in Python code execution sandbox.

CrewAI

Role-based agents organized into crews with tasks. High-level abstraction — define agents, tasks, and process type (sequential or hierarchical). Best for structured team-like workflows that mirror human org structures.

Claude Agent SDK (Anthropic)

Native Anthropic SDK for building agents with tool use, computer use, and multi-turn conversations. Best when building production agents specifically with Claude that need tight integration with Anthropic features.

Swarm (OpenAI)

Lightweight framework for agent handoffs and multi-agent coordination. Simple API: agents hand off to each other based on function return values. Good for exploring agent patterns without framework overhead.

Semantic Kernel (Microsoft)

Enterprise-focused agent framework with .NET and Python support. Plugins, planners, and memory. Best for enterprises already invested in the Microsoft Azure AI ecosystem.

5

Tool Use — Extending Agent Capabilities

pythonAgent with tool use — web search + database
import anthropic
import json

client = anthropic.Anthropic()

# Define tools the agent can call
tools = [
    {
        "name": "web_search",
        "description": "Search the web for current information",
        "input_schema": {
            "type": "object",
            "properties": {
                "query": {"type": "string", "description": "Search query"}
            },
            "required": ["query"]
        }
    },
    {
        "name": "get_database_record",
        "description": "Fetch a record from the product database",
        "input_schema": {
            "type": "object",
            "properties": {
                "product_id": {"type": "string"}
            },
            "required": ["product_id"]
        }
    }
]

def execute_tool(name: str, inputs: dict) -> str:
    """Execute the tool and return result as string."""
    if name == "web_search":
        # Real implementation would call a search API
        return f"Search results for '{inputs['query']}': [placeholder results]"
    elif name == "get_database_record":
        return json.dumps({"id": inputs["product_id"], "name": "Widget", "price": 29.99})

def run_agent(user_message: str) -> str:
    messages = [{"role": "user", "content": user_message}]

    while True:
        response = client.messages.create(
            model="claude-sonnet-4-6",
            max_tokens=4096,
            tools=tools,
            messages=messages,
        )

        # No tool calls — final answer
        if response.stop_reason == "end_turn":
            return response.content[0].text

        # Process tool calls
        tool_results = []
        for block in response.content:
            if block.type == "tool_use":
                result = execute_tool(block.name, block.input)
                tool_results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": result,
                })

        # Add agent response + tool results to conversation
        messages.append({"role": "assistant", "content": response.content})
        messages.append({"role": "user", "content": tool_results})

result = run_agent("What are the current reviews for product ID P123?")
print(result)

Start with a single agent, not multi-agent

Multi-agent systems are significantly more complex to debug, monitor, and reason about. Start with a single capable agent with good tool use. Add additional agents only when you hit specific bottlenecks: context limits that prevent a single agent from handling the full task, parallelism needs, or conflicting objectives that benefit from adversarial review. Most tasks don't need more than 2-3 agents.
6

Production Considerations

ItemChallengeSolution
Failures in mid-pipelineAny agent can fail, losing all upstream workCheckpoint state after each step. LangGraph supports resumable workflows.
Infinite loopsAgents can get stuck in retry cyclesSet max_iterations on all loops. Use timeout limits per agent step.
Cost runawayCosts multiply with every agent in the pipelineUse Claude Haiku for simple steps, Sonnet for complex reasoning. Cache prompts.
ObservabilityHard to debug what went wrong in a 5-agent pipelineUse LangSmith, Langfuse, or Weave. Log all intermediate state.
Prompt injectionExternal content can inject instructions into agentsSanitize all external inputs. Use system prompt separation. See Claude safety docs.

Frequently Asked Questions

Related AI & Systems Guides

Continue with closely related troubleshooting guides and developer workflows.