Chapter 13: Orchestration Fundamentals

Multi-Agent orchestration isn't about having a bunch of Agents each doing their own thing—it's about making them collaborate like a symphony orchestra—with a conductor, division of labor, and coordination. But no matter how talented the conductor is, if the musicians are incompetent, it's all for nothing.

Quick Track (Master the core in 5 minutes)

Three hard limitations of single Agent: serial execution is slow, shallow depth, single point of failure

Three elements of orchestration: task decomposition, Agent assignment, result synthesis

Automatic vs configuration: auto-match for simple tasks, explicit configuration for complex tasks

Coordination overhead cannot be ignored: for simple tasks, a single Agent is actually faster

Orchestration is an architectural decision that requires balancing parallel benefits against coordination overhead

10-minute path: 13.1-13.3 → 13.5 → Shannon Lab

13.1 Why Isn't a Single Agent Enough?

This chapter addresses one core question: When a single Agent can't efficiently complete a task, how do you get multiple Agents to collaborate?

Imagine you're managing a small research project—you need to analyze the EV strategies of three competitors (Tesla, BYD, Rivian). If you're working alone, what would you do?

You'd process them serially: research Tesla today, BYD tomorrow, Rivian the day after. Three days later, you finally have all the information collected, and you can start writing the comparative analysis.

But what if you had three assistants? You'd have them work simultaneously: Alice researches Tesla, Bob researches BYD, Carol researches Rivian. One day later, all three reports arrive at once, and you just need to synthesize and compare.

3x efficiency improvement.

A single Agent is like working solo—it can complete tasks, but it's inefficient and shallow. Multi-Agent orchestration is about building a team with division of labor and collaboration.

But building a team isn't as simple as "hiring more people." You need to: assign tasks, coordinate progress, integrate results, and handle conflicts. The Orchestrator is what does this.

Three Hard Limitations of a Single Agent

Let me cut to the chase: a single Agent has three hard limitations.

Limitation One: Serial Execution, Too Inefficient

Searching three companies is completely independent—there are no dependencies. But a single Agent can only do them one after another. What if we parallelize? The difference is obvious:

Serial vs Parallel Execution Comparison

Saved 40 seconds. The more tasks, the bigger the gap.

Limitation Two: Generalist Doing Specialist Work, Lacks Depth

"Design a business plan for an AI startup"—what does this task require?

Market analysis: industry size, growth trends, competitive landscape
Technical architecture: technology selection, cost estimation, feasibility assessment
Financial projections: revenue model, cost structure, profit/loss analysis
Marketing strategy: target users, customer acquisition channels, brand positioning

Have one "generalist" Agent handle all four at once? It might know a bit about each, but not enough depth in any.

A better approach: 4 specialist Agents, each focused on their area.

Limitation Three: Single Point of Failure, No Redundancy

When an Agent goes down—network timeout, LLM error, tool call failure—the entire task fails.

Multi-Agent systems can implement fault tolerance: if one fails, the others continue; critical tasks can have backups.

Multi-Agent vs Single Agent

Capability	Single Agent	Multi-Agent
Parallel capability	Serial execution	Concurrent execution
Professional depth	Generalist, knows a bit of everything	Specialist division, each with strengths
Fault tolerance	Single point of failure	Redundant fault tolerance
Cost control	Unified model	Select model per task (cheaper models for simple tasks)

Note: Multi-Agent isn't a silver bullet. Coordinating multiple Agents has overhead itself—communication, synchronization, result integration. When tasks are simple, a single Agent is actually faster. Only when tasks are complex enough do multi-Agent benefits outweigh coordination costs. I've seen people split "check the weather" into 3 Agents—completely unnecessary, and actually slower.

13.2 The Orchestrator: Conductor of Multi-Agent Systems

Multi-Agent systems need a "conductor"—the Orchestrator.

It doesn't do the actual work, but it decides:

How to decompose tasks
Who does what
Execution order
How to integrate results

Four Core Responsibilities

Orchestrator's Four Responsibilities

Analogy: The orchestrator is like a restaurant's head chef.

A customer says "I want a steak dinner." The head chef won't do everything alone—they will:

Decompose: steak, sides, sauce, dessert
Dispatch: steak to the grill station, sides to the cold kitchen, sauce to the sauce chef
Coordinate: sauce goes on after the steak is ready, sides and steak come out together
Synthesize: plating, ensuring temperature and presentation are right

The head chef doesn't need to know how to make everything, but they need to know: who's good at what, what order makes sense, how to combine it all into one dish.

Execution Flow

Orchestration Four-Step Process

13.3 Routing Decisions: Which Strategy to Use?

Not all tasks need multi-Agent. The orchestrator's first decision is: What path should this task take?

Shannon's Routing Logic

Shannon's OrchestratorWorkflow makes this judgment:

// Determine if it's a simple task
simpleByShape := len(decomp.Subtasks) == 0 ||
                 (len(decomp.Subtasks) == 1 && !needsTools)
isSimple := decomp.ComplexityScore < simpleThreshold && simpleByShape

// Check for dependencies
hasDeps := false
for _, st := range decomp.Subtasks {
    if len(st.Dependencies) > 0 || len(st.Consumes) > 0 {
        hasDeps = true
        break
    }
}

switch {
case isSimple:
    // Simple task → Single Agent direct execution
    return SimpleTaskWorkflow(input)

case len(decomp.Subtasks) > 5 || hasDeps:
    // Complex task or has dependencies → Supervisor pattern
    return SupervisorWorkflow(input)

default:
    // Standard task → DAG workflow
    return DAGWorkflow(input)
}

Implementation reference (Shannon): go/orchestrator/internal/workflows/orchestrator_router.go - OrchestratorWorkflow function

Decision Tree

Task enters
    │
    ▼
Complexity < 0.3 AND single subtask AND no tools? ──Yes──► SimpleTaskWorkflow
    │                                                      (Single Agent direct execution)
    No
    │
    ▼
Subtasks > 5 OR has dependencies? ──Yes──► SupervisorWorkflow
    │                                      (Complex multi-Agent coordination)
    No
    │
    ▼
DAGWorkflow (default)
(Standard multi-Agent parallel/serial)

Three Strategy Comparison

Strategy	Use Case	Characteristics
SimpleTask	Simple Q&A, single-step tasks	Lightest weight, single Agent
DAGWorkflow	2-5 subtasks, possibly simple dependencies	Parallel/serial/hybrid execution
Supervisor	6+ subtasks, complex dependencies, needs dynamic coordination	Team management, mailbox communication

These three strategies will be covered in detail in the following chapters. For now, remember: The orchestrator automatically selects a strategy based on task complexity.

13.4 Three Execution Modes

Regardless of which strategy is chosen, Agents ultimately need to be executed. There are three execution modes:

Mode One: Parallel Execution

Use case: Subtasks are mutually independent with no dependencies.

Parallel Task Execution

The core is semaphore control—limiting the number of Agents executing simultaneously to prevent resource exhaustion.

type ParallelConfig struct {
    MaxConcurrency int  // Maximum concurrency, default 5
}

func ExecuteParallel(ctx workflow.Context, tasks []ParallelTask, config ParallelConfig) {
    // Semaphore to control concurrency
    semaphore := workflow.NewSemaphore(ctx, int64(config.MaxConcurrency))

    for i, task := range tasks {
        workflow.Go(ctx, func(ctx workflow.Context) {
            // Acquire semaphore (blocks if over concurrency limit)
            semaphore.Acquire(ctx, 1)
            defer semaphore.Release(1)

            // Execute task
            executeTask(task)
        })
    }
}

Implementation reference (Shannon): go/orchestrator/internal/workflows/patterns/execution/parallel.go - ExecuteParallel function

Why limit concurrency?

Say you have 10 search tasks, MaxConcurrency = 3:

t0: [Task 1] [Task 2] [Task 3]  ← 3 start simultaneously
t1: [1 done] [Task 4 starts]    ← 1 completes, 4 immediately fills in
t2: [2 done] [Task 5 starts]    ← 2 completes, 5 fills in
...

Without limits, 10 Agents calling the LLM API simultaneously would likely trigger rate limiting, making it slower instead.

Mode Two: Sequential Execution

Use case: Tasks have implicit dependencies where the next one needs the previous one's result.

Sequential Task Execution

type SequentialConfig struct {
    PassPreviousResults bool  // Whether to pass previous results to the next task
}

func ExecuteSequential(ctx workflow.Context, tasks []Task, config SequentialConfig) {
    var results []Result

    for i, task := range tasks {
        // Inject previous results into context
        if config.PassPreviousResults && len(results) > 0 {
            task.Context["previous_results"] = results
        }

        result := executeTask(task)
        results = append(results, result)
    }
}

The key is result passing. For example:

Task 1: "Get Tesla stock price"
        → Response: "$250"
        ↓
Task 2: "Calculate year-over-year growth"
        Context: {
          previous_results: [
            { response: "$250", numeric_value: 250 }
          ]
        }
        → Can directly use 250 for calculation

Mode Three: Hybrid Execution (Hybrid/DAG)

Use case: Some tasks can run in parallel, some have dependencies.

Task Dependency Tree

The core is dependency waiting—tasks can only start after all their dependencies are complete.

func waitForDependencies(
    ctx workflow.Context,
    dependencies []string,
    completedTasks map[string]bool,
    timeout time.Duration,
) bool {
    startTime := workflow.Now(ctx)
    deadline := startTime.Add(timeout)

    for workflow.Now(ctx).Before(deadline) {
        // Check if all dependencies are complete
        allDone := true
        for _, depID := range dependencies {
            if !completedTasks[depID] {
                allDone = false
                break
            }
        }
        if allDone {
            return true
        }

        // Wait 30 seconds before checking again
        workflow.AwaitWithTimeout(ctx, 30*time.Second, func() bool {
            // Condition check
            for _, depID := range dependencies {
                if !completedTasks[depID] {
                    return false
                }
            }
            return true
        })
    }

    return false  // Timeout
}

Implementation reference (Shannon): go/orchestrator/internal/workflows/patterns/execution/hybrid.go - waitForDependencies function

13.5 Result Synthesis

Multiple Agents have finished running—how do you integrate the results?

The Problem

Raw Agent output is typically:

Redundant: Different Agents may provide similar information
Inconsistent format: Each Agent has its own output style
Variable quality: Some succeed, some fail, some are half-baked

Users expect: a unified, complete, high-quality answer.

Preprocessing: Deduplication and Filtering

func preprocessResults(results []AgentResult) []AgentResult {
    // 1. Exact deduplication (Hash)
    seen := make(map[string]bool)
    exact := []AgentResult{}
    for _, r := range results {
        hash := computeHash(r.Response)
        if !seen[hash] {
            seen[hash] = true
            exact = append(exact, r)
        }
    }

    // 2. Similarity deduplication (Jaccard > 0.85)
    similar := []AgentResult{}
    for _, r := range exact {
        isDuplicate := false
        for _, s := range similar {
            if jaccardSimilarity(r.Response, s.Response) > 0.85 {
                isDuplicate = true
                break
            }
        }
        if !isDuplicate {
            similar = append(similar, r)
        }
    }

    // 3. Quality filtering
    filtered := []AgentResult{}
    noInfoPatterns := []string{
        "unable to retrieve",
        "failed to fetch",
        "no information available",
        "unable to access",
        "not found",
    }
    for _, r := range similar {
        if r.Success && !containsAny(r.Response, noInfoPatterns) {
            filtered = append(filtered, r)
        }
    }

    return filtered
}

Synthesis Methods

Simple synthesis: Direct concatenation

Suitable when results are already well-formatted:

func simpleSynthesis(results []AgentResult) string {
    var parts []string
    for _, r := range results {
        parts = append(parts, r.Response)
    }
    return strings.Join(parts, "\n\n")
}

LLM synthesis: Intelligent integration

Suitable when you need unified perspective, conflict resolution, and insight generation:

func llmSynthesis(query string, results []AgentResult) string {
    prompt := fmt.Sprintf(`Synthesize the following research results to answer the question: %s

Requirements:
1. Eliminate duplicate information
2. Resolve contradictions (if any)
3. Highlight key insights
4. Present in a unified format

`, query)

    for i, r := range results {
        prompt += fmt.Sprintf("=== Source %d ===\n%s\n\n", i+1, r.Response)
    }

    return callLLM(prompt)
}

13.6 Token Budget Allocation

Cost control is even more important in multi-Agent scenarios.

Why?

A single Agent burns 1000 tokens, multi-Agent might burn 5000. Without control, one complex task could use up an entire day's budget.

Budget Allocation Strategies

Simple strategy: Equal distribution

func allocateBudgetSimple(totalBudget int, numAgents int) int {
    return totalBudget / numAgents
}

// Example: Total budget 10000, 5 Agents → 2000 each

Advanced strategy: Allocate by complexity

func allocateBudgetByComplexity(totalBudget int, subtasks []Subtask) map[string]int {
    budgets := make(map[string]int)

    // Calculate total complexity
    totalComplexity := 0.0
    for _, st := range subtasks {
        totalComplexity += st.Complexity
    }

    // Allocate proportionally
    for _, st := range subtasks {
        budgets[st.ID] = int(float64(totalBudget) * st.Complexity / totalComplexity)
    }

    return budgets
}

// Example: Total budget 10000
//     Task A (complexity 0.5) → 5000
//     Task B (complexity 0.3) → 3000
//     Task C (complexity 0.2) → 2000

Shannon's implementation:

// Pass budget from router
n := len(decomp.Subtasks)
if n == 0 {
    n = 1
}
agentMax := res.RemainingTaskBudget / n

// Can set cap via environment variable or request context
if v := os.Getenv("TOKEN_BUDGET_PER_AGENT"); v != "" {
    if cap, err := strconv.Atoi(v); err == nil && cap > 0 && cap < agentMax {
        agentMax = cap
    }
}

input.Context["budget_agent_max"] = agentMax

13.7 Control Signals

During orchestration, users might want to: pause, resume, or cancel.

Temporal's Signal Mechanism

Shannon uses Temporal's Signal mechanism to implement control:

// Set up control signal handler
controlHandler := &ControlSignalHandler{
    WorkflowID: workflowID,
    AgentID:    "orchestrator",
}
controlHandler.Setup(ctx)

// Check signals at key points
checkpoints := []string{
    "pre_routing",         // Before routing decision
    "post_decomposition",  // After task decomposition
    "pre_dag_workflow",    // Before entering DAG
}

for _, checkpoint := range checkpoints {
    if err := controlHandler.CheckPausePoint(ctx, checkpoint); err != nil {
        return TaskResult{Success: false, ErrorMessage: err.Error()}, err
    }
}

Child Workflow Registration

When the orchestrator launches child workflows, they need to be registered to propagate signals:

// Launch child workflow
childFuture := workflow.ExecuteChildWorkflow(ctx, DAGWorkflow, input)

// Get child workflow ID
var childExec workflow.Execution
childFuture.GetChildWorkflowExecution().Get(ctx, &childExec)

// Register (so pause/cancel signals propagate to child workflows)
controlHandler.RegisterChildWorkflow(childExec.ID)

// Unregister after completion
defer controlHandler.UnregisterChildWorkflow(childExec.ID)

13.8 Complete Example

Let's tie together everything from above with a complete multi-Agent research task:

func CompanyResearchWorkflow(ctx workflow.Context, query string) (string, error) {
    companies := []string{"Tesla", "BYD", "Rivian"}

    // 1. Build parallel tasks
    tasks := make([]ParallelTask, len(companies))
    for i, company := range companies {
        tasks[i] = ParallelTask{
            ID:          fmt.Sprintf("research-%s", strings.ToLower(company)),
            Description: fmt.Sprintf("Research %s's 2024 EV strategy", company),
            SuggestedTools: []string{"web_search"},
            Role:        "researcher",
        }
    }

    // 2. Execute in parallel
    config := ParallelConfig{
        MaxConcurrency: 3,
        EmitEvents:     true,
    }
    result, err := ExecuteParallel(ctx, tasks, sessionID, history, config, budgetPerAgent, userID, modelTier)
    if err != nil {
        return "", err
    }

    // 3. Preprocess results
    processed := preprocessResults(result.Results)

    // 4. LLM synthesis
    synthesis := llmSynthesis(query, processed)

    return synthesis, nil
}

Execution timeline:

0s   ┌─ Orchestrator starts
     ├─ Task decomposition: 3 research tasks + 1 synthesis task
     └─ Routing decision: DAGWorkflow

1s   ├─ Launch 3 research Agents in parallel
     │   ├─ Agent A (Tesla):  Searching...
     │   ├─ Agent B (BYD):    Searching...
     │   └─ Agent C (Rivian): Searching...

15s  ├─ Agent B completes
20s  ├─ Agent C completes
25s  ├─ Agent A completes (Tesla has the most info)

26s  ├─ Start result synthesis
     │   ├─ Dedup: Removed 2 duplicate items
     │   ├─ Filter: Removed 1 failed result
     │   └─ LLM synthesis analysis

45s  └─ Output final report

Total time: ~45 seconds (serial would take ~75 seconds)

13.9 Common Pitfalls

Pitfall 1: Over-parallelization

// Dangerous: 100 concurrent, API will rate limit
config := ParallelConfig{MaxConcurrency: 100}

// Reasonable: Set based on API limits
config := ParallelConfig{MaxConcurrency: 5}

I've seen people set concurrency to 50, only to get a bunch of 429 Too Many Requests from the LLM API. Worse than serial execution.

Pitfall 2: Ignoring Failed Tasks

// Problem: Only process successes, failures are ignored
for _, r := range results {
    if r.Success {
        process(r)
    }
}

// Improvement: Monitor success rate
successRate := float64(successCount) / float64(total)
if successRate < 0.7 {
    logger.Warn("Low success rate", "rate", successRate)
    // May need retry or alerting
}

Pitfall 3: Result Synthesis Loses Information

Simple concatenation can lead to:

Duplicate information (two Agents both mention "Tesla market cap is $800 billion")
Contradictory information (one says 15% growth, another says 12%)
Lack of insights (just listing, no comparative analysis)

When using LLM synthesis, the prompt should explicitly require:

synthesisPrompt := `Synthesize the following research results:

Requirements:
1. Eliminate duplicates
2. Mark contradictions (if any)
3. Generate a comparative analysis table
4. Summarize key insights (3-5 items)

...
`

Pitfall 4: Poor Budget Allocation

// Problem: Same budget for simple and complex tasks
budgetPerAgent := totalBudget / numAgents

// Improvement: Allocate based on estimated tokens per task
for _, st := range subtasks {
    budgets[st.ID] = int(float64(totalBudget) * float64(st.EstimatedTokens) / float64(totalEstimated))
}

13.10 Other Framework Implementations

Orchestration is the core problem of multi-Agent systems, and each framework has its own approach:

Framework	Orchestration Method	Characteristics
LangGraph	Graph definition + node execution	Flexible, requires manual graph definition
AutoGen	GroupChat + Manager	Conversation-driven, automatic speaker selection
CrewAI	Crew + Process	Clear role definitions, supports sequential/hierarchical
OpenAI Swarm	handoff()	Lightweight, direct Agent-to-Agent handoff

LangGraph example:

from langgraph.graph import StateGraph

# Define state
class ResearchState(TypedDict):
    query: str
    tesla_data: str
    byd_data: str
    synthesis: str

# Define graph
graph = StateGraph(ResearchState)
graph.add_node("research_tesla", research_tesla_node)
graph.add_node("research_byd", research_byd_node)
graph.add_node("synthesize", synthesize_node)

# Define edges (dependencies)
graph.add_edge(START, "research_tesla")
graph.add_edge(START, "research_byd")
graph.add_edge("research_tesla", "synthesize")
graph.add_edge("research_byd", "synthesize")

That's This Chapter

The core message is one sentence: The Orchestrator is the conductor of multi-Agent systems—decomposing tasks, dispatching execution, coordinating dependencies, synthesizing results.

Summary

Three limitations of single Agent: Serial inefficiency, generalist not specialist, single point of failure
Orchestrator's four responsibilities: Decompose → Dispatch → Coordinate → Synthesize
Routing decisions: Simple tasks use SimpleTask, complex tasks use DAG or Supervisor
Three execution modes: Parallel (independent tasks), Sequential (chain dependencies), Hybrid (DAG)
Result synthesis: Dedup → Filter → LLM integration

Shannon Lab (10-minute hands-on)

This section helps you map this chapter's concepts to Shannon source code in 10 minutes.

Required Reading (1 file)

orchestrator_router.go: Find the OrchestratorWorkflow function's routing switch statement, understand how it determines "simple task", "needs Supervisor", and how it delegates to child workflows

Optional Deep Dives (2 files, pick based on interest)

execution/parallel.go: Understand how semaphore control is implemented (workflow.NewSemaphore), why futuresChan + Selector is used to collect results
execution/hybrid.go: Understand waitForDependencies' incremental timeout checking, why workflow.AwaitWithTimeout is used instead of blocking indefinitely

Exercises

Exercise 1: Routing Decision Analysis

Analyze which path these tasks would take:

"What's the weather in Beijing today"
"Compare iPhone and Android market share"
"Design a complete e-commerce system architecture including frontend, backend, database, cache, and message queue"

For each task, explain:

Expected complexity score range
Which workflow it would use (SimpleTask / DAG / Supervisor)
Why

Exercise 2: Concurrency Settings

Assume your LLM API limit is 10 requests per second, each task requires 3 LLM calls, averaging 5 seconds each.

Questions:

If you have 20 subtasks, what should MaxConcurrency be?
What happens if you set it too high?
What happens if you set it too low?

Exercise 3 (Advanced): Design a Synthesis Prompt

Design an LLM synthesis prompt for a "multi-company financial report comparison analysis" task.

Requirements:

How to handle duplicate information
How to handle data contradictions
Output format requirements (table + insights)
Citation annotation requirements

Want to Go Deeper?

Temporal Workflows - Understanding the infrastructure for workflow orchestration
LangGraph Multi-Agent - Python ecosystem's graph orchestration solution
AutoGen GroupChat - Microsoft's conversational multi-Agent framework

Next Chapter Preview

The orchestrator decides "who does what," but "how to do it" is still unresolved.

When tasks have complex dependencies—A waits for B, B waits for C, C can run in parallel with D—simple serial or parallel execution won't cut it.

The next chapter covers DAG Workflows: using directed acyclic graphs to model task dependencies and implement intelligent scheduling.

See you in the next chapter.