Appendix A: Glossary
This appendix helps you quickly look up key terms, understand their definitions and technical concepts. Organized alphabetically with definitions, related chapters, and practical examples.
A - Agent Core Concepts
Agent
Definition: A software entity that can perceive its environment, make autonomous decisions, and take actions to achieve goals. In the LLM era, Agents use large language models for reasoning and tools for environment interaction.
Related Chapters: Chapter 1, Chapter 2, Chapter 14
Example: A customer service Agent receives user questions, queries a knowledge base, calls an order system API, and generates a response
Related Terms: ReAct, Tool Use, Multi-Agent
Agentic Coding
Definition: A programming paradigm where AI Agents autonomously complete code generation, debugging, testing, and deployment. The Agent understands requirements, plans implementation, writes code, and iteratively optimizes.
Related Chapters: Chapter 28
Example: A developer describes "add user authentication," and the Agent automatically generates code, writes tests, and submits a PR
Related Terms: Computer Use, Reflection
API Gateway
Definition: A unified entry point that handles routing, authentication, rate limiting, and protocol conversion for all external requests. In Agent systems, it exposes Agent capabilities as standard APIs.
Related Chapters: Chapter 21, Chapter 22
Example: Kong gateway handles all HTTP requests, routes /v1/chat to the Orchestrator, and applies Token quotas
Related Terms: Orchestrator, Guardrails
Asynchronous Workflow
Definition: A workflow pattern where task execution doesn't block the caller, with results obtained via callbacks, message queues, or polling. Suitable for long-running Agent tasks.
Related Chapters: Chapter 21, Chapter 22
Example: User submits a research task, receives a task_id, Agent executes in background, client polls for progress
Related Terms: Temporal, Background Agent
B - Backpressure and Budget
Background Agent
Definition: An Agent that runs continuously in the background, executing scheduled tasks, monitoring events, or processing async requests. Doesn't require real-time user interaction.
Related Chapters: Chapter 29
Example: Automatically scrapes competitor information hourly, analyzes trends, generates weekly reports, and sends emails
Related Terms: Asynchronous Workflow, Temporal
Backpressure
Definition: A flow control mechanism where downstream systems signal upstream to "slow down." Prevents producer speed from exceeding consumer processing capacity, avoiding system crashes.
Related Chapters: Chapter 22, Chapter 23
Example: When LLM Service is overwhelmed, it returns 429 errors; Orchestrator pauses sending new requests
Related Terms: Circuit Breaker, Rate Limiting
Batch Processing
Definition: Aggregating multiple requests for unified processing, improving throughput and resource utilization. Commonly used for embedding generation, batch inference, etc.
Related Chapters: Chapter 23
Example: Accumulate 10 documents before calling Embedding API once, rather than calling for each document
Related Terms: Token Budget, Cost Optimization
C - Chain-of-Thought and Cost
Caching
Definition: Storing computation results to avoid redundant calculations, speeding up responses and reducing costs. In Agent systems, can cache LLM responses, tool call results, or vector retrievals.
Related Chapters: Chapter 11, Chapter 23
Example: Cache the answer to "what's the company address?" for 24 hours, avoiding redundant LLM calls
Related Terms: Memory, Session
Chain-of-Thought (CoT)
Definition: A prompting technique that has LLMs generate intermediate reasoning steps, solving complex problems step by step. Improves reasoning accuracy and explainability.
Related Chapters: Chapter 12, Chapter 13, Chapter 17
Example: Computing "23 x 47," LLM outputs "20x47=940, 3x47=141, 940+141=1081"
Related Terms: Tree-of-Thoughts, Reflection
Circuit Breaker
Definition: When downstream service failure rate exceeds a threshold, automatically cuts off requests to prevent cascading failures. Attempts recovery after a period.
Related Chapters: Chapter 22, Chapter 23
Example: After 10 consecutive vector database timeouts, circuit breaker opens, all queries fail fast, half-open recovery attempted after 30 seconds
Related Terms: Backpressure, Guardrails
Computer Use
Definition: An Agent's ability to interact with computer environments by operating browsers, desktop applications, or command-line tools. Enables RPA automation and complex task execution.
Related Chapters: Chapter 27
Example: Agent opens browser, logs into system, fills forms, uploads files, takes screenshots to confirm results
Related Terms: Tool Use, Agentic Coding
Context Window
Definition: The maximum number of tokens an LLM can process at once. Includes input prompts, conversation history, and output content.
Related Chapters: Chapter 9, Chapter 10
Example: GPT-4's context window is 128K tokens, approximately 96,000 English words
Related Terms: Token, Summarization
Cost Optimization
Definition: Strategies to reduce LLM usage costs through model selection, caching, batch processing, prompt compression, etc.
Related Chapters: Chapter 23
Example: Use Claude Haiku ($0.25/M tokens) for simple Q&A, Opus ($15/M tokens) for complex reasoning
Related Terms: Token Budget, Caching
D - DAG and Debate
DAG (Directed Acyclic Graph)
Definition: A graph structure where nodes represent tasks and edges represent dependencies. Used to define execution order of multi-Agent workflows.
Related Chapters: Chapter 14, Chapter 21
Example: Research task: search (Node A) -> analyze (Nodes B, C in parallel) -> synthesize (Node D depends on B, C)
Related Terms: Orchestrator, Temporal
Debate Pattern
Definition: A reasoning pattern where multiple Agents with different viewpoints engage in multi-round debate, improving decision quality through adversarial discussion.
Related Chapters: Chapter 18
Example: Legal Agents playing plaintiff, defendant, and judge roles debate contract clause reasonability, reaching consensus
Related Terms: Multi-Agent, Reflection
Deterministic Replay
Definition: A mechanism where workflow engines can re-execute from checkpoints after failure, guaranteeing identical input produces identical results. Core feature of Temporal.
Related Chapters: Chapter 21
Example: Agent task crashes at step 5, Temporal replays from checkpoint, skipping completed steps 1-4
Related Terms: Temporal, Asynchronous Workflow
E - Embedding and Error Handling
Embedding
Definition: Converting text into high-dimensional vector numerical representations that capture semantic information. Used for similarity search, clustering, and RAG.
Related Chapters: Chapter 11, Chapter 19
Example: Text "machine learning" converted to 1536-dimensional vector [0.23, -0.45, ...]
Related Terms: RAG, Vector Database
Error Handling
Definition: Mechanisms for identifying, catching, and recovering from system exceptions. In Agent systems, includes retries, degradation, circuit breaking, and user-friendly error messages.
Related Chapters: Chapter 22, Chapter 23
Example: After LLM call timeout, auto-retry 3 times; if still failing, degrade to "Service busy, please try later"
Related Terms: Circuit Breaker, Guardrails
F - Function Calling and Failure Recovery
Fallback Strategy
Definition: A fault tolerance mechanism that automatically switches to alternatives when the primary service is unavailable. Ensures basic functionality during partial failures.
Related Chapters: Chapter 22, Chapter 23
Example: When vector search fails, degrade to keyword search; when LLM unavailable, return predefined templates
Related Terms: Circuit Breaker, Error Handling
Function Calling
Definition: LLM generates structured function call requests based on conversation, which the system executes and returns results. The standardized implementation of Tool Use.
Related Chapters: Chapter 3, Chapter 4
Example: User asks "Beijing weather," LLM returns get_weather(city="Beijing"), system calls weather API
Related Terms: Tool Use, ReAct
G - Guardrails and Governance
Guardrails
Definition: Security mechanisms that constrain and validate Agent behavior. Includes input validation, output filtering, permission checking, and policy enforcement.
Related Chapters: Chapter 24, Chapter 25
Example: Intercept outputs containing sensitive words, block Agent access to unauthorized databases, limit Token usage
H - Handoff and Hooks
Handoff
Definition: A collaboration pattern where one Agent transfers a task to another more specialized Agent. Common in multi-Agent system responsibility division.
Related Chapters: Chapter 15, Chapter 16
Example: General Agent identifies user needs a refund, hands off conversation to Refund Specialist Agent
Related Terms: Multi-Agent, Orchestrator
Hooks
Definition: Event callback mechanisms triggered at specific points in Agent execution flow. Used for logging, monitoring, auditing, or custom logic injection.
Related Chapters: Chapter 6, Chapter 29
Example: before_tool_call hook logs tool name and parameters, after_llm_response hook filters sensitive information
Related Terms: Plugins, Observability
I - Integration and Idempotency
Idempotency
Definition: The property where the same operation produces the same result when executed multiple times. Used for safe retries and deduplication in distributed systems.
Related Chapters: Chapter 21, Chapter 22
Example: Using request_id ensures payment request retries don't result in duplicate charges
Related Terms: Deterministic Replay, Temporal
L - LLM and Logging
LLM Service
Definition: A service layer that encapsulates multiple LLM provider calls, tool selection, and execution. Provides unified API, handles authentication, retries, caching, and other common logic.
Related Chapters: Chapter 3, Chapter 21
Example: Shannon's Python service layer supporting OpenAI, Anthropic, Azure with unified response format
Related Terms: Agent, Function Calling
Logging
Definition: Mechanisms for recording system runtime events, errors, and state changes. In Agent systems, used for debugging, auditing, and performance analysis.
Related Chapters: Chapter 22
Example: Structured logging records each LLM call's model, tokens, latency, and cost
Related Terms: Observability, Tracing
M - MCP and Memory
MCP (Model Context Protocol)
Definition: An open protocol that standardizes LLM connections to external data sources and tools. Defines discovery and invocation specifications for resources, tools, and prompts.
Related Chapters: Chapter 4, Chapter 5
Example: Expose Google Drive files via MCP Server, allowing Agent to list, read, and search documents
Related Terms: Tool Use, Skills
Timeliness Note (2026-01): MCP specification is still rapidly evolving. Check the latest documentation for transport layer and capability updates.
Memory
Definition: Mechanisms for Agents to store and retrieve historical information. Includes working memory (current session), short-term memory (persistent storage), and semantic memory (knowledge graphs).
Related Chapters: Chapter 7, Chapter 8, Chapter 9
Example: User says "the project I mentioned last time," Agent retrieves context from Session to understand the reference
Metrics
Definition: Mechanisms for collecting, aggregating, and visualizing system performance data. In Agent systems, monitors key metrics like Token usage, latency, and success rates.
Related Chapters: Chapter 22
Example: Prometheus collects average response time, Token consumption, and tool call counts per Agent
Related Terms: Observability, Logging
Multi-Agent System
Definition: A system architecture where multiple Agents collaborate to complete complex tasks. Agents can interact in parallel, sequential, or dynamic patterns.
Related Chapters: Chapter 13, Chapter 14, Chapter 16
Example: In e-commerce, Search Agent, Recommendation Agent, and Customer Service Agent collaboratively handle user shopping flow
Related Terms: Orchestrator, DAG
O - Orchestration and Observability
Observability
Definition: The ability to understand system internal state through logs, metrics, and traces. In Agent systems, used for debugging complex reasoning chains and tool calls.
Related Chapters: Chapter 22
Example: Trace ID tracks a request's complete path through Gateway(20ms) -> Orchestrator(50ms) -> Agent(300ms) -> LLM(2s)
Related Terms: Logging, Tracing, Metrics
OPA (Open Policy Agent)
Definition: A general-purpose policy engine that uses Rego language to define and execute authorization, validation, and compliance rules.
Related Chapters: Chapter 24
Example: Define policy "financial data can only be accessed by finance department," intercept unauthorized Agent queries
Related Terms: Guardrails, WASI
Orchestrator
Definition: A control layer that coordinates multiple Agents executing complex workflows. Responsible for routing, scheduling, result aggregation, and error handling.
Related Chapters: Chapter 13, Chapter 21
Example: Shannon's Go Orchestrator executes 3 Agents in parallel based on DAG definition, aggregates results
Related Terms: DAG, Multi-Agent
P - Planning and Plugins
P2P (Peer-to-Peer)
Definition: A collaboration pattern where Agents communicate directly without a centralized orchestrator. Suitable for dynamic, self-organizing multi-Agent scenarios.
Related Chapters: Chapter 16
Example: Research Agent autonomously discovers and requests Analysis Agent to provide data, without Orchestrator involvement
Related Terms: Multi-Agent, Handoff
Planning
Definition: A reasoning pattern where Agents formulate step-by-step plans before execution. Decomposes complex goals into executable subtask sequences.
Related Chapters: Chapter 10
Example: User requests "organize team building," Agent generates plan: 1. Determine date 2. Book venue 3. Send invitations
Related Terms: ReAct, Tree-of-Thoughts
Plugins
Definition: Pluggable modules encapsulating specific capabilities, extending Agent functionality through standard interfaces. Supports hot loading and version management.
Related Chapters: Chapter 29
Example: After installing Slack plugin, Agent can send messages, create channels, get history
Related Terms: Skills, Hooks, MCP
Prompt
Definition: Input text sent to LLMs, including instructions, examples, context, and questions. The core interface for controlling Agent behavior.
Related Chapters: Chapter 2, Chapter 12
Example: "You are a professional customer service Agent. User question: {question}. Please refer to knowledge base: {context}"
Related Terms: Chain-of-Thought, ReAct
R - RAG and Reflection
RAG (Retrieval-Augmented Generation)
Definition: A technique that retrieves relevant documents before generating responses, providing retrieval results as context to LLMs.
Related Chapters: Chapter 8, Chapter 19
Example: User asks "return policy," vector search finds relevant documents, LLM generates answer based on documents
Related Terms: Embedding, Vector Database
Rate Limiting
Definition: A flow control mechanism limiting request count per time unit. Prevents abuse, protects downstream services, and controls costs.
Related Chapters: Chapter 23, Chapter 24
Example: Each user limited to 10 LLM calls per minute, exceeding returns 429 Too Many Requests
Related Terms: Token Budget, Backpressure
ReAct (Reason-Act Loop)
Definition: A decision pattern where Agents cyclically execute "reason -> act -> observe." LLM generates reasoning process and next action, observes results after execution.
Related Chapters: Chapter 2, Chapter 3
Example: Think: need to check weather -> Act: call weather_api -> Observe: Beijing 15C -> Think: answer user
Related Terms: Agent, Tool Use
Reflection
Definition: A self-optimization pattern where Agents evaluate their own output quality, identify errors, and improve. Increases result accuracy through multiple iterations.
Related Chapters: Chapter 11
Example: Agent generates code, then self-checks "any syntax errors? What's test coverage?", fixes issues
Related Terms: Chain-of-Thought, Debate
S - Session and Skills
Sandbox
Definition: An isolated execution environment that limits code access to system resources. In Agent systems, used to safely run untrusted tool code.
Related Chapters: Chapter 25
Example: Run user-defined Python tool in WASI sandbox, unable to access filesystem or network
Related Terms: WASI, Guardrails
Session
Definition: A complete interaction process between user and Agent, containing multiple conversation turns and related context. Sessions can be persisted for cross-session dialogue.
Related Chapters: Chapter 7, Chapter 9
Example: User's conversation history, preferences, and unfinished tasks after login are all bound to the same session_id
Related Terms: Memory, Context Window
Skills
Definition: Reusable Agent capability modules encapsulating domain-specific prompts, tools, and workflows. Can be shared and composed across Agents.
Related Chapters: Chapter 5
Example: code_review skill includes code analysis prompts, static checking tools, and formatting tools
Streaming
Definition: A pattern where LLM generates content incrementally and clients receive partial results in real-time. Improves user experience by reducing time-to-first-token.
Related Chapters: Chapter 3, Chapter 22
Example: ChatGPT-style character-by-character output, users see progress without waiting for complete response
Related Terms: Asynchronous Workflow
Summarization
Definition: A technique for compressing long text into brief summaries. In Agent systems, used to handle context window limits and reduce Token costs.
Related Chapters: Chapter 7
Example: Compress 1000 turns of conversation history into summary "User inquired about refund process, resolved"
Related Terms: Context Window, Memory
Supervisor Pattern
Definition: A multi-Agent architecture pattern where one Agent acts as coordinator, assigning tasks to specialist Agents and aggregating results.
Related Chapters: Chapter 15
Example: Supervisor Agent breaks down "write article" into research, writing, and proofreading tasks, assigns to three specialists
Related Terms: Orchestrator, Handoff
T - Token and Tools
Temporal (Workflow Engine)
Definition: A distributed, durable workflow engine supporting long-running tasks, deterministic replay, and automatic retries. Suitable for complex Agent orchestration.
Related Chapters: Chapter 21
Example: Define "daily report" workflow that can resume from breakpoint even after service restart
Related Terms: DAG, Orchestrator
Token
Definition: The smallest text unit LLMs process, approximately 0.75 English words or 0.5 Chinese characters. LLM billing and context windows are measured in Tokens.
Related Chapters: Chapter 2, Chapter 23
Example: Text "Hello World" equals approximately 2 Tokens
Related Terms: Context Window, Token Budget
Token Budget
Definition: Mechanisms for managing and allocating Token usage limits per request, session, or user. Controls costs and prevents abuse.
Related Chapters: Chapter 23, Chapter 24
Example: Free users get 10K Tokens/day, paid users get 1M Tokens; exceeding triggers degradation or denial
Related Terms: Rate Limiting, Cost Optimization
Tool Use
Definition: Mechanisms for Agents to extend capabilities by calling external tools (APIs, databases, calculators, etc.). The core way Agents interact with their environment.
Related Chapters: Chapter 3, Chapter 4
Example: Call search_api("Claude 3.5") for latest info, call calculator(23*47) for computation
Related Terms: Function Calling, MCP
Tracing (Distributed Tracing)
Definition: Tracking a request's complete path through distributed systems, recording each service's latency and status. In Agent systems, used for debugging complex call chains.
Related Chapters: Chapter 22
Example: Jaeger shows request path: Gateway(20ms) -> Orchestrator(50ms) -> Agent(300ms) -> LLM(2s)
Related Terms: Observability, Logging
Tree-of-Thoughts (ToT)
Definition: An advanced reasoning pattern that models the problem solution space as a tree, explores multiple reasoning paths, and finds optimal solutions through backtracking and pruning.
Related Chapters: Chapter 17
Example: When solving a math problem, try 3 methods, expand 2 steps for each, evaluate and select the best path to continue
Related Terms: Chain-of-Thought, Planning
V - Vector Database
Vector Database
Definition: A database specialized for storing and retrieving high-dimensional vectors. Supports similarity search, commonly used for RAG and semantic retrieval.
Related Chapters: Chapter 8, Chapter 19
Example: Pinecone, Qdrant store document embeddings, return most similar Top-K results in milliseconds
W - WASI and Workflow
WASI (WASM Sandbox)
Definition: WebAssembly System Interface, a standard interface allowing WASM modules to safely access system resources like files and network. Used for isolated execution of untrusted code.
Related Chapters: Chapter 25
Example: Shannon uses WASI to run user-defined tools, limited to only accessing specified directories and APIs
Related Terms: Sandbox, Guardrails
Workflow
Definition: Process orchestration that defines task execution order, dependencies, and error handling. In Agent systems, used to organize multi-step tasks.
Related Chapters: Chapter 14, Chapter 21
Example: E-commerce order workflow: verify inventory -> charge -> ship -> send notification
How to Use This Glossary
Usage Guide
- Quick lookup: Organized alphabetically for easy term location
- Understand concepts: Each term includes concise definition and practical example
- Deep learning: Jump to detailed content via "Related Chapters" links
- Extended reading: Understand relationships between terms via "Related Terms" links
Update Notes
The AI Agent field is evolving rapidly; some term definitions and best practices may change over time. This glossary is written based on early 2026 industry consensus. We recommend using it alongside "Timeliness Notes" in chapters and external references.
Feedback and Contributions
If you find inaccurate definitions or need to add new terms, please provide feedback through GitHub Issues.
Index Reference
By Topic:
- Agent Basics: Agent, ReAct, Tool Use, Function Calling, Prompt
- Reasoning Patterns: CoT, ToT, Planning, Reflection, Debate
- Multi-Agent: Multi-Agent, Orchestrator, DAG, Supervisor, Handoff, P2P
- Extension Mechanisms: MCP, Skills, Hooks, Plugins
- Context Memory: Context Window, Memory, Session, Summarization, RAG
- Production Architecture: Temporal, Observability, Logging, Tracing, Metrics
- Security Compliance: OPA, WASI, Guardrails, Sandbox
- Cost Optimization: Token Budget, Rate Limiting, Caching, Batch Processing
- Fault Tolerance: Circuit Breaker, Backpressure, Fallback Strategy, Error Handling
- Frontier Practices: Computer Use, Agentic Coding, Background Agent
By Technology Stack:
- LLM Related: Token, Context Window, Prompt, Streaming, Function Calling
- Data Storage: Vector Database, Embedding, Caching
- Workflow Engines: Temporal, DAG, Workflow, Deterministic Replay
- Observability: Logging, Tracing, Metrics, Observability
- Security Isolation: WASI, Sandbox, OPA, Guardrails
This glossary covers core concepts from all 30 chapters, with 60+ key terms. We recommend using it alongside chapter content and code examples for deeper understanding.