**Domain**: AI/ML Architecture
Domain: AI/ML Architecture Inheritance: inheritable Version: 1.0.0 Last Updated: 2026-02-01
Comprehensive patterns for designing AI agents—autonomous systems that use LLMs to reason, plan, and execute multi-step tasks. Covers single-agent architectures, multi-agent orchestration, tool use, memory systems, and production deployment patterns.
┌─────────────────────────────────────────────────────────────┐
│ AI AGENT │
├─────────────────────────────────────────────────────────────┤
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
│ │ Perceive│ → │ Plan │ → │ Act │ → │ Learn │ │
│ └─────────┘ └─────────┘ └─────────┘ └─────────┘ │
│ ↑ │ │
│ └──────────────────────────────────────────┘ │
│ Feedback Loop │
└─────────────────────────────────────────────────────────────┘
Core Components:
| Aspect | Chatbot | Workflow | Agent |
|---|---|---|---|
| Autonomy | Low | None | High |
| Planning | None | Predefined | Dynamic |
| Tool Use | Limited | Fixed sequence | Flexible |
| Memory | Session only | None | Persistent |
| Error Recovery | Retry/fail | Fail | Reason & adapt |
The foundation of most modern agents:
┌──────────────────────────────────────────┐
│ ReAct Loop │
├──────────────────────────────────────────┤
│ 1. Thought: Reason about the task │
│ 2. Action: Choose and execute a tool │
│ 3. Observation: Process tool output │
│ 4. Repeat until task complete │
└──────────────────────────────────────────┘
Example Trace:
User: What's the weather in Seattle and should I bring an umbrella?
Thought: I need to check Seattle weather to answer this question.
Action: weather_api(location="Seattle, WA")
Observation: {"temp": 52, "condition": "rain", "precipitation": 80%}
Thought: It's raining with 80% precipitation chance. User should bring umbrella.
Action: respond("It's 52°F and raining in Seattle with 80% chance of
precipitation. Yes, definitely bring an umbrella!")
For complex, multi-step tasks:
┌─────────────────────────────────────────────────────────────┐
│ Plan-and-Execute │
├─────────────────────────────────────────────────────────────┤
│ ┌─────────────┐ │
│ │ Planner │ Create high-level plan │
│ └──────┬──────┘ │
│ ↓ │
│ ┌─────────────┐ │
│ │ Executor │ Execute each step │
│ └──────┬──────┘ │
│ ↓ │
│ ┌─────────────┐ │
│ │ Replanner │ Adjust plan based on results │
│ └─────────────┘ │
└─────────────────────────────────────────────────────────────┘
When to Use:
Self-improvement through reflection:
┌─────────────────────────────────────────────────────────────┐
│ Reflexion │
├─────────────────────────────────────────────────────────────┤
│ 1. Attempt task │
│ 2. Evaluate outcome (success/failure) │
│ 3. Generate reflection on what went wrong │
│ 4. Store reflection in memory │
│ 5. Retry with reflection context │
└─────────────────────────────────────────────────────────────┘
Central coordinator delegates to specialized agents:
┌─────────────────────────────────────────────────────────────┐
│ │
│ ┌────────────┐ │
│ │ Supervisor │ │
│ └─────┬──────┘ │
│ ┌─────────────┼─────────────┐ │
│ ↓ ↓ ↓ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Research │ │ Writer │ │ Reviewer │ │
│ │ Agent │ │ Agent │ │ Agent │ │
│ └──────────┘ └──────────┘ └──────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
Use Cases:
Nested supervisor structure for complex organizations:
┌─────────────────────────────────────────────────────────────┐
│ Top Supervisor │
│ ┌─────────────┴─────────────┐ │
│ ↓ ↓ │
│ ┌───────────────┐ ┌───────────────┐ │
│ │ Research Lead │ │ Writing Lead │ │
│ └───────┬───────┘ └───────┬───────┘ │
│ ┌────┴────┐ ┌────┴────┐ │
│ ↓ ↓ ↓ ↓ │
│ ┌───────┐ ┌───────┐ ┌───────┐ ┌───────┐ │
│ │Web │ │Paper │ │Draft │ │Edit │ │
│ │Search │ │Review │ │Writer │ │Writer │ │
│ └───────┘ └───────┘ └───────┘ └───────┘ │
└─────────────────────────────────────────────────────────────┘
Multiple agents argue to reach better conclusions:
┌─────────────────────────────────────────────────────────────┐
│ │
│ ┌──────────┐ Argue ┌──────────┐ │
│ │ Agent A │ ◄──────────────► │ Agent B │ │
│ │ (Pro) │ │ (Con) │ │
│ └────┬─────┘ └────┬─────┘ │
│ │ │ │
│ └──────────┬──────────────────┘ │
│ ↓ │
│ ┌────────────┐ │
│ │ Judge │ Synthesize best answer │
│ └────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
Benefits:
{
"name": "search_database",
"description": "Search the product database. Returns matching products with prices. Use when user asks about product availability or pricing.",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "Search terms (product name, category, or SKU)"
},
"max_results": {
"type": "integer",
"default": 10,
"description": "Maximum results to return (1-100)"
},
"filters": {
"type": "object",
"properties": {
"min_price": { "type": "number" },
"max_price": { "type": "number" },
"in_stock": { "type": "boolean" }
}
}
},
"required": ["query"]
}
}
Tool Design Principles:
| Strategy | Description | When to Use |
|---|---|---|
| Direct | LLM chooses from all tools | < 10 tools |
| Categorized | Group tools, select category first | 10-50 tools |
| Retrieval | Embed tool descriptions, retrieve relevant | 50+ tools |
| Routing | Specialized selector model | Production scale |
┌─────────────────────────────────────────────────────────────┐
│ Human-in-the-Loop Pattern │
├─────────────────────────────────────────────────────────────┤
│ │
│ Agent Action Request │
│ │ │
│ ↓ │
│ ┌───────────────┐ │
│ │ Risk Check │ │
│ └───────┬───────┘ │
│ │ │
│ Low ──┴── High │
│ │ │ │
│ ↓ ↓ │
│ Execute ┌──────────┐ │
│ Directly │ Human │ │
│ │ Approval │ │
│ └────┬─────┘ │
│ │ │
│ Approve/Reject/Modify │
│ │
└─────────────────────────────────────────────────────────────┘
High-Risk Actions Requiring Approval:
┌─────────────────────────────────────────────────────────────┐
│ Agent Memory │
├─────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ Working Memory │ │
│ │ Current conversation + recent context (in prompt) │ │
│ └─────────────────────────────────────────────────────┘ │
│ │ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ Short-Term Memory │ │
│ │ Session state, intermediate results (key-value) │ │
│ └─────────────────────────────────────────────────────┘ │
│ │ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ Long-Term Memory │ │
│ │ Facts, preferences, history (vector DB + graph) │ │
│ └─────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
| Type | Storage | Retrieval | Use Case |
|---|---|---|---|
| Episodic | Vector DB | Semantic search | Past conversations, experiences |
| Semantic | Graph DB | Structured query | Facts, relationships, knowledge |
| Procedural | Code/prompts | Direct lookup | How to perform tasks |
| Working | Prompt context | Always present | Current task state |
Summarization: Compress old conversations
Full History → Summarize → Store Summary → Discard Full
Forgetting: Remove low-value memories
Memories → Score by (recency × importance × access_count) → Prune lowest
Consolidation: Merge related memories
Similar Memories → Cluster → Create consolidated memory → Archive originals
Complex Task: "Build a marketing campaign for our new product"
│
┌───────────────┼───────────────┐
↓ ↓ ↓
┌──────────┐ ┌──────────┐ ┌──────────┐
│ Research │ │ Content │ │ Launch │
│ Phase │ │ Phase │ │ Phase │
└────┬─────┘ └────┬─────┘ └────┬─────┘
│ │ │
┌──────┴──────┐ ┌───┴───┐ ┌───┴───┐
↓ ↓ ↓ ↓ ↓ ↓
Analyze Survey Create Write Schedule Monitor
Competitors Users Assets Copy Posts Results
Current State: No marketing campaign
Goal State: Campaign live with 10K impressions
│
↓
┌─────────────────────┐
│ Gap Analysis │
│ What's missing? │
└──────────┬──────────┘
↓
┌─────────────────────┐
│ Action Generation │
│ What can close gap? │
└──────────┬──────────┘
↓
┌─────────────────────┐
│ Action Selection │
│ Best next step? │
└─────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Error Recovery Ladder │
├─────────────────────────────────────────────────────────────┤
│ │
│ Level 1: Retry │
│ └── Same action, maybe with backoff │
│ │
│ Level 2: Rephrase │
│ └── Reformulate the action (different query) │
│ │
│ Level 3: Alternative │
│ └── Use different tool for same goal │
│ │
│ Level 4: Partial │
│ └── Return partial results, note limitations │
│ │
│ Level 5: Escalate │
│ └── Ask human for help │
│ │
│ Level 6: Abort │
│ └── Cannot complete, explain why │
│ │
└─────────────────────────────────────────────────────────────┘
Agents can get stuck. Detect and break loops:
def detect_loop(action_history, window=5, threshold=0.8):
"""Detect if agent is repeating similar actions."""
if len(action_history) < window * 2:
return False
recent = action_history[-window:]
previous = action_history[-window*2:-window]
# Compare action patterns
similarity = calculate_similarity(recent, previous)
return similarity > threshold
Recovery Actions:
What to Log:
Trace Structure:
Trace: user_request_abc123
├── parse_intent (50ms)
├── plan_generation (200ms)
├── step_1_research
│ ├── tool_call: search_web (150ms)
│ └── tool_call: summarize (100ms)
├── step_2_write
│ └── llm_call: generate_draft (300ms)
└── step_3_review
└── llm_call: critique (200ms)
| Strategy | Implementation |
|---|---|
| Token budgets | Set max tokens per task |
| Step limits | Maximum N actions per request |
| Tiered models | GPT-4 for planning, GPT-3.5 for execution |
| Caching | Cache tool results, LLM responses |
| Early termination | Stop when "good enough" |
┌─────────────────────────────────────────────────────────────┐
│ Safety Layer │
├─────────────────────────────────────────────────────────────┤
│ │
│ Input Validation │
│ ├── Prompt injection detection │
│ ├── PII/sensitive data filtering │
│ └── Request rate limiting │
│ │
│ Action Validation │
│ ├── Tool parameter sanitization │
│ ├── Scope/permission checks │
│ └── Dangerous action blocking │
│ │
│ Output Validation │
│ ├── Content policy compliance │
│ ├── Hallucination detection │
│ └── Sensitive data redaction │
│ │
└─────────────────────────────────────────────────────────────┘
| Framework | Strengths | Best For |
|---|---|---|
| LangChain | Comprehensive, many integrations | Rapid prototyping |
| LangGraph | Stateful, graph-based flows | Complex multi-agent |
| AutoGen | Multi-agent conversations | Research, code gen |
| CrewAI | Role-based teams | Business workflows |
| Semantic Kernel | Enterprise, .NET/Python | Microsoft stack |
| Agents SDK (OpenAI) | Simple, hosted | Quick single-agent |
Problem: Agent makes too many decisions without checkpoints Solution: Add approval gates for significant actions
Problem: No termination conditions Solution: Set max iterations, cost limits, time bounds
Problem: Too many tools confuse the agent Solution: Curate tools, use retrieval for large toolsets
Problem: Accumulating context without pruning Solution: Summarize, forget, consolidate
Problem: One agent does everything Solution: Decompose into specialized sub-agents
✅ Good Fit:
❌ Poor Fit:
AI Agent Design skill — Building autonomous, reliable AI systems