Name: Model Fallback
Author: openclaw

Model Fallback | Skills Pool

{
  "fallback_chain": [
    {
      "provider": "minimax-portal",
      "model": "MiniMax-M2.5",
      "priority": 1,
      "timeout": 30,
      "max_retries": 3
    },
    {
      "provider": "moonshot",
      "model": "kimi-k2.5",
      "priority": 2,
      "timeout": 30,
      "max_retries": 2
    },
    {
      "provider": "zhipu",
      "model": "glm-4-air",
      "priority": 3,
      "timeout": 20,
      "max_retries": 2
    }
  ]
}

# Trigger a model call (fallback happens automatically on failure)

# Force fallback to next model
/scripts/model-fallback.sh --force-next

# Check current model status
/scripts/model-fallback.sh --status

# Reset to primary model
/scripts/model-fallback.sh --reset

{
  "fallback_chain": [
    {"provider": "...", "model": "...", "priority": 1}
  ],
  "health_check": {
    "enabled": true,
    "interval_seconds": 300
  }
}

1. User makes request with primary model
2. Model call fails (error, timeout, rate limit)
3. Skill detects failure
4. Wait 3 seconds (debounce)
5. Switch to next model in chain
6. Retry request with new model
7. If successful, return result
8. If failed, repeat steps 4-7
9. If all models fail, return error with details

Trigger	Condition	Action
API Unavailable	Connection timeout	Fallback
Rate Limit	429 response	Fallback + wait
Slow Response	> timeout seconds	Fallback
Invalid Response	Parse error	Fallback
Auth Error	401/403 response	Log + stop

[2026-02-27 14:00:00] [INFO] Primary model MiniMax-M2.5 called
[2026-02-27 14:00:05] [WARN] Model failed: rate limit exceeded
[2026-02-27 14:00:05] [INFO] Falling back to Kimi K2.5
[2026-02-27 14:00:10] [INFO] Fallback successful

{
  "task_routing": {
    "simple_query": ["glm-4-air", "glm-4-flash"],
    "complex_reasoning": ["MiniMax-M2.5", "kimi-k2.5"],
    "long_context": ["kimi-k2.5", "MiniMax-M2.1"]
  }
}

{
  "models": {
    "mode": "merge",
    "fallback": {
      "enabled": true,
      "config": "~/.openclaw/skills/model-fallback/config.json"
    }
  }
}

# Check model health
curl http://localhost:18789/api/models/health

User: "Hello"
System: Using MiniMax-M2.5...
System: Rate limited, switching to Kimi K2.5...
System: Response from Kimi K2.5: "Hello! How can I help?"

User: "What is 2+2?"
System: Routing to glm-4-air (low cost)...
System: Response: "2+2=4"

User: "Summarize this 100-page PDF"
System: Detected long context requirement
System: Routing to Kimi K2.5 (256K context)...
System: Processing...

Provider	Model	Context	Use Case
MiniMax	M2.5	200K	Primary (reasoning)
MiniMax	M2.1	200K	Backup
Kimi	K2.5	256K	Long documents
Kimi	K2	128K	Standard
Zhipu	GLM-4-Air	128K	Low cost
Zhipu	GLM-4-Flash	1M	High volume

Variable	Required	Description
`MODEL_FALLBACK_ENABLED`	No	Enable/disable fallback (default: true)
`MODEL_FALLBACK_LOG_LEVEL`	No	Log level: debug, info, warn, error

Model Fallback

Model Fallback Skill

Overview

Features

Supported Models

Model Fallback

Model Fallback Skill

Overview

Features

Supported Models

Configuration

Default Fallback Chain

Environment Variables

Usage

Basic Usage

Manual Fallback

Configuration

How It Works

Fallback Triggers

Logging

Log Format

Cost Optimization

Integration

OpenClaw Configuration

Health Check

Troubleshooting

Fallback Not Working

Models Always Failing

Examples

Example 1: Simple Fallback

Example 2: Cost Optimization

Example 3: Long Document

License

Author

Version

Continuous Learning V2

Continuous Learning V2

Continuous Learning V2

Continuous Learning

Continuous Learning

Pytorch Patterns