Token-level cost analysis per model with tier optimization recommendations and budget projections
Cost Analysis provides detailed token-level cost breakdowns per model tier, identifies optimization opportunities to reduce cost without sacrificing quality, and projects future costs based on usage trends. It enables informed decisions about model tier selection, thinking budget allocation, and task routing to balance cost against capability. For Vetinari, where every agent invocation has a token cost, understanding and optimizing cost is essential for sustainable operation.
| Parameter |
|---|
| Type |
|---|
| Required |
|---|
| Description |
|---|
| task | string | Yes | What to analyze and the analysis objective |
| usage_data | list[dict] | No | Token usage records (model, input_tokens, output_tokens, mode) |
| model_pricing | dict | No | Per-model pricing (input_cost_per_1k, output_cost_per_1k) |
| time_range | dict | No | Analysis period: {start: "2025-01-01", end: "2025-01-31"} |
| budget | dict | No | Budget constraints (daily_limit, monthly_limit) |
| context | dict | No | System context (task types, model configuration) |
Usage data collection -- Gather token usage records from episode memory, execution logs, or provided usage data. For each invocation, record: model tier, input tokens, output tokens, mode, task type, and timestamp.
Per-model cost calculation -- Apply pricing to each invocation:
Cost distribution analysis -- Identify where costs concentrate:
Efficiency metrics -- Calculate efficiency ratios:
Tier optimization analysis -- For each task type, evaluate if a cheaper model tier could handle it:
Thinking budget analysis -- Analyze thinking budget usage:
Trend projection -- Based on historical data, project future costs:
Optimization recommendations -- Produce actionable recommendations:
Report assembly -- Compile the cost analysis report with: summary metrics, per-model breakdown, optimization opportunities, trend projections, and recommendations.
The skill produces a cost analysis report:
{
"success": true,
"output": {
"summary": {
"total_cost_usd": 12.45,
"period": "2025-01-01 to 2025-01-31",
"total_invocations": 342,
"avg_cost_per_task": 0.036
},
"by_model": {
"claude-opus": {"invocations": 45, "cost": 8.20, "pct": 65.9},
"claude-sonnet": {"invocations": 180, "cost": 3.50, "pct": 28.1},
"claude-haiku": {"invocations": 117, "cost": 0.75, "pct": 6.0}
},
"by_mode": {
"build": {"invocations": 89, "cost": 5.60},
"code_discovery": {"invocations": 120, "cost": 2.10},
"code_review": {"invocations": 65, "cost": 3.80},
"plan": {"invocations": 68, "cost": 0.95}
},
"optimization_opportunities": [
{
"opportunity": "Route code_discovery tasks to haiku tier",
"current_cost": 2.10,
"projected_cost": 0.45,
"savings_usd": 1.65,
"risk": "low -- discovery tasks are pattern-matching, not reasoning-intensive"
}
],
"projection": {
"next_month_estimate": 14.20,
"growth_rate": "14% month-over-month",
"budget_exhaustion": "Not at risk (budget: $50/month)"
}
},
"metadata": {
"data_source": "episode_memory",
"records_analyzed": 342
}
}
This skill is governed by the following standards from the skill registry:
Input: