Skill File

Cost Analysis

Name: Cost Analysis
Author: StrategicMilk

Token-level cost analysis per model with tier optimization recommendations and budget projections

StrategicMilk0 starsMar 19, 2026

Occupation
Categories: Machine Learning

Skill Content

Purpose

Cost Analysis provides detailed token-level cost breakdowns per model tier, identifies optimization opportunities to reduce cost without sacrificing quality, and projects future costs based on usage trends. It enables informed decisions about model tier selection, thinking budget allocation, and task routing to balance cost against capability. For Vetinari, where every agent invocation has a token cost, understanding and optimizing cost is essential for sustainable operation.

When to Use

After completing a plan to understand the actual cost versus estimate
When evaluating whether a task can be routed to a cheaper model tier
When budgets are constrained and cost reduction is needed
During capacity planning to project monthly costs
When comparing the cost-effectiveness of different decomposition strategies
When the Foreman needs cost data to inform effort estimation
After a spike in costs to identify the cause

Inputs

Related Skills

Cost Analysis | Skills Pool

Usage data collection -- Gather token usage records from episode memory, execution logs, or provided usage data. For each invocation, record: model tier, input tokens, output tokens, mode, task type, and timestamp.
Per-model cost calculation -- Apply pricing to each invocation:
- Input cost = input_tokens * input_price_per_token
- Output cost = output_tokens * output_price_per_token
- Total cost = input_cost + output_cost
- Aggregate by model tier, mode, and task type
Cost distribution analysis -- Identify where costs concentrate:
- Which model tier accounts for the most cost?
- Which mode (build, research, review) is most expensive?
- Which task types have the highest per-task cost?
- What percentage of cost is input tokens vs output tokens?
Efficiency metrics -- Calculate efficiency ratios:
- Cost per successful task completion
- Cost per line of code produced (for build tasks)
- Cost per review (for inspector tasks)
- Retry cost (wasted tokens on failed attempts)
- Token waste ratio (tokens spent on tasks that were later replanned)
Tier optimization analysis -- For each task type, evaluate if a cheaper model tier could handle it:
- Research tasks: could use efficient tier instead of capable?
- Simple build tasks: could reduce thinking budget?
- Standard reviews: could use lighter review mode?
- Estimate cost savings for each optimization
Thinking budget analysis -- Analyze thinking budget usage:
- Distribution of thinking modes (low/medium/high/xhigh) by task type
- Correlation between thinking budget and task success rate
- Identify tasks where high thinking was used but low would suffice
- Identify tasks where low thinking led to failures (under-invested)
Trend projection -- Based on historical data, project future costs:
- Daily/weekly/monthly cost trends
- Growth rate of token usage
- Projected cost for the next period
- When the budget will be exhausted at current rate
Optimization recommendations -- Produce actionable recommendations:
- Model tier downgrade opportunities with risk assessment
- Thinking budget adjustments with expected savings
- Task batching opportunities (reduce per-invocation overhead)
- Caching opportunities (avoid re-computing similar tasks)
Report assembly -- Compile the cost analysis report with: summary metrics, per-model breakdown, optimization opportunities, trend projections, and recommendations.

{
  "success": true,
  "output": {
    "summary": {
      "total_cost_usd": 12.45,
      "period": "2025-01-01 to 2025-01-31",
      "total_invocations": 342,
      "avg_cost_per_task": 0.036
    },
    "by_model": {
      "claude-opus": {"invocations": 45, "cost": 8.20, "pct": 65.9},
      "claude-sonnet": {"invocations": 180, "cost": 3.50, "pct": 28.1},
      "claude-haiku": {"invocations": 117, "cost": 0.75, "pct": 6.0}
    },
    "by_mode": {
      "build": {"invocations": 89, "cost": 5.60},
      "code_discovery": {"invocations": 120, "cost": 2.10},
      "code_review": {"invocations": 65, "cost": 3.80},
      "plan": {"invocations": 68, "cost": 0.95}
    },
    "optimization_opportunities": [
      {
        "opportunity": "Route code_discovery tasks to haiku tier",
        "current_cost": 2.10,
        "projected_cost": 0.45,
        "savings_usd": 1.65,
        "risk": "low -- discovery tasks are pattern-matching, not reasoning-intensive"
      }
    ],
    "projection": {
      "next_month_estimate": 14.20,
      "growth_rate": "14% month-over-month",
      "budget_exhaustion": "Not at risk (budget: $50/month)"
    }
  },
  "metadata": {
    "data_source": "episode_memory",
    "records_analyzed": 342
  }
}

task	string	Yes	What to analyze and the analysis objective
usage_data	list[dict]	No	Token usage records (model, input_tokens, output_tokens, mode)
model_pricing	dict	No	Per-model pricing (input_cost_per_1k, output_cost_per_1k)
time_range	dict	No	Analysis period: {start: "2025-01-01", end: "2025-01-31"}
budget	dict	No	Budget constraints (daily_limit, monthly_limit)
context	dict	No	System context (task types, model configuration)

Cost Analysis

Purpose

When to Use

Inputs

Cost Analysis

Purpose

When to Use

Inputs

Process Steps

Output Format

Quality Standards

Examples

Example: Monthly cost review with optimization

Continuous Learning V2

Continuous Learning V2

Continuous Learning V2

Continuous Learning

Continuous Learning

Pytorch Patterns