스킬 파일

Model Cost Compare

Name: Model Cost Compare
Author: mergisi

Trigger when the user asks which model to use, wants to compare model costs, says "what's cheapest for this task", "should I use Opus or Sonnet", "can a smaller model handle this", or "/model-cost-compare". Estimates token cost across Opus 4.6, Sonnet 4.6, GLM-5.1, Minimax M2.7, and local Gemma 4, then recommends the cheapest model capable of the task.

mergisi2,956 스타2026. 4. 13.

직업
카테고리: 머신러닝

스킬 내용

Given a task description (and optionally a rough prompt / input size), estimate the cost of running it on each available model tier and recommend the cheapest one that can actually do the job.

When to use

"Which model should I use for X?"
"Is it worth running this on Opus or will Sonnet do?"
"Can I offload this to a local model?"
"/model-cost-compare — classify 10k support tickets"

Pricing table (indicative — always flag as "check provider docs")

Use these rough figures. They are not exact; confirm before quoting real numbers to the user.

Model	Tier	Input ($/1M tok)	Output ($/1M tok)	Context	Strengths
Opus 4.6 (1M)	Frontier	~$15	~$75	1M	Agentic, long-context, hard reasoning
Sonnet 4.6	Mid	~$3

관련 스킬

Model Cost Compare | Skills Pool

Parse the user's task. Extract:
- Task type: reasoning, extraction, classification, drafting, translation, agentic tool use, long-context synthesis.
- Input size estimate: in tokens. If the user says "10k tickets averaging 500 tokens", that's 5M input tokens. If unknown, ask for a rough size.
- Output size estimate: short label? full essay? JSON record?
- Volume: one-off or batch?
Rule out incapable models. Use this capability floor:
- Agentic multi-tool flows with long reasoning → Opus or Sonnet only.
- Structured extraction / classification with clear schema → any tier, including Gemma 4 local.
- Long-context synthesis (>400k tokens) → Opus only.
- Privacy-sensitive data that cannot leave the machine → Gemma 4 local only.
For each surviving model, compute:
```
cost = (input_tokens / 1_000_000) * input_price
     + (output_tokens / 1_000_000) * output_price
```
Multiply by volume. Show your arithmetic so the user can sanity-check.
Print the comparison as a Markdown table sorted cheapest first. Bold the recommended row.
End with a one-line recommendation: Recommended: <model> — <1-sentence reason>.

Total tokens: 4M input, 200k output

| Model       | Input cost | Output cost | Total   | Capable? |
|-------------|-----------:|------------:|--------:|---------:|
| **Gemma 4** |     $0.00  |      $0.00  |  $0.00  |   yes    |
| Minimax M2.7|     $1.60  |      $0.36  |  $1.96  |   yes    |
| GLM-5.1     |     $2.40  |      $0.44  |  $2.84  |   yes    |
| Sonnet 4.6  |    $12.00  |      $3.00  | $15.00  |   yes    |
| Opus 4.6    |    $60.00  |     $15.00  | $75.00  |   overkill |

Recommended: Gemma 4 local — classification with a fixed 5-label schema is trivial for on-device models and costs nothing.

Model Cost Compare

When to use

Pricing table (indicative — always flag as "check provider docs")

Model Cost Compare

When to use

Pricing table (indicative — always flag as "check provider docs")

Instructions

Output example

Anti-patterns

Example invocations

Continuous Learning V2

Continuous Learning V2

Continuous Learning V2

Continuous Learning

Continuous Learning

Pytorch Patterns