Token Saver — Claude Code Cost Optimization

Overview

A routing and habits guide to cut Claude API costs. The core idea: match the tool to the task — use free/cheap tools for simple work, and reserve expensive models for genuinely hard problems.

Quick Reference

FREE    → Ollama  (commits, summaries, translations, explanations)
CHEAP   → Haiku   (quick lookups, simple one-liners)
DEFAULT → Sonnet  (daily coding, features, bugs)
DEEP    → Opus    (architecture, security, complex refactors only)

Tier 0 — Ollama (FREE, runs locally)

Install once, zero API cost forever.

Setup:

curl -fsSL https://ollama.com/install.sh | sh
ollama pull llama3.2:3b

Task	Command
Commit message	`ollama run llama3.2:3b "write a conventional commit for: <diff>"`
Summarize text	`ollama run llama3.2:3b "summarize: <text>"`
Translate PT↔EN	`ollama run llama3.2:3b "translate to english: <text>"`
Explain a function	`ollama run llama3.2:3b "explain this code: <snippet>"`
PR description	`ollama run llama3.2:3b "write a PR description for: <summary>"`
Classify text	`ollama run llama3.2:3b "classify as bug/feature/chore: <text>"`

/model claude-haiku-4-5-20251001

/model claude-sonnet-4-6

/model claude-opus-4-6

Instead of	Do this
"Read this file and find the auth logic"	"Find the `authenticate` function in `auth.service.ts`"
"Look at the whole component"	"Show me just the `useEffect` hooks in `Dashboard.tsx`"
Opening a 500-line file	Grep for the symbol first, then read only that block

Token Saver | Skills Pool

Token Saver

Token Saver

Token Saver — Claude Code Cost Optimization

Overview

Quick Reference

Tier 0 — Ollama (FREE, runs locally)

Tier 1 — Haiku (Cheapest Claude)

Tier 2 — Sonnet (Default)

Tier 3 — Opus (Use Sparingly)

Read Code Surgically

Scope Requests Tightly

Stop Retry Loops Early

Biggest Wins (in order)

Taskflow Inbox Triage

Accessibility

Open a Pull Request

Investor Materials

Continuous Agent Loop

Configure Ecc