Skill ファイル

Token Optimization

Name: Token Optimization
Author: cwinvestments

Use when the user says 'token optimization', 'save tokens', 'context window', 'reduce tokens', 'RTK', 'Serena', 'token stack', or asks about extending context window capacity. Covers the 3-layer token optimization stack: Headroom (API compression), RTK (CLI output compression), and Serena (LSP-backed code navigation). Do NOT use for Headroom-only troubleshooting (Compress skill).

cwinvestments283 スター2026/04/01

職業
カテゴリ: コンテナ

スキル内容

Token Optimization Guide — Full Stack Setup

Three complementary tools that reduce token consumption by 50-80% across different layers of the Claude Code pipeline.

Activation

When this skill activates, output:

Token Optimization Guide — Configuring the 3-layer token stack...

Then execute the protocol below.

Context Guard

Context	Status
User asks about token savings, context optimization	ACTIVE — full guide
User says "RTK", "Serena", "token stack"	ACTIVE — relevant section
User wants to install or configure any layer	ACTIVE — install steps
User asks about context window limits	ACTIVE — explain stack

関連 Skill

Token Optimization | Skills Pool

                    Claude Code Context Window
                    ==========================

  Layer 3: Serena (MCP)         Prevents token waste at the SOURCE
  ─────────────────────         Instead of reading entire files,
                                 use LSP to fetch only the symbols
                                 and references you need.
                                 Savings: variable (avoids 1000s of
                                 tokens per file read)
                                        │
                                        ▼
  Layer 2: RTK (CLI proxy)      Compresses tool OUTPUT
  ────────────────────────      git diff, npm install, build logs —
                                 all compressed 60-90% before they
                                 enter the context window.
                                        │
                                        ▼
  Layer 1: Headroom (API proxy) Compresses API TRAFFIC
  ──────────────────────────    Compresses the full conversation
                                 payload between CC and the Anthropic
                                 API. ~34% reduction on wire traffic.
                                        │
                                        ▼
                              Anthropic API

pip install headroom-ai[code]

headroom proxy --llmlingua-device cpu --port 8787

set ANTHROPIC_BASE_URL=http://127.0.0.1:8787
claude

ANTHROPIC_BASE_URL=http://127.0.0.1:8787 claude

# Health check
curl http://127.0.0.1:8787/health

# Token savings stats
curl http://127.0.0.1:8787/stats

Issue	Fix
Compression at 0%	Install with `[code]` extra: `pip install headroom-ai[code]`
Proxy not reachable	Check `curl http://127.0.0.1:8787/health` — restart if needed
API errors in CC	Headroom may have crashed — unset `ANTHROPIC_BASE_URL` to bypass
Slow first request	Model weights downloading (~500MB) — one-time cost

# Download from GitHub releases
gh release download --repo rtk-ai/rtk --pattern "rtk-x86_64-pc-windows-msvc.zip" --dir /tmp
unzip /tmp/rtk-x86_64-pc-windows-msvc.zip -d /tmp/rtk-extract
cp /tmp/rtk-extract/rtk.exe ~/.local/bin/rtk.exe

curl -fsSL https://raw.githubusercontent.com/rtk-ai/rtk/main/install.sh | sh

cargo install --git https://github.com/rtk-ai/rtk

# Global (all projects) — recommended
rtk init -g

# Per-project only
rtk init

rtk git status          # Compact status (62% savings)
rtk git diff            # Ultra-condensed diff
rtk git log             # Compact log
rtk npm install         # Filtered install output (70-90%)
rtk npm run build       # Compressed build output
rtk ls -la              # Token-optimized directory listing
rtk docker ps           # Compact container list
rtk kubectl get pods    # Compressed k8s output

rtk --version    # Should show version number
rtk gain         # Show cumulative token savings

Command Category	Compression
Git (status, log, diff)	59-80%
GitHub CLI (pr, run, issue)	26-87%
Package managers (npm, pnpm)	70-90%
File operations (ls, read)	60-75%
Infrastructure (docker, k8s)	85%
Network (curl, wget)	65-70%
Average	60-90%

# Add to Claude Code as a global MCP server
claude mcp add --scope user serena -- \
  uvx --from git+https://github.com/oraios/serena \
  serena start-mcp-server \
  --context=claude-code \
  --project-from-cwd

claude mcp list 2>&1 | grep serena
# Should show: serena: ... ✓ Connected

Tool	Purpose
`find_symbol`	Global symbol search via LSP (functions, classes, variables)
`find_referencing_symbols`	Find all references to a symbol across the codebase
`get_symbols_overview`	List top-level symbols in a file (like an IDE outline)
`rename_symbol`	Refactor-safe rename across the entire codebase
`replace_symbol_body`	Replace a function/class definition by name
`insert_before_symbol`	Insert code before a symbol definition
`insert_after_symbol`	Insert code after a symbol definition

Read entire 500-line file → find the one function → 3,000 tokens consumed

find_symbol("handleAuth") → returns only that function → 200 tokens consumed

# 1. Headroom (API compression)
pip install headroom-ai[code]

# 2. RTK (CLI compression)
gh release download --repo rtk-ai/rtk --pattern "rtk-x86_64-pc-windows-msvc.zip" --dir /tmp
unzip /tmp/rtk-x86_64-pc-windows-msvc.zip -d /tmp/rtk-extract
cp /tmp/rtk-extract/rtk.exe ~/.local/bin/rtk.exe
rtk init -g

# 3. Serena (LSP navigation)
pip install uv
claude mcp add --scope user serena -- \
  uvx --from git+https://github.com/oraios/serena \
  serena start-mcp-server \
  --context=claude-code \
  --project-from-cwd

# Terminal 1
headroom proxy --llmlingua-device cpu --port 8787

# Terminal 2
set ANTHROPIC_BASE_URL=http://127.0.0.1:8787   # Windows
claude

# Headroom
curl http://127.0.0.1:8787/health

# RTK
rtk --version
rtk gain

# Serena
claude mcp list 2>&1 | grep serena

Component	Windows Behavior
Headroom	Use `set ANTHROPIC_BASE_URL=...` (not `export`)
RTK	Uses CLAUDE.md injection instead of CC hooks. Download `.zip` from releases, not the install script. Binary goes in `~/.local/bin/rtk.exe`
Serena	Works identically — `uv`/`uvx` handle Windows natively
PATH	Ensure `~/.local/bin` is in your PATH for RTK

Skill	Scope	When to Use
Token Optimization (this)	Full 3-layer stack setup and reference	Installing, configuring, or understanding the optimization stack
Compress	Headroom-only troubleshooting	Proxy crashes, health checks, stats monitoring
Context DB	SQLite fact store	Reducing token waste from repeatedly reading project context

Content Type	Compression
Code files	30-46%
Conversation text	25-35%
Tool output	30-40%
Average	~34%

Tool	Purpose
`find_file`	Find files by name/pattern
`read_file`	Read file contents
`search_for_pattern`	Regex search across project
`list_dir`	Directory listing

Tool	Purpose
`write_memory`	Store project facts for future sessions
`read_memory`	Retrieve stored project knowledge
`list_memories`	List all stored memory files

Tool	Purpose
`onboarding`	Auto-discover project structure
`activate_project`	Switch active project
`get_current_config`	Show current Serena configuration

Token Optimization

Token Optimization Guide — Full Stack Setup

Activation

Context Guard

Token Optimization

Token Optimization Guide — Full Stack Setup

Activation

Context Guard

How They Stack

Layer 1: Headroom (API Compression)

Prerequisites

Install

Run

Verify

Typical Savings

Troubleshooting

Layer 2: RTK (CLI Output Compression)

Prerequisites

Install

Configure for Claude Code

Usage

Verify

Typical Savings

Layer 3: Serena (LSP Code Navigation)

Prerequisites

Install & Configure

Verify

Key Tools (28 total in claude-code context)

Why LSP Matters for Tokens

Language Support

Quick Start Checklist

Verify All Layers

Windows-Specific Notes

Relationship to Other Skills

Level History

Helm Chart Scaffolding

Python Observability

K8s Manifest Generator

Istio Traffic Management

Secrets Management

Gitops Workflow