Guides caching strategy selection and implementation across the full stack including HTTP caching, application-level caching (Redis, in-memory), frontend data caching (SWR, TanStack Query), LLM response caching (prompt caching, semantic caching), database query caching, cache invalidation patterns, and distributed cache architectures. Covers cache-aside, read-through, write-through, write-behind patterns, eviction policies (LRU/LFU), and agentic workflow caching considerations. Use when adding caching to an application, choosing a caching strategy, debugging stale data, optimizing API response times, reducing LLM costs, or designing distributed cache topologies. Triggers: cache, caching, Redis, CDN, TTL, cache invalidation, stale data, Cache-Control, ETag, SWR, TanStack Query, prompt caching, semantic cache, LRU, write-through, cache-aside, materialized view.
Expert guidance for caching strategy selection, implementation, and invalidation across HTTP, application, frontend, database, and LLM layers.
| Request type | Load reference |
|---|---|
| HTTP headers, CDN, Cache-Control, ETags, browser caching | references/http-and-cdn-caching.md |
| Redis, in-memory, LRU/LFU, eviction, two-tier cache | references/application-level-caching.md |
| SWR, TanStack Query, service workers, browser storage | references/frontend-data-caching.md |
| Prompt caching, semantic caching, LLM cost reduction, agentic caching | references/llm-and-agentic-caching.md |
| Materialized views, query caching, database read optimization | references/database-query-caching.md |
| TTL, event-driven, tag-based, versioned invalidation strategies | references/cache-invalidation.md |
Not everything should be cached. Cache data that is read frequently, expensive to compute, tolerant of brief staleness, and unlikely to change between reads. Caching mutable, low-read data adds complexity without benefit.
Decision test: If the data is read 10x more than it is written and a few seconds of staleness is acceptable, it is a strong caching candidate.
| Pattern | How it works | Best for |
|---|---|---|
| Cache-aside | App checks cache, on miss reads DB, writes to cache | General purpose; simple, widely understood |
| Read-through | Cache itself loads from DB on miss | Frameworks with cache-provider abstraction |
| Write-through | Writes go to cache and DB synchronously | Strong consistency requirements |
| Write-behind | Writes go to cache, async flush to DB | Write-heavy workloads; eventual consistency OK |
| Refresh-ahead | Cache proactively refreshes before expiry | Predictable access patterns; low-latency reads |
Default recommendation: Start with cache-aside. It is the simplest, most portable, and gives the application full control over caching behavior.
| Policy | Evicts | Best when |
|---|---|---|
| LRU (Least Recently Used) | Oldest-accessed entries | Recency predicts future access |
| LFU (Least Frequently Used) | Least-accessed entries | Popularity predicts future access (skewed workloads) |
| TTL (Time-To-Live) | Expired entries | Data has a known freshness window |
| Random | Random entries | Uniform access distribution |
Default recommendation: Use allkeys-lru for Redis. It handles the common case (Pareto distribution) well. Add TTL as a safety net on all entries.
Apply caching at multiple layers, each serving a different purpose:
Browser Cache -> CDN Edge -> API Gateway -> App In-Memory -> Redis -> Database
"There are only two hard things in Computer Science: cache invalidation and naming things." Start with TTL-based expiration and layer on event-driven invalidation for critical data paths. Never rely solely on manual invalidation.
When an AI agent is implementing or reasoning about caching:
async function getUser(userId: string): Promise<User> {
// 1. Check cache
const cached = await redis.get(`user:${userId}`);
if (cached) return JSON.parse(cached);
// 2. Cache miss -- read from database
const user = await db.users.findById(userId);
if (!user) throw new NotFoundError('User not found');
// 3. Populate cache with TTL
await redis.set(`user:${userId}`, JSON.stringify(user), 'EX', 300); // 5 min TTL
return user;
}
async function updateUser(userId: string, data: Partial<User>): Promise<User> {
// 1. Update database (source of truth)
const user = await db.users.update(userId, data);
// 2. Invalidate cache
await redis.del(`user:${userId}`);
return user;
}
# Static assets (fingerprinted filenames)
Cache-Control: public, max-age=31536000, immutable
# HTML pages
Cache-Control: no-cache
ETag: "abc123"
# API responses (short-lived, revalidatable)
Cache-Control: private, max-age=0, must-revalidate
ETag: "v2-abc123"
# API responses (CDN-friendly)
Cache-Control: public, max-age=60, s-maxage=300, stale-while-revalidate=60
# Sensitive data (never cache)
Cache-Control: no-store
| Data type | Suggested TTL | Rationale |
|---|---|---|
| Static assets (versioned) | 1 year (immutable) | Filename changes on content change |
| User profile | 5-15 minutes | Moderate change frequency |
| Product catalog | 1-5 minutes | Balances freshness and performance |
| Session data | Match session timeout | Must not outlive session |
| API rate limit counters | Window duration | Must be exact |
| LLM prompt cache | 5 min (Anthropic default) | Cost vs. freshness tradeoff |
| Search results | 30-60 seconds | High change frequency |
| Configuration/feature flags | 30-60 seconds | Must propagate quickly |
# Pattern: {entity}:{identifier}:{optional-variant}