Redis patterns, cache invalidation, CDN caching, service workers, stampede prevention, consistent hashing, Redis Cluster vs Sentinel
Caching is the most powerful performance lever. A well-designed cache can reduce API latency 10-100x.
CACHE HIERARCHY
CPU L1 Cache (KB, ~1ns latency)
↓ (miss)
CPU L2 Cache (MB, ~10ns latency)
↓ (miss)
CPU L3 Cache (MB, ~40ns latency)
↓ (miss)
RAM (GB, ~100ns latency)
↓ (miss)
Redis (distributed memory, ~1-5ms network + lookup)
↓ (miss)
SSD/Database (ms-100ms latency)
↓ (miss)
Network/API call (100-1000ms latency)
For web applications, the important layers:
1. Client-side caching (browser, service worker): ~0ms
2. CDN/Edge: ~10-100ms
3. Application cache (Redis): ~1-5ms
4. Database (with indexes): ~10-50ms
5. Database (slow query): ~100-5000ms
STRATEGY: Cache at multiple levels to minimize slow queries
CACHE-ASIDE (Lazy Loading):
GET user:123
1. Check Redis for key "user:123"
2. If exists (cache hit): Return immediately (~2ms)
3. If not exists (cache miss):
- Query database: SELECT * FROM users WHERE id = 123
- Store in Redis: SET user:123 <json_data> EX 3600
- Return data
4. Next request within 1 hour: Cache hit
Code (Python):
def get_user(user_id):
# Try cache first
cached = redis.get(f"user:{user_id}")
if cached:
return json.loads(cached)
# Cache miss: query database
user = db.query(User).filter_by(id=user_id).first()
if user:
# Store in cache for 1 hour
redis.setex(f"user:{user_id}", 3600, json.dumps(user.to_dict()))
return user
TRADEOFF:
Pros:
- Simple to implement
- Automatic expiration (TTL)
- Memory efficient (only cache what's accessed)
Cons:
- Stale data (old value returned until cache expires)
- Cache miss on first access (slower)
WRITE-THROUGH (Update Cache When Writing):
POST /users (create user)
1. Write to database: INSERT INTO users ...
2. Write to cache: SET user:123 <new_data> EX 3600
3. Return response
GET /users/123
1. Check Redis: Always cache hit
2. Return immediately
TRADEOFF:
Pros:
- Cache always fresh (written when data changes)
- Consistent (cache and DB in sync)
Cons:
- Cache misses on creation (new objects)
- Extra writes (slower creation)
- Cache stores unused data (if object never read again)
WRITE-BEHIND (Async Cache Invalidation):
POST /users (create user)
1. Write to cache immediately: SET user:123 <new_data> EX 3600
2. Queue database write job (async)
3. Return response immediately (fast!)
Background job:
1. Insert into database: INSERT INTO users ...
2. Confirm cache is still valid
TRADEOFF:
Pros:
- Fastest writes (return immediately)
- Cache always fresh
Cons:
- Risk of data loss (if cache fails before DB write)
- Complexity (need reliable job queue)
- Eventual consistency (not immediately in DB)
The hardest problem in caching is invalidating stale data.
STRATEGY 1: TTL (Time-To-Live)
Data expires after fixed time
SET user:123 <data> EX 3600 # 1 hour expiration
When to use:
- Data that changes infrequently (hourly or daily)
- Can tolerate staleness (up to 1 hour old is okay)
Tradeoff:
- Pro: Simple
- Con: Stale data for up to TTL duration
STRATEGY 2: Event-Driven Invalidation
Delete cache when data changes
POST /users/123 (update)
1. Update database
2. Delete cache: DEL user:123
3. Next GET will cache miss and reload
Code:
def update_user(user_id, changes):
db.execute("UPDATE users SET ... WHERE id = ?", user_id)
redis.delete(f"user:{user_id}") # Invalidate cache
return get_user(user_id) # Reload from DB
When to use:
- Data that changes frequently
- Need fresh data immediately after update
Tradeoff:
- Pro: Fresh data always
- Con: Cache misses after every update (slower)
STRATEGY 3: Tag-Based Invalidation
Group related cache keys, invalidate by tag
SET user:123 <data>
SET tags:user:123 ["user:123", "company:456:users", "email:[email protected]"]
When user updates email, invalidate by tag:
DELETE tags:email:[email protected]
# This invalidates all keys tagged with this email
When to use:
- Complex data relationships (user → company → permissions)
- Need to invalidate multiple keys at once
Tools:
- Redis doesn't natively support tags
- Use Memcached or external library
- Or maintain tag list manually in sorted set
STRATEGY 4: Versioned Keys
Keep multiple versions, switch when needed