Name: Caching Strategy At Scale
Author: AniruddhaPKawarase

스킬 검색.../

Caching Strategy At Scale | Skills Pool

CACHE-ASIDE (Lazy Loading):

GET user:123
1. Check Redis for key "user:123"
2. If exists (cache hit): Return immediately (~2ms)
3. If not exists (cache miss):
   - Query database: SELECT * FROM users WHERE id = 123
   - Store in Redis: SET user:123 <json_data> EX 3600
   - Return data
4. Next request within 1 hour: Cache hit

Code (Python):
def get_user(user_id):
  # Try cache first
  cached = redis.get(f"user:{user_id}")
  if cached:
    return json.loads(cached)

  # Cache miss: query database
  user = db.query(User).filter_by(id=user_id).first()
  if user:
    # Store in cache for 1 hour
    redis.setex(f"user:{user_id}", 3600, json.dumps(user.to_dict()))

  return user

TRADEOFF:
Pros:
- Simple to implement
- Automatic expiration (TTL)
- Memory efficient (only cache what's accessed)

Cons:
- Stale data (old value returned until cache expires)
- Cache miss on first access (slower)

WRITE-THROUGH (Update Cache When Writing):

POST /users (create user)
1. Write to database: INSERT INTO users ...
2. Write to cache: SET user:123 <new_data> EX 3600
3. Return response

GET /users/123
1. Check Redis: Always cache hit
2. Return immediately

TRADEOFF:
Pros:
- Cache always fresh (written when data changes)
- Consistent (cache and DB in sync)

Cons:
- Cache misses on creation (new objects)
- Extra writes (slower creation)
- Cache stores unused data (if object never read again)

WRITE-BEHIND (Async Cache Invalidation):

POST /users (create user)
1. Write to cache immediately: SET user:123 <new_data> EX 3600
2. Queue database write job (async)
3. Return response immediately (fast!)

Background job:
1. Insert into database: INSERT INTO users ...
2. Confirm cache is still valid

TRADEOFF:
Pros:
- Fastest writes (return immediately)
- Cache always fresh

Cons:
- Risk of data loss (if cache fails before DB write)
- Complexity (need reliable job queue)
- Eventual consistency (not immediately in DB)

STRATEGY 1: TTL (Time-To-Live)
Data expires after fixed time

SET user:123 <data> EX 3600  # 1 hour expiration

When to use:
- Data that changes infrequently (hourly or daily)
- Can tolerate staleness (up to 1 hour old is okay)

Tradeoff:
- Pro: Simple
- Con: Stale data for up to TTL duration

STRATEGY 2: Event-Driven Invalidation
Delete cache when data changes

POST /users/123 (update)
1. Update database
2. Delete cache: DEL user:123
3. Next GET will cache miss and reload

Code:
def update_user(user_id, changes):
  db.execute("UPDATE users SET ... WHERE id = ?", user_id)
  redis.delete(f"user:{user_id}")  # Invalidate cache
  return get_user(user_id)  # Reload from DB

When to use:
- Data that changes frequently
- Need fresh data immediately after update

Tradeoff:
- Pro: Fresh data always
- Con: Cache misses after every update (slower)

STRATEGY 3: Tag-Based Invalidation
Group related cache keys, invalidate by tag

SET user:123 <data>
SET tags:user:123 ["user:123", "company:456:users", "email:[email protected]"]

When user updates email, invalidate by tag:
DELETE tags:email:[email protected]
# This invalidates all keys tagged with this email

When to use:
- Complex data relationships (user → company → permissions)
- Need to invalidate multiple keys at once

Tools:
- Redis doesn't natively support tags
- Use Memcached or external library
- Or maintain tag list manually in sorted set

STRATEGY 4: Versioned Keys
Keep multiple versions, switch when needed

Caching Strategy At Scale

Caching Strategy at Scale: Enterprise Playbook

Cache Fundamentals

Cache Hierarchy (Fastest to Slowest)

Redis Caching Patterns

Caching Strategy At Scale

Caching Strategy at Scale: Enterprise Playbook

Cache Fundamentals

Cache Hierarchy (Fastest to Slowest)

Redis Caching Patterns

Cache-Aside Pattern (Most Common)

Write-Through Pattern

Write-Behind Pattern

Cache Invalidation

Invalidation Strategies

Vector Index Tuning

Azure Resource Manager Redis Dotnet

Redis Expert

Elasticsearch

Cache Expert

Abp Mongodb