Cache Me If You Can: Design Patterns for Performance

Sibasish Mohanty · 2025-09-03T05:12:29+0100

In part 3 of our System Design series, we’re tackling caching and load balancing — the unsung heroes of performance. Without them, systems crumble under scale.

We’ll cover:

Caching – App/DB/CDN; write-through/write-back, TTLs
Cache Invalidation – TTLs, versioning, stampede protection
Load Balancing – L4/L7, round-robin, least-connections, hashing

1. Caching

TL;DR: Caching is your first lever for scale. Use it everywhere, but know the trade-offs.

App cache: In-memory (Redis, Memcached). Ultra-fast but volatile.
DB cache: Query or object cache to offload hot queries.
CDN cache: Push static assets near users.

Strategies:

Write-through: Write to cache + DB simultaneously (safe, consistent, slower writes)
Write-back: Write to cache first, sync to DB later (fast, risky if cache crashes)
TTL (Time To Live): Expire stale data automatically

Example: A news homepage caches top stories for 30s — thousands of requests saved.

Interview tie-in: “How would you scale a read-heavy service?” — caching is the first answer.

2. Cache Invalidation

TL;DR: The hardest part of caching isn’t caching — it’s invalidation.

TTL: Safe default, but may serve stale data.
Versioning: Change cache key when data updates (e.g., user:v2:123)
Stampede protection: Use locking or request coalescing so multiple clients don’t hammer the DB when cache expires.

Example: If 1M users refresh when a cache expires, that’s a cache stampede. Use jittered TTLs or async refresh.

Interview tie-in: They’ll ask “What’s the hardest part about caching?” — answer: invalidation and consistency.

3. Load Balancing

TL;DR: Load balancers spread requests across servers and hide failures.

L4 (Transport): Balances based on IP/port. Simple, fast.
L7 (Application): Smarter — routes based on headers, cookies, paths.

Algorithms:

Round Robin: Even distribution
Least Connections: Send to the server with fewest active requests
Hashing: Sticky sessions (e.g., same user → same server)

Example: E-commerce app uses L7 LB to route /images → CDN, /checkout → payment cluster.

Interview tie-in: “How do you handle uneven traffic across servers?” — least-connections or weighted load balancing.

Takeaways

Cache where it hurts most: hot queries, static assets, read-heavy endpoints
Invalidation is the real challenge; plan strategies upfront
Load balancing is critical for fairness, resilience, and routing logic

Practice Question:

"Design the caching strategy for a Twitter timeline. How would you avoid cache stampede during trending events?"

Continue reading...

Cache Me If You Can: Design Patterns for Performance

Sibasish Mohanty

Guest

1. Caching​

2. Cache Invalidation​

3. Load Balancing​

Takeaways​

1. Caching

2. Cache Invalidation

3. Load Balancing

Takeaways