Cache Me If You Can: Design Patterns for Performance

S

Sibasish Mohanty

Guest
In part 3 of our System Design series, we’re tackling caching and load balancing β€” the unsung heroes of performance. Without them, systems crumble under scale.

We’ll cover:

  1. Caching – App/DB/CDN; write-through/write-back, TTLs
  2. Cache Invalidation – TTLs, versioning, stampede protection
  3. Load Balancing – L4/L7, round-robin, least-connections, hashing

1. Caching​


TL;DR: Caching is your first lever for scale. Use it everywhere, but know the trade-offs.

  • App cache: In-memory (Redis, Memcached). Ultra-fast but volatile.
  • DB cache: Query or object cache to offload hot queries.
  • CDN cache: Push static assets near users.

Strategies:

  • Write-through: Write to cache + DB simultaneously (safe, consistent, slower writes)
  • Write-back: Write to cache first, sync to DB later (fast, risky if cache crashes)
  • TTL (Time To Live): Expire stale data automatically

πŸ‘‰ Example: A news homepage caches top stories for 30s β€” thousands of requests saved.

πŸ‘‰ Interview tie-in: β€œHow would you scale a read-heavy service?” β€” caching is the first answer.

2. Cache Invalidation​


TL;DR: The hardest part of caching isn’t caching β€” it’s invalidation.

  • TTL: Safe default, but may serve stale data.
  • Versioning: Change cache key when data updates (e.g., user:v2:123)
  • Stampede protection: Use locking or request coalescing so multiple clients don’t hammer the DB when cache expires.

πŸ‘‰ Example: If 1M users refresh when a cache expires, that’s a cache stampede. Use jittered TTLs or async refresh.

πŸ‘‰ Interview tie-in: They’ll ask β€œWhat’s the hardest part about caching?” β€” answer: invalidation and consistency.

3. Load Balancing​


TL;DR: Load balancers spread requests across servers and hide failures.

  • L4 (Transport): Balances based on IP/port. Simple, fast.
  • L7 (Application): Smarter β€” routes based on headers, cookies, paths.

Algorithms:

  • Round Robin: Even distribution
  • Least Connections: Send to the server with fewest active requests
  • Hashing: Sticky sessions (e.g., same user β†’ same server)

πŸ‘‰ Example: E-commerce app uses L7 LB to route /images β†’ CDN, /checkout β†’ payment cluster.

πŸ‘‰ Interview tie-in: β€œHow do you handle uneven traffic across servers?” β€” least-connections or weighted load balancing.

βœ… Takeaways​

  • Cache where it hurts most: hot queries, static assets, read-heavy endpoints
  • Invalidation is the real challenge; plan strategies upfront
  • Load balancing is critical for fairness, resilience, and routing logic

πŸ’‘ Practice Question:

"Design the caching strategy for a Twitter timeline. How would you avoid cache stampede during trending events?"

Continue reading...
 


Join 𝕋𝕄𝕋 on Telegram
Channel PREVIEW:
Back
Top