S
Sibasish Mohanty
Guest
In part 3 of our System Design series, weβre tackling caching and load balancing β the unsung heroes of performance. Without them, systems crumble under scale.
Weβll cover:
TL;DR: Caching is your first lever for scale. Use it everywhere, but know the trade-offs.
Strategies:
Example: A news homepage caches top stories for 30s β thousands of requests saved.
Interview tie-in: βHow would you scale a read-heavy service?β β caching is the first answer.
TL;DR: The hardest part of caching isnβt caching β itβs invalidation.
Example: If 1M users refresh when a cache expires, thatβs a cache stampede. Use jittered TTLs or async refresh.
Interview tie-in: Theyβll ask βWhatβs the hardest part about caching?β β answer: invalidation and consistency.
TL;DR: Load balancers spread requests across servers and hide failures.
Algorithms:
Example: E-commerce app uses L7 LB to route
Interview tie-in: βHow do you handle uneven traffic across servers?β β least-connections or weighted load balancing.
Practice Question:
"Design the caching strategy for a Twitter timeline. How would you avoid cache stampede during trending events?"
Continue reading...
Weβll cover:
- Caching β App/DB/CDN; write-through/write-back, TTLs
- Cache Invalidation β TTLs, versioning, stampede protection
- Load Balancing β L4/L7, round-robin, least-connections, hashing
1. Caching
TL;DR: Caching is your first lever for scale. Use it everywhere, but know the trade-offs.
- App cache: In-memory (Redis, Memcached). Ultra-fast but volatile.
- DB cache: Query or object cache to offload hot queries.
- CDN cache: Push static assets near users.
Strategies:
- Write-through: Write to cache + DB simultaneously (safe, consistent, slower writes)
- Write-back: Write to cache first, sync to DB later (fast, risky if cache crashes)
- TTL (Time To Live): Expire stale data automatically


2. Cache Invalidation
TL;DR: The hardest part of caching isnβt caching β itβs invalidation.
- TTL: Safe default, but may serve stale data.
- Versioning: Change cache key when data updates (e.g.,
user:v2:123
) - Stampede protection: Use locking or request coalescing so multiple clients donβt hammer the DB when cache expires.


3. Load Balancing
TL;DR: Load balancers spread requests across servers and hide failures.
- L4 (Transport): Balances based on IP/port. Simple, fast.
- L7 (Application): Smarter β routes based on headers, cookies, paths.
Algorithms:
- Round Robin: Even distribution
- Least Connections: Send to the server with fewest active requests
- Hashing: Sticky sessions (e.g., same user β same server)

/images
β CDN, /checkout
β payment cluster.
Takeaways
- Cache where it hurts most: hot queries, static assets, read-heavy endpoints
- Invalidation is the real challenge; plan strategies upfront
- Load balancing is critical for fairness, resilience, and routing logic

"Design the caching strategy for a Twitter timeline. How would you avoid cache stampede during trending events?"
Continue reading...