Skip to content
Low Level Design Mastery Logo
LowLevelDesign Mastery

Caching Fundamentals

Speed is a feature - caching makes it possible

Imagine you’re a librarian. Every time someone asks for a book, you could walk to the massive warehouse (database) to find it. Or, you could keep the 100 most popular books on a cart right next to you (cache). When someone asks for a popular book, you grab it instantly. That’s caching.

Caching is storing frequently accessed data in fast storage (usually memory) to avoid slow operations like database queries or external API calls.

Diagram
ProblemHow Caching Solves It
Slow Database QueriesCache stores results, avoiding repeated queries
High Database LoadReduces database requests by 90%+
Expensive External APIsCache API responses, avoid rate limits
Repeated ComputationsCache expensive calculation results
Geographic LatencyCache data closer to users (CDN)

Diagram

There are four main ways to integrate caching into your application. Each has different trade-offs.

The most common pattern. Your application manages the cache directly.

Diagram

How it works:

  1. Application checks cache first
  2. If cache hit → return data immediately
  3. If cache miss → fetch from database
  4. Store result in cache for next time
  5. Return data to user

When to use:

  • Most common use case - Widely used pattern that fits most scenarios
  • You want full control over cache logic - Application manages cache directly
  • Different data needs different caching strategies - Flexibility to customize per data type
  • Cache and database can be separate systems - No tight coupling required

Trade-offs:

  • Simple to understand and implement - Straightforward pattern with clear logic
  • Flexible - you control everything - Full control over cache behavior
  • Application code must handle cache logic - More code to write and maintain
  • Risk of cache stampede - Multiple requests can hit database simultaneously on cache miss

The cache acts as a proxy. Your application only talks to the cache; the cache handles database access.

Diagram

How it works:

  1. Application requests data from cache
  2. Cache checks if data exists
  3. If miss, cache automatically fetches from database
  4. Cache stores the result
  5. Cache returns data to application

When to use:

  • You want simpler application code - Cache library handles complexity
  • Cache library handles database complexity - Less code to write
  • Consistent caching behavior across application - Standardized approach
  • Good for read-heavy workloads - Optimized for frequent reads

Trade-offs:

  • Simpler application code - Less boilerplate to write
  • Cache handles all complexity - Library manages everything
  • Less flexible than cache-aside - Limited customization options
  • Cache library must support your database - Dependency on library features

Writes go to both cache and database simultaneously. Ensures they stay in sync.

Diagram

How it works:

  1. Application writes data
  2. Cache updates immediately
  3. Cache writes to database simultaneously
  4. Wait for both to complete
  5. Return success

When to use:

  • Strong consistency required - Cache and database must always match
  • Can’t afford stale cache data - Data accuracy is critical
  • Write latency is acceptable - Performance trade-off is acceptable
  • Critical data that must be accurate - Financial, inventory, pricing data

Trade-offs:

  • Strong consistency - Cache and database always synchronized
  • Cache and database always match - No stale data risk
  • Higher write latency - Must wait for both cache and database writes
  • Database becomes bottleneck for writes - All writes go through database

Write to cache immediately, database write happens later. Fastest writes, but risky.

Diagram

How it works:

  1. Application writes data
  2. Cache updates immediately
  3. Return success to application (fast!)
  4. Queue database write for later
  5. Background process writes to database

When to use:

  • Write performance is critical - Need maximum write speed
  • Can tolerate eventual consistency - Temporary inconsistency acceptable
  • Non-critical data - Analytics, logs, metrics where some loss is acceptable
  • High write volume - Need to handle many writes efficiently

Trade-offs:

  • Lowest write latency - Writes complete immediately
  • High write throughput - Can handle many writes per second
  • Risk of data loss - Data lost if cache fails before database write
  • Eventual consistency - Cache and database may differ temporarily

LLD Connection: Implementing Cache Patterns

Section titled “LLD Connection: Implementing Cache Patterns”
Diagram

At the code level, caching patterns translate to decorator patterns and repository abstractions.

The decorator pattern is perfect for adding caching to existing repositories:

A more complete example showing cache-aside in a repository:


Understanding how major companies implement caching patterns helps illustrate when to use each approach:

The Challenge: Facebook’s news feed serves personalized content to billions of users. Each user’s feed is unique, requiring complex queries across multiple data sources.

The Solution: Facebook uses cache-aside pattern extensively:

  • Application checks Memcached first for pre-computed feed data
  • On cache miss, queries multiple services (posts, likes, comments, friends)
  • Stores computed feed in cache for future requests
  • Cache TTL: 2-5 minutes (balance between freshness and load)

Why Cache-Aside? Different users need different caching strategies. Some users have high engagement (shorter TTL), others are casual (longer TTL). Cache-aside gives Facebook flexibility to customize per user.

Impact: Reduces database load by 90%+. A typical feed request that would take 500ms from database takes 5ms from cache.

The Challenge: Amazon’s product catalog is accessed millions of times per second. Product data changes infrequently but needs to be fast.

The Solution: Amazon uses read-through caching:

  • Application only talks to cache layer (DynamoDB with caching)
  • Cache automatically fetches from database on miss
  • Cache library handles all database complexity
  • Products cached for hours (they change rarely)

Why Read-Through? Simpler application code. Product service doesn’t need to know about cache - it just reads products. Cache layer handles everything.

Impact: Product pages load in 50ms instead of 200ms. During Prime Day, caching handles 10x normal traffic without database overload.

The Challenge: Trading platforms need real-time, accurate prices. Stale data means wrong trades, which costs money.

The Solution: Trading systems use write-through:

  • Price update goes to both cache and database simultaneously
  • Cache always has latest price
  • Database is source of truth
  • No stale data risk

Why Write-Through? Strong consistency is critical. A 1-second delay showing wrong price could mean millions in losses. Write-through ensures cache and database always match.

Example: A stock price update from $100 to $105:

  • Write-through: Cache = $105, DB = $105 (consistent)
  • Cache-aside: Cache might still show $100 until next read (stale)

The Challenge: Twitter generates billions of tweets per day. Each tweet needs analytics (views, likes, retweets) tracked, but write performance is critical.

The Solution: Twitter uses write-behind for analytics:

  • Tweet creation writes to cache immediately (fast response)
  • Analytics (view counts, engagement) written to cache first
  • Database writes happen asynchronously in batches
  • Some data loss acceptable (analytics are approximate anyway)

Why Write-Behind? Write performance is critical. Users expect instant tweet posting. Analytics can be eventually consistent - losing a few view counts is acceptable.

Impact: Tweet creation latency: 10ms (cache write) vs 100ms (database write). During viral events, write-behind handles 100x normal write volume.

The Challenge: Netflix serves video metadata (titles, descriptions, ratings) globally. Data changes rarely but needs to be fast worldwide.

The Solution: Netflix uses multiple patterns:

  • Read-Through for video metadata (rarely changes, cache for days)
  • Cache-Aside for user recommendations (personalized, cache per user)
  • Write-Through for watch history (critical data, must be consistent)

Why Multiple Patterns? Different data has different requirements. Metadata can be stale, recommendations are personalized, watch history must be accurate.

Impact: 95% of requests served from cache. Global latency reduced from 200ms to 20ms average.



Diagram
PatternRead LatencyWrite LatencyConsistencyComplexityUse Case
Cache-AsideLow (cache hit)LowEventualMediumMost applications
Read-ThroughLow (cache hit)LowEventualLowRead-heavy apps
Write-ThroughLow (cache hit)High (waits for DB)StrongMediumCritical data
Write-BehindLow (cache hit)Very LowEventualHighHigh write volume

🎯 Cache-Aside is King

Most applications use cache-aside. It’s flexible, understandable, and gives you control.

⚡ Speed Matters

Cache lookups are 100x faster than database queries. At scale, this difference is massive.

🔄 Consistency Trade-offs

Faster writes (write-behind) = weaker consistency. Stronger consistency (write-through) = slower writes.

🏗️ Decorator Pattern

Use decorator pattern in code to add caching transparently to existing repositories.