Skip to content
Low Level Design Mastery Logo
LowLevelDesign Mastery

Distributed Caching

Caching that scales across machines

When your application runs on one server, caching is simple - just use local memory. But what happens when you scale to multiple servers?

Diagram

The problem: Each server has its own cache. Data cached on Server 1 isn’t available on Server 2. You’re wasting memory and getting inconsistent results.

The solution: Distributed caching - a shared cache accessible by all servers.


Distributed caching means using a cache that runs on separate servers and is shared by all your application servers.

Diagram
BenefitDescription
Shared CacheAll servers see the same cached data
Larger CapacitySum of all cache servers, not limited to one machine
High AvailabilityCache survives individual server failures
ConsistencyUpdates visible to all servers immediately
Memory EfficiencyOne copy instead of N copies

The two most popular distributed caching solutions:

Feature-rich in-memory data store. More than just a cache.

Diagram

Redis Strengths:

  • Rich data structures (lists, sets, sorted sets, hashes)
  • Persistence options (RDB, AOF)
  • Advanced features (pub/sub, transactions, Lua scripting)
  • Atomic operations
  • Built-in replication and clustering

Redis Use Cases:

  • Caching
  • Session storage
  • Real-time leaderboards
  • Message queues
  • Rate limiting
  • Distributed locks

Simple, fast key-value store. Pure caching solution.

Diagram

Memcached Strengths:

  • Simpler than Redis
  • Lower memory overhead
  • Faster for simple key-value operations
  • Good for pure caching needs

Memcached Use Cases:

  • Simple caching
  • Session storage (if persistence not needed)
  • When you only need key-value
FeatureRedisMemcached
Data StructuresRich (strings, lists, sets, etc.)Key-value only
PersistenceYes (RDB, AOF)No
PerformanceFastFaster (simpler)
Memory EfficiencyHigher overheadLower overhead
ReplicationBuilt-inClient-side sharding
Use CaseFeature-rich cachingSimple caching

Major companies use distributed caching to scale their applications:

The Challenge: Twitter serves billions of timeline requests per day. Each user’s timeline is unique and requires complex queries.

The Solution: Twitter uses Redis clusters:

  • Scale: 100+ Redis clusters, millions of keys
  • Data: User timelines, tweet metadata, follower lists
  • Pattern: Cache-aside with Redis
  • TTL: 5 minutes for timelines, 1 hour for user profiles

Why Redis? Twitter needs rich data structures (lists for timelines, sets for followers, sorted sets for trending). Redis provides these natively.

Architecture:

  • Application servers → Redis cluster (sharded by user ID)
  • Cache miss → Query database → Store in Redis
  • Replication: Each Redis node has replicas for high availability

Impact: 70% of timeline requests served from Redis. Reduced database load by 80%. Timeline load time: 50ms (Redis) vs 500ms (database).

The Challenge: Facebook’s news feed serves personalized content to billions of users. Simple key-value caching needed.

The Solution: Facebook uses Memcached extensively:

  • Scale: Thousands of Memcached servers, petabytes of cached data
  • Data: Feed content, user profiles, friend lists
  • Pattern: Cache-aside with Memcached
  • Sharding: Consistent hashing across Memcached servers

Why Memcached? Facebook needs simple, fast key-value storage. Memcached is optimized for this use case - simpler than Redis, faster for pure caching.

Architecture:

  • Application servers → Memcached pool (consistent hashing)
  • Cache miss → Query database → Store in Memcached
  • No persistence (acceptable - data can be recomputed)

Impact: 99% of feed requests served from Memcached. Database queries reduced by 99%. Feed load time: 10ms (Memcached) vs 200ms (database).

Redis vs Memcached: Instagram’s Use Case

Section titled “Redis vs Memcached: Instagram’s Use Case”

The Challenge: Instagram needs both simple caching and complex data structures for different features.

The Solution: Instagram uses both:

  • Memcached: Simple caching (user sessions, simple counters)
  • Redis: Complex data structures (feed timelines, follower lists, real-time features)

Why Both? Different features have different needs:

  • User sessions: Simple key-value → Memcached (faster, simpler)
  • Feed timelines: Need lists, sorted sets → Redis (richer features)
  • Real-time features: Need pub/sub → Redis (Memcached doesn’t support)

Example:

  • User login: Session stored in Memcached (simple, fast)
  • Feed generation: Timeline stored in Redis (needs list operations)
  • Real-time notifications: Pub/sub via Redis (Memcached can’t do this)

Impact: Right tool for each job. Optimized performance and cost.

The Challenge: E-commerce platform needs distributed caching but doesn’t want to manage infrastructure.

The Solution: Uses Amazon ElastiCache (managed Redis):

  • Scale: Auto-scales based on load
  • Data: Product catalog, shopping carts, session data
  • Pattern: Read-through with ElastiCache
  • High Availability: Multi-AZ deployment with automatic failover

Why Managed Service? Focus on application logic, not infrastructure. ElastiCache handles scaling, backups, monitoring.

Architecture:

  • Application → ElastiCache (managed Redis cluster)
  • Automatic failover if primary fails
  • Automatic backups and point-in-time recovery

Impact: Reduced operational overhead by 90%. Product pages load 5x faster. Handles Black Friday traffic spikes automatically.

The Challenge: Netflix serves content globally. Users in different regions need low latency.

The Solution: Netflix uses distributed caching across regions:

  • Scale: Redis clusters in each AWS region
  • Data: Video metadata, user preferences, recommendations
  • Pattern: Cache-aside with regional Redis clusters
  • Consistency: Eventual consistency across regions (acceptable for metadata)

Why Multi-Region? Latency matters for video streaming. Users in Asia get data from Asian Redis cluster (20ms) instead of US cluster (200ms).

Architecture:

  • Regional application servers → Regional Redis cluster
  • Cache miss → Regional database → Store in regional Redis
  • Cross-region replication for popular content

Impact: Global latency reduced from 200ms to 20ms average. 95% cache hit rate. Reduced bandwidth costs by 80%.



Diagram

Problem: When Server 1 updates cache, how do Servers 2 and 3 know?

Diagram

Solutions:

  1. Write-Through to Shared Cache

    • All writes go to distributed cache
    • All servers read from same cache
    • Ensures consistency
  2. Cache Invalidation

    • When data updated, invalidate cache
    • Next read fetches fresh data
    • More on this in next lesson
  3. Short TTL

    • Use short expiration times
    • Accepts eventual consistency
    • Simple but may have stale data

At the code level, you need to design cache client wrappers that abstract Redis/Memcached:

cache_client.py
from abc import ABC, abstractmethod
from typing import Optional, Any
import redis
import memcache
class CacheClient(ABC):
@abstractmethod
def get(self, key: str) -> Optional[Any]:
pass
@abstractmethod
def set(self, key: str, value: Any, ttl: int = 300) -> bool:
pass
@abstractmethod
def delete(self, key: str) -> bool:
pass
class RedisCacheClient(CacheClient):
def __init__(self, host: str = 'localhost', port: int = 6379):
self.client = redis.Redis(host=host, port=port, decode_responses=True)
def get(self, key: str) -> Optional[Any]:
try:
return self.client.get(key)
except redis.RedisError:
# Handle failure gracefully
return None
def set(self, key: str, value: Any, ttl: int = 300) -> bool:
try:
return self.client.setex(key, ttl, value)
except redis.RedisError:
return False
def delete(self, key: str) -> bool:
try:
return bool(self.client.delete(key))
except redis.RedisError:
return False
class MemcachedCacheClient(CacheClient):
def __init__(self, servers: list = None):
self.client = memcache.Client(servers or ['127.0.0.1:11211'])
def get(self, key: str) -> Optional[Any]:
try:
return self.client.get(key)
except Exception:
return None
def set(self, key: str, value: Any, ttl: int = 300) -> bool:
try:
return self.client.set(key, value, time=ttl)
except Exception:
return False
def delete(self, key: str) -> bool:
try:
return self.client.delete(key)
except Exception:
return False
# Usage - application code doesn't care about implementation
class UserService:
def __init__(self, cache: CacheClient):
self.cache = cache
def get_user(self, user_id: int):
# Cache-aside pattern
cache_key = f"user:{user_id}"
user = self.cache.get(cache_key)
if user:
return user
# Cache miss - fetch from DB
user = self._fetch_from_db(user_id)
# Store in cache
if user:
self.cache.set(cache_key, user, ttl=300)
return user

Important: Don’t create new connections for each request. Use connection pooling:

connection_pool.py
import redis
from redis.connection import ConnectionPool
# Create connection pool
pool = ConnectionPool(
host='localhost',
port=6379,
max_connections=50, # Max connections in pool
decode_responses=True
)
# Reuse pool across requests
class CacheService:
def __init__(self):
self.redis = redis.Redis(connection_pool=pool)
def get(self, key: str):
return self.redis.get(key)

Handle cache failures gracefully:

cache_with_retry.py
import time
from typing import Callable, Optional, Any
class CacheWithRetry:
def __init__(self, cache: CacheClient, max_retries: int = 3):
self.cache = cache
self.max_retries = max_retries
def get_with_retry(self, key: str) -> Optional[Any]:
for attempt in range(self.max_retries):
try:
return self.cache.get(key)
except Exception as e:
if attempt == self.max_retries - 1:
# Last attempt failed - return None (cache miss)
return None
# Exponential backoff
time.sleep(2 ** attempt)
return None

For very large caches, shard data across multiple cache nodes:

Diagram

Sharding Strategy:

  • Hash key to determine which shard
  • Distribute load across nodes
  • Each shard handles subset of keys

🌐 Shared Cache

Distributed caching provides shared cache accessible by all application servers.

🔴 Redis vs Memcached

Redis = feature-rich, Memcached = simple and fast. Choose based on needs.

🏗️ Abstract Implementation

Design cache interfaces that abstract Redis/Memcached. Makes switching easier.

🔌 Connection Pooling

Always use connection pooling. Don’t create connections per request.