Distributed Caching

Caching that scales across machines

The Single Server Problem

When your application runs on one server, caching is simple - just use local memory. But what happens when you scale to multiple servers?

The problem: Each server has its own cache. Data cached on Server 1 isn’t available on Server 2. You’re wasting memory and getting inconsistent results.

The solution: Distributed caching - a shared cache accessible by all servers.

What is Distributed Caching?

Distributed caching means using a cache that runs on separate servers and is shared by all your application servers.

Benefits

Benefit	Description
Shared Cache	All servers see the same cached data
Larger Capacity	Sum of all cache servers, not limited to one machine
High Availability	Cache survives individual server failures
Consistency	Updates visible to all servers immediately
Memory Efficiency	One copy instead of N copies

Redis vs Memcached

The two most popular distributed caching solutions:

Redis (Remote Dictionary Server)

Feature-rich in-memory data store. More than just a cache.

Redis Strengths:

Rich data structures (lists, sets, sorted sets, hashes)
Persistence options (RDB, AOF)
Advanced features (pub/sub, transactions, Lua scripting)
Atomic operations
Built-in replication and clustering

Redis Use Cases:

Caching
Session storage
Real-time leaderboards
Message queues
Rate limiting
Distributed locks

Memcached

Simple, fast key-value store. Pure caching solution.

Memcached Strengths:

Simpler than Redis
Lower memory overhead
Faster for simple key-value operations
Good for pure caching needs

Memcached Use Cases:

Simple caching
Session storage (if persistence not needed)
When you only need key-value

Comparison

Feature	Redis	Memcached
Data Structures	Rich (strings, lists, sets, etc.)	Key-value only
Persistence	Yes (RDB, AOF)	No
Performance	Fast	Faster (simpler)
Memory Efficiency	Higher overhead	Lower overhead
Replication	Built-in	Client-side sharding
Use Case	Feature-rich caching	Simple caching

Real-World Examples

Major companies use distributed caching to scale their applications:

Redis: Twitter’s Timeline Cache

The Challenge: Twitter serves billions of timeline requests per day. Each user’s timeline is unique and requires complex queries.

The Solution: Twitter uses Redis clusters:

Scale: 100+ Redis clusters, millions of keys
Data: User timelines, tweet metadata, follower lists
Pattern: Cache-aside with Redis
TTL: 5 minutes for timelines, 1 hour for user profiles

Why Redis? Twitter needs rich data structures (lists for timelines, sets for followers, sorted sets for trending). Redis provides these natively.

Architecture:

Application servers → Redis cluster (sharded by user ID)
Cache miss → Query database → Store in Redis
Replication: Each Redis node has replicas for high availability

Impact: 70% of timeline requests served from Redis. Reduced database load by 80%. Timeline load time: 50ms (Redis) vs 500ms (database).

Memcached: Facebook’s Feed Cache

The Challenge: Facebook’s news feed serves personalized content to billions of users. Simple key-value caching needed.

The Solution: Facebook uses Memcached extensively:

Scale: Thousands of Memcached servers, petabytes of cached data
Data: Feed content, user profiles, friend lists
Pattern: Cache-aside with Memcached
Sharding: Consistent hashing across Memcached servers

Why Memcached? Facebook needs simple, fast key-value storage. Memcached is optimized for this use case - simpler than Redis, faster for pure caching.

Architecture:

Application servers → Memcached pool (consistent hashing)
Cache miss → Query database → Store in Memcached
No persistence (acceptable - data can be recomputed)

Impact: 99% of feed requests served from Memcached. Database queries reduced by 99%. Feed load time: 10ms (Memcached) vs 200ms (database).

Redis vs Memcached: Instagram’s Use Case

The Challenge: Instagram needs both simple caching and complex data structures for different features.

The Solution: Instagram uses both:

Memcached: Simple caching (user sessions, simple counters)
Redis: Complex data structures (feed timelines, follower lists, real-time features)

Why Both? Different features have different needs:

User sessions: Simple key-value → Memcached (faster, simpler)
Feed timelines: Need lists, sorted sets → Redis (richer features)
Real-time features: Need pub/sub → Redis (Memcached doesn’t support)

Example:

User login: Session stored in Memcached (simple, fast)
Feed generation: Timeline stored in Redis (needs list operations)
Real-time notifications: Pub/sub via Redis (Memcached can’t do this)

Impact: Right tool for each job. Optimized performance and cost.

Amazon ElastiCache: E-commerce Platform

The Challenge: E-commerce platform needs distributed caching but doesn’t want to manage infrastructure.

The Solution: Uses Amazon ElastiCache (managed Redis):

Scale: Auto-scales based on load
Data: Product catalog, shopping carts, session data
Pattern: Read-through with ElastiCache
High Availability: Multi-AZ deployment with automatic failover

Why Managed Service? Focus on application logic, not infrastructure. ElastiCache handles scaling, backups, monitoring.

Architecture:

Application → ElastiCache (managed Redis cluster)
Automatic failover if primary fails
Automatic backups and point-in-time recovery

Impact: Reduced operational overhead by 90%. Product pages load 5x faster. Handles Black Friday traffic spikes automatically.

Netflix: Multi-Region Caching

The Challenge: Netflix serves content globally. Users in different regions need low latency.

The Solution: Netflix uses distributed caching across regions:

Scale: Redis clusters in each AWS region
Data: Video metadata, user preferences, recommendations
Pattern: Cache-aside with regional Redis clusters
Consistency: Eventual consistency across regions (acceptable for metadata)

Why Multi-Region? Latency matters for video streaming. Users in Asia get data from Asian Redis cluster (20ms) instead of US cluster (200ms).

Architecture:

Regional application servers → Regional Redis cluster
Cache miss → Regional database → Store in regional Redis
Cross-region replication for popular content

Impact: Global latency reduced from 200ms to 20ms average. 95% cache hit rate. Reduced bandwidth costs by 80%.

Distributed Cache Architecture

Basic Setup

Consistency Challenges

Problem: When Server 1 updates cache, how do Servers 2 and 3 know?

Solutions:

Write-Through to Shared Cache
- All writes go to distributed cache
- All servers read from same cache
- Ensures consistency
Cache Invalidation
- When data updated, invalidate cache
- Next read fetches fresh data
- More on this in next lesson
Short TTL
- Use short expiration times
- Accepts eventual consistency
- Simple but may have stale data

LLD Connection: Designing Cache Clients

At the code level, you need to design cache client wrappers that abstract Redis/Memcached:

1
from abc import ABC, abstractmethod
2
from typing import Optional, Any
3
import redis
4
import memcache
5

6
class CacheClient(ABC):
7
    @abstractmethod
8
    def get(self, key: str) -> Optional[Any]:
9
        pass
10

11
    @abstractmethod
12
    def set(self, key: str, value: Any, ttl: int = 300) -> bool:
13
        pass
14

15
    @abstractmethod
16
    def delete(self, key: str) -> bool:
17
        pass
18

19
class RedisCacheClient(CacheClient):
20
    def __init__(self, host: str = 'localhost', port: int = 6379):
21
        self.client = redis.Redis(host=host, port=port, decode_responses=True)
22

23
    def get(self, key: str) -> Optional[Any]:
24
        try:
25
            return self.client.get(key)
26
        except redis.RedisError:
27
            # Handle failure gracefully
28
            return None
29

30
    def set(self, key: str, value: Any, ttl: int = 300) -> bool:
31
        try:
32
            return self.client.setex(key, ttl, value)
33
        except redis.RedisError:
34
            return False
35

36
    def delete(self, key: str) -> bool:
37
        try:
38
            return bool(self.client.delete(key))
39
        except redis.RedisError:
40
            return False
41

42
class MemcachedCacheClient(CacheClient):
43
    def __init__(self, servers: list = None):
44
        self.client = memcache.Client(servers or ['127.0.0.1:11211'])
45

46
    def get(self, key: str) -> Optional[Any]:
47
        try:
48
            return self.client.get(key)
49
        except Exception:
50
            return None
51

52
    def set(self, key: str, value: Any, ttl: int = 300) -> bool:
53
        try:
54
            return self.client.set(key, value, time=ttl)
55
        except Exception:
56
            return False
57

58
    def delete(self, key: str) -> bool:
59
        try:
60
            return self.client.delete(key)
61
        except Exception:
62
            return False
63

64
# Usage - application code doesn't care about implementation
65
class UserService:
66
    def __init__(self, cache: CacheClient):
67
        self.cache = cache
68

69
    def get_user(self, user_id: int):
70
        # Cache-aside pattern
71
        cache_key = f"user:{user_id}"
72
        user = self.cache.get(cache_key)
73

74
        if user:
75
            return user
76

77
        # Cache miss - fetch from DB
78
        user = self._fetch_from_db(user_id)
79

80
        # Store in cache
81
        if user:
82
            self.cache.set(cache_key, user, ttl=300)
83

84
        return user

1
import java.util.Optional;
2

3
interface CacheClient {
4
    Optional<String> get(String key);
5
    boolean set(String key, String value, int ttlSeconds);
6
    boolean delete(String key);
7
}
8

9
class RedisCacheClient implements CacheClient {
10
    private final Jedis jedis;
11

12
    public RedisCacheClient(String host, int port) {
13
        this.jedis = new Jedis(host, port);
14
    }
15

16
    public Optional<String> get(String key) {
17
        try {
18
            String value = jedis.get(key);
19
            return Optional.ofNullable(value);
20
        } catch (Exception e) {
21
            // Handle failure gracefully
22
            return Optional.empty();
23
        }
24
    }
25

26
    public boolean set(String key, String value, int ttlSeconds) {
27
        try {
28
            return "OK".equals(jedis.setex(key, ttlSeconds, value));
29
        } catch (Exception e) {
30
            return false;
31
        }
32
    }
33

34
    public boolean delete(String key) {
35
        try {
36
            return jedis.del(key) > 0;
37
        } catch (Exception e) {
38
            return false;
39
        }
40
    }
41
}
42

43
class MemcachedCacheClient implements CacheClient {
44
    private final MemcachedClient client;
45

46
    public MemcachedCacheClient(String host, int port) {
47
        this.client = new MemcachedClient(
48
            new InetSocketAddress(host, port));
49
    }
50

51
    public Optional<String> get(String key) {
52
        try {
53
            return Optional.ofNullable((String) client.get(key));
54
        } catch (Exception e) {
55
            return Optional.empty();
56
        }
57
    }
58

59
    public boolean set(String key, String value, int ttlSeconds) {
60
        try {
61
            return client.set(key, ttlSeconds, value).get();
62
        } catch (Exception e) {
63
            return false;
64
        }
65
    }
66

67
    public boolean delete(String key) {
68
        try {
69
            return client.delete(key).get();
70
        } catch (Exception e) {
71
            return false;
72
        }
73
    }
74
}
75

76
// Usage - application code doesn't care about implementation
77
class UserService {
78
    private final CacheClient cache;
79

80
    public UserService(CacheClient cache) {
81
        this.cache = cache;
82
    }
83

84
    public Optional<User> getUser(int userId) {
85
        // Cache-aside pattern
86
        String cacheKey = "user:" + userId;
87
        Optional<String> cached = cache.get(cacheKey);
88

89
        if (cached.isPresent()) {
90
            return Optional.of(deserialize(cached.get()));
91
        }
92

93
        // Cache miss - fetch from DB
94
        Optional<User> user = fetchFromDb(userId);
95

96
        // Store in cache
97
        user.ifPresent(u -> cache.set(cacheKey, serialize(u), 300));
98

99
        return user;
100
    }
101
}

Connection Pooling

Important: Don’t create new connections for each request. Use connection pooling:

Python
Java

1
import redis
2
from redis.connection import ConnectionPool
3

4
# Create connection pool
5
pool = ConnectionPool(
6
    host='localhost',
7
    port=6379,
8
    max_connections=50,  # Max connections in pool
9
    decode_responses=True
10
)
11

12
# Reuse pool across requests
13
class CacheService:
14
    def __init__(self):
15
        self.redis = redis.Redis(connection_pool=pool)
16

17
    def get(self, key: str):
18
        return self.redis.get(key)

1
import redis.clients.jedis.JedisPool;
2
import redis.clients.jedis.JedisPoolConfig;
3

4
// Create connection pool
5
JedisPoolConfig poolConfig = new JedisPoolConfig();
6
poolConfig.setMaxTotal(50);  // Max connections
7
poolConfig.setMaxIdle(10);   // Max idle connections
8

9
JedisPool pool = new JedisPool(
10
    poolConfig, "localhost", 6379);
11

12
// Reuse pool across requests
13
class CacheService {
14
    private final JedisPool pool;
15

16
    public CacheService(JedisPool pool) {
17
        this.pool = pool;
18
    }
19

20
    public String get(String key) {
21
        try (Jedis jedis = pool.getResource()) {
22
            return jedis.get(key);
23
        }
24
    }
25
}

Retry Logic and Circuit Breaker

Handle cache failures gracefully:

Python
Java

1
import time
2
from typing import Callable, Optional, Any
3

4
class CacheWithRetry:
5
    def __init__(self, cache: CacheClient, max_retries: int = 3):
6
        self.cache = cache
7
        self.max_retries = max_retries
8

9
    def get_with_retry(self, key: str) -> Optional[Any]:
10
        for attempt in range(self.max_retries):
11
            try:
12
                return self.cache.get(key)
13
            except Exception as e:
14
                if attempt == self.max_retries - 1:
15
                    # Last attempt failed - return None (cache miss)
16
                    return None
17
                # Exponential backoff
18
                time.sleep(2 ** attempt)
19
        return None

1
import java.util.Optional;
2
import java.util.function.Supplier;
3

4
class CacheWithRetry {
5
    private final CacheClient cache;
6
    private final int maxRetries;
7

8
    public CacheWithRetry(CacheClient cache, int maxRetries) {
9
        this.cache = cache;
10
        this.maxRetries = maxRetries;
11
    }
12

13
    public Optional<String> getWithRetry(String key) {
14
        for (int attempt = 0; attempt < maxRetries; attempt++) {
15
            try {
16
                return cache.get(key);
17
            } catch (Exception e) {
18
                if (attempt == maxRetries - 1) {
19
                    // Last attempt failed - return empty (cache miss)
20
                    return Optional.empty();
21
                }
22
                // Exponential backoff
23
                try {
24
                    Thread.sleep((long) Math.pow(2, attempt) * 1000);
25
                } catch (InterruptedException ie) {
26
                    Thread.currentThread().interrupt();
27
                    return Optional.empty();
28
                }
29
            }
30
        }
31
        return Optional.empty();
32
    }
33
}

Sharding and Partitioning

For very large caches, shard data across multiple cache nodes:

Sharding Strategy:

Hash key to determine which shard
Distribute load across nodes
Each shard handles subset of keys

Key Takeaways

🌐 Shared Cache

Distributed caching provides shared cache accessible by all application servers.

🔴 Redis vs Memcached

Redis = feature-rich, Memcached = simple and fast. Choose based on needs.

🏗️ Abstract Implementation

Design cache interfaces that abstract Redis/Memcached. Makes switching easier.

🔌 Connection Pooling

Always use connection pooling. Don’t create connections per request.

Next Steps

Master Cache Invalidation - the hardest problem in computer science
Understand CDN & Edge Caching - caching at the edge for global users

Request a feature or report an issue