Skip to content
Low Level Design Mastery Logo
LowLevelDesign Mastery

Cache Eviction Policies

When cache is full, what gets kicked out?

Imagine your cache is a parking lot with 100 spaces. When it’s full and a new car arrives, you need to decide: which car leaves?

That’s cache eviction - deciding what to remove when cache is full.

Diagram

Maximize cache hit rate - keep the data you’ll access soon, evict data you won’t need.


Most popular policy. Evicts the item that hasn’t been accessed for the longest time.

Think of it like a stack of books on your desk. When you use a book, you put it on top. When your desk is full, you remove books from the bottom (least recently used).

Diagram

Algorithm:

  1. When item accessed → move to front (most recent)
  2. When cache full → remove from back (least recent)
  3. New items added to front

When to use:

  • Most common use case
  • Temporal locality (recently used = likely to be used again)
  • Web applications, API responses
  • General-purpose caching

Trade-offs:

  • Simple to understand
  • Works well for most access patterns
  • O(1) operations with proper implementation
  • Doesn’t consider frequency (one-hit wonders stay)
  • Can evict frequently used items if not accessed recently

This is a classic interview problem. Here’s how to implement it efficiently:

lru_cache.py
from collections import OrderedDict
class LRUCache:
def __init__(self, capacity: int):
self.capacity = capacity
# OrderedDict maintains insertion order
# Most recent at end, least recent at beginning
self.cache = OrderedDict()
def get(self, key: int) -> int:
if key not in self.cache:
return -1
# Move to end (most recent)
self.cache.move_to_end(key)
return self.cache[key]
def put(self, key: int, value: int) -> None:
if key in self.cache:
# Update existing
self.cache.move_to_end(key)
else:
# Check if full
if len(self.cache) >= self.capacity:
# Evict least recent (first item)
self.cache.popitem(last=False)
self.cache[key] = value
# Ensure it's at end (most recent)
self.cache.move_to_end(key)

More Efficient Implementation (HashMap + Doubly Linked List for O(1) operations):

lru_cache_optimized.py
class Node:
def __init__(self, key=0, value=0):
self.key = key
self.value = value
self.prev = None
self.next = None
class LRUCache:
def __init__(self, capacity: int):
self.capacity = capacity
self.cache = {} # key -> node
# Dummy head and tail for easier operations
self.head = Node()
self.tail = Node()
self.head.next = self.tail
self.tail.prev = self.head
def _add_node(self, node: Node):
# Add after head
node.prev = self.head
node.next = self.head.next
self.head.next.prev = node
self.head.next = node
def _remove_node(self, node: Node):
node.prev.next = node.next
node.next.prev = node.prev
def _move_to_head(self, node: Node):
self._remove_node(node)
self._add_node(node)
def _pop_tail(self) -> Node:
last = self.tail.prev
self._remove_node(last)
return last
def get(self, key: int) -> int:
node = self.cache.get(key)
if not node:
return -1
self._move_to_head(node)
return node.value
def put(self, key: int, value: int):
node = self.cache.get(key)
if not node:
if len(self.cache) >= self.capacity:
tail = self._pop_tail()
del self.cache[tail.key]
node = Node(key, value)
self.cache[key] = node
else:
node.value = value
self._move_to_head(node)

Frequency-based eviction. Evicts the item with the lowest access count.

Think of it like a popularity contest. Items with more “votes” (accesses) stay longer. When cache is full, the least popular item leaves.

Diagram

Algorithm:

  1. Track access count for each item
  2. When item accessed → increment count
  3. When cache full → evict item with lowest count
  4. On tie, use LRU as tiebreaker

When to use:

  • Items accessed many times (popular products, trending content)
  • Frequency matters more than recency
  • E-commerce product catalogs
  • Content recommendation systems

Trade-offs:

  • Keeps frequently accessed items
  • Good for stable access patterns
  • “One-hit wonder” problem (new items evicted quickly)
  • More complex than LRU
  • Needs frequency tracking overhead

Simple queue-based eviction. Evicts the oldest item regardless of usage.

Like a queue at a store - first in, first out. When cache is full, the oldest item leaves, even if it was just accessed.

Diagram

Algorithm:

  1. Items added to end of queue
  2. When cache full → remove from front (oldest)
  3. Access doesn’t change position

When to use:

  • Simple implementation needed
  • Access patterns are random
  • Items have equal importance
  • Rarely used in practice (ignores access patterns)

Trade-offs:

  • Very simple to implement
  • O(1) operations
  • Ignores access patterns
  • Can evict frequently used items
  • Poor cache hit rate

Time-based expiration. Items expire after a fixed time period.

Like milk with an expiration date. After a certain time, items are automatically removed, regardless of usage.

Diagram

Algorithm:

  1. Each item has expiration timestamp
  2. Background process checks for expired items
  3. Expired items removed automatically
  4. Often combined with LRU/LFU for eviction when full

When to use:

  • Data has natural expiration (API responses, sessions)
  • Staleness matters (news, stock prices)
  • Combined with other policies
  • Simple freshness guarantee

Trade-offs:

  • Automatic freshness
  • Simple concept
  • Good for time-sensitive data
  • Doesn’t consider access patterns
  • Background cleanup overhead

Different eviction policies are used by major companies based on their specific access patterns:

The Challenge: Google caches billions of search results. Recent searches are more likely to be repeated than old ones.

The Solution: Google uses LRU for search result caching:

  • Most recent searches stay in cache
  • Old searches evicted when cache is full
  • Perfect for temporal locality (users often repeat recent searches)

Why LRU? Search patterns show strong temporal locality. If someone searched “weather” 5 minutes ago, they’re likely to search it again soon. LRU keeps recent results hot.

Example: Cache size: 1 million results. User searches “Python tutorial” → cached. 10 minutes later, searches again → instant from cache. After 1 hour of no access → evicted to make room for newer searches.

Impact: 60% of searches served from cache. Average response time: 10ms (cache) vs 100ms (database).

The Challenge: Amazon has millions of products, but only thousands are popular. Popular products (iPhone, bestsellers) are accessed constantly.

The Solution: Amazon uses LFU for product catalog:

  • Frequently accessed products stay in cache
  • One-hit wonder products evicted quickly
  • Keeps bestsellers and trending items hot

Why LFU? Product access patterns show frequency matters more than recency. iPhone 15 is accessed thousands of times per day - it should stay in cache. A random obscure product accessed once should be evicted.

Example: Cache size: 100,000 products. iPhone 15 accessed 10,000 times/day → stays in cache. Random product accessed once → evicted quickly.

Impact: 80% cache hit rate for popular products. Product pages load instantly for bestsellers.

The Challenge: Tweets are time-sensitive. A tweet from 1 hour ago is less relevant than a tweet from 1 minute ago.

The Solution: Twitter uses TTL-based eviction:

  • Recent tweets cached for 5 minutes
  • Older tweets expire automatically
  • Combined with LRU for eviction when full

Why TTL? Tweets have natural expiration. A tweet from yesterday is less likely to be accessed than a tweet from 5 minutes ago. TTL ensures freshness.

Example: Tweet created at 10:00 AM. Cached until 10:05 AM. After 10:05 AM, expires. If cache is full before expiration, LRU evicts least recently accessed tweets.

Impact: 70% of timeline requests served from cache. Reduces database load during peak events (viral tweets).

The Challenge: Netflix has different content types: popular shows (accessed frequently), new releases (accessed recently), old content (rarely accessed).

The Solution: Netflix uses hybrid policy:

  • LFU for popular shows (Stranger Things, The Crown) - accessed frequently
  • LRU for new releases - accessed recently
  • TTL for trending content - time-sensitive

Why Hybrid? Different content has different access patterns. Popular shows benefit from LFU, new releases from LRU, trending from TTL.

Example:

  • Stranger Things (popular): LFU keeps it in cache (accessed 1000x/day)
  • New movie release: LRU keeps it in cache (accessed recently)
  • Trending show: TTL expires after 1 hour (trends change quickly)

Impact: 85% cache hit rate. Content loads instantly for popular shows, reducing bandwidth costs.

The Challenge: Redis needs to evict keys when memory is full, but checking all keys for LRU is expensive.

The Solution: Redis uses approximate LRU:

  • Samples random keys (5 keys by default)
  • Evicts least recently used from sample
  • Fast and efficient for large caches

Why Sampling? Checking all keys for true LRU is O(n). Sampling is O(1). The approximation is good enough for most use cases.

Example: Cache has 1 million keys. Memory full. Redis samples 5 random keys, evicts the least recently used one. Fast and efficient.

Impact: Eviction overhead: O(1) instead of O(n). Handles millions of keys efficiently.



PolicyComplexityHit RateUse CaseImplementation
LRUMediumHighGeneral purposeHashMap + Doubly Linked List
LFUHighHighFrequency-basedHashMap + Frequency tracking
FIFOLowLowSimple casesQueue
TTLLowMediumTime-sensitiveTimestamp tracking

At the code level, eviction policies are strategies you can swap:

eviction_strategy.py
from abc import ABC, abstractmethod
from typing import Any
class EvictionStrategy(ABC):
@abstractmethod
def evict(self, cache: dict) -> Any:
"""Return key to evict"""
pass
class LRUStrategy(EvictionStrategy):
def evict(self, cache: dict) -> Any:
# Return least recently used (first in OrderedDict)
return next(iter(cache))
class LFUStrategy(EvictionStrategy):
def __init__(self):
self.access_counts = {}
def evict(self, cache: dict) -> Any:
# Return least frequently used
return min(self.access_counts.items(), key=lambda x: x[1])[0]
class Cache:
def __init__(self, capacity: int, strategy: EvictionStrategy):
self.capacity = capacity
self.strategy = strategy
self.cache = {}
def evict_if_needed(self):
if len(self.cache) >= self.capacity:
key = self.strategy.evict(self.cache)
del self.cache[key]

Most production systems use combinations:

  • LRU + TTL: Evict by recency, but also expire old items
  • LFU + LRU: Use frequency, tiebreak by recency
  • Adaptive: Switch policies based on access patterns

Pre-populate cache with likely-to-be-accessed data:

  • Popular products
  • User profiles
  • Frequently accessed content

Track these metrics:

  • Hit rate: % of requests served from cache
  • Eviction rate: How often eviction happens
  • Access patterns: Understand your workload

🎯 LRU is Default

Use LRU for most cases. It’s simple, effective, and handles temporal locality well.

📊 Frequency vs Recency

LRU = recency, LFU = frequency. Choose based on your access patterns.

⚡ O(1) Operations

Proper LRU implementation uses HashMap + Doubly Linked List for O(1) get/put.

🔄 Strategy Pattern

Make eviction policy swappable using strategy pattern in your code.