Scalability Fundamentals
What is Scalability?
Section titled “What is Scalability?”Scalability is a system’s ability to handle increased load by adding resources. A scalable system can grow to accommodate more users, more data, or more transactions without degrading performance.
Two Approaches to Scaling
Section titled “Two Approaches to Scaling”Vertical Scaling (Scale Up)
Section titled “Vertical Scaling (Scale Up)”Add more power to existing machines - bigger CPU, more RAM, faster disks.
Pros:
- ✅ Simple - no code changes needed
- ✅ No distributed system complexity
- ✅ Strong consistency is easy
Cons:
- ❌ Hardware limits (can’t add infinite CPU)
- ❌ Expensive at high end
- ❌ Single point of failure
- ❌ Downtime during upgrades
Horizontal Scaling (Scale Out)
Section titled “Horizontal Scaling (Scale Out)”Add more machines - distribute the load across multiple servers.
Pros:
- ✅ No hardware limits (add infinite machines)
- ✅ Cost-effective (use commodity hardware)
- ✅ Built-in redundancy
- ✅ Gradual scaling
Cons:
- ❌ Distributed system complexity
- ❌ Data consistency challenges
- ❌ Code must be designed for it
- ❌ Network overhead
Comparison: Vertical vs Horizontal
Section titled “Comparison: Vertical vs Horizontal”| Aspect | Vertical Scaling | Horizontal Scaling |
|---|---|---|
| Approach | Bigger machine | More machines |
| Limit | Hardware ceiling | Theoretically unlimited |
| Cost | Expensive at scale | Cost-effective |
| Complexity | Simple | Complex |
| Downtime | Required for upgrades | Zero-downtime possible |
| Failure | Single point of failure | Redundancy built-in |
| Code changes | Usually none | May require redesign |
What Makes Code Horizontally Scalable?
Section titled “What Makes Code Horizontally Scalable?”This is where LLD meets HLD. Your class design determines whether your system can scale horizontally.
The Problem: Stateful Services
Section titled “The Problem: Stateful Services”1class ShoppingCartService:2 """❌ NOT horizontally scalable - stores state in memory"""3
4 def __init__(self):5 self.carts = {} # user_id -> cart items6
7 def add_item(self, user_id: str, item: str):8 if user_id not in self.carts:9 self.carts[user_id] = []10 self.carts[user_id].append(item)11
12 def get_cart(self, user_id: str) -> list:13 return self.carts.get(user_id, [])14
15# Problem: If user's next request goes to a different server,16# their cart is empty!1import java.util.*;2
3public class ShoppingCartService {4 // ❌ NOT horizontally scalable - stores state in memory5 private Map<String, List<String>> carts = new HashMap<>();6
7 public void addItem(String userId, String item) {8 carts.computeIfAbsent(userId, k -> new ArrayList<>()).add(item);9 }10
11 public List<String> getCart(String userId) {12 return carts.getOrDefault(userId, Collections.emptyList());13 }14}15
16// Problem: If user's next request goes to a different server,17// their cart is empty!The Solution: Stateless Services
Section titled “The Solution: Stateless Services”The solution is to externalize all state to a shared store (Redis, database, etc.):
- No instance variables holding user/session data
- All state lives externally in Redis, database, or similar
- Any server can handle any request because they all access the same shared state
1class ShoppingCartService:2 """✅ Horizontally scalable - no local state"""3
4 def __init__(self, redis_client):5 self.redis = redis_client # External storage6
7 def add_item(self, user_id: str, item: str) -> None:8 self.redis.rpush(f"cart:{user_id}", item)9
10 def get_cart(self, user_id: str) -> list:11 return self.redis.lrange(f"cart:{user_id}", 0, -1)12
13# Any server can handle any request!1public class ShoppingCartService {2 private final Jedis redis; // External storage3
4 public ShoppingCartService(Jedis redis) { this.redis = redis; }5
6 public void addItem(String userId, String item) {7 redis.rpush("cart:" + userId, item);8 }9
10 public List<String> getCart(String userId) {11 return redis.lrange("cart:" + userId, 0, -1);12 }13}14// Any server can handle any request!Scalability Dimensions
Section titled “Scalability Dimensions”Systems can be scaled along different dimensions:
1. Load Scalability
Section titled “1. Load Scalability”Handle more requests per second
2. Data Scalability
Section titled “2. Data Scalability”Handle more data
3. Geographic Scalability
Section titled “3. Geographic Scalability”Serve users globally with low latency
Design Patterns for Scalability
Section titled “Design Patterns for Scalability”Pattern 1: Stateless Services
Section titled “Pattern 1: Stateless Services”Key Principle: All state lives externally (database, cache, message queue). The service itself stores nothing between requests.
This allows you to run any number of service instances and route requests to any of them.
Pattern 2: Idempotent Operations
Section titled “Pattern 2: Idempotent Operations”Key Principle: Operations that can be safely retried without side effects. Critical for distributed systems where network failures cause retries.
Implementation: Store results keyed by a unique idempotency key. Before processing, check if the key exists and return the cached result.
Pattern 3: Async Processing
Section titled “Pattern 3: Async Processing”Key Principle: Move slow, non-critical operations out of the request path using message queues.
Result: Users get fast responses. Slow operations (email, analytics, notifications) happen in the background without blocking.
Scalability Checklist for LLD
Section titled “Scalability Checklist for LLD”When designing classes, ask yourself:
| Question | Why It Matters |
|---|---|
| Does this class store state in instance variables? | Prevents horizontal scaling |
| Can multiple instances run simultaneously? | Required for scaling out |
| Are operations idempotent? | Enables safe retries |
| What happens if this operation is slow? | May need async processing |
| Does this depend on local resources (files, memory)? | Won’t work across servers |
| How does this handle concurrent requests? | Thread safety concerns |
Key Takeaways
Section titled “Key Takeaways”What’s Next?
Section titled “What’s Next?”Understanding scalability is just the beginning. Next, we’ll dive into measuring system performance:
Next up: Latency and Throughput - Learn the key metrics that define system performance.