Skip to content
Low Level Design Mastery Logo
LowLevelDesign Mastery

Why System Design Matters

From single class to global scale

You’ve written a beautiful class. It’s well-designed, follows SOLID principles, and has great test coverage. But software doesn’t run in isolation—it runs on servers, handles thousands of users, and must work 24/7.

Diagram

System design is the process of defining the architecture, components, and data flow of a system to meet specific requirements. It’s about making decisions that affect:

  • How your code runs - On one server or thousands?
  • How data flows - Synchronous or asynchronous?
  • How failures are handled - What happens when things break?
  • How the system scales - Can it handle 10x more users?
Diagram
AspectHigh-Level Design (HLD)Low-Level Design (LLD)
FocusSystem architectureClass structure
ScopeMultiple servicesSingle service/module
ArtifactsArchitecture diagramsClass diagrams
DecisionsWhich database? How many servers?Which pattern? What interface?
ScaleMillions of usersThousands of objects

Every class you write will eventually run in a system with:

2. Design Decisions Have System Implications

Section titled “2. Design Decisions Have System Implications”

Every LLD decision affects the system:

LLD DecisionSystem Implication
Using Singleton patternWon’t work across multiple servers
Storing state in instance variablesCan’t scale horizontally
Synchronous method callsCreates coupling, blocks resources
In-memory cachingEach server has different cache
Auto-increment IDsConflicts in distributed databases

In senior engineering interviews, expect questions like:

Diagram

Every system design discussion involves these key concerns:

Can the system handle growth?

Diagram

LLD Impact: Design classes that can work in a distributed environment. Avoid global state, use dependency injection, make components stateless where possible.

Does the system work correctly, even when things fail?

  • Hardware fails (servers crash, disks die)
  • Software has bugs
  • Networks are unreliable
  • Users make mistakes

LLD Impact: Implement proper error handling, use retry patterns, design for idempotency.

Is the system accessible when users need it?

  • 99.9% uptime = 8.76 hours downtime/year
  • 99.99% uptime = 52.6 minutes downtime/year
  • 99.999% uptime = 5.26 minutes downtime/year

LLD Impact: Design classes with fallback behaviors, implement circuit breakers, handle graceful degradation.

Can the system be easily modified and operated?

  • New features can be added
  • Bugs can be fixed quickly
  • Operations are simple
  • System is observable

LLD Impact: Follow SOLID principles, write clean code, use design patterns appropriately.

Does the system respond quickly and efficiently?

  • Low latency (fast responses)
  • High throughput (many requests)
  • Efficient resource usage

LLD Impact: Choose appropriate data structures, optimize algorithms, minimize unnecessary operations.


Example 1: Twitter’s Tweet Counter Evolution

Section titled “Example 1: Twitter’s Tweet Counter Evolution”

Company: Twitter (now X)

Scenario: Twitter needs to display view counts, like counts, and retweet counts for billions of tweets. Initially, they used in-memory counters, but this failed at scale.

Implementation: Evolved from naive to distributed design:

Diagram

Why This Matters:

  • Scale: Billions of tweets, millions of interactions per second
  • Consistency: Users expect accurate counts
  • Performance: Counts must load instantly
  • Result: Redis-based distributed counters handle millions of increments per second

Real-World Impact:

  • Throughput: Millions of counter increments per second
  • Latency: Sub-millisecond counter updates
  • Consistency: All users see same counts globally

Example 2: Instagram’s Photo View Counter

Section titled “Example 2: Instagram’s Photo View Counter”

Company: Instagram (Meta)

Scenario: Instagram displays view counts on photos and videos. With billions of photos and millions of views per second, they need a scalable counting system.

Implementation: Uses distributed counters with sharding:

Diagram

Why Sharding?

  • Scale: Distributes load across multiple Redis instances
  • Capacity: Each shard handles subset of photos
  • Performance: Parallel processing increases throughput
  • Result: Handles billions of views with low latency

Real-World Impact:

  • Scale: Billions of photos, trillions of views
  • Performance: < 1ms counter increment latency
  • Availability: 99.99% uptime despite massive scale

Example 3: YouTube’s View Counter System

Section titled “Example 3: YouTube’s View Counter System”

Company: Google (YouTube)

Scenario: YouTube tracks view counts for billions of videos. The system must handle massive spikes during viral videos while maintaining accuracy.

Implementation: Uses hybrid approach with batching:

Diagram

Why Batching?

  • Efficiency: Reduces database writes by 100x
  • Performance: Handles traffic spikes gracefully
  • Accuracy: Eventually consistent, acceptable for views
  • Result: Handles viral video traffic spikes

Real-World Impact:

  • Scale: Billions of videos, trillions of views
  • Spike Handling: Handles 10x traffic spikes during viral events
  • Efficiency: 100x reduction in database writes through batching

Let’s see how system thinking changes a simple class design:

Problems with this design:

  • Data lost if server restarts
  • Different counts on each server
  • No persistence
  • Memory grows unbounded

The key insight is to externalize state to a shared store that all servers can access. This requires:

  1. Abstraction - Define an interface for storage (Dependency Inversion Principle)
  2. Shared State - Use Redis, a database, or similar shared storage
  3. Atomic Operations - Use Redis’s INCR command which is atomic

What changed and why:

ChangeSystem Design Reason
Added CounterStorage interfaceDecouples from specific storage (DIP)
Used Redis instead of in-memoryShared state across servers
Dependency injectionTestable, flexible, swappable
Atomic operations (INCR)Handles concurrent requests


Now that you understand why system design matters, let’s dive into the first fundamental concept:

Next up: Scalability Fundamentals - Learn how systems grow and the strategies to handle that growth.