Skip to content
Low Level Design Mastery Logo
LowLevelDesign Mastery

Distributed Transactions

Coordinating operations across distributed systems

In a single database, transactions are straightforward: all operations succeed or all fail (atomicity). But what happens when your transaction spans multiple services or databases?

Diagram

A distributed transaction is a transaction that spans multiple services or databases. It requires coordination to ensure atomicity: all operations succeed or all fail.

Diagram

The Challenge: How do you ensure all succeed or all fail when services are distributed?


Two-Phase Commit (2PC) is a protocol that coordinates distributed transactions through two phases.

Diagram
  1. Coordinator sends “prepare” message to all participants
  2. Each participant:
    • Locks resources
    • Performs validation
    • Votes YES (ready) or NO (abort)
  3. Participants cannot abort after voting YES (they’re locked)

If all vote YES:

  • Coordinator sends “commit” to all
  • Participants commit and release locks

If any votes NO:

  • Coordinator sends “abort” to all
  • Participants rollback and release locks
Diagram Diagram
ProblemDescriptionImpact
BlockingNodes wait if coordinator failsResources locked indefinitely
SPOFCoordinator is criticalSystem fails if coordinator dies
PerformanceSynchronous, blockingSlow, doesn’t scale
PartitionsDoesn’t handle network splitsCan’t make progress during partitions

The Saga pattern breaks a distributed transaction into a sequence of local transactions. Each step has a compensating action that undoes it.

Diagram

Key Difference from 2PC: Saga uses compensation (undo operations) instead of rollback (database transaction rollback).


Compensation is an operation that undoes the effects of a previous operation. Unlike database rollback, compensation may not perfectly reverse the operation, but makes the system consistent.

Diagram

Example: You can’t “unsend” an email, but you can send an apology email. That’s compensation!


Diagram

Characteristics:

  • Centralized control (easier to understand)
  • Easier to monitor and debug
  • Single point of failure (orchestrator)
  • Tighter coupling

Diagram

Characteristics:

  • No single point of failure
  • Loosely coupled services
  • Harder to understand flow
  • Harder to monitor

Diagram

If Step 2 (Charge Payment) fails:

  1. Compensate Step 1: Release reserved inventory
  2. Transaction ends (no further steps)

If Step 4 (Send Notification) fails:

  1. Steps 1-3 already succeeded (can’t undo shipping!)
  2. Send apology notification (compensation)

How distributed transactions affect your class design:


Diagram
Aspect2PCSaga
ConsistencyStrong (ACID)Eventual
PerformanceSlow (blocking)Fast (non-blocking)
AvailabilityLow (blocking)High (non-blocking)
ComplexityMediumHigh (need compensation)
Use CasesCritical, short transactionsLong-running, distributed

Example 1: E-Commerce Order Processing (Saga Pattern)

Section titled “Example 1: E-Commerce Order Processing (Saga Pattern)”

Company: Amazon, eBay, Shopify

Scenario: When a customer places an order, multiple services must coordinate:

  1. Inventory Service: Reserve items
  2. Payment Service: Charge the customer
  3. Shipping Service: Create shipment
  4. Notification Service: Send confirmation email

Implementation: Uses Saga Orchestration pattern with compensation:

Diagram

Why Saga? Order processing is long-running (can take minutes), involves multiple services, and requires compensation if any step fails. 2PC would block resources for too long.

Real-World Impact:

  • Amazon: Processes millions of orders daily using Saga pattern
  • Failure Rate: <0.1% of orders require compensation
  • Performance: Average order processing time: 2-5 seconds

Example 2: Banking Money Transfer (2PC Pattern)

Section titled “Example 2: Banking Money Transfer (2PC Pattern)”

Company: Traditional Banks, Financial Institutions

Scenario: Transferring money between accounts requires atomicity across multiple databases:

  1. Debit from source account
  2. Credit to destination account
  3. Log transaction in audit database

Implementation: Uses 2PC for strong ACID guarantees:

Diagram

Why 2PC? Financial transactions require strong consistency and ACID guarantees. The blocking nature is acceptable for critical financial operations.

Real-World Impact:

  • Banks: Use 2PC for inter-account transfers
  • Latency: 100-300ms per transfer (acceptable for financial operations)
  • Reliability: 99.99% success rate with proper error handling

Example 3: Hotel Booking System (Saga Choreography)

Section titled “Example 3: Hotel Booking System (Saga Choreography)”

Company: Booking.com, Expedia, Airbnb

Scenario: Booking a hotel room involves:

  1. Availability Service: Check and reserve room
  2. Payment Service: Process payment
  3. Confirmation Service: Send booking confirmation
  4. Loyalty Service: Update loyalty points

Implementation: Uses Saga Choreography (event-driven):

Diagram

Why Choreography? Hotel bookings are high-volume, services are loosely coupled, and the system needs to scale horizontally. No single coordinator bottleneck.

Real-World Impact:

  • Booking.com: Processes 1.5+ million bookings daily
  • Throughput: 10K+ bookings per second during peak
  • Scalability: Each service scales independently

Example 4: Microservices Order Fulfillment (Saga with Timeouts)

Section titled “Example 4: Microservices Order Fulfillment (Saga with Timeouts)”

Company: Netflix (content delivery), Uber (ride booking)

Scenario: Complex workflows that span hours or days:

  1. Order Service: Create order
  2. Inventory Service: Reserve items
  3. Payment Service: Authorize payment
  4. Fulfillment Service: Process order (takes hours)
  5. Shipping Service: Ship order (takes days)

Implementation: Uses Saga with Persistent State and timeouts:

Diagram

Why Persistent Saga? Long-running workflows must survive server restarts, handle timeouts, and support idempotent retries.

Real-World Impact:

  • Netflix: Content delivery workflows span days
  • Uber: Ride booking workflows handle cancellations hours later
  • Reliability: 99.9% completion rate with proper timeout handling

Deep Dive: Production Considerations for Distributed Transactions

Section titled “Deep Dive: Production Considerations for Distributed Transactions”

2PC Performance:

  • Latency: 100-500ms per transaction (synchronous coordination)
  • Throughput: 100-1K transactions/sec (limited by coordinator)
  • Blocking: All participants blocked during prepare phase
  • Failure Recovery: 10-60 seconds (must query all participants)

Saga Performance:

  • Latency: 50-200ms per step (asynchronous execution)
  • Throughput: 1K-10K transactions/sec (no blocking)
  • Non-blocking: Steps execute independently
  • Failure Recovery: 1-5 seconds (compensate only completed steps)

Real-World Benchmark:

  • E-commerce checkout (2PC): ~300ms, 500 TPS
  • E-commerce checkout (Saga): ~150ms, 3K TPS
  • Improvement: Saga is 2x faster, 6x more throughput

Problem: What if compensation itself fails?

Example:

  1. Step 1: Reserve inventory
  2. Step 2: Charge payment
  3. Step 3: Ship order (fails)
  4. Compensate Step 2: Refund (fails!)

Solutions:

Solution 1: Retry Compensation

class SagaOrchestrator:
def compensate_with_retry(self, step, max_retries=3):
for attempt in range(max_retries):
try:
step.compensate()
return True
except Exception as e:
if attempt == max_retries - 1:
# Log for manual intervention
self.alert_manual_intervention(step, e)
return False
time.sleep(2 ** attempt) # Exponential backoff

Solution 2: Compensating Transaction Pattern

  • Store compensation intent in database
  • Background job retries failed compensations
  • Benefit: Eventually consistent compensation

Solution 3: Manual Intervention Queue

  • Failed compensations go to queue
  • Operations team handles manually
  • Use case: Critical financial transactions

Problem: Sagas that take hours/days (e.g., order fulfillment).

Challenges:

  • State persistence: Must survive server restarts
  • Timeout handling: Steps may timeout
  • Idempotency: Steps may be retried

Solution: Persistent Saga State

class PersistentSagaOrchestrator:
def __init__(self, db_connection):
self.db = db_connection
def execute_saga(self, saga_id, steps):
# Save saga state
self.db.save_saga_state(saga_id, {
'status': 'running',
'current_step': 0,
'completed_steps': [],
'compensated_steps': []
})
try:
for i, step in enumerate(steps):
# Load state (survives restarts)
state = self.db.load_saga_state(saga_id)
if i in state['completed_steps']:
continue # Already completed (idempotent)
# Execute step
if step.execute():
state['completed_steps'].append(i)
self.db.save_saga_state(saga_id, state)
else:
raise Exception(f"Step {i} failed")
except Exception as e:
# Compensate completed steps
self.compensate_saga(saga_id, state['completed_steps'])

Scenario 3: Saga Orchestration vs Choreography: Production Comparison

Section titled “Scenario 3: Saga Orchestration vs Choreography: Production Comparison”

Orchestration (Central Coordinator):

Pros:

  • Easier to understand: Centralized control flow
  • Better monitoring: Single point to observe
  • Easier debugging: All logic in one place
  • Transaction visibility: Can see entire saga state

Cons:

  • Single point of failure: Coordinator is critical
  • Scalability bottleneck: Coordinator can become overloaded
  • Tighter coupling: Services depend on orchestrator

Production Example: Netflix Conductor

  • Centralized orchestration engine
  • Handles millions of workflows
  • Scaling: Multiple coordinator instances with shared state

Choreography (Event-Driven):

Pros:

  • No single point of failure: Distributed control
  • Better scalability: No coordinator bottleneck
  • Looser coupling: Services communicate via events

Cons:

  • Harder to understand: Distributed control flow
  • Harder to monitor: Must trace events across services
  • Harder to debug: No single place to see state
  • Transaction visibility: Hard to see saga state

Production Example: Event Sourcing + CQRS

  • Events stored in event log
  • Services react to events
  • Monitoring: Event log provides audit trail

Hybrid Approach:

  • Use orchestration for critical, complex workflows
  • Use choreography for simple, high-volume workflows
  • Best of both worlds: Flexibility + visibility

Despite its problems, 2PC is still used in production:

Use Case 1: Database Clusters

  • PostgreSQL: Uses 2PC for distributed transactions
  • MySQL Cluster: Uses 2PC for multi-master replication
  • Why: Database-level coordination, not application-level

Use Case 2: XA Transactions

  • Java EE: JTA (Java Transaction API) uses 2PC
  • Spring: @Transactional with XA datasources
  • Why: Standard protocol, well-understood

Use Case 3: Short-Lived Transactions

  • Microsecond transactions: 2PC overhead acceptable
  • Low concurrency: Blocking not a problem
  • Why: Simpler than Saga for simple cases

Production Pattern:

// Use 2PC for simple, fast transactions
@Transactional(rollbackFor = Exception.class)
public void transferMoney(Account from, Account to, double amount) {
from.debit(amount);
to.credit(amount);
// 2PC handles coordination
}

Pattern: Retry failed step instead of compensating.

When to use:

  • Transient failures: Network hiccups, temporary unavailability
  • Idempotent operations: Safe to retry

Example:

class RetrySagaStep:
def execute(self, max_retries=3):
for attempt in range(max_retries):
try:
return self.do_work()
except TransientError as e:
if attempt == max_retries - 1:
raise # Give up after retries
time.sleep(2 ** attempt) # Exponential backoff

Pattern 2: Backward Recovery (Compensation)

Section titled “Pattern 2: Backward Recovery (Compensation)”

Pattern: Compensate completed steps (standard Saga).

When to use:

  • Permanent failures: Business logic errors
  • Non-idempotent operations: Can’t safely retry

Example: Already covered in main content


Pattern: Compensations expire after time limit.

Use case: Prevent stale compensations from running.

Example:

class TimedCompensation:
def __init__(self, ttl_seconds=3600):
self.ttl = ttl_seconds
self.created_at = time.time()
def compensate(self):
if time.time() - self.created_at > self.ttl:
raise CompensationExpired("Too much time has passed")
# Proceed with compensation

Pattern: Use idempotency keys to prevent duplicate execution.

Example:

class IdempotentSagaStep:
def execute(self, idempotency_key):
# Check if already executed
if self.db.exists(idempotency_key):
return self.db.get_result(idempotency_key)
# Execute and store result
result = self.do_work()
self.db.store(idempotency_key, result)
return result

Benefits:

  • Safe to retry
  • Prevents duplicate execution
  • Handles network retries

Pattern: Set timeouts for saga execution.

Example:

class TimedSaga:
def execute(self, timeout_seconds=300):
start_time = time.time()
for step in self.steps:
if time.time() - start_time > timeout_seconds:
raise SagaTimeout("Saga exceeded timeout")
step.execute()

Production Settings:

  • Short sagas: 30-60 seconds (API calls)
  • Medium sagas: 5-10 minutes (order processing)
  • Long sagas: Hours/days (fulfillment workflows)

What to Monitor:

  • Saga completion rate: Should be >99%
  • Average saga duration: Track P50, P95, P99
  • Compensation rate: High rate indicates problems
  • Failed sagas: Alert on failures

Production Metrics:

class SagaMetrics:
def record_saga(self, saga_id, duration, success):
self.metrics.increment('saga.total')
self.metrics.histogram('saga.duration', duration)
if success:
self.metrics.increment('saga.success')
else:
self.metrics.increment('saga.failure')
self.alert_on_failure(saga_id)

Performance Benchmarks: Real-World Numbers

Section titled “Performance Benchmarks: Real-World Numbers”
PatternLatencyThroughputFailure RecoveryUse Case
2PC100-500ms100-1K TPS10-60sDatabase clusters
Saga (Orchestration)50-200ms1K-10K TPS1-5sMicroservices
Saga (Choreography)30-150ms5K-50K TPS1-3sEvent-driven
Local Transaction5-20ms10K-100K TPSN/ASingle service

Key Insights:

  • Saga is 2-5x faster than 2PC
  • Choreography is faster than orchestration (no coordinator overhead)
  • Local transactions are 10x faster (no coordination needed)


Now that you understand distributed transactions, let’s explore how to handle conflicts when multiple operations happen simultaneously:

Next up: Conflict Resolution Strategies — Learn last-write-wins, vector clocks, and CRDTs for handling concurrent updates.