Distributed Transactions
The Challenge of Distributed Transactions
Section titled “The Challenge of Distributed Transactions”In a single database, transactions are straightforward: all operations succeed or all fail (atomicity). But what happens when your transaction spans multiple services or databases?
What is a Distributed Transaction?
Section titled “What is a Distributed Transaction?”A distributed transaction is a transaction that spans multiple services or databases. It requires coordination to ensure atomicity: all operations succeed or all fail.
The Challenge: How do you ensure all succeed or all fail when services are distributed?
Solution 1: Two-Phase Commit (2PC)
Section titled “Solution 1: Two-Phase Commit (2PC)”Two-Phase Commit (2PC) is a protocol that coordinates distributed transactions through two phases.
How 2PC Works
Section titled “How 2PC Works”Phase 1: Prepare (Voting)
Section titled “Phase 1: Prepare (Voting)”- Coordinator sends “prepare” message to all participants
- Each participant:
- Locks resources
- Performs validation
- Votes YES (ready) or NO (abort)
- Participants cannot abort after voting YES (they’re locked)
Phase 2: Commit or Abort
Section titled “Phase 2: Commit or Abort”If all vote YES:
- Coordinator sends “commit” to all
- Participants commit and release locks
If any votes NO:
- Coordinator sends “abort” to all
- Participants rollback and release locks
2PC Flow Diagram
Section titled “2PC Flow Diagram”Problems with 2PC
Section titled “Problems with 2PC”| Problem | Description | Impact |
|---|---|---|
| Blocking | Nodes wait if coordinator fails | Resources locked indefinitely |
| SPOF | Coordinator is critical | System fails if coordinator dies |
| Performance | Synchronous, blocking | Slow, doesn’t scale |
| Partitions | Doesn’t handle network splits | Can’t make progress during partitions |
Solution 2: Saga Pattern
Section titled “Solution 2: Saga Pattern”The Saga pattern breaks a distributed transaction into a sequence of local transactions. Each step has a compensating action that undoes it.
How Saga Works
Section titled “How Saga Works”Key Difference from 2PC: Saga uses compensation (undo operations) instead of rollback (database transaction rollback).
What is Compensation?
Section titled “What is Compensation?”Compensation is an operation that undoes the effects of a previous operation. Unlike database rollback, compensation may not perfectly reverse the operation, but makes the system consistent.
Example: You can’t “unsend” an email, but you can send an apology email. That’s compensation!
Saga Orchestration vs Choreography
Section titled “Saga Orchestration vs Choreography”Orchestration: Central Coordinator
Section titled “Orchestration: Central Coordinator”Characteristics:
- Centralized control (easier to understand)
- Easier to monitor and debug
- Single point of failure (orchestrator)
- Tighter coupling
Choreography: Event-Driven
Section titled “Choreography: Event-Driven”Characteristics:
- No single point of failure
- Loosely coupled services
- Harder to understand flow
- Harder to monitor
Saga Example: E-Commerce Order
Section titled “Saga Example: E-Commerce Order”If Step 2 (Charge Payment) fails:
- Compensate Step 1: Release reserved inventory
- Transaction ends (no further steps)
If Step 4 (Send Notification) fails:
- Steps 1-3 already succeeded (can’t undo shipping!)
- Send apology notification (compensation)
LLD ↔ HLD Connection
Section titled “LLD ↔ HLD Connection”How distributed transactions affect your class design:
Saga Orchestrator Implementation
Section titled “Saga Orchestrator Implementation”Compensation Pattern
Section titled “Compensation Pattern”2PC vs Saga: When to Use What?
Section titled “2PC vs Saga: When to Use What?”| Aspect | 2PC | Saga |
|---|---|---|
| Consistency | Strong (ACID) | Eventual |
| Performance | Slow (blocking) | Fast (non-blocking) |
| Availability | Low (blocking) | High (non-blocking) |
| Complexity | Medium | High (need compensation) |
| Use Cases | Critical, short transactions | Long-running, distributed |
Real-World Examples
Section titled “Real-World Examples”Example 1: E-Commerce Order Processing (Saga Pattern)
Section titled “Example 1: E-Commerce Order Processing (Saga Pattern)”Company: Amazon, eBay, Shopify
Scenario: When a customer places an order, multiple services must coordinate:
- Inventory Service: Reserve items
- Payment Service: Charge the customer
- Shipping Service: Create shipment
- Notification Service: Send confirmation email
Implementation: Uses Saga Orchestration pattern with compensation:
Why Saga? Order processing is long-running (can take minutes), involves multiple services, and requires compensation if any step fails. 2PC would block resources for too long.
Real-World Impact:
- Amazon: Processes millions of orders daily using Saga pattern
- Failure Rate: <0.1% of orders require compensation
- Performance: Average order processing time: 2-5 seconds
Example 2: Banking Money Transfer (2PC Pattern)
Section titled “Example 2: Banking Money Transfer (2PC Pattern)”Company: Traditional Banks, Financial Institutions
Scenario: Transferring money between accounts requires atomicity across multiple databases:
- Debit from source account
- Credit to destination account
- Log transaction in audit database
Implementation: Uses 2PC for strong ACID guarantees:
Why 2PC? Financial transactions require strong consistency and ACID guarantees. The blocking nature is acceptable for critical financial operations.
Real-World Impact:
- Banks: Use 2PC for inter-account transfers
- Latency: 100-300ms per transfer (acceptable for financial operations)
- Reliability: 99.99% success rate with proper error handling
Example 3: Hotel Booking System (Saga Choreography)
Section titled “Example 3: Hotel Booking System (Saga Choreography)”Company: Booking.com, Expedia, Airbnb
Scenario: Booking a hotel room involves:
- Availability Service: Check and reserve room
- Payment Service: Process payment
- Confirmation Service: Send booking confirmation
- Loyalty Service: Update loyalty points
Implementation: Uses Saga Choreography (event-driven):
Why Choreography? Hotel bookings are high-volume, services are loosely coupled, and the system needs to scale horizontally. No single coordinator bottleneck.
Real-World Impact:
- Booking.com: Processes 1.5+ million bookings daily
- Throughput: 10K+ bookings per second during peak
- Scalability: Each service scales independently
Example 4: Microservices Order Fulfillment (Saga with Timeouts)
Section titled “Example 4: Microservices Order Fulfillment (Saga with Timeouts)”Company: Netflix (content delivery), Uber (ride booking)
Scenario: Complex workflows that span hours or days:
- Order Service: Create order
- Inventory Service: Reserve items
- Payment Service: Authorize payment
- Fulfillment Service: Process order (takes hours)
- Shipping Service: Ship order (takes days)
Implementation: Uses Saga with Persistent State and timeouts:
Why Persistent Saga? Long-running workflows must survive server restarts, handle timeouts, and support idempotent retries.
Real-World Impact:
- Netflix: Content delivery workflows span days
- Uber: Ride booking workflows handle cancellations hours later
- Reliability: 99.9% completion rate with proper timeout handling
Deep Dive: Production Considerations for Distributed Transactions
Section titled “Deep Dive: Production Considerations for Distributed Transactions”2PC vs Saga: Production Trade-offs
Section titled “2PC vs Saga: Production Trade-offs”Performance Comparison
Section titled “Performance Comparison”2PC Performance:
- Latency: 100-500ms per transaction (synchronous coordination)
- Throughput: 100-1K transactions/sec (limited by coordinator)
- Blocking: All participants blocked during prepare phase
- Failure Recovery: 10-60 seconds (must query all participants)
Saga Performance:
- Latency: 50-200ms per step (asynchronous execution)
- Throughput: 1K-10K transactions/sec (no blocking)
- Non-blocking: Steps execute independently
- Failure Recovery: 1-5 seconds (compensate only completed steps)
Real-World Benchmark:
- E-commerce checkout (2PC): ~300ms, 500 TPS
- E-commerce checkout (Saga): ~150ms, 3K TPS
- Improvement: Saga is 2x faster, 6x more throughput
Saga Pattern: Advanced Scenarios
Section titled “Saga Pattern: Advanced Scenarios”Scenario 1: Partial Failure Recovery
Section titled “Scenario 1: Partial Failure Recovery”Problem: What if compensation itself fails?
Example:
- Step 1: Reserve inventory
- Step 2: Charge payment
- Step 3: Ship order (fails)
- Compensate Step 2: Refund (fails!)
Solutions:
Solution 1: Retry Compensation
class SagaOrchestrator: def compensate_with_retry(self, step, max_retries=3): for attempt in range(max_retries): try: step.compensate() return True except Exception as e: if attempt == max_retries - 1: # Log for manual intervention self.alert_manual_intervention(step, e) return False time.sleep(2 ** attempt) # Exponential backoffSolution 2: Compensating Transaction Pattern
- Store compensation intent in database
- Background job retries failed compensations
- Benefit: Eventually consistent compensation
Solution 3: Manual Intervention Queue
- Failed compensations go to queue
- Operations team handles manually
- Use case: Critical financial transactions
Scenario 2: Long-Running Sagas
Section titled “Scenario 2: Long-Running Sagas”Problem: Sagas that take hours/days (e.g., order fulfillment).
Challenges:
- State persistence: Must survive server restarts
- Timeout handling: Steps may timeout
- Idempotency: Steps may be retried
Solution: Persistent Saga State
class PersistentSagaOrchestrator: def __init__(self, db_connection): self.db = db_connection
def execute_saga(self, saga_id, steps): # Save saga state self.db.save_saga_state(saga_id, { 'status': 'running', 'current_step': 0, 'completed_steps': [], 'compensated_steps': [] })
try: for i, step in enumerate(steps): # Load state (survives restarts) state = self.db.load_saga_state(saga_id)
if i in state['completed_steps']: continue # Already completed (idempotent)
# Execute step if step.execute(): state['completed_steps'].append(i) self.db.save_saga_state(saga_id, state) else: raise Exception(f"Step {i} failed") except Exception as e: # Compensate completed steps self.compensate_saga(saga_id, state['completed_steps'])Scenario 3: Saga Orchestration vs Choreography: Production Comparison
Section titled “Scenario 3: Saga Orchestration vs Choreography: Production Comparison”Orchestration (Central Coordinator):
Pros:
- Easier to understand: Centralized control flow
- Better monitoring: Single point to observe
- Easier debugging: All logic in one place
- Transaction visibility: Can see entire saga state
Cons:
- Single point of failure: Coordinator is critical
- Scalability bottleneck: Coordinator can become overloaded
- Tighter coupling: Services depend on orchestrator
Production Example: Netflix Conductor
- Centralized orchestration engine
- Handles millions of workflows
- Scaling: Multiple coordinator instances with shared state
Choreography (Event-Driven):
Pros:
- No single point of failure: Distributed control
- Better scalability: No coordinator bottleneck
- Looser coupling: Services communicate via events
Cons:
- Harder to understand: Distributed control flow
- Harder to monitor: Must trace events across services
- Harder to debug: No single place to see state
- Transaction visibility: Hard to see saga state
Production Example: Event Sourcing + CQRS
- Events stored in event log
- Services react to events
- Monitoring: Event log provides audit trail
Hybrid Approach:
- Use orchestration for critical, complex workflows
- Use choreography for simple, high-volume workflows
- Best of both worlds: Flexibility + visibility
2PC: When It’s Still Used
Section titled “2PC: When It’s Still Used”Despite its problems, 2PC is still used in production:
Use Case 1: Database Clusters
- PostgreSQL: Uses 2PC for distributed transactions
- MySQL Cluster: Uses 2PC for multi-master replication
- Why: Database-level coordination, not application-level
Use Case 2: XA Transactions
- Java EE: JTA (Java Transaction API) uses 2PC
- Spring: @Transactional with XA datasources
- Why: Standard protocol, well-understood
Use Case 3: Short-Lived Transactions
- Microsecond transactions: 2PC overhead acceptable
- Low concurrency: Blocking not a problem
- Why: Simpler than Saga for simple cases
Production Pattern:
// Use 2PC for simple, fast transactions@Transactional(rollbackFor = Exception.class)public void transferMoney(Account from, Account to, double amount) { from.debit(amount); to.credit(amount); // 2PC handles coordination}Compensation Design Patterns
Section titled “Compensation Design Patterns”Pattern 1: Forward Recovery (Retry)
Section titled “Pattern 1: Forward Recovery (Retry)”Pattern: Retry failed step instead of compensating.
When to use:
- Transient failures: Network hiccups, temporary unavailability
- Idempotent operations: Safe to retry
Example:
class RetrySagaStep: def execute(self, max_retries=3): for attempt in range(max_retries): try: return self.do_work() except TransientError as e: if attempt == max_retries - 1: raise # Give up after retries time.sleep(2 ** attempt) # Exponential backoffPattern 2: Backward Recovery (Compensation)
Section titled “Pattern 2: Backward Recovery (Compensation)”Pattern: Compensate completed steps (standard Saga).
When to use:
- Permanent failures: Business logic errors
- Non-idempotent operations: Can’t safely retry
Example: Already covered in main content
Pattern 3: Compensation with TTL
Section titled “Pattern 3: Compensation with TTL”Pattern: Compensations expire after time limit.
Use case: Prevent stale compensations from running.
Example:
class TimedCompensation: def __init__(self, ttl_seconds=3600): self.ttl = ttl_seconds self.created_at = time.time()
def compensate(self): if time.time() - self.created_at > self.ttl: raise CompensationExpired("Too much time has passed") # Proceed with compensationProduction Best Practices
Section titled “Production Best Practices”Practice 1: Idempotency Keys
Section titled “Practice 1: Idempotency Keys”Pattern: Use idempotency keys to prevent duplicate execution.
Example:
class IdempotentSagaStep: def execute(self, idempotency_key): # Check if already executed if self.db.exists(idempotency_key): return self.db.get_result(idempotency_key)
# Execute and store result result = self.do_work() self.db.store(idempotency_key, result) return resultBenefits:
- Safe to retry
- Prevents duplicate execution
- Handles network retries
Practice 2: Saga Timeout Management
Section titled “Practice 2: Saga Timeout Management”Pattern: Set timeouts for saga execution.
Example:
class TimedSaga: def execute(self, timeout_seconds=300): start_time = time.time()
for step in self.steps: if time.time() - start_time > timeout_seconds: raise SagaTimeout("Saga exceeded timeout") step.execute()Production Settings:
- Short sagas: 30-60 seconds (API calls)
- Medium sagas: 5-10 minutes (order processing)
- Long sagas: Hours/days (fulfillment workflows)
Practice 3: Saga Monitoring and Alerting
Section titled “Practice 3: Saga Monitoring and Alerting”What to Monitor:
- Saga completion rate: Should be >99%
- Average saga duration: Track P50, P95, P99
- Compensation rate: High rate indicates problems
- Failed sagas: Alert on failures
Production Metrics:
class SagaMetrics: def record_saga(self, saga_id, duration, success): self.metrics.increment('saga.total') self.metrics.histogram('saga.duration', duration)
if success: self.metrics.increment('saga.success') else: self.metrics.increment('saga.failure') self.alert_on_failure(saga_id)Performance Benchmarks: Real-World Numbers
Section titled “Performance Benchmarks: Real-World Numbers”| Pattern | Latency | Throughput | Failure Recovery | Use Case |
|---|---|---|---|---|
| 2PC | 100-500ms | 100-1K TPS | 10-60s | Database clusters |
| Saga (Orchestration) | 50-200ms | 1K-10K TPS | 1-5s | Microservices |
| Saga (Choreography) | 30-150ms | 5K-50K TPS | 1-3s | Event-driven |
| Local Transaction | 5-20ms | 10K-100K TPS | N/A | Single service |
Key Insights:
- Saga is 2-5x faster than 2PC
- Choreography is faster than orchestration (no coordinator overhead)
- Local transactions are 10x faster (no coordination needed)
Key Takeaways
Section titled “Key Takeaways”What’s Next?
Section titled “What’s Next?”Now that you understand distributed transactions, let’s explore how to handle conflicts when multiple operations happen simultaneously:
Next up: Conflict Resolution Strategies — Learn last-write-wins, vector clocks, and CRDTs for handling concurrent updates.