Distributed Transactions
The Challenge of Distributed Transactions
Section titled “The Challenge of Distributed Transactions”In a single database, transactions are straightforward: all operations succeed or all fail (atomicity). But what happens when your transaction spans multiple services or databases?
What is a Distributed Transaction?
Section titled “What is a Distributed Transaction?”A distributed transaction is a transaction that spans multiple services or databases. It requires coordination to ensure atomicity: all operations succeed or all fail.
The Challenge: How do you ensure all succeed or all fail when services are distributed?
Solution 1: Two-Phase Commit (2PC)
Section titled “Solution 1: Two-Phase Commit (2PC)”Two-Phase Commit (2PC) is a protocol that coordinates distributed transactions through two phases.
How 2PC Works
Section titled “How 2PC Works”Phase 1: Prepare (Voting)
Section titled “Phase 1: Prepare (Voting)”- Coordinator sends “prepare” message to all participants
- Each participant:
- Locks resources
- Performs validation
- Votes YES (ready) or NO (abort)
- Participants cannot abort after voting YES (they’re locked)
Phase 2: Commit or Abort
Section titled “Phase 2: Commit or Abort”If all vote YES:
- Coordinator sends “commit” to all
- Participants commit and release locks
If any votes NO:
- Coordinator sends “abort” to all
- Participants rollback and release locks
2PC Flow Diagram
Section titled “2PC Flow Diagram”Problems with 2PC
Section titled “Problems with 2PC”| Problem | Description | Impact |
|---|---|---|
| Blocking | Nodes wait if coordinator fails | Resources locked indefinitely |
| SPOF | Coordinator is critical | System fails if coordinator dies |
| Performance | Synchronous, blocking | Slow, doesn’t scale |
| Partitions | Doesn’t handle network splits | Can’t make progress during partitions |
Solution 2: Saga Pattern
Section titled “Solution 2: Saga Pattern”The Saga pattern breaks a distributed transaction into a sequence of local transactions. Each step has a compensating action that undoes it.
How Saga Works
Section titled “How Saga Works”Key Difference from 2PC: Saga uses compensation (undo operations) instead of rollback (database transaction rollback).
What is Compensation?
Section titled “What is Compensation?”Compensation is an operation that undoes the effects of a previous operation. Unlike database rollback, compensation may not perfectly reverse the operation, but makes the system consistent.
Example: You can’t “unsend” an email, but you can send an apology email. That’s compensation!
Saga Orchestration vs Choreography
Section titled “Saga Orchestration vs Choreography”Orchestration: Central Coordinator
Section titled “Orchestration: Central Coordinator”Characteristics:
- ✅ Centralized control (easier to understand)
- ✅ Easier to monitor and debug
- ❌ Single point of failure (orchestrator)
- ❌ Tighter coupling
Choreography: Event-Driven
Section titled “Choreography: Event-Driven”Characteristics:
- ✅ No single point of failure
- ✅ Loosely coupled services
- ❌ Harder to understand flow
- ❌ Harder to monitor
Saga Example: E-Commerce Order
Section titled “Saga Example: E-Commerce Order”If Step 2 (Charge Payment) fails:
- Compensate Step 1: Release reserved inventory
- Transaction ends (no further steps)
If Step 4 (Send Notification) fails:
- Steps 1-3 already succeeded (can’t undo shipping!)
- Send apology notification (compensation)
LLD ↔ HLD Connection
Section titled “LLD ↔ HLD Connection”How distributed transactions affect your class design:
Saga Orchestrator Implementation
Section titled “Saga Orchestrator Implementation”1from enum import Enum2from typing import List, Callable, Optional3
4class SagaStep:5 def __init__(self, name: str, execute: Callable, compensate: Callable):6 self.name = name7 self.execute = execute8 self.compensate = compensate9 self.completed = False10
11class SagaOrchestrator:12 def __init__(self):13 self.steps: List[SagaStep] = []14
15 def add_step(self, step: SagaStep):16 self.steps.append(step)17
18 def execute(self) -> bool:19 completed_steps = []20
21 try:22 for step in self.steps:23 # Execute step24 if not step.execute():25 raise Exception(f"Step {step.name} failed")26
27 step.completed = True28 completed_steps.append(step)29
30 return True # All succeeded31
32 except Exception as e:33 # Compensate in reverse order34 for step in reversed(completed_steps):35 step.compensate()36
37 return False # Transaction failed1import java.util.*;2
3public class SagaStep {4 private String name;5 private Runnable execute;6 private Runnable compensate;7 private boolean completed = false;8
9 public SagaStep(String name, Runnable execute, Runnable compensate) {10 this.name = name;11 this.execute = execute;12 this.compensate = compensate;13 }14
15 public boolean execute() {16 try {17 execute.run();18 completed = true;19 return true;20 } catch (Exception e) {21 return false;22 }23 }24
25 public void compensate() {26 compensate.run();27 }28}29
30public class SagaOrchestrator {31 private List<SagaStep> steps = new ArrayList<>();32
33 public void addStep(SagaStep step) {34 steps.add(step);35 }36
37 public boolean execute() {38 List<SagaStep> completedSteps = new ArrayList<>();39
40 try {41 for (SagaStep step : steps) {42 if (!step.execute()) {43 throw new RuntimeException("Step " + step.name + " failed");44 }45 completedSteps.add(step);46 }47 return true; // All succeeded48 } catch (Exception e) {49 // Compensate in reverse order50 Collections.reverse(completedSteps);51 for (SagaStep step : completedSteps) {52 step.compensate();53 }54 return false; // Transaction failed55 }56 }57}Compensation Pattern
Section titled “Compensation Pattern”1class InventoryService:2 def reserve(self, product_id: str, quantity: int) -> bool:3 # Reserve inventory4 # Returns True if successful5 pass6
7 def release(self, product_id: str, quantity: int):8 # Compensate: Release reserved inventory9 pass10
11class PaymentService:12 def charge(self, user_id: str, amount: float) -> bool:13 # Charge payment14 # Returns True if successful15 pass16
17 def refund(self, user_id: str, amount: float):18 # Compensate: Refund payment19 pass20
21# Usage in Saga22def create_order_saga():23 orchestrator = SagaOrchestrator()24
25 inventory = InventoryService()26 payment = PaymentService()27
28 orchestrator.add_step(SagaStep(29 "reserve_inventory",30 lambda: inventory.reserve("product_123", 1),31 lambda: inventory.release("product_123", 1)32 ))33
34 orchestrator.add_step(SagaStep(35 "charge_payment",36 lambda: payment.charge("user_456", 99.99),37 lambda: payment.refund("user_456", 99.99)38 ))39
40 return orchestrator.execute()1public class InventoryService {2 public boolean reserve(String productId, int quantity) {3 // Reserve inventory4 // Returns true if successful5 return true;6 }7
8 public void release(String productId, int quantity) {9 // Compensate: Release reserved inventory10 }11}12
13public class PaymentService {14 public boolean charge(String userId, double amount) {15 // Charge payment16 // Returns true if successful17 return true;18 }19
20 public void refund(String userId, double amount) {21 // Compensate: Refund payment22 }23}24
25// Usage in Saga26public class OrderSaga {27 public boolean createOrder() {28 SagaOrchestrator orchestrator = new SagaOrchestrator();29 InventoryService inventory = new InventoryService();30 PaymentService payment = new PaymentService();31
32 orchestrator.addStep(new SagaStep(33 "reserve_inventory",34 () -> inventory.reserve("product_123", 1),35 () -> inventory.release("product_123", 1)36 ));37
38 orchestrator.addStep(new SagaStep(39 "charge_payment",40 () -> payment.charge("user_456", 99.99),41 () -> payment.refund("user_456", 99.99)42 ));43
44 return orchestrator.execute();45 }46}2PC vs Saga: When to Use What?
Section titled “2PC vs Saga: When to Use What?”| Aspect | 2PC | Saga |
|---|---|---|
| Consistency | Strong (ACID) | Eventual |
| Performance | Slow (blocking) | Fast (non-blocking) |
| Availability | Low (blocking) | High (non-blocking) |
| Complexity | Medium | High (need compensation) |
| Use Cases | Critical, short transactions | Long-running, distributed |
Deep Dive: Production Considerations for Distributed Transactions
Section titled “Deep Dive: Production Considerations for Distributed Transactions”2PC vs Saga: Production Trade-offs
Section titled “2PC vs Saga: Production Trade-offs”Performance Comparison
Section titled “Performance Comparison”2PC Performance:
- Latency: 100-500ms per transaction (synchronous coordination)
- Throughput: 100-1K transactions/sec (limited by coordinator)
- Blocking: All participants blocked during prepare phase
- Failure Recovery: 10-60 seconds (must query all participants)
Saga Performance:
- Latency: 50-200ms per step (asynchronous execution)
- Throughput: 1K-10K transactions/sec (no blocking)
- Non-blocking: Steps execute independently
- Failure Recovery: 1-5 seconds (compensate only completed steps)
Real-World Benchmark:
- E-commerce checkout (2PC): ~300ms, 500 TPS
- E-commerce checkout (Saga): ~150ms, 3K TPS
- Improvement: Saga is 2x faster, 6x more throughput
Saga Pattern: Advanced Scenarios
Section titled “Saga Pattern: Advanced Scenarios”Scenario 1: Partial Failure Recovery
Section titled “Scenario 1: Partial Failure Recovery”Problem: What if compensation itself fails?
Example:
- Step 1: Reserve inventory ✅
- Step 2: Charge payment ✅
- Step 3: Ship order ❌ (fails)
- Compensate Step 2: Refund ❌ (fails!)
Solutions:
Solution 1: Retry Compensation
1class SagaOrchestrator:2 def compensate_with_retry(self, step, max_retries=3):3 for attempt in range(max_retries):4 try:5 step.compensate()6 return True7 except Exception as e:8 if attempt == max_retries - 1:9 # Log for manual intervention10 self.alert_manual_intervention(step, e)11 return False12 time.sleep(2 ** attempt) # Exponential backoffSolution 2: Compensating Transaction Pattern
- Store compensation intent in database
- Background job retries failed compensations
- Benefit: Eventually consistent compensation
Solution 3: Manual Intervention Queue
- Failed compensations go to queue
- Operations team handles manually
- Use case: Critical financial transactions
Scenario 2: Long-Running Sagas
Section titled “Scenario 2: Long-Running Sagas”Problem: Sagas that take hours/days (e.g., order fulfillment).
Challenges:
- State persistence: Must survive server restarts
- Timeout handling: Steps may timeout
- Idempotency: Steps may be retried
Solution: Persistent Saga State
1class PersistentSagaOrchestrator:2 def __init__(self, db_connection):3 self.db = db_connection4
5 def execute_saga(self, saga_id, steps):6 # Save saga state7 self.db.save_saga_state(saga_id, {8 'status': 'running',9 'current_step': 0,10 'completed_steps': [],11 'compensated_steps': []12 })13
14 try:15 for i, step in enumerate(steps):16 # Load state (survives restarts)17 state = self.db.load_saga_state(saga_id)18
19 if i in state['completed_steps']:20 continue # Already completed (idempotent)21
22 # Execute step23 if step.execute():24 state['completed_steps'].append(i)25 self.db.save_saga_state(saga_id, state)26 else:27 raise Exception(f"Step {i} failed")28 except Exception as e:29 # Compensate completed steps30 self.compensate_saga(saga_id, state['completed_steps'])Scenario 3: Saga Orchestration vs Choreography: Production Comparison
Section titled “Scenario 3: Saga Orchestration vs Choreography: Production Comparison”Orchestration (Central Coordinator):
Pros:
- ✅ Easier to understand: Centralized control flow
- ✅ Better monitoring: Single point to observe
- ✅ Easier debugging: All logic in one place
- ✅ Transaction visibility: Can see entire saga state
Cons:
- ❌ Single point of failure: Coordinator is critical
- ❌ Scalability bottleneck: Coordinator can become overloaded
- ❌ Tighter coupling: Services depend on orchestrator
Production Example: Netflix Conductor
- Centralized orchestration engine
- Handles millions of workflows
- Scaling: Multiple coordinator instances with shared state
Choreography (Event-Driven):
Pros:
- ✅ No single point of failure: Distributed control
- ✅ Better scalability: No coordinator bottleneck
- ✅ Looser coupling: Services communicate via events
Cons:
- ❌ Harder to understand: Distributed control flow
- ❌ Harder to monitor: Must trace events across services
- ❌ Harder to debug: No single place to see state
- ❌ Transaction visibility: Hard to see saga state
Production Example: Event Sourcing + CQRS
- Events stored in event log
- Services react to events
- Monitoring: Event log provides audit trail
Hybrid Approach:
- Use orchestration for critical, complex workflows
- Use choreography for simple, high-volume workflows
- Best of both worlds: Flexibility + visibility
2PC: When It’s Still Used
Section titled “2PC: When It’s Still Used”Despite its problems, 2PC is still used in production:
Use Case 1: Database Clusters
- PostgreSQL: Uses 2PC for distributed transactions
- MySQL Cluster: Uses 2PC for multi-master replication
- Why: Database-level coordination, not application-level
Use Case 2: XA Transactions
- Java EE: JTA (Java Transaction API) uses 2PC
- Spring: @Transactional with XA datasources
- Why: Standard protocol, well-understood
Use Case 3: Short-Lived Transactions
- Microsecond transactions: 2PC overhead acceptable
- Low concurrency: Blocking not a problem
- Why: Simpler than Saga for simple cases
Production Pattern:
1// Use 2PC for simple, fast transactions2@Transactional(rollbackFor = Exception.class)3public void transferMoney(Account from, Account to, double amount) {4 from.debit(amount);5 to.credit(amount);6 // 2PC handles coordination7}Compensation Design Patterns
Section titled “Compensation Design Patterns”Pattern 1: Forward Recovery (Retry)
Section titled “Pattern 1: Forward Recovery (Retry)”Pattern: Retry failed step instead of compensating.
When to use:
- Transient failures: Network hiccups, temporary unavailability
- Idempotent operations: Safe to retry
Example:
1class RetrySagaStep:2 def execute(self, max_retries=3):3 for attempt in range(max_retries):4 try:5 return self.do_work()6 except TransientError as e:7 if attempt == max_retries - 1:8 raise # Give up after retries9 time.sleep(2 ** attempt) # Exponential backoffPattern 2: Backward Recovery (Compensation)
Section titled “Pattern 2: Backward Recovery (Compensation)”Pattern: Compensate completed steps (standard Saga).
When to use:
- Permanent failures: Business logic errors
- Non-idempotent operations: Can’t safely retry
Example: Already covered in main content
Pattern 3: Compensation with TTL
Section titled “Pattern 3: Compensation with TTL”Pattern: Compensations expire after time limit.
Use case: Prevent stale compensations from running.
Example:
1class TimedCompensation:2 def __init__(self, ttl_seconds=3600):3 self.ttl = ttl_seconds4 self.created_at = time.time()5
6 def compensate(self):7 if time.time() - self.created_at > self.ttl:8 raise CompensationExpired("Too much time has passed")9 # Proceed with compensationProduction Best Practices
Section titled “Production Best Practices”Practice 1: Idempotency Keys
Section titled “Practice 1: Idempotency Keys”Pattern: Use idempotency keys to prevent duplicate execution.
Example:
1class IdempotentSagaStep:2 def execute(self, idempotency_key):3 # Check if already executed4 if self.db.exists(idempotency_key):5 return self.db.get_result(idempotency_key)6
7 # Execute and store result8 result = self.do_work()9 self.db.store(idempotency_key, result)10 return resultBenefits:
- ✅ Safe to retry
- ✅ Prevents duplicate execution
- ✅ Handles network retries
Practice 2: Saga Timeout Management
Section titled “Practice 2: Saga Timeout Management”Pattern: Set timeouts for saga execution.
Example:
1class TimedSaga:2 def execute(self, timeout_seconds=300):3 start_time = time.time()4
5 for step in self.steps:6 if time.time() - start_time > timeout_seconds:7 raise SagaTimeout("Saga exceeded timeout")8 step.execute()Production Settings:
- Short sagas: 30-60 seconds (API calls)
- Medium sagas: 5-10 minutes (order processing)
- Long sagas: Hours/days (fulfillment workflows)
Practice 3: Saga Monitoring and Alerting
Section titled “Practice 3: Saga Monitoring and Alerting”What to Monitor:
- Saga completion rate: Should be >99%
- Average saga duration: Track P50, P95, P99
- Compensation rate: High rate indicates problems
- Failed sagas: Alert on failures
Production Metrics:
1class SagaMetrics:2 def record_saga(self, saga_id, duration, success):3 self.metrics.increment('saga.total')4 self.metrics.histogram('saga.duration', duration)5
6 if success:7 self.metrics.increment('saga.success')8 else:9 self.metrics.increment('saga.failure')10 self.alert_on_failure(saga_id)Performance Benchmarks: Real-World Numbers
Section titled “Performance Benchmarks: Real-World Numbers”| Pattern | Latency | Throughput | Failure Recovery | Use Case |
|---|---|---|---|---|
| 2PC | 100-500ms | 100-1K TPS | 10-60s | Database clusters |
| Saga (Orchestration) | 50-200ms | 1K-10K TPS | 1-5s | Microservices |
| Saga (Choreography) | 30-150ms | 5K-50K TPS | 1-3s | Event-driven |
| Local Transaction | 5-20ms | 10K-100K TPS | N/A | Single service |
Key Insights:
- Saga is 2-5x faster than 2PC
- Choreography is faster than orchestration (no coordinator overhead)
- Local transactions are 10x faster (no coordination needed)
Key Takeaways
Section titled “Key Takeaways”What’s Next?
Section titled “What’s Next?”Now that you understand distributed transactions, let’s explore how to handle conflicts when multiple operations happen simultaneously:
Next up: Conflict Resolution Strategies — Learn last-write-wins, vector clocks, and CRDTs for handling concurrent updates.