Distributed Transactions

Coordinating operations across distributed systems

The Challenge of Distributed Transactions

In a single database, transactions are straightforward: all operations succeed or all fail (atomicity). But what happens when your transaction spans multiple services or databases?

What is a Distributed Transaction?

A distributed transaction is a transaction that spans multiple services or databases. It requires coordination to ensure atomicity: all operations succeed or all fail.

The Challenge: How do you ensure all succeed or all fail when services are distributed?

Solution 1: Two-Phase Commit (2PC)

Two-Phase Commit (2PC) is a protocol that coordinates distributed transactions through two phases.

How 2PC Works

Phase 1: Prepare (Voting)

Coordinator sends “prepare” message to all participants
Each participant:
- Locks resources
- Performs validation
- Votes YES (ready) or NO (abort)
Participants cannot abort after voting YES (they’re locked)

Phase 2: Commit or Abort

If all vote YES:

Coordinator sends “commit” to all
Participants commit and release locks

If any votes NO:

Coordinator sends “abort” to all
Participants rollback and release locks

2PC Flow Diagram

Problems with 2PC

Problem	Description	Impact
Blocking	Nodes wait if coordinator fails	Resources locked indefinitely
SPOF	Coordinator is critical	System fails if coordinator dies
Performance	Synchronous, blocking	Slow, doesn’t scale
Partitions	Doesn’t handle network splits	Can’t make progress during partitions

Solution 2: Saga Pattern

The Saga pattern breaks a distributed transaction into a sequence of local transactions. Each step has a compensating action that undoes it.

How Saga Works

Key Difference from 2PC: Saga uses compensation (undo operations) instead of rollback (database transaction rollback).

What is Compensation?

Compensation is an operation that undoes the effects of a previous operation. Unlike database rollback, compensation may not perfectly reverse the operation, but makes the system consistent.

Example: You can’t “unsend” an email, but you can send an apology email. That’s compensation!

Saga Orchestration vs Choreography

Orchestration: Central Coordinator

Characteristics:

Centralized control (easier to understand)
Easier to monitor and debug
Single point of failure (orchestrator)
Tighter coupling

Choreography: Event-Driven

Characteristics:

No single point of failure
Loosely coupled services
Harder to understand flow
Harder to monitor

Saga Example: E-Commerce Order

If Step 2 (Charge Payment) fails:

Compensate Step 1: Release reserved inventory
Transaction ends (no further steps)

If Step 4 (Send Notification) fails:

Steps 1-3 already succeeded (can’t undo shipping!)
Send apology notification (compensation)

LLD ↔ HLD Connection

How distributed transactions affect your class design:

Saga Orchestrator Implementation

Compensation Pattern

2PC vs Saga: When to Use What?

Aspect	2PC	Saga
Consistency	Strong (ACID)	Eventual
Performance	Slow (blocking)	Fast (non-blocking)
Availability	Low (blocking)	High (non-blocking)
Complexity	Medium	High (need compensation)
Use Cases	Critical, short transactions	Long-running, distributed

Real-World Examples

Example 1: E-Commerce Order Processing (Saga Pattern)

Company: Amazon, eBay, Shopify

Scenario: When a customer places an order, multiple services must coordinate:

Inventory Service: Reserve items
Payment Service: Charge the customer
Shipping Service: Create shipment
Notification Service: Send confirmation email

Implementation: Uses Saga Orchestration pattern with compensation:

Why Saga? Order processing is long-running (can take minutes), involves multiple services, and requires compensation if any step fails. 2PC would block resources for too long.

Real-World Impact:

Amazon: Processes millions of orders daily using Saga pattern
Failure Rate: <0.1% of orders require compensation
Performance: Average order processing time: 2-5 seconds

Example 2: Banking Money Transfer (2PC Pattern)

Company: Traditional Banks, Financial Institutions

Scenario: Transferring money between accounts requires atomicity across multiple databases:

Debit from source account
Credit to destination account
Log transaction in audit database

Implementation: Uses 2PC for strong ACID guarantees:

Why 2PC? Financial transactions require strong consistency and ACID guarantees. The blocking nature is acceptable for critical financial operations.

Real-World Impact:

Banks: Use 2PC for inter-account transfers
Latency: 100-300ms per transfer (acceptable for financial operations)
Reliability: 99.99% success rate with proper error handling

Example 3: Hotel Booking System (Saga Choreography)

Company: Booking.com, Expedia, Airbnb

Scenario: Booking a hotel room involves:

Availability Service: Check and reserve room
Payment Service: Process payment
Confirmation Service: Send booking confirmation
Loyalty Service: Update loyalty points

Implementation: Uses Saga Choreography (event-driven):

Why Choreography? Hotel bookings are high-volume, services are loosely coupled, and the system needs to scale horizontally. No single coordinator bottleneck.

Real-World Impact:

Booking.com: Processes 1.5+ million bookings daily
Throughput: 10K+ bookings per second during peak
Scalability: Each service scales independently

Example 4: Microservices Order Fulfillment (Saga with Timeouts)

Company: Netflix (content delivery), Uber (ride booking)

Scenario: Complex workflows that span hours or days:

Order Service: Create order
Inventory Service: Reserve items
Payment Service: Authorize payment
Fulfillment Service: Process order (takes hours)
Shipping Service: Ship order (takes days)

Implementation: Uses Saga with Persistent State and timeouts:

Why Persistent Saga? Long-running workflows must survive server restarts, handle timeouts, and support idempotent retries.

Real-World Impact:

Netflix: Content delivery workflows span days
Uber: Ride booking workflows handle cancellations hours later
Reliability: 99.9% completion rate with proper timeout handling

Deep Dive: Production Considerations for Distributed Transactions

2PC vs Saga: Production Trade-offs

Performance Comparison

2PC Performance:

Latency: 100-500ms per transaction (synchronous coordination)
Throughput: 100-1K transactions/sec (limited by coordinator)
Blocking: All participants blocked during prepare phase
Failure Recovery: 10-60 seconds (must query all participants)

Saga Performance:

Latency: 50-200ms per step (asynchronous execution)
Throughput: 1K-10K transactions/sec (no blocking)
Non-blocking: Steps execute independently
Failure Recovery: 1-5 seconds (compensate only completed steps)

Real-World Benchmark:

E-commerce checkout (2PC): ~300ms, 500 TPS
E-commerce checkout (Saga): ~150ms, 3K TPS
Improvement: Saga is 2x faster, 6x more throughput

Saga Pattern: Advanced Scenarios

Scenario 1: Partial Failure Recovery

Problem: What if compensation itself fails?

Example:

Step 1: Reserve inventory
Step 2: Charge payment
Step 3: Ship order (fails)
Compensate Step 2: Refund (fails!)

Solutions:

Solution 1: Retry Compensation

1
class SagaOrchestrator:
2
    def compensate_with_retry(self, step, max_retries=3):
3
        for attempt in range(max_retries):
4
            try:
5
                step.compensate()
6
                return True
7
            except Exception as e:
8
                if attempt == max_retries - 1:
9
                    # Log for manual intervention
10
                    self.alert_manual_intervention(step, e)
11
                    return False
12
                time.sleep(2 ** attempt)  # Exponential backoff

Solution 2: Compensating Transaction Pattern

Store compensation intent in database
Background job retries failed compensations
Benefit: Eventually consistent compensation

Solution 3: Manual Intervention Queue

Failed compensations go to queue
Operations team handles manually
Use case: Critical financial transactions

Scenario 2: Long-Running Sagas

Problem: Sagas that take hours/days (e.g., order fulfillment).

Challenges:

State persistence: Must survive server restarts
Timeout handling: Steps may timeout
Idempotency: Steps may be retried

Solution: Persistent Saga State

1
class PersistentSagaOrchestrator:
2
    def __init__(self, db_connection):
3
        self.db = db_connection
4

5
    def execute_saga(self, saga_id, steps):
6
        # Save saga state
7
        self.db.save_saga_state(saga_id, {
8
            'status': 'running',
9
            'current_step': 0,
10
            'completed_steps': [],
11
            'compensated_steps': []
12
        })
13

14
        try:
15
            for i, step in enumerate(steps):
16
                # Load state (survives restarts)
17
                state = self.db.load_saga_state(saga_id)
18

19
                if i in state['completed_steps']:
20
                    continue  # Already completed (idempotent)
21

22
                # Execute step
23
                if step.execute():
24
                    state['completed_steps'].append(i)
25
                    self.db.save_saga_state(saga_id, state)
26
                else:
27
                    raise Exception(f"Step {i} failed")
28
        except Exception as e:
29
            # Compensate completed steps
30
            self.compensate_saga(saga_id, state['completed_steps'])

Scenario 3: Saga Orchestration vs Choreography: Production Comparison

Orchestration (Central Coordinator):

Pros:

Easier to understand: Centralized control flow
Better monitoring: Single point to observe
Easier debugging: All logic in one place
Transaction visibility: Can see entire saga state

Cons:

Single point of failure: Coordinator is critical
Scalability bottleneck: Coordinator can become overloaded
Tighter coupling: Services depend on orchestrator

Production Example: Netflix Conductor

Centralized orchestration engine
Handles millions of workflows
Scaling: Multiple coordinator instances with shared state

Choreography (Event-Driven):

Pros:

No single point of failure: Distributed control
Better scalability: No coordinator bottleneck
Looser coupling: Services communicate via events

Cons:

Harder to understand: Distributed control flow
Harder to monitor: Must trace events across services
Harder to debug: No single place to see state
Transaction visibility: Hard to see saga state

Production Example: Event Sourcing + CQRS

Events stored in event log
Services react to events
Monitoring: Event log provides audit trail

Hybrid Approach:

Use orchestration for critical, complex workflows
Use choreography for simple, high-volume workflows
Best of both worlds: Flexibility + visibility

2PC: When It’s Still Used

Despite its problems, 2PC is still used in production:

Use Case 1: Database Clusters

PostgreSQL: Uses 2PC for distributed transactions
MySQL Cluster: Uses 2PC for multi-master replication
Why: Database-level coordination, not application-level

Use Case 2: XA Transactions

Java EE: JTA (Java Transaction API) uses 2PC
Spring: @Transactional with XA datasources
Why: Standard protocol, well-understood

Use Case 3: Short-Lived Transactions

Microsecond transactions: 2PC overhead acceptable
Low concurrency: Blocking not a problem
Why: Simpler than Saga for simple cases

Production Pattern:

1
// Use 2PC for simple, fast transactions
2
@Transactional(rollbackFor = Exception.class)
3
public void transferMoney(Account from, Account to, double amount) {
4
    from.debit(amount);
5
    to.credit(amount);
6
    // 2PC handles coordination
7
}

Compensation Design Patterns

Pattern 1: Forward Recovery (Retry)

Pattern: Retry failed step instead of compensating.

When to use:

Transient failures: Network hiccups, temporary unavailability
Idempotent operations: Safe to retry

Example:

1
class RetrySagaStep:
2
    def execute(self, max_retries=3):
3
        for attempt in range(max_retries):
4
            try:
5
                return self.do_work()
6
            except TransientError as e:
7
                if attempt == max_retries - 1:
8
                    raise  # Give up after retries
9
                time.sleep(2 ** attempt)  # Exponential backoff

Pattern 2: Backward Recovery (Compensation)

Pattern: Compensate completed steps (standard Saga).

When to use:

Permanent failures: Business logic errors
Non-idempotent operations: Can’t safely retry

Example: Already covered in main content

Pattern 3: Compensation with TTL

Pattern: Compensations expire after time limit.

Use case: Prevent stale compensations from running.

Example:

1
class TimedCompensation:
2
    def __init__(self, ttl_seconds=3600):
3
        self.ttl = ttl_seconds
4
        self.created_at = time.time()
5

6
    def compensate(self):
7
        if time.time() - self.created_at > self.ttl:
8
            raise CompensationExpired("Too much time has passed")
9
        # Proceed with compensation

Production Best Practices

Practice 1: Idempotency Keys

Pattern: Use idempotency keys to prevent duplicate execution.

Example:

1
class IdempotentSagaStep:
2
    def execute(self, idempotency_key):
3
        # Check if already executed
4
        if self.db.exists(idempotency_key):
5
            return self.db.get_result(idempotency_key)
6

7
        # Execute and store result
8
        result = self.do_work()
9
        self.db.store(idempotency_key, result)
10
        return result

Benefits:

Safe to retry
Prevents duplicate execution
Handles network retries

Practice 2: Saga Timeout Management

Pattern: Set timeouts for saga execution.

Example:

1
class TimedSaga:
2
    def execute(self, timeout_seconds=300):
3
        start_time = time.time()
4

5
        for step in self.steps:
6
            if time.time() - start_time > timeout_seconds:
7
                raise SagaTimeout("Saga exceeded timeout")
8
            step.execute()

Production Settings:

Short sagas: 30-60 seconds (API calls)
Medium sagas: 5-10 minutes (order processing)
Long sagas: Hours/days (fulfillment workflows)

Practice 3: Saga Monitoring and Alerting

What to Monitor:

Saga completion rate: Should be >99%
Average saga duration: Track P50, P95, P99
Compensation rate: High rate indicates problems
Failed sagas: Alert on failures

Production Metrics:

1
class SagaMetrics:
2
    def record_saga(self, saga_id, duration, success):
3
        self.metrics.increment('saga.total')
4
        self.metrics.histogram('saga.duration', duration)
5

6
        if success:
7
            self.metrics.increment('saga.success')
8
        else:
9
            self.metrics.increment('saga.failure')
10
            self.alert_on_failure(saga_id)

Performance Benchmarks: Real-World Numbers

Pattern	Latency	Throughput	Failure Recovery	Use Case
2PC	100-500ms	100-1K TPS	10-60s	Database clusters
Saga (Orchestration)	50-200ms	1K-10K TPS	1-5s	Microservices
Saga (Choreography)	30-150ms	5K-50K TPS	1-3s	Event-driven
Local Transaction	5-20ms	10K-100K TPS	N/A	Single service

Key Insights:

Saga is 2-5x faster than 2PC
Choreography is faster than orchestration (no coordinator overhead)
Local transactions are 10x faster (no coordination needed)

Key Takeaways

What’s Next?

Now that you understand distributed transactions, let’s explore how to handle conflicts when multiple operations happen simultaneously:

Next up: Conflict Resolution Strategies — Learn last-write-wins, vector clocks, and CRDTs for handling concurrent updates.

Request a feature or report an issue