Skip to content
Low Level Design Mastery Logo
LowLevelDesign Mastery

Distributed Locks

Coordinating actions across distributed nodes

In distributed systems, multiple nodes often need to coordinate access to shared resources:

  • Database updates: Only one node should update a record at a time
  • Cache invalidation: Prevent multiple nodes from invalidating cache simultaneously
  • Scheduled tasks: Ensure only one node runs a scheduled job
  • Resource allocation: Coordinate access to limited resources

The Challenge: Traditional locks (like mutexes) only work within a single process. In distributed systems, we need locks that work across multiple nodes, handle network failures, and prevent deadlocks.

Diagram

A distributed lock is a coordination mechanism that ensures only one process or node can hold a lock at a time across a distributed system. It provides mutual exclusion across network boundaries.

  1. Mutual Exclusion: Only one holder at a time
  2. Deadlock Free: Locks are eventually released (via timeout/lease)
  3. Fault Tolerant: Survives node failures
  4. High Availability: Lock service must be available
  5. Performance: Low latency, high throughput

Think of a distributed lock like a bathroom key at a restaurant:

  • Only one person can have the key at a time
  • If someone forgets to return the key, there’s a timeout mechanism (staff has a master key)
  • Multiple people can request the key, but only one gets it
  • The key must be returned for others to use it

Lease-based locks automatically expire after a timeout period. This prevents deadlocks from crashed nodes.

Diagram

How It Works:

  1. Node acquires lock with a lease time (e.g., 10 seconds)
  2. Node must renew lock before lease expires
  3. If node crashes, lock automatically expires
  4. Other nodes can acquire lock after expiration

Benefits:

  • Prevents deadlocks from crashed nodes
  • Automatic cleanup
  • No manual lock release needed if node crashes

Redis provides a simple way to implement distributed locks using the SET command with NX (only if not exists) and EX (expiration) options.

Diagram

Key Points:

  • Use unique value (like UUID) to verify ownership
  • Set expiration to prevent deadlocks
  • Check return value: OK means lock acquired, nil means already held

Problem: Network partition can cause split-brain (multiple lock holders).

Solution: Use majority consensus (like Redlock algorithm) or accept that locks are best-effort.

Problem: Different nodes have different clocks, affecting lease expiration.

Solution: Use logical clocks or ensure clock synchronization (NTP).

Problem: If lock holder crashes, lock might never be released.

Solution: Use lease-based locks with automatic expiration.

Problem: Lock acquisition adds latency.

Solution: Use local locks when possible, minimize lock hold time, use optimistic locking when appropriate.

Ensure only one node updates a record at a time. Prevents race conditions and data corruption.

Ensure only one node runs a scheduled job. Prevents duplicate execution across multiple nodes.

Coordinate cache invalidation across nodes. Prevents multiple nodes from invalidating cache simultaneously.

Coordinate access to limited resources (like API rate limits, connection pools).

Advantages:

  • Provides mutual exclusion across nodes
  • Prevents race conditions
  • Coordinates distributed operations

Disadvantages:

  • Adds latency
  • Single point of failure (lock service)
  • Complex to implement correctly
  • Network partitions can cause issues

Mutual Exclusion

Ensures only one node holds lock at a time. Provides coordination across distributed systems.

Lease-Based

Locks automatically expire after lease time. Prevents deadlocks from crashed nodes. Requires renewal.

Ownership Verification

Use unique value (UUID) to verify lock ownership before release. Prevents releasing locks held by others.

Atomic Operations

Use Lua scripts or atomic Redis commands for lock acquisition and release. Ensures correctness.