Design a Rate Limiter

Design a rate limiter system that restricts the number of requests a user or client can make to an API within a specified time window.

What is the Rate Limiter Problem?

Design a rate limiter system that restricts the number of requests a user or client can make to an API within a specified time window. The system should support multiple rate limiting algorithms, handle concurrent requests efficiently, and be scalable across distributed environments.

In this problem, you’ll design a middleware-style component that intercepts requests and decides whether to “Allow” or “Drop” them based on configured limits.

Problem Overview

Design a service that tracks request counts for various identifiers and enforces throughput limits to ensure system stability.

Core Requirements

Functional Requirements:

Throughput Control: Limit requests within defined windows (e.g., 100 req/min).
Multiple Algorithms: Support Fixed Window Counter, Sliding Window Counter, and Token Bucket.
Flexible Identification: Allow configuration per user, per IP address, or per API key.
Error Feedback: Return HTTP 429 when limits are exceeded, with reset time information.
Precise Tracking: Maintain accurate counts and timestamps for enforcement.
Tiered Limits: Support different limits for different user categories (e.g., Premium vs. Free).
Admission Control: Check if a request is allowed before any processing occurs.

Non-Functional Requirements:

Low Latency: Checks must be fast enough to minimize impact on API response times.
Thread Safety: Handle high-concurrency updates to counters without race conditions.
Scalability: Designed for distributed environments (e.g., Redis integration).
Extensibility: Use the Strategy Pattern to easily add new limiting algorithms.
Testability: Core limiting logic must be easy to verify and maintain.

What’s Expected?

1. System Architecture

The rate limiter typically sits between the API Gateway and the Backend Service.

2. Key Classes to Design

System Flow

Token Bucket Logic Flow

Key Design Challenges

1. Race Conditions in Counters

If two threads read a counter as 9 (limit 10) at the same time, they both might increment it to 10 and allow the request, effectively letting 11 requests through.

Solution: Use Atomic Operations. In Java, use AtomicLong or LongAdder. In a distributed system, use Redis’ INCR or Lua scripts to ensure the “check-and-increment” step is atomic.

2. Algorithm Tradeoffs

Fixed Window: Simple but suffers from “spikes” at the edge of windows.
Token Bucket: Great for handling bursts, but requires tracking timestamps for every user.
Sliding Window: Most accurate, but memory-intensive as it stores timestamps of every request.

Solution: Use the Strategy Pattern. Encapsulate each algorithm in a strategy class, allowing the user to choose the best one for their specific resource constraints.

3. Memory Management for Dormant Users

If you have 1 million users but only 1000 are active, you shouldn’t keep buckets in memory for everyone.

Solution: Use an Expiring Cache (TTL). Use a structure like a LinkedHashMap (for LRU eviction) or a specialized cache library (Caffeine/Guava) to automatically remove rate limit data for users who haven’t made a request in a long time.

What You’ll Learn

By solving this problem, you’ll master:

✅ Algorithm Implementation - Turning math formulas into production code.
✅ High-Concurrency Patterns - Managing shared state without bottlenecks.
✅ Middleware Design - Building components that wrap other services.
✅ Distributed Thinking - Moving from in-memory logic to cross-server state.

View Complete Solution & Practice

Ready to see the full implementation? Open the interactive playground to access:

🎯 Step-by-step guidance through the 8-step LLD approach
📊 Interactive UML builder to visualize your design
💻 Complete Code Solutions in Python, Java, C++, TypeScript, JavaScript, C#
🤖 AI-powered review of your design and code

🚀 View Solution in Playground

After mastering the Rate Limiter, try these similar problems:

Order Matching Engine - Another high-throughput logic problem.
Notification Service - Handling message bursts and routing.
URL Shortener - Focus on unique ID generation and storage.

Request a feature or report an issue