Master the algorithms that protect your system from overload. Token buckets, sliding windows, and more.
Tokens are added at a fixed rate. Requests consume tokens. Allows for bursts of traffic.
Requests enter a queue (bucket) and are processed (leaked) at a constant rate. Smooths out bursts.
Counts requests in fixed time windows (e.g., 1 min). Vulnerable to spikes at window edges.
Tracks timestamps of each request. Accurate but high memory cost. Removes old timestamps.
Combines Fixed Window + Log. Approximates count using weighted average of previous & current window.
Where to place the limiter? API Gateway vs Load Balancer vs Application Code (Sidecar).