|
| 1 | +--- |
| 2 | +title: "Rate Limiting Pattern in Java: Controlling System Overload Gracefully" |
| 3 | +shortTitle: Rate Limiting |
| 4 | +description: "Explore multiple rate limiting strategies in Java—Token Bucket, Fixed Window, and Adaptive Limiting. Learn with diagrams, programmatic examples, and real-world simulation." |
| 5 | +category: Behavioral |
| 6 | +language: en |
| 7 | +tag: |
| 8 | + - Resilience |
| 9 | + - System Overload Protection |
| 10 | + - API Throttling |
| 11 | + - Concurrency |
| 12 | + - Cloud Patterns |
| 13 | +--- |
| 14 | + |
| 15 | +## Also known as |
| 16 | + |
| 17 | +- Throttling |
| 18 | +- Request Limiting |
| 19 | +- API Rate Limiting |
| 20 | + |
| 21 | +--- |
| 22 | + |
| 23 | +## Intent of Rate Limiting Design Pattern |
| 24 | + |
| 25 | +To regulate the number of requests sent to a service in a specific time window, avoiding resource exhaustion and ensuring system stability. This is especially useful in distributed and cloud-native architectures. |
| 26 | + |
| 27 | +--- |
| 28 | + |
| 29 | +## Detailed Explanation of Rate Limiting with Real-World Examples |
| 30 | + |
| 31 | +### Real-world example |
| 32 | + |
| 33 | +Imagine you're entering a concert hall that only allows 50 people per minute. If too many fans arrive at once, the gate staff slows down entry, allowing only a few at a time. This prevents overcrowding and ensures safety. Similarly, the rate limiter controls how many requests are processed to avoid overloading a server. |
| 34 | + |
| 35 | +### In plain words |
| 36 | + |
| 37 | +Regulate the number of requests a system handles within a time frame to protect availability and performance. |
| 38 | + |
| 39 | + |
| 40 | +### AWS says |
| 41 | + |
| 42 | +> "API Gateway limits the steady-state rate and burst rate of requests that it allows for each method in your REST APIs. When request rates exceed these limits, API Gateway begins to throttle requests." |
| 43 | +
|
| 44 | +— [API Gateway quotas and important notes - AWS Documentation](https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-request-throttling.html) |
| 45 | + |
| 46 | +--- |
| 47 | + |
| 48 | +## Architecture Diagram |
| 49 | + |
| 50 | + |
| 51 | + |
| 52 | +This UML shows the key components: |
| 53 | +- `RateLimiter` interface |
| 54 | +- `TokenBucketRateLimiter`, `FixedWindowRateLimiter`, `AdaptiveRateLimiter` |
| 55 | +- Supporting exception classes |
| 56 | +- `FindCustomerRequest` as a rate-limited operation |
| 57 | + |
| 58 | +--- |
| 59 | + |
| 60 | +## Flowcharts |
| 61 | + |
| 62 | +### 1. Token Bucket Strategy |
| 63 | + |
| 64 | + |
| 65 | + |
| 66 | +### 2. Fixed Window Strategy |
| 67 | + |
| 68 | + |
| 69 | + |
| 70 | +### 3. Adaptive Rate Limiting Strategy |
| 71 | + |
| 72 | + |
| 73 | + |
| 74 | +--- |
| 75 | + |
| 76 | +### Programmatic Example of Rate Limiter Pattern in Java |
| 77 | + |
| 78 | +The **Rate Limiter** design pattern helps protect systems from overload by restricting the number of operations that can be performed in a given time window. It is especially useful when accessing shared resources, APIs, or services that are sensitive to spikes in traffic. |
| 79 | + |
| 80 | +This implementation demonstrates three strategies for rate limiting: |
| 81 | + |
| 82 | +- **Token Bucket Rate Limiter** |
| 83 | +- **Fixed Window Rate Limiter** |
| 84 | +- **Adaptive Rate Limiter** |
| 85 | + |
| 86 | +Let’s walk through the key components. |
| 87 | + |
| 88 | +--- |
| 89 | + |
| 90 | +#### 1. Token Bucket Rate Limiter |
| 91 | + |
| 92 | +The token bucket allows short bursts followed by a steady rate. Tokens are added periodically and requests are only allowed if a token is available. |
| 93 | + |
| 94 | +```java |
| 95 | +public class TokenBucketRateLimiter implements RateLimiter { |
| 96 | + private final int capacity; |
| 97 | + private final int refillRate; |
| 98 | + private final ConcurrentHashMap<String, TokenBucket> buckets = new ConcurrentHashMap<>(); |
| 99 | + private final ScheduledExecutorService scheduler = Executors.newScheduledThreadPool(1); |
| 100 | + |
| 101 | + public TokenBucketRateLimiter(int capacity, int refillRate) { |
| 102 | + this.capacity = capacity; |
| 103 | + this.refillRate = refillRate; |
| 104 | + scheduler.scheduleAtFixedRate(this::refillBuckets, 1, 1, TimeUnit.SECONDS); |
| 105 | + } |
| 106 | + |
| 107 | + @Override |
| 108 | + public void check(String serviceName, String operationName) throws RateLimitException { |
| 109 | + String key = serviceName + ":" + operationName; |
| 110 | + TokenBucket bucket = buckets.computeIfAbsent(key, k -> new TokenBucket(capacity)); |
| 111 | + if (!bucket.tryConsume()) { |
| 112 | + throw new ThrottlingException(serviceName, operationName, 1000); |
| 113 | + } |
| 114 | + } |
| 115 | + |
| 116 | + private void refillBuckets() { |
| 117 | + buckets.forEach((k, b) -> b.refill(refillRate)); |
| 118 | + } |
| 119 | + |
| 120 | + private static class TokenBucket { |
| 121 | + private final int capacity; |
| 122 | + private final AtomicInteger tokens; |
| 123 | + |
| 124 | + TokenBucket(int capacity) { |
| 125 | + this.capacity = capacity; |
| 126 | + this.tokens = new AtomicInteger(capacity); |
| 127 | + } |
| 128 | + |
| 129 | + boolean tryConsume() { |
| 130 | + while (true) { |
| 131 | + int current = tokens.get(); |
| 132 | + if (current <= 0) return false; |
| 133 | + if (tokens.compareAndSet(current, current - 1)) return true; |
| 134 | + } |
| 135 | + } |
| 136 | + |
| 137 | + void refill(int amount) { |
| 138 | + tokens.getAndUpdate(current -> Math.min(current + amount, capacity)); |
| 139 | + } |
| 140 | + } |
| 141 | +} |
| 142 | +``` |
| 143 | + |
| 144 | +--- |
| 145 | + |
| 146 | +#### 2. Fixed Window Rate Limiter |
| 147 | + |
| 148 | +This strategy uses a simple counter within a fixed time window. |
| 149 | + |
| 150 | +```java |
| 151 | +public class FixedWindowRateLimiter implements RateLimiter { |
| 152 | + private final int limit; |
| 153 | + private final long windowMillis; |
| 154 | + private final ConcurrentHashMap<String, WindowCounter> counters = new ConcurrentHashMap<>(); |
| 155 | + |
| 156 | + public FixedWindowRateLimiter(int limit, long windowSeconds) { |
| 157 | + this.limit = limit; |
| 158 | + this.windowMillis = TimeUnit.SECONDS.toMillis(windowSeconds); |
| 159 | + } |
| 160 | + |
| 161 | + @Override |
| 162 | + public synchronized void check(String serviceName, String operationName) throws RateLimitException { |
| 163 | + String key = serviceName + ":" + operationName; |
| 164 | + WindowCounter counter = counters.computeIfAbsent(key, k -> new WindowCounter()); |
| 165 | + |
| 166 | + if (!counter.tryIncrement()) { |
| 167 | + throw new RateLimitException("Rate limit exceeded for " + key, windowMillis); |
| 168 | + } |
| 169 | + } |
| 170 | + |
| 171 | + private class WindowCounter { |
| 172 | + private AtomicInteger count = new AtomicInteger(0); |
| 173 | + private volatile long windowStart = System.currentTimeMillis(); |
| 174 | + |
| 175 | + synchronized boolean tryIncrement() { |
| 176 | + long now = System.currentTimeMillis(); |
| 177 | + if (now - windowStart > windowMillis) { |
| 178 | + count.set(0); |
| 179 | + windowStart = now; |
| 180 | + } |
| 181 | + return count.incrementAndGet() <= limit; |
| 182 | + } |
| 183 | + } |
| 184 | +} |
| 185 | +``` |
| 186 | + |
| 187 | +--- |
| 188 | + |
| 189 | +#### 3. Adaptive Rate Limiter |
| 190 | + |
| 191 | +This version adjusts the rate based on system health, reducing the rate when throttling occurs and recovering periodically. |
| 192 | + |
| 193 | +```java |
| 194 | +public class AdaptiveRateLimiter implements RateLimiter { |
| 195 | + private final int initialLimit; |
| 196 | + private final int maxLimit; |
| 197 | + private final AtomicInteger currentLimit; |
| 198 | + private final ConcurrentHashMap<String, RateLimiter> limiters = new ConcurrentHashMap<>(); |
| 199 | + private final ScheduledExecutorService healthChecker = Executors.newScheduledThreadPool(1); |
| 200 | + |
| 201 | + public AdaptiveRateLimiter(int initialLimit, int maxLimit) { |
| 202 | + this.initialLimit = initialLimit; |
| 203 | + this.maxLimit = maxLimit; |
| 204 | + this.currentLimit = new AtomicInteger(initialLimit); |
| 205 | + healthChecker.scheduleAtFixedRate(this::adjustLimits, 10, 10, TimeUnit.SECONDS); |
| 206 | + } |
| 207 | + |
| 208 | + @Override |
| 209 | + public void check(String serviceName, String operationName) throws RateLimitException { |
| 210 | + String key = serviceName + ":" + operationName; |
| 211 | + int current = currentLimit.get(); |
| 212 | + RateLimiter limiter = limiters.computeIfAbsent(key, k -> new TokenBucketRateLimiter(current, current)); |
| 213 | + |
| 214 | + try { |
| 215 | + limiter.check(serviceName, operationName); |
| 216 | + } catch (RateLimitException e) { |
| 217 | + currentLimit.updateAndGet(curr -> Math.max(initialLimit, curr / 2)); |
| 218 | + throw e; |
| 219 | + } |
| 220 | + } |
| 221 | + |
| 222 | + private void adjustLimits() { |
| 223 | + currentLimit.updateAndGet(curr -> Math.min(maxLimit, curr + (initialLimit / 2))); |
| 224 | + } |
| 225 | +} |
| 226 | +``` |
| 227 | + |
| 228 | +--- |
| 229 | + |
| 230 | +#### 4. Simulated Demo Using All Limiters |
| 231 | + |
| 232 | +```java |
| 233 | +public final class App { |
| 234 | + public static void main(String[] args) { |
| 235 | + TokenBucketRateLimiter tb = new TokenBucketRateLimiter(2, 1); |
| 236 | + FixedWindowRateLimiter fw = new FixedWindowRateLimiter(3, 1); |
| 237 | + AdaptiveRateLimiter ar = new AdaptiveRateLimiter(2, 6); |
| 238 | + |
| 239 | + ExecutorService executor = Executors.newFixedThreadPool(3); |
| 240 | + for (int i = 1; i <= 3; i++) { |
| 241 | + executor.submit(createClientTask(i, tb, fw, ar)); |
| 242 | + } |
| 243 | + } |
| 244 | + |
| 245 | + private static Runnable createClientTask(int clientId, RateLimiter tb, RateLimiter fw, RateLimiter ar) { |
| 246 | + return () -> { |
| 247 | + String[] services = {"s3", "dynamodb", "lambda"}; |
| 248 | + String[] operations = {"GetObject", "PutObject", "Query", "Scan", "PutItem", "Invoke", "ListFunctions"}; |
| 249 | + Random random = new Random(); |
| 250 | + |
| 251 | + while (true) { |
| 252 | + String service = services[random.nextInt(services.length)]; |
| 253 | + String operation = operations[random.nextInt(operations.length)]; |
| 254 | + try { |
| 255 | + switch (service) { |
| 256 | + case "s3" -> tb.check(service, operation); |
| 257 | + case "dynamodb" -> fw.check(service, operation); |
| 258 | + case "lambda" -> ar.check(service, operation); |
| 259 | + } |
| 260 | + System.out.printf("Client %d: %s.%s - ALLOWED%n", clientId, service, operation); |
| 261 | + } catch (RateLimitException e) { |
| 262 | + System.out.printf("Client %d: %s.%s - THROTTLED%n", clientId, service, operation); |
| 263 | + } |
| 264 | + |
| 265 | + try { |
| 266 | + Thread.sleep(30 + random.nextInt(50)); |
| 267 | + } catch (InterruptedException e) { |
| 268 | + Thread.currentThread().interrupt(); |
| 269 | + } |
| 270 | + } |
| 271 | + }; |
| 272 | + } |
| 273 | +} |
| 274 | +``` |
| 275 | + |
| 276 | +--- |
| 277 | + |
| 278 | +This example highlights how the Rate Limiter pattern supports various throttling techniques and how they respond under simulated traffic pressure, making it invaluable for building scalable, resilient systems. |
| 279 | + |
| 280 | +## When to Use Rate Limiting |
| 281 | + |
| 282 | +- APIs receiving unpredictable traffic |
| 283 | +- Shared cloud resources (e.g., DB, compute) |
| 284 | +- Services requiring fair client usage |
| 285 | +- Preventing DoS or abuse |
| 286 | + |
| 287 | +--- |
| 288 | + |
| 289 | +## Real-World Applications |
| 290 | + |
| 291 | +- **AWS API Gateway** |
| 292 | +- **Google Cloud Functions** |
| 293 | +- **Netflix Zuul API Gateway** |
| 294 | +- **Stripe API Throttling** |
| 295 | + |
| 296 | +--- |
| 297 | + |
| 298 | +## Benefits and Trade-offs |
| 299 | + |
| 300 | +### Benefits |
| 301 | + |
| 302 | +- Protects backend from overload |
| 303 | +- Fair distribution of resources |
| 304 | +- Better user experience under load |
| 305 | + |
| 306 | +### Trade-offs |
| 307 | + |
| 308 | +- May delay valid requests |
| 309 | +- Requires tuning of limits |
| 310 | +- Could create bottlenecks if misused |
| 311 | + |
| 312 | +--- |
| 313 | + |
| 314 | +## Related Java Design Patterns |
| 315 | + |
| 316 | +- [Circuit Breaker](https://java-design-patterns.com/patterns/circuit-breaker/) |
| 317 | +- [Retry](https://java-design-patterns.com/patterns/retry/) |
| 318 | +- [Throttling Queue](https://java-design-patterns.com/patterns/throttling/) |
| 319 | + |
| 320 | +--- |
| 321 | + |
| 322 | +## References and Credits |
| 323 | + |
| 324 | +- [Microsoft Cloud Design Patterns](https://learn.microsoft.com/en-us/azure/architecture/patterns/throttling) |
| 325 | +- [AWS API Gateway Throttling](https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-request-throttling.html) |
| 326 | +- *Designing Data-Intensive Applications* by Martin Kleppmann |
| 327 | +- [Resilience4j](https://resilience4j.readme.io/) |
| 328 | +- Java Design Patterns Project: [java-design-patterns](https://github.com/iluwatar/java-design-patterns) |
0 commit comments