Rate Limit using Spring Boot Resilience4J

Niraj Kumar
5 min readDec 16, 2024

--

What is Rate Limiter?

A Rate Limiter is a mechanism used to control the number of requests or operations that are allowed to occur within a specified period. It is commonly used in software systems to ensure fair usage of resources, protect services from being overwhelmed, and maintain system stability.

Key Concepts of Rate Limiting

  1. Request Limit:
  • Defines how many requests are allowed in a given time window (e.g., 10 requests per second).

2. Time Window:

  • The time interval during which the rate limit applies (e.g., one second, one minute).

3. Exceeding the Limit:

  • If the number of requests exceeds the limit, further requests may be:
  • Rejected (returning an error like HTTP 429 — Too Many Requests).
  • Queued for later processing.
  • Delayed until the limit resets.

Why Use Rate Limiting?

  1. Prevent Overload:
  • Protects a service or system from being overwhelmed by excessive traffic.

2. Fair Resource Usage:

  • Ensures fair distribution of resources among users or services.

3. API Usage Control:

  • Enforces quotas for public or third-party APIs (e.g., a free tier allows 100 requests per hour).

4. Prevent Abuse:

  • Stops malicious activities like brute force attacks or denial-of-service (DoS) attacks.

5. Traffic Shaping:

  • Smoothens traffic bursts by regulating request flow.

Where is Rate Limiting Used?

  1. APIs:
  • Public APIs often use rate limiting to control how clients interact with the API (e.g., 1000 requests/day per user).

2. Web Applications:

  • Protect backend systems by limiting user requests (e.g., login attempts or form submissions).

3. Distributed Systems:

  • Manage load across microservices by regulating requests between services.

4. Messaging Systems:

  • Control the number of messages processed to avoid overwhelming consumers.

How Does Rate Limiting Work?

Common Algorithms for Rate Limiting

  1. Token Bucket:
  • A bucket is filled with tokens at a steady rate.
  • Each request removes a token from the bucket.
  • If the bucket is empty, the request is rejected or delayed.

2. Leaky Bucket:

  • Requests are added to a queue (the “bucket”).
  • The queue processes requests at a fixed rate.
  • Excess requests are discarded if the bucket overflows.

3. Fixed Window:

  • Limits requests based on a fixed time window (e.g., 10 requests per second).
  • Simple but prone to burst traffic at window edges.

4. Sliding Window:

  • Tracks requests over a rolling time window (e.g., last 60 seconds).
  • More precise and smooth compared to fixed windows.

5. Sliding Log:

  • Logs each request’s timestamp and checks if the number of requests within the window exceeds the limit.
  • Accurate but can be memory-intensive.

Implementing Rate Limiting

Client-Side Rate Limiting:

  • Controls the frequency of requests sent from the client.
  • Example: JavaScript libraries implementing debouncing or throttling.

Server-Side Rate Limiting:

  • Applied on the server to manage incoming requests.
  • Example: Using tools or libraries like Resilience4j, Redis, or API gateways (e.g., NGINX, AWS API Gateway).

Examples

Rate Limiting in Web APIs

If an API allows only 10 requests per minute, the 11th request will:

  1. Return an HTTP 429 (Too Many Requests) error.
  2. Include a Retry-After header indicating when the client can try again.

Example Response:

HTTP/1.1 429 Too Many Requests
Retry-After: 60

Benefits of Rate Limiting

  1. Improves System Reliability:
  • Avoids crashes caused by excessive traffic.

2. Enhances User Experience:

  • Prevents degraded performance during traffic spikes.

3. Prevents Misuse:

  • Stops malicious actors from overusing resources.

4. Encourages Efficient Usage:

  • Incentivizes users to stay within allocated quotas.

Tools and Libraries for Rate Limiting

  1. For Java:
  • Resilience4j
  • Bucket4j

2. For Python:

  • Flask-Limiter
  • Django Ratelimit

3. API Gateways:

  • AWS API Gateway
  • Kong
  • NGINX

Rate limiting is a crucial tool in modern systems to protect resources, ensure stability, and maintain consistent performance under varying loads.

Resilience4j Rate Limiter

Resilience4j Rate Limiter is a module in the Resilience4j library that helps implement rate-limiting functionality in Spring Boot applications. It restricts the number of calls made to a service or method within a specified time interval, protecting downstream systems from overloads and ensuring fair resource allocation.

Here’s a comprehensive guide to integrating and using Resilience4j Rate Limiter in a Spring Boot application:

1. Adding Dependencies

Add the required Resilience4j Spring Boot Starter dependency in your pom.xml for Maven projects.

<dependency>
<groupId>io.github.resilience4j</groupId>
<artifactId>resilience4j-spring-boot3</artifactId>
<version>2.x.x</version> <!-- Use the latest version -->
</dependency>

For Gradle:

implementation 'io.github.resilience4j:resilience4j-spring-boot3:2.x.x'

2. Configuration

Resilience4j allows configuration of rate limits in the application.yml or application.properties file.

Example Configuration in application.yml:

resilience4j:
ratelimiter:
instances:
myService:
limitForPeriod: 10 # Number of calls allowed in a time period
limitRefreshPeriod: 1s # Time period for resetting the limit
timeoutDuration: 500ms # Maximum wait time if the rate limit is exceeded

ParameterDescriptionlimitForPeriodMaximum number of calls allowed in a time window.limitRefreshPeriodDuration after which the limit resets.timeoutDurationMaximum time a thread waits for permission before throwing a RequestNotPermitted exception.

3. Using Rate Limiter

Resilience4j Rate Limiter can be applied programmatically or declaratively.

a. Declarative with @RateLimiter

Use the @RateLimiter annotation to apply rate-limiting to specific methods.

@RestController
@RequestMapping("/api")
public class MyController {

@GetMapping("/data")
@RateLimiter(name = "myService", fallbackMethod = "rateLimiterFallback")
public String getData() {
return "Rate Limiter Success!";
}
public String rateLimiterFallback(Throwable throwable) {
return "Rate limit exceeded. Please try again later.";
}
}
  • The name parameter refers to the configured rate limiter instance in application.yml.
  • The fallbackMethod handles requests that exceed the rate limit.

b. Programmatic Usage

Use the RateLimiter API to apply rate-limiting logic manually.

import io.github.resilience4j.ratelimiter.RateLimiter;
import io.github.resilience4j.ratelimiter.RateLimiterRegistry;

@Service
public class MyService {
private final RateLimiter rateLimiter;
public MyService(RateLimiterRegistry rateLimiterRegistry) {
this.rateLimiter = rateLimiterRegistry.rateLimiter("myService");
}
public String processRequest() {
return RateLimiter.decorateSupplier(rateLimiter, this::actualMethod).get();
}
private String actualMethod() {
return "Rate Limiter Success!";
}
}

4. Monitoring and Metrics

Resilience4j integrates seamlessly with Micrometer for monitoring and metrics.

Enable Micrometer in application.yml:

management:
endpoints:
web:
exposure:
include: resilience4j*
metrics:
enable:
resilience4j: true

Access Metrics

  1. Expose metrics via Spring Boot Actuator at /actuator/metrics/resilience4j.ratelimiter.calls.
  2. Metrics include:
  • successful_calls: Number of successful calls.
  • failed_calls: Number of calls that failed.
  • rate_limited_calls: Number of calls blocked due to exceeding rate limits.

5. Advanced Configurations

Custom Fallback Methods

Fallback methods can accept:

  1. The same arguments as the original method.
  2. Additional Throwable argument for error details.

Example:

@RateLimiter(name = "myService", fallbackMethod = "customFallback")
public String myMethod(String input) {
return "Processed: " + input;
}

public String customFallback(String input, Throwable throwable) {
return "Rate limit exceeded for input: " + input;
}

Multiple Rate Limiters

Define multiple instances in application.yml for different services or endpoints.

resilience4j:
ratelimiter:
instances:
serviceA:
limitForPeriod: 5
limitRefreshPeriod: 2s
timeoutDuration: 1s
serviceB:
limitForPeriod: 20
limitRefreshPeriod: 1s
timeoutDuration: 500ms

6. Testing Rate Limiter

You can test the behavior of your rate limiter using tools like Postman or JMeter. Ensure that:

  1. Requests exceeding the configured limits return an appropriate fallback response.
  2. The rate limiter resets after the limitRefreshPeriod.

7. Common Exceptions

  1. RequestNotPermitted Exception:
  • Thrown when a request exceeds the configured rate limit and the timeout period elapses.

2. Troubleshooting:

  • Ensure the rate limiter instance name matches the configuration.
  • Validate the limitRefreshPeriod and timeoutDuration settings.

8. Example Workflow

  1. Configure rate limit rules in application.yml.
  2. Annotate specific methods with @RateLimiter or use the programmatic API.
  3. Monitor and debug using Actuator endpoints or logs.
  4. Handle exceptions gracefully with fallback methods.

Key Benefits

  • Protects downstream services from being overwhelmed.
  • Improves system stability under high load.
  • Fully customizable and lightweight.

By implementing Resilience4j Rate Limiter in Spring Boot, you can ensure your application handles traffic efficiently while providing a better user experience during peak loads.

--

--

Niraj Kumar
Niraj Kumar

Written by Niraj Kumar

Architect | Lead Developer | Cloud Computing | Microservices | Java | React | Angular | Kafka | AI/ML

No responses yet