Implementing Rate Limiting and Throttling in Spring Boot APIs with Bucket4j and Redis

Aman Jha

Let me be upfront: if your API doesn’t have proper rate limiting, it’s not a matter of if something goes wrong, it’s a matter of when.

A while back, I experienced one of those moments in one of my projects where everything just started falling apart. One of our internal services went rogue, not intentionally, but due to a simple logic bug, it ended up calling a single endpoint in a tight loop. Within under a minute, the entire service became sluggish, the database connection pool had maxed out, and alerts started pouring in. That’s when it hit me, it wasn’t a security breach or a traffic spike from outside. It was our system misbehaving.

So I set out to fix this. In this post, I’ll share exactly how I implemented rate limiting and throttling in my Spring Boot APIs using Bucket4j and Redis, based on real-world lessons rather than just reading the docs.

The Problem: Too Many Requests, Too Little Defence

My APIs were being accessed in a range of ways — public clients, mobile users, and internal microservices. Some were well-behaved, others… not so much. And at that point, there were zero restrictions on how many requests someone could make.

Here’s what I personally ran into:

A public client started sending over 200 requests per second.
A developer accidentally hit “Send” repeatedly in Postman.
A microservice began retrying calls too aggressively.
A staging script made its way to production and overwhelmed it.

I was, quite literally, flooding my selves. That’s when I knew I needed a solid, flexible, and distributed rate-limiting mechanism.

What is Rate Limiting and Throttling?

Rate limiting is the process of controlling how many requests a client can make within a specific period. Think of it as your API saying, “You’ve made enough requests for now; come back in a minute.”

Throttling is what happens when a client hits that limit – the API says, “Too many requests”, and the request gets rejected with an appropriate response (usually HTTP 429).

Rate limiting protects backend systems from traffic spikes and abuse. It keeps services reliable, reduces downtime risk, and helps manage infrastructure costs.

Why I Chose Bucket4j?

I explored several options: Spring Cloud Gateway, Guava’s Rate Limiter, Resilience4j, and even writing our own filters.

But Bucket4j stood out. Here’s why:

It uses the Token Bucket algorithm, which works really well for burst traffic
The library is lightweight and doesn’t drag along heavy dependencies
Redis support is built-in, so it’s perfect for multi-instance deployment
It integrates smoothly into Spring Boot via filters, AOP, or straight service calls

Most importantly, Bucket4j supports shared rate limits across instances via Redis — a must for cloud-native setups.

Real Scenarios We Had to Handle

I didn’t want a one-size-fits-all solution. So, I mapped out our different consumers and designed rules accordingly:

Public APIs: Free tier users were allowed 100 requests per minute, while paid tier users had up to 1000.
Internal microservices: Allowed burst traffic but controlled average load over time.
Mobile clients: Rate limited based on IP address to prevent abuse from shared networks.

How I started (Implementation Steps):

Step 1: Dependencies

In my pom.xml file:

Required dependencies for the rate limiting.

Step 2: Redis Setup

In my application.yml file:

Redis lets me share rate limit state between app instances, essential for horizontal scaling.

Step 3: Set up your Rate limiter Properties & Filter Config

Here’s this will be your Rate limiter properties:

Rate limiter properties for setting the properties

And after that, add your filter config:

Filter configuration for setting up the order

Also, after adding your configurations, set up your request limit as per your application requirements inside the application.yml file:

yml file for hosting redis and conf. of requests

Step 4: Add a Filter or Aspect of Limiting

Response filter for setting up the methods or filters aspects.

This filter intercepts every incoming HTTP request and enforces rate limiting based on the client’s IP and HTTP method. If the allowed request quota is exceeded, it responds with a 429 status code and blocks further access temporarily.

Real Test Scenarios for Rate Limiter

When you hit your endpoint with Postman rapidly, suppose 10 requests per minute (According to previous application.yml file configuration), the first 10 requests within the minute will succeed. From 11 onwards, you will receive 429 Too Many Requests.

Testing after implementation of GET requests

Redis for Distributed Setup

Without Redis, rate limits would be per instance, so if you run 3 instances, each has its own limit. Redis lets you share the token bucket across all instances, ensuring global enforcement.

This was crucial for us in production; without Redis, we would have users bypass limits by hitting different load-balanced IPs.

Final Thoughts

Rate limiting is not just for public APIs; it’s a critical layer of protection in any system dealing with concurrency, scale, or user-generated traffic. Using Bucket4j with Redis in Spring Boot helped us build a resilient and fair API, ensuring that one bad actor doesn’t ruin it for everyone.

References

Source Code

https://www.baeldung.com/spring-bucket4

Solutions

Industry

Our thinking

Implementing Rate Limiting and Throttling in Spring Boot APIs with Bucket4j and Redis

Aman Jha

Table of Contents

The Problem: Too Many Requests, Too Little Defence

What is Rate Limiting and Throttling?

Why I Chose Bucket4j?

Real Scenarios We Had to Handle

How I started (Implementation Steps):

Step 1: Dependencies

Step 2: Redis Setup

Step 3: Set up your Rate limiter Properties & Filter Config

Step 4: Add a Filter or Aspect of Limiting

Real Test Scenarios for Rate Limiter

Redis for Distributed Setup

Final Thoughts

References

Aman Jha

Leave a Comment Cancel Reply

Suggested Article

NashTech

Solutions

Useful links

Connect with us

Our achievements