Shamil

Software Engineer | Berlin, Germany


A while back, I built a tiny service designed to fetch DNS information from various geographic locations. The service was initially built with simplicity in mind, and admittedly, a bit of laziness. At launch, I hadn’t bothered implementing rate limiting, partly because I assumed the niche use case wouldn’t attract many heavy users and partly out of sheer procrastination.

And indeed, for some time, peak traffic would hover around ~4000 requests per minute. However, recently I began noticing unexpected spikes, sometimes as high as ~20000 requests within a few minutes. The service runs on a moderate general-purpose VM, and the sudden burst of requests was putting the system to its limits.

After combing through the logs, it became clear that these bursts originated from a single IP address. It turned out that one client was checking DNS records for about 17000 domains every hour. Without a batch processing API, they resorted to making individual requests for each domain, creating the burst. Had I provided a batch API capable of accepting multiple domains per request, say around 500 domains per API call, these 17000 domain checks would translate to only about 34 API calls per hour. This would have significantly smoothed out the traffic spikes, distributed load evenly, and alleviated the pressure on the application.

It feels overdue and highly necessary to implement a proper rate limiting strategy tailored not just by IP but by user subscription tiers.

Architecture

Let’s briefly revisit the system architecture, which is very basic in itself.

The service comprises two stateless instances deployed behind an Nginx load balancer, each connected to Redis (in reality, it’s KeyDB).

While Nginx conveniently supports basic IP-based rate limiting out-of-the-box, my goal is not quite this. I want granular control based on subscription plans, which would enable distinct request limits per user:

  • Free plan: 100 requests/minute.
  • Starter plan: 3000 requests/minute.
  • Higher-tier plans: accordingly scaled for money’s worth..

Since these user-specific limits are subscription-based, determining them requires additonal contexts, which naturally resides within our application instances. Given our stateless architecture, Redis provides the ideal central storage for maintaining request counters.

Algorithm Overview

There are various rate limiting algorithms out there, each with its own strengths and suited for specific problems. I opted for the Sliding Window Log algorithm. While there are plenty of excellent write-ups available online, I’d like to explain it the way I understand it.

In a nutshell, the algorithm works as follows:

  • We start by defining a window of interest for tracking to requests. This window is typically defined by a specific time frame, in our case it’s 1 minute.
  • For each user, a log of request timestamps is maintained.
  • When a new request arrives, any timestamps in the log that are outside the current window of interest are removed.
  • After clearing outdated timestamps, the new request’s timestamp is appended to the log.
  • We then check the size of the log. If the number of timestamps in the log exceeds the predefined limit (e.g., the maximum number of requests allowed in the time frame), the request is denied.

Atomic Implementation with Redis + Lua

In order to ensure our rate limiting works properly across all our app instances, we require atomic operations. If multiple requests from the same user arrive simultaneously on different instances, they must all correctly reflect the user’s usage against the limit. Otherwise, things get messy and inaccurate.

This is where I find Lua scripts in Redis to be extremely valuable. Sure, Redis transactions could do the job too, but in my experience, they can get complicated pretty fast. Lua scripts keep things neat and tidy, and the entire algorithm logic up in one place.

Here’s the Lua script implementing the aforementioned logic:

local key = KEYS[1] -- unique key per user
local limit = tonumber(ARGV[1]) -- allowed number of requests within the window
local windowSize = tonumber(ARGV[2]) -- window size in milliseconds, e.g 60000 (for 1 minute)
local currentTime = tonumber(ARGV[3]) -- current epoch milli

-- Remove expired elements from the sorted set
redis.call('ZREMRANGEBYSCORE', key, '-inf', '(' .. currentTime - windowSize)

-- Get the current count from the sorted set
local currentCount = redis.call('ZCARD', key)

if currentCount >= limit then
	return currentCount
end

local member = currentTime .. '-' .. redis.call('INCR', key .. ':counter')

redis.call('ZADD', key, currentTime, member)
redis.call('PEXPIRE', key .. ':counter', windowSize)

return currentCount+1

A few insights into this script:

  • Passing currentTime externally (rather than fetching from Redis) simplifies testing and debugging.
  • The check if currentCount >= limit ensures only successfully processed requests contribute towards the user’s limit, rather than all attempts.
  • Each request is added to a Redis sorted set using ZADD. The request’s timestamp is used as the score. This orders the requests and helps remove those outside the sliding window. We need a unique value for each entry. Using just the timestamp would cause collisions when several requests happen in the same millisecond. To fix this, we append an atomic counter to the timestamp. This counter is stored in a key that is the sliding window’s key plus “:counter”, making it unique per user. Redis’s INCR command atomically increases the counter. Finally, we use PEXPIRE on the counter key to reset it after the window period, preventing unbounded growth.

Integration with Go Application

Let’s dive into the application code that ties everything together.

My service is built using the Fiber framework. It uses the go-redis library to talk to Redis. However, adapting this logic to other frameworks or plain Go apps should be straightforward.

Middleware Chain

Requests in the application flow through several Fiber middlewares before reaching the main handlers. The reqeust flow looks like this:

  • The Authentication middleware extracts an API key from each incoming request, resolves the associated user, and binds this user information into the request context. The user data bound to context looks like this:
    type ApiUser struct {
        UserId string
        // ... other fields
    }
    
  • Once the user is authenticated, the Subscription middleware retrieves the subscription details associated with the resolved user. This subscription information is also attached to the request context for downstream middlewares and handlers. The subscription struct looks like this:
    type Subscription struct {
        UserId            string
        RequestsPerMinute int64
        // ... other fields
    }
    
  • Finally, the request reaches our Rate-limit middleware. At this point, we have all the necessary context: the authenticated user and their subscription plan, including their allowed request limits. With this information, we can apply the rate-limiting algorithm to decide whether to allow or deny the request.

If you’re not using Fiber, setting up a middleware chain in a plain Go application should be quite trivial.

Implementing Rate-Limit Middleware

It’s time to implement the actual Redis-backed rate-limiting logic in our middleware.

First, we load the Lua script into a reusable redis.Script instance:

rateLimitScript := redis.NewScript(luaScript)

This rateLimitScript is of type redis.Script. We can hold on to this rateLimitScript and use it throughout our app’s lifetime.

Next, using the subscription details from context, we construct our Redis keys and arguments:

userDataKey := subscription.UserId + "-rate-limit-v1"

now := time.Now().UTC()
duration := time.Minute
expiresAt := now.Add(duration)

key := []string{userDataKey}
values := []interface{}{
    subscription.RequestsPerMinute,
    duration.Milliseconds(),
    now.UnixMilli(),
}
  • For each user, a unique Redis key is created by appending “-rate-limit-v1” to the user ID. This ensures that the rate limit data is isolated from other data stored for that user.
  • Our window of interest is 1 minute, or 60000 milliseconds. I prefer using UTC time because a minute is the same no matter where you are in the world.

Then we execute the Lua script to determine how many requests this user has made within the current sliding window:

requestsInWindow, err := rateLimitScript.Run(c.Context(), redisClient, key, values).Int64()

We have the number of requests this client has made in the current window (i.e in the last 1 minute). We can compare this against the number of permitted requests and decide whether the current request should be allowed or not.

if requestsInWindow >= sub.RequestsPerMinute {
	return c.Status(fiber.StatusTooManyRequests).JSON(fiber.Map{
		"error_code": "rate_limit_exceeded",
	})
}

In case the user is not allowed to make this request, we send a 429 Too Many Requests error.

Regardless of the outcome, what I also like to do is send the rate limit headers to the client, like what Twitch or Okta does. These should be simple to implement, since we already have the necessary data.

c.Set("RateLimit-Limit", strconv.FormatInt(sub.RequestsPerMinute, 10))
c.Set("RateLimit-Remaining", strconv.FormatInt(sub.RequestsPerMinute-requestsInWindow, 10))
c.Set("RateLimit-Reset", strconv.FormatInt(expiresAt.Unix(), 10))

And that’s it! We have a fairly simple, but scalable rate limiter in place.

Closing Thoughts

I want to emphasize that this approach is specifically designed to cater to feature-level granularity aligned with subscription plans, rather than a general-purpose solution. To cover the basics, I’ve complemented it with an overarching IP-based rate limiter in Nginx, which caps requests at 50000 per minute, sufficiently above the highest subscription tier limit of 20000 requests per minute.

So far, this layered rate-limiting strategy has proven effective and efficient.

Cheers!


The (almost?) complete code is here:

import (
	"github.com/gofiber/fiber/v2"
	"github.com/redis/go-redis/v9"
	"strconv"
	"time"
)

var luaScript = ``

type RateLimiter struct {
	redisClient         *redis.Client
	rateLimitScript     *redis.Script
}

func NewRateLimiter(
	rdb *redis.Client,
) *RateLimiter {
	return &RateLimiter{
		redisClient:         rdb,
		rateLimitScript:     redis.NewScript(luaScript),
	}
}

func (r *RateLimiter) ApplyRateLimit(c *fiber.Ctx) error {
	sub := c.Locals("subscription")

	userDataKey := sub.UserId + "-rate-limit-v1"

	now := time.Now().UTC()
	duration := time.Minute
	expiresAt := now.Add(duration)

	key := []string{userDataKey}
	values := []interface{}{sub.RequestsPerMinute, duration.Milliseconds(), now.UnixMilli()}

	requestsInWindow, err := r.rateLimitScript.Run(c.Context(), r.redisClient, key, values).Int64()

	if err != nil {
		return c.Status(fiber.StatusInternalServerError).JSON(fiber.Map{
			"error_code": "unexpected_api_server_error",
		})
	}

	allowedRate := sub.RequestsPerMinute

	c.Set("RateLimit-Limit", strconv.FormatInt(allowedRate, 10))
	c.Set("RateLimit-Remaining", strconv.FormatInt(allowedRate-requestsInWindow, 10))
	c.Set("RateLimit-Reset", strconv.FormatInt(expiresAt.Unix(), 10))

	if requestsInWindow >= allowedRate {
		return c.Status(fiber.StatusTooManyRequests).JSON(fiber.Map{
			"error_code": "rate_limit_exceeded",
		})
	}

	return c.Next()
}