An alternative approach to rate limiting
If you are building web applications at consumer scale, you may find Figma’s approach to building its rate limiter quite helpful. A rate limiter caps how many requests a sender , a user or an IP address , can issue in a specific window of time. Nikrad Mahdi considered the existing implementations such as Token bucket, Fixed window counters and Sliding Windows log but none seemed to solve effectively for two factors that were important for the team:
- accurately limit excessive use of their web application
- use as little memory as possible
The token bucket algorithm’s Redis operations aren’t atomic. In a distributed environment, the “read-and-then-write” behavior creates a race condition. This means the rate limiter can at times be too lenient. Redis operations are atomic for a fixed window approach but it can sometimes let through twice the number of allowed requests per minute. For example, if the rate limits are 5 requests per minute and a user makes 5 requests at 11:00:59, they could make 5 more requests at 11:01:00 because a new counter begins at the start of each minute. And while the precision of the sliding window log approach is useful, it leaves a considerably large memory footprint because it stores a value for every request.
As none of these met the team’s requirement, Nikrad implemented an alternative approach by combining fixed window counters and sliding window log. Requests from each sender are counted using multiple fixed time windows 1/60th the size of the rate limit’s time window. To reduce the memory footprint, they store the counters in a Redis hash which offer efficient storage when they have fewer than 100 keys. When each request increments a counter in the hash, it also sets the hash to expire an hour later. If they have a rate limit of 500 requests per day per user on a website with 10,000 active users, they would at most need to store ~600,000 values in Redis. That comes out to a memory footprint of ~2.4 MB.
Check out the post to understand the practical implications of implementing a rate limiting algo.
8 mins read