For public web services, to throttle excess requests of a client we employ a technique called Rate Limiting. Rate limiting accept and reject a client request based on a certain quota.
A typical application can have many resources with the need for rate limiting.
For a simple one-machine system, we can use the Token Bucket algorithm, (Other Algos like a sliding window can also be used) to restrict client requests for a resource beyond a quota.
The token bucket algorithm works by accumulating tokens in a bucket at a fixed rate. Each token represents a certain amount of data or allowed transmission. Data packets can be transmitted only when there are enough tokens in the bucket to cover their size. If the bucket is empty, data transmission is delayed until new tokens are generated at the fixed rate.
We require the following components:
- Quota Rules Manager – It manages resources and quota mapping.
- Token Bucket Manager
- Rules Retriever – It gets quota details from Quota Rules Manager.
- Token Bucket Manager – Use rules and manage the token bucket for distinct clients and requests. It also periodically adds tokens to the bucket and deletes idle token buckets to control memory utilization.
- Client Identifier Generator – This component can use requesting client user id, IP address, or any other attribute to generate a unique identity.
- Rate Limiter – This component will take resource id and client unique identification and decide if request access is allowed or rejected by checking with the token bucket.
In the distributed environment a single request is handled by multiple cluster nodes. A client request handled at one node will impact the token bucket on all nodes. To implement a mechanism for cross-node communication, we can use the following mechanism
- Full mesh – Let everyone know everyone. Good for a small no. of nodes.
- Gossip Protocol
- Distributed cache design like Redis
- Leader Election- Elected leader receives the update and relays the update
Integration of the Rate Limiter is a crucial aspect and we can use Rate Limiter with service or within the same host as service as a separate process.