TokenLimiter
This library enables you to limit the usage of tokens per minute across multiple goroutines, with support for optional delay until fit behavior.
Problem to Solve
Solution
Typical Use Cases
- Rate limiting API calls across concurrent processes
- Controlling resource consumption in distributed systems
- Managing token usage for third-party API integrations
- Ensuring compliance with API rate limits in high-throughput applications
Used By
- LinkResearchTools
- A large AI brand
Version History
- Review and update to latest libraries
- Fixed compatibility issues with older systems
- Review and update to latest libraries
License Terms
Features #
- Thread-safe token limiting across multiple goroutines
- Option to delay until fit: when enabled, if the token usage limit is exceeded, the call will wait until enough tokens have passed for the usage to fit within the limit
- Error when limit exceeded: when delay until fit is not enabled, the call will return an error if the token usage limit is exceeded
- Track average usage: keep track of the average token usage over a series of points (e.g., last 100 usages)
- Get last minute: method to get the last minute that was used to add tokens
Limitations #
This TokenLimiter works only on a single machine. If run on multiple nodes of a cluster, the token limit could be exceeded. A workaround for this would be to give each node a portion of the Token Per Minute quota (TPM). This would work if I know that each node will have similar utilization. However, if a machine handles less requests and is not operational, its quota would be unused, degrading cluster performance. The proper solution would be to implement a cluster of TokenLimiters that manages the distribution of token limits per machine dynamically.
How is this different from Rate Limiters #
While rate limiters and my TokenLimiter both limit the usage of resources in some way, there are key differences. Rate limiters typically limit the number of requests that can be made in a given time period and often handle concurrency in isolation. My TokenLimiter is specifically designed for token-based systems (e.g., API tokens) where each action may consume different amounts of tokens, and provides thread-safe management across concurrent operations.