Unkey’s rate limiting is designed for global, low-latency enforcement across distributed systems.Documentation Index
Fetch the complete documentation index at: https://unkey.com/docs/llms.txt
Use this file to discover all available pages before exploring further.
Architecture
When you calllimiter.limit(identifier):
- Request hits the nearest Unkey location
- Counter is checked and updated
- Decision returned in ~30ms globally
Sliding window algorithm
Unkey uses a sliding window algorithm that provides smooth rate limiting without the “burst at window start” problem of fixed windows. Fixed window problem:- Limit: 100/minute
- User sends 100 requests at 0:59
- Window resets at 1:00
- User sends 100 more at 1:01
- Result: 200 requests in 2 seconds ❌
- Limit: 100/minute
- Considers requests from the past 60 seconds at any point
- No burst exploitation possible
Global consistency
Rate limits are enforced consistently across all regions. A user can’t bypass limits by hitting different geographic endpoints.Cross-region denial propagation
When an identifier crosses its limit in any region, every other region picks up the denial within a few seconds and starts rejecting the same identifier locally — even before that region sees any of the abusive traffic firsthand. The window is honored end to end: as the offending window decays, every region releases the identifier at the same time. This means a single attacker hitting your API from multiple geographies can’t multiply their effective limit by the number of regions they hit. Once any region denies them, every region denies them. You don’t have to enable or configure anything — propagation runs automatically for every namespace.Cross-region enforcement applies to limits with a window of at least 1 minute.
Shorter windows (for example, per-second burst limits) are enforced per region
only, because the propagation roundtrip takes longer than the window itself.
Response fields
Every rate limit check returns:| Field | Type | Description |
|---|---|---|
success | boolean | true if request is allowed |
limit | number | The configured limit |
remaining | number | Requests left in current window |
reset | number | Unix timestamp (ms) when window resets |
Handling the response
Cost-based limiting
Not all requests are equal. Usecost to deduct more from the limit for expensive operations:
- 100 normal requests, OR
- 20 expensive requests, OR
- Mix of both
Timeout and fallback
Configure behavior when Unkey is unreachable:Next steps
Custom overrides
Give specific users different limits
SDK reference
Full SDK documentation

