Skip to main content
To ensure availability, stability, and fair usage for all users, the tipee API enforces rate limiting on all requests.

Token bucket

Rate limiting is based on a token bucket mechanism. Each tenant has a bucket of tokens that is consumed by requests and replenished continuously over time.

Pool of 500 tokens...

Maximum token pool size at your disposal.

... replenishing at 4 tokens / second

Refill rate — your pool continuously replenishes at this pace.
Each API request consumes one token from the bucket. When the bucket is empty, further requests are rejected until enough tokens have been replenished. These limits apply to all endpoints available in this documentation.

Concurrency limits

In addition to rate limiting, the API limits the number of requests that can be processed simultaneously for a given tenant.
ParameterValue
Maximum concurrent requests32 per tenant
Queue capacity128 additional requests
When all concurrent slots are in use, additional requests are held in a queue. If the queue is also full, the API responds immediately with an error. These are upper bounds. During periods of high load, effective concurrency may be reduced to ensure fairness across tenants. Reaching these limits is also likely to affect human users interacting with tipee on the same instance.
In practice, we recommend keeping concurrent requests below 10.

Response headers

Every API response includes headers describing your current rate limit status:
HeaderDescription
X-RateLimit-LimitMaximum number of requests allowed
X-RateLimit-RemainingRequests remaining before the limit is reached
X-RateLimit-Rate-AmountNumber of requests replenished per interval
X-RateLimit-Rate-IntervalDuration of each replenishment interval, in seconds
X-RateLimit-Retry-AfterSeconds to wait before a request can be accepted (0 when within limits)
Example:
X-RateLimit-Limit: 500
X-RateLimit-Remaining: 456
X-RateLimit-Retry-After: 0
We strongly recommend using these headers in your application to proactively manage your request frequency and avoid hitting the limit.

Exceeding the limit

When the limit is reached, the API responds with HTTP 429 Too Many Requests and includes a Retry-After header indicating how long to wait:
HTTP/1.1 429 Too Many Requests
Retry-After: 1
You should not attempt to retry the request until the Retry-After duration has passed.

Best practices

Use response headers to self-throttle

Monitor X-RateLimit-Remaining and reduce your request rate as it decreases. This is more reliable than reacting to 429 errors after the fact.

Limit concurrency

Keep parallel requests below 10 to avoid contention and minimize impact on human users of the same tipee instance.

Respect Retry-After

When you receive a 429 response, always wait for the full duration specified in the header before retrying.

Spread requests over time

A steady flow of requests uses the refill rate more efficiently than large bursts followed by idle periods.

Implement exponential backoff

For transient errors (5xx) or repeated rate limit responses, increase the delay between retries rather than retrying immediately.