Rate Limiting
API requests are rate limited to ensure fair usage and system stability. Learn how to monitor and handle rate limits in your applications.
About Rate Limiting
Rate limiting helps maintain API performance and ensures fair resource distribution among all users. When you exceed the rate limit, you'll receive a 429 Too Many Requests status code.
Key Points:
- Rate limits are applied per API key
- All responses include rate limit headers for monitoring
- Different API endpoints may have different rate limits
- Rate limits reset at specific intervals
Default Rate Limits
The following rate limits apply by default to different API categories:
| API Category | Rate Limit | Description |
|---|---|---|
| Public API | 100 requests/minute | Standard API endpoints for general use |
| Studio API | 60 requests/minute | Video studio and editing endpoints |
| Inference API | 10 requests/minute | AI inference and generation endpoints |
| Widget Sessions | 20 sessions/hour | Widget session creation endpoints |
Note: Rate limits may vary based on your subscription plan. Enterprise customers may have custom rate limits. Check your account dashboard or contact support for specific limits.
Rate Limit Headers
All API responses include rate limit information in HTTP headers. Monitor these headers to track your usage and avoid hitting rate limits.
Header Format
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 95
X-RateLimit-Reset: 1640995200| Header | Description | Example |
|---|---|---|
| X-RateLimit-Limit | Maximum number of requests allowed in the current time window | 100 |
| X-RateLimit-Remaining | Number of requests remaining in the current time window | 95 |
| X-RateLimit-Reset | Unix timestamp (seconds) when the rate limit window resets | 1640995200 |
Handling Rate Limits
When you receive a 429 Too Many Requests status code, implement retry logic with exponential backoff. Here are examples in different programming languages:
async function makeAPICall(url, options) {
try {
const response = await fetch(url, options);
// Check rate limit headers
const limit = response.headers.get('X-RateLimit-Limit');
const remaining = response.headers.get('X-RateLimit-Remaining');
const reset = response.headers.get('X-RateLimit-Reset');
console.log(`Rate limit: ${remaining}/${limit} requests remaining`);
console.log(`Reset time: ${new Date(reset * 1000).toISOString()}`);
if (response.status === 429) {
const resetTime = parseInt(reset) * 1000;
const waitTime = Math.max(0, resetTime - Date.now());
console.log(`Rate limited. Waiting ${waitTime}ms`);
await new Promise(resolve => setTimeout(resolve, waitTime));
return makeAPICall(url, options); // Retry
}
return response;
} catch (error) {
console.error('API call failed:', error);
throw error;
}
}Rate Limit Error Response
When you exceed the rate limit, you'll receive a 429 status code with the following response format:
{
"error": {
"code": "RATE_LIMIT_EXCEEDED",
"message": "Too many requests. Rate limit exceeded.",
"details": {
"limit": 100,
"remaining": 0,
"reset_at": "2024-01-01T00:00:00Z"
}
},
"request_id": "req-123456"
}Best Practices
- Monitor rate limit headers: Always check
X-RateLimit-Remainingto track your usage and avoid hitting limits. - Implement retry logic: When you receive a 429 status, wait until the reset time before retrying the request.
- Use exponential backoff: For multiple retries, implement exponential backoff to avoid overwhelming the API.
- Cache responses: Cache API responses when possible to reduce the number of requests.
- Batch requests: When possible, batch multiple operations into a single request.
- Queue requests: For high-volume applications, implement a request queue to manage rate limits.
- Monitor usage: Track your API usage patterns and adjust your application behavior accordingly.
- Plan for limits: Design your application to gracefully handle rate limit errors without breaking user experience.
Notes
- Rate limits are enforced per API key, not per IP address.
- Rate limit windows reset at specific intervals (typically every minute or hour depending on the endpoint).
- Different endpoints may have different rate limits based on their resource requirements.
- Rate limit headers are included in all responses, even successful ones, so you can monitor usage proactively.
- If you consistently hit rate limits, consider upgrading your plan or contacting support for custom limits.
- For error handling details, see Error Responses.
- For authentication information, see Authentication.