Rate Limiting

API requests are rate limited to ensure fair usage and system stability. Learn how to monitor and handle rate limits in your applications.

About Rate Limiting

Rate limiting helps maintain API performance and ensures fair resource distribution among all users. When you exceed the rate limit, you'll receive a 429 Too Many Requests status code.

Key Points:

  • Rate limits are applied per API key
  • All responses include rate limit headers for monitoring
  • Different API endpoints may have different rate limits
  • Rate limits reset at specific intervals

Default Rate Limits

The following rate limits apply by default to different API categories:

API CategoryRate LimitDescription
Public API100 requests/minuteStandard API endpoints for general use
Studio API60 requests/minuteVideo studio and editing endpoints
Inference API10 requests/minuteAI inference and generation endpoints
Widget Sessions20 sessions/hourWidget session creation endpoints

Note: Rate limits may vary based on your subscription plan. Enterprise customers may have custom rate limits. Check your account dashboard or contact support for specific limits.

Rate Limit Headers

All API responses include rate limit information in HTTP headers. Monitor these headers to track your usage and avoid hitting rate limits.

Header Format

http
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 95
X-RateLimit-Reset: 1640995200
HeaderDescriptionExample
X-RateLimit-LimitMaximum number of requests allowed in the current time window100
X-RateLimit-RemainingNumber of requests remaining in the current time window95
X-RateLimit-ResetUnix timestamp (seconds) when the rate limit window resets1640995200

Handling Rate Limits

When you receive a 429 Too Many Requests status code, implement retry logic with exponential backoff. Here are examples in different programming languages:

javascript
async function makeAPICall(url, options) {
    try {
        const response = await fetch(url, options);
        
        // Check rate limit headers
        const limit = response.headers.get('X-RateLimit-Limit');
        const remaining = response.headers.get('X-RateLimit-Remaining');
        const reset = response.headers.get('X-RateLimit-Reset');
        
        console.log(`Rate limit: ${remaining}/${limit} requests remaining`);
        console.log(`Reset time: ${new Date(reset * 1000).toISOString()}`);
        
        if (response.status === 429) {
            const resetTime = parseInt(reset) * 1000;
            const waitTime = Math.max(0, resetTime - Date.now());
            
            console.log(`Rate limited. Waiting ${waitTime}ms`);
            await new Promise(resolve => setTimeout(resolve, waitTime));
            
            return makeAPICall(url, options); // Retry
        }
        
        return response;
    } catch (error) {
        console.error('API call failed:', error);
        throw error;
    }
}

Rate Limit Error Response

When you exceed the rate limit, you'll receive a 429 status code with the following response format:

json
{
  "error": {
    "code": "RATE_LIMIT_EXCEEDED",
    "message": "Too many requests. Rate limit exceeded.",
    "details": {
      "limit": 100,
      "remaining": 0,
      "reset_at": "2024-01-01T00:00:00Z"
    }
  },
  "request_id": "req-123456"
}

Best Practices

  • Monitor rate limit headers: Always check X-RateLimit-Remaining to track your usage and avoid hitting limits.
  • Implement retry logic: When you receive a 429 status, wait until the reset time before retrying the request.
  • Use exponential backoff: For multiple retries, implement exponential backoff to avoid overwhelming the API.
  • Cache responses: Cache API responses when possible to reduce the number of requests.
  • Batch requests: When possible, batch multiple operations into a single request.
  • Queue requests: For high-volume applications, implement a request queue to manage rate limits.
  • Monitor usage: Track your API usage patterns and adjust your application behavior accordingly.
  • Plan for limits: Design your application to gracefully handle rate limit errors without breaking user experience.

Notes

  • Rate limits are enforced per API key, not per IP address.
  • Rate limit windows reset at specific intervals (typically every minute or hour depending on the endpoint).
  • Different endpoints may have different rate limits based on their resource requirements.
  • Rate limit headers are included in all responses, even successful ones, so you can monitor usage proactively.
  • If you consistently hit rate limits, consider upgrading your plan or contacting support for custom limits.
  • For error handling details, see Error Responses.
  • For authentication information, see Authentication.

Table of Contents