API Rate Limiting | Help Center

How Rate Limiting Works

huhu.ai uses a sliding-window rate limiter to ensure fair usage across all customers. Each plan tier has a defined maximum number of API requests per hour: Free (100), Starter (500), Pro (2,000), and Enterprise (custom). Limits apply per API key, not per endpoint.

Rate limits are separate from monthly generation credits. You can make many read-only API calls (e.g., listing projects) without consuming generation credits.

Rate Limit Headers

Every API response includes three rate-limit headers: X-RateLimit-Limit (your hourly cap), X-RateLimit-Remaining (requests left in the current window), and X-RateLimit-Reset (Unix timestamp when the window resets).

Monitor these headers proactively to throttle your client before hitting the limit.

Handling 429 Errors

When you exceed the limit, the API returns 429 Too Many Requests with a Retry-After header indicating how many seconds to wait. Implement exponential backoff in your client: wait the Retry-After duration, then double the wait on each subsequent 429 until the request succeeds.

Avoid tight retry loops without backoff, as they will keep your client rate-limited longer and may trigger temporary IP-level blocks.

Best Practices

Cache responses where possible to reduce redundant API calls.
Use webhooks instead of polling for asynchronous job results.
Distribute API calls evenly over time rather than bursting them all at once.
If you consistently hit your limit, consider upgrading to a higher plan tier with a larger allowance.