The Semji API enforces rate limits to ensure fair usage and stable performance for all customers. Understanding these limits and building retry logic into your integration will prevent disruptions in production. This page explains the limits in place, the response headers you can use to track your usage, and recommended strategies for handling errors gracefully.Documentation Index
Fetch the complete documentation index at: https://developers.semji.com/llms.txt
Use this file to discover all available pages before exploring further.
Limits
Two rate limits apply to every API key simultaneously:| Limit | Window | Maximum |
|---|---|---|
| Hourly | 1 hour (rolling) | 1,000 requests |
| Burst | 1 second | 20 requests |
/health), OpenAPI spec (/openapi.json), and documentation (/docs) endpoints are excluded from rate limiting.
Rate limit headers
Every API response includes the following headers reflecting your current hourly limit status:The maximum number of requests allowed per hour for this API key.
The number of requests remaining in the current rolling hour window.
The number of seconds until the oldest request in the rolling window falls out and your remaining count increases.
-i flag:
Inspect rate limit headers
Response headers (excerpt)
When the limit is exceeded
When you exceed either limit, the API responds with429 Too Many Requests. The response body follows the standard error format:
429 Too Many Requests
429, the response also includes a Retry-After header indicating the number of seconds to wait before retrying.
Handling 429 errors
Exponential backoff
The recommended approach is exponential backoff with jitter. Wait progressively longer between retries, and add a small random delay to avoid synchronized retry storms across multiple workers:Proactively monitoring your budget
Instead of waiting for a429, read the X-RateLimit-Remaining header on each response and slow down when your budget runs low:
Proactive throttling (Python)
Best practices
Cache responses where possible. Resources like workspaces, brand voices, and knowledge documents rarely change. Cache their IDs and names for the duration of your session rather than fetching them on every run. Use pagination efficiently. Fetch only the pages you need. Use the maximumlimit=100 when you need to process all items in a collection, rather than making many small requests.
Batch reads before writes. If your workflow reads several resources before creating or updating one, perform all the reads first. This groups your writes into a smaller time window and leaves more headroom for subsequent operations.
Avoid retrying 4xx errors other than 429. Errors in the 400–428 range indicate a problem with the request itself (bad input, missing permissions, not found). Retrying them will not help and only wastes your quota. Fix the request and then retry.