What is rate limiting and how is it communicated in REST APIs?

Answer

Rate limiting restricts how many requests a client can make in a time window, protecting the API from abuse and ensuring fair usage. Common algorithms include token bucket (smooth burst tolerance), sliding window (accurate, no boundary spikes), and fixed window (simple but allows burst at window boundaries). Rate limit status is communicated via standard headers: X-RateLimit-Limit: 1000 (total allowed), X-RateLimit-Remaining: 42 (left in current window), X-RateLimit-Reset: 1609459200 (Unix timestamp when the window resets). When the limit is exceeded, return 429 Too Many Requests with a Retry-After header indicating seconds to wait. Apply different limits by tier (free vs paid), by endpoint (write operations cost more), and by IP for unauthenticated requests.