TooManyRequests (429)
KubernetesWARNINGNotableAPI ErrorHIGH confidence

API server rate limit exceeded

Production Risk

Controllers and operators may fail to reconcile desired state; deployments may be delayed.

What this means

HTTP 429 from the Kubernetes API server indicates the client has exceeded the API priority and fairness (APF) rate limits. The API server uses a token-bucket algorithm per user, group, and verb. Large clusters with many controllers, CI/CD pipelines, and monitoring agents are particularly prone to this. The response includes a Retry-After header.

Why it happens
  1. 1High-frequency kubectl polling in scripts or CI/CD pipelines
  2. 2Operator or controller using watch-and-reconcile at too high a rate
  3. 3Monitoring or audit tooling making excessive LIST calls
  4. 4APF flow control misconfigured with insufficient shares for the workload type
How to reproduce

kubectl calls or controller reconcile loops return 429; operations are delayed.

trigger — this will error
trigger — this will error
kubectl get pods
# Error from server (TooManyRequests): the server has received too many requests
# and is asking clients to slow down, please wait and retry later

kubectl get flowschemas
kubectl get prioritylevelconfigurations

expected output

Error from server (TooManyRequests): the server has received too many requests and is asking clients to slow down

Fix 1

Use watch instead of polling

WHEN Scripts or tools are polling the API at high frequency

Use watch instead of polling
# Instead of polling
kubectl get pods --watch

# In client-go, use Informers with shared cache
# rather than direct List/Get calls in tight loops

Why this works

Watches establish a single long-lived connection and receive pushed updates, eliminating repeated polling.

Fix 2

Review and adjust API Priority and Fairness

WHEN Legitimate workloads are being rate-limited

Review and adjust API Priority and Fairness
kubectl get flowschemas
kubectl get prioritylevelconfigurations
kubectl describe prioritylevelconfiguration workload-high

Why this works

APF configuration controls token allocation per request priority; adjusting shares can allow more throughput for critical workloads.

What not to do

Sources
Official documentation ↗

Kubernetes Documentation

Content generated with AI assistance and reviewed for accuracy. Found an error? hello@errcodes.dev

← All Kubernetes errors