API server rate limit exceeded

Q: How to fix Kubernetes TooManyRequests (429) API server rate limit exceeded?

Use watch instead of polling: Watches establish a single long-lived connection and receive pushed updates, eliminating repeated polling.

Q: How to fix Kubernetes TooManyRequests (429) API server rate limit exceeded?

Review and adjust API Priority and Fairness: APF configuration controls token allocation per request priority; adjusting shares can allow more throughput for critical workloads.

Production Risk

Controllers and operators may fail to reconcile desired state; deployments may be delayed.

What this means

HTTP 429 from the Kubernetes API server indicates the client has exceeded the API priority and fairness (APF) rate limits. The API server uses a token-bucket algorithm per user, group, and verb. Large clusters with many controllers, CI/CD pipelines, and monitoring agents are particularly prone to this. The response includes a Retry-After header.

Why it happens

1High-frequency kubectl polling in scripts or CI/CD pipelines
2Operator or controller using watch-and-reconcile at too high a rate
3Monitoring or audit tooling making excessive LIST calls
4APF flow control misconfigured with insufficient shares for the workload type

How to reproduce

kubectl calls or controller reconcile loops return 429; operations are delayed.

trigger — this will error

kubectl get pods
# Error from server (TooManyRequests): the server has received too many requests
# and is asking clients to slow down, please wait and retry later

kubectl get flowschemas
kubectl get prioritylevelconfigurations

expected output

Error from server (TooManyRequests): the server has received too many requests and is asking clients to slow down

Fix 1

Use watch instead of polling

WHEN Scripts or tools are polling the API at high frequency

Use watch instead of polling

# Instead of polling
kubectl get pods --watch

# In client-go, use Informers with shared cache
# rather than direct List/Get calls in tight loops

Why this works

Watches establish a single long-lived connection and receive pushed updates, eliminating repeated polling.

Fix 2

Review and adjust API Priority and Fairness

WHEN Legitimate workloads are being rate-limited

Review and adjust API Priority and Fairness

kubectl get flowschemas
kubectl get prioritylevelconfigurations
kubectl describe prioritylevelconfiguration workload-high

Why this works

APF configuration controls token allocation per request priority; adjusting shares can allow more throughput for critical workloads.

What not to do

Sources

Official documentation ↗

Kubernetes Documentation

Content generated with AI assistance and reviewed for accuracy. Found an error? hello@errcodes.dev

At a glance

PlatformKubernetes Pod States & Errors

CodeTooManyRequests (429)

ClassAPI Error

SeverityWARNING

TierNotable

ConfidenceHIGH

← All Kubernetes errors