Job pod failure matched a pod failure policy rule

Q: How to fix Kubernetes PodFailurePolicyMatch Job pod failure matched a pod failure policy rule?

Review and adjust podFailurePolicy rules: Inspecting the policy shows which exit codes trigger FailJob vs. retry so you can tune the rules to match your application's exit code semantics.

Q: How to fix Kubernetes PodFailurePolicyMatch Job pod failure matched a pod failure policy rule?

Add Ignore rule for node-level disruption exit codes: Ignoring DisruptionTarget failures prevents node evictions from consuming the Job's backoffLimit.

Production Risk

Medium — a misconfigured FailJob rule can permanently fail a batch job on the first transient error.

What this means

A Job pod terminated and its exit code or condition matched a rule in the Job's podFailurePolicy. Depending on the matched rule's action, the Job will either retry the pod, ignore the failure, or mark the Job as failed without retrying.

Why it happens

1Application container exited with a specific exit code defined in a podFailurePolicy rule.
2Pod condition (e.g. DisruptionTarget) matched a policy rule with action FailJob or Ignore.
3Policy rule with action Count decremented the backoff budget.

How to reproduce

Job pod exits and the controller evaluates podFailurePolicy rules against the pod's exit code or conditions.

trigger — this will error

kubectl describe job my-job
# Events: PodFailurePolicyMatch rule matched, action: FailJob

expected output

Status:
  Failed: 1
  Conditions:
    Type: Failed
    Reason: PodFailurePolicyMatch

Fix 1

Review and adjust podFailurePolicy rules

WHEN Job is failing faster than expected due to a policy match

Review and adjust podFailurePolicy rules

kubectl get job my-job -o yaml | grep -A 20 podFailurePolicy

Why this works

Inspecting the policy shows which exit codes trigger FailJob vs. retry so you can tune the rules to match your application's exit code semantics.

Fix 2

Add Ignore rule for node-level disruption exit codes

WHEN Node evictions are counting against job retries

Add Ignore rule for node-level disruption exit codes

podFailurePolicy:
  rules:
  - action: Ignore
    onPodConditions:
    - type: DisruptionTarget

Why this works

Ignoring DisruptionTarget failures prevents node evictions from consuming the Job's backoffLimit.

What not to do

Version notes

Kubernetes 1.26

podFailurePolicy introduced as alpha.

Kubernetes 1.27

podFailurePolicy promoted to beta.

Sources

Official documentation ↗

Kubernetes 1.26 — Job Pod Failure Policy

Content generated with AI assistance and reviewed for accuracy. Found an error? hello@errcodes.dev

At a glance

PlatformKubernetes Pod States & Errors

CodePodFailurePolicyMatch

ClassWorkloads

SeverityWARNING

TierNotable

ConfidenceHIGH

← All Kubernetes errors