DevOps Dictionary

Kubernetes PodDisruptionBudget (PDB)

Kubernetes PodDisruptionBudget (PDB) is a Kubernetes policy object that limits voluntary pod evictions so your application keeps enough replicas running during maintenance, node drains, cluster autoscaling, or upgrades. In practical terms, a PDB tells Kubernetes, “do not evict too many matching pods at the same time.”

What a PodDisruptionBudget does

A PDB protects availability during planned or controlled disruptions. It applies to pods selected by labels, usually pods managed by a Deployment, StatefulSet, or ReplicaSet.

Common voluntary disruptions include:

  • Running kubectl drain before replacing or patching a node
  • Cluster autoscaler removing underused nodes
  • Managed Kubernetes upgrades that move workloads between nodes
  • Maintenance workflows that use the Kubernetes eviction API

For example, if you run a web API with 5 replicas and define a PDB with minAvailable: 4, Kubernetes can evict only 1 matching pod at a time through the eviction API.

How it works

A PDB uses a label selector to find the pods it protects. It then calculates how many of those pods must remain available before Kubernetes allows another voluntary eviction.

You define one of these fields:

  • minAvailable: the minimum number or percentage of pods that must stay available.
  • maxUnavailable: the maximum number or percentage of pods that may be unavailable.

You should use one or the other, not both. For most stateless services, maxUnavailable is often easier to reason about. For quorum-based systems, minAvailable may be clearer.

Simple example

This PDB protects pods labeled app: payments-api and requires at least 2 of them to stay available:

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: payments-api-pdb
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: payments-api

If the Deployment has 3 replicas, Kubernetes can voluntarily evict only 1 pod at a time. If only 2 pods are healthy, a node drain that would evict another matching pod will wait or fail until availability improves.

Common use cases

  • Node upgrades: Keep critical services available while nodes are cordoned, drained, and replaced. PDBs are especially useful when planning Kubernetes upgrades for startup environments.
  • Cluster autoscaling: Stop the autoscaler from removing a node if doing so would take too many replicas offline.
  • Stateful workloads: Protect databases, queues, and consensus systems that need a minimum number of members available.
  • Platform workloads: Reduce downtime risk for ingress controllers, DNS components, controllers, and internal developer platform services.
  • Production apps: Add a safety guard for APIs, workers, and background services during routine infrastructure changes.

Key parts of a PDB

  • Selector: Matches the pods covered by the budget. If labels are wrong, the PDB may protect nothing or the wrong workload.
  • Desired availability: Defined with minAvailable or maxUnavailable.
  • Eviction API: Kubernetes checks the PDB when a tool requests pod eviction through this API.
  • Healthy pods: Kubernetes counts pods as available based on readiness and controller status.

What PDBs do not protect against

A PDB does not guarantee uptime in every failure scenario. It controls voluntary evictions, not all pod terminations.

A PDB does not prevent:

  • Node hardware failure
  • Kernel crashes or cloud VM failure
  • Pod crashes caused by application bugs
  • OOM kills due to memory pressure
  • Direct pod deletion that bypasses the eviction API
  • Bad rollout settings that make too many new pods unhealthy

You still need enough replicas, readiness probes, resource requests, topology spread, and safe rollout settings. If you manage Kubernetes objects through IaC, include PDBs alongside Deployments and Services when you deploy Kubernetes resources using Terraform.

PDB vs Deployment rolling update settings

A PDB and a Deployment rolling update strategy solve related but different problems.

  • Deployment maxUnavailable: Controls how many pods the Deployment can make unavailable during an application rollout.
  • PDB maxUnavailable: Controls how many matching pods can be voluntarily evicted by cluster operations.

For example, a Deployment might allow 1 unavailable pod during a rollout, while the PDB also allows 1 unavailable pod during node maintenance. Configure both with your real replica count in mind.

Real-world example

Suppose you run Apache Airflow on EKS with webserver, scheduler, and worker components. The scheduler is critical because it coordinates DAG execution. If you run 2 scheduler replicas, you may add a PDB with minAvailable: 1 so maintenance does not evict both at the same time. This is a practical addition when you deploy Apache Airflow on AWS EKS.

Practical guidance

  • Set at least 2 replicas before adding a strict PDB. A PDB on a single-replica workload can block node drains.
  • Use maxUnavailable: 1 for many small stateless services with 2 or more replicas.
  • Use percentages carefully. For small replica counts, rounding can surprise you.
  • Check PDB status with kubectl get pdb before maintenance.
  • Make sure readiness probes reflect whether the pod can actually serve traffic.
  • Avoid one broad PDB that matches unrelated workloads. Keep selectors precise.

Related Kubernetes concepts

  • Deployment: Manages stateless replicas and rolling updates.
  • StatefulSet: Manages stable identities for stateful pods.
  • Readiness probe: Tells Kubernetes whether a pod should receive traffic and count as available.
  • Node drain: Safely evicts pods from a node before maintenance.
  • Cluster autoscaler: May remove nodes when workloads can move elsewhere without violating PDBs.

A PodDisruptionBudget is a small Kubernetes object with a clear purpose: protect service availability during planned disruption. Used well, it gives platform and SRE teams safer maintenance workflows without hiding real capacity or reliability problems.

A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
Y
X
Z