DevOps Dictionary

Kubernetes StatefulSet

Kubernetes StatefulSet definition

Kubernetes StatefulSet is a Kubernetes workload API object for running stateful applications. It manages pods that need stable network identity, stable persistent storage, and ordered deployment or termination.

In practical terms, a StatefulSet is useful when each pod is not interchangeable. A database replica, message broker node, or distributed system member often needs its own name, storage volume, and startup order. StatefulSet gives Kubernetes a way to manage those requirements.

What a StatefulSet does

A StatefulSet manages a set of pods with predictable identities. If you create a StatefulSet named postgres with 3 replicas, Kubernetes creates pods named:

  • postgres-0
  • postgres-1
  • postgres-2

Those names stay consistent across rescheduling. If postgres-1 crashes and gets recreated on another node, it still comes back as postgres-1 and reattaches to its own persistent volume when configured correctly.

How StatefulSet works

A StatefulSet usually works with three Kubernetes features:

  • Pods: The running instances of your application.
  • PersistentVolumeClaims: Storage claims that give each pod its own persistent disk or volume.
  • Headless Service: A Service with clusterIP: None that provides stable DNS records for each pod.

For example, with a headless Service named postgres in the prod namespace, pod DNS names may look like:

  • postgres-0.postgres.prod.svc.cluster.local
  • postgres-1.postgres.prod.svc.cluster.local
  • postgres-2.postgres.prod.svc.cluster.local

StatefulSet also handles ordering. By default, Kubernetes creates pods in order, starting with -0, then -1, then -2. During deletion or scale-down, it removes them in reverse order.

Common use cases

StatefulSets are commonly used for applications that rely on stable identity or persistent data, such as:

  • Databases such as PostgreSQL, MySQL, MongoDB, or Cassandra
  • Message brokers such as Kafka or RabbitMQ
  • Search systems such as Elasticsearch or OpenSearch
  • Distributed coordination systems such as ZooKeeper or etcd
  • Applications that require one persistent volume per replica

For data platforms running on Kubernetes, a StatefulSet may be part of the wider deployment design. For example, an Apache Airflow deployment on AWS EKS may use Kubernetes workloads together with external databases, queues, and persistent storage depending on the architecture.

StatefulSet vs Deployment

A Deployment is the usual choice for stateless applications. A StatefulSet is the better fit when pod identity and storage must stay consistent.

  • Deployment: Creates interchangeable pods. Any pod can replace another pod.
  • StatefulSet: Creates pods with stable names, stable storage, and predictable ordering.

For example, an API service with 10 identical replicas usually belongs in a Deployment. A 3-node Kafka cluster usually belongs in a StatefulSet because each broker has its own identity and storage.

Benefits

  • Stable pod names: Each replica gets a predictable ordinal name, such as app-0.
  • Stable network identity: Each pod can have a consistent DNS name through a headless Service.
  • Persistent storage per pod: Each replica can keep its own volume across restarts and rescheduling.
  • Ordered rollout behavior: Kubernetes can create, update, and delete pods in a predictable sequence.

Tradeoffs and limitations

  • More operational complexity: You need to think carefully about storage, backups, recovery, and upgrade order.
  • Scaling may require application knowledge: Adding or removing replicas can affect quorum, replication, partitions, or cluster membership.
  • Storage is not deleted automatically in many cases: Persistent volumes may remain after deleting the StatefulSet, which can be useful for data safety but requires cleanup planning.
  • Rolling updates can be slower: Stateful workloads often need cautious, ordered updates to avoid data loss or downtime.

Simple real-world example

Suppose you run a 3-node PostgreSQL-compatible database cluster on Kubernetes. Each node needs its own disk because it stores data locally. The primary and replicas also need stable names so they can identify each other during replication.

A StatefulSet can create:

  • db-0 with volume data-db-0
  • db-1 with volume data-db-1
  • db-2 with volume data-db-2

If db-1 moves to another Kubernetes node after a failure, it can keep the same identity and reconnect to its own storage. That behavior is the main reason to use a StatefulSet instead of a Deployment.

Operational considerations

Before using StatefulSet in production, plan these details:

  • Storage class: Choose the right storage backend for your latency, durability, and availability needs.
  • Backup and restore: Test restore procedures before you need them during an incident.
  • Pod disruption budgets: Prevent too many replicas from going down at once during node maintenance.
  • Readiness probes: Make sure Kubernetes sends traffic only to pods that are ready.
  • Upgrade strategy: Validate application-specific upgrade steps, especially for databases and clustered systems.

If your team manages manifests through infrastructure as code, you can define StatefulSets alongside other objects when you deploy Kubernetes resources using Terraform. If your StatefulSet depends on cloud resources such as databases, IAM roles, or object storage, tools like Crossplane can help manage those dependencies inside Kubernetes, as shown in this guide to deploying AWS resources using Crossplane on Kubernetes.

StatefulSet and Kubernetes upgrades

Stateful workloads need extra care during cluster upgrades. Node drains, storage driver changes, and version compatibility can affect availability. For startups and lean platform teams, it helps to test upgrades in a non-production cluster, review PodDisruptionBudgets, and confirm backup coverage before touching production. This is especially important when following practical Kubernetes upgrade steps for small teams.

In short

Use a Kubernetes StatefulSet when your application needs stable pod identity, persistent storage per replica, and ordered lifecycle management. Use a Deployment when your pods are stateless and interchangeable. For production stateful systems, treat the StatefulSet as one part of the design. Storage, backups, failover behavior, and upgrade procedures matter just as much as the Kubernetes object itself.

A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
Y
X
Z