DevOps Dictionary

Scaling

Scaling is the practice of adjusting a system’s compute capacity to match workload demand, so applications stay responsive and cost-effective as traffic, data volume, or job throughput changes. In DevOps and platform engineering, scaling typically works by adding or removing resources such as application instances, containers, or database replicas based on signals like CPU, memory, request rate, or queue length; this can be done vertically (bigger machines) or horizontally (more machines). With scaling, teams can absorb spikes and planned growth while maintaining service levels and predictable performance; without it, systems tend to either overload during peaks (timeouts, failed deployments, incident churn) or run overprovisioned during quiet periods (wasted spend). This gap exists because demand is variable, and automation is needed to translate real-time load into safe, timely capacity changes.

A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
Y
X
Z