Q11 of 38 · Performance

What's the difference between load testing and stress testing in how you actually execute them?

PerformanceMidperformanceload-testingstress-testingrampexecution

Short answer

Short answer: Load runs at expected target RPS sustained for a long hold — pass/fail against SLOs, no surprises wanted. Stress ramps past target until the system breaks — output is the failure mode (errors, latency cliff, queue saturation, resource exhaustion). Different ramp shapes, different success criteria.

Detail

Two distinct exercises with different outputs.

Load test execution:

  • Ramp up gradually (5-10 min) to target traffic.
  • Hold at target for 30+ minutes (long enough to expose connection pool exhaustion, GC patterns, cache eviction).
  • Ramp down or hold steady.
  • Pass criteria: p95 < SLO, error rate < SLO, no resource alarms triggered.
  • Output: a yes/no on whether the system meets its SLOs at expected load.

Stress test execution:

  • Ramp from baseline through and past target — keep going until something breaks.
  • Don't hold; the failure point is the answer.
  • Stop at "system fails gracefully" or "system collapses ungracefully."
  • Pass criteria: graceful degradation (rate-limiting, queueing, circuit breakers trip rather than cascading failure).
  • Output: capacity number (RPS or VU count at which SLOs break) plus the failure mode.

Failure modes worth naming:

  • Cliff — latency stays flat then suddenly explodes (CPU saturates, queue overflow).
  • Drift — latency slowly worsens over time (memory leak, growing index).
  • Cascade — one component fails and trips the next (DB slows, app threads block, downstream times out).

Tools handle them differently. k6 ramping-arrival-rate with thresholds disabled lets a stress test continue past failure to see how bad it gets. JMeter Throughput Shaping Timer or Stepping Thread Group lets you build a programmatic ramp-and-hold staircase.

// EXAMPLE

k6-stress-vs-load.js

// Load test: hold at expected load
export const loadOptions = {
  stages: [
    { duration: '5m', target: 100 },   // ramp
    { duration: '30m', target: 100 },  // hold
    { duration: '5m', target: 0 },     // ramp down
  ],
  thresholds: { http_req_duration: ['p(95)<500'] },
};

// Stress test: keep ramping until it breaks
export const stressOptions = {
  stages: [
    { duration: '2m', target: 100 },
    { duration: '2m', target: 200 },
    { duration: '2m', target: 400 },
    { duration: '2m', target: 800 },
    { duration: '2m', target: 1600 },  // find the cliff
  ],
};

// WHAT INTERVIEWERS LOOK FOR

Different ramp shapes, different success criteria, awareness that stress tests are for finding *failure modes* not pass/fail.

// COMMON PITFALL

Calling a stress test that immediately spikes to 10x expected load 'realistic'. You skip the warm-up and ramp-saturation phases real production sees, and the failure mode you observe isn't the production one.