Q19 of 38 · Performance

How do you build a sustainable performance test suite that runs in CI without becoming a bottleneck?

PerformanceSeniorperformancecitest-strategytieringsustainability

Short answer

Short answer: Tier the suite — fast smoke perf on every PR (under 5 min, narrow scope), full load nightly, soak weekly. Use thresholds that fail loud on real regression but tolerate noise. Run on dedicated infra so dev pipelines aren't fighting load gen for resources.

Detail

Three tiers, each with different goals and budgets:

Tier 1 — PR smoke perf (every PR, < 5 min)

  • Hits 5-10 critical endpoints with a small VU count (10-20).
  • Asserts thresholds on absolute latency (e.g. p95 < 500ms) — if you can't pass these on a tiny load, something's badly wrong.
  • Goal: catch egregious regressions before merge — a query that went from 50ms to 5s.
  • Budget: tight time, high signal. Anything noisy gets pulled out of this tier.

Tier 2 — Nightly load tests (every night, 30-60 min)

  • Full target load against staging.
  • Real SLOs as thresholds.
  • Trend-tracked: today's p95 vs. last week's. Page on regression.
  • Goal: catch slow-creep regressions and dependency drift.

Tier 3 — Weekly soak / monthly capacity (scheduled)

  • 8-24h soak at moderate load — catches memory leaks, log fill-up.
  • Capacity test (stress to break) once a quarter or before peak events.
  • Goal: catch issues PR/nightly tests can't surface.

Infra discipline:

  • Run perf tests on dedicated runners. CI runners are noisy and shared — load gen needs predictable CPU/network.
  • The target environment must be stable too. Random staging noise = random false alarms = devs ignoring perf failures.
  • Result storage: dump JSON to S3 or Grafana Cloud for trend tracking. Without trend, "did this run regress?" is unanswerable.

Process discipline:

  • Owned: a named team (or rotation) triages perf failures. Untriaged failures get retried and ignored.
  • Quarantine: flaky perf tests go to a separate job that doesn't block, but is tracked weekly.
  • Budget: review the perf suite quarterly — what's covered, what's worth more, what's worth less.

Senior signal: treating perf testing as a product with users (devs, SRE, product), an SLA (PR feedback in 5 min), and a maintenance plan. Without that, the suite rots within months.

// WHAT INTERVIEWERS LOOK FOR

Tiering, owned triage, dedicated infra, trend tracking, quarantine process. The signal of seniority is treating the test suite itself like a product that needs investment.

// COMMON PITFALL

Cramming a 30-minute load test into PR CI. Devs disable it within a week, the suite is theatre, and real regressions ship.