Q23 of 38 · Performance
How would you justify investment in a dedicated performance testing function?
Short answer
Short answer: Tie to incidents avoided (cost of one Black Friday outage), customer impact (latency complaints, churn correlation with p95), and engineering time saved (centralised expertise vs. every team relearning). A single P1 outage typically dwarfs annual investment — the value is in *not* having that outage.
Detail
The hidden problem with performance investment: it pays off by absence — outages that didn't happen. Justifying the spend requires reframing.
Frame 1 — Cost of incidents avoided. Compute one prevented P1: lost revenue per hour during the outage, post-incident eng time, customer credits, brand damage (harder to quantify but real). For a marketplace at peak, a 2-hour outage on Black Friday can be £1-10M+. A perf team that prevents one such incident per year pays for itself many times over.
Frame 2 — Customer impact. Pull RUM data: what's the conversion drop between p50-fast and p95-slow users? Most companies find a 1-3% conversion delta per 100ms. Multiply by revenue: a 200ms p95 improvement on the checkout page is often worth £M+ annually. Performance is revenue, not infrastructure.
Frame 3 — Engineering time saved. Without a centralised function, every team rediscovers JMeter, builds half-baked tooling, and learns the same lessons. Centralised expertise: shared frameworks, a single CI integration, common dashboards. Time-to-first-load-test for a new service goes from weeks to days.
Frame 4 — Risk reduction. Pre-prod sign-off for major launches. Capacity planning before peak events. Vendor evaluation (could you switch from Postgres to Aurora? Perf team answers in a week). The team is an insurance policy against a class of risks the org otherwise can't quantify.
What the investment looks like (typical mid-size):
- 2-3 senior engineers focused on perf.
- Tooling: load infra (k6 Cloud, dedicated runners), APM, RUM.
- Process: pre-prod perf sign-off for major changes, quarterly capacity review, peak-event game days.
Common counter-arguments and responses:
- "Devs can write their own tests" — they can, and often badly. Centralised expertise raises the floor.
- "We don't have perf problems" — meaning we don't measure them. RUM data usually surfaces problems on first look.
- "It's expensive" — quantify the prevented incident at the rates above; the function pays for itself with one save.
Senior leadership signal: the framing is in revenue and risk, not in test runs. "We're investing £X to reduce £Y in expected outage cost and unlock £Z in conversion uplift." That's the language that gets funding.
// WHAT INTERVIEWERS LOOK FOR
// COMMON PITFALL
// Related questions
How would you justify investment in contract testing to a leadership team focused on velocity?
API testing
How would you measure ROI on QA's investment in CI/CD infrastructure?
CI/CD & DevOps
How would you justify the choice of Playwright over Cypress to a director skeptical of changing tools?
Playwright