How would you justify investment in a dedicated performance testing function?

Question

Accepted Answer

Tie to incidents avoided (cost of one Black Friday outage), customer impact (latency complaints, churn correlation with p95), and engineering time saved (centralised expertise vs. every team relearning). A single P1 outage typically dwarfs annual investment — the value is in *not* having that outage. The hidden problem with performance investment: it pays off by absence — outages that didn't happen. Justifying the spend requires reframing. Frame 1 — Cost of incidents avoided. Compute one prevented P1: lost revenue per hour during the outage, post-incident eng time, customer credits, brand damage (harder to quantify but real). For a marketplace at peak, a 2-hour outage on Black Friday can be £1-10M+. A perf team that prevents one such incident per year pays for itself many times over. Frame 2 — Customer impact. Pull RUM data: what's the conversion drop between p50-fast and p95-slow users? Most companies find a 1-3% conversion delta per 100ms. Multiply by revenue: a 200ms p95 improvement

How would you justify investment in a dedicated performance testing function?

// WHAT INTERVIEWERS LOOK FOR

// COMMON PITFALL

How would you justify investment in a dedicated performance testing function?

Short answer

Detail

// WHAT INTERVIEWERS LOOK FOR

// COMMON PITFALL