Q46 of 48 · Cypress
How do you set quality bars and metrics for an automation team owning Cypress?
Short answer
Short answer: Track flake rate (target <1%), CI duration (target <30 min), escape rate (production bugs not caught), and coverage of top user journeys (target 100%). Publish weekly. Set explicit owners for each bar; flake quarantine is a 7-day SLA. The bars must be visible to the broader engineering team, not just QA.
Detail
Without explicit bars, an automation team drifts into "we run tests" with no measure of success. Explicit, published bars create accountability and a shared definition of "the suite is healthy."
The four bars I'd set as a Cypress lead:
1. Flake rate. % of test runs where a test fails on first attempt and passes on retry. Target: <1%. Compute weekly per spec; spec exceeding 1% goes into quarantine within 7 days. Cypress Cloud surfaces this; without it, parse JUnit retries.
2. CI duration. Wall time of the slowest shard from PR open to all-shards-green. Target: <30 minutes for the full suite (lower if smoke-only). Track p50, p95. Regression triggers a sprint goal to investigate.
3. Escape rate. Production bugs that should have been caught by automation, divided by total prod bugs. Target: <10%. Each escape gets a "missing test" ticket; closing the ticket adds the test that would have caught it.
4. Top-journey coverage. % of the team's defined critical user journeys (typically 10-15) that have working automated coverage. Target: 100%. Drift down triggers a fix, not new feature work.
Process around the bars:
- Weekly health report. A short Slack post or dashboard with the four numbers, trend arrows, and any quarantined tests. Visible to the whole eng org.
- Flake quarantine SLA. When a test is quarantined, it has a named owner and a 7-day deadline. Past 7 days, the owner is paged or the test is deleted. No stale quarantine.
- PR gate definitions. Smoke must pass on every PR; full regression must pass on PR-to-main. Skipping a test (e.g.,
it.skip) requires PR approval from a lead. - Quarterly retrospectives. Bars reviewed quarterly. Numbers that aren't improving get explicit attention; numbers comfortably hit get tightened.
Cultural points:
- Publish failure modes, not just success. "We had 3 escapes last quarter; here's what we added" builds credibility.
- Reward fixing flakes, not just shipping tests. A test that catches no bugs but never flakes is healthier than one that catches everything but fails 5% of the time.
- Make automation cost visible. "This regression suite costs £X/month in CI minutes; here's the value it produces" justifies investment to leadership.
The lead trap to avoid: setting too many metrics. Four numbers everyone knows beats twenty numbers nobody looks at.