What is a flaky test and how do you handle it in CI?

Question

Accepted Answer

A flaky test passes and fails on the same code without changes. Quarantine it (don't auto-retry), mark it owned, fix the root cause within a deadline, and delete or rewrite it if you can't. A flaky test is one whose outcome isn't determined by the code under test — it sometimes passes and sometimes fails for the same SHA. Common causes: hard-coded timing or sleeps, shared mutable state between tests, network or third-party dependencies, race conditions in async code, time-zone or date assumptions. The wrong response is to add an auto-retry that masks flakiness — the test still flaps, the bug it might be catching gets ignored, and trust in CI erodes. The right response is quarantine and triage: Detect: track flake rate per test in CI metadata. A test that fails ≥ 1% of runs on the main branch is flaky by definition. Quarantine: move flaky tests to a separate job that's allowed to fail without blocking PRs, but is tracked. Don't delete them yet — they might be catching real bugs. Own: as

What is a flaky test and how do you handle it in CI?

// MODEL ANSWER

// WHAT INTERVIEWERS LOOK FOR

// COMMON PITFALL

What is a flaky test and how do you handle it in CI?

Short answer

Detail

// MODEL ANSWER

// WHAT INTERVIEWERS LOOK FOR

// COMMON PITFALL