Q1 of 38 · CI/CD & DevOps
What is a flaky test and how do you handle it in CI?
Short answer
Short answer: A flaky test passes and fails on the same code without changes. Quarantine it (don't auto-retry), mark it owned, fix the root cause within a deadline, and delete or rewrite it if you can't.
Detail
A flaky test is one whose outcome isn't determined by the code under test — it sometimes passes and sometimes fails for the same SHA. Common causes: hard-coded timing or sleeps, shared mutable state between tests, network or third-party dependencies, race conditions in async code, time-zone or date assumptions.
The wrong response is to add an auto-retry that masks flakiness — the test still flaps, the bug it might be catching gets ignored, and trust in CI erodes. The right response is quarantine and triage:
- Detect: track flake rate per test in CI metadata. A test that fails ≥ 1% of runs on the main branch is flaky by definition.
- Quarantine: move flaky tests to a separate job that's allowed to fail without blocking PRs, but is tracked. Don't delete them yet — they might be catching real bugs.
- Own: assign every quarantined test to a person or team with a deadline (e.g. 2 weeks).
- Fix or delete: if the flake is a test bug, fix it (replace sleeps with explicit waits, isolate state, mock the network). If the flake is a real product bug surfacing intermittently, fix the product. If neither — the test is testing something genuinely unstable — delete it; flake without intent is worse than no test.
Auto-retry is acceptable only as a temporary mitigation while a fix is in flight, never as the permanent answer.
// MODEL ANSWER
// WHAT INTERVIEWERS LOOK FOR
// COMMON PITFALL
// Related questions
How do you reduce flakiness in a 500+ test Selenium suite?
Selenium
How do you debug a flaky test in Cypress?
Cypress
Describe a time you reduced flakiness in a test suite. Walk me through the process.
Behavioural
How does `git bisect` work, and how would you use it to find which commit broke an API test?
Git