A test suite that runs in CI but doesn't block merging when it fails is a suggestion, not a gate. Teams that have "tests run in CI but developers can still merge" inevitably develop a culture of ignoring red pipelines — the signal erodes until nobody trusts the suite at all. Quality gates are the mechanism that converts test results from advisory to mandatory: a failing gate means the code cannot merge, full stop.
What quality gates are
A quality gate is a pass/fail check that the pipeline evaluates before a PR can merge. The gate can check any measurable criterion: all tests pass, code coverage is above 80%, no critical security vulnerabilities, no SonarQube issues above severity threshold. If any configured gate fails, the PR is blocked until the failure is resolved.
The value is automatic consistency. Without gates, each PR's merge decision is made by a human reviewing CI results — which means it depends on the reviewer's attention, time pressure, and willingness to push back. With gates configured in the repository settings, the same standard applies to every PR, every time, regardless of who's reviewing.
How tests fail builds in GitHub Actions
When a test step exits with a non-zero code, GitHub Actions marks the step as failed. The job fails. The workflow fails. The PR status check shows a red X. If that workflow is configured as a required check in branch protection, the PR is blocked.
This chain works automatically for most frameworks — Maven exits non-zero when mvn test has failures, npx playwright test exits non-zero when tests fail, pytest exits non-zero on failures. You don't need extra configuration to get the basic "failed tests block the PR" behaviour.
What you do need to configure explicitly: the branch protection rule.
Configuring required status checks
- Repository → Settings → Branches → Add rule (or edit existing rule for
main) - Enable Require status checks to pass before merging
- Search for and add your workflow job names (e.g.,
selenium,playwright,cypress) - Optionally enable Require branches to be up to date before merging — this prevents a PR from merging if main has advanced since the PR's checks ran
- Save
From this point, every PR must show green checks for all listed jobs before the merge button is active. A failed test anywhere in the chain blocks the merge.
Custom quality gates
Beyond test pass/fail, you can encode any measurable standard as a gate:
Test pass rate threshold (reject if more than 5% of tests fail):
- name: Evaluate test pass rate
run: |
TOTAL=$(python3 -c "
import xml.etree.ElementTree as ET, glob
files = glob.glob('target/surefire-reports/*.xml')
total = sum(int(ET.parse(f).getroot().get('tests', 0)) for f in files)
print(total)
")
FAILED=$(python3 -c "
import xml.etree.ElementTree as ET, glob
files = glob.glob('target/surefire-reports/*.xml')
failed = sum(int(ET.parse(f).getroot().get('failures', 0)) + int(ET.parse(f).getroot().get('errors', 0)) for f in files)
print(failed)
")
RATE=$(( (TOTAL - FAILED) * 100 / TOTAL ))
echo "Pass rate: ${RATE}%"
if [ "$RATE" -lt 95 ]; then
echo "::error::Pass rate ${RATE}% is below the 95% threshold"
exit 1
fiCoverage minimum (covered in the next lesson — JaCoCo enforces this via mvn jacoco:check).
SonarQube quality gate:
- name: SonarQube analysis
run: mvn sonar:sonar -Dsonar.projectKey=my-project
env:
SONAR_TOKEN: ${{ secrets.SONAR_TOKEN }}
- uses: sonarsource/sonarqube-quality-gate-action@master
timeout-minutes: 5
env:
SONAR_TOKEN: ${{ secrets.SONAR_TOKEN }}The SonarQube gate fails the workflow if the SonarQube project's quality gate (configured in the SonarQube server) is not met. The quality gate on the server can check: code coverage, duplicated code ratio, maintainability rating, reliability rating, and security hotspots.
Handling flaky tests gracefully
A known-flaky test that fails 15% of the time creates a dilemma: make it a required gate and it blocks PRs randomly; leave it ungated and it has no value. The right approaches:
Retry on failure (don't let a single flaky test fail the gate):
# Playwright
- run: npx playwright test --retries=2
# Maven Surefire
- run: mvn test -Dsurefire.rerunFailingTestsCount=2 -BWith retries, a test must fail 3 consecutive attempts before the step fails. Genuinely flaky tests (failure rate < 30%) usually pass on retry. Consistently broken tests fail on all retries and correctly block the PR.
Quarantine tag — tag flaky tests with @flaky and exclude them from the required gate job. Run them in a separate optional job so the failures are visible but don't block merging:
jobs:
required-gate:
runs-on: ubuntu-latest
steps:
- run: npx playwright test --grep-invert @flaky # excludes flaky tests
flaky-watch:
runs-on: ubuntu-latest
continue-on-error: true # doesn't block PR
steps:
- run: npx playwright test --grep @flakyThe flaky watch job surfaces the failure without blocking the merge. The intent is to fix the flaky test — the separate job provides visibility without friction.
Calibrating strictness
The strictest possible gate — 100% test pass rate, 100% coverage, zero lint warnings — sounds ideal until it paralyses the team. Gates that fire on legitimate work teach developers to work around them (force-push to a different branch, get a quick approval, bypass the check). Once the team learns to route around a gate, it provides negative value — false confidence and friction.
The practical calibration: start with one gate (all tests pass on the smoke suite), enforce it consistently for two weeks, measure whether it's catching real issues and whether it's creating false blocks. Expand gates gradually once the team trusts the first one.
⚠️ Common mistakes
- Configuring required checks without enforcing "branches must be up to date." A PR can pass all checks on Monday, sit unmerged until Friday, and then merge — ignoring everything that merged to main in between, including a test that would now catch a conflict. Enable "Require branches to be up to date" alongside required checks.
- Making flaky tests required gates without retries. A test with a 20% failure rate blocks 1 in 5 PRs for no real quality reason. Either fix the flaky test, add retries, or quarantine it — don't leave it as a required gate in its current state.
- Adding gates without owners. A SonarQube gate that fires needs someone to review and resolve it. If nobody is assigned to review SonarQube issues, the gate fails every build, the team learns to ignore it, and it provides no value. Every gate needs an owner and a process for resolution.
🎯 Practice task
Configure a complete quality gate setup — 30 minutes.
- Confirm your test workflow exits non-zero on test failure (run a deliberate failing test locally and check the exit code:
mvn test; echo $?ornpx playwright test; echo $?). - Add branch protection to your test repository: require your test workflow's job as a status check. Try to merge a PR with a failing test — confirm the button is greyed out.
- Add
--retries=1(Playwright) or-Dsurefire.rerunFailingTestsCount=1(Maven) to your test command. Find a flaky test (or deliberately addMath.random() > 0.5 → fail) and confirm it passes on retry without blocking the PR. - Stretch: create a second job in your workflow with
continue-on-error: true. Move a flaky or slow test into it. Confirm the PR can merge even when this secondary job fails.
The next lesson adds quantitative measurement to your quality gate: code coverage reporting and the threshold checks that enforce it.