Q12 of 38 · Performance

How would you use k6 thresholds to fail a CI pipeline on regression?

PerformanceMidperformancek6cithresholdsregression

Short answer

Short answer: Define thresholds per metric (p95 latency, error rate, custom metrics) — k6 exits non-zero on breach, failing the pipeline step. Persist results as JSON, compare to a baseline, and gate merges on green k6 runs as part of the standard CI checks.

Detail

k6's threshold mechanism is the bridge between performance testing and CI gating.

How thresholds work: the thresholds block in options defines pass/fail criteria. If any metric exceeds its threshold, k6 sets exit code 99 (or 1 with --quiet) — the CI step fails, the pipeline blocks. No glue code, no log scraping.

Useful thresholds:

  • http_req_duration: ['p(95)<500', 'p(99)<1500'] — latency SLO
  • http_req_failed: ['rate<0.001'] — error budget
  • Custom metrics: Trend or Counter for business-level timings (e.g. checkout-end-to-end)
  • abortOnFail: true — exit immediately on breach instead of running the full duration

CI integration shape:

  1. Run k6 run --out json=summary.json load-test.js.
  2. On non-zero exit, fail the step.
  3. Upload summary.json as a build artifact for trend tracking.
  4. Optionally post a PR comment with p95 / error rate / regression vs. baseline.

Pitfalls in CI: the runner's CPU is shared with other jobs — load-gen accuracy suffers. Either run k6 against a stable env from a stable host (dedicated runner, k6 Cloud), or use relative thresholds against the same env's previous run rather than absolute SLOs.

The smart pattern: thresholds set conservatively (catch the obvious regressions) plus baseline comparison done outside k6 (e.g. Python script comparing today's p95 to last week's). The k6 threshold gates the build; the baseline diff catches drift before it becomes a regression.

// EXAMPLE

.github/workflows/perf.yml

name: Performance regression
on: [pull_request]
jobs:
  k6:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: grafana/setup-k6-action@v1
      - name: Run k6
        run: k6 run --out json=summary.json tests/load.js
        env:
          BASE_URL: ${{ secrets.STAGING_URL }}
      - uses: actions/upload-artifact@v4
        if: always()
        with:
          name: k6-summary
          path: summary.json

// WHAT INTERVIEWERS LOOK FOR

Knowing the threshold mechanism and exit-code behaviour, awareness that CI runners are noisy load generators, and the discipline of also tracking trend, not just pass/fail.

// COMMON PITFALL

Setting thresholds that fail randomly because the CI runner is overloaded — devs disable the check rather than fix the noise, and the gate becomes ceremonial.