Testing in CI/CD Pipelines

5 min read

A modern engineering team ships dozens or hundreds of changes a day. The only way to do that safely is to automate the path from commit to production — the Continuous Integration / Continuous Delivery pipeline. Testing is the heart of that pipeline. Get it right and the team moves fast with confidence; get it wrong and every deploy becomes a coin flip.

What a typical pipeline looks like

The simplest healthy pipeline runs a series of gates, each of which has to pass before the change moves on:

1. Commit. A developer pushes a branch and opens a pull request.

2. Build. A clean build of the project runs — code compiles, dependencies resolve, container images get built.

3. Static checks. Linters, type checkers, and security scanners run. Cheap, fast, and catch a surprising amount of trouble before any test runs.

4. Unit tests. The smallest, fastest tests run first. They isolate individual functions or classes and execute in seconds. A modern unit suite can run tens of thousands of tests in under a minute.

5. Integration tests. Tests that exercise multiple components together — a service plus its database, two services talking over an API. Slower than unit tests, faster than end-to-end.

6. Deploy to staging. The build is pushed to a staging environment that mirrors production.

7. End-to-end tests. Tests like Cypress or Playwright drive a real browser through real user flows on staging. Slowest of the automated layers but closest to real-user experience.

8. Deploy to production. Often gated by a manual approval, sometimes fully automated for low-risk services. Modern teams favour gradual rollout — canary releases, feature flags, blue/green deploys.

9. Post-deploy checks. Smoke tests run against production after deploy. Monitoring watches error rates and latency. If something goes wrong, automated rollback kicks in.

The pyramid you keep hearing about

The pipeline reflects the famous test pyramid: many fast unit tests at the bottom, a moderate number of integration tests in the middle, a small number of end-to-end tests at the top. Faster tests run earlier in the pipeline because feedback within seconds is cheap; feedback after a 30-minute end-to-end run is expensive. A flipped pyramid — heavy on E2E, light on unit — is one of the surest ways to slow a team down.

What the QA engineer owns

Modern QA engineers are usually responsible for the upper layers of the pyramid: the integration and end-to-end tests, plus exploratory testing on staging before promotion. Specifically:

  • Maintaining the E2E suite. Adding tests for new flows, fixing flaky tests, tuning timeouts, keeping selectors stable. The E2E suite is a living thing — it deteriorates without active care.
  • Owning the staging environment. Making sure the data is realistic, the integrations are wired up, and the environment behaves like production.
  • Defining "ready to deploy." What checks must pass? Which failures block, which warn? Where is the manual approval gate?
  • Smoke tests in production. A short, fast suite that runs after every deploy to verify the most critical paths still work — login, checkout, search.

Common pipeline failures

  • Slow pipelines. A pipeline that takes more than 15–20 minutes destroys flow and pushes developers to skip checks. Parallelisation, caching, and trimming flaky tests are the standard fixes.
  • Flaky pipelines. Tests that pass sometimes and fail sometimes erode trust. Once developers start re-running until green, the pipeline is no longer a quality gate. Track flake rates and quarantine offenders.
  • Long feedback loops. A defect found 40 minutes into the pipeline is much more expensive than one found in 40 seconds. Order tests cheap-to-expensive.
  • Pipelines that lie. Tests that "pass" without actually exercising the change — a common failure when the test infrastructure is broken but defaults to green.

What this means for your career

Even if you start as a manual tester, exposure to CI/CD pipelines is one of the highest-leverage skills you can develop. Understanding what runs, when, and why makes every conversation with developers easier and opens the door to SDET roles. You do not have to be the person writing the YAML — but you do need to know what the YAML does.

The next chapter takes a step back from process and dives into the types of testing themselves: functional, non-functional, smoke, sanity, regression, and beyond.

// tip to track lessons you complete and pick up where you left off across devices.