Test Reports in CI — JUnit XML, HTML, Allure

8 min read

A CI pipeline that runs tests and silently exits with a red light is incomplete. The failure message, the stack trace, the screenshot of the broken UI, the trend showing this test has been flaky for three weeks — that information is what turns a red build into a fixed bug. Test reporting bridges the gap between "the build failed" and "here's exactly what went wrong and where."

Three formats for three audiences

Test reporting in CI serves two distinct audiences: machines and humans. The formats they need are different.

JUnit XML is for machines. Your CI server reads it to determine pass/fail counts, track trends, and annotate pull requests. It's not readable by humans, and that's fine — it's not meant to be.

HTML reports are for humans in a hurry. A developer who sees a failing PR status check clicks through to the report, finds the failure in seconds, reads the assertion message, and opens the failing test file. The report answers "what failed and why" without requiring access to CI logs.

Allure is for humans who need depth. It adds history across builds, failure categories, test timeline visualisation, behaviour-driven views (when @Epic/@Feature annotations are used), and per-test attachments. It's the right choice for teams that review test health as a regular practice.

JUnit XML — the universal CI format

JUnit XML is generated automatically by every major test framework and CI tool knows how to read it:

<testsuite name="LoginTest" tests="5" failures="1" errors="0" time="12.5">
    <testcase classname="LoginTest" name="testValidLogin" time="2.3"/>
    <testcase classname="LoginTest" name="testInvalidLogin" time="1.8">
        <failure message="Expected error message not displayed">
            org.opentest4j.AssertionFailedError: Expected error...
        </failure>
    </testcase>
</testsuite>

Maven Surefire writes this to target/surefire-reports/ automatically. pytest writes it when you pass --junitxml=results.xml. Jest writes it with --reporters=default --reporters=jest-junit.

In GitHub Actions, parse JUnit XML and annotate the PR with individual test failures:

- uses: dorny/test-reporter@v1
  if: always()
  with:
    name: Test Results
    path: '**/surefire-reports/*.xml'
    reporter: java-junit
    fail-on-error: true

The dorny/test-reporter action reads the XML and renders a formatted test summary on the PR — test counts, failure messages, and links to the failing test — without requiring anyone to download a ZIP file.

In Jenkins, the built-in JUnit plugin does the same thing in post { always { } }:

post {
    always {
        junit 'target/surefire-reports/*.xml'
    }
}

One line. Jenkins adds a trend graph, a per-test history view, and marks the build UNSTABLE if any tests fail.

HTML reports in CI

HTML reports need to be hosted somewhere accessible — they're useless as a raw ZIP on a developer's laptop.

GitHub Actions artifact (simplest):

- uses: actions/upload-artifact@v4
  if: always()
  with:
    name: playwright-report
    path: playwright-report/
    retention-days: 14

The report appears as a downloadable ZIP on the workflow run page. Simple, always available, no infrastructure needed.

Jenkins HTML Publisher: (covered in Chapter 3) serves the report directly from Jenkins with a sidebar link.

GitHub Pages (permanent, shareable URLs): useful for Allure reports that should be accessible without Jenkins or GitHub Actions access:

- uses: peaceiris/actions-gh-pages@v4
  if: always()
  with:
    github_token: ${{ secrets.GITHUB_TOKEN }}
    publish_dir: allure-history
    destination_dir: reports/${{ github.run_id }}

Each run gets a unique URL. Link to it from the Slack notification or PR comment so anyone can access it without GitHub Actions access.

Allure in CI

Allure is worth the extra setup for teams running more than ~50 tests, because the trend history and failure categorisation are invaluable for identifying flakiness:

- name: Run tests
  run: mvn test -B
 
- name: Generate Allure report
  uses: simple-elf/allure-report-action@master
  if: always()
  with:
    allure_results: target/allure-results
    allure_history: allure-history
    keep_reports: 20
 
- name: Deploy to GitHub Pages
  uses: peaceiris/actions-gh-pages@v4
  if: always()
  with:
    github_token: ${{ secrets.GITHUB_TOKEN }}
    publish_dir: allure-history

The keep_reports: 20 parameter retains the last 20 builds' history on the GitHub Pages site. Each new run updates the trend graph with the latest results. A test that's been failing for 4 builds in a row is much easier to spot in Allure's history view than in individual CI logs.

For Playwright, set the Allure reporter in playwright.config.ts:

reporter: [
  ['allure-playwright', { outputFolder: 'allure-results' }],
  ['html', { open: 'never' }]
],

Both reporters run simultaneously — Allure results for CI history, HTML for quick local review.

Test report formats: right tool for the right audience

JUnit XML

  • Audience: CI servers, tooling

    Machines parse it — humans don't read it

  • Generated by: Surefire, pytest, Jest

    Zero extra config in most frameworks

  • Used by: Jenkins, GitHub Actions, GitLab CI

    Universal format — every CI tool reads it

  • Shows: pass/fail counts, timing, error messages

    No screenshots, no history, no trends

HTML Reports

  • Audience: developers debugging failures

    Click through from failed status check

  • Generated by: Playwright, Cucumber, ExtentReports

    Framework-specific format and quality

  • Hosted via: artifact download or Jenkins sidebar

    Available for the lifetime of the artifact

  • Shows: test list, failure details, screenshots

    No cross-build history or trend data

Allure

  • Audience: QA team reviewing test health

    Best for regular test health reviews

  • Generated by: allure-testng, allure-playwright

    Requires adding a dependency + annotation

  • Hosted via: GitHub Pages or Allure TestOps

    Permanent URL with build history

  • Shows: trends, timelines, flaky test detection

    Cross-build history is the key advantage

⚠️ Common mistakes

  • Only uploading reports on failure. A report from a green run has value — it confirms exactly what passed, documents the test run for auditing, and provides baseline data for flakiness detection. Use if: always() for report uploads, not if: failure().
  • Not publishing Allure history. Allure without history is just a fancy HTML report. The trend graph, flakiness detection, and failure categorisation only appear once you preserve results across builds. The keep_reports parameter and the GitHub Pages deploy step are not optional extras — they're what makes Allure worth using.
  • Expecting JUnit XML to replace readable failure messages. The XML records the failure. The human-readable test code and assertion message are what explain it. Write clear assertion messages (assertThat(status).as("Order status after checkout").isEqualTo("CONFIRMED")) so the XML failure element contains something actionable, not just AssertionError.

🎯 Practice task

Set up all three report types for your project — 40 minutes.

  1. JUnit XML: confirm your framework generates it (target/surefire-reports/ for Maven, junit-results.xml for pytest). Add dorny/test-reporter@v1 to your GitHub Actions workflow with if: always(). Push a failing test and confirm the failure appears as a PR annotation.
  2. HTML artifact: add actions/upload-artifact@v4 pointing at your HTML report directory. Push. Download the artifact from the workflow run page and open it.
  3. Allure: add the Allure dependency for your framework, run tests, confirm target/allure-results/ is populated. Add the Allure GitHub Action and deploy to GitHub Pages. Open the published URL.
  4. Stretch: make a test fail on two consecutive runs. Open the Allure history view. Confirm the failure appears across both runs in the trend graph.

The next lesson turns reporting into enforcement: quality gates that block merging when the criteria you set aren't met.

// tip to track lessons you complete and pick up where you left off across devices.