Maven Surefire Plugin and Parallel Execution

8 min read

mvn test is what Jenkins runs. mvn test is what GitHub Actions runs. The plugin that actually runs your tests during that command is the Maven Surefire Plugin — it sits between Maven's build lifecycle and the TestNG runtime, picking up the testng.xml, forking JVMs, collecting reports. This lesson is the working knowledge: how Surefire is configured, how to make tests run in parallel via TestNG, and the ThreadLocal<WebDriver> pattern that makes parallel actually work without your tests stomping each other. By the end you'll have a suite that runs in 1/4 of its serial time without flake.

Surefire — the runner Maven uses

You configured Surefire briefly in chapter 1. The fuller version, with system-property pass-through, looks like:

<build>
    <plugins>
        <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-surefire-plugin</artifactId>
            <version>3.2.5</version>
            <configuration>
                <suiteXmlFiles>
                    <suiteXmlFile>src/test/resources/${suiteFile}</suiteXmlFile>
                </suiteXmlFiles>
                <systemPropertyVariables>
                    <browser>${browser}</browser>
                    <headless>${headless}</headless>
                    <env>${env}</env>
                    <grid.url>${grid.url}</grid.url>
                </systemPropertyVariables>
                <argLine>-Xmx2048m</argLine>
            </configuration>
        </plugin>
    </plugins>
</build>

Three configuration items earn their place:

  • <suiteXmlFiles> — points Surefire at the TestNG suite file. The ${suiteFile} placeholder reads from mvn test -DsuiteFile=smoke.xml (or defaults to whatever you set in <properties>).
  • <systemPropertyVariables> — pass -D flags from the command line through to the test JVM. Without this, mvn test -Dbrowser=firefox doesn't reach System.getProperty("browser") inside your tests.
  • <argLine> — JVM args for the forked test process. -Xmx2048m doubles the default heap; useful when the suite is large or screenshots are heavy.

Run with overrides:

mvn test -DsuiteFile=cross-browser.xml -Dbrowser=firefox -Dheadless=true

Each -D flag travels Maven → Surefire → JVM → System.getProperty(...).

Parallelism — turn it on in testng.xml

Surefire forks one JVM per Maven module by default. Within that JVM, TestNG decides whether to run tests sequentially or in parallel. The parallel attribute on <suite> is the switch:

<suite name="Parallel" parallel="methods" thread-count="4">
    <test name="All">
        <packages>
            <package name="com.mycompany.tests.tests"/>
        </packages>
    </test>
</suite>

Four valid parallel values:

  • methods — every @Test method on its own thread. Maximum concurrency, lowest wall-clock; demands the tightest thread-safety.
  • classes — methods in the same class run sequentially; different classes run in parallel. A reasonable middle ground when methods within a class share state.
  • tests — each <test> block on its own thread. Same idea as cross-browser tests in chapter 7.
  • instances — different @Test-class instances in parallel. Niche; rarely used.

thread-count caps the pool. With parallel="methods" thread-count="4", four tests run simultaneously regardless of how many tests you have.

For a 200-test suite running 30 minutes serial, parallel="methods" thread-count="4" typically lands at ~8 minutes. The math isn't perfectly linear (some tests are slower, JVM warmup steals from the first 30 seconds), but the savings are real.

What parallel actually looks like

Four threads, four tests, four browser processes. As soon as one finishes, the next queued test grabs that thread. The constraint that makes this possible: each test has its own WebDriver. Sharing a single WebDriver field across threads is the single fastest way to break a parallel suite.

The ThreadLocal pattern

@BeforeMethod creates WebDriver. Where does that driver live? In a non-parallel suite, a protected WebDriver driver; field on BaseTest works fine — TestNG creates a fresh instance of the test class per @Test, so the field is per-test-instance.

In a parallel-methods suite, the wires get tangled. TestNG still creates instances, but multiple instances run on multiple threads simultaneously. Each thread needs its own driver, and the convention is to use ThreadLocal<WebDriver>:

package com.mycompany.tests.base;
 
import org.openqa.selenium.WebDriver;
 
public final class DriverManager {
 
    private static final ThreadLocal<WebDriver> DRIVERS = new ThreadLocal<>();
 
    private DriverManager() {}
 
    public static void setDriver(WebDriver driver) {
        DRIVERS.set(driver);
    }
 
    public static WebDriver getDriver() {
        return DRIVERS.get();
    }
 
    public static void quitDriver() {
        WebDriver d = DRIVERS.get();
        if (d != null) {
            d.quit();
            DRIVERS.remove();   // critical — avoids a leak in long-running JVMs
        }
    }
}

ThreadLocal<T> gives every thread its own private value. Thread A's DRIVERS.get() returns a different WebDriver than thread B's — without any locking, without any contention.

BaseTest plugs into it:

public abstract class BaseTest {
 
    @BeforeMethod
    public void createDriver() {
        WebDriver driver = createDriverFor("chrome", true);   // factory from chapter 7
        DriverManager.setDriver(driver);
    }
 
    @AfterMethod
    public void quitDriver() {
        DriverManager.quitDriver();
    }
 
    protected WebDriver getDriver() {
        return DriverManager.getDriver();
    }
}

Tests now access the driver via getDriver() instead of a driver field:

@Test
public void shouldLogIn() {
    new LoginPage(getDriver()).navigateTo().loginAs("standard_user", "secret_sauce");
}

Page objects take the driver in their constructors as before. The plumbing is invisible to the test author — they just call getDriver().

Why DRIVERS.remove() matters

ThreadLocal's sneakiest gotcha: not removing values after use leaks them. The pool of threads that TestNG manages can be reused across tests; if thread 7 still holds the dead driver from the previous test, the next test's setup creates another driver but DRIVERS.get() may still return the old one. Always .remove() in teardown.

Method ordering with parallel — don't depend on it

Sequential TestNG runs methods in source order (or priority order). Parallel TestNG runs them as soon as a thread is free, in whatever order it happens to schedule them. Tests that depend on each other (@Test(dependsOnMethods = ...)) still respect the dependency, but unrelated tests will run in unpredictable order.

The discipline this enforces: every test must set up its own state and tear it down. Tests that "happen to work because the previous test logged in" break the moment they're parallelised. If you've followed the patterns from chapters 5 and 6 — @BeforeMethod driver setup, fresh page objects per test — you're already there.

Surefire's forkCount — process-level parallelism

parallel="methods" runs threads inside one JVM. Surefire also offers process-level parallelism via forkCount:

<configuration>
    <forkCount>3</forkCount>
    <reuseForks>false</reuseForks>
</configuration>

Three JVMs, each running tests in parallel. This is heavier (more memory, more startup time) but provides true isolation — a JVM crash in one fork doesn't affect the others. Most teams don't need it; TestNG's thread-level parallelism is sufficient. Reach for forkCount only when you have classes that pollute static state and can't be cleanly isolated otherwise.

A complete parallel test class

package com.mycompany.tests.tests;
 
import com.mycompany.tests.base.BaseTest;
import com.mycompany.tests.pages.InventoryPage;
import com.mycompany.tests.pages.LoginPage;
import org.testng.Assert;
import org.testng.annotations.Test;
 
public class ParallelLoginTest extends BaseTest {
 
    @Test
    public void shouldLogInAsStandardUser() {
        InventoryPage inv = new LoginPage(getDriver()).navigateTo()
            .loginAs("standard_user", "secret_sauce");
        Assert.assertEquals(inv.productCount(), 6);
    }
 
    @Test
    public void shouldLogInAsProblemUser() {
        InventoryPage inv = new LoginPage(getDriver()).navigateTo()
            .loginAs("problem_user", "secret_sauce");
        Assert.assertEquals(inv.productCount(), 6);
    }
 
    @Test
    public void shouldFailLoginForLockedOutUser() {
        LoginPage login = new LoginPage(getDriver()).navigateTo();
        login.fillUsername("locked_out_user");
        login.fillPassword("secret_sauce");
        login.submitExpectingError();
        Assert.assertTrue(login.errorText().contains("locked out"));
    }
}

In a parallel="methods" thread-count="3" suite, all three run simultaneously, each with its own driver. Run twenty times in a row — twenty greens. That's the property parallel suites should hold.

The Failsafe plugin — for integration tests

Surefire's sibling, maven-failsafe-plugin, is meant for integration tests:

<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-failsafe-plugin</artifactId>
    <executions>
        <execution>
            <goals>
                <goal>integration-test</goal>
                <goal>verify</goal>
            </goals>
        </execution>
    </executions>
</plugin>

The key difference: Failsafe doesn't fail the build when a test fails — it merely reports. You then use a separate verify goal to fail the build later, after reports are generated. This matters when post-test cleanup (Docker shutdown, log archive) needs to run regardless of test outcome. Surefire stops on failure; Failsafe doesn't. Many teams use both — Surefire for unit tests, Failsafe for Selenium suites that depend on running services.

The Selenium tool entry covers driver concerns; the TestNG cheat sheet covers parallelism attributes and the Surefire/Failsafe configuration that wraps them.

Comparison with Cypress and Playwright

// Cypress — parallelism handled at CI level (split spec files across runners)
// Cypress Cloud or cypress-parallel; serial within a single runner
 
// Playwright — built-in test-level parallelism via workers
// playwright.config.ts:
workers: 4,    // four worker processes, automatically parallel

Playwright's worker model is the cleanest of the three — out of the box, every test runs in parallel up to the worker count. Cypress traditionally relied on splitting specs across CI runners. Selenium's parallelism is the most explicit (you opt in via testng.xml) and the most demanding of test-side discipline (ThreadLocal driver, isolated state). The trade-off: Selenium gives you complete control; the modern frameworks give you sensible defaults.

⚠️ Common mistakes

  • A protected static WebDriver driver; field shared across instances. Sequential tests work; the first parallel run breaks because every test's @BeforeMethod overwrites the field. Symptoms: NullPointerException mid-test, "session not created — element not found in this session." Fix: ThreadLocal<WebDriver>.
  • Forgetting DRIVERS.remove() in teardown. The driver is quit, but the ThreadLocal still references it. The next test reusing the thread reads a stale reference and fails confusingly. The two-line quit() + remove() pair is the rule — don't omit either.
  • Parallel methods without thread-safe Page Objects. A page object that holds mutable state (a cached List<WebElement> products) is fine sequentially. With parallel methods, two threads share that state; one's add corrupts the other's read. Either keep page objects stateless (re-find on each method), or instantiate per-test (which you already do).

🎯 Practice task

Make your suite parallel — and prove it. 35–45 minutes.

  1. Update your pom.xml's Surefire config to include the system-property pass-through and <argLine>-Xmx2048m</argLine> from this lesson.
  2. Create src/test/java/com/mycompany/tests/base/DriverManager.java with the ThreadLocal pattern. Update BaseTest to use it; expose getDriver() to subclasses; remove the protected WebDriver driver; field.
  3. Update at least one existing test to use getDriver() rather than driver. Run sequentially. It should still pass.
  4. Add parallel="methods" thread-count="4" to your suite XML. Run mvn test -DsuiteFile=.... Watch four browser windows pop up at once. Total wall-clock should be ~1/4 of the sequential time (slightly more due to driver-startup overhead per test).
  5. Force a flake. Briefly revert BaseTest to share a single protected WebDriver driver; field across instances (or worse: a static field). Run with parallel=4. Watch tests fail mysteriously — NullPointerException, wrong page, sessions colliding. Restore the ThreadLocal version. Watch them go green again.
  6. Forget .remove(). Comment out DRIVERS.remove() in quitDriver(). Run a 50-test parallel suite. Inspect the failures (you may not see them on a fresh JVM, but on a long-running one or with --reuseForks=true, leaks accumulate). Restore the .remove() call.
  7. Stretch — Failsafe. Add maven-failsafe-plugin to your pom.xml configured for integration tests with a name pattern like *IT.java. Rename one of your test classes to *IT.java. Run mvn verify. Failsafe runs the integration tests; the build fails on the verify goal if any failed — but a post-integration-test phase you add can still run cleanup beforehand. Useful for Selenium suites with Docker dependencies.

Chapter 8 is done. Your tests run in parallel, locally and on Jenkins and on GitHub Actions, with proper thread-safe driver management, system-property pass-through, and reports archived as artefacts. The framework is production-grade. Chapter 9 is the capstone — applying everything from chapters 1–8 to a real flight-booking application end to end.

// tip to track lessons you complete and pick up where you left off across devices.