In chapter 5 you wrote a @DataProvider that returned an inline Object[][]. That's fine for ten rows. The moment business stakeholders want to own the test data — adding new login scenarios, tax-bracket combinations, currency conversions — they don't want to touch Java. They want a spreadsheet. Apache POI is the Java library that reads .xlsx files into your test code, and pairing it with @DataProvider is how teams give product managers and business analysts an editable test-data surface without exposing the codebase. This lesson walks the POI API, the cell-type handling that bites everyone once, and the practical pattern for wiring an Excel file into a TestNG suite.
When Excel is the right answer
Three honest cases:
- Non-developers maintain the data. PMs, BAs, business stakeholders edit
.xlsxhappily and edit JSON awkwardly. If your data shape will be edited weekly by people who don't open IntelliJ, Excel earns its place. - The data has formulas or formatting. Currency, dates, computed columns — Excel handles these natively. CSV doesn't.
- The data already exists in Excel. Migrating an existing test-data spreadsheet is faster than rewriting it as JSON.
The cost: .xlsx is a binary zip — Git can't diff it sensibly, file sizes balloon, and parsing is slower than CSV/JSON. For developer-owned data, JSON or CSV is almost always cleaner. Use Excel deliberately.
Adding Apache POI to the project
Add the POI dependency to pom.xml:
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi-ooxml</artifactId>
<version>5.2.5</version>
<scope>test</scope>
</dependency>poi-ooxml includes both POI's core (poi) and the .xlsx (OOXML zip-based) reader. For the older .xls (binary) format, the same JAR's HSSFWorkbook class works — but .xls is genuinely legacy in 2026, prefer .xlsx.
A reusable ExcelReader utility
Wrap POI behind a small class so test code never sees XSSFCell:
package com.mycompany.tests.utils;
import org.apache.poi.ss.usermodel.*;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
import java.io.FileInputStream;
import java.io.IOException;
public class ExcelReader {
public static Object[][] readData(String filePath, String sheetName) throws IOException {
try (FileInputStream fis = new FileInputStream(filePath);
XSSFWorkbook workbook = new XSSFWorkbook(fis)) {
Sheet sheet = workbook.getSheet(sheetName);
if (sheet == null) {
throw new IllegalArgumentException("Sheet not found: " + sheetName);
}
int rowCount = sheet.getLastRowNum(); // 0-indexed last row
int colCount = sheet.getRow(0).getLastCellNum();// header row's column count
Object[][] data = new Object[rowCount][colCount];
for (int rowIdx = 1; rowIdx <= rowCount; rowIdx++) {
Row row = sheet.getRow(rowIdx);
for (int colIdx = 0; colIdx < colCount; colIdx++) {
data[rowIdx - 1][colIdx] = cellValue(row.getCell(colIdx));
}
}
return data;
}
}
private static String cellValue(Cell cell) {
if (cell == null) return "";
return switch (cell.getCellType()) {
case STRING -> cell.getStringCellValue();
case NUMERIC -> {
double d = cell.getNumericCellValue();
yield d == Math.floor(d) ? String.valueOf((long) d) : String.valueOf(d);
}
case BOOLEAN -> String.valueOf(cell.getBooleanCellValue());
case FORMULA -> cell.getCellFormula();
default -> "";
};
}
}Three details every Selenium-with-Excel codebase eventually adds:
try-with-resourceson the workbook AND the input stream. POI doesn't auto-close them and leaked handles are a real CI flake source.- Sheet existence check. A sheet typo silently produces a
NullPointerExceptionten lines later. The early throw with the sheet name is much easier to debug. cellValuereturns String. Excel cells can be string, numeric, boolean, formula, blank — andgetStringCellValue()throws on a numeric cell. Coerce everything to String at the boundary; tests parse to int/boolean/etc. as needed.
The login-data spreadsheet
A sample src/test/resources/testdata/login-data.xlsx with a single sheet LoginTests:
| username | password | shouldSucceed | expectedFragment |
|---|---|---|---|
| standard_user | secret_sauce | true | /inventory.html |
| locked_out_user | secret_sauce | false | locked out |
| problem_user | secret_sauce | true | /inventory.html |
| performance_glitch_user | secret_sauce | true | /inventory.html |
| (empty) | secret_sauce | false | Username is required |
| standard_user | (empty) | false | Password is required |
| wrong_user | wrong_password | false | do not match |
First row = headers (we skip them in the loop). Subsequent rows = data. The PM can add a row in Excel without ever touching the test code.
Wiring Excel into a @DataProvider
package com.mycompany.tests.tests;
import com.mycompany.tests.base.BaseTest;
import com.mycompany.tests.pages.LoginPage;
import com.mycompany.tests.utils.ExcelReader;
import org.testng.Assert;
import org.testng.annotations.DataProvider;
import org.testng.annotations.Test;
import java.io.IOException;
public class ExcelDrivenLoginTest extends BaseTest {
@DataProvider(name = "loginScenarios")
public Object[][] loginScenarios() throws IOException {
return ExcelReader.readData(
"src/test/resources/testdata/login-data.xlsx",
"LoginTests"
);
}
@Test(dataProvider = "loginScenarios")
public void shouldHandleLoginScenario(
String username, String password, String shouldSucceed, String expectedFragment
) {
LoginPage login = new LoginPage(driver).navigateTo();
login.loginAs(username, password);
if ("true".equalsIgnoreCase(shouldSucceed)) {
Assert.assertTrue(driver.getCurrentUrl().contains(expectedFragment),
"Expected URL to contain " + expectedFragment);
} else {
Assert.assertTrue(login.errorText().contains(expectedFragment),
"Expected error to mention " + expectedFragment);
}
}
}Run it. TestNG produces seven test results (one per row) named with the parameter values. Adding an eighth scenario means adding a row in Excel — no Java change, no recompile.
The data flow, visualised
Excel rows → ExcelReader → DataProvider → seven test runs
| username | password | succeed? | expected | |
|---|---|---|---|---|
| row 1 | standard_user | secret_sauce | true | /inventory.html |
| row 2 | locked_out_user | secret_sauce | false | locked out |
| row 3 | problem_user | secret_sauce | true | /inventory.html |
| row 4 | performance_glitch_user | secret_sauce | true | /inventory.html |
| row 5 | (empty) | secret_sauce | false | Username is required |
| row 6 | standard_user | (empty) | false | Password is required |
| row 7 | wrong_user | wrong_password | false | do not match |
Add a row in Excel; TestNG runs an extra test on the next mvn test. That's the single best property of this pattern.
A typed wrapper — preferred for production
Returning Object[][] of strings works but loses type information. The next iteration: read into a typed POJO, expose getters:
public class LoginScenario {
public final String username;
public final String password;
public final boolean shouldSucceed;
public final String expectedFragment;
public LoginScenario(String username, String password, boolean ok, String fragment) {
this.username = username;
this.password = password;
this.shouldSucceed = ok;
this.expectedFragment = fragment;
}
}
@DataProvider(name = "loginObjects")
public Object[][] loginObjects() throws IOException {
Object[][] raw = ExcelReader.readData("...", "LoginTests");
return Arrays.stream(raw)
.map(r -> new Object[]{
new LoginScenario(
(String) r[0], (String) r[1],
Boolean.parseBoolean((String) r[2]), (String) r[3]
)
})
.toArray(Object[][]::new);
}
@Test(dataProvider = "loginObjects")
public void shouldHandleLoginScenarioTyped(LoginScenario scenario) { ... }Tests now take a single typed parameter. Adding a column means adding a field to the POJO — strong typing keeps the data shape and the test signature in lockstep.
Reading vs writing — POI handles both
POI also writes Excel. When a test produces results that need exporting (a flaky-test inventory, a dataset of generated test users), the writing API mirrors the reading one:
try (XSSFWorkbook wb = new XSSFWorkbook()) {
Sheet sheet = wb.createSheet("Results");
Row header = sheet.createRow(0);
header.createCell(0).setCellValue("Test");
header.createCell(1).setCellValue("Status");
Row r1 = sheet.createRow(1);
r1.createCell(0).setCellValue("shouldLogIn");
r1.createCell(1).setCellValue("PASSED");
try (FileOutputStream fos = new FileOutputStream("results.xlsx")) {
wb.write(fos);
}
}Useful for ad-hoc tooling, less useful for routine reporting (Allure/Extent already produce nicer reports).
Comparison with other languages
# pytest + openpyxl
import openpyxl
@pytest.fixture
def login_rows():
wb = openpyxl.load_workbook("login-data.xlsx")
return [row for row in wb["LoginTests"].iter_rows(min_row=2, values_only=True)]// Cypress / Playwright + xlsx (JavaScript)
import * as XLSX from "xlsx";
const wb = XLSX.readFile("login-data.xlsx");
const rows = XLSX.utils.sheet_to_json(wb.Sheets["LoginTests"]);The pattern is universal — every language has an Excel library. POI happens to be the most full-featured and the most standard in Java, with deep support for formulas, formatting, and the older .xls format.
The Selenium tool entry covers the test side; the TestNG cheat sheet covers @DataProvider patterns; the Core Java cheat sheet covers the Maven dependency-management fundamentals.
⚠️ Common mistakes
- Calling
getStringCellValue()on a numeric cell. POI'sgetStringCellValuethrowsIllegalStateExceptiononCellType.NUMERIC— and Excel often "helpfully" stores12345as a number even when the column is conceptually a string. Always checkgetCellType()first, or use a single coercion helper like the lesson'scellValue(...). - Forgetting to close the workbook and stream. POI's resources are heavy. A test that creates 100 ExcelReader instances and never closes them leaks file handles and on Linux/CI eventually fails with
Too many open files. Always wrap intry-with-resources. - Using Excel for developer-owned data. If only engineers maintain the file and Git diffs matter, JSON or CSV is much friendlier — text-diffable, smaller, faster to parse. Reserve Excel for stakeholders who genuinely need a spreadsheet UI.
🎯 Practice task
Build an Excel-driven test suite. 35–45 minutes.
- Add the POI dependency to your
pom.xml. Runmvn clean compileand confirm POI is on the classpath. - Create
src/test/resources/testdata/login-data.xlsxwith the seven rows from this lesson under a sheet namedLoginTests. Save and commit. - Add
ExcelReader.javafrom the lesson undersrc/test/java/com/mycompany/tests/utils/. - Build
ExcelDrivenLoginTest. Run it. Confirm seven test results, one per row. - Add a row in Excel. Open the spreadsheet, add an eighth scenario (any combination — try a Unicode username 😀). Save. Re-run
mvn test. The eighth test runs without code changes. - Force the type bug. Change the
shouldSucceedcolumn to actually containTRUE(Excel's boolean type) instead of the string"true". Run the test — note the failure message about cell type. UpdatecellValueto handleCellType.BOOLEAN(the lesson's version already does — confirm it works). The point: cell types matter. - Stretch — typed POJO. Add a
LoginScenarioPOJO and rewrite the data provider to return typed scenarios instead ofObject[][]. The test signature becomesvoid shouldHandleLoginScenario(LoginScenario s)— a single parameter, full IDE autocomplete on the fields. Compare with theObject[][]version; the typed one is much more pleasant to maintain.
Next lesson: JSON and CSV — the formats developer-owned test data should usually be in. We'll build the same login suite three times in three formats and compare them honestly.