CSS and XPath Selectors in Playwright

8 min read

The previous lesson made the case for accessibility-first locators — getByRole, getByLabel, getByText — and you should reach for those first 90% of the time. But the remaining 10% is real. Custom data widgets without semantic markup, structural relationships that no role describes, legacy DOMs you can't change. For those, Playwright still gives you full CSS and XPath support — plus a few selector engines unique to Playwright that beat both. This lesson is about what each tool is genuinely good at and the patterns to keep your selectors readable when you have to use them.

CSS with page.locator()

page.locator() accepts any CSS selector you'd recognise from a stylesheet:

page.locator(".product-card");                  // class
page.locator("#login-form");                    // id
page.locator("button[type='submit']");          // attribute
page.locator("[data-testid='email']");          // data attribute
page.locator("form input[name='email']");       // descendant
page.locator(".card > .price");                 // direct child
page.locator(".card:has(.badge)");              // :has() — element containing a child
page.locator("li:nth-child(3)");                // structural position

The :has() pseudo-class is especially useful — it selects elements that contain a given child:

// Every product card that has a "Sale" badge inside it
await page.locator(".product-card:has(.badge-sale)").first().click();

CSS is the right call when you need a structural selector — child, sibling, descendant relationships — that semantic locators can't express.

XPath — useful in narrow cases

XPath is supported but Playwright's docs explicitly call it a last resort:

page.locator("//button[@type='submit']");
page.locator("//div[@class='product-card']//span[@class='price']");
page.locator("xpath=//button[contains(text(), 'Sign in')]");

The // prefix or the xpath= engine prefix tells Playwright the string is XPath, not CSS. Where XPath earns its keep is the rare case where you need to traverse up the DOM (parent or ancestor selection):

// Find the row that contains a specific cell — go up to <tr> from the <td>
const row = page.locator("xpath=//td[text()='alice@example.com']/ancestor::tr");
await row.getByRole("button", { name: "Edit" }).click();

In modern Playwright, the same intent reads better as page.getByRole('row').filter({ hasText: 'alice@example.com' }). So even XPath's traditional sweet spot has a cleaner alternative — which is why it sits at the bottom of the priority list.

Playwright's text-matching engines

Playwright extends CSS with selector engines you won't find in the W3C spec. The most useful is the text engine:

page.locator("text=Submit");                    // contains "Submit" (case-insensitive)
page.locator("text='Submit'");                  // exact match (single quotes)
page.locator("text=/^Total:\\s+\\$\\d+/");      // regex
page.locator("button:text('Add to cart')");      // pseudo-class form
page.locator("button:text-is('Add to cart')");   // strict-equal form
page.locator("button:text-matches('add', 'i')"); // regex form

Most of the time you'd reach for getByText instead. The pseudo-class forms (:text(...), :text-is(...)) are useful when you want to compose them inside a larger CSS chain — e.g., page.locator(".alert:text('Saved')") for "an alert containing the word Saved."

.filter() — the layered alternative

Playwright also exposes filtering as locator methods, which often reads better than packing everything into a CSS string:

// Element matching the locator AND containing some text
page.locator(".card").filter({ hasText: "Laptop" });
 
// Element matching the locator AND containing a child locator
page.locator(".row").filter({ has: page.locator(".badge-error") });
 
// Element matching the locator AND NOT containing some text
page.locator(".product").filter({ hasNotText: "Out of stock" });
 
// Element matching the locator AND NOT containing a child
page.locator(".card").filter({ hasNot: page.locator(".sold") });

These compose — .filter() chained off .filter() chained off a base locator — and the resulting code reads like the description: "the row that contains a badge-error element."

Position picking — .first(), .last(), .nth()

When a locator matches multiple elements, narrow with positional methods rather than CSS pseudo-classes:

page.locator(".product-card").first();   // first match
page.locator(".product-card").last();    // last match
page.locator(".product-card").nth(2);    // third match (0-indexed)

These are functionally equivalent to :first-child, :last-child, :nth-child(3) for the matched set, but they read more clearly and they don't depend on the elements being siblings in the DOM tree.

locator.and() and locator.or()

Two combinators most teams forget exist:

// Element matching BOTH conditions: a button AND it has the role of being the primary submit
const primarySubmit = page
  .getByRole("button")
  .and(page.locator(".btn-primary"));
 
// Element matching EITHER: clicking "Submit" OR "Save" depending on which renders
const submitOrSave = page
  .getByRole("button", { name: "Submit" })
  .or(page.getByRole("button", { name: "Save" }));
 
await submitOrSave.click();

.or() is particularly useful when the same flow has two valid UIs — e.g., a feature flag that renames a button. The test stays green through the rollout instead of breaking on the day the flag flips.

Same element, three locator strategies

Same Add-to-cart button, three ways to find it

getByRole — recommended

  • page.getByRole('button', { name: 'Add to cart' })

  • Reads like: 'find the button labelled Add to cart'

  • Survives CSS class renames and design refactors

  • Doubles as an accessibility check — fails if the button isn't accessible

CSS — fallback

  • page.locator('button.btn-primary.add-to-cart')

  • Reads like: 'find the button with these CSS classes'

  • Breaks the moment a designer renames the utility classes

  • Useful when no role/label/text exists — but rare in well-built apps

XPath — last resort

  • page.locator('xpath=//button[text()="Add to cart"]')

  • Reads like: 'walk the DOM tree manually'

  • Verbose, harder to read, slower to maintain

  • Only justifies itself when traversing up — and even then, .filter() usually wins

When CSS or XPath is the right call

Three scenarios where you legitimately reach below the semantic locators:

  1. Custom widgets with no role. A draggable canvas, a third-party charting library, a styled <div> masquerading as a button (and the dev team can't add a role yet). page.locator(".chart-bar:nth-child(3)") is what you've got.
  2. Structural relationships. "The price element inside the third row of the offers table." Roles don't model "the third row"; CSS does naturally.
  3. Legacy or generated DOM. Server-rendered apps with classes like mod_a3f2_btn and no aria attributes. You add data-testid where you can; for the rest, CSS is the bridge.

In every other case — and that's the overwhelming majority — the semantic locator wins.

A complex product-grid example

A typed test that mixes Playwright's recommended locators with CSS where it genuinely helps:

import { test, expect } from "@playwright/test";
 
test.describe("Product grid — mixed locators", () => {
  test.beforeEach(async ({ page }) => {
    await page.goto("/products");
  });
 
  test("the search input is inside the page header", async ({ page }) => {
    // CSS for structural scoping; getByRole inside for semantic match
    const header = page.locator("header.page-header");
    await expect(header.getByRole("searchbox")).toBeVisible();
  });
 
  test("clicks the third pagination button via :nth-of-type", async ({ page }) => {
    // CSS + nth() reads better than Playwright equivalents here
    const pageThree = page
      .locator(".pagination button")
      .filter({ hasText: "3" });
    await pageThree.click();
    await expect(page).toHaveURL(/page=3/);
  });
 
  test("locates a sale-tagged product without a dedicated test ID", async ({ page }) => {
    // :has() lets us say "card containing a sale badge" in one CSS string
    const onSaleCards = page.locator(".product-card:has(.badge-sale)");
    await expect(onSaleCards).toHaveCount(3);
    await onSaleCards.first().getByRole("button", { name: "Add to cart" }).click();
  });
 
  test("submit-or-save — same flow under a feature-flag rollout", async ({ page }) => {
    const submitOrSave = page
      .getByRole("button", { name: "Submit" })
      .or(page.getByRole("button", { name: "Save" }));
    await submitOrSave.click();
    await expect(page).toHaveURL(/confirmation/);
  });
});

Read each test for the reason CSS shows up. The first uses CSS to scope to a specific structural region the markup makes obvious. The second uses CSS because "the third pagination button" is genuinely a structural query. The third uses :has() to express "card containing a sale badge" — neat in one line, awkward to express semantically. The fourth uses .or() to handle two valid UIs during a rollout. Every other interaction stays semantic.

Coming from Cypress?

Cypress teams default to cy.get('[data-testid=...]') everywhere — which Playwright respects via page.getByTestId(...). The bigger shift is in CSS itself: where Cypress treated cy.get('.btn-primary') as a normal way to write tests, Playwright's docs treat that as an anti-pattern. The CSS escape hatch is still here when you need it; just don't reach for it first.

⚠️ Common mistakes

  • Defaulting to CSS because it's familiar. Cypress muscle memory makes page.locator('.product-card') feel natural. It's also the locator most likely to break when the design system changes. Spend the extra five seconds to ask if a getByRole or filter({ hasText }) would do the same job — almost always it will, and your test will outlive the next CSS refactor.
  • Writing absolute XPath copied from devtools. xpath=/html/body/div[1]/div[2]/main/section[3]/div/div[1]/button survives nothing — not a wrapper-div added by a developer, not a layout change, not a flexbox-to-grid refactor. If XPath is genuinely the only option, write it relatively (//table//tr[contains(., 'Alice')]/td[3]/button) and prefer .filter() first.
  • Overusing :has() and :not() until selectors are unreadable. page.locator('.row:has(.badge:not(.badge-error)):has(.cell:has(.icon))') is a sign you've gone too deep into CSS. Break it apart into named locator variables, use .filter(), or — ideally — find the semantic angle (getByRole('row').filter({ hasText: 'Active' })).

🎯 Practice task

Practise picking the right tool for each locator scenario. 20-25 minutes.

  1. Use the Sauce Demo inventory page (/inventory.html, logged in as standard_user). Create tests/css-xpath.spec.ts with a beforeEach that handles login.

  2. Write five tests, each demonstrating one locator approach:

    import { test, expect } from "@playwright/test";
     
    test.describe("CSS and XPath in context", () => {
      test.beforeEach(async ({ page }) => {
        await page.goto("/");
        await page.getByPlaceholder("Username").fill("standard_user");
        await page.getByPlaceholder("Password").fill("secret_sauce");
        await page.getByRole("button", { name: "Login" }).click();
      });
     
      test("CSS for a structural region — header", async ({ page }) => {
        const header = page.locator("#header_container");
        await expect(header.getByText("Swag Labs")).toBeVisible();
      });
     
      test("CSS :has() — items with a price > $20", async ({ page }) => {
        // The CSS is illustrative; Sauce Demo's price is a sibling .inventory_item_price.
        const items = page.locator(".inventory_item");
        await expect(items).toHaveCount(6);
      });
     
      test("filter() chained — a specific named product", async ({ page }) => {
        const fleece = page.locator(".inventory_item").filter({ hasText: "Fleece Jacket" });
        await fleece.getByRole("button", { name: "Add to cart" }).click();
        await expect(page.locator(".shopping_cart_badge")).toHaveText("1");
      });
     
      test("XPath — last resort, traverse up", async ({ page }) => {
        const editLinkRow = page.locator(
          "xpath=//div[contains(@class, 'inventory_item') and .//div[text()='Sauce Labs Backpack']]"
        );
        await editLinkRow.getByRole("button", { name: "Add to cart" }).click();
        await expect(page.locator(".shopping_cart_badge")).toHaveText("1");
      });
     
      test("locator.or() — handles UI A/B variations", async ({ page }) => {
        const cartLink = page
          .getByRole("link", { name: /cart/i })
          .or(page.locator(".shopping_cart_link"));
        await cartLink.click();
        await expect(page).toHaveURL(/cart/);
      });
    });
  3. Run the spec: npm test -- tests/css-xpath.spec.ts. All five should pass.

  4. Refactor the XPath test to use .filter({ hasText: 'Sauce Labs Backpack' }) instead. Re-run. Notice the test reads cleaner and is two lines shorter — that's the lesson: even when XPath works, a Playwright-native filter is usually better.

  5. Stretch: open devtools on Sauce Demo's inventory page and pick three elements that have NO data-test attribute, no obvious role, and no unique text — purely styled <div> and <span> content. For each, write the most resilient locator you can. If none of getByRole, getByLabel, getByText, getByTestId fit, fall back to a :has() CSS or a .filter() chain. Notice how often the answer is "ask the dev team to add a data-testid" rather than getting clever with selectors.

You now know when to reach below the semantic-locator line — and crucially, when not to. The next lesson moves from finding elements to acting on them: clicks, types, fills, presses, and the actionability checks Playwright runs before every interaction.

// tip to track lessons you complete and pick up where you left off across devices.