Locator Strategies in Python — Playwright with Python

Every Playwright test boils down to find an element, do something with it, assert something about it. The "find an element" half is what this chapter is about — and Playwright's locator API is genuinely the most user-friendly of any browser-automation framework. Where Selenium relies on CSS and XPath as the default and treats accessibility-based locators as opt-in, Playwright inverts the priority: role-, label-, and text-based locators are the recommended way, with CSS and XPath as fallbacks. In Python the API is identical to the TypeScript course — just renamed to snake_case and called from sync def tests.

What a Locator actually is

Before the API: a Playwright Locator is not the element itself. It's a lazy, queryable handle that re-runs its query every time you act or assert against it. Think of it as a recipe for finding an element, not a snapshot.

# This line does NOT touch the DOM
submit_button = page.get_by_role("button", name="Submit")
 
# This line queries the DOM and clicks
submit_button.click()
 
# This line queries the DOM AGAIN — fresh query, no stale element
expect(submit_button).to_be_visible()

Because locators re-query, they're immune to the "stale element reference" errors that plague Selenium suites. You can store a Locator in a variable, navigate to a different page, render new content, and the same Locator still works against the new DOM.

Snake_case naming — the only TS-vs-Python difference you'll feel

Every locator method changes case from camelCase (TS) to snake_case (Python). Behaviour is identical — same engine, same DOM queries:

getByRole → get_by_role
getByLabel → get_by_label
getByText → get_by_text
getByTestId → get_by_test_id
getByPlaceholder → get_by_placeholder
getByAltText → get_by_alt_text
getByTitle → get_by_title

If you've worked through the Playwright TypeScript course, translating between the two is a mechanical rename of the method and dropping await. The argument shape is the same too — TS { name: 'Submit' } becomes Python name="Submit" (Python keyword arguments instead of an options object).

The recommended locator priority

Playwright's docs are unusually opinionated about locator order. From most resilient to least:

get_by_role — by accessibility role and accessible name (best practice).
get_by_label — form fields by their visible label.
get_by_placeholder — inputs by their placeholder text.
get_by_text — elements by visible text.
get_by_alt_text — images and elements by alt text.
get_by_title — elements by their title attribute.
get_by_test_id — by data-testid attribute.
CSS selectors — fallback for structural matches.
XPath — fallback of last resort.

The further up the list, the more your locator describes how a user finds the element, not how the developer happened to mark up the DOM. That alignment with user intent is what makes the locator survive CSS refactors, framework migrations, and design-system overhauls.

get_by_role — the workhorse

A web page is a tree of accessibility roles whether the developer thinks about it or not: every <button> has role button, every <a href> has role link, every <input type="text"> has role textbox. Screen readers navigate by role. So can your tests:

page.get_by_role("button", name="Submit").click()
page.get_by_role("link", name="Products").click()
page.get_by_role("heading", level=1).wait_for()
page.get_by_role("textbox", name="Email").fill("alice@test.com")
page.get_by_role("checkbox", name="Remember me").check()
page.get_by_role("listitem").filter(has_text="Laptop").click()

The name keyword argument matches the element's accessible name — usually its visible text or its aria-label. The match is case-insensitive and substring by default; pass name=re.compile(r"^Submit$") and exact=True for exact regex matching.

Why this is the best locator: if get_by_role("button", name="Submit") can't find a button, the button might not be properly accessible — which is itself a real bug worth catching. Tests that demand accessibility produce more accessible products.

get_by_label — forms first

For form fields, the most natural locator is the label the user reads:

page.get_by_label("Email address").fill("alice@test.com")
page.get_by_label("Password").fill("secret123")
page.get_by_label("Subscribe to newsletter").check()

This works for any input properly associated with a <label> (via for= or wrapping). It also works for aria-label and aria-labelledby patterns. Forms that pass get_by_label tests are forms that screen-reader users can fill in — another double-duty win.

get_by_text — visible content

When the element exists for the user as text rather than a control — banners, error messages, totals, "12 items" counters — get_by_text is the natural fit:

import re
 
expect(page.get_by_text("Order confirmed")).to_be_visible()
page.get_by_text("Add to cart").first.click()
expect(page.get_by_text(re.compile(r"total: \$[\d.]+", re.I))).to_be_visible()

The default is substring + case-insensitive. Pass exact=True to match the full string, or pass a compiled regex when the text is dynamic (timestamps, counts, prices). Note re.compile instead of TS's literal /regex/ syntax — Python pattern matching takes a Pattern object.

get_by_test_id — when the DOM doesn't help

Sometimes the element has no role, no label, and no stable text — a custom widget, a styled <div>, an analytics-only element. That's what data-testid is for:

page.get_by_test_id("product-card").first.click()
expect(page.get_by_test_id("checkout-total")).to_contain_text("$129.99")

Playwright reads data-testid by default. If your team's convention is data-cy (common when migrating a Cypress suite), switch the attribute via the browser_context_args fixture in conftest.py:

# tests/conftest.py
import pytest
 
@pytest.fixture(scope="session")
def browser_context_args(browser_context_args):
    return {**browser_context_args, "test_id_attribute": "data-cy"}

Now page.get_by_test_id("submit") matches [data-cy="submit"]. One change in one place — every test inherits.

CSS and XPath — the fallback

When the page genuinely has no semantic anchor, fall back to CSS or XPath via page.locator(...):

page.locator(".product-card")
page.locator("#login-form button[type='submit']")
page.locator("//button[@type='submit']")

Reach for these last. A CSS class like .btn-primary is one design-system migration away from breaking; an XPath like //div[3]/span[2] is one DOM-tree refactor from breaking. Use them when nothing else fits, and document the choice with a one-line comment so the next reviewer doesn't assume it was lazy.

Locator priority, visualised

Locator strategies — ranked by resilience and user-friendliness

get_by_role — accessibility role + name95/100

get_by_label — form fields by label text90/100

get_by_placeholder — inputs by placeholder80/100

get_by_text — visible content75/100

get_by_test_id — data-testid attribute70/100

CSS selectors — structural / fallback45/100

XPath — last resort25/100

The numbers are illustrative — the point is the order. Every team eventually drifts up the list as they internalise that user-facing locators outlive design refactors.

Filtering — narrowing without resorting to nth()

Real apps have lists, tables, and repeating components. .filter() narrows a multi-match locator to the one you want:

# All listitems containing "Laptop"
laptop_item = page.get_by_role("listitem").filter(has_text="Laptop")
 
# The row where the cell contains Alice's name
alice_row = page.get_by_role("row").filter(
    has=page.get_by_role("cell", name="Alice")
)
 
# Exclude a class of rows
enabled_buttons = page.get_by_role("button").filter(has_not_text="Coming soon")
 
alice_row.get_by_role("button", name="Edit").click()

has_text matches by visible text inside the locator. has filters by the presence of a child locator — perfect for "the row that contains this thing." has_not_text and has_not are the negations. Note the snake_case difference from TS's hasText / hasNot.

Chaining — scoping down a tree

Locators chain. parent.get_by_role(...) searches inside parent, not the whole page:

product_card = page.get_by_test_id("product-card").first
product_card.get_by_role("button", name="Add to cart").click()
expect(product_card.get_by_role("heading")).to_contain_text("Laptop")

.first, .last, and .nth(n) (zero-indexed) pick a specific match without filtering. Note that .first and .last are attributes in Python (no parentheses), unlike TS's .first() and .last() method calls. .nth(2) is a method call because it takes an argument.

Strict mode — Playwright's safety net

By default, locators that match multiple elements throw an error when you act on them:

# If 6 buttons say "Add to cart", this raises:
#   strict mode violation: locator resolved to 6 elements
page.get_by_role("button", name="Add to cart").click()

This is on purpose. Selenium silently picks the first match; Playwright forces you to be explicit. Use .first, .last, .nth(2), or filter:

page.get_by_role("button", name="Add to cart").first.click()
 
page.get_by_test_id("product-card") \
    .filter(has_text="Wireless Headphones") \
    .get_by_role("button", name="Add to cart") \
    .click()

Strict mode is one of the most underrated reliability features in Playwright. The first time you accidentally write a locator that matches three elements, the test screams instead of clicking the wrong thing — and you find a real bug.

A real product-listing example

A typed test that exercises every locator pattern:

from playwright.sync_api import Page, expect
 
class TestProductListing:
    def test_page_header_and_search_input(self, page: Page):
        page.goto("/products")
        header = page.get_by_test_id("page-header")
        expect(header.get_by_role("heading", level=1)).to_contain_text("Products")
        expect(header.get_by_placeholder("Search products")).to_be_visible()
 
    def test_clicks_add_to_cart_on_first_card(self, page: Page):
        page.goto("/products")
        first_card = page.get_by_test_id("product-card").first
        first_card.get_by_role("button", name="Add to cart").click()
        expect(page.get_by_test_id("cart-count")).to_have_text("1")
 
    def test_adds_wireless_headphones_specifically(self, page: Page):
        page.goto("/products")
        headphones_card = page.get_by_test_id("product-card").filter(
            has_text="Wireless Headphones"
        )
        headphones_card.get_by_role("button", name="Add to cart").click()
        expect(page.get_by_test_id("cart-count")).to_have_text("1")
 
    def test_paginates(self, page: Page):
        page.goto("/products")
        page.get_by_test_id("pagination").get_by_role("button", name="Next").click()
        expect(page).to_have_url(re.compile(r"page=2"))

Read each test from the outside in. The first scopes locators to the page header. The second uses .first to disambiguate. The third uses .filter(has_text=...) to find a specific card by product name. The fourth scopes pagination clicks to the pagination region. None of them rely on a CSS class.

⚠️ Common mistakes

Reaching for CSS classes when a role would work. page.locator(".btn-primary.submit-large") is one design-system migration away from breaking. page.get_by_role("button", name="Submit") survives every CSS refactor your team will ever do. Get into the habit of asking "how would a user find this?" first, every time.
Forgetting Playwright is strict by default. A locator that matches multiple elements raises — that's a feature, not a bug. Don't paper over it with .first reflexively; ask whether your locator is actually unambiguous. Often the right fix is .filter(has_text=...) to narrow to the correct match, not just some match.
Translating TS regex syntax literally. getByText(/total/i) in TS is get_by_text(re.compile(r"total", re.I)) in Python. Forgetting re.compile and writing get_by_text(r"total") matches the literal string "total", not the pattern. Import re at the top of any test that uses pattern matching.

🎯 Practice task

Wire up the locator toolkit on Sauce Demo. 25-30 minutes.

With base_url = https://www.saucedemo.com set in pytest.ini, write a conftest.py fixture that logs in before each test (we'll formalise this in chapter 3 — for now, repeat it in beforeEach-style):

# tests/test_locators.py
import re
import pytest
from playwright.sync_api import Page, expect
 
@pytest.fixture(autouse=True)
def login(page: Page):
    page.goto("/")
    page.get_by_placeholder("Username").fill("standard_user")
    page.get_by_placeholder("Password").fill("secret_sauce")
    page.get_by_role("button", name="Login").click()
    expect(page).to_have_url(re.compile(r"inventory"))

Below the fixture, write five tests, one per locator technique:

class TestLocatorToolkit:
    def test_test_id_six_product_cards(self, page: Page):
        expect(page.locator("[data-test='inventory-item']")).to_have_count(6)
 
    def test_role_first_add_to_cart(self, page: Page):
        page.locator(".inventory_item").first.get_by_role(
            "button", name="Add to cart"
        ).click()
        expect(page.locator(".shopping_cart_badge")).to_have_text("1")
 
    def test_filter_chain_backpack_anywhere(self, page: Page):
        backpack = page.locator(".inventory_item").filter(has_text="Sauce Labs Backpack")
        backpack.get_by_role("button", name="Add to cart").click()
        expect(page.locator(".shopping_cart_badge")).to_have_text("1")
 
    def test_text_locator_heading(self, page: Page):
        expect(page.get_by_text("Products", exact=True)).to_be_visible()
 
    def test_strict_mode_disambiguate(self, page: Page):
        page.locator(".inventory_item").filter(has_text="Fleece Jacket") \
            .get_by_role("button", name="Add to cart").click()
        expect(page.locator(".shopping_cart_badge")).to_have_text("1")

Run the spec headlessly: pytest tests/test_locators.py -v. All five should pass on Chromium.
Force a strict-mode error. In one test, change the click to page.get_by_role("button", name="Add to cart").click() (no .first, no filter). Run again. Read the error: strict mode violation: locator resolved to 6 elements. This is Playwright telling you exactly what's wrong. Fix it by adding .first or .filter(...).
Stretch: add a sixth test that adds three different items by name (Backpack, Fleece Jacket, T-Shirt) using .filter(has_text=...) and get_by_role, then asserts the cart badge shows "3". This is the pattern you'll repeat dozens of times in a real e-commerce suite — and it's the test that will not break when the design team renames .inventory_item to .product-tile next quarter.

Locators are the foundation. The next lesson covers the actions you do with those locators — click, fill, check, select_option, and the keyboard and mouse APIs that round out the toolbox.