Self-Healing Selectors with AI Assistance

9 min read

Selector breakage is the single largest source of test-maintenance pain in any UI suite. A developer renames id="submit" to data-testid="submit-button", ten tests turn red, and a tester spends a Tuesday morning hunting for the new locator. Playwright MCP collapses that triage into a one-prompt session: the assistant visits the page, finds the renamed element by its visible role and name, and proposes the updated selector. This lesson covers the heal-by-suggestion pattern, where it earns its keep, the boundaries of fully automated healing tools, and — most importantly — why the real fix is using locators that don't break in the first place.

A frame to keep in mind throughout: AI-assisted healing is treatment, not prevention. The patient still has the disease. The chapter on locator strategy in your existing Playwright knowledge is the prevention; this lesson is what to do when prevention has already failed.

A typical heal session

The test broke yesterday with this error:

locator.click: Timeout 30000ms exceeded.
Call log: waiting for locator('#submit')

You hand it to the assistant:

This Playwright test is failing:
 
await page.locator('#submit').click();
 
Error: locator.click: Timeout 30000ms exceeded.
 
Visit https://myapp.com/checkout, find the new locator for the "Submit order" button,
and propose an updated test line. Prefer getByRole. Confirm the new locator works
by clicking it before suggesting it.

The assistant navigates to /checkout, snapshots the page, finds a button with the accessible name "Submit order", clicks it to confirm the locator resolves, and replies:

// Old (broken):
await page.locator('#submit').click();
 
// New (verified):
await page.getByRole('button', { name: 'Submit order' }).click();

The verification step matters. The assistant didn't just suggest a plausible selector — it tested it against the live page before recommending. That round-trip is the difference between "this looks right" and "this works."

Healing in batch

When a deploy moves a whole page's markup, you usually have several broken tests at once. One prompt covers them all:

These five tests all fail on the checkout page after today's deploy:
 
1. await page.locator('#submit').click();
2. await page.locator('.coupon-input').fill('SAVE10');
3. await page.locator('div.order-summary > p:nth-child(3)').textContent();
4. await page.click('input[name="postcode"]');
5. await expect(page.locator('.success-banner')).toBeVisible();
 
Visit https://myapp.com/checkout (logged in as the test user) and propose stable
replacements for each. Verify each new locator resolves before suggesting it.

The assistant runs through each, confirms the new locators, and returns a side-by-side diff. You apply, run the tests locally, and the suite is green again. What used to be a half-day of investigation is fifteen minutes of prompt + review + commit.

Where automated healing tools sit

Some commercial platforms — Healenium, Mabl, others — go a step further: they detect a broken selector at runtime and silently substitute a healed one without a human in the loop. That sounds great until a developer renames a button and changes its purpose; the healed selector silently binds to a wrong element, and the test passes when it should have failed.

The trade-off is real:

  • Manual heal-by-suggestion (this lesson): slower but keeps the safety net. Every change is reviewed in a PR before landing.
  • Fully automated healing: faster, lower-touch, but quietly hides the kind of UX changes the suite was supposed to catch.

For a regression suite that exists to catch product regressions, suggestion-with-PR is the right default. For internal tooling where speed matters more than precision, full automation can be acceptable. Pick deliberately.

Healing inside the test code itself

A pattern some teams reach for: catch a broken selector at test time and fall back to AI healing.

test('place order', async ({ page }) => {
  await page.goto('/checkout');
  try {
    await page.locator('#submit').click({ timeout: 5_000 });
  } catch (e) {
    // In dev: log a warning and prompt for healing.
    // In CI: fail with full context.
    console.warn('Selector #submit failed — healing candidate. Page state:');
    console.warn(await page.locator('main').innerHTML().catch(() => 'unavailable'));
    throw e;
  }
});

Useful pattern, but understand what it is: a signal that something needs healing, not the heal itself. Don't try to call out to an LLM at runtime in a CI job — slow, costly, non-deterministic, and exactly the wrong shape for a deterministic suite. Capture the signal, fail loudly, fix in a PR.

Brittle vs resilient selectors

The deeper fix isn't faster healing — it's locators that don't need to be healed.

Brittle vs resilient selectors

Brittle (heal often)

  • page.locator('#submit-btn-v2-final')

  • page.locator('.css-7d8jk2')

  • page.locator('div > section:nth-child(3) > button')

  • Breaks on rename, refactor, or framework upgrade

Resilient (heal rarely)

  • page.getByRole('button', { name: 'Submit order' })

  • page.getByLabel('Email')

  • page.getByTestId('submit-order')

  • Survives styling changes; only breaks on real UX changes

The resilient column tracks user-facing identity — the role, the label, the explicit testid that the team agreed represents the element. Those break only when the element's meaning changes, which is exactly when a test should break. Class-name and DOM-position selectors break on every refactor; that's noise, not signal.

Using the AI heal as a feedback loop

Every healing session is a small bug report on your locator strategy. The assistant just told you a particular line of test code is fragile. Don't just patch it — note the pattern. After three or four heal sessions across a sprint, the patterns are stark:

  • Tests that target by id heal often → ban id-based locators in code review.
  • Tests that target by class heal often → switch to data-testid for the affected components.
  • Tests that target by deep DOM paths heal often → introduce stable testids on the relevant section roots.

Use the AI to fix the immediate failure and to harvest the data on what broke, so that the next cycle of healing is shorter than the last.

⚠️ Common mistakes

  • Healing without verifying the new locator targets the right element. The AI suggestion looks plausible — page.getByRole('button', { name: 'Submit' }) — but the page now has two Submit buttons (one for the order, one for a feedback form). The healed test passes against the wrong one and silently misses regressions. Always run the healed test against a known-good and known-bad case before committing.
  • Auto-merging healed tests on every CI failure. Tempting, but exactly the failure mode commercial auto-healing tools have to defend against. A renamed button isn't always a renamed button — sometimes it's a redesigned flow that should fail the test. Keep humans in the merge loop for any healing diff.
  • Treating healing as a substitute for fixing locator strategy. Three heals on the same component in two months is not "the AI saved us three times" — it's "we have a brittle locator and we're paying the AI to dodge the underlying fix." Add a stable testid, retire the brittle locator, and stop paying that tax.

🎯 Practice task

Heal a real broken test, then prevent the next break. 30 minutes.

  1. Find a broken test in your suite — or create one by deliberately changing a stable selector to a brittle one (e.g., replace getByRole('button', { name: 'Save' }) with locator('.btn-primary-v2')) and confirm it now fails on a small UI tweak.
  2. Hand the failure to the assistant using the prompt template above. Read the proposed heal, verify the new locator resolves to the right element on the live page, and apply.
  3. Run the test. Confirm it passes. Commit the heal.
  4. Pattern audit: open the diff. What kind of selector broke (id, class, testid, role, DOM path)? Add a code-review check that flags new code introducing the same pattern. The aim is to make the next failure a different kind of failure, not the same one again.
  5. Stretch: use the assistant in a meta way — "Search the tests/ directory for any selectors that are likely to break in the next refactor (id-based, class-based, deep DOM paths). Propose a stable replacement for each and order the list by risk." Pick the top three and refactor proactively.

That closes Chapter 3. The next chapter shifts from authoring tests to executing them in AI-driven sessions: exploratory testing, bug reproduction, visual verification, and AI-assisted debugging.

// tip to track lessons you complete and pick up where you left off across devices.