Your Lighthouse score isn't an accessibility test

qa.codes · 16 December 2025 · 7 min read

Beginner

accessibilitylighthousea11y

A 100 Lighthouse accessibility score doesn't mean your site is accessible. Lighthouse can't see keyboard navigation, can't see focus order, can't hear what a screen reader says. The score is a smoke alarm — useful, but not a test. Here's what it actually measures, and what you still need to check manually.

part ofAccessibility for QA

What Lighthouse accessibility is actually checking

Lighthouse's accessibility audit runs a subset of axe-core rules against the rendered page. That's the complete technical description of what it does. The score isn't a holistic accessibility assessment — it's a weighted percentage of axe-core rules the page passes, calculated from a static DOM snapshot.

The rules Lighthouse evaluates are structural: ARIA role usage on interactive elements, alt text presence on images, label associations for form inputs, colour contrast ratios between text and background, heading hierarchy, document language declaration, and bypass links for navigation. These are genuinely important, and catching them automatically is valuable. But they represent a specific, bounded category of accessibility failure — failures you can detect by inspecting a DOM snapshot without any user interaction.

The scoring is weighted, which matters more than most teams realise. Not all rules contribute equally to the final score. A site that fails color-contrast — which Lighthouse weights heavily because it affects a large proportion of content — will score much lower than one that fails several low-weight rules. A 100 score means every weighted rule the tool checks passed. It says nothing about what the tool doesn't know how to check, which is considerable.

The surface Lighthouse can't see

Keyboard-only navigation. Lighthouse doesn't press Tab. It doesn't verify whether every interactive element is reachable via keyboard, whether the tab order follows a logical visual sequence, whether skip links function correctly, or whether custom interactive widgets (comboboxes, date pickers, drag-and-drop interfaces) respond correctly to keyboard input. A site can score 100 on Lighthouse and be entirely unusable for someone who relies on a keyboard because they have a motor impairment.

Focus visibility and management. Lighthouse can detect that a CSS rule explicitly sets outline: none without providing an alternative focus indicator. It cannot verify that a visible focus ring has sufficient contrast against the element background, that a modal dialog traps focus correctly so keyboard users can't tab out of it, or that focus is returned to the correct element when a dialog closes.

Screen reader output. Lighthouse reads the DOM structure. It does not listen to what NVDA, JAWS, or VoiceOver actually announces when a user navigates or interacts. A button with aria-label="close dialog" passes Lighthouse's accessible-name check — that check confirms the attribute exists, not that the label is meaningful in context or that a screen reader user would understand what they're closing without visual context.

Dynamic content and ARIA live regions. Lighthouse inspects the page as rendered at a single point in time. It doesn't click, type, or wait for responses. Whether a live region correctly announces search results, error messages, toast notifications, or form submission feedback — whether those announcements fire, whether they're appropriately verbose, whether they don't fire too many times and become noise — all of this is invisible to a static DOM inspection.

Cognitive and language accessibility. WCAG 2.1 includes success criteria for readable language (3.1.5), consistent navigation (3.2.3), error identification and suggestion (3.3.1, 3.3.3), and instructions that don't rely solely on sensory characteristics (1.3.3). None of these are evaluable by automated inspection. They require a human to read the content and evaluate whether it communicates clearly.

Why this matters: the false confidence problem

None of this is a criticism of Lighthouse itself — every automated tool has limitations, and Lighthouse's are honestly documented. The problem is what happens when the score gets used as an accessibility certification.

The pattern appears frequently: a team ships a feature, someone runs a Lighthouse audit, gets 100, and the accessibility checkbox in the ticket is marked complete. No one navigated with a keyboard. No one tested with a screen reader. The tab order through the new modal is illogical. The modal doesn't trap focus, so keyboard users can tab through the content behind it. A form error appears but the live region announcing it is misconfigured and screen readers never hear it. All of this passes Lighthouse with a perfect score.

The score creates a false confidence that is particularly harmful because it looks authoritative — a number out of 100 with a green badge feels like evidence. The people most likely to be harmed by accessibility failures — users with visual, motor, or cognitive disabilities — are the least likely to be in the room when the score is reviewed.

The actual minimum accessibility test suite

This is what I consider a meaningful baseline — not everything, but the minimum before calling a feature accessible.

Automated (run in CI):

axe-core integrated into your E2E test suite, scoped to critical user paths
Lighthouse CI in regression mode: catch drops from a previous baseline, not as an absolute pass/fail gate

Manual (run before release):

Keyboard-only navigation pass: starting from a fresh page load, can you reach and operate every interactive element using Tab, Shift+Tab, Enter, Space, and arrow keys? Can you open and close every modal and dropdown? When a dialog closes, does focus return to the element that opened it?
Screen reader smoke check: navigate the critical path with VoiceOver on macOS (or NVDA on Windows, or TalkBack on Android). Do the page headings make sense without visual context? Do interactive elements announce their purpose? Do form errors get announced when they appear?

These last two items are not optional and cannot be automated. axe-core catches roughly 30–40% of WCAG failures. Lighthouse catches a subset of those. The rest require a human.

Lighthouse's defensible use case

None of this is an argument for skipping Lighthouse. Two use cases where it provides clear value.

Regression detection. If a page scores 100 in CI today and 80 tomorrow, something changed. That delta is a reliable signal that a structural accessibility issue was introduced. Lighthouse CI — which tracks scores across commits and fails the build on drops below a threshold — is a solid regression gate. Use it this way, not as a "we tested accessibility" claim.

Early-stage awareness. A team that has never thought about accessibility will find Lighthouse surfacing real, immediately actionable issues: images without alt text, inputs without labels, contrast ratios below the minimum. Going from a Lighthouse score of 50 to 100 represents real improvements that benefit real users, even though 100 isn't the finish line.

The framing that I find most accurate: Lighthouse is a smoke alarm. It detects a specific and important category of problem. It does not detect all problems, and a quiet alarm doesn't mean the building is safe — it means the specific class of fire the alarm is sensitive to isn't present. The rest of the safety check still needs to happen.

// related

Tutorials·23 December 2025 · 9 min read

Adding accessibility tests with axe — a practical walkthrough

axe-core is the engine behind most accessibility testing in 2026 — and it's surprisingly approachable. Here's a practical walkthrough of integrating axe with Playwright, what it catches, and what it misses.

accessibilityaxeplaywrighta11y

Opinions·21 October 2025 · 9 min read

Manual exploratory testing isn't dead — it's underused

What automation replaced was regression checks — running the same path repeatedly. What it didn't replace, and can't replace, is human intuition trying to break a product.

manual-qaexploratory-testingopinionculture