How to write test cases developers actually read
Test cases that get read are short, scannable, and written for the person who has to act on them. Here is the format I use.
Blog
First-person walkthroughs of solving one specific problem — what we tried, what worked, what didn't. The blog version: dated, narrative, ours. For structured lessons in chapters, see Learn.
Test cases that get read are short, scannable, and written for the person who has to act on them. Here is the format I use.
A high-value checklist: the twelve API bugs that surface most often, from wrong status codes to idempotency failures.
A ten-minute accessibility pass any QA can run before release — keyboard, focus, contrast, and the obvious screen-reader checks.
The auth and session bugs that show up in normal functional testing — no exploit tooling required.
A sign-off checklist short enough that people actually use it — and specific enough to catch the things that block releases.
A practical evaluation pass for AI chat features: hallucinations, refusals, prompt injection, and the cases with no single right answer.
The specific bugs that hide in paginated, filtered, and sorted endpoints — off-by-one pages, unstable sorts, and filter leaks.
Tab through the page. That single habit catches more accessibility bugs than most automated scans.
Password reset is a deceptively risky flow — token reuse, expiry, enumeration, and session handling all hide here.
A short, device-real smoke pass: permissions, offline, rotation, interruptions, and the update path.
AI writes plausible Playwright tests that pass for the wrong reasons. Here is the review checklist that catches them.
What take-home reviewers actually score, and how to spend your limited time on the parts that count.
How to scope a regression pass to the change in front of you instead of re-running the entire suite by hand.
A session that lives too long is a hole, one that survives logout defeats the point. Here is the session-expiry pass — idle, absolute, logout, reset, remember-me, and fixation.
Notifications behave differently foregrounded, backgrounded, and killed — and deep-link to the wrong place when they arrive. The killed-app cold start is where it breaks.
The interesting offline bugs are in the transitions, not the offline state: double-submits on reconnect, in-flight requests that die, optimistic UI that never rolls back.
A bug report exists to get the bug fixed. Specific title, minimal repro steps, explicit expected-vs-actual, evidence, and environment — the format that prevents "can't reproduce".
Catch the blatant screen-reader failures in fifteen minutes with the reader already on your machine — meaningful names, sensible images, labelled fields, announced changes.
Forms break accessibility hardest — labels, required state, announced errors, focus management, and keyboard-operable custom widgets. The form-specific pass.
A test strategy is a short set of project-specific decisions, not a generic thirty-page document. Scope, risk, levels, automation split, data, ownership, and what "done" means.
A screenshot isn't a repro when outputs vary. Capture the full assembled prompt, retrieved context, model version, and parameters so an AI bug is actually reproducible.
A charter-driven, time-boxed template for exploratory testing: 5 minutes to charter, 35 to test, 10 to debrief — and notes someone can read.
The full multi-factor auth test surface: bypass, wrong/expired/reused codes, brute-force lockout, recovery, and the usability cases most teams skip.
Permission bugs live in deny, revoke, and 'ask every time' — not the grant happy path. The per-permission, per-platform matrix that catches them.
Not a full load test — a fast, fixed, repeatable check on a few critical endpoints, compared to baseline, that catches gross regressions before sign-off.
Concrete test cases for AI hallucination — unanswerable questions, false premises, invented entities, citations — and how to judge answers with no 'correct' value.
Test the full rate-limit contract — enforcement, 429, Retry-After headers, recovery, scope — with a low configurable limit and a dedicated key, not by flooding shared staging.
An OpenAPI spec is a ready-made test plan — every param and status code is a case — and its gaps (missing errors, unbounded fields, drift) predict the bugs.
Concrete, reusable accessibility cases for the two highest-consequence flows — keyboard completion, labels, announced errors, focus management — where a barrier blocks a core task.
Treat the auth token as an input: test that it expires, dies on logout, can't cross scope or user, doesn't leak, and rejects tampering — all with your normal API client.
A security report has extra duties: private channel, impact over exploit, test data only, redacted evidence, clear severity — getting it fixed without making it worse.
Which k6 metrics matter and which mislead: check the error rate first, read p95/p99 not the average, confirm the load profile, and compare to a baseline.
A short, risk-first status format: lead with a one-line risk verdict, then what's at risk, key findings, light coverage numbers, and explicit asks — built to drive a decision.
Write an operating manual for the arriver, not a diary: current state, setup, known issues with status, gotchas, and pointers — so someone can take over without asking you.
Get the speed of an AI agent on your test repo without the mess: work on a branch, review every change like a junior's PR, and make tests fail first to catch assert-nothing tests.
Most teams over-abstract too early. Four custom commands are worth writing on every Cypress project — login, seed, intercept, visit. The rest can wait.
cy.intercept is the most powerful command in Cypress and the one teams most often misuse. Here's the playbook: when to alias, when to stub, when to spy, and the race-condition-shaped bug that intercepts usually catch.
Most explanations of Playwright fixtures lean on React-hook metaphors that miss the point. Fixtures are scoped factories. Here's what to do with them and the three every project should have.
The practical playbook for AI-assisted test writing in 2026. The prompts that work, the prompts that don't, and the human-in-the-loop checkpoints that keep AI from writing tests that pass for the wrong reasons.
axe-core is the engine behind most accessibility testing in 2026 — and it's surprisingly approachable. Here's a practical walkthrough of integrating axe with Playwright, what it catches, and what it misses.