Generating Tests from Natural Language

9 min read

Generating tests from natural language is the feature most QA engineers try first — and the one where prompt quality matters most. A vague request produces a generic test. A grounded request produces something you would have written yourself. This lesson covers the anatomy of a prompt that works, examples across frameworks, and what to do after Claude Code writes the file.

The anatomy of a good generation prompt

Three things make a generation prompt effective:

  1. What — the feature or scenario to test, including specific test cases (not "comprehensive coverage")
  2. Where — the exact file path and the existing code to use (page objects, helpers, base classes)
  3. How — an existing test file for Claude Code to match in style
Generate a Playwright test for the login flow at https://staging.myapp.com/login.
 
Save it to tests/auth/login.spec.ts.
Read tests/auth/registration.spec.ts first and match its style.
Use the LoginPage from src/pages/LoginPage.ts.
Cover: valid login, wrong password, locked account, MFA prompt.

This prompt answers all three questions. It does not say "write a comprehensive test" — it names the scenarios. It does not say "use best practices" — it points at an existing file that already embodies your team's practices.

Example — Playwright with TypeScript

Read tests/checkout/ to understand how I structure checkout tests.
Then generate a test for the guest checkout flow.
 
Save to tests/checkout/guest-checkout.spec.ts.
The flow starts at /cart. Cover:
- Successful checkout as a guest
- Out-of-stock item blocking checkout
- Invalid credit card number error
- Missing required address field validation

Claude Code reads your existing checkout tests, matches the page object usage and assertion style, and generates a test you can drop straight into CI.

Example — Selenium with Java

Add a TestNG class for the password reset flow.
Save to src/test/java/com/myapp/auth/PasswordResetTest.java.
 
- Extend BaseTest for setup and teardown
- Use AuthPage from the existing page object layer
- Cover: valid email address, unknown email, expired reset token,
  malformed token, rate limiting after 5 attempts

Claude Code reads your BaseTest, finds your AuthPage, and writes a class that fits the existing hierarchy — not its own invented base class or import pattern.

Example — Cypress with custom commands

Look at cypress/support/commands.ts to understand my custom commands.
Generate a Cypress test for the account settings page.
 
Save to cypress/e2e/account/settings.cy.ts.
Use cy.loginAs() from my custom commands for setup.
Cover: updating display name, changing email, enabling two-factor auth.

Pointing Claude Code at your commands.ts means it uses cy.loginAs() rather than reinventing setup inline.

After generation — the essential review

Every generated test needs a read before it runs. Specifically check:

  • Are the selectors real? If Playwright MCP is not connected, Claude Code guesses at selectors from the HTML it infers. Verify them against the actual page.
  • Do the assertions test the right thing? A test that clicks "submit" and then checks expect(page.url()).toContain('/success') passes even if the server returned an error without redirecting. Ask yourself: if this feature broke, would this assertion catch it?
  • Are negative cases actually negative? Generated error-path tests sometimes assert the happy-path success condition instead. Read each one.

Iterating on a generated test

> The test for the locked account case uses cy.get('#error-message'). 
  Our convention is data-testid. Update it to use cy.findByTestId('login-error').
> Add a beforeEach that clears cookies and localStorage to prevent test pollution.
> The MFA test is missing — the prompt after correct credentials shows 
  a 6-digit code input at /mfa. Add that scenario.

Each iteration edits the file in place. Claude Code shows exactly what changed. This is faster than writing from scratch and faster than a traditional code review cycle.

Step 1 of 6

Orient Claude Code

Ask it to read one or two existing tests that represent your conventions. This sets the style context before generation.

⚠️ Common Mistakes

  • Accepting without reading. The biggest risk with AI-generated tests is not that they fail — it is that they pass while testing the wrong thing. Read every generated file before running it.
  • Not giving reference examples. "Write a Playwright test in TypeScript" produces generic boilerplate. "Follow the pattern in tests/auth/registration.spec.ts" produces a test that belongs in your project.
  • Asking for "comprehensive" coverage without naming scenarios. Claude Code will invent scenarios that sound reasonable but miss your domain's actual edge cases. Name the cases that matter.

🎯 Practice Task

Generate a real test for a real project. 20 minutes.

  1. Pick a feature that lacks test coverage.
  2. Find one existing test that represents your conventions — this is your reference.
  3. Write a prompt that names: the feature, the file path, the reference test to match, and at least three specific scenarios.
  4. Let Claude Code generate the test. Read it before running it.
  5. Make at least one refinement based on what you read.
  6. Note the round-trip time: how long did generation + review + refinement take compared to writing from scratch?

The next lesson tackles the other side of the coin — not writing new tests, but fixing and improving the ones that already exist.

// tip to track lessons you complete and pick up where you left off across devices.