Writing Effective Gherkin — Best Practices and Anti-Patterns

By now you can write syntactically valid Gherkin. But valid Gherkin and good Gherkin are different things. This lesson is about the gap between them — the patterns that make feature files a team asset and the anti-patterns that make them a maintenance burden nobody reads.

The goal: scenarios a product owner can review

Ask yourself this before committing a feature file: could a product owner, a business analyst, or a new engineer unfamiliar with the codebase read these scenarios and verify they match their understanding of how the system should behave?

If the answer is no, the scenarios are not doing their job. They may run correctly as tests, but they've lost the communication value that makes BDD worth the overhead.

What good Gherkin looks like

# Good
Scenario: Successful checkout creates an order
  Given the user has 2 items in the cart
  When the user completes checkout with valid payment
  Then an order confirmation email should be sent
  And the order should appear in the user's order history
 
# Good
Scenario: Adding an item to the cart increases the cart count
  Given the user is browsing the product catalogue
  When the user adds "Laptop" to the cart
  Then the cart icon should show 1 item

Both scenarios are short, describe one behaviour each, and could be understood by anyone who uses the application. No element IDs. No URLs. No implementation details.

The declarative vs imperative divide

This is the single most important concept in effective Gherkin. Compare:

# Bad — imperative
Scenario: User adds item to cart
  Given the user navigates to "https://shop.example.com/products"
  When the user finds the element with CSS selector ".product-card:first-child"
  And the user clicks the button with data-testid "add-to-cart-btn"
  And the user waits 2 seconds for the animation to complete
  Then the element with id "cart-count" should contain the text "1"

# Good — declarative
Scenario: User adds item to cart
  Given the user is on the product catalogue
  When the user adds the first product to the cart
  Then the cart should show 1 item

The imperative version is 9 steps of Selenium script embedded in Gherkin. When the CSS class changes from .product-card to .product-tile, the feature file breaks. When the animation delay changes, the feature file becomes incorrect. When a non-technical stakeholder reads it, they see technical noise, not business behaviour.

The declarative version has 3 steps. None of them contain a CSS selector, a URL, or a wait. All implementation details live in the step definitions — the only place they belong.

The anti-patterns, named

UI-step addiction — narrating every click, every field clear, every animation. Steps like When the user clicks the button with id "submit" are test scripts in Gherkin clothing. Rewrite as When the user submits the form.

Leaking technical details — URLs, API endpoints, CSS selectors, database column names, HTTP status codes in the Given/When/Then steps. These belong in step definitions. A feature file that references /api/v2/users becomes incorrect the moment the API is versioned.

Scenario overloading — testing 5 behaviours in 1 scenario. A scenario with 20 steps is testing a flow, not a behaviour. Split at natural boundaries. "Successful registration" and "Successful first login after registration" are two scenarios, not one.

Missing Then — a scenario with only Given and When steps makes no assertion. It will always pass, even if the system is completely broken. Every scenario needs at least one Then.

Scenario as setup — using a scenario purely to create test data for other scenarios. Scenarios should be independent. Test data setup belongs in @Before hooks or Given step definitions.

Over-specified preconditions — the Given block describes the exact database state, API call sequence, and user history needed. Business scenarios don't usually need this detail. Summarise: Given the user has an active subscription, not Given the user has a subscription created on 2024-01-15 with plan ID 42 and status code 1.

The "three perspectives" test

Good scenarios satisfy three perspectives simultaneously:

Product owner: "Does this describe what users actually do and care about?"
Developer: "Are the steps abstract enough that I can implement them in multiple ways?"
Tester: "Does this cover the right behaviour, and could it realistically fail when something breaks?"

If a scenario fails any of these, it needs work.

Step reuse as a design constraint

The step definition @When("the user logs in with valid credentials") is reusable across every feature that needs a logged-in user. The step definition @When("the user types \"alice@test.com\" in the field labelled \"Email address\" and clicks the blue button") is reusable by nobody. Good Gherkin produces naturally reusable steps.

Count how many step definitions you have after writing 10 scenarios. If you have nearly 10 × N steps (one unique step per scenario × number of steps per scenario), your Gherkin is too specific. A healthy suite reuses steps heavily — a new scenario should mostly be assembling existing steps, not writing new ones.

Good vs bad: side by side

Anti-patterns to avoid

Technical details: CSS selectors, IDs, URLs
20+ steps per scenario
Multiple behaviours per scenario
No Then assertion
Duplicate setup across scenarios (no Background)
Literal values hardcoded in step text (not parameterised)
Written by QA alone, after development ends

Patterns to follow

Business language: 'the user submits the form'
3–7 steps per scenario
One behaviour per scenario
At least one Then assertion per scenario
Background for shared preconditions
Parameters: {string} and {int} for variable values
Written by Three Amigos before development starts

The review habit

Make feature file review a regular practice. Before every sprint, run a 20-minute session where the team reads the previous sprint's new scenarios out loud. Anyone who can't follow a scenario without asking "what does this step mean?" is evidence of a step that should be rewritten.

Feature files that nobody reads drift towards becoming test scripts. Feature files that the team actively maintains become the most honest documentation the project has.

⚠️ Common mistakes

Thinking good Gherkin is about Cucumber. You can write excellent feature files before knowing what step definitions are. Good Gherkin is a writing skill, not an automation skill.
Short scenarios that are still imperative. A 3-step scenario with a CSS selector is still an anti-pattern. Length alone doesn't make Gherkin good — abstraction does.
Not getting stakeholder review. The purpose of plain English scenarios is that non-technical people can read them. If you never show them to a product owner or business analyst, you've removed the feedback loop that makes the language choice worthwhile.

🎯 Practice task

Audit and refactor existing feature files for clarity. 30–40 minutes.

Review your login.feature and checkout.feature from previous lessons. Read each scenario from the perspective of a non-technical product owner. Mark any step that contains a technical detail (ID, URL, selector, status code).
Rewrite each marked step to be declarative. For example, When the user clicks element with id "submit" becomes When the user submits the login form.
Check scenario length. Any scenario over 7 steps: split it into two scenarios or extract shared steps to Background.
Confirm every scenario has at least one Then step with a meaningful assertion.
Stretch: write 3 new scenarios for a feature you haven't covered yet (e.g., product filtering, user profile update, password change). Write them with a product owner in mind — no technical details allowed. Then write the step definitions and run them. You'll find that good declarative Gherkin naturally produces cleaner, more reusable step definitions.

This wraps up Chapter 2. Chapter 3 dives into step definitions: parameter expressions, hooks, dependency injection for sharing state between steps, and integrating Cucumber with the Page Object Model.