Stop writing BDD tests you don't actually need
Cucumber and Gherkin make sense when non-technical stakeholders write tests. They don't make sense when engineers write tests for engineers. Here's the pragmatic test: who actually reads your tests? If the answer is 'just engineers', BDD is overhead with no upside.
What BDD actually solved
BDD was formalised by Dan North in 2003 as an evolution of TDD. The core insight: tests that read like English sentences can be written and reviewed by non-technical stakeholders — business analysts, product managers, QA leads without coding backgrounds. The dream was executable specifications: a product manager writes a .feature file describing how a feature should behave, engineers implement it, and the file becomes the living documentation.
Feature: User checkout
Scenario: Guest user can place an order
Given I am on the product page for "Mechanical Keyboard"
And I have not logged in
When I click "Add to cart"
And I proceed to checkout
And I enter my shipping details
Then I should see an order confirmation
And I should receive a confirmation emailThat's a real value proposition. A product manager can read that, verify it matches their understanding of the feature, and even catch bugs in the spec before implementation. The .feature file becomes a contract between the business and engineering.
The original BDD teams used Gherkin this way. It worked. Those teams had non-engineers writing and reviewing scenarios as part of their normal workflow.
The BDD cargo cult
Most teams using Cucumber today are not those teams. Most teams using Cucumber have engineers writing .feature files that no non-engineer reads. The product manager doesn't know the .feature file exists. The QA lead learned Gherkin because it was in the job description but writes scenarios by reading what engineers write. The business analyst writes tickets in Jira, not scenarios in Gherkin.
In this situation — which is the majority of BDD teams I've encountered — Gherkin is purely overhead. You write a scenario:
When I click "Submit"
Then I should see "Order placed successfully"Then you write a step definition that implements it:
When('I click {string}', async (buttonText) => {
await page.getByRole('button', { name: buttonText }).click();
});
Then('I should see {string}', async (text) => {
await expect(page.getByText(text)).toBeVisible();
});And then you have two files, two maintenance surfaces, and a layer of indirection between the test intention and the test implementation. When the test fails, you debug through the scenario, find the step, read the step definition, and trace the actual Playwright/Cypress call. Every hop costs time.
The cargo cult version of BDD adds all the cost of Gherkin — the file structure, the step definitions, the regex matching, the shared step state — and captures none of the benefit (stakeholder collaboration), because the stakeholders aren't collaborating on the test files.
The cost: real and accumulated
The step-definition file explosion — in every mature Cucumber project I've reviewed, step definitions are the most tangled files in the codebase. Steps get added for every new scenario. Similar steps proliferate ("I click 'Submit'" and "I press the submit button" and "I tap the Submit CTA"). The matching logic becomes a regex graveyard that no one can confidently modify.
The broken-step graveyard — step definitions that were written for a scenario that was deleted, but the step definition stays. It matches no current scenario but sits in the file. Nobody deletes it because they're not sure if something references it. In large suites this accounts for 20–30% of all step code.
Slower onboarding — a new engineer joins and needs to write a test. In a plain TypeScript Playwright suite, they read five lines of existing test code and understand the pattern. In a Cucumber suite, they need to understand .feature files, the step-definition pattern, the regex capture groups, the world object, the before/after hooks in Cucumber's hook system. The ramp time is measurably longer.
Undetectable duplication — two scenarios can test the same thing with different Gherkin phrasing and never trigger a duplicate warning. Plain code would make this obvious. Gherkin hides it behind natural language variation.
When BDD genuinely pays off
Regulated industries where acceptance criteria are auditable artefacts. Medical device software, financial services, aviation — contexts where a .feature file is a compliance document that gets reviewed by someone with a regulatory, not engineering, lens.
Genuinely cross-functional teams where product managers write or review scenarios as part of sprint planning. If the .feature file is in the sprint ticket as the definition of done, and the PM has approved the scenarios before implementation starts, BDD is doing exactly what it was designed to do.
Large organisations with dedicated QA analysts who are expert in Gherkin and who collaborate with developers on scenario design. There are teams where this works. They're the minority.
The question to ask your team: "Who reads our .feature files?" If the answer is "engineers, to know what the step definitions need to do" — BDD is not buying you anything.
The pragmatic replacement
If you want test code that reads like documentation without the Gherkin overhead, named custom commands in Cypress or descriptive Playwright helper functions get you 80% of the readability with none of the indirection:
// Instead of Gherkin + step definitions:
// Given I am logged in as an admin
// When I visit the user management page
// Then I should see the user list
// Plain TypeScript — equally readable, directly executable:
test('admin can see user list', async ({ authenticatedPage }) => {
await authenticatedPage.goto('/admin/users');
await expect(authenticatedPage.getByTestId('user-table')).toBeVisible();
});That test reads clearly, fails with a direct stack trace pointing to the exact line, and has one maintenance surface. No step definitions, no regex, no world object.
If you need longer narrative test descriptions, test.step() in Playwright or nested describe blocks in Cypress let you add structure without adding a separate file format:
test('guest user can complete checkout', async ({ page }) => {
await test.step('add item to cart', async () => {
await page.goto('/products/keyboard');
await page.getByRole('button', { name: 'Add to cart' }).click();
});
await test.step('complete checkout', async () => {
await page.goto('/checkout');
await page.fill('[data-testid="email"]', 'guest@example.com');
await page.getByRole('button', { name: 'Place order' }).click();
});
await expect(page.getByTestId('confirmation-message')).toBeVisible();
});Test steps show in the Playwright trace viewer and HTML reporter. You get structured, readable output — the thing BDD advocates usually point to as a reason for Gherkin — without the separate file format.
The one-question test
Would your product manager open the .feature file to review acceptance criteria? If yes: BDD is appropriate for your team. Keep it.
If no — if the .feature files are an engineer-to-engineer communication layer dressed up in business language — remove Cucumber from your project and write plain test code. You'll ship tests faster, debug failures faster, and onboard new team members faster. The Gherkin was a translation layer between the test intent and the test code. Remove the layer; keep the intent.
// related
You probably don't need a Page Object Model
POM was a Selenium-era solution to a Selenium-era problem. In modern Cypress and Playwright, custom commands and locator helpers cover 90% of what POM was supposed to give you.
Custom Cypress commands that actually pay off
Most teams over-abstract too early. Four custom commands are worth writing on every Cypress project — login, seed, intercept, visit. The rest can wait.