Feature Files and Gherkin Syntax — Cucumber BDD Framework

A feature file is a plain text document with a .feature extension. It contains one or more scenarios written in Gherkin — a structured language that reads like English but has a small, precise vocabulary that Cucumber can parse. This lesson covers every Gherkin keyword you'll use in practice and the rules that make feature files worth reading.

The anatomy of a feature file

Feature: User Registration
  As a new visitor
  I want to register an account
  So that I can access premium features
 
  Scenario: Successful registration with valid data
    Given the registration page is open
    When the user fills in "Alice Smith" as name
    And the user fills in "alice@test.com" as email
    And the user fills in "SecurePass123" as password
    And the user clicks the register button
    Then the user should be on the welcome page
    And a welcome email should be sent to "alice@test.com"
 
  Scenario: Registration fails when email is already taken
    Given the registration page is open
    And an account with email "alice@test.com" already exists
    When the user fills in "alice@test.com" as email
    And the user clicks the register button
    Then the user should see the error "Email already registered"

Every part of this has a specific job:

Feature: — a one-line name for the feature. Cucumber uses it in reports but doesn't execute it. Everything after the name until the first Scenario: is a description block — optional, not executed, purely for documentation.

The description block (As a / I want / So that) — the user story format. Not required by Cucumber but strongly recommended: it records who needs the feature, what they want to do, and why it matters. This is what stops feature files from becoming test-only artifacts.

Scenario: — one specific test case. Each scenario is a single, independent example of the feature's behaviour. A feature file can have as many scenarios as needed, but each should test exactly one thing.

Given — the state the system is in before the user acts. This is your setup: a user exists, the page is open, the database has certain records. Step definitions in the Given block create this state programmatically.

When — the action the user takes. Exactly one main action per scenario is the ideal: the user submits a form, calls an API endpoint, clicks a button. If you have three When steps, consider whether you're testing multiple behaviours in one scenario.

Then — the observable outcome. What the user sees, what the system does, what the database contains. Assertions live in Then step definitions.

And / But — continuation keywords. And continues the previous keyword's role: And the user fills in "alice" as name after a When is still a When-style step. But is semantically identical to And but reads better for negative cases: But the password field should not be pre-filled.

One scenario, one behaviour

The most common Gherkin mistake is stuffing multiple behaviours into a single scenario. This:

Scenario: Registration and login flow
  Given the registration page is open
  When the user registers as "alice@test.com"
  Then the user should be on the welcome page
  When the user logs out
  And the user navigates to login
  When the user logs in as "alice@test.com"
  Then the user should see the dashboard

...is two scenarios (registration + login) stitched together. Split it. Each scenario should answer one question: "Does the system do X when Y?" Not "Does it do X, then Y, then Z?"

The practical reason beyond readability: a long scenario that fails at step 8 tells you steps 1–7 passed and step 8 failed. A short, focused scenario that fails tells you exactly what broke.

Declarative vs imperative steps

There are two ways to write the same scenario. Both work. One is much more maintainable.

Imperative — describes every click and keystroke:

When the user types "alice@test.com" in the input with placeholder "Email"
And the user types "SecurePass123" in the input with placeholder "Password"
And the user clicks the button labelled "Create Account"

Declarative — describes the business action:

When the user registers with email "alice@test.com" and password "SecurePass123"

The declarative step survives a UI redesign: if "Create Account" is renamed "Sign Up" and the inputs get new placeholders, the feature file doesn't change — only the step definition does. The imperative step needs updating every time the UI shifts.

Write declarative Gherkin. Put the implementation details in step definitions.

File naming and organisation

Name feature files after the feature they test: login.feature, user-registration.feature, checkout.feature, product-search.feature. One feature per file is the convention. When a feature grows to 20+ scenarios, consider splitting by scenario category rather than creating a single massive file.

For larger projects, group feature files into subdirectories:

src/test/resources/features/
├── auth/
│   ├── login.feature
│   └── registration.feature
├── checkout/
│   ├── cart.feature
│   └── payment.feature
└── api/
    ├── users-api.feature
    └── orders-api.feature

Tags (covered in Chapter 2 Lesson 3) let you run any combination of these without touching the directory structure.

Comments

Gherkin supports # comments:

# This scenario covers the happy path only — edge cases in registration-edge-cases.feature
Scenario: Successful registration
  ...

Use sparingly. If you need a comment to explain a scenario, the scenario's wording is probably unclear. Improve the wording instead.

Non-ASCII characters

Gherkin is encoded in UTF-8 by default. You can write scenarios in any language — Cucumber has built-in language support for over 70 languages. Add # language: fr at the top of a file to use French keywords (Fonctionnalité, Scénario, Etant donné, Quand, Alors). This feature exists but most English-language teams never use it.

Gherkin structure at a glance

Feature File

– Name (required)
– Description (As a / I want / So that)
– One per .feature file

– Given — preconditions
– When — the action
– Then — expected outcome
– And / But — continuation

– One behaviour per scenario
– Declarative over imperative
– No technical details in Gherkin

Background — shared Given steps –
Scenario Outline — data-driven –
Tags — selective execution –

⚠️ Common mistakes

Mixing UI details into Gherkin. Steps like When the user clicks the element with id "submitBtn" are unmaintainable. If the button's ID changes, the feature file breaks. Use When the user submits the registration form — the ID is the step definition's problem.
Overloading a scenario. A scenario with 15 steps is testing too many things. Future debugging will be painful. Aim for 3–7 steps per scenario. If the scenario needs more, it's testing multiple behaviours.
Omitting the feature description. Feature: Login is technically valid but loses the "who, what, why" context. New team members reading the file six months from now have no idea why this feature matters to users.
And as the first step. And must follow a Given, When, or Then. Starting a scenario with And is a Gherkin syntax error.

🎯 Practice task

Write feature files for a real application — no automation yet, just Gherkin. 35 minutes.

Open the Sauce Demo app (a free Selenium practice site). Spend 5 minutes exploring: login, browse products, add to cart, checkout.
Write a login.feature file with 3 scenarios: successful login, login with wrong password, login with empty credentials. Use declarative steps throughout.
Write a checkout.feature file with 2 scenarios: successful checkout, attempting checkout with an empty cart. Focus on business behaviour, not clicks.
Review your scenarios: could a non-technical product owner read them and verify they describe correct system behaviour? If not, rewrite the steps that are too technical.
Stretch: write a products.feature file with a scenario that tests filtering or sorting products. Think about what the Given needs to set up (a logged-in user with products visible) and what Then needs to verify (products are in the expected order or filtered set).

Next lesson: your first step definitions — Java code that maps these Gherkin steps to real actions.