On this page7 sections
ConceptsIntermediate6-8 min reference

Test Design Techniques

The formal techniques for picking which test cases to write — so your suite catches the bugs that matter without exploding into combinatorial chaos.

Equivalence Partitioning

Inputs that should behave the same way go in the same partition. Test one value from each partition — a value that fails for one usually fails for all.

Rules

  • Identify every distinct valid partition.
  • Identify every distinct invalid partition (different ways an input can be rejected each get their own).
  • Pick one representative value per partition. One is enough; more is waste.
  • A test case can cover multiple partitions if their behaviour is independent.

Worked example — Age field accepts 18–65

PartitionSample valueExpectedNotes
Valid: 18–6530acceptedThe happy path
Invalid: < 1810rejected (under-age)Too young
Invalid: > 6580rejected (over-age)Too old
Invalid: non-numeric"abc"rejected (type error)Different rejection branch
Invalid: empty / null"" / nullrejected (required field)Different rejection branch
Invalid: float25.5rejected (type error)If field requires int
Invalid: negative-5rejected (under-age + sign)Often the same branch as < 18, but worth confirming

That's seven test cases instead of every possible value 0–999. Each one represents a class of inputs the system should handle the same way.

When to apply it

SurfacePartitions usually look like
Form fieldsvalid / too short / too long / wrong type / empty
API parametersvalid range / out of range / wrong type / missing / null
Dropdownseach enum value + missing/invalid option
File uploadsaccepted type / wrong type / oversize / corrupt / empty
Date inputsvalid range / before min / after max / wrong format / non-existent date

Common mistakes

  • Treating all "invalid" inputs as one partition. A user who types letters into a number field hits a different code path than a user who submits an empty form. They're separate partitions.
  • Picking edge values as the "representative". Boundary values are a separate technique — combine them deliberately, don't conflate.
  • Skipping the empty/null partition. Most production bugs live there.

Boundary Value Analysis

Errors cluster at the edges of equivalence partitions. Off-by-one, < vs <=, inclusive vs exclusive — these are the most common bugs in any range check.

Two-value BVA (standard)

For each boundary, test the boundary itself and one step outside.

       invalid     valid     invalid
   ────────|─────|─────|─────|────────
      min-1   min   max   max+1

Three-value BVA (robust)

Adds one step inside the boundary too.

   invalid    valid             valid    invalid
   ───|────|─────|─────...─────|─────|────|───
   min-1  min  min+1         max-1  max  max+1

Worth the extra two cases when off-by-one bugs would be expensive (auth, billing, safety-critical fields).

Worked example — Password field accepts 8–20 characters

TestLengthExpectedNotes
Below min7 charsinvalidone short
At min8 charsvalidexact lower edge
Just above min9 charsvalidinside
Just below max19 charsvalidinside
At max20 charsvalidexact upper edge
Above max21 charsinvalidone over

Six test cases, all targeting code that does a length check. If < should have been <=, exactly one of these will catch it.

Combining with equivalence partitioning

For each partition, take a typical mid-value (EP) plus the values at and around the boundaries (BVA). For the password example:

PartitionSample valueWhy
Empty""type/required failure
Below boundary"abc" (3 chars)well inside the invalid partition
At boundary"12345678" (8 chars)min — the most error-prone case
Mid-valid"abcdef1!" (8–20 typical)confirms typical case
At boundary"12345678901234567890" (20 chars)max
Above boundary21 charsone over

Apply it to anything with a range

RangeBoundaries
Numeric (min..max)min-1, min, min+1, max-1, max, max+1
String length (minLen..maxLen)same, in characters
Date rangeday before, first day, day after; same for end of range
File size0 bytes, 1 byte, just below limit, exactly at limit, 1 byte over
Paginationpage=0, page=1, page=lastPage, page=lastPage+1, very large page
Decimal precisionvalues just over the rounding edge in both directions

Decision Table Testing

When an outcome depends on multiple conditions combining, list every combination and the action it produces. Decision tables guarantee you don't forget a rule.

Components

  • Conditions — input flags (rows, top half).
  • Actions — outcomes (rows, bottom half).
  • Rules — columns. Each column is one combination of condition values.

Process

  1. List every condition.
  2. List every possible action.
  3. Build a column for every combination of condition values (2^n for n boolean conditions).
  4. Fill in the action(s) for each rule.
  5. Reduce: collapse columns where some conditions don't matter (mark them ); drop impossible combinations.

Worked example — Shipping cost calculator

Conditions: order over £50, premium member.

Full table (4 rules from 2 booleans):

ConditionR1R2R3R4
Order > £50YYNN
Premium memberYNYN
Action
Free shipping
Standard rate (£3)
Full rate (£8)

Collapsed — premium membership alone unlocks free shipping, so when "Premium member = Y" we don't care about order size:

ConditionR1R2R3
Order > £50YN
Premium memberYNN
Action
Free shipping
Standard rate

That's three test cases instead of four — but only because we proved R1 and R3 (in the original) produce the same action.

A larger worked example — Login authorisation

Conditions: valid email format, password matches, account active, account locked, 2FA enabled.

That's 2^5 = 32 combinations, but most collapse out:

ConditionValid loginWrong passwordLockedInactive2FA neededBad email
Valid email formatYYYYYN
Password matchesYNY
Account activeYNY
Account lockedNYN
2FA enabledNY
Action
Grant access
Show password error
Show locked banner
Show inactive notice
Prompt for 2FA code
Show format error

Six rules → six test cases that cover every meaningful combination.

When to use decision tables

  • Business rules with multiple conditions (pricing, eligibility, approval flows).
  • Complex form validation with conditional fields.
  • Authorisation matrices (role × resource × action).
  • Insurance / loan / tax calculations.
  • Any spec that uses the words "if … and … but only when …".

State Transition Testing

For features with discrete states, model the state machine and pick tests systematically — at the right depth.

Components

  • States — distinct conditions the system can be in.
  • Events — inputs that may cause a transition.
  • Transitions(state, event) → next state.
  • Guards — conditions that gate a transition (if balance > 0).
  • Actions — side effects performed during a transition (send email, debit balance).

State transition table

Logged Out ──login (valid)──→ Logged In
Logged Out ──login (bad)────→ Logged Out  (action: show error)
Logged In  ──logout─────────→ Logged Out
Logged In  ──timeout────────→ Logged Out  (action: redirect)
FromEventGuardActionTo
Logged OutLoginvalid credentialsshow dashboardLogged In
Logged OutLogininvalid credentialsshow errorLogged Out
Logged InLogoutclear sessionLogged Out
Logged InTimeoutsession > 30 minredirectLogged Out

Coverage levels

LevelWhat it coversWhen to use
0-switchVisit every state at least onceSmoke testing
1-switchCover every valid transition (every row of the table)Default for most features
2-switchCover every pair of consecutive transitionsCritical workflows; chains where intermediate state matters
All-pathsCover every full path from start to endShort, finite state machines (wizards, finite workflows)

Worked example — Order status

Draft ──submit──→ Submitted ──approve──→ Approved ──ship──→ Shipped ──deliver──→ Delivered
                  │                       │                  │
                  └─reject→ Rejected      └─cancel→ Cancelled└─return→ Returned

1-switch coverage — every valid transition:

#FromEventTo
1DraftsubmitSubmitted
2SubmittedapproveApproved
3SubmittedrejectRejected
4ApprovedshipShipped
5ApprovedcancelCancelled
6ShippeddeliverDelivered
7ShippedreturnReturned

Negative testing — invalid transitions

Equally important: the system rejects transitions that aren't on the diagram. For each state, attempt every event that shouldn't work.

FromEventExpected
Deliveredsubmitrejected — terminal state
Cancelledshiprejected — already cancelled
Draftshiprejected — must approve first
Rejectedapproverejected — already rejected

Negative transitions catch state-machine bugs that valid-only testing misses (e.g. a webhook racing the state and re-shipping a cancelled order).

Edge conditions worth probing

  • Mid-transition crash — kill the process between "auth captured" and "order updated". What state is the order in?
  • Concurrent transitions — two admins click "Ship" within 100ms. One wins? Both succeed and double-ship?
  • Replayed event — payment webhook delivered twice. Does the order go to Paid once?
  • Timer-driven transitions — abandoned cart, expired session. Does the timer fire when the user is idle? When they're active in another tab?

Pairwise / Combinatorial Testing

Most defects are triggered by the interaction of two parameters, rarely three or more. Pairwise testing covers all pairs of parameter values without testing every combination.

Why it works

Empirical studies (Kuhn et al., NIST) found ≈ 70 % of failures come from a single faulty input or a pair of inputs. By covering every pair, you catch the vast majority of bugs at a fraction of the cost.

Combinatorial explosion

3 parameters × 3 values each = 27 full combinations. 4 × 4 × 4 × 4 = 256 full. 6 × 6 × 6 × 6 × 6 = 7,776 full.

Pairwise replaces these with ~10 / ~16 / ~36. Most projects can't afford full combinatorial; pairwise is what you can actually run.

Worked example — Browser compatibility

Parameters:

  • Browser: Chrome, Firefox, Safari
  • OS: Windows, macOS, Linux
  • Language: EN, FR, ES

Full combinatorial = 27. Pairwise — 9 tests cover every (browser × OS), (browser × lang), and (OS × lang) pair:

#BrowserOSLanguage
1ChromeWindowsEN
2ChromemacOSFR
3ChromeLinuxES
4FirefoxWindowsFR
5FirefoxmacOSES
6FirefoxLinuxEN
7SafariWindowsES
8SafarimacOSEN
9SafariLinuxFR

Every browser appears with every OS at least once. Every browser appears with every language at least once. Every OS appears with every language at least once. All pairs covered, 9 tests instead of 27.

Tools that generate pairwise sets

ToolHow
Microsoft PICTCLI, txt config — pict params.txt outputs the test set
AllPairs (Python)pip install allpairspy — programmatic generator
pairwise.orgWeb UI for one-off generation
HexawiseCommercial, supports constraints and seeding
PICTMasterExcel-based generator

PICT input file:

Browser:  Chrome, Firefox, Safari
OS:       Windows, macOS, Linux
Language: EN, FR, ES
pict params.txt

Constraints

Real systems have impossible combinations — Safari on Linux doesn't ship. Tools support constraints:

IF [Browser] = "Safari" THEN [OS] <> "Linux";

The generator skips infeasible combinations while still covering all valid pairs.

When pairwise fits

  • Configuration testing — browsers × OSes × screen sizes × locales.
  • Form fields with many independent dropdowns / toggles.
  • API parameters — many optional query params with several valid values.
  • Feature flags matrix — handful of flags, each on/off.
  • Compatibility — versions of dependencies, plugins, integrations.

When not to use it

  • Two parameters strongly interact in known ways → enumerate explicitly.
  • The state machine has dependencies between values → use state transition testing.
  • The combinations encode business rules → use a decision table.

Error Guessing & Experience-Based Testing

Formal techniques cover the specifiable test cases. Error guessing fills the gap with judgment — the test cases that come from "I bet this is going to break."

Common error categories

  • Empty / null / undefined — most common production bug source.
  • Whitespace — leading/trailing, all-whitespace, tab characters in name fields.
  • Special characters — quotes, angle brackets, semicolons, emoji, RTL text, zero-width spaces.
  • Numeric edges0, -1, INT_MAX, INT_MAX + 1, NaN, Infinity, 0.1 + 0.2.
  • Boundary timing — DST transitions, leap years, leap seconds, month/year rollover, midnight UTC vs midnight local.
  • Concurrency — two clicks within 100ms, page refresh during a long action, concurrent edits.
  • Network — offline submit, slow 3G, request abort, mid-upload disconnect, DNS failure.
  • Auth edges — expired token mid-action, revoked permission while session is active, role downgrade.
  • Storage limits — quota exceeded, IndexedDB unavailable, cookie disabled, browser private mode.
  • Internationalisation — multi-byte characters in strings, language affecting numeric format (1,234.56 vs 1.234,56), RTL text.

Maintain a personal checklist

Every team grows a "what bites us most" list. Capture yours:

□ Empty + whitespace inputs
□ Leading/trailing whitespace stripped where it shouldn't be
□ Pagination off-by-one (page 0 vs page 1)
□ Timezone mismatch between client and server
□ Stale browser cache after deploy
□ Optimistic UI inconsistent with server state on failure
□ Email verification race after change-of-email
□ Soft-deleted user re-registering

The list grows over years and pays for itself every release.

Combining with formal techniques

Run formal techniques first (EP, BVA, decision tables, state transitions, pairwise) — they give you systematic coverage. Then run error guessing on top — it catches what spec-driven design can't see.

Use Case Testing

Derive tests directly from how a real user accomplishes a goal. Each use case yields one main success scenario plus alternative and exception flows.

Structure of a use case

Title:           Place an order
Actor:           Authenticated customer
Preconditions:   Cart contains at least one in-stock item; payment method on file

Main success scenario:
  1. Customer reviews cart
  2. Customer proceeds to checkout
  3. System validates stock and pricing
  4. Customer confirms shipping address
  5. System charges payment method
  6. System creates order and sends confirmation
  7. System shows confirmation page

Alternative flows:
  4a. Customer applies a coupon
       → System recalculates total, returns to step 5
  4b. Customer changes shipping method
       → System recalculates shipping, returns to step 5
  6a. Customer requests gift wrapping
       → System adds line item, returns to step 5

Exception flows:
  3a. Item out of stock
       → System shows alert, removes item, customer continues with rest
  5a. Payment declined
       → System shows error, customer enters new method, returns to step 5
  5b. Network error mid-payment
       → System retries; on second failure, shows recovery instructions
       → Customer's cart is preserved
  *.  Session timeout at any step
       → System asks customer to re-authenticate, returns to current step

Coverage

LayerTests
Main success scenario1 (the happy path)
Each alternative flow1 each (4a, 4b, 6a → 3 cases)
Each exception flow1 each (3a, 5a, 5b, * → 4 cases)
Cross-cutting variationslogged-out user; mobile vs desktop; saved card vs new card

The main flow plus all alternatives and exceptions is usually 5–15 tests per use case. Worth tracking against the use-case document for traceability — every alternative and exception flow should map to at least one test case.

When to favour use case testing

  • Workflows where the order of steps matters (checkout, onboarding, KYC, multi-step forms).
  • Scenarios driven by persona / role — admin onboarding flow vs end-user.
  • Acceptance testing — UAT scripts written from use cases read naturally.

State transitions, decision tables, and pairwise complement use cases — once you've identified the flows, those techniques tell you which inputs to drive at each step.