ConceptsIntermediate6-8 min reference

Test Design Techniques

The formal techniques for picking which test cases to write — so your suite catches the bugs that matter without exploding into combinatorial chaos.

Equivalence Partitioning

Inputs that should behave the same way go in the same partition. Test one value from each partition — a value that fails for one usually fails for all.

Rules

Identify every distinct valid partition.
Identify every distinct invalid partition (different ways an input can be rejected each get their own).
Pick one representative value per partition. One is enough; more is waste.
A test case can cover multiple partitions if their behaviour is independent.

Worked example — Age field accepts 18–65

Partition	Sample value	Expected	Notes
Valid: 18–65	`30`	accepted	The happy path
Invalid: < 18	`10`	rejected (under-age)	Too young
Invalid: > 65	`80`	rejected (over-age)	Too old
Invalid: non-numeric	`"abc"`	rejected (type error)	Different rejection branch
Invalid: empty / null	`""` / `null`	rejected (required field)	Different rejection branch
Invalid: float	`25.5`	rejected (type error)	If field requires int
Invalid: negative	`-5`	rejected (under-age + sign)	Often the same branch as `< 18`, but worth confirming

That's seven test cases instead of every possible value 0–999. Each one represents a class of inputs the system should handle the same way.

When to apply it

Surface	Partitions usually look like
Form fields	valid / too short / too long / wrong type / empty
API parameters	valid range / out of range / wrong type / missing / null
Dropdowns	each enum value + missing/invalid option
File uploads	accepted type / wrong type / oversize / corrupt / empty
Date inputs	valid range / before min / after max / wrong format / non-existent date

Common mistakes

Treating all "invalid" inputs as one partition. A user who types letters into a number field hits a different code path than a user who submits an empty form. They're separate partitions.
Picking edge values as the "representative". Boundary values are a separate technique — combine them deliberately, don't conflate.
Skipping the empty/null partition. Most production bugs live there.

Boundary Value Analysis

Errors cluster at the edges of equivalence partitions. Off-by-one, < vs <=, inclusive vs exclusive — these are the most common bugs in any range check.

Two-value BVA (standard)

For each boundary, test the boundary itself and one step outside.

       invalid     valid     invalid
   ────────|─────|─────|─────|────────
      min-1   min   max   max+1

Three-value BVA (robust)

Adds one step inside the boundary too.

   invalid    valid             valid    invalid
   ───|────|─────|─────...─────|─────|────|───
   min-1  min  min+1         max-1  max  max+1

Worth the extra two cases when off-by-one bugs would be expensive (auth, billing, safety-critical fields).

Worked example — Password field accepts 8–20 characters

Test	Length	Expected	Notes
Below min	7 chars	invalid	one short
At min	8 chars	valid	exact lower edge
Just above min	9 chars	valid	inside
Just below max	19 chars	valid	inside
At max	20 chars	valid	exact upper edge
Above max	21 chars	invalid	one over

Six test cases, all targeting code that does a length check. If < should have been <=, exactly one of these will catch it.

Combining with equivalence partitioning

For each partition, take a typical mid-value (EP) plus the values at and around the boundaries (BVA). For the password example:

Partition	Sample value	Why
Empty	`""`	type/required failure
Below boundary	`"abc"` (3 chars)	well inside the invalid partition
At boundary	`"12345678"` (8 chars)	min — the most error-prone case
Mid-valid	`"abcdef1!"` (8–20 typical)	confirms typical case
At boundary	`"12345678901234567890"` (20 chars)	max
Above boundary	21 chars	one over

Apply it to anything with a range

Range	Boundaries
Numeric (`min..max`)	`min-1, min, min+1, max-1, max, max+1`
String length (`minLen..maxLen`)	same, in characters
Date range	day before, first day, day after; same for end of range
File size	0 bytes, 1 byte, just below limit, exactly at limit, 1 byte over
Pagination	`page=0`, `page=1`, `page=lastPage`, `page=lastPage+1`, very large `page`
Decimal precision	values just over the rounding edge in both directions

Decision Table Testing

When an outcome depends on multiple conditions combining, list every combination and the action it produces. Decision tables guarantee you don't forget a rule.

Components

Conditions — input flags (rows, top half).
Actions — outcomes (rows, bottom half).
Rules — columns. Each column is one combination of condition values.

Process

List every condition.
List every possible action.
Build a column for every combination of condition values (2^n for n boolean conditions).
Fill in the action(s) for each rule.
Reduce: collapse columns where some conditions don't matter (mark them —); drop impossible combinations.

Worked example — Shipping cost calculator

Conditions: order over £50, premium member.

Full table (4 rules from 2 booleans):

Condition	R1	R2	R3	R4
Order > £50	Y	Y	N	N
Premium member	Y	N	Y	N
Action
Free shipping	✓		✓
Standard rate (£3)		✓
Full rate (£8)				✓

Collapsed — premium membership alone unlocks free shipping, so when "Premium member = Y" we don't care about order size:

Condition	R1	R2	R3
Order > £50	—	Y	N
Premium member	Y	N	N
Action
Free shipping	✓	✓
Standard rate			✓

That's three test cases instead of four — but only because we proved R1 and R3 (in the original) produce the same action.

Conditions: valid email format, password matches, account active, account locked, 2FA enabled.

That's 2^5 = 32 combinations, but most collapse out:

Condition	Valid login	Wrong password	Locked	Inactive	2FA needed	Bad email
Valid email format	Y	Y	Y	Y	Y	N
Password matches	Y	N	—	—	Y	—
Account active	Y	—	—	N	Y	—
Account locked	N	—	Y	—	N	—
2FA enabled	N	—	—	—	Y	—
Action
Grant access	✓
Show password error		✓
Show locked banner			✓
Show inactive notice				✓
Prompt for 2FA code					✓
Show format error						✓

Six rules → six test cases that cover every meaningful combination.

When to use decision tables

Business rules with multiple conditions (pricing, eligibility, approval flows).
Complex form validation with conditional fields.
Authorisation matrices (role × resource × action).
Insurance / loan / tax calculations.
Any spec that uses the words "if … and … but only when …".

State Transition Testing

For features with discrete states, model the state machine and pick tests systematically — at the right depth.

Components

States — distinct conditions the system can be in.
Events — inputs that may cause a transition.
Transitions — (state, event) → next state.
Guards — conditions that gate a transition (if balance > 0).
Actions — side effects performed during a transition (send email, debit balance).

State transition table

Logged Out ──login (valid)──→ Logged In
Logged Out ──login (bad)────→ Logged Out  (action: show error)
Logged In  ──logout─────────→ Logged Out
Logged In  ──timeout────────→ Logged Out  (action: redirect)

From	Event	Guard	Action	To
Logged Out	Login	valid credentials	show dashboard	Logged In
Logged Out	Login	invalid credentials	show error	Logged Out
Logged In	Logout	—	clear session	Logged Out
Logged In	Timeout	session > 30 min	redirect	Logged Out

Coverage levels

Level	What it covers	When to use
0-switch	Visit every state at least once	Smoke testing
1-switch	Cover every valid transition (every row of the table)	Default for most features
2-switch	Cover every pair of consecutive transitions	Critical workflows; chains where intermediate state matters
All-paths	Cover every full path from start to end	Short, finite state machines (wizards, finite workflows)

Worked example — Order status

Draft ──submit──→ Submitted ──approve──→ Approved ──ship──→ Shipped ──deliver──→ Delivered
                  │                       │                  │
                  └─reject→ Rejected      └─cancel→ Cancelled└─return→ Returned

1-switch coverage — every valid transition:

#	From	Event	To
1	Draft	submit	Submitted
2	Submitted	approve	Approved
3	Submitted	reject	Rejected
4	Approved	ship	Shipped
5	Approved	cancel	Cancelled
6	Shipped	deliver	Delivered
7	Shipped	return	Returned

Negative testing — invalid transitions

Equally important: the system rejects transitions that aren't on the diagram. For each state, attempt every event that shouldn't work.

From	Event	Expected
Delivered	submit	rejected — terminal state
Cancelled	ship	rejected — already cancelled
Draft	ship	rejected — must approve first
Rejected	approve	rejected — already rejected

Negative transitions catch state-machine bugs that valid-only testing misses (e.g. a webhook racing the state and re-shipping a cancelled order).

Edge conditions worth probing

Mid-transition crash — kill the process between "auth captured" and "order updated". What state is the order in?
Concurrent transitions — two admins click "Ship" within 100ms. One wins? Both succeed and double-ship?
Replayed event — payment webhook delivered twice. Does the order go to Paid once?
Timer-driven transitions — abandoned cart, expired session. Does the timer fire when the user is idle? When they're active in another tab?

Pairwise / Combinatorial Testing

Most defects are triggered by the interaction of two parameters, rarely three or more. Pairwise testing covers all pairs of parameter values without testing every combination.

Why it works

Empirical studies (Kuhn et al., NIST) found ≈ 70 % of failures come from a single faulty input or a pair of inputs. By covering every pair, you catch the vast majority of bugs at a fraction of the cost.

Combinatorial explosion

3 parameters × 3 values each = 27 full combinations. 4 × 4 × 4 × 4 = 256 full. 6 × 6 × 6 × 6 × 6 = 7,776 full.

Pairwise replaces these with ~10 / ~16 / ~36. Most projects can't afford full combinatorial; pairwise is what you can actually run.

Worked example — Browser compatibility

Parameters:

Browser: Chrome, Firefox, Safari
OS: Windows, macOS, Linux
Language: EN, FR, ES

Full combinatorial = 27. Pairwise — 9 tests cover every (browser × OS), (browser × lang), and (OS × lang) pair:

#	Browser	OS	Language
1	Chrome	Windows	EN
2	Chrome	macOS	FR
3	Chrome	Linux	ES
4	Firefox	Windows	FR
5	Firefox	macOS	ES
6	Firefox	Linux	EN
7	Safari	Windows	ES
8	Safari	macOS	EN
9	Safari	Linux	FR

Every browser appears with every OS at least once. Every browser appears with every language at least once. Every OS appears with every language at least once. All pairs covered, 9 tests instead of 27.

Tools that generate pairwise sets

Tool	How
Microsoft PICT	CLI, txt config — `pict params.txt` outputs the test set
AllPairs (Python)	`pip install allpairspy` — programmatic generator
pairwise.org	Web UI for one-off generation
Hexawise	Commercial, supports constraints and seeding
PICTMaster	Excel-based generator

PICT input file:

Browser:  Chrome, Firefox, Safari
OS:       Windows, macOS, Linux
Language: EN, FR, ES

pict params.txt

Constraints

Real systems have impossible combinations — Safari on Linux doesn't ship. Tools support constraints:

IF [Browser] = "Safari" THEN [OS] <> "Linux";

The generator skips infeasible combinations while still covering all valid pairs.

When pairwise fits

Configuration testing — browsers × OSes × screen sizes × locales.
Form fields with many independent dropdowns / toggles.
API parameters — many optional query params with several valid values.
Feature flags matrix — handful of flags, each on/off.
Compatibility — versions of dependencies, plugins, integrations.

When not to use it

Two parameters strongly interact in known ways → enumerate explicitly.
The state machine has dependencies between values → use state transition testing.
The combinations encode business rules → use a decision table.

Error Guessing & Experience-Based Testing

Formal techniques cover the specifiable test cases. Error guessing fills the gap with judgment — the test cases that come from "I bet this is going to break."

Common error categories

Empty / null / undefined — most common production bug source.
Whitespace — leading/trailing, all-whitespace, tab characters in name fields.
Special characters — quotes, angle brackets, semicolons, emoji, RTL text, zero-width spaces.
Numeric edges — 0, -1, INT_MAX, INT_MAX + 1, NaN, Infinity, 0.1 + 0.2.
Boundary timing — DST transitions, leap years, leap seconds, month/year rollover, midnight UTC vs midnight local.
Concurrency — two clicks within 100ms, page refresh during a long action, concurrent edits.
Network — offline submit, slow 3G, request abort, mid-upload disconnect, DNS failure.
Auth edges — expired token mid-action, revoked permission while session is active, role downgrade.
Storage limits — quota exceeded, IndexedDB unavailable, cookie disabled, browser private mode.
Internationalisation — multi-byte characters in strings, language affecting numeric format (1,234.56 vs 1.234,56), RTL text.

Maintain a personal checklist

Every team grows a "what bites us most" list. Capture yours:

□ Empty + whitespace inputs
□ Leading/trailing whitespace stripped where it shouldn't be
□ Pagination off-by-one (page 0 vs page 1)
□ Timezone mismatch between client and server
□ Stale browser cache after deploy
□ Optimistic UI inconsistent with server state on failure
□ Email verification race after change-of-email
□ Soft-deleted user re-registering

The list grows over years and pays for itself every release.

Combining with formal techniques

Run formal techniques first (EP, BVA, decision tables, state transitions, pairwise) — they give you systematic coverage. Then run error guessing on top — it catches what spec-driven design can't see.

Use Case Testing

Derive tests directly from how a real user accomplishes a goal. Each use case yields one main success scenario plus alternative and exception flows.

Structure of a use case

Title:           Place an order
Actor:           Authenticated customer
Preconditions:   Cart contains at least one in-stock item; payment method on file

Main success scenario:
  1. Customer reviews cart
  2. Customer proceeds to checkout
  3. System validates stock and pricing
  4. Customer confirms shipping address
  5. System charges payment method
  6. System creates order and sends confirmation
  7. System shows confirmation page

Alternative flows:
  4a. Customer applies a coupon
       → System recalculates total, returns to step 5
  4b. Customer changes shipping method
       → System recalculates shipping, returns to step 5
  6a. Customer requests gift wrapping
       → System adds line item, returns to step 5

Exception flows:
  3a. Item out of stock
       → System shows alert, removes item, customer continues with rest
  5a. Payment declined
       → System shows error, customer enters new method, returns to step 5
  5b. Network error mid-payment
       → System retries; on second failure, shows recovery instructions
       → Customer's cart is preserved
  *.  Session timeout at any step
       → System asks customer to re-authenticate, returns to current step

Coverage

Layer	Tests
Main success scenario	1 (the happy path)
Each alternative flow	1 each (4a, 4b, 6a → 3 cases)
Each exception flow	1 each (3a, 5a, 5b, * → 4 cases)
Cross-cutting variations	logged-out user; mobile vs desktop; saved card vs new card

The main flow plus all alternatives and exceptions is usually 5–15 tests per use case. Worth tracking against the use-case document for traceability — every alternative and exception flow should map to at least one test case.

When to favour use case testing

Workflows where the order of steps matters (checkout, onboarding, KYC, multi-step forms).
Scenarios driven by persona / role — admin onboarding flow vs end-user.
Acceptance testing — UAT scripts written from use cases read naturally.

State transitions, decision tables, and pairwise complement use cases — once you've identified the flows, those techniques tell you which inputs to drive at each step.

Test Design Techniques

Equivalence Partitioning

Rules

Worked example — Age field accepts 18–65

When to apply it

Common mistakes

Boundary Value Analysis

Two-value BVA (standard)

Three-value BVA (robust)

Worked example — Password field accepts 8–20 characters

Combining with equivalence partitioning

Apply it to anything with a range

Decision Table Testing

Components

Process

Worked example — Shipping cost calculator

A larger worked example — Login authorisation

When to use decision tables

State Transition Testing

Components

State transition table

Coverage levels

Worked example — Order status

Negative testing — invalid transitions

Edge conditions worth probing

Pairwise / Combinatorial Testing

Why it works

Combinatorial explosion

Worked example — Browser compatibility

Tools that generate pairwise sets

Constraints

When pairwise fits

When not to use it

Error Guessing & Experience-Based Testing

Common error categories

Maintain a personal checklist

Combining with formal techniques

Use Case Testing

Structure of a use case

Coverage

When to favour use case testing