Heuristics, tours, and prompts for finding bugs that scripted tests miss. Pull one off the shelf when you're staring at a feature and don't know where to start.
SFDPOT — San Francisco Depot
A Bach-and-Bolton coverage heuristic. For each dimension, ask the listed questions and you'll surface test ideas you would have missed.
Structure — what is it made of?
Ask
Examples
What components make up this feature?
Front-end form, API, validation rules, DB tables, queue, worker
What files / modules does it touch?
auth.ts, User.java, migration 0042_add_phone.sql
What database tables / collections?
users, sessions, audit_log
What configuration files?
app.yml, feature flags, environment variables
What third-party libraries?
Stripe SDK, Auth0, AWS S3
Function — what does it do?
Ask
Examples
What can a user accomplish?
Create account, reset password, upload avatar
What can the system do automatically?
Send digest email, expire stale sessions, archive old data
What functions exist for admins / power users?
Impersonate, export, bulk-delete
What's missing that should be here?
"Forgot password" link, undo, confirmation dialog
Data — what does it process?
Ask
Examples
What inputs flow in?
Form fields, query params, headers, file uploads
What outputs flow out?
API responses, emails, downloads, audit log entries
What is stored, where, and for how long?
Session token in cookie 30d, password hash in DB, image in S3
Different roles, locales, plans, OS, browser, device
Repaired
Recently fixed
Verify the fix and check for fresh regressions around it
Chronic
Persistently flaky / buggy
Areas that bite every release no matter what
FEW HICCUPPS — Consistency Heuristics
Bach and Kaner's consistency oracles. The product should be consistent with each of these — call out any divergence as a bug, observation, or question.
Letter
Consistency with…
Probe
Familiar
Similar products
Does it follow conventions a user already knows?
Explainability
A clear explanation
Could you explain this behaviour to a customer with a straight face?
World
Reality
Does it model real-world physics, money, dates, geography correctly?
History
Past versions
Has long-standing behaviour silently changed?
Image
Brand and reputation
Does this match the polish customers expect from us?
Comparable
Comparable products
How does the closest competitor handle this case?
Claims
Marketing and docs
Do the screenshots, docs, and copy still match the product?
User expectations
What users expect
Is the surprise factor low for the typical user?
Purpose
Stated purpose
Does this support the feature's reason for existing?
Product
Itself
Are similar things in the app handled the same way internally?
Standards
Applicable standards
RFCs, W3C, WCAG, ISO — does it comply where required?
Touring Heuristics (James Whittaker)
Run a tour when you need a focused mode of exploration. Each tour has a different goal and yields different bugs.
Guidebook Tour — Walk through the user-facing documentation step by step. Verify every example works as written; flag wording that's misleading or stale.
Money Tour — Test the features the company makes money on (checkout, subscription upgrade, paid-only flows). Bugs here have the highest blast radius.
Landmark Tour — List the 5–10 features users care about most, then visit each. Useful as a smoke-test charter.
Intellectual Tour — Find the hardest features to test (state machines, concurrent editing, billing). Spend a session on the one you're most afraid of.
FedEx Tour — Follow data through the system end to end: input → validation → processing → storage → output → notification. Look for places it can be lost or transformed wrong.
Garbage Collector Tour — Feed the system input it doesn't want: invalid types, oversize files, malformed JSON, unexpected encodings. Check how cleanly errors propagate.
Bad Neighborhood Tour — Spend the session in areas where bugs cluster historically. Look at recent bug reports and test their neighbourhood.
Museum Tour — Test legacy features no one has touched in years. They often quietly broke during refactors.
Back Alley Tour — Explore the least-used features (admin tools, edge-case settings, obscure shortcuts). Expect more bugs per minute than in the main flows.
All-Nighter Tour — Leave the app running overnight. Memory leaks, expired sessions, day-rollover bugs, scheduled-job failures often only show up after hours.
Supermodel Tour — Test only the UI: layout, alignment, colour, contrast, hover/focus states. Don't click anything that submits.
Couch Potato Tour — Do the bare minimum. Accept defaults. Click "OK" without reading. Surfaces what happens to users who don't pay attention.
Session-Based Test Management (SBTM)
A way to make exploratory testing accountable without scripting it.
The structure
Concept
Definition
Session
An uninterrupted block of testing time, usually 60–90 minutes
Charter
A single mission for the session, written before it starts
Debrief
A short conversation after the session — surface findings and refine the next charter
Example charters
Explore the login flow with invalid credentials across Chrome, Safari, and Firefox.
Probe the new bulk-import feature with the Garbage Collector tour for 60 minutes.
Verify the password-reset email survives quoted-printable encoding in Outlook.
Run the Money Tour on the upgrade flow with a previously-cancelled subscription.
Session sheet template
Charter: Explore the bulk-import feature using the Garbage Collector tour
Tester: Vimal
Date / time: 2026-05-03 / 10:00–11:00
Duration: 60 minutes (45 testing, 15 setup/notes)
Areas covered: CSV upload UI, /api/imports, validation messages, error reporting
Bugs found:
#1234 Empty CSV is accepted, creates 0-row import (silent)
#1235 CSV with BOM at start strips first column header
#1236 100 MB file kills the upload progress bar (frozen at 0%)
Issues / questions:
- Spec doesn't say whether duplicate rows should error or merge
- 415 vs 422 inconsistency between front-end and API for "wrong file type"
Notes:
- Drag-and-drop works on Chrome but not Safari iOS
- Progress bar resets on tab switch — confirm if intentional
Useful metrics (don't overdo them)
Sessions completed vs planned
Charter completion % — did you finish the mission, or get distracted?
Bugs / session — track trend, not absolute (rises with familiarity, then falls)
Coverage map — which areas have been visited recently, which are stale
Input Variation Techniques
What to throw at every input field, every API parameter, every config value.
Boundary values
For any numeric or length-bounded input, test six points:
min - 1 ← invalid
min ← valid (lower edge)
min + 1 ← valid (just inside)
max - 1 ← valid (just inside)
max ← valid (upper edge)
max + 1 ← invalid
Equivalence partitions
Pick one representative from each class instead of testing every value.
0 chars
1 char
255 chars legacy varchar boundary
256 chars
1000+ chars paragraph / paste from doc
1 MB paste from a log file
Numbers
0
-0 different from 0 in some systems
-1
1
INT_MAX (2147483647), INT_MAX + 1
INT_MIN, INT_MIN - 1
3.14159, 1e308, 1e-308
NaN, Infinity, -Infinity
Dates and times
1970-01-01T00:00:00Z Unix epoch
1969-12-31T23:59:59Z pre-epoch (often breaks)
2000-02-29 leap day
2024-02-29 vs 2025-02-29 leap year vs non-leap
2038-01-19T03:14:07Z 32-bit Unix overflow
DST forward / backward day timezone math
Dec 31 → Jan 1 year rollover
Feb 28 → Mar 1 month rollover
local time in non-UTC zone server vs client time mismatch
File uploads
Probe
What it tests
Wrong extension (.exe renamed to .png)
MIME-type validation vs extension check
Empty file (0 bytes)
"did the upload succeed?" branch
Huge file (1 GB+)
Server limits, progress, timeout, memory
No extension
Default handling
Double extension (avatar.png.exe)
Sanitisation
Path-traversal name (../../evil.png)
Filename sanitisation
Same filename twice
Conflict / overwrite policy
File during network drop
Resume / retry
State Transition Testing
For any feature with discrete states (orders, accounts, subscriptions, document workflow), explicitly test the transitions.
A four-step recipe
Enumerate every state the system can be in.
Map every transition the spec says is valid.
Test each valid transition with the action that triggers it.
Attempt every invalid transition — expect a clean refusal, not a crash.