Test Estimation Techniques

7 min read

"How long will testing take?" is the question every tester is asked, and the one most testers answer worst. Underestimate and you blow the sprint and miss bugs. Overestimate and you slow the team. Estimation is a learnable skill, and the techniques here will let you give numbers you can defend.

Why estimation matters

Three real decisions ride on your estimate. Sprint planning — how many stories fit in the sprint once testing time is included. Release planning — roadmaps assume a predictable testing cadence; bad estimates compound across releases. Resource allocation — does this feature need one tester for two days or two testers for three? A tester who estimates well is taken seriously in those meetings; a tester who never quite finishes on time loses influence over scope and dates.

Four techniques you should know

1. Work breakdown structure (WBS). Break the feature into the smallest testable units, estimate each, sum the totals. For a checkout: payment A (4h), payment B (3h), guest checkout (2h), logged-in (2h), shipping option 1 (1h), shipping option 2 (1h). Auditable — someone can ask "why six hours for guest checkout?" and you can answer with the scenarios you plan to cover.

2. Expert judgement. A senior tester says "this will take three days," based on pattern recognition from similar features. Faster than WBS but only as reliable as the expert's memory. Use early in sprint planning; refine with WBS once requirements solidify.

3. Historical data. "Last release this took 16 hours, so this will too." Cheap and surprisingly accurate when the work really is similar. Most teams underuse this — tracking actuals is a small investment with a long payoff.

4. Three-point estimation. For each task, estimate optimistic (O), most likely (L), and pessimistic (P). The expected value is (O + 4L + P) / 6. The 4× weight on L anchors the estimate while O and P pull it. If O and P are far apart, the task is risky — that is itself useful information.

Three-point estimation in practice

For a single task — testing the new card payment integration — the three estimates are 3, 5, and 9 hours. The expected value (3 + 4×5 + 9) / 6 = 5.3 hours. That number is more defensible than any single guess because it shows your reasoning. If you do this for each piece of work in the feature, you get a defensible total and a list of the riskiest items, which is exactly what a sprint planning meeting needs.

Factors that quietly inflate the estimate

Pure WBS or three-point estimates underestimate because testers forget the work that surrounds the testing itself:

  • Test data setup — 30 accounts with different states (verified, suspended, premium). Often 10–20% of effort.
  • Environment availability — shared staging means hours lost to other people deploying over your build.
  • Dependencies — a Stripe sandbox test takes longer when Stripe sandbox is flaky. Real integrations are rarely as reliable as you hope.
  • Bug verification — every bug found generates a re-test cycle. A feature with 8 bugs costs 8 cycles you did not estimate.
  • Tester experience — a senior moves twice as fast through a known codebase as a tester in their first month.

A worked example: a new checkout flow

Imagine you need to estimate testing for a new checkout that supports 3 payment methods (card, Apple Pay, gift card), 2 shipping options (standard, express), and 2 customer states (guest, logged-in). The combinations alone are 3 × 2 × 2 = 12 flows.

  • Test data setup: 4 hours (12 representative accounts and inventory states).
  • Per-payment-method functional tests: 5h, 4h, 3h (three-point expected). Total: 12 hours.
  • Shipping option tests: 2 hours each. Total: 4 hours.
  • Guest vs logged-in: add 3 hours of cross-state edge cases.
  • Cross-browser smoke (3 browsers × 1 happy path): 3 hours.
  • Exploratory testing time-box: 4 hours.
  • Bug verification buffer: 4 hours.

Total: 34 hours, roughly 4–5 working days for one tester. Now apply a buffer.

The "add 20%" rule and why it is not enough

Many teams add a flat 20% buffer to their estimates. That is better than nothing, but it is usually too small for one specific reason: it covers known unknowns (a few bugs, a couple of late questions) but not unknown unknowns (an integration that suddenly breaks, a requirement that pivots mid-sprint). For first-of-its-kind features and integrations, 30–50% is closer to right. For well-understood, repeated work, 10–15% is enough. Be honest with yourself about which kind of work you are estimating, and document your reasoning so the buffer is defensible.

⚠️ Common mistakes

  • Estimating only the "fun" parts. Testers naturally estimate the scenarios they will run and forget the test data, environment friction, and bug verification cycles. The work surrounding testing is often 30–40% of the total.
  • Caving when a manager pushes back. A defended estimate is a learnable skill. If the work cannot fit in the available time, the answer is to reduce scope or extend the timeline — not to silently agree to a number you know is wrong.
  • Never tracking actuals. You cannot calibrate estimation without tracking how long the work actually took. Even a simple spreadsheet of estimate vs actual, updated each sprint, will make you 30–40% more accurate within a quarter.

🎯 Practice task

Pick a feature you have tested recently or that you use daily — a sign-up flow, a payment form, a search filter. Spend 25 minutes producing a complete, defensible estimate:

  1. List every testable unit using a work breakdown structure. Aim for 8–15 items.
  2. For the three highest-effort items, do three-point estimation: write down O, L, and P, then compute (O + 4L + P) / 6.
  3. For the rest, use expert judgement — single-number estimates with a one-line reason.
  4. Add explicit lines for test data setup, bug verification, and a buffer (be honest about whether 20% is enough).
  5. Sum the total and write a one-paragraph "estimation memo" that you could hand to a manager. Include your top two assumptions and your top risk.

You now have a number you can defend. Even better, you have the reasoning — and reasoning travels with you to the next feature, the next sprint, and the next role.

// tip to track lessons you complete and pick up where you left off across devices.