Q13 of 26 · SQL

When would you use UNION vs UNION ALL in a QA context?

SQLMidsqlunionunion-allset-operations

Short answer

Short answer: UNION removes duplicate rows (extra sort/distinct pass); UNION ALL keeps all rows including duplicates and is faster. Use UNION ALL unless you specifically need deduplication.

Detail

Both operators combine the results of two SELECT statements that have the same columns.

-- UNION: deduplicates (slow on large sets)
SELECT email FROM users_us
UNION
SELECT email FROM users_eu;

-- UNION ALL: keeps every row, including duplicates (faster)
SELECT email FROM users_us
UNION ALL
SELECT email FROM users_eu;

QA uses of UNION ALL:

  1. Merging test data from multiple environments — combine rows from staging and prod-mirror tables for comparison.
  2. Comparing expected vs actual with full context — tag each side with a label column, then filter for mismatches:
SELECT 'expected' AS source, order_id, status FROM expected_orders
UNION ALL
SELECT 'actual'   AS source, order_id, status FROM orders
  1. Combining audit rows across multiple log tables — when logs are partitioned by month.

When to use UNION (with deduplication): when you want unique emails across two user tables, or unique error codes from two error-log tables, and duplicates would skew your count.

// EXAMPLE

-- Tag rows by source to compare expected vs actual side-by-side
SELECT 'expected' AS source, order_id, status, total_amount
FROM   expected_orders
UNION ALL
SELECT 'actual'   AS source, order_id, status, total_amount
FROM   orders
ORDER BY order_id, source;

// WHAT INTERVIEWERS LOOK FOR

Knowing UNION ALL is faster. A concrete use case — the expected-vs-actual labelling pattern is the strongest answer.