Q11 of 26 · SQL
What is a subquery and when would you use one instead of a JOIN in QA work?
Short answer
Short answer: A subquery is a SELECT nested inside another query. Use it when the inner result is a single value or a list that filters the outer query, and when a JOIN would produce duplicate rows that are hard to deduplicate.
Detail
Subqueries come in three flavours:
Scalar subquery — returns a single value, used in SELECT or WHERE:
-- Orders whose total exceeds the average order amount
SELECT order_id, total_amount
FROM orders
WHERE total_amount > (SELECT AVG(total_amount) FROM orders);
List subquery — returns a column of values, used with IN/NOT IN:
-- Users who have NEVER placed an order
SELECT id, email
FROM users
WHERE id NOT IN (SELECT DISTINCT customer_id FROM orders);
Correlated subquery — references the outer query, runs once per outer row:
-- Most recent order date per user
SELECT u.id, u.email,
(SELECT MAX(created_at) FROM orders o WHERE o.customer_id = u.id) AS last_order
FROM users u;
JOIN vs subquery: JOINs are generally faster and the query planner optimises them well. Subqueries are clearer when the inner logic is a filter condition rather than a data source. NOT IN subqueries have a NULL gotcha — if any value in the subquery result is NULL, the whole NOT IN returns no rows; NOT EXISTS is safer.
// EXAMPLE
-- Users with no orders (safer than NOT IN when customer_id can be NULL)
SELECT id, email
FROM users u
WHERE NOT EXISTS (
SELECT 1 FROM orders o WHERE o.customer_id = u.id
);