Q17 of 26 · SQL
How do you verify referential integrity using SQL after a data migration?
SQLMidsqlreferential-integritymigrationdata-validationforeign-key
Short answer
Short answer: Run LEFT JOIN checks for every foreign key relationship to find child rows with no matching parent, and compare row counts against the source to catch missing records.
Detail
A migration checklist using SQL:
1. Row count comparison (source vs target):
SELECT 'orders' AS tbl, COUNT(*) FROM source_db.orders
UNION ALL
SELECT 'orders', COUNT(*) FROM target_db.orders;
-- Counts should match
2. Orphan check for every FK relationship:
-- order_items with no matching order
SELECT COUNT(*) AS orphaned_items
FROM order_items oi
LEFT JOIN orders o ON o.id = oi.order_id
WHERE o.id IS NULL;
-- Expect 0
-- orders with no matching customer
SELECT COUNT(*) AS orphaned_orders
FROM orders o
LEFT JOIN customers c ON c.id = o.customer_id
WHERE c.id IS NULL;
3. Duplicate primary key check:
SELECT id, COUNT(*) FROM orders GROUP BY id HAVING COUNT(*) > 1;
-- Expect 0 rows
4. NULL where NOT NULL is expected:
SELECT COUNT(*) FROM orders WHERE customer_id IS NULL;
-- Expect 0
5. Value range / enum validation:
SELECT DISTINCT status FROM orders
WHERE status NOT IN ('pending','paid','shipped','completed','cancelled');
-- Expect 0 rows (no unknown statuses)
These checks form a post-migration test suite that catches common data corruption patterns before the application goes live.
// WHAT INTERVIEWERS LOOK FOR
A structured checklist approach: row counts, orphan checks, PK uniqueness, NOT NULL, enum validity. Framing it as a repeatable test suite, not a one-off query.